Hacker Newsnew | past | comments | ask | show | jobs | submit | Gareth321's commentslogin

My experience as well. This is even worse than just having a mediocre model, because I can work around that. The inconsistency means it produces different outputs for the same prompt, and I can't rely on that as a business tool.

If we take it a step further, in a few years, why would anyone purchase generic software anymore? If we can perfectly customise software for our needs and preferences for almost free, why would anyone purchase generic software from an App Store? I genuinely think Apple's business model is in jeopardy.

Most apps aren’t standalone and the services they depend on are nontrivial to build. For example, maybe you could vibe code a guitar tuner app, but not a ride share app.

I agree. The services which will be left standing will be those with a competitive moat: critical mass (Tinder, Facebook), content (YouTube, AppleTV), and scale (frontier AI models requiring expensive hardware), etc.

That said, if you look at the apps on your phone, I wager a large proportion don't have these moats. Translation, passwords, budget, reminders, email, to do, project management, messaging, browser, calendar, fitness, games, game tracking, etc.


The most recent software paradigm has been SaaS - software as a service. Capex is distributed among all customers and opex is paid for through the subscription. This avoids the large upfront capex and provides easy cost and revenue projections for both sides of the transaction. The key to SaaS is that the software is maximally generic. Meaning is works well for the largest number of people. This necessitates making tough cuts on UX and functionality when they only benefit small parts of the userbase.

Vibe coding or LLM accelerated development is going to turn this on its head. Everyone will be able to afford custom software to fit their specific needs and preferences. Where Salesforce currently has 150,000 customers, imagine 150,000 customers all using their own customised CRM. The scope for software expansion is unbelievably large right now.


SaaS is not a new idea and has been renamed multiple times.

In the 70s, it was called "time-sharing". Instead of buying a mainframe, you got a CICS application instance on a mainframe and used that. (tangentially, spare time on these built-out nation-wide dialup-supported networks is what gave birth to CompuServe and GEnie).

In the dot-com era, it was called "application service providers". Salesforce and actually started in this era (1999). So did NetSuite. This was the first attempt to be browser-based but bandwidth and browsers sucked then.

I think PaaS is a more recent software paradigm, albeit a far less successful one.


Yeah it looks like it's just a shared project. I thought we could already do this?

Same! I thought I was going crazy but the effect is clear and reproducible. My hangovers are also less bad.

My anecdotal experience is that NAC makes me much more tolerant to alcohol. As in, I can drink a lot more without feeling the effects. Since I don't get the same buzz, I care less about reaching for a beer.

That is my goal and I invested a few dozen hours into the endeavour. My honest review is:

1. Something like OpenClaw will change the world.

2. OpenClaw is not yet ready.

The heart of OpenClaw (and the promise) is the autonomy. We can already do a lot with the paid harnesses offered by OpenAI and Anthropic, so the secret sauce here is agents doing stuff for us without us having to babysit them or even ask them.

The problem is that OpenClaw does this is an extreme rudimentary way: with "heartbeats." These are basically cron jobs which execute every five minutes. The cron job executes a list of tasks, which in turn execute other tasks. The architecture is extremely inefficient, heavy in LLM compute, and prone to failure. I could enumerate the thousand ways it can and will fail but it's not important. So the autonomy part of the autonomous assistant works very badly. Many people end up with a series of prescriptive cron jobs and mistakenly call that OpenClaw.

Compounding this is memory. It is extremely primitive. Unfortunately even the most advanced RAG solutions out there are poor. LLMs are powerful due to the calculated weights between parametric knowledge. Referring to non-parametric knowledge is incredibly inefficient. The difference between a wheelchair and a rocket ship. This compounds over time. Each time OpenClaw needs to "think" about anything, it preloads a huge amount of "memories" into the query. Everything from your personal details to architecture to the specific task. Something as simple as "what time is it" can chew through tens of thousands of tokens. Now consider what happens over time as the agent learns more and more about you. Does that all get included in every single query? It eventually fails under its own weight.

There is no elegant solution to this. You can "compress" previous knowledge but this is very lossy and the LLMs do a terrible job of intelligently retaining the right stuff. RAG solutions are testing intelligent routing. One method is an agentic memory feedback loop to seek out knowledge which might exist. The problem is this is circular and mathematically impossible. Does the LLM always attempt to search every memory file in the hope that one of the .md files contains something useful? This is hopelessly slow. Does it try to infer based on weekly/monthly summaries? This has proven extremely error-prone.

At this point I think this will be first solved by OpenAI and/or Anthropic. They'll create a clean vectorised memory solution (likely a light LLM which can train itself in the background on a schedule) and a sustainable heartbeat cadence packaged into their existing apps. Anthropic is clearly taking cues from OpenClaw right now. In a couple of years we might have a competent open source agent solution. By then we might also have decent local LLMs to give us some privacy, because sending all my most intimate info to OpenAI doesn't feel great.


Disclosure: I wrote the linked post.

Heartbeat cron and naive memory are the right thread to pull. Agree.

The problem is the data/trust boundary. One agent process, one credential store, all channels sharing both. Whenever we scale the memory up, which we all want to do, we scale the disaster radius of every prompt injection with it.

Wirken accounted for this in the first design step. Per-channel process isolation. Handshakes between adapters and the core. Compile-time type constraints so a Discord adapter cannot construct a Telegram session handle. Encrypted credential vault. Hash-chained audit log of every action. All, remaining model-agnostic, so local models and confidential-compute providers are drop-in.

Your memory point is still unsolved at this layer. When memory does get solved, you want the solver running where it cannot leak the wrong credentials to the wrong channel. Otherwise the smarter it gets, the worse the breach.


> Compounding this is memory. It is extremely primitive. ... Now consider what happens over time as the agent learns more and more about you. Does that all get included in every single query? It eventually fails under its own weight.

Agentic coding has all of the same issues and it gets solved much the same way: give LLMs tool calls to file persistent memories by topic, list what topics are available (possibly with multiple levels of subtopics in turn) and retrieve them into the context when relevant. Not too different from what humans do with zettelkasten and the like.


Not sure why you think Agentic coding is solved - it isnt imo, and exactly because of the same memory issues.

I suspect the distinction is API vs subscription. The app has some kind of very restrictive system prompt which appears to heavily restrict compute without some creative coaxing. API remains solid. So if you're using OpenCode or some other harness with an API key, that's why you're still having a good time.

The Mac Minis (probably 64GB RAM) are the most cost effective.

It is surprisingly competent. It's not Opus 4.6 but it works well for well structured tasks.

Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: