Nice product! I'm looking for a tool to iterate fast on UI/UX design.
> Old images get dropped
> A screenshot attached four turns ago can be 50,000 tokens. It is not helping the agent decide what to do now. Once an image falls behind your last two user turns, Brilliant strips it and leaves a short [Image cleared] stub in its place. The recent ones stay, because those are the ones that matter for the next move. Zero user effort, zero config.
Ok the context gets smaller but doesn't it invalidate the LLM cache?
Cilantro really tastes different from one person to another (relative to the aldéhyde content of cilantro and genetic variations). I don't know about sugar and aspartame but saying that it is purely a "preference" looks a little bit presomptuous to me.
To the previous poster: do other intense sweeteners (stevia, saccharin, sucralose) taste sweet to you?
They all have variations of a bitter aftertaste to me. It’s not sweet or pleasant at all.
And it’s a different form of bitterness than the one you get from kale/collared greens, brussel sprouts, etc., whichi quite enjoy. I _almost_ want to drink a diet drink along with one of the “bitter” vegetable or even a crème brûlée to quantify the difference.
I feel like it's a good way to build the best PoC in any vertical. Either they create a product, or the big players will but Anthropic can provide them with the infra.
In some cases this is what I ask from my juniors.
Not for every commit, but during some specific reviews. The goal is to coach them on why and how they got a specific result.
Companies (C-suites) do not actually want for their worker pool (humans + agents) to stay constant in time, there is no reason for it to stay constant in time. C-suites have very different worries.
And "cost center" is a lie from Outsourcing Era, forget about it.
Yes most comments makes no sense to me. The statement basically both allows surveillance of non-american people and prevents imaginary LLM weapons (I highly doubt we'll see a LLM fully automating a weapon...)
Something feels off about this announcement. Anyone else?
Credit where it's due, going on record like this isn't easy, particularly when facing pressure from a major government client. Still, the two limits Anthropic is defending deserve a closer look.
On surveillance: the carve-out only protects people inside the US. Speaking as someone based in Europe, that's a detail that doesn't go unnoticed. On autonomous weapons: realistically, current AI systems aren't anywhere near capable enough to run one independently. So that particular line in the sand isn't really costing them much.
What I find more candid is actually the revised RSP. It draws a clearer picture of where Anthropic's oversight genuinely holds and where it starts to break down as they race to stay at the cutting edge. The core tension, trying to be simultaneously the most powerful and the most principled player in the room, doesn't have a neat resolution.
This statement doesn't offer one either. But engaging with the question openly, even without all the answers, beats silence and gives the rest of us something real to push back on.
>the carve-out only protects people inside the US. Speaking as someone based in Europe, that's a detail that doesn't go unnoticed.
I'm not sure an American company prioritising the privacy of American people is worth questioning. As a European, Anthropic are very low on the list of companies I worry about in terms of the progressive eradication of my privacy.
Agreed. That said, Anthropic's original pitch was about embedding safety at the foundational level of the 'model' (acknowledging that a model is more than just its weights).
If the safeguard against mass surveillance is strictly tied to geolocation (US vs. non-US), it can't be an intrinsic property of the model. It has to be enforced at the API or contractual level. This means international users are left out of those core, embedded protections. Unless Anthropic is planning to deploy multiple, differently-aligned foundation models based on customer geography or industry, the safety harness isn't really in the model anymore.
What can be asserted without evidence can also be dismissed without evidence. The benchmark creators haven't demonstrated that higher scores result in fewer humans dying or any meaningful outcome like that. If the LLM outputs some naughty words that's not an actual safety problem.
How stopping using hyperscalers models on their infra would "get as much of this capability into the open as possible"?
Either "we" create models better than commercial state of the art (by using whatever means).
Or we use open models AND fund organisations building such models (could be by purchasing service from these orgs or donations - in which case would these orgs be different than hyperscalers?).
But i dont see how just hosting the models on some private servers would give us an edge?
> Old images get dropped > A screenshot attached four turns ago can be 50,000 tokens. It is not helping the agent decide what to do now. Once an image falls behind your last two user turns, Brilliant strips it and leaves a short [Image cleared] stub in its place. The recent ones stay, because those are the ones that matter for the next move. Zero user effort, zero config.
Ok the context gets smaller but doesn't it invalidate the LLM cache?
reply