More

achille · 2026-04-06T06:38:40 1775457520

what setup do you use for the bar at the bottom?

nitroedge · 2026-04-06T16:32:09 1775493129

search up claude-hud for the status bar options

achille · 2026-03-28T01:49:40 1774662580

due to library limitation, not language, they could use fancy-regex

achille · 2026-03-27T04:40:19 1774586419

i registered for their demo and it seems to be unable to do any of the advertised features. nearly everything crashes; feels like a ux wireframe not yet wired it

smgdkngt · 2026-03-27T12:11:40 1774613500

That's strange. I use it daily myself and some of my clients do too (to communicate with me). The main tools work reasonably well for us. There are definitely rough edges and bugs still, it's a one-person project.

achille · 2026-03-27T02:34:08 1774578848

same here, would love to compare notes

achille · 2026-03-10T01:18:24 1773105504

  - local data residency & sovereignty
  - latency
  - bandwidth 
  - regulatory climate
  - competition

uae is business friendly

all cloud providers have middle east presence

refineries generate terabytes of sensor data per hour

the population and people there produce and consume a lot of data

achille · 2026-03-06T21:53:09 1772833989

Fun challenge: I asked Claude/Gemini to decode the audio by just uploading it as puzzle.wav. Claude is able to decode it:

https://claude.ai/share/4262fb6b-3ca1-407f-af0d-4d014686e65d

achille · 2026-02-22T18:44:23 1771785863

in the article they explicitly said they stripped symbols. If you look at the actual backdoors many are already minimal and quite obfuscated,

see:

- https://github.com/QuesmaOrg/BinaryAudit/blob/main/tasks/dns...

- https://github.com/QuesmaOrg/BinaryAudit/blob/main/tasks/dro...

comex · 2026-02-22T20:36:48 1771792608

The first one was probably found due to the reference to the string /bin/sh, which is a pretty obvious tell in this context.

The second one is more impressive. I'd like to see the reasoning trace.

comex · 2026-02-22T23:05:54 1771801554

Reply to self: I managed to get their code running, since they seemingly haven’t published their trajectories. At least in my run (using Opus 4.6), it turns out that Claude is able to find the backdoored function because it’s literally the first function Claude checks.

Before even looking at the binary, Claude announces it will“look at the authentication functions, especially password checking logic which is a common backdoor target.” It finds the password checking function (svr_auth_password) using strings. And that is the function they decided to backdoor.

I’m experienced with reverse engineering but not experienced with these kinds of CTF-type challenges, so it didn’t occur to me that this function would be a stereotypical backdoor target…

They have a different task (dropbear-brokenauth2-detect) which puts a backdoor in a different function, and zero agents were able to find that one.

On the original task (dropbear-brokenauth-detect), in their runs, Claude reports the right function as backdoored 2 out of 3 times, but it also reports some function as backdoored 2 out of 2 times in the control experiment (dropbear-brokenauth-detect-negative), so it might just be getting lucky. The benchmark seemingly only checks whether the agent identifies which function is backdoored, not the specific nature of the backdoor. Since Claude guessed the right function in advance, it could hallucinate any backdoor and still pass.

But I don’t want to underestimate Claude. My run is not finished yet. Once it’s finished, I’ll check whether it identified the right function and, if so, whether it actually found the backdoor.

comex · 2026-02-23T06:10:23 1771827023

Update: It did find the backdoor! It spent an hour and a half mostly barking up various wrong trees and was about to "give my final answer" identifying the wrong function, but then said: "Actually, wait. Let me reconsider once more. [..] Let me look at one more thing - the password auth function. I want to double-check if there's a subtle bypass I missed." It disassembled it again, and this time it knew what the callee functions did and noticed the wrong function being called after failure.

Amusingly, it cited some Dropbear function names that it had not seen before, so it must have been relying in part on memorized knowledge of the Dropbear codebase.

achille · 2026-02-17T00:40:48 1771288848

Absolutely, they didn't give the agents autonomy to research or any additional data. No documentation, no web search, no reference materials.

What's the point of building skills like this?

achille · 2026-01-04T11:54:17 1767527657

Spoilers: https://areweseductionyet.pages.dev

Mistletoe · 2026-01-04T14:41:57 1767537717

Extreme longevity with a none is kind of depressing.

enneff · 2026-01-04T19:36:42 1767555402

Why? Life for an individual is long enough IMO. Death means renewal. Our children are better than us.

Mistletoe · 2026-01-04T23:18:11 1767568691

I like being alive and I like myself.

Hammershaft · 2026-01-04T21:28:23 1767562103

Even besides our selfish impulse to live the 'why' is clear. If death is inherently a bad loss, then the longer life extension takes the worse it is.

achille · 2026-01-04T04:29:47 1767500987

thanks for sharing that, it was simple, neat, elegant.

this sent me down a rabbit hole -- I asked a few models to solve that same problem, then followed up with a request to optimize it so it runs more efficiently.

chatgpt & gemini's solutions were buggy, but claude solved it, and actually found a solution that is even more efficient. It only needs to compute sqrt once per iteration. It's more complex however.

                   yours  claude
  ------------------------------
  Time (ns/call)    40.5   38.3
  sqrt per iter        3      1
  Accuracy        4.8e-7 4.8e-7

Claude's trick: instead of calling sin/cos each iteration, it rotates the existing (cos,sin) pair by the small Newton step and renormalizes:

  // Rotate (c,s) by angle dt, then renormalize to unit circle
  float nc = c + dt*s, ns = s - dt*c;
  float len = sqrt(nc*nc + ns*ns);
  c = nc/len; s = ns/len;

See: https://gist.github.com/achille/d1eadf82aa54056b9ded7706e8f5...

p.s: it seems like Gemini has disabled the ability to share chats can anyone else confirm this?

0xfaded · 2026-01-04T05:06:48 1767503208

Thanks for pushing this, I've never gone beyond "zero" shotting the prompt (is it still called zero shot with search?)

As a curiosity, it looks like r and q are only ever used as r/q, and therefore a sqrt could be saved by computing rq = sqrt((rxrx + ryry) / (qxqx + qyqy)). The if q < 1e-10 is also perhaps not necessary, since this would imply that the ellipse is degenerate. My method won't work in that case anyway.

For the other sqrt, maybe try std::hypot

Finally, for your test set, could you had some highly eccentric cases such as a=1 and b=100

Thanks for the investigation:)

Edit: BTW, the sin/cos renormalize trick is the same as what tx,ty are doing. It was pointed out to me by another SO member. My original implementation used trig functions

achille · 2026-01-04T05:52:14 1767505934

Nice, that worked. It's even faster.

                 yours  yours+opt  claude
  ---------------------------------------
  Time (ns)        40.9      36.4    38.7
  sqrt/iter           3         2       1
  Instructions      207       187     241

Edit: it looks like the claude algorithm fails at high eccentricities. Gave chatgpt pro more context and it worked for 30min and only made marginal improvement on yours, by doing 2 steps then taking a third local step.

https://gist.github.com/achille/23680e9100db87565a8e67038797...

0xfaded · 2026-01-04T06:09:09 1767506949

Haha nice, hanging in there by a thread

gchuf · 2026-01-04T13:56:11 1767534971

Consider updating your answer on SO - I know I'll keep visiting SO for answers like these for quite some time. And enjoy the deserved upvotes :)

HappyPanacea · 2026-01-04T14:37:34 1767537454

Do you think you can extend it to distance from a point to an ellipsoid?

0xfaded · 2026-01-04T15:18:21 1767539901

Yes, people have done this