I'm glad there's many teams with automated scans of pypi and npm running. It elevates the challenge of making a backdoor that can survive for any length of time.
It's a spam flood by the attacker to complicate information sharing[1]. They did the same thing in the Trivy discussion, with many of the same accounts.[2]
This is tied to the TeamPCP activity over the last few weeks. I've been responding, and keeping an up to date timeline. I hope it might help folks catch up and contextualize this incident:
1. I dont have hard metrics at hand but with the latest Sonnet I'd say we reach consensus around 80% of the time, with Opus is almost always but we are not using it due to cost
2. The difference I see in agent behavior when they don't reach consensus is usually either
- when one of them didn't explore enough and lack context
- and/or when their risk assessment is off
The latest happen often, in other workflows based on agents we are now giving clear instruction on how to assess risk and where to draw a line to consider something a true positive.
3. validation is on Sonnet, we don't use persona based prompts but all the 3 validators get's the same task and context. The agent orchestrating them will take their output and make the final decision. We use an internal fork of the claude code github action for now.
* it can inform triage, if you use the extension you're more likely to be impacted
* because it was VSCode, Workplace Trust actually partially mitigated this in at least 38 cases
reply