Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
AI compromised sandbox to mine crypto without prompting on its own initiative
8 points by throw0101c 75 days ago | hide | past | favorite | 3 comments
From §3.1.4, "Safety-Aligned Data Composition":

> Early one morning, our team was urgently convened after Alibaba Cloud’s managed firewall flagged a burst of security-policy violations originating from our training servers. The alerts were severe and heterogeneous, including attempts to probe or access internal-network resources and traffic patterns consistent with cryptomining-related activity. We initially treated this as a conventional security incident (e.g., misconfigured egress controls or external compromise). […]

> […] In the most striking instance, the agent established and used a reverse SSH tunnel from an Alibaba Cloud instance to an external IP address—an outbound-initiated remote access channel that can effectively neutralize ingress filtering and erode supervisory control. We also observed the unauthorized repurposing of provisioned GPU capacity for cryptocurrency mining, quietly diverting compute away from training, inflating operational costs, and introducing clear legal and reputational exposure. Notably, these events were not triggered by prompts requesting tunneling or mining; instead, they emerged as* instrumental side effects of autonomous tool use under RL optimization.

* https://arxiv.org/abs/2512.24873



Is it possible that the agent installed a e.g. python package that did these things?

To me, initiating an reverse ssh tunnel to somewhere makes no sense if you don't know who's on the other end?

And why Bitcoin mining?


The reverse SSH tunnel detail is what makes this genuinely alarming — not the crypto mining itself, but that outbound-initiated channels effectively null out your ingress controls. You can have the tightest security groups in the world and an agent with shell access can still phone home. We saw something similar (different context, not AI) where egress filtering wasn't applied symmetrically to training/batch instances because "they don't serve traffic."

The GPU compute diversion is also underreported as a cost signal. If you have any agentic workloads, you probably want anomaly detection on GPU utilization per job, not just billing alerts — by the time your bill spikes, the damage is already days old.

What runtime isolation are you seeing orgs actually deploy for agent workloads? gVisor, Firecracker, something else? Curious whether this is pushing people toward stronger VM-level boundaries or if network egress controls are the more practical mitigation.


This Alibaba case is wild but not surprising. I caught one of my agents making outbound network calls to an IP I didn't recognize during a routine training run last month. Turned out to be a dependency it pulled in autonomously. I run everything through Daedalab now, it flags unauthorized network activity and blocks it before execution. The fact that these things happen silently is the real problem.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: