If you don't get it working with Claude Code Routines, would love to connect and see if we can help! We're building an open core product that can spin up sandboxed coding and control them from Slack (and also web UI, TUI, and HTTP APIs + CLIs)
We might be building something up your alley! I wanted an OSS platform that let me run any coding agent (or multiple agents) in a sandbox and control it either programmatically or via GUI / TUI.
necro-posting here, but that's kinda what we're working on! We're focused on creating cloud workspaces for sandboxed coding agents and it's built to support any agent harness. https://www.amika.dev/
We let users spin up sandboxed coding agents in the cloud, and control them interactively or programmatically. Each sandbox comes loaded with your git repos, your pick of coding agents, agent skills, MCP servers, and CLI tools, plus a live preview environment so you and the AI can see changes in real time.
I like running `claude --dangerously-skip-permissions` in Amika because worst case, I just delete the sandbox. You can also spin them up via API/CLI to do things like catch Sentry issues and auto-fix them in the background.
We're excited about "software factories": using code-gen automations to produce more of your code. We still review everything that lands, but the process of producing those changes is getting more hands-off.
I think one of the main examples that i saw in a swyx article a while back is that using the sort of ALL CAPS and *IMPORTANT* language that works decently with claude will actually detune the codex models and make them perform worse. I will see if I can find the post
Because that just does it for you, it doesn't help me understand how to write better prompts.
Actually, I can just read the skill with my own eyes and then I can also learn. So, thank you for sharing. It's interesting to read through what it suggests for different models - it fits for the ones I work with regularly, but there are many I don't know the strengths and weaknesses of.
Every website needs to add the "friend or foe" system[0] so that I can mark bots to avoid their content and mark good posters so I can filter just to theirs.
no, I truly do not want to read IHeartHitler88's opinion on jews, or donttreadonme09's bright opinions about how the economy would be better if we listened to Ayn Rand. I'll be very happy when they're out of my sight. If I want to have a miserable day, sure, I'll turn it off.
Fact of the matter is, most posts on the internet are already dogshit. Now they're also populated by AI, but the point stands. Most of what you will say online is at best useless.
I know, it hurts. Most of what I say in this website doesn't matter. Even if it did, it's about the same thing as screaming into the void. And it applies to you too.
The vast majority of what we post is vapid, useless bullshit.
From the article, it looks like they integrated with Docker because someone at Docker reached out about collaborating on the integration.
Regarding security, I think you need three things:
1. You need the agent to run inside a sandbox.
2. You need a safe perimeter or proxy that can apply deterministic filtering rules on what makes it into the AI agent's sandbox and the HTTP requests and responses that agent sends out from the sandbox.
3. The bot should have its own email accounts, or maybe be configured to only send/read from certain email addresses
I'm working on a product that makes it as easy to spin up remote agent sandboxes as it is to git push and git pull. Then when we get that working well we're putting a proxy around each sandbox to let users control filtering rules.
I personally see a future where there are many different types of *Claws, coding agents, etc. and I think they need a new "operating system", so to speak.
Claude: can escape its sandbox (there are GitHub issues about this) and, when sandboxed, still has full read access to everything on your machine (SSH keys, API keys, files, etc.)
Codex: IIRC, only shell commands are sandboxed; the actual agent runtime is not.
I've been working on an OSS project, Amika[1], to quickly spin up local or remote sandboxes for coding workloads. We support copy-on-write semantics locally (well, "copy-and-then-write" for now... we just copy directories to a temp file-tree).
It's tailored to play nicely with Git: spin up sandboxes form CLI, expose TCP/UDP ports of apps to check your work, and if running hosted sandboxes, share the sandbox URLs with teammates. I basically want running sandboxed agents to be as easy as `git clone ...`.
Docs are early and edges are rough. This week I'm starting to dogfood all my dev using Amika. Feedback is super appreciated!
FYI: we are also a startup, but local sandbox mgmt will stay OSS.
This is just a thin wrapper over Docker. It still doesn't offer what I want. I can't run macOS apps, and if I'm doing any sort of compilation, now I need a cross-compile toolchain (and need to target two platforms??).
Just use Docker, or a VM.
The other issue is that this does not facilitate unpredictable file access -- I have to mount everything up front. Sometimes you don't know what you need. And even then copying in and out is very different from a true overlay.
It sounds like a big part of your use case is to safely give an agent control of your computer? Like, for things besides codegen?
We're probably not going to directly support that type of use case, since we're focused on code-gen agents and migrating their work between localhost and the cloud.
We are going to add dynamic filesystem mounting, for after sandbox creation. Haven't figured out the exact implementation yet. Might be a FUSE layer we build ourselves. Mutagen is pretty interesting as well here.
reply