Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Coasts – Containerized Hosts for Agents (github.com/coast-guard)
74 points by jsunderland323 15 hours ago | hide | past | favorite | 30 comments
Hi HN - We've been working on Coasts (“containerized hosts”) to make it so you can run multiple localhost instances, and multiple docker-compose runtimes, across git worktrees on the same computer. Here’s a demo: https://www.youtube.com/watch?v=yRiySdGQZZA. There are also videos in our docs that give a good conceptual overview: https://coasts.dev/docs/learn-coasts-videos.

Agents can make code changes in different worktrees in isolation, but it's hard for them to test their changes without multiple localhost runtimes that are isolated and scoped to those worktrees as well. You can do it up to a point with port hacking tricks, but it becomes impractical when you have a complex docker-compose with many services and multiple volumes.

We started playing with Codex and Conductor in the beginning of this year and had to come up with a bunch of hacky workarounds to give the agents access to isolated runtimes. After bastardizing our own docker-compose setup, we came up with Coasts as a way for agents to have their own runtimes without having to change your original docker-compose.

A containerized host (from now on we’ll just say “coast” for short) is a representation of your project's runtime, like a devcontainer but without the IDE stuff—it’s just focused on the runtime. You create a Coastfile at your project root and usually point to your project's docker-compose from there. When you run `coast build` next to the Coastfile you will get a build (essentially a docker image) that can be used to spin up multiple Docker-in-Docker runtimes of your project.

Once you have a coast running, you can then do things like assign it to a worktree, with `coast assign dev-1 -w worktree-1`. The coast will then point at the worktree-1 root.

Under the hood the host project root and any external worktree directories are Docker-bind-mounted into the container at creation time but the /workspace dir, where we run the services of the coast from, is a separate Linux bind mount that we create inside the running container. When switching worktrees we basically just do umount -l /workspace, mount --bind <path_to_worktree_root>, mount --make-rshared /workspace inside of the running coast. The rshared flag sets up mount propagation so that when we remount /workspace, the change flows down to the inner Docker daemon's containers.

The main idea is that the agents can continue to work host-side but then run exec commands against a specific coast instance if they need to test runtime changes or access runtime logs. This makes it so that we are harness agnostic and create interoperability around any agent or agent harness that runs host-side.

Each coast comes with its own set of dynamic ports: you define the ports you wish to expose back to the host machine in the Coastfile. You're also able to "checkout" a coast. When you do that, socat binds the canonical ports of your coast (e.g. web 3000, db 5432) to the host machine. This is useful if you have hard coded ports in your project or need to do something like test webhooks.

In your Coastfile you point to all the locations on your host-machine where you store your worktrees for your project (e.g. ~/.codex/worktrees). When an agent runs `coast lookup` from a host-side worktree directory, it is able to find the name of the coast instance it is running on, so it can do things like call `coast exec dev-1 make tests`. If your agent needs to do things like test with Playwright it can so that host-side by using the dynamic port of your frontend.

You can also configure volume topologies, omit services and volumes that your agent doesn't need, as well as share certain services host-side so you don't add overhead to each coast instance. You can also do things like define strategies for how each service should behave after a worktree assignment change (e.g. none, hot, restart, rebuild). This helps you optimize switching worktrees so you don't have to do a whole docker-compose down and up cycle every time.

We'd love to answer any questions and get your feedback!

 help



This is really cool, been feeling this pain with worktrees for a while.

Curious about the hot strategy: when you do umount -l /workspace + mount --bind + mount --make-rshared inside the DinD container, lazy unmount means a running file watcher can still hold open fds to the old worktree while the new bind is already live. Have you hit cases where it keeps writing to stale paths after the switch? Or does it just naturally recover once the watcher picks up the inotify events from the new mount?


I have waited 12 hours for someone to ask this! You are my hero.

So the name "hot" is a bit misleading. The containers don't actually stay alive through the switch. What happens is we do the umount -l /workspace, mount --bind, mount --make-rshared sequence first, and then we run docker compose up --force-recreate. Force-recreate skips compose down (which would tear down the network, named volumes, everything) and just swaps the container processes in place. The old containers and their file watchers are killed and new ones start up.

By the time the new container processes start, /workspace already points at the new worktree so all their file handles are fresh and correct. There's no window where a watcher could be writing to stale paths because the old processes are just gone.

I was pretty afraid of this at first too but it turns out the force-recreate sidesteps the whole problem.


We have been trying to solve the same problem (and a bunch of other ones) with https://specific.dev as well. We’ve tried to stay away from Docker as much as we can though because of the still pretty bad experience on Mac.

Our approach is having our CLI handle port assignments (and pass any connection details/ports along as env vars) and that way being able to spin up “isolated” copies of the local dev environment. Has the added benefit of us being able to deploy the same config straight to production and switch in production database connections strings and anything else needed.


> "We’ve tried to stay away from Docker as much as we can though because of the still pretty bad experience on Mac."

This seems to be a pretty common perspective, but isn't it mostly about Docker Desktop? Orbstack solved my complaints, and I'm genuinely curious if I'm missing something significant (which is def possible).


I think this was a common perspective from early docker days with regard to local bind mounts (before docker switched from virtual box with hyperkit on macos). I do use Orb Stack and have noticed faster build times with Orb Stack but I haven't really noticed any difference in runtime performance between Orb Stack and Docker Desktop.

We started with an approach like that but I think our grounding principal has been that you shouldn't have to modify your docker-compose to get parallelized local development. I think we want to layer onto your existing setup, not make you re-write your stack around us.

I haven't really had a bad experience with Docker on Mac. but Is the idea you basically just build your service on top of specific.dev's provided services (postgres and redis) and those run bare-metal locally and then you can deploy to specific.dev's hosted solution?


Yes, exactly. Probably two different focuses between us, we are more focused on providing the full environment to build productively with coding agents, from local dev all the way to prod. The key thing for us is that the agent can write code, build infrastructure and test the entire system autonomously locally, and then deploying to production should be dead simple.

A bit of a different approach from the classic use case of docker-compose that is often orthogonal to the production infrastructure in some sense.

One thing I've used to great success though is taking an existing project or example docker-compose and simply asking the coding agent to translate it to Specific's IaC. Works a treat, especially as the coding agent can read all the code at the same time and connect it all together.

(also it looks like we were in the same batch!)


I could definitely see that being useful for folks who are Docker-fearful or just less infra literate in general.

I think we're focused on the other end of the spectrum. Folks who like docker and have a good docker setup but want to have parallel runtimes. Anyway, best of luck!


How reliably do agents stick to the 'coast exec' boundary in practice? Especially when they spawn subagents that may or may not inherit the instructions.

Actually pretty reliably but you do need to explicitly call out the skill. I usually start agent threads with /coasts or in codex $coasts. Once it’s in the conversation they stick to it though.

One cool thing we do is we have the docs and semantic search of our docs baked into the CLI, so if the agents get lost they can usually figure things out kind of quickly by searching the docs via the cli.

Also we have a little section our agent.md and claude.md,I’m not sure how well it works without that.


This is pretty cool, have personally felt this limitation many a time.

Basically been relying on spinning up cursor / niteshift / devin workflows since they have their own containers but this could be interesting to keep it all on your main machine.


Thanks!

Yeah, I think there's a ton of great remote solutions right now. I think worktrees make the local stuff tricky but hopefully Coasts can help you out.

Let me know how it goes!


This looks really cool and I've definitely been feeling this pain. I've been building out a solution for myself on top of docker. What are the advantages of using coasts over docker?

Hey thanks! To be clear it does use docker. It's a docker-in-docker solution.

I think there's a quite a few things:

1) You need a control plane to manage the host-side ports. Docker alone cannot do that, so you're either going to write a docker-compose for your development environment where you hard code dynamic ports into a special docker-compose or you're going to end up writing your own custom control plane.

2) You can preserve your regular Docker setup without needing to alter it around dynamic ports and parallelized runtimes. I like this a lot because I want to know that my docker-compose is an approximation of production.

3) Docker basically leaves you with one type of strategy... docker compose up and docker compose down. With coasts you can decide on different strategies when you switch worktrees on a per service basis.

4) This is sort of back to point 2, but more often than not you want to do things like have some shared services or volumes across parallelized runtimes, Coasts makes that trivial (You can also have multiple coast configs so you can easily create a coast type that has isolated volumes). If you go the pure docker route, you are going to end up having multiple docker-composes for different scenarios that are easily abstracted by coasts.

5) The UI you get out of the box for keeping track of your assigned worktrees is super useful.

6) There's a lot of built in optimizations around switching worktrees in the inner bind mount that you'll have to manually code up yourself.

7) I think the ergonomics are just way better. I know that's kind of a vibesey answer but it was sort of the impetus for making Coasts in the first place.

8) There's a lot of stuff around secrets management that I think Coasts does particularly well but can get cumbersome if you're hand-rolling a docker solution.


Thank you for the detailed info! I will check it out

> docker-in-docker solution

Goodbye Mac users.


Why do you say that?

It works fine on mac (that's what we developed it on) and it's not nearly as much overhead as I was initially expecting. There's probably some added latency from virtual box but it hasn't been noticeable in our usage.


HN questions we know are coming our way:

1) Could you run an agent in the coast?

You could... sort of. We started out with this in mind. We wanted to get Claude Max plans to work so we built a way to inject OAuth secrets from the host into the containerized host... unfortunately because the Coast runtime doesn't match the host machine the OAuth token is created on, Anthropic rapidly invalidates the OAuth tokens. This would really only work for TUIs/CLIs and you'd almost certainly have to bring a usage key (at least for Anthropic). You would also need to figure out how to get a browser runtime into the containerized host if you wanted things like playwright to work for your agent.

There's so many good host-side solutions for sandboxing. Coasts is not a sandboxing tool and we don't try to be. We should play well with all host-side sandboxing solutions though.

2) Why DinD and why not mount namespaces with unshare / nsenter?

Yes, DinD is heavy. A core principle of our design was to run the user's docker-compose unmodified. We wanted the full docker api inside the running containerized host. Raw mount namespaces can't provide image caches, network namespaces, and build layers without running against the host daemon or reimplementing Docker itself.

In practice, I've seen about 200mb of overhead with each containerized host running Dind. We have a Podman runtime in the works, which may cut that down some. But the bulk of utilization comes from the services you're running and how you decide to optimize your containerized hosts and docker stack. We have a concept of "shared-services". For example if you don't need isolated postgres or redis, you can declare those services as shared in your Coastfile, and they'll run once on the host Docker daemon instead of being duplicated inside each containerized host, coasts will route to them.


This is interesting for MCP server deployment. Right now most MCP servers run as local stdio processes. Containerizing them would solve the security and isolation concerns that come up every time someone installs a thirdparty MCP server.

Would love to see this support stdio-to-HTTP bridging so local MCP servers can be exposed as remote ones without rewriting them.


There a couple of ways you can go about MCP within coasts (also depends on what the MCP does). You can either install the MCP service host-side (something like playwright), in which case everything should just work out of the box for you.

Alternatively, you can setup the Coast to install MCP services in the containers. There are some cases around specific logging or db MCP's where this might make sense.

>Would love to see this support stdio-to-HTTP bridging so local MCP servers can be exposed as remote ones without rewriting them.

Are you saying if you exposed the MCP service in the Coast and hosted it remotely you could expose back the MCP service remotely? That's actually a sort of interesting idea. Right now, the agents basically need to exec the mcp calls if they are running host-side and need to call an inner mcp. I hadn't considered the case of proxying the stdout to http. I'll think about how best to implement that!


Isn't the primary security concern with thirdparty MCP servers the actual injected context and not whatever sandbox the MCP server is in? It doesn't really matter if the MCP can't do something to it's host; it's that it can manipulate the context to whatever ends it deems fit, which then is intractable in whatever LLM is calling it.

I'm really struggling to understand what peoples security concepts are with LLMs.


Third-party MCP servers create at least two different security problems. One is prompt/context injection through the tool output. The other is the much more conventional risk of executing untrusted code with transient dependencies on your machine (which is how the recent litellm compromise was discovered).

Containerization only helps with the second one, not the first, but that still matters. If you’re going to run random third-party MCP servers, isolating them from your host and any sensitive local data is still an obvious improvement over no isolation.


There's this naïve approach to security that obsesses with building walls, because walls are secure and nothing gets through.

Apparently a lot of people get nerd sniped into building impenetrable 10meter thick steel walls instead of thinking about doors and the windows.


Does it support native macOS containers?

It does not. It works through Docker Desktop, Orb Stack, or Colima on macOS.

Just FYI you might want to reconsider your branding. Using the term "Coast Guard" in pretty much any capacity without written authorization is a felony.

Super not true. Unless they're actively _impersonating_ a Coast Guard officer and acting overtly in that purported role, there's no crime. Simply having a thing called "coast guard" doesn't run afoul of anything. (18 USC SS 912/913).

Interesting, I was not aware.

Well fortunately it's the name of a local observability ui and not the actual product. We'll change it if it becomes a problem.


[flagged]


So technically you could use Coasts to sandbox but our default approach is actually not sandboxed at all. The agents still run host-side so unless you're sandboxing the agent host-side, you're not sandboxed. With coasts you're basically running exec commands against the coast container to extract runtime information.

>One thing I've been thinking about with agent infrastructure: the auth model gets complex fast when agents need to call external APIs on behalf of users. Per-key rate limiting and usage tracking at the edge (rather than in the container) has worked well for me. Curious how you’re handling the credential passing to containerized agents.

The way we handle secrets is at build-time we allow you to run scripts that can extract secrets and env vars host-side. The secrets get stored in a sqlite table (not baked into the coast image). When you start a coast, it injects those secrets -- you can decide how you the secrets should appear either as env vars, or if they should be written to the write layer. You're then able to trigger a re-injection of the secrets, so you can extract all the secrets again host-side and have them injected into all running coasts. This is useful because you don't have to rebuild and re-run just to update secrets.


[flagged]


>One thing I'm curious about: how do you handle state drift when agents are working on the same service across different worktrees? For example, if two agents are both making schema changes to a shared database service, do you have any coordination primitives, or is that left to the orchestration layer above? In my experience the runtime isolation is the easy part - the hard part is when agents need to share state (like a test database) without stepping on each other.

Great question! You can configure multiple coasts, so you could have a coast running with isolated dbs/state and also a shared version (you can either share the volume amongst the running coasts or move your db to run host-side as a singleton). So its sort of left to the orchestration layer: you put rules in your md file about when to use each. There's trade-offs to each scenario. I've been using isolated dbs for integration tests, but then for UI things I end up going with shared services.

>Re: For example, if two agents are both making schema changes to a shared database service

Obviously things can still go wrong here in the shared scenario, but it's worked fine for us and I haven't hit anything so far. It's just like having developers introducing schema changes across feature branches.

>Also, the per-service strategy config (none/hot/restart/rebuild) seems like the right abstraction. Most of the overhead in switching worktrees comes from unnecessary full restarts of services that don't actually care about the code change.

Totally, at first switching worktrees for our 1m+ loc repo was like 2 minutes. Then we introduced the hot/none strategies and got it down to like 8s. This is by far one of the best features we have.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: