Why We Gave AI Root Access to a Sandbox

The first version of our AI sandbox had read-only volumes, non-root execution, a command blocklist that rejected npm install, and a 30-second timeout. It was secure. It was also useless.

The CEO looked at the implementation and asked a question that changed the entire design: "Why so many restrictions? Users will ask AI to clone their repo, install dependencies, run their app, and find errors. How is it supposed to do that if it can't install packages?"

He was right. We had optimized for the wrong threat model.

The wrong question: "How do we keep AI safe?"

When most engineers think about giving AI shell access, they start from fear. What if it runs rm -rf /? What if it installs malware? What if it exfiltrates data?

These are valid concerns on a shared system. They are not valid concerns inside a disposable container that exists solely for AI to use.

The right question: "What does AI need to actually help?"

Here is what developers ask AI to do when debugging a deployment:

"Clone my repo and tell me why the build fails"
"Install the dependencies and check if there are version conflicts"
"Run the app locally and hit the health endpoint"
"Check if the database is reachable from the app's network"
"Read the nginx config and tell me what's wrong"

Every single one of these requires package installation, file writes, or shell piping. Our original blocklist -- which rejected apk, pip, npm, chmod, and pipes to sh -- made all of them impossible.

What we actually built

The AI sandbox is a full-featured Alpine Linux container that runs alongside your application:

┌─────────────────────────────┐
│       Docker Host           │
│                             │
│  ┌──────────┐ ┌──────────┐ │
│  │ Your App │ │ AI       │ │
│  │          │ │ Sandbox  │ │
│  │ :3000    │ │ root     │ │
│  │          │ │ 1GB RAM  │ │
│  │          │ │ 2 CPUs   │ │
│  └────┬─────┘ └────┬─────┘ │
│       │  localhost  │       │
│       └─────────────┘       │
│       shared network        │
│       shared volumes        │
└─────────────────────────────┘

Network mode: container:{app_container_id} -- the sandbox shares the app's network namespace. It can reach the app on localhost:3000. It can reach the app's database on db:5432. Same view of the network as the app itself.

Volumes: Writable. AI can read your config files, modify them to test fixes, and check if the change works.

User: Root. AI can apk add whatever it needs. Node project? npm install. Python? pip install. Need to compile something? apk add build-base.

Pre-installed tools: curl, wget, dig, nc, jq, git, node, npm, python3, pip, bash.

Resources: 1 GB RAM, 2 CPU cores. Enough for npm install and small builds.

Timeout: 5 minutes. Enough for git clone + npm install on a typical project.

What we still block

The blocklist went from 30+ patterns to 8:

rustconst BLOCKED_COMMANDS: &[&str] = &[
    "rm -rf /",
    "rm -rf /*",
    "mkfs",
    "shutdown",
    "reboot",
    "halt",
    "poweroff",
    "kill -9 1",
];

Plus fork bombs. That is it.

These are commands that serve no diagnostic purpose and would destroy the container itself. Everything else -- file operations, package managers, shell piping, network tools, process management -- is allowed.

The security model is the container

This is the insight that changed the design. The sandbox IS the security boundary. It is:

Isolated: A separate container with its own filesystem
Disposable: Destroyed when the app is deleted, stopped when the app stops
Resource-limited: 1 GB RAM, 2 CPU cores, cannot consume the host
Network-scoped: Shares the app's network only, not the host network
Ephemeral: No restart policy. If the host reboots, the sandbox is gone

The question is not "what commands should AI be allowed to run?" The question is "what is the blast radius if AI does something destructive?" The answer: one disposable container that can be recreated in seconds.

The MCP integration

Five new tools expose the sandbox through sh0's MCP server:

Tool	Risk	Purpose
`sandbox_exec_command`	write	Execute any shell command
`sandbox_read_file`	read	Read files from app volumes
`sandbox_list_processes`	read	`ps aux` in the app container
`sandbox_check_connectivity`	read	Test network with `nc` or `curl`
`sandbox_status`	read	Is the sandbox running?

The write-risk sandbox_exec_command requires a write-scoped API key. Read-only keys can read files and check connectivity but cannot execute arbitrary commands. This is the real access control -- not a command blocklist.

The implementation details that matter

Idempotent lifecycle. ensure_sandbox is the entry point for every tool call. If the sandbox exists and is running, it returns the ID. If it is stopped, it restarts it. If it does not exist, it creates it. Two concurrent tool calls hitting ensure_sandbox simultaneously are handled via Docker's 409 Conflict response.

Non-blocking creation. When an app deploys with sandbox_enabled: true, sandbox creation is tokio::spawn-ed as a fire-and-forget task. The deploy pipeline never waits for the sandbox. If sandbox creation fails, it logs a warning and the sandbox is created lazily on first tool call.

Paired lifecycle. Stop the app, the sandbox stops. Start the app, the sandbox starts. Delete the app, the sandbox is destroyed. The sandbox follows the app.

Double timeout. The command is wrapped in Alpine's timeout utility (server-side kill) AND tokio::time::timeout (client-side guard). If the server-side timeout fires, exit code 143 (SIGTERM) is detected and timed_out: true is returned. If somehow that fails, the client-side timeout fires 5 seconds later.

What this enables

With the sandbox, the AI assistant in sh0 can now:

Deep debugging: Clone the user's repo, install deps, grep for error patterns, test connectivity
Configuration analysis: Read Dockerfile, nginx.conf, environment files, package.json -- understand the full stack
Live testing: curl the app's endpoints from the same network, test database connections, verify DNS
Dependency auditing: Install the project, check for vulnerabilities, verify version compatibility
Build reproduction: Clone, install, build -- reproduce the exact failure the user is seeing

This is the difference between an AI that reads logs and an AI that actually investigates.

The methodology lesson

Our multi-session audit workflow caught the initial over-engineering. But it was the CEO who caught the fundamental design error -- because he thinks about what users need, not what engineers fear.

The build-audit-audit-approve cycle works best when the "approve" step includes someone who asks "but will anyone actually use this?" A technically perfect sandbox that cannot install npm is a sandbox nobody will enable.

Security is a constraint, not a goal. The goal is giving AI the tools to actually help. The constraint is doing it without creating real risk. A disposable container with a minimal blocklist achieves both.

Why We Gave AI Root Access to a Sandbox

The wrong question: "How do we keep AI safe?"

The right question: "What does AI need to actually help?"

What we actually built

What we still block

The security model is the container

The MCP integration

The implementation details that matter

What this enables

The methodology lesson

Responses

Related Articles

Don't Make the Founder Open Chrome

The Agents That Arrived After The Commit

Claude Fable 5 Field Notes For Senior Developers: Every Capability Thirteen Agents Actually Used To Ship A Production Website In One Session