______ ______ ________ __ __ ________ __
| ▓▓▓▓▓▓\ ▓▓▓▓▓▓\ ▓▓▓▓▓▓▓▓ ▓▓\ | ▓▓\▓▓▓▓▓▓▓▓ ▓▓____ ______ __ __
| ▓▓__| ▓▓ ▓▓ __\▓▓ ▓▓__ | ▓▓▓\| ▓▓ | ▓▓ | ▓▓ \ / \| \ / \
| ▓▓ ▓▓ ▓▓| \ ▓▓ \ | ▓▓▓▓\ ▓▓ | ▓▓ | ▓▓▓▓▓▓▓\ ▓▓▓▓▓▓\\▓▓\/ ▓▓
| ▓▓▓▓▓▓▓▓ ▓▓ \▓▓▓▓ ▓▓▓▓▓ | ▓▓\▓▓ ▓▓ | ▓▓ | ▓▓ | ▓▓ ▓▓ | ▓▓ >▓▓ ▓▓
| ▓▓ | ▓▓ ▓▓__| ▓▓ ▓▓_____| ▓▓ \▓▓▓▓ | ▓▓ | ▓▓__/ ▓▓ ▓▓__/ ▓▓/ ▓▓▓▓\
| ▓▓ | ▓▓\▓▓ ▓▓ ▓▓ \ ▓▓ \▓▓▓ | ▓▓ | ▓▓ ▓▓\▓▓ ▓▓ ▓▓ \▓▓\
\▓▓ \▓▓ \▓▓▓▓▓▓ \▓▓▓▓▓▓▓▓\▓▓ \▓▓ \▓▓ \▓▓▓▓▓▓▓ \▓▓▓▓▓▓ \▓▓ \▓▓
Safe containers for autonomous AI agents
Run Claude, Codex, or Gemini with full auto-approve permissions. They can't wreck your system because they're in a container. They can work while you sleep because git tracks everything. If something goes wrong, git reset --hard and you're back to normal.
I saw Matt Brown on YouTube do something wild: he set up a race between himself and an AI agent to reverse engineer an IoT binary exploit using Ghidra and Binary Ninja. Human vs machine, both working in parallel on the same problem.
I thought: "I want this."
Not just the competition - the workflow itself. An autonomous agent with full access to specialized tools, multiple directories mounted, complete isolation, safe to detach and let work in the background.
The closest thing was Dev Containers, but those are designed for IDE workflows. I wanted something simpler: Docker for isolation, agent CLIs for execution, no editor dependencies. Just give the agent a sandbox, point it at your project, and let it work.
That's Agentbox.
AI agents are most useful when autonomous - auto-approve changes, run commands without asking, iterate until done. But nobody gives an agent those permissions on their actual machine. We've all heard the stories: an agent runs rm -rf in the wrong directory, corrupts a git repo, installs packages that break your system.
The tension is real:
- Autonomous agents are powerful - Let them work while you sleep, handle tedious tasks, run in parallel
- Autonomous agents are dangerous - Full system access + auto-approve = potential disaster
You can do this manually. Git worktrees for parallel branches. Docker for containers. But the ergonomics are terrible - too many commands to remember, flags are painful on phone keyboards, no unified interface.
First, put agents in a jail. A Docker container gives them a full dev environment - git, node, python, everything. But it contains the blast radius. If an agent goes rogue, it can only damage what's inside. Your system stays safe.
Second, wrap it all in a simple CLI. No flags. Positional arguments only. Designed for phone keyboards and tired brains. One command to start, one to connect, one to manage.
# Install
git clone git@github.com:scharc/agentbox.git
cd agentbox
bash bin/setup.sh --shell zsh # or bash
# Use it
cd ~/myproject
agentbox init
agentbox superclaudeThat's it. Claude starts working with auto-approve enabled. Give it a task, detach (Ctrl+A, D), come back later.
Autonomous agents run with auto-approve - no permission prompts, continuous execution:
agentbox superclaude # Claude with --dangerously-skip-permissions
agentbox supercodex # Codex autonomous
agentbox supergemini # Gemini autonomousInteractive agents ask permission for each action - good for exploration:
agentbox claude
agentbox codex
agentbox geminiShell for manual work:
agentbox shell # Just bash, no AIAdd packages the agent can use:
agentbox packages add npm typescript
agentbox packages add pip pytest
agentbox packages add apt ffmpeg
agentbox packages add cargo ripgrepChanges auto-rebuild the container.
Mount additional directories:
agentbox workspace add ~/other-repo ro reference
agentbox workspace add ~/data rw dataInside the container:
/workspace- Your project (read-write)/context/reference- Other repo (read-only)/context/data- Data directory (read-write)
Enable MCP servers for extended capabilities:
agentbox mcp list # See available
agentbox mcp add agentbox-analyst # Enable oneCore MCPs:
- agentctl - Worktree and session management
- agentbox-analyst - Cross-agent review and analysis
Run multiple agents on different branches simultaneously:
agentbox worktree add feature-auth # Create worktree
agentbox worktree superclaude feature-auth # Run agent there
agentbox worktree list # See all worktreesEach branch gets its own directory. Agents don't interfere.
Run multiple agents in one container:
agentbox session new superclaude feature # New session
agentbox session list # See sessions
agentbox session attach feature # Jump to oneSingle-keypress navigation for phone keyboards:
agentbox qShows sessions, worktrees, actions. Press a letter to act. No typing commands.
The daemon bridges container and host:
agentbox service install # Install as systemd serviceGet notified when:
- Task completes
- Agent appears stalled
- Something needs attention
Expose container ports without restart:
agentbox ports expose 3000 # Container → Host
agentbox ports forward 5432 # Host → ContainerConnect to other Docker containers:
agentbox network connect postgres-dev
# Agent can now reach postgres-dev:5432Zero setup required. Agentbox automatically shares your host credentials with the container. Authenticate once on your machine, and every container gets access.
Supported credentials:
- Claude -
~/.claude/.credentials.json(OAuth tokens) - Codex -
~/.codex/auth.json - OpenAI -
~/.config/openai/ - Gemini -
~/.config/gemini/ - Git - Author name/email from environment
How it works:
- Host credential directories are mounted into the container
- Container-init creates symlinks to the expected locations
- OAuth token refresh works both ways (mounts are read-write)
This means:
- No
claude logininside containers - Tokens auto-refresh without breaking
- New containers immediately have access
SSH keys are configurable via .agentbox.yml:
ssh:
mode: keys # Copy keys (default)
# mode: mount # Bind mount ~/.ssh
# mode: config # Config only (use with forward_agent)
# mode: none # No SSH
forward_agent: false # Forward SSH agent socketGive agents access to hardware devices. The interactive chooser shows what's available:
agentbox devices # Interactive selection
agentbox devices add /dev/snd # Or add directlyThe chooser auto-detects audio devices, GPUs, serial ports, and cameras on your system. Devices that go offline won't break the container - they're skipped automatically at startup.
Give agent control of Docker (use with caution):
agentbox docker enableI use Agentbox as my daily driver. Here's what that looks like:
Start an agent on my laptop with a task. Detach. Go get coffee.
From my phone, SSH into my laptop via Tailscale. Run agentbox q to see the quick menu. Check on the agent. Maybe start another one on a different branch.
Get a notification when it's done. Review from wherever I am.
Multiple agents, multiple branches, all from my phone. The quick menu makes it practical.
┌─────────────────────────────────────────┐
│ YOUR MACHINE (Host) │
│ │
│ agentbox superclaude │
│ agentbox connect │
│ agentbox stop │
│ │
│ agentboxd (daemon) │
│ ├── Desktop notifications │
│ ├── Stall detection │
│ └── Port forwarding │
└─────────────────────────────────────────┘
↕ SSH tunnel
┌─────────────────────────────────────────┐
│ CONTAINER (Agent's World) │
│ │
│ /workspace (your code) │
│ /context/* (extra mounts) │
│ │
│ Agent working autonomously... │
│ ├── Edits files │
│ ├── Runs tests │
│ ├── Commits changes │
│ └── Notifies when done │
└─────────────────────────────────────────┘
Two isolated worlds. The agent works safely inside. The daemon connects them.
agentbox list # Running containers
agentbox list all # Include stopped
agentbox info # Container details
agentbox stop # Stop container
agentbox remove # Delete container
agentbox rebase # Rebuild with new configContainer isolation: Agents can only access the project directory and explicitly mounted paths. Your system, other projects, and home directory are unreachable.
Git safety net: Every change is tracked. Easy to review (git diff), easy to undo (git reset --hard).
Credential isolation: SSH keys (in keys mode) are copied into the container - changes don't affect your host. API tokens are synced to support OAuth refresh.
Worst case: Agent corrupts the project? git reset --hard. Container breaks? agentbox remove && agentbox superclaude. Back to normal in seconds.
- Docker - Container runtime
- Python 3.12+ - For the CLI
- Poetry - Python dependency management
- Agent CLI - At least one: Claude Code, Codex, or Gemini
The Story (start here):
- Why Agentbox Exists - The origin story
- Two Worlds - Architecture
- First Steps - Your first agent
- The Dangerous Settings - Agent types
- Parallel Work - Sessions and worktrees
- Work From Anywhere - Mobile workflow
- Day-to-Day - Container management
- All the Options - Configuration
Reference:
- REF-A: CLI Reference - All commands
- REF-B: Daemon - agentboxd
- REF-C: Container CLI - agentctl
- REF-D: Tunnel Protocol - Technical details
- REF-E: Library - MCPs and skills
- REF-F: Agent Collaboration - Peer review workflow
- REF-G: Network Connections - Container networking
- REF-H: Analyst MCP - Cross-agent analysis
For agents working on Agentbox:
See CONTRIBUTING.md for guidelines.
Areas that need help:
- Documentation improvements
- Bug reports and fixes
- Testing experimental features
MIT
- Issues: https://github.com/scharc/agentbox/issues
- Discussions: https://github.com/scharc/agentbox/discussions