Agent in a Box

The agent runtime runs the pi coding agent (@mariozechner/pi-coding-agent) inside an isol8 container. Instead of executing code, it executes a prompt — pi handles the LLM loop, tool calls (read, write, edit, bash), and file edits autonomously, entirely within the sandbox.

Quick start

CLI
Library
API

isol8 run -e "add unit tests for the auth module" \
  --runtime agent \
  --net filtered \
  --allow "api.anthropic.com" \
  --secret "ANTHROPIC_API_KEY=sk-ant-..." \
  --agent-flags "--model anthropic/claude-sonnet-4-5"

import { DockerIsol8 } from "@isol8/core";

const engine = new DockerIsol8({
  network: "filtered",
  networkFilter: {
    whitelist: ["^api\\.anthropic\\.com$"],
    blacklist: [],
  },
});

await engine.start();

const result = await engine.execute({
  runtime: "agent",
  code: "add unit tests for the auth module",
  agentFlags: "--model anthropic/claude-sonnet-4-5",
  env: { ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY! },
  timeoutMs: 300_000,
});

console.log(result.stdout);
await engine.stop();

curl -X POST http://localhost:3000/execute \
  -H "Authorization: Bearer $ISOL8_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "request": {
      "runtime": "agent",
      "code": "add unit tests for the auth module",
      "agentFlags": "--model anthropic/claude-sonnet-4-5",
      "env": { "ANTHROPIC_API_KEY": "sk-ant-..." }
    },
    "options": {
      "network": "filtered",
      "networkFilter": {
        "whitelist": ["^api\\.anthropic\\.com$"],
        "blacklist": []
      },
      "timeoutMs": 300000
    }
  }'

How it works

When runtime: "agent" is used, isol8 runs:

pi --no-session --append-system-prompt '<sandbox context>' [agentFlags] -p '<code>'

--no-session — disables session persistence (ephemeral, non-interactive). This is always set by isol8; you do not need to include it in agentFlags.
--append-system-prompt — automatically injected by isol8 to inform pi of sandbox constraints
agentFlags — extra pi flags you supply (model, thinking level, tool restrictions)
-p '<code>' — your prompt, shell-quoted

pi then runs its own tool-call loop inside the container. It can read, write, and edit files under /sandbox, and run arbitrary bash commands — all within the sandbox’s resource and network limits.

The isol8:agent Docker image (which provides bun, pi, and gh) is built automatically when you run isol8 setup or when DockerIsol8 first uses the agent runtime. If you need to build it manually — for example in an offline environment — run:

docker build --target agent -t isol8:agent \
  node_modules/@isol8/core/dist/docker/

Networking requirement

The agent runtime requires network access — the AI coding agent must reach its LLM provider API. network: "none" throws:

Error: Agent runtime requires network access.
The AI coding agent needs to reach its LLM provider API.
Use --net host, or --net filtered --allow "api.anthropic.com" (or your provider's domain).

Two valid network modes for the agent runtime:

Mode	When to use
`"filtered"`	Recommended. Restricts outbound traffic to an explicit allowlist of LLM API hostnames.
`"host"`	Full host network access. Use in trusted environments where you control what the agent calls.

filtered (recommended)
host

const engine = new DockerIsol8({
  network: "filtered",
  networkFilter: {
    whitelist: ["^api\\.anthropic\\.com$"],
    blacklist: ["^169\\.254\\."],
  },
});

"filtered" requires at least one whitelist entry — an empty whitelist throws:

Error: Agent runtime requires at least one network whitelist entry.

const engine = new DockerIsol8({
  network: "host",
});

"host" gives the agent full access to your host network. Use only in trusted, controlled environments. Prefer "filtered" with an explicit whitelist whenever possible.

Sandbox system prompt

Every pi invocation inside isol8 receives an automatically appended system prompt informing the agent that it is running in a sandbox with restricted network access and an ephemeral filesystem. This uses pi’s --append-system-prompt — it appends to pi’s default prompt without replacing it. You do not need to supply this yourself.

The `code` field

For the agent runtime, code is always the prompt text — never a script. It is passed to pi via -p '<prompt>' after shell-quoting.

await engine.execute({
  runtime: "agent",
  code: "Refactor authenticate() to use async/await and add JSDoc comments.",
});

Agent flags (`agentFlags`)

Use agentFlags (library/API) or --agent-flags (CLI) to pass extra arguments to pi before the -p flag.

await engine.execute({
  runtime: "agent",
  code: "Fix the failing tests",
  agentFlags: "--model anthropic/claude-sonnet-4-5 --thinking medium --no-extensions",
});

Useful pi flags

Flag	Description
`--model <provider/id>`	LLM to use — e.g. `anthropic/claude-sonnet-4-5`, `openai/gpt-4o`, `google/gemini-2.0-flash`
`--thinking <level>`	Thinking budget: `off`, `minimal`, `low`, `medium`, `high`, `xhigh`
`--tools <list>`	Built-in tools to enable. Default: `read,bash,edit,write`. Also: `grep`, `find`, `ls`
`--no-tools`	Disable all built-in tools
`--no-skills`	Disable auto-loading of skill files from the container
`--no-extensions`	Disable auto-loading of extensions from the container

Injecting files

Use files in ExecutionRequest (library/API) or --files <dir> (CLI) to inject local files into /sandbox before the agent runs.

Library
CLI

import { readFileSync } from "node:fs";

await engine.execute({
  runtime: "agent",
  code: "Review the code in /sandbox and suggest improvements to error handling",
  agentFlags: "--model anthropic/claude-sonnet-4-5 --tools read,bash",
  files: {
    "src/auth.ts": readFileSync("./src/auth.ts", "utf-8"),
    "src/utils.ts": readFileSync("./src/utils.ts", "utf-8"),
    // pi auto-loads AGENTS.md from cwd — use this for project rules
    "AGENTS.md": "# Rules\n- Follow existing code style\n- No new dependencies\n",
  },
});

# Inject an entire local directory into /sandbox
isol8 run -e "Review the code and suggest improvements" \
  --runtime agent \
  --files ./src \
  --net filtered \
  --allow "api.anthropic.com"

pi automatically loads AGENTS.md (and CLAUDE.md) from the working directory at startup. Injecting your project rules as /sandbox/AGENTS.md gives the agent project-specific context without touching the prompt.

Setup scripts

A setupScript runs as a bash script inside the container before pi receives its prompt. Use it to clone repos, write config files, install tools, or prepare any state the agent needs. The script runs as the sandbox user from /sandbox.

Clone a repo before the agent starts

The most common pattern: clone the target repo so pi finds it ready on the filesystem.

const result = await engine.execute({
  runtime: "agent",
  setupScript: `
    git clone https://$GITHUB_TOKEN@github.com/my-org/my-repo.git /sandbox/repo
    cd /sandbox/repo && git checkout -b agent/fix origin/main
  `,
  code: "Fix the failing TypeScript type errors in src/parser.ts. Write tests for any functions you change.",
  agentFlags: "--model anthropic/claude-sonnet-4-5",
  timeoutMs: 300_000,
});

Inject `.npmrc` or `.gitconfig` before the agent runs

The agent may need authenticated access to npm or private git remotes. Write config files via the setup script so credentials are in place before pi starts:

await engine.execute({
  runtime: "agent",
  setupScript: `
    # Authenticate npm to private registry
    cat > /sandbox/.npmrc << 'EOF'
registry=https://registry.npmjs.org/
//registry.npmjs.org/:_authToken=$NPM_TOKEN
EOF

    # Configure git identity for commits
    git config --global user.name "isol8-agent"
    git config --global user.email "agent@ci.internal"

    # Clone the target repo
    git clone https://$GITHUB_TOKEN@github.com/my-org/my-repo.git /sandbox/repo
    cd /sandbox/repo && npm ci
  `,
  code: "Add end-to-end tests for the checkout flow. Use the existing test patterns in tests/e2e/.",
  agentFlags: "--model anthropic/claude-sonnet-4-5 --thinking low",
  timeoutMs: 600_000,
});

Inject AGENTS.md via setup script

pi auto-loads AGENTS.md from its working directory. Write project rules via the setup script to give the agent context without touching the prompt:

await engine.execute({
  runtime: "agent",
  setupScript: `
    git clone https://$GITHUB_TOKEN@github.com/my-org/my-repo.git /sandbox/repo

    # Write project rules — pi picks these up automatically
    cat > /sandbox/repo/AGENTS.md << 'EOF'
# Coding rules
- Follow existing code style
- No new runtime dependencies without approval
- All new functions must have JSDoc comments
- Tests live in tests/ — use vitest
EOF
  `,
  code: "Refactor the authentication module to use async/await throughout.",
  agentFlags: "--model anthropic/claude-sonnet-4-5",
  timeoutMs: 300_000,
});

Bake setup into a custom image

For setup that never changes between runs (git identity, tool config, registry auth), bake it into a custom image using prebuiltImages[].setupScript in your config. The script runs on every execution against that image without adding per-request latency:

isol8.config.json

{
  "prebuiltImages": [
    {
      "tag": "my-org/node-devbox:latest",
      "runtime": "node",
      "installPackages": ["typescript", "eslint", "prettier", "vitest"],
      "setupScript": "git config --global user.name 'isol8-agent' && git config --global user.email 'agent@ci.internal' && git config --global core.autocrlf false && npm config set update-notifier false"
    }
  ]
}

Then your execution only needs the per-run parts:

const engine = new DockerIsol8({
  image: "my-org/node-devbox:latest",
  network: "filtered",
  networkFilter: { whitelist: ["^api\\.anthropic\\.com$", "^github\\.com$"], blacklist: [] },
});

await engine.start();

await engine.execute({
  runtime: "agent",
  // Image-level script already ran git config — only clone is needed here
  setupScript: "git clone https://$GITHUB_TOKEN@github.com/my-org/my-repo.git /sandbox/repo",
  code: "Add unit tests for the UserService class.",
  agentFlags: "--model anthropic/claude-sonnet-4-5 --thinking low",
  timeoutMs: 300_000,
});

When both image-level and request-level setupScript are set, the image-level script always runs first. See Setup scripts for the full reference.

Persistent sessions

Use mode: "persistent" to run multiple steps in the same container — for example, cloning a repo with bash and then running the agent against it:

const engine = new DockerIsol8({
  mode: "persistent",
  network: "filtered",
  networkFilter: {
    whitelist: ["^api\\.anthropic\\.com$", "^github\\.com$"],
    blacklist: [],
  },
  secrets: {
    GITHUB_TOKEN: process.env.GITHUB_TOKEN!,
    ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY!,
  },
  pidsLimit: 200,
  memoryLimit: "2g",
});

await engine.start();

// Step 1: deterministic setup (bash)
await engine.execute({
  runtime: "bash",
  code: `
    git clone https://$GITHUB_TOKEN@github.com/my-org/my-repo.git /sandbox/repo
    cd /sandbox/repo && git checkout -b agent/task origin/main
  `,
});

// Step 2: agentic implementation (agent)
await engine.execute({
  runtime: "agent",
  code: "Fix the type errors in src/parser.ts. The project uses TypeScript strict mode.",
  agentFlags: "--model anthropic/claude-sonnet-4-5 --thinking low",
  env: { ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY! },
  timeoutMs: 300_000,
});

// Step 3: deterministic verification (bash)
const testResult = await engine.execute({
  runtime: "bash",
  code: "cd /sandbox/repo && npx tsc --noEmit && npx jest --ci",
});

await engine.stop();

Streaming agent output

pi produces output incrementally. Use executeStream to receive it in real-time:

for await (const event of engine.executeStream({
  runtime: "agent",
  code: "Refactor the auth module to remove deprecated API calls",
  agentFlags: "--model anthropic/claude-sonnet-4-5",
})) {
  if (event.type === "stdout") process.stdout.write(event.data);
  if (event.type === "stderr") process.stderr.write(event.data);
  if (event.type === "exit") console.log(`\nAgent exited: ${event.data}`);
}

Each event carries an optional phase field ("setup" or "code") so you can distinguish setup-script output from agent output:

for await (const event of engine.executeStream({
  runtime: "agent",
  setupScript: "git clone https://$GITHUB_TOKEN@github.com/my-org/repo.git /sandbox/repo",
  code: "Fix the type errors in src/parser.ts",
  agentFlags: "--model anthropic/claude-sonnet-4-5",
})) {
  if (event.phase === "setup") {
    // output from the setupScript (clone, config, etc.)
    process.stderr.write(`[setup] ${event.data}`);
  } else if (event.type === "stdout") {
    process.stdout.write(event.data);
  } else if (event.type === "exit") {
    console.log(`\nAgent exited: ${event.data}`);
  }
}

If the setupScript exits non-zero, the stream yields a { type: "error", phase: "setup" } event followed by an exit event, and the agent never starts. Filter on phase to surface setup failures separately from agent failures.

Default resource limits

The agent runtime spawns subprocesses for tool calls (bash, package installs, git operations). The default pidsLimit of 64 is often too low — explicitly set pidsLimit: 200 to avoid process limit errors:

const engine = new DockerIsol8({
  network: "filtered",
  networkFilter: { whitelist: ["^api\\.anthropic\\.com$"], blacklist: [] },
  pidsLimit: 200, // required — the default of 64 is too low for agent workloads
});

Option	Recommended for agent	Default (all runtimes)
`pidsLimit`	`200` (set explicitly)	`64`
`sandboxSize`	`2g`	`512m`

Retrieving output files

Use outputPaths to include files written by the agent in the result:

const result = await engine.execute({
  runtime: "agent",
  code: "Generate a test suite for the Parser class and write it to /sandbox/parser.test.ts",
  outputPaths: ["/sandbox/parser.test.ts"],
});

console.log(result.files?.["/sandbox/parser.test.ts"]);

Or retrieve files explicitly with getFile() after execution in a persistent session.

LLM API key handling

Pass the API key via engine secrets (recommended — masked from output) or per-request env:

// Via engine secrets — automatically masked in stdout/stderr
const engine = new DockerIsol8({
  secrets: { ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY! },
  // ...
});

Troubleshooting

Error: Agent runtime requires network access — Switch to network: "filtered" with at least one whitelist entry, or network: "host". network: "none" is not supported for the agent runtime. Agent exits non-zero — Check result.stderr. Common causes: missing API key, endpoint not in whitelist, timeoutMs too short. Agent can’t reach the LLM API — Verify the whitelist pattern. Patterns are matched as extended regular expressions using grep -E (substring match, not full-string). Without anchors, a pattern like anthropic\\.com would also match evil-anthropic.com.attacker.net. Use ^ and $ anchors for precise matching: ^api\\.anthropic\\.com$. Files not in result — Add outputPaths or call getFile() after the run. In ephemeral mode, container state is discarded on exit.

One-shot coding agents

Architecture and pipeline: clone repo, implement, verify, fix, and open a PR — with no human in the loop.

Setup scripts

Full reference for setupScript: image-level vs request-level, execution order, error handling.

AI agent code execution

Foundational patterns for LLM tool-call loops with isol8.

Runtime reference

All six runtimes: commands, extensions, package install behavior.

Security model

Network controls, seccomp, secret masking, and isolation boundaries.

​Quick start

​How it works

​Networking requirement

​Sandbox system prompt

​The code field

​Agent flags (agentFlags)

​Useful pi flags

​Injecting files

​Setup scripts

​Clone a repo before the agent starts

​Inject .npmrc or .gitconfig before the agent runs

​Inject AGENTS.md via setup script

​Bake setup into a custom image

​Persistent sessions

​Streaming agent output

​Default resource limits

​Retrieving output files

​LLM API key handling

​Troubleshooting

​Related pages

One-shot coding agents

Setup scripts

AI agent code execution

Runtime reference

Security model

Quick start

How it works

Networking requirement

Sandbox system prompt

The `code` field

Agent flags (`agentFlags`)

Useful pi flags

Injecting files

Setup scripts

Clone a repo before the agent starts

Inject `.npmrc` or `.gitconfig` before the agent runs

Inject AGENTS.md via setup script

Bake setup into a custom image

Persistent sessions

Streaming agent output

Default resource limits

Retrieving output files

LLM API key handling

Troubleshooting

Related pages