One-shot coding agents

Stripe’s Minions are unattended coding agents that one-shot tasks end to end: a developer kicks one off, and it produces a complete pull request with no human in the loop. Over 1,300 PRs merge at Stripe each week this way. This guide shows you how to build the same thing with isol8 — the architecture, each pipeline stage, and the patterns that make one-shot agents reliable.

Architecture

A one-shot coding agent system has three layers: a client that submits tasks, an orchestrator that sequences steps, and an isol8 container where all code and agent work happens. The client can be anything — a GitHub bot reacting to issue labels, a Slack bot responding to commands, a web UI with a task form, or a CLI script. It simply sends a task description to the orchestrator and optionally consumes progress events. The orchestrator is the core of the system. It creates a single persistent isol8 container and runs a pipeline of steps inside it: Every step runs inside the same container. The filesystem state built up during setup (/sandbox/repo) persists through implement, verify, fix, and ship.

The container: one persistent session per task

Each task gets a single DockerIsol8 instance in mode: "persistent". All steps share the container’s filesystem.

const engine = new DockerIsol8({
  mode: "persistent",
  network: "host",
  timeoutMs: 1_800_000,   // 30-minute hard ceiling for the whole run
  memoryLimit: "4g",
  cpuLimit: 2,
  pidsLimit: 200,          // agent spawns subprocesses; default of 64 is too low
  sandboxSize: "4g",
  maxOutputSize: 10 * 1024 * 1024,
  image: "isol8:agent",
  secrets: {
    GITHUB_TOKEN: githubToken,
    ANTHROPIC_API_KEY: anthropicKey,
  },
});

await engine.start();
// ... run all steps ...
await engine.stop(); // always in a finally block

network: "host" is used here because the agent needs to reach GitHub, the LLM provider API, and package registries simultaneously. If you know your exact set of hostnames, network: "filtered" with an explicit whitelist is more secure. See Security considerations.

Step 1: Setup — clone before the agent starts

Before the agent receives any prompt, a setupScript clones the repo and checks out a branch. Setup scripts run as bash inside the container and complete before the main execution begins.

function buildSetupScript(repo: string, branch: string): string {
  return `
set -e
git config --global user.email "agent@ci.internal"
git config --global user.name "Agent"
cd /sandbox && rm -rf repo
git clone https://x-access-token:$GITHUB_TOKEN@github.com/${repo}.git repo 2>&1
cd repo
git checkout -b ${branch} || git checkout ${branch}
npm ci
echo "ready — branch ${branch}"`;
}

await engine.execute({
  runtime: "agent",
  setupScript: buildSetupScript("my-org/my-repo", "agent/fix-auth"),
  code: 'echo "[setup] done"',
  timeoutMs: 300_000,
  workdir: "/sandbox/repo",
});

git checkout -b on retry: With set -e, git checkout -b exits non-zero if the branch already exists (e.g. on a retry). The || git checkout ${branch} fallback is load-bearing — always include it.

Step 2: Implement — the agent gets the task

The implement step passes a prompt to the agent runtime. The agent reads files, writes code, and runs tools — all inside the sandbox. A naive approach passes the raw task description directly:

await engine.execute({
  runtime: "agent",
  code: `You are working in /sandbox/repo on branch \`${branch}\` of ${repo}.

Implement the following task — read the repo structure first, make all necessary code
changes, run \`npm install\` if node_modules is missing. Do NOT commit anything.

Task:
${task}`,
  agentFlags: "--model anthropic/claude-sonnet-4-5 --no-session",
  timeoutMs: 1_200_000,
  workdir: "/sandbox/repo",
});

This works for simple tasks but is fragile. The isol8 agent has no way to ask follow-up questions — the prompt is the complete specification.

In practice, the orchestrator should act as a master agent. Gather context before handing off: read relevant files, pull issue details, summarize related PRs, fetch coding guidelines. Construct a self-sufficient prompt that gives the agent everything it needs without clarification.

A better implement step looks like this:

// The orchestrator gathers context before handing off
const relevantFiles = await readFilesFromGitHub(repo, issue.changedPaths);
const relatedIssues = await searchIssues(repo, issue.title);
const codeStyle = await readFile(repo, "CONTRIBUTING.md");

const prompt = `
You are working in /sandbox/repo. The codebase uses TypeScript strict mode.

## Task
${issue.body}

## Files most likely to need changes
${relevantFiles.map(f => `### ${f.path}\n\`\`\`\n${f.content}\n\`\`\``).join("\n\n")}

## Code style rules
${codeStyle}

## Related context
${relatedIssues.map(i => `- #${i.number}: ${i.title}`).join("\n")}

Make all necessary changes. Do NOT commit.
`;

await engine.execute({
  runtime: "agent",
  code: prompt,
  agentFlags: "--model anthropic/claude-sonnet-4-5 --no-session",
  timeoutMs: 1_200_000,
  workdir: "/sandbox/repo",
});

Step 3: Verify — lint and build

After the agent implements, deterministic shell steps verify the result. Lint and build are the only steps allowed to fail — their output is collected and fed into a fix loop, not treated as a hard error.

const lintResult = await engine.execute({
  runtime: "agent",
  cmd: "npm run lint 2>&1",
  workdir: "/sandbox/repo",
  timeoutMs: 120_000,
});

const buildResult = await engine.execute({
  runtime: "agent",
  cmd: "npm run build 2>&1",
  workdir: "/sandbox/repo",
  timeoutMs: 300_000,
});

Step 4: Fix loop — automated retry

If lint or build fails, the fix loop runs the agent again with the error output, then re-verifies. Two rounds is the practical ceiling — after that, hand off to humans.

const maxFixRounds = 2;

for (let round = 1; round <= maxFixRounds; round++) {
  await engine.execute({
    runtime: "agent",
    code: `You are working in /sandbox/repo. Fix all of the following errors.
Do NOT commit anything.

${lintErrors}
${buildErrors}`,
    agentFlags: "--model anthropic/claude-sonnet-4-5 --no-session",
    timeoutMs: 900_000,
    workdir: "/sandbox/repo",
  });

  // Re-verify after fix
  const relint = await engine.execute({
    runtime: "agent",
    cmd: "npm run lint 2>&1",
    workdir: "/sandbox/repo",
    timeoutMs: 120_000,
  });
  const rebuild = await engine.execute({
    runtime: "agent",
    cmd: "npm run build 2>&1",
    workdir: "/sandbox/repo",
    timeoutMs: 300_000,
  });

  if (relint.exitCode === 0 && rebuild.exitCode === 0) break;
  if (round === maxFixRounds) throw new Error("Still failing after max fix rounds");
}

Diminishing returns set in quickly. Two fix rounds is the practical ceiling — after that, hand off to humans rather than burning more tokens.

Step 5: Ship — commit and open a PR

The commit and PR steps are also delegated to the agent. Two shell patterns matter:

// Commit — write message to a file, use -F not -m (avoids interactive editor traps)
await engine.execute({
  runtime: "agent",
  code: `Write a conventional commit message to /tmp/commit-msg.txt, then run:
git add -A && git commit -F /tmp/commit-msg.txt && git push -u origin ${branch}`,
  agentFlags: "--model anthropic/claude-sonnet-4-5 --no-session",
  timeoutMs: 300_000,
  workdir: "/sandbox/repo",
});

// PR — write body to a file, use --body-file not --body (avoids shell escaping issues)
const prResult = await engine.execute({
  runtime: "agent",
  code: `Write the PR title to /tmp/pr-title.txt and body to /tmp/pr-body.md, then run:
gh pr create --title "$(cat /tmp/pr-title.txt)" --body-file /tmp/pr-body.md --base main --head ${branch}
Output the resulting PR URL on its own line.`,
  agentFlags: "--model anthropic/claude-sonnet-4-5 --no-session",
  timeoutMs: 300_000,
  workdir: "/sandbox/repo",
});

// Extract the PR URL from output
const PR_URL_REGEX = /https:\/\/github\.com\/[^\s]+\/pull\/(\d+)/;
const prMatch = (prResult.stdout + prResult.stderr).match(PR_URL_REGEX);
const prUrl = prMatch?.[0];

Full pipeline

Putting it all together, the orchestrator function:

async function runOneShot(task: string, repo: string, branch: string) {
  const engine = new DockerIsol8({
    mode: "persistent",
    network: "host",
    timeoutMs: 1_800_000,
    memoryLimit: "4g",
    cpuLimit: 2,
    pidsLimit: 200,
    sandboxSize: "4g",
    image: "isol8:agent",
    secrets: {
      GITHUB_TOKEN: process.env.GITHUB_TOKEN!,
      ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY!,
    },
  });

  await engine.start();

  try {
    // 1. Setup
    await engine.execute({
      runtime: "agent",
      setupScript: buildSetupScript(repo, branch),
      code: 'echo "setup complete"',
      timeoutMs: 300_000,
    });

    // 2. Implement
    await engine.execute({
      runtime: "agent",
      code: buildPrompt(task, repo, branch),
      agentFlags: "--model anthropic/claude-sonnet-4-5 --no-session",
      timeoutMs: 1_200_000,
      workdir: "/sandbox/repo",
    });

    // 3. Verify
    const lint = await engine.execute({
      runtime: "agent",
      cmd: "npm run lint 2>&1",
      workdir: "/sandbox/repo",
      timeoutMs: 120_000,
    });
    const build = await engine.execute({
      runtime: "agent",
      cmd: "npm run build 2>&1",
      workdir: "/sandbox/repo",
      timeoutMs: 300_000,
    });

    // 4. Fix loop (if needed)
    if (lint.exitCode !== 0 || build.exitCode !== 0) {
      await runFixLoop(engine, lint, build, branch);
    }

    // 5. Ship
    const prUrl = await commitAndCreatePR(engine, branch);
    return prUrl;
  } finally {
    await engine.stop();
  }
}

Streaming progress to clients

Every step can use executeStream() to emit progress in real-time. The phase field on each StreamEvent distinguishes setup output from agent output:

for await (const event of engine.executeStream({
  runtime: "agent",
  setupScript: buildSetupScript(repo, branch),
  code: buildPrompt(task, repo, branch),
  agentFlags: "--model anthropic/claude-sonnet-4-5 --no-session",
})) {
  if (event.phase === "setup") {
    onProgress({ step: "setup", data: event.data });
  } else if (event.type === "stdout") {
    onProgress({ step: "implement", data: event.data });
  } else if (event.type === "exit") {
    onProgress({ step: "implement", data: `exited: ${event.data}` });
  }
}

How you relay these events to the client depends on your architecture — SSE, WebSockets, a message queue, or writing to a database that the client polls. The orchestrator produces StreamEvents; what the client does with them is up to you.

If the setupScript exits non-zero, the stream yields a { type: "error", phase: "setup" } event followed by an exit event, and the agent never starts. Filter on phase to surface setup failures separately from agent failures.

Concurrency and cancellation

When running multiple tasks concurrently, use a job queue with bounded concurrency. Each task should get its own AbortController for cancellation — aborting triggers engine.stop() to destroy the container immediately.

import PQueue from "p-queue";

const queue = new PQueue({ concurrency: 3 });
const controllers = new Map<string, AbortController>();

function enqueueTask(taskId: string, task: string, repo: string, branch: string) {
  const controller = new AbortController();
  controllers.set(taskId, controller);

  queue.add(async () => {
    if (controller.signal.aborted) return;
    await runOneShot(task, repo, branch);
  }, { signal: controller.signal });
}

function cancelTask(taskId: string) {
  controllers.get(taskId)?.abort();
}

Inside the orchestrator, wire the abort signal to engine.stop():

const onAbort = () => engine.stop().catch(() => undefined);
signal?.addEventListener("abort", onAbort, { once: true });

try {
  // ... all steps ...
} finally {
  signal?.removeEventListener("abort", onAbort);
  await engine.stop();
}

Security considerations

For production deployments — especially multi-tenant or untrusted-task workloads — use network: "filtered" with an explicit allowlist instead of network: "host":

const engine = new DockerIsol8({
  mode: "persistent",
  network: "filtered",
  networkFilter: {
    whitelist: [
      "^api\\.anthropic\\.com$",       // LLM API
      "^github\\.com$",                // git clone + push
      "^api\\.github\\.com$",          // gh CLI (pr create, auth)
      "^registry\\.npmjs\\.org$",      // npm installs
    ],
    blacklist: [
      "^169\\.254\\.",                  // link-local (cloud metadata)
      "^10\\.",                         // RFC 1918
      "^172\\.(1[6-9]|2[0-9]|3[01])\\.", // RFC 1918
    ],
  },
  secrets: {
    GITHUB_TOKEN: githubToken,
    ANTHROPIC_API_KEY: anthropicKey,
  },
  memoryLimit: "4g",
  cpuLimit: 2,
  pidsLimit: 200,
  maxOutputSize: 10 * 1024 * 1024,
});

All other isol8 isolation guarantees apply regardless of network mode: read-only root filesystem, non-root sandbox user, seccomp syscall filtering, /sandbox tmpfs, and automatic secret masking in output.

Agent in a Box

Full reference for the agent runtime: flags, networking, file injection, and streaming.

Setup scripts

Image-level vs request-level setup, execution order, streaming output, and error handling.

AI agent code execution

Foundational patterns for LLM tool-call loops with isol8.

Security model

Network controls, seccomp, secret masking, and isolation boundaries.

Remote server

Deploy isol8 as a centralized execution server for agent fleets.

​Architecture

​The container: one persistent session per task

​Step 1: Setup — clone before the agent starts

​Step 2: Implement — the agent gets the task

​Step 3: Verify — lint and build

​Step 4: Fix loop — automated retry

​Step 5: Ship — commit and open a PR

​Full pipeline

​Streaming progress to clients

​Concurrency and cancellation

​Security considerations

​Related pages

Agent in a Box

Setup scripts

AI agent code execution

Security model

Remote server

Architecture

The container: one persistent session per task

Step 1: Setup — clone before the agent starts

Step 2: Implement — the agent gets the task

Step 3: Verify — lint and build

Step 4: Fix loop — automated retry

Step 5: Ship — commit and open a PR

Full pipeline

Streaming progress to clients

Concurrency and cancellation

Security considerations

Related pages