Skip to main content
Use isol8 to run untrusted or user-provided data processing scripts safely. This is common in platforms where users upload data and define custom transformations, or where you need to run third-party analysis scripts without risking the host.

Basic Data Transformation

Pass data in via stdin, process it with user-provided code, and capture the result:
import { DockerIsol8 } from "isol8";

const isol8 = new DockerIsol8({
  mode: "ephemeral",
  network: "none",
  memoryLimit: "1g",      // Data processing may need more memory
  timeoutMs: 60000,       // Allow longer for large datasets
  sandboxSize: "512m",    // Space for intermediate files
});

await isol8.start();

// User-provided transformation
const transformCode = `
import json, sys

data = json.load(sys.stdin)
result = [
    {**row, "total": row["price"] * row["quantity"]}
    for row in data
]
json.dump(result, sys.stdout, indent=2)
`;

const inputData = JSON.stringify([
  { item: "Widget", price: 9.99, quantity: 100 },
  { item: "Gadget", price: 24.99, quantity: 50 },
]);

const result = await isol8.execute({
  code: transformCode,
  runtime: "python",
  stdin: inputData,
});

const transformed = JSON.parse(result.stdout);
console.log(transformed);
// [{ item: "Widget", price: 9.99, quantity: 100, total: 999 }, ...]

CSV Processing with pandas

Install pandas on the fly for more complex data work:
const result = await isol8.execute({
  code: `
import pandas as pd
import json

df = pd.read_csv("/sandbox/data.csv")

# User-defined aggregation
summary = df.groupby("category").agg({
    "revenue": ["sum", "mean", "count"],
    "profit_margin": "mean"
}).round(2)

print(summary.to_json())
`,
  runtime: "python",
  installPackages: ["pandas"],
  files: {
    "/sandbox/data.csv": csvContent, // Pass CSV as a file
  },
});

const summary = JSON.parse(result.stdout);

File-Based Pipelines

For complex pipelines, inject input files and retrieve output files:
const isol8 = new DockerIsol8({ mode: "persistent" });
await isol8.start();

// Step 1: Upload raw data
await isol8.putFile("/sandbox/raw.json", JSON.stringify(rawData));

// Step 2: Run cleaning script
await isol8.execute({
  code: `
import json

with open("/sandbox/raw.json") as f:
    data = json.load(f)

# Remove nulls, normalize strings
cleaned = [
    {k: v.strip() if isinstance(v, str) else v for k, v in row.items()}
    for row in data
    if all(v is not None for v in row.values())
]

with open("/sandbox/cleaned.json", "w") as f:
    json.dump(cleaned, f)
print(f"Cleaned {len(data)} -> {len(cleaned)} records")
`,
  runtime: "python",
});

// Step 3: Run analysis script
await isol8.execute({
  code: `
import json

with open("/sandbox/cleaned.json") as f:
    data = json.load(f)

# Compute statistics
stats = {
    "count": len(data),
    "fields": list(data[0].keys()) if data else [],
}

with open("/sandbox/report.json", "w") as f:
    json.dump(stats, f, indent=2)
print("Report generated")
`,
  runtime: "python",
});

// Step 4: Retrieve the report
const report = await isol8.getFile("/sandbox/report.json");
console.log(report.toString());

await isol8.stop();

Generating Charts and Visualizations

Generate plots with matplotlib and retrieve the image:
const result = await isol8.execute({
  code: `
import matplotlib
matplotlib.use('Agg')  # Non-interactive backend
import matplotlib.pyplot as plt
import json

with open("/sandbox/data.json") as f:
    data = json.load(f)

categories = [d["category"] for d in data]
values = [d["value"] for d in data]

plt.figure(figsize=(10, 6))
plt.bar(categories, values, color='#0E7C6B')
plt.title("Sales by Category")
plt.ylabel("Revenue ($)")
plt.tight_layout()
plt.savefig("/sandbox/chart.png", dpi=150)
print("Chart saved")
`,
  runtime: "python",
  installPackages: ["matplotlib"],
  files: {
    "/sandbox/data.json": JSON.stringify(chartData),
  },
  outputPaths: ["/sandbox/chart.png"],
});

// result.files["/sandbox/chart.png"] contains the PNG as base64
const chartBase64 = result.files?.["/sandbox/chart.png"];

Parallel Processing

Process multiple datasets concurrently using separate containers:
async function processDatasets(datasets: Record<string, string>[]) {
  const tasks = datasets.map(async (dataset, i) => {
    const result = await isol8.execute({
      code: `
import json, sys
data = json.load(sys.stdin)
# Process each dataset independently
total = sum(row.get("amount", 0) for row in data)
print(json.dumps({"dataset": ${i}, "total": total}))
`,
      runtime: "python",
      stdin: JSON.stringify(dataset),
    });

    return JSON.parse(result.stdout);
  });

  // isol8's concurrency semaphore limits parallel execution automatically
  return Promise.all(tasks);
}

Node.js Data Processing

Not everything has to be Python — use Node.js for JSON-heavy workloads:
const result = await isol8.execute({
  code: `
const data = JSON.parse(require("fs").readFileSync("/dev/stdin", "utf8"));

const grouped = data.reduce((acc, item) => {
  const key = item.region;
  if (!acc[key]) acc[key] = [];
  acc[key].push(item);
  return acc;
}, {});

const summary = Object.entries(grouped).map(([region, items]) => ({
  region,
  count: items.length,
  avgRevenue: (items.reduce((s, i) => s + i.revenue, 0) / items.length).toFixed(2),
}));

console.log(JSON.stringify(summary, null, 2));
`,
  runtime: "node",
  stdin: JSON.stringify(salesData),
});