Documentation Index
Fetch the complete documentation index at: https://isol8.notdhruv.com/llms.txt
Use this file to discover all available pages before exploring further.
Use this guide when user code needs internet access, but you still need strict control over which destinations can be reached.
Diagram: Policy-enforced scraping flow
Start with filtered mode
Enable only approved hosts first, then run scraping logic.
import { DockerIsol8 } from "@isol8/core";
const engine = new DockerIsol8({
mode: "ephemeral",
network: "filtered",
networkFilter: {
whitelist: [
"^api\\.github\\.com$",
"^en\\.wikipedia\\.org$",
],
blacklist: ["^169\\.254\\."],
},
timeoutMs: 30000,
memoryLimit: "512m",
});
await engine.start();
isol8 run scraper.py \
--net filtered \
--allow "^api\.github\.com$" \
--allow "^en\.wikipedia\.org$" \
--deny "^169\.254\."
{
"request": {
"code": "print('scrape')",
"runtime": "python"
},
"options": {
"network": "filtered",
"networkFilter": {
"whitelist": ["^api\\.github\\.com$"],
"blacklist": ["^169\\.254\\."]
}
}
}
In filtered mode, blacklist rules take precedence over whitelist rules.
Pattern 1: approved API fetch
const result = await engine.execute({
runtime: "python",
code: `
import urllib.request, json
url = "https://api.github.com/repos/Illusion47586/isol8"
resp = urllib.request.urlopen(url)
data = json.loads(resp.read())
print(json.dumps({
"repo": data["full_name"],
"stars": data["stargazers_count"]
}))
`,
});
console.log(result.stdout);
Pattern 2: graceful handling for blocked hosts
const result = await engine.execute({
runtime: "python",
code: `
import urllib.request
targets = [
"https://api.github.com",
"https://example-blocked-domain.invalid"
]
for url in targets:
try:
urllib.request.urlopen(url, timeout=5)
print(f"ALLOW {url}")
except Exception as e:
print(f"BLOCK {url}: {e}")
`,
});
Pattern 3: scraping HTML with packages
For richer parsing, install parser libraries:
const result = await engine.execute({
runtime: "python",
installPackages: ["requests", "beautifulsoup4"],
code: `
import requests
from bs4 import BeautifulSoup
html = requests.get("https://en.wikipedia.org/wiki/Docker_(software)", timeout=10).text
soup = BeautifulSoup(html, "html.parser")
first_p = soup.select_one(".mw-parser-output > p:not(.mw-empty-elt)")
print(first_p.get_text(strip=True)[:300])
`,
});
Authenticated API calls with secrets
When scraping private APIs, inject credentials using secrets.
const secured = new DockerIsol8({
mode: "ephemeral",
network: "filtered",
networkFilter: {
whitelist: ["^api\\.example\\.com$"],
blacklist: [],
},
secrets: {
API_TOKEN: process.env.API_TOKEN!,
},
});
const result = await secured.execute({
runtime: "python",
code: `
import os, urllib.request, json
req = urllib.request.Request(
"https://api.example.com/data",
headers={"Authorization": f"Bearer {os.environ['API_TOKEN']}"}
)
resp = urllib.request.urlopen(req)
print(resp.status)
`,
});
Secret masking applies to stdout/stderr text. If script writes secrets to files, those file contents are not auto-redacted.
Observe network behavior during scraping
Enable network request logs for filtered runs:
isol8 run scraper.py \
--net filtered \
--allow "^api\.github\.com$" \
--log-network \
--no-stream
In non-stream mode, CLI prints collected network log entries when available.
Remote scraping workers
For centralized scraping infrastructure, run remote server and use RemoteIsol8.
import { RemoteIsol8 } from "@isol8/core";
const remote = new RemoteIsol8(
{
host: "http://localhost:3000",
apiKey: process.env.ISOL8_API_KEY!,
sessionId: "scrape-job-001",
},
{
network: "filtered",
networkFilter: {
whitelist: ["^api\\.github\\.com$"],
blacklist: [],
},
timeoutMs: 30000,
}
);
await remote.start();
const res = await remote.execute({
runtime: "python",
code: "print('remote scrape run')",
});
await remote.stop();
Safer scraping design patterns
- whitelist exact hostnames instead of broad wildcards
- keep timeouts short for external requests
- parse to structured output (JSON) rather than raw HTML dumps
- separate fetch and parse stages to isolate failures
- pre-bake stable dependencies to avoid per-run install overhead
Related pages
Security model
Understand filtered mode enforcement and seccomp boundaries.
Remote server and client
Run scraping workloads with centralized session/policy management.
Execution guide
Execution request fields, streaming, and output behavior.
Option mapping
Exact CLI/config/API/library mapping for network and runtime options.