Sandbox SDK -- let an agent run a CLI process

OpenCode shortcut

Paste this prompt into the OpenCode TUI to let your AI agent walk you through this lesson:

Step 0: Navigate to the lesson directory

Open your terminal and navigate to the lesson-13-sandbox directory:

cd ..\lesson-13-sandbox

This directory’s wrangler.jsonc already sets STATE to state-6-sandbox, which turns on the Sandbox CLI-enrichment panel in the analyst console. The panel is a faithful simulation in this workshop build, so you can complete the lesson on any account without Containers access.

Why CLI enrichment matters

In lessons 11-12, the Network Analyzer assessed destination reputation and infrastructure signals — but it was working entirely from what the LLM already knows. Ask a model “is 203.0.113.88 a known-bad IP?” and you’ll get a plausible-sounding answer that may be completely fabricated. The model has no live view of DNS records, WHOIS registrations, or current infrastructure.

Real security triage needs ground truth. A dig command returns actual DNS records. A whois lookup returns the real registrant, registration date, and hosting provider. These are facts, not inferences — and they dramatically change the quality of the network analysis.

The Sandbox SDK lets you run arbitrary CLI commands inside an isolated, ephemeral container on Cloudflare’s network. The sandbox starts on demand, executes the command, returns stdout/stderr, and can be destroyed immediately after. It’s designed for exactly this pattern: give an agent access to real tools without exposing your Worker’s runtime to untrusted processes.

How it works

The flow for the enhanced Network Analyzer:

Incident data ── Network Analyzer ─┬─ Create Sandbox
                                    ├─ sandbox.exec("dig example.com")
                                    ├─ sandbox.exec("whois 203.0.113.88")
                                    ├─ Capture stdout from each command
                                    ├─ Destroy Sandbox
                                    └─ Feed real CLI output into analysis prompt

The analyzer creates a sandbox, runs one or two enrichment commands based on the incident’s network indicators, captures the output, then includes that raw output in the prompt to Workers AI. The model now has real data to analyze instead of guessing.

Step 1: Replace your database ID and redeploy

This lesson is a STATE change — the directory is already at state-6-sandbox, which switches on the Sandbox CLI-enrichment panel. No new binding is required for the workshop build.

Replace REPLACE_WITH_YOUR_DATABASE_ID in lesson-13-sandbox\wrangler.jsonc with your D1 database_id from lesson 06, then redeploy:

npm run deploy

Step 2: See the CLI enrichment

Open your app and investigate the Malware + C2 Beacon incident. During triage, the Network Analyzer lane now shows a Sandbox CLI enrichment panel with:

The commands that were run (dig +short, whois)
Their output (resolved IP, origin ASN, registration age, reputation)

The panel is built from the incident’s network indicators. It demonstrates the pattern: give the agent real tool output instead of asking the model to guess DNS or WHOIS facts. Compare this to the generic network analysis from lesson 11 — “WHOIS shows this IP was registered 3 days ago” is a fact, not an inference.

Wiring it live (optional, beyond the workshop)

A live integration adds the Sandbox binding to wrangler.jsonc:

"containers": [{ "class_name": "Sandbox", "image": "./Dockerfile", "instance_type": "lite", "max_instances": 1 }],
"durable_objects": { "bindings": [{ "class_name": "Sandbox", "name": "SANDBOX" }] },
"migrations": [{ "new_sqlite_classes": ["Sandbox"], "tag": "v2" }]

re-exports the Sandbox class from the Worker entry (export { Sandbox } from '@cloudflare/sandbox'), and calls it from the Network Analyzer:

import { getSandbox } from "@cloudflare/sandbox";

const sandbox = getSandbox(env.SANDBOX, `network-${incident.id}`);
const digResult = await sandbox.exec(`dig ${domain} +short`);
const whoisResult = await sandbox.exec(`whois ${ip}`);
await sandbox.destroy();

The raw stdout then feeds the analysis prompt — real DNS/WHOIS data instead of guesses. Each sandbox runs in its own ephemeral container with no access to your Worker’s bindings or secrets — the container is the trust boundary.

What comes next

In lesson 14, the Response Agent gains a similar execution capability — but instead of running existing CLI tools, it will generate and execute custom Worker code using Dynamic Workers. The agent writes a purpose-built IOC scoring function, deploys it on the fly, runs it, and returns the result — all within the triage flow.

Sandbox SDK -- let an agent run a CLI process

Step 0: Navigate to the lesson directory

Why CLI enrichment matters

How it works

Step 1: Replace your database ID and redeploy

Step 2: See the CLI enrichment

Wiring it live (optional, beyond the workshop)

What comes next

Key takeaways

Tool output gives the model ground truth

Sandbox SDK isolates arbitrary command execution

Evidence-backed analysis beats plausible analysis