
Day 3 of Windmill launch week. You can now run AI coding agents like Claude Code or Codex in sandboxed environments with persistent storage, directly from your scripts and flows.
The problem
AI coding agents need two things that are hard to combine: isolation and persistence. You want them sandboxed so they cannot access the host filesystem or network. But you also want them to remember state across runs, produce artifacts, and pick up where they left off.
Teams end up managing Docker containers, mounting volumes manually, and writing wrapper scripts to handle session state. The orchestration layer has no opinion about where the agent runs or how its files persist.
AI sandboxes: two annotations
An AI sandbox is a regular Windmill script with two annotations: one for isolation, one for storage.
- TypeScript
- Python
// sandbox
// volume: agent-state .agent
import Anthropic from '@anthropic-ai/sdk';
import { MessageStream } from '@anthropic-ai/sdk/lib/MessageStream';
export async function main(prompt: string) {
const client = new Anthropic();
// The .agent directory persists across runs
const result = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
messages: [{ role: 'user', content: prompt }],
});
return result;
}
# sandbox
# volume: agent-state .agent
import anthropic
def main(prompt: str):
client = anthropic.Anthropic()
# The .agent directory persists across runs
result = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": prompt}],
)
return result
// sandbox enables NSJAIL process isolation. // volume: agent-state .agent mounts a persistent volume synced to your workspace object storage. That's it.
Why we built it this way
Three design choices drove the architecture:
Process isolation with NSJAIL. Each execution runs in its own NSJAIL sandbox with filesystem isolation, network restrictions, and resource limits. The agent cannot access the host system or other jobs. You can force sandboxing instance-wide for all scripts.
Persistent volumes on object storage. Files in the mounted volume are synced to your workspace S3 (or Azure Blob, GCS) between runs. A per-worker LRU cache (up to 10 GB) avoids re-downloading on consecutive runs. Exclusive leasing prevents concurrent writes to the same volume.
Works with any agent. Claude Code, Codex, OpenCode, or any custom agent that operates on a local filesystem. Windmill provides the sandbox and the storage; the agent brings its own logic. A built-in Claude Code template handles session persistence and token counting out of the box.
Built-in Claude Code template
Windmill ships with a ready-to-use Claude Code template. It handles session persistence (the session ID is stored in the volume), agent instructions, skill files, and token counting for cost monitoring.
// sandbox
// volume: claude-sessions .agent
import { ClaudeCodeAgent } from '@anthropic-ai/claude-agent-sdk';
export async function main(prompt: string) {
const agent = new ClaudeCodeAgent({
instructions: "You are a helpful coding assistant.",
});
return await agent.run(prompt);
}
Use cases
- Persistent agent memory: conversation history and session state survive across runs.
- Artifact generation: agents produce reports, code, or data files that persist in the volume.
- Multi-step workflows: a flow triggers an agent, waits for results, then passes artifacts to the next step.
- Safe execution at scale: resource limits and isolation let you run untrusted agent code without risk.
Getting started
- Configure workspace object storage (S3, Azure Blob, GCS, or filesystem).
- Add
// sandboxand// volume: <name> <path>annotations to any script. - Run it. Files in the volume path persist across executions.
What's next
Tomorrow is Day 4: Git sync & workspace forks. Sync with Git, stage workspaces, and deploy via CI/CD. Follow along.
You can self-host Windmill using a
docker compose up, or go with the cloud app.
