Agent recipes¶
Patterns for plugging firebox into a real agent loop. Pick one, copy, modify.
The shape is always the same:
flowchart LR
A[Your agent / LLM] -- decides actions --> T{firebox tools}
T -- sandbox_open --> S[microVM]
T -- browser_* --> S
T -- run --> S
T -- search --> SX[SearxNG]
S -- result / page text / screenshot --> A
SX -- result list --> A
A is your agent — Claude, GPT, your own loop. The tool layer is
firebox: SDK from Python, or MCP for any MCP-aware host. Sandboxes
provide isolation; firebox provides the API.
1. Web research¶
"Summarise the top three Hacker News stories and give me the authors of the linked articles."
The agent decides each step (navigate → read → click → read again →
synthesize). Firebox provides search + browser_*.
from firebox.sandbox import Sandbox
def research(question: str, llm) -> str:
with Sandbox.create(template="browser-use", ttl_seconds=600) as sb:
sb.browser.start()
# 1. broad search
results = sb.search.web(question, language="en")[:5]
# 2. open each, pull headline + lede
digests = []
for r in results:
sb.browser.navigate(r.url, timeout=15.0)
text = sb.browser.text("article, main, body")[:1500]
digests.append({"url": r.url, "snippet": text})
# 3. ask the LLM to synthesize
return llm.summarize(question, digests)
The model needs no glue code; it sees search, browser_* as
native tools and picks them in order.
User prompt → Claude:
Use firebox tools to find the top 3 Hacker News stories, open each one, and summarize what they're about.
Claude calls (typically): sandbox_open(template="browser-use")
→ browser_start → browser_navigate("https://news.ycombinator.com")
→ browser_text_all(".titleline > a") → for each top story
browser_click_at(...) → browser_text → sandbox_close.
No code on your side at all.
2. Code interpreter¶
"Run this Python code; if it errors, show me; if it produces a plot, return the image."
Pattern: write user code into the sandbox, run it, fetch any side-effect files (plots, csvs).
from firebox.sandbox import Sandbox
def execute(code: str) -> dict:
with Sandbox.create(template="base", ttl_seconds=120) as sb:
sb.files.write("/work/main.py", code)
result = sb.run("cd /work && python3 main.py", timeout=60)
out = {
"stdout": result.stdout,
"stderr": result.stderr,
"exit_code": result.exit_code,
}
# If the script produced a plot, return it
try:
out["plot_png"] = sb.files.read("/work/plot.png") # bytes
except RuntimeError:
pass
return out
Stream output while it runs (useful for long jobs):
for chunk in sb.stream("cd /work && python3 train.py"):
if chunk.stream == "stdout":
forward_to_user(chunk.data) # progress bars work
elif chunk.stream == "final":
print(f"exit {chunk.exit_code} in {chunk.duration:.2f}s")
3. Browser scraping (no LLM in the loop)¶
For deterministic scrapes — the agent that drives this is just your Python script.
from firebox.sandbox import Sandbox
with Sandbox.create(template="browser-use", ttl_seconds=120) as sb:
sb.browser.start()
sb.browser.navigate("https://old.reddit.com/r/programming/")
# JS escape hatch when CSS selectors get messy
stories = sb.browser.evaluate("""
() => [...document.querySelectorAll("div.thing.link:not(.promoted)")]
.filter(el => !el.classList.contains("stickied"))
.slice(0, 5)
.map(el => ({
title: el.querySelector("a.title")?.innerText,
score: el.querySelector(".score.unvoted")?.innerText,
url: el.querySelector("a.title")?.href,
}))
""")
for s in stories:
print(f" ({s['score']}) {s['title']} — {s['url']}")
Real, working version: examples/browser-use/reddit_live.py.
4. Lead generation (search + parallel scrape)¶
Search engines for contact pages, fetch each via real-Chrome TLS in parallel, filter for plausible business emails.
import re
from firebox.sandbox import Sandbox
EMAIL_RE = re.compile(r"[\w.+-]+@[\w.-]+\.[A-Za-z]{2,}")
def find_leads(queries: list[str], n: int = 10) -> list[dict]:
with Sandbox.create(template="browser-use", ttl_seconds=600) as sb:
# 1. fan-out search across N queries
urls = []
for q in queries:
for r in sb.search.web(q):
urls.append(r.url)
# 2. parallel HTTP fetch with real Chrome TLS (curl_cffi)
leads, seen = [], set()
for url in urls:
try:
html = sb.http.get(url, timeout=8).text
except Exception:
continue
for email in set(EMAIL_RE.findall(html)):
if email.lower() in seen: continue
seen.add(email.lower())
leads.append({"email": email, "source": url})
if len(leads) >= n:
return leads
return leads
Real version with proper plausibility filtering and contact-page
sub-paths: examples/browser-use/lead_finder.py.
5. Multi-agent fleet¶
Spawn N sandboxes and run different tasks concurrently. Each gets its own IP, its own browser, its own quota slot.
from concurrent.futures import ThreadPoolExecutor
from firebox.sandbox import Sandbox
def worker(task: str) -> dict:
with Sandbox.create(template="browser-use", ttl_seconds=300) as sb:
sb.browser.start()
sb.browser.navigate(task["url"])
return {
"task": task["id"],
"title": sb.browser.text("h1"),
"screenshot": sb.browser.screenshot(), # bytes
}
tasks = [
{"id": 1, "url": "https://example.com"},
{"id": 2, "url": "https://hbs.si"},
{"id": 3, "url": "https://news.ycombinator.com"},
]
with ThreadPoolExecutor(max_workers=5) as pool:
results = list(pool.map(worker, tasks))
Or via the CLI:
firebox run-many \
"echo agent-1 from \$(hostname)" \
"echo agent-2 from \$(hostname)" \
"echo agent-3 from \$(hostname)" \
--concurrency 3
run-many ships in the SDK as firebox.parallel.run_many —
useful when you want a clean batch fan-out from inside a larger
agent loop.
6. Long-lived agent session (LLM in the loop)¶
When the agent should reuse the same VM across many turns — log into a site once, then click around for the rest of the session.
from firebox.sandbox import Sandbox
class WebAgent:
def __init__(self, llm):
self.llm = llm
self.sb = Sandbox.create(template="browser-use", ttl_seconds=900)
self.sb.browser.start()
# Load saved login profile if available
try:
self.sb.browser.start(profile="my-app-login")
except Exception:
pass
def step(self, user_msg: str) -> str:
screenshot = self.sb.browser.screenshot_annotated()
clickables = self.sb.browser.clickables()
action = self.llm.decide(
user_msg,
page_image=screenshot,
available_actions=clickables,
)
if action.kind == "click":
self.sb.browser.click_at(action.x, action.y)
elif action.kind == "type":
self.sb.browser.fill(action.selector, action.text)
elif action.kind == "navigate":
self.sb.browser.navigate(action.url)
return self.sb.browser.text("body")[:2000]
def close(self):
self.sb.browser.save_profile("my-app-login")
self.sb.close()
The agent's memory is the page; the LLM doesn't need a database. TTL keeps the VM alive between turns; activity resets the clock. Profile save/load preserves login state across sessions.
7. Agent that uses firebox via MCP¶
The cleanest path when you don't own the agent loop. Anything MCP-aware (Claude Desktop, Code, Cursor, ChatGPT Agent, custom clients) can mount firebox tools and call them directly.
sequenceDiagram
participant U as You
participant Cl as Claude Desktop
participant M as firebox MCP server
participant D as firebox-daemon
participant S as microVM
U->>Cl: "Find me 10 fitness blog email leads."
Cl->>M: tools/call sandbox_open
M->>D: POST /sandboxes
D-->>M: { id }
M-->>Cl: { sandbox_id }
loop for each query
Cl->>M: tools/call search { q, categories: "general" }
M->>D: GET /search → SearxNG
M-->>Cl: { results: [...] }
Cl->>M: tools/call browser_navigate, browser_text, ...
end
Cl->>M: tools/call sandbox_close
M->>D: POST /sandboxes/X/close
Cl-->>U: "Here are 10 leads: ..."
Setup: see MCP server. After that, your agent gets 22 firebox tools and you write zero glue code.
8. Live progress to a UI / chat¶
Stream everything the sandbox is doing — search hits, file edits, shell output, browser activity — into your front-end as it happens.
from firebox.sandbox import Sandbox
def emit(channel, payload):
"""Replace with: SSE write, websocket send, Slack post, agent log..."""
print(f"[{channel}] {payload}")
with Sandbox.create(template="browser-use",
workspace="./project",
workspace_exclude=[".git/*", "__pycache__/*"]) as sb:
# 1. Search — stream results engine-by-engine
for r in sb.search.stream("user query", engines=["google","duckduckgo","brave"]):
emit("search", {"engine": r.engine, "title": r.title, "url": r.url})
# 2. Browser session — VNC live-stream (open in user's browser)
sb.browser.start(headless=False, stealth=True)
sb.process.start("websockify 6080 localhost:5900", env={"DISPLAY": ":99"})
emit("browser", {"vnc_url": f"http://{sb.ip}:6080/vnc.html"})
sb.browser.navigate("https://example.com") # user sees it live
# 3. File edits — push every change as the agent works
import threading
def watcher():
for evt in sb.files.watch("/work", timeout=600):
emit("fs", evt) # MODIFY a.py, CREATE README.md, ...
threading.Thread(target=watcher, daemon=True).start()
# 4. Shell output — stdout/stderr line-by-line
for chunk in sb.stream("python3 /work/build.py"):
emit("shell", {"stream": chunk.stream, "data": chunk.data})
Four primitives, one unified live feed. The frontend can render each
channel differently — search as a list, browser as an embedded
noVNC iframe, fs as a file tree with highlighted edits, shell as
a terminal pane.
Choosing a pattern¶
| Goal | Pattern | LLM in loop? |
|---|---|---|
| Deterministic scrape, known DOM | #3 | No |
| Scrape across many sites | #4 | No (LLM optional for synthesis) |
| LLM decides actions step-by-step | #1 or #7 | Yes |
| User pastes code, you run it | #2 | No (LLM wrote the code) |
| N tasks in parallel | #5 | No |
| Multi-turn, stateful | #6 | Yes |
| Plug into Claude / Cursor | #7 | Yes |
| Live progress to user UI | #8 | Optional |
Mix freely. A typical real agent does multi-turn (#6) but occasionally fans out (#5) for parallel verification, and uses the search helper inside any of these.