Skip to content

Agent recipes

Patterns for plugging firebox into a real agent loop. Pick one, copy, modify.

The shape is always the same:

flowchart LR
    A[Your agent / LLM] -- decides actions --> T{firebox tools}
    T -- sandbox_open --> S[microVM]
    T -- browser_* --> S
    T -- run --> S
    T -- search --> SX[SearxNG]
    S -- result / page text / screenshot --> A
    SX -- result list --> A

A is your agent — Claude, GPT, your own loop. The tool layer is firebox: SDK from Python, or MCP for any MCP-aware host. Sandboxes provide isolation; firebox provides the API.


1. Web research

"Summarise the top three Hacker News stories and give me the authors of the linked articles."

The agent decides each step (navigate → read → click → read again → synthesize). Firebox provides search + browser_*.

from firebox.sandbox import Sandbox

def research(question: str, llm) -> str:
    with Sandbox.create(template="browser-use", ttl_seconds=600) as sb:
        sb.browser.start()
        # 1. broad search
        results = sb.search.web(question, language="en")[:5]
        # 2. open each, pull headline + lede
        digests = []
        for r in results:
            sb.browser.navigate(r.url, timeout=15.0)
            text = sb.browser.text("article, main, body")[:1500]
            digests.append({"url": r.url, "snippet": text})
        # 3. ask the LLM to synthesize
        return llm.summarize(question, digests)

The model needs no glue code; it sees search, browser_* as native tools and picks them in order.

User prompt → Claude:

Use firebox tools to find the top 3 Hacker News stories, open each one, and summarize what they're about.

Claude calls (typically): sandbox_open(template="browser-use")browser_startbrowser_navigate("https://news.ycombinator.com")browser_text_all(".titleline > a") → for each top story browser_click_at(...)browser_textsandbox_close.

No code on your side at all.


2. Code interpreter

"Run this Python code; if it errors, show me; if it produces a plot, return the image."

Pattern: write user code into the sandbox, run it, fetch any side-effect files (plots, csvs).

from firebox.sandbox import Sandbox

def execute(code: str) -> dict:
    with Sandbox.create(template="base", ttl_seconds=120) as sb:
        sb.files.write("/work/main.py", code)
        result = sb.run("cd /work && python3 main.py", timeout=60)
        out = {
            "stdout": result.stdout,
            "stderr": result.stderr,
            "exit_code": result.exit_code,
        }
        # If the script produced a plot, return it
        try:
            out["plot_png"] = sb.files.read("/work/plot.png")  # bytes
        except RuntimeError:
            pass
        return out

Stream output while it runs (useful for long jobs):

for chunk in sb.stream("cd /work && python3 train.py"):
    if chunk.stream == "stdout":
        forward_to_user(chunk.data)         # progress bars work
    elif chunk.stream == "final":
        print(f"exit {chunk.exit_code} in {chunk.duration:.2f}s")

3. Browser scraping (no LLM in the loop)

For deterministic scrapes — the agent that drives this is just your Python script.

from firebox.sandbox import Sandbox

with Sandbox.create(template="browser-use", ttl_seconds=120) as sb:
    sb.browser.start()
    sb.browser.navigate("https://old.reddit.com/r/programming/")
    # JS escape hatch when CSS selectors get messy
    stories = sb.browser.evaluate("""
        () => [...document.querySelectorAll("div.thing.link:not(.promoted)")]
                  .filter(el => !el.classList.contains("stickied"))
                  .slice(0, 5)
                  .map(el => ({
                      title: el.querySelector("a.title")?.innerText,
                      score: el.querySelector(".score.unvoted")?.innerText,
                      url:   el.querySelector("a.title")?.href,
                  }))
    """)
    for s in stories:
        print(f"  ({s['score']}) {s['title']}{s['url']}")

Real, working version: examples/browser-use/reddit_live.py.


4. Lead generation (search + parallel scrape)

Search engines for contact pages, fetch each via real-Chrome TLS in parallel, filter for plausible business emails.

import re
from firebox.sandbox import Sandbox

EMAIL_RE = re.compile(r"[\w.+-]+@[\w.-]+\.[A-Za-z]{2,}")

def find_leads(queries: list[str], n: int = 10) -> list[dict]:
    with Sandbox.create(template="browser-use", ttl_seconds=600) as sb:
        # 1. fan-out search across N queries
        urls = []
        for q in queries:
            for r in sb.search.web(q):
                urls.append(r.url)
        # 2. parallel HTTP fetch with real Chrome TLS (curl_cffi)
        leads, seen = [], set()
        for url in urls:
            try:
                html = sb.http.get(url, timeout=8).text
            except Exception:
                continue
            for email in set(EMAIL_RE.findall(html)):
                if email.lower() in seen: continue
                seen.add(email.lower())
                leads.append({"email": email, "source": url})
                if len(leads) >= n:
                    return leads
        return leads

Real version with proper plausibility filtering and contact-page sub-paths: examples/browser-use/lead_finder.py.


5. Multi-agent fleet

Spawn N sandboxes and run different tasks concurrently. Each gets its own IP, its own browser, its own quota slot.

from concurrent.futures import ThreadPoolExecutor
from firebox.sandbox import Sandbox

def worker(task: str) -> dict:
    with Sandbox.create(template="browser-use", ttl_seconds=300) as sb:
        sb.browser.start()
        sb.browser.navigate(task["url"])
        return {
            "task": task["id"],
            "title": sb.browser.text("h1"),
            "screenshot": sb.browser.screenshot(),    # bytes
        }

tasks = [
    {"id": 1, "url": "https://example.com"},
    {"id": 2, "url": "https://hbs.si"},
    {"id": 3, "url": "https://news.ycombinator.com"},
]
with ThreadPoolExecutor(max_workers=5) as pool:
    results = list(pool.map(worker, tasks))

Or via the CLI:

firebox run-many \
    "echo agent-1 from \$(hostname)" \
    "echo agent-2 from \$(hostname)" \
    "echo agent-3 from \$(hostname)" \
    --concurrency 3

run-many ships in the SDK as firebox.parallel.run_many — useful when you want a clean batch fan-out from inside a larger agent loop.


6. Long-lived agent session (LLM in the loop)

When the agent should reuse the same VM across many turns — log into a site once, then click around for the rest of the session.

from firebox.sandbox import Sandbox

class WebAgent:
    def __init__(self, llm):
        self.llm = llm
        self.sb = Sandbox.create(template="browser-use", ttl_seconds=900)
        self.sb.browser.start()
        # Load saved login profile if available
        try:
            self.sb.browser.start(profile="my-app-login")
        except Exception:
            pass

    def step(self, user_msg: str) -> str:
        screenshot = self.sb.browser.screenshot_annotated()
        clickables = self.sb.browser.clickables()
        action = self.llm.decide(
            user_msg,
            page_image=screenshot,
            available_actions=clickables,
        )
        if action.kind == "click":
            self.sb.browser.click_at(action.x, action.y)
        elif action.kind == "type":
            self.sb.browser.fill(action.selector, action.text)
        elif action.kind == "navigate":
            self.sb.browser.navigate(action.url)
        return self.sb.browser.text("body")[:2000]

    def close(self):
        self.sb.browser.save_profile("my-app-login")
        self.sb.close()

The agent's memory is the page; the LLM doesn't need a database. TTL keeps the VM alive between turns; activity resets the clock. Profile save/load preserves login state across sessions.


7. Agent that uses firebox via MCP

The cleanest path when you don't own the agent loop. Anything MCP-aware (Claude Desktop, Code, Cursor, ChatGPT Agent, custom clients) can mount firebox tools and call them directly.

sequenceDiagram
    participant U as You
    participant Cl as Claude Desktop
    participant M as firebox MCP server
    participant D as firebox-daemon
    participant S as microVM

    U->>Cl: "Find me 10 fitness blog email leads."
    Cl->>M: tools/call sandbox_open
    M->>D: POST /sandboxes
    D-->>M: { id }
    M-->>Cl: { sandbox_id }

    loop for each query
      Cl->>M: tools/call search { q, categories: "general" }
      M->>D: GET /search → SearxNG
      M-->>Cl: { results: [...] }
      Cl->>M: tools/call browser_navigate, browser_text, ...
    end

    Cl->>M: tools/call sandbox_close
    M->>D: POST /sandboxes/X/close
    Cl-->>U: "Here are 10 leads: ..."

Setup: see MCP server. After that, your agent gets 22 firebox tools and you write zero glue code.


8. Live progress to a UI / chat

Stream everything the sandbox is doing — search hits, file edits, shell output, browser activity — into your front-end as it happens.

from firebox.sandbox import Sandbox

def emit(channel, payload):
    """Replace with: SSE write, websocket send, Slack post, agent log..."""
    print(f"[{channel}] {payload}")

with Sandbox.create(template="browser-use",
                    workspace="./project",
                    workspace_exclude=[".git/*", "__pycache__/*"]) as sb:

    # 1. Search — stream results engine-by-engine
    for r in sb.search.stream("user query", engines=["google","duckduckgo","brave"]):
        emit("search", {"engine": r.engine, "title": r.title, "url": r.url})

    # 2. Browser session — VNC live-stream (open in user's browser)
    sb.browser.start(headless=False, stealth=True)
    sb.process.start("websockify 6080 localhost:5900", env={"DISPLAY": ":99"})
    emit("browser", {"vnc_url": f"http://{sb.ip}:6080/vnc.html"})
    sb.browser.navigate("https://example.com")     # user sees it live

    # 3. File edits — push every change as the agent works
    import threading
    def watcher():
        for evt in sb.files.watch("/work", timeout=600):
            emit("fs", evt)            # MODIFY a.py, CREATE README.md, ...
    threading.Thread(target=watcher, daemon=True).start()

    # 4. Shell output — stdout/stderr line-by-line
    for chunk in sb.stream("python3 /work/build.py"):
        emit("shell", {"stream": chunk.stream, "data": chunk.data})

Four primitives, one unified live feed. The frontend can render each channel differently — search as a list, browser as an embedded noVNC iframe, fs as a file tree with highlighted edits, shell as a terminal pane.


Choosing a pattern

Goal Pattern LLM in loop?
Deterministic scrape, known DOM #3 No
Scrape across many sites #4 No (LLM optional for synthesis)
LLM decides actions step-by-step #1 or #7 Yes
User pastes code, you run it #2 No (LLM wrote the code)
N tasks in parallel #5 No
Multi-turn, stateful #6 Yes
Plug into Claude / Cursor #7 Yes
Live progress to user UI #8 Optional

Mix freely. A typical real agent does multi-turn (#6) but occasionally fans out (#5) for parallel verification, and uses the search helper inside any of these.