Skip to content

Python SDK

Stdlib-only client. pip install brings nothing along — urllib, json, base64 do the work.

from firebox.sandbox import Sandbox

Sandbox is the only entry point you'll actually instantiate. Every namespace (files, process, browser, http, search, audio, captcha) hangs off an instance.

Sandbox

sb = Sandbox.create(
    template: str | None = None,
    ttl_seconds: float = 300.0,
    vcpu: int = 2,
    mem_mib: int = 512,
    # Optional auto-upload of a local directory at boot:
    workspace: str | None = None,           # local dir path
    workspace_remote: str = "/work",        # remote target
    workspace_exclude: list[str] | None = None,  # fnmatch patterns
)
sb = Sandbox.attach(sandbox_id: str)

sb.id            # short id used everywhere
sb.ip            # 10.42.0.X
sb.template      # template name or None
sb.expires_at    # epoch seconds

sb.run(cmd, timeout=60, cwd=None, env=None, shell=None)  -> RunResult
sb.stream(cmd, ...)                                        -> Iterator[StreamChunk]
sb.close()

# Context manager (recommended)
with Sandbox.create(...) as sb:
    ...

RunResult is (stdout, stderr, exit_code, duration, timeout). StreamChunk.stream is "stdout" / "stderr" / "final".

workspace= tar-gzips the local directory in one HTTP roundtrip and extracts it inside the sandbox before create() returns — handy for shipping a project tree to the VM in one shot. See sb.files for the explicit upload_dir / download_dir and live watch.

sb.files

sb.files.read(path)          -> bytes
sb.files.read_text(path)     -> str
sb.files.write(path, content, mode=None)  -> bytes_written
sb.files.list(path="/")      -> list[FileEntry]      # name, type, size, mtime
sb.files.upload(local, remote, mode=None)
sb.files.download(remote, local)

# Bulk transfer — tar+gzip in one HTTP roundtrip
sb.files.upload_dir(local_dir, remote_dir, exclude=["*.pyc","__pycache__/*"])
sb.files.download_dir(remote_dir, local_dir)

# Live filesystem events — yields {path, event, file} per change
for evt in sb.files.watch("/work", recursive=True,
                          events="modify,create,delete,move",
                          timeout=600):
    print(evt["event"], evt["file"])    # MODIFY a.txt, CREATE b.py, ...

Bytes are base64-framed — binary files survive the round-trip unchanged. upload_dir is dramatically faster than looping upload for many files (one POST instead of N) and accepts fnmatch exclude patterns. watch runs inotifywait -m inside the sandbox and streams events back as they occur — perfect for live "agent edited file X" displays. Requires inotify-tools in the template (baked into browser-use).

sb.process

Background processes that outlive sb.run calls. The classic case: start a server, then probe it from the next call.

proc = sb.process.start("python3 -m http.server 8000")
sb.run("curl http://127.0.0.1:8000/")          # talks to the bg server
proc.logs()              -> ProcessLogs(stdout, stderr, running, exit_code)
proc.kill()              -> exit_code
proc.wait(timeout=60)    -> exit_code or None on timeout

sb.process.list()        -> list[Process]

sb.browser

Stealth Chromium driven from outside. See Browser concepts for the full surface; the SDK signatures are:

sb.browser.start(headless=True, stealth=True, viewport=None,
                  user_agent=None, locale=None, timezone_id=None,
                  profile=None, proxy=None)
sb.browser.close()
sb.browser.state()                  -> {"open": bool, "url": str, "title": str}

# Navigation
sb.browser.navigate(url, wait_until="domcontentloaded", timeout=30.0)
sb.browser.back() / .forward() / .reload()

# Interaction
sb.browser.click(selector, timeout=10.0)
sb.browser.click_at(x, y, button="left", click_count=1, humanlike=True)
sb.browser.fill(selector, text, timeout=10.0)
sb.browser.press(key)
sb.browser.type(text, delay_ms=None, humanlike=True)
sb.browser.scroll(x=0, y=0)
sb.browser.wait_for(selector, state="visible", timeout=10.0)

# Reading
sb.browser.text(selector=None, timeout=10.0)        -> str
sb.browser.text_all(selector, timeout=10.0)         -> list[str]
sb.browser.attr(selector, name, timeout=10.0)       -> str | None
sb.browser.html(selector=None, timeout=10.0)        -> str
sb.browser.evaluate(script, arg=None)               -> Any
sb.browser.clickables()                             -> list[dict]

# Screenshots
sb.browser.screenshot(selector=None, full_page=False, save_path=None)            -> bytes
sb.browser.screenshot_annotated(full_page=False, save_path=None)                  -> bytes

# Persistence
sb.browser.cookies(urls=None)                       -> list[dict]
sb.browser.set_cookies(cookies)
sb.browser.save_profile(name)                       -> str (path)
sb.browser.list_profiles()                          -> list[str]
sb.browser.delete_profile(name)

# Captcha helpers
sb.browser.detect_captcha()                         -> dict | None
sb.browser.solve_captcha_on_page(api_key=...)       -> dict
sb.browser.inject_captcha_token(captcha_type, token)

sb.http

Raw HTTP with real Chrome 120 TLS / JA3 / H2 fingerprint via curl_cffi. Use this for API calls / scraping where Cloudflare's TLS-layer detection rejects the chromium browser.

r = sb.http.get(url, params=None, headers=None, timeout=30.0,
                  impersonate="chrome120", proxies=None)
r = sb.http.post(url, json={...}, ...)
r = sb.http.request("PUT", url, body=b"...", ...)

r.ok           -> bool
r.status       -> int
r.url          -> str
r.headers      -> dict
r.text         -> str
r.content      -> bytes
r.json()       -> Any

sb.search

Aggregated metasearch via SearxNG. See Search.

sb.search.web(q, **kw)        -> list[SearchResult]
sb.search.news(q, **kw)
sb.search.papers(q, **kw)
sb.search.code(q, **kw)
sb.search.images(q, **kw)
sb.search.videos(q, **kw)
sb.search.wiki(q, **kw)
sb.search.maps(q, **kw)
sb.search.query(q, categories=, engines=, language=, time_range=,
                 pageno=, safesearch=, base_url=, cache=True, timeout=20)

# Live streaming — yields SearchResult per engine as backends respond
for r in sb.search.stream(q, engines=["google","duckduckgo","brave","qwant"]):
    print(r.engine, r.title, r.url)

sb.search.stream queries each engine independently and yields its results as soon as they arrive, deduped on URL. Use it when you want to render results in a UI as they trickle in instead of waiting for the full SearxNG aggregation.

sb.audio

Local Whisper inside the sandbox. Powers the free reCAPTCHA audio solver but useful on its own:

result = sb.audio.transcribe(
    audio: bytes, format="mp3", model="tiny.en",
    language=None, beam_size=5, vad_filter=False,
)
result.text           # full text
result.language       # "en", probability close to 1
result.duration       # seconds
result.segments       # [{start, end, text}, ...]

sb.captcha

Four solver paths — pick by cost / target site.

# Paid solver service (works for everything)
sb.captcha.solve_recaptcha_v2(sitekey, page_url, api_key=None, provider="2captcha", timeout=180)
sb.captcha.solve_hcaptcha(...)
sb.captcha.solve_turnstile(...)

# Free local Whisper — audio mode of reCAPTCHA v2 / hCaptcha
sb.captcha.solve_recaptcha_audio(retries=3, model="tiny.en")     -> dict
sb.captcha.solve_hcaptcha_audio(retries=3, model="tiny.en")      -> dict

# Free image-grid primitives — *your* vision LLM picks cells
sb.captcha.recaptcha_open_image_challenge(timeout=30.0)
    # → {instructions, target, grid_size, cells, screenshot_b64}
sb.captcha.recaptcha_click_cells(indices, timeout=10.0)
    # → {clicked, indices}
sb.captcha.recaptcha_verify_image(timeout=15.0)
    # → {verified, more_to_click, ...} or a fresh challenge

# Free human handoff via VNC
sb.captcha.handoff_to_vnc(password="firebox",
                           poll_until_solved=False, poll_timeout=600.0)
    # → {vnc_url, sandbox_ip, vnc_password, solved?, waited?}

The image-grid primitives deliberately don't run a model in firebox. The caller's own LLM (Claude / GPT-4o / Qwen-VL — whatever's already driving the agent) does the visual reasoning, which keeps firebox provider-agnostic.

Error model

All SDK methods raise RuntimeError with the daemon's {"error": "..."} body inlined when the daemon returns 4xx / 5xx. Time-outs raise TimeoutError.

try:
    sb = Sandbox.create(template="browser-use", mem_mib=99999)
except RuntimeError as e:
    # "daemon 429: token 'alice' would exceed max_mem_mib (...)"
    ...