Browser¶
sb.browser is a Playwright-driven Chromium running inside the
sandbox. It's stealthed by default (passes
bot.sannysoft.com fingerprint checks),
threads its calls through a per-VM worker, and keeps state across
calls — same tab, same cookies — until you close().
Drive it from outside¶
with Sandbox.create(template="browser-use") as sb:
b = sb.browser
b.start() # launch Chromium
b.navigate("https://example.com")
print(b.text("h1")) # → "Example Domain"
The browser stays up across calls. There's no LLM inside the sandbox — the caller is the agent.
What's exposed¶
flowchart TB
subgraph SDK["sb.browser API"]
N1[start / close / state]
N2[navigate / back / forward / reload]
N3[click / click_at / fill / press / type / scroll]
N4[wait_for]
N5[text / text_all / attr / html]
N6[screenshot / screenshot_annotated / clickables]
N7[evaluate]
N8[cookies / set_cookies / save_profile / list_profiles]
N9[detect_captcha / inject_captcha_token]
end
Three ways to act on a UI element, tradeoff your call:
Deterministic, fast.
For sites where selectors are unstable. Pair with
screenshot_annotated() for an LLM-friendly numbered overlay.
Stealth profile¶
sb.browser.start() defaults to stealth=True. That turns on:
| Layer | Patch |
|---|---|
| Launch flags | --disable-blink-features=AutomationControlled, drop --enable-automation |
| Backend | Patchright (drop-in replacement for playwright that fixes Runtime.enable + console.debug leaks at the chromium binary level) |
navigator.webdriver |
scrubbed from prototype + own (so 'webdriver' in navigator is false) |
window.chrome |
populated |
navigator.plugins |
five-entry PluginArray (real plugins, real prototype) |
navigator.languages |
['en-US', 'en'] |
navigator.permissions.query |
consistent with native Chrome |
navigator.hardwareConcurrency |
8 |
navigator.deviceMemory |
8 |
navigator.maxTouchPoints |
0 |
Notification.permission |
'default' (not 'denied') |
window.outerHeight/Width |
offset from inner |
navigator.userAgentData |
Chrome 120 brands + Linux platform |
| Canvas | sub-pixel noise on getImageData / toDataURL |
| AudioContext | float buffer noise on getChannelData |
| WebRTC | host-candidate IPs stripped (no real-IP leak) |
| WebGL | vendor / renderer reported as Intel Inc. / Iris OpenGL Engine |
| User-Agent | Linux Chrome 120 (no HeadlessChrome string) |
| Locale | en-US |
| Timezone | Europe/Ljubljana |
To get a vanilla automation profile (debugging, captcha tests):
Captcha solving¶
Two paths. Use whichever is cheaper:
For reCAPTCHA v2 on sites that don't pre-block your IP. Cost: $0, accuracy ~70-80 %, ~5-10 s.
Profile persistence¶
Cookies + localStorage + sessionStorage survive across sandbox lifetimes when you save them as a named profile:
# First sandbox: log in once.
with Sandbox.create(template="browser-use") as sb:
sb.browser.start()
sb.browser.navigate("https://app.example.com/login")
sb.browser.fill("#email", "alice@example.com")
sb.browser.fill("#password", "...")
sb.browser.click("button[type=submit]")
sb.browser.save_profile("alice-app")
# Later sandbox: restore — already logged in.
with Sandbox.create(template="browser-use") as sb:
sb.browser.start(profile="alice-app")
sb.browser.navigate("https://app.example.com/dashboard")
print(sb.browser.text("h1")) # whatever's gated behind login
Profiles live in /var/firebox-profiles/<name>.json inside the
template's rootfs. They don't leak across templates — alice-app
saved from a browser-use template only loads when you start from
that same template.
Live preview via VNC¶
The browser-use template ships with Xvfb + x11vnc + fluxbox.
Useful when an LLM-driven agent's flow is failing and you want to
watch.
# In the sandbox:
sb.run("firebox-display firebox") # password=firebox
# On the host, DNAT a public port to the sandbox's :5900, then:
open vnc://:firebox@your-host:33000
The run_agent_vnc.py
example wires this up end-to-end.
What it can't do¶
- GPU rendering — Chromium runs on CPU. Bake an
--enable-gpuflag and live with software fallback if you must. - Real Chrome TLS — Chromium's TLS hello differs from Chrome's.
Cloudflare Bot Manager / Akamai do fingerprint at the TLS layer.
Workaround: use
sb.http(curl_cffi with real Chrome 120 JA3) for raw HTTP calls behind the same cookies. - Mobile emulation — viewport-based only; no touch event fidelity.
- Pause / resume of the whole VM mid-test — no Firecracker snapshot integration yet.