Search¶

sb.search is an aggregated metasearch facade over a self-hosted SearxNG instance reachable from the sandbox. One query fans out to 5–15 engines (Google via Brave proxy, Bing, DuckDuckGo, Wikipedia, arxiv, GitHub, Stack Overflow, ...) and returns one deduped JSON. No API keys. No per-engine rate limits when you self-host.

Routing¶

flowchart LR
    Caller["sb.search.query()"] --HTTP--> SX["SearxNG :8888"]
    SX -.fan-out.-> G[Google via Brave]
    SX -.->  B[Bing]
    SX -.-> D[DuckDuckGo]
    SX -.-> W[Wikipedia]
    SX -.-> A[arxiv]
    SX -.-> GH[GitHub]
    SX -.-> Etc[200+ more]

    G --> Merge[merge + dedupe]
    B --> Merge
    D --> Merge
    W --> Merge
    A --> Merge
    GH --> Merge
    Etc --> Merge
    Merge -->|json| Caller

By default sb.search points at http://10.42.0.1:8888 — the host's SearxNG. Set FIREBOX_SEARXNG_URL in the agent's env to point elsewhere.

Helpers per category¶

sb.search.web("rust web framework")              # general
sb.search.news("AI agents", time_range="day")    # news engines, last day
sb.search.papers("microvm performance")          # arxiv, Scholar, Semantic Scholar
sb.search.code("playwright stealth github")      # GitHub, SO, NPM, PyPI
sb.search.images("hacker news logo")             # Bing/DDG images
sb.search.videos("rust async tutorial")          # YouTube, Vimeo, Bilibili
sb.search.wiki("Firecracker (software)")         # Wikipedia
sb.search.maps("Ljubljana")                      # OpenStreetMap

Or the lower-level call:

sb.search.query(
    "AI agent benchmarks",
    categories="general",          # or list
    engines=["brave", "startpage"],
    language="en",                 # ISO code
    time_range="month",            # day|week|month|year
    pageno=1,
    safesearch=0,                  # 0|1|2
    cache=True,                    # 5-min in-sandbox cache, on by default
)

Each result is a SearchResult:

@dataclass
class SearchResult:
    title: str
    url: str
    content: str         # snippet
    engine: str          # "brave" / "bing" / ...
    score: float
    category: str

Cache¶

The SDK caches (base_url, normalised_params) for 300 s in memory, LRU-evicted at 256 entries. Same query within five minutes returns in microseconds:

sb.search.web("foo")    # ~1 s, network round-trip
sb.search.web("foo")    # <1 ms, served from memory

Disable per call with cache=False if freshness matters more than speed.

CLI¶

The CLI bypasses sandboxes entirely — it hits SearxNG directly via your FIREBOX_SEARXNG_URL (or default). No microVM spin-up cost.

firebox search "firecracker microvm" -n 5
firebox search "AI agents" -c news -t week
firebox search "rust web" -c it -e brave,bing
firebox search "playwright stealth" --json | jq '.[0].url'

# Pipe-friendly:
firebox search "fitness blog contact" -n 20 -u \
    | xargs -P5 -I{} curl -s -o /dev/null -w "%{http_code} {}\n" {}

Flags:

Flag	Default	Notes
`-n` `--limit`	10	max results
`-c` `--category`	–	`general`/`news`/`science`/`it`/`images`/`videos`/`map`
`-e` `--engines`	–	comma-list, e.g. `brave,bing,google`
`-l` `--language`	–	ISO code
`-t` `--time-range`	–	`day`/`week`/`month`/`year`
`--pageno`	1	pagination
`--safesearch`	0	0/1/2
`--json`	–	machine-readable output
`-u` `--urls-only`	–	one URL per line, xargs-friendly
`-T` `--titles-only`	–	one title per line
`--field FIELD`	–	print one field per line (engine, content, ...)

When to use `sb.search` vs the CLI¶

Inside an agent flow: sb.search because it shares the sandbox's IP + cache and doesn't need the caller to reach SearxNG.
Quick terminal lookup or shell pipeline: firebox search — zero microVM cost, instant output.
From an MCP-aware agent: the MCP search tool. Same backend, one tool call.