Search¶
sb.search is an aggregated metasearch facade over a self-hosted
SearxNG instance reachable from the sandbox.
One query fans out to 5–15 engines (Google via Brave proxy, Bing,
DuckDuckGo, Wikipedia, arxiv, GitHub, Stack Overflow, ...) and returns
one deduped JSON. No API keys. No per-engine rate limits when you
self-host.
Routing¶
flowchart LR
Caller["sb.search.query()"] --HTTP--> SX["SearxNG :8888"]
SX -.fan-out.-> G[Google via Brave]
SX -.-> B[Bing]
SX -.-> D[DuckDuckGo]
SX -.-> W[Wikipedia]
SX -.-> A[arxiv]
SX -.-> GH[GitHub]
SX -.-> Etc[200+ more]
G --> Merge[merge + dedupe]
B --> Merge
D --> Merge
W --> Merge
A --> Merge
GH --> Merge
Etc --> Merge
Merge -->|json| Caller
By default sb.search points at http://10.42.0.1:8888 — the host's
SearxNG. Set FIREBOX_SEARXNG_URL in the agent's env to point
elsewhere.
Helpers per category¶
sb.search.web("rust web framework") # general
sb.search.news("AI agents", time_range="day") # news engines, last day
sb.search.papers("microvm performance") # arxiv, Scholar, Semantic Scholar
sb.search.code("playwright stealth github") # GitHub, SO, NPM, PyPI
sb.search.images("hacker news logo") # Bing/DDG images
sb.search.videos("rust async tutorial") # YouTube, Vimeo, Bilibili
sb.search.wiki("Firecracker (software)") # Wikipedia
sb.search.maps("Ljubljana") # OpenStreetMap
Or the lower-level call:
sb.search.query(
"AI agent benchmarks",
categories="general", # or list
engines=["brave", "startpage"],
language="en", # ISO code
time_range="month", # day|week|month|year
pageno=1,
safesearch=0, # 0|1|2
cache=True, # 5-min in-sandbox cache, on by default
)
Each result is a SearchResult:
@dataclass
class SearchResult:
title: str
url: str
content: str # snippet
engine: str # "brave" / "bing" / ...
score: float
category: str
Cache¶
The SDK caches (base_url, normalised_params) for 300 s in memory,
LRU-evicted at 256 entries. Same query within five minutes returns in
microseconds:
Disable per call with cache=False if freshness matters more than
speed.
CLI¶
The CLI bypasses sandboxes entirely — it hits SearxNG directly via
your FIREBOX_SEARXNG_URL (or default). No microVM spin-up cost.
firebox search "firecracker microvm" -n 5
firebox search "AI agents" -c news -t week
firebox search "rust web" -c it -e brave,bing
firebox search "playwright stealth" --json | jq '.[0].url'
# Pipe-friendly:
firebox search "fitness blog contact" -n 20 -u \
| xargs -P5 -I{} curl -s -o /dev/null -w "%{http_code} {}\n" {}
Flags:
| Flag | Default | Notes |
|---|---|---|
-n --limit |
10 | max results |
-c --category |
– | general/news/science/it/images/videos/map |
-e --engines |
– | comma-list, e.g. brave,bing,google |
-l --language |
– | ISO code |
-t --time-range |
– | day/week/month/year |
--pageno |
1 | pagination |
--safesearch |
0 | 0/1/2 |
--json |
– | machine-readable output |
-u --urls-only |
– | one URL per line, xargs-friendly |
-T --titles-only |
– | one title per line |
--field FIELD |
– | print one field per line (engine, content, ...) |
When to use sb.search vs the CLI¶
- Inside an agent flow:
sb.searchbecause it shares the sandbox's IP + cache and doesn't need the caller to reach SearxNG. - Quick terminal lookup or shell pipeline:
firebox search— zero microVM cost, instant output. - From an MCP-aware agent: the MCP
searchtool. Same backend, one tool call.