MCP server¶
Firebox ships an MCP stdio server that exposes the daemon's surface as 22 tools any MCP-aware client can call. Drop one config block into Claude Desktop / Cursor / Claude Code / ChatGPT Agent / any other MCP host and your model gets native sandbox + browser tools.
Install¶
The [mcp] extra pulls in the mcp Python SDK. Without it the
server entry point is missing.
Configure your MCP host¶
~/Library/Application Support/Claude/claude_desktop_config.json
(macOS) or platform equivalent:
{
"mcpServers": {
"firebox": {
"command": "python3",
"args": ["-m", "firebox.mcp.server"],
"env": {
"FIREBOX_URL": "https://firebox.example.com",
"FIREBOX_TOKEN": "<paste-secret-here>"
}
}
}
}
Restart Claude Desktop. The 22 firebox tools appear in the model's toolbox automatically — no per-conversation setup.
Settings → MCP → New Server:
Same pattern: set command + args + env so the spawned process
inherits FIREBOX_URL + FIREBOX_TOKEN. The MCP server reads
these on startup and uses them for every tool call.
Tool catalog¶
flowchart LR
subgraph Lifecycle
sandbox_open
sandbox_list
sandbox_close
sandbox_run
end
subgraph Files
file_read
file_write
end
subgraph Browser
browser_start
browser_navigate
browser_close
browser_back[browser_back/forward/reload]
browser_click
browser_click_at
browser_fill
browser_press
browser_type
browser_wait_for
browser_text
browser_text_all
browser_html
browser_screenshot
browser_screenshot_annotated
browser_clickables
browser_evaluate
end
subgraph Search
search
end
Tools at a glance¶
| Tool | Returns | Notes |
|---|---|---|
sandbox_open |
{id, ip, template} |
Pass to every other tool |
sandbox_close |
{closed} |
Frees resources immediately |
sandbox_list |
{sandboxes} |
Filtered to caller's tokens |
sandbox_run |
{stdout, stderr, exit_code} |
Synchronous shell exec |
file_read / file_write |
{content} / {bytes_written} |
UTF-8 |
browser_start |
{url, title} |
stealth defaults to true |
browser_navigate |
{url, title} |
Reuses tab |
browser_click / _at |
{ok, url, navigated} |
Selector or (x, y) |
browser_fill |
{ok} |
Clears prior value |
browser_press |
{ok} |
Single key, e.g. "Enter" |
browser_text |
{text} |
Selector optional |
browser_text_all |
{items} |
List of strings |
browser_html |
{html} |
Whole doc or selector subtree |
browser_screenshot |
ImageContent (PNG) |
LLM sees the image |
browser_screenshot_annotated |
ImageContent (PNG) |
Yellow numbered boxes |
browser_clickables |
{clickables} |
idx, x, y, text, href per element |
browser_evaluate |
{result} |
Raw JS escape hatch |
search |
{results} |
Aggregated SearxNG |
How a model uses it¶
A typical Claude conversation flows like:
sequenceDiagram
participant U as User
participant M as Claude Desktop
participant S as MCP firebox server
participant D as Daemon
U->>M: "Open hbs.si and tell me the headline."
M->>S: tools/call sandbox_open {template: "browser-use"}
S->>D: POST /sandboxes
D-->>S: { id, ip }
S-->>M: { sandbox_id }
M->>S: tools/call browser_start {sandbox_id}
S->>D: ... → in-VM agent
M->>S: tools/call browser_navigate {url: "https://hbs.si"}
M->>S: tools/call browser_text {selector: "h1"}
S-->>M: { text: "Hermes: Digitalna transformacija" }
M-->>U: "Hermes: Digitalna transformacija"
M->>S: tools/call sandbox_close {sandbox_id}
The model decides which tools to call based on the user's instruction; the MCP server forwards each call to the daemon over HTTP. No model runs inside the sandbox.
Verifying it works¶
From a terminal, drive the MCP server manually:
echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":
{"protocolVersion":"2024-11-05","capabilities":{},
"clientInfo":{"name":"smoke","version":"0"}}}' | \
python3 -m firebox.mcp.server
The server replies with its capabilities + tool list. Claude Desktop does the same handshake on its side.