docs: Add agent browser use case and reorganize browser use section

beran-t · beran-t · commit 47c28f843c33 · 2026-03-17T19:16:49.000+01:00
diff --git a/docs.json b/docs.json
@@ -48,7 +48,14 @@
               "docs/use-cases/coding-agents",
               "docs/use-cases/computer-use",
               "docs/use-cases/ci-cd",
-              "docs/use-cases/browser-use"
+              {
+                "group": "Browser use",
+                "icon": "globe",
+                "pages": [
+                  "docs/use-cases/browser-use",
+                  "docs/use-cases/agent-browser"
+                ]
+              }
             ]
           },
           {
diff --git a/docs/use-cases/agent-browser.mdx b/docs/use-cases/agent-browser.mdx
@@ -0,0 +1,235 @@
+---
+title: "Agent remote browser"
+description: "Run an autonomous AI agent inside an E2B sandbox that browses the web using a Kernel cloud browser and the Browser Use framework."
+icon: "robot"
+---
+
+Run an AI agent inside an E2B sandbox that autonomously controls a [Kernel](https://www.kernel.computer/) cloud browser. The agent decides what to click, type, and navigate — you just give it a task.
+
+This builds on the [remote browser](/docs/use-cases/browser-use) pattern by adding the [Browser Use](https://docs.browser-use.com/) framework, which turns an LLM into a browser-controlling agent.
+
+## Architecture
+
+1. **E2B Sandbox** — isolated environment where the agent code runs. Pre-installed with Kernel SDK, Playwright, and Browser Use.
+2. **Kernel Cloud Browser** — remote Chromium instance the agent controls via CDP.
+3. **Browser Use** — agent framework that connects an LLM to Playwright. The LLM sees screenshots and decides actions (click, type, scroll, navigate).
+
+The orchestrator creates the sandbox and kicks off the agent. The agent runs autonomously inside the sandbox — it creates a Kernel browser, connects Browser Use, and executes the task.
+
+## Prerequisites
+
+- An [E2B API key](https://e2b.dev/dashboard?tab=keys)
+- A [Kernel API key](https://www.kernel.computer/)
+- An LLM API key (Anthropic, OpenAI, or other [supported model](https://docs.browser-use.com/customize/supported-models))
+- Python 3.10+
+
+```bash
+pip install e2b-code-interpreter
+```
+
+Set your keys in the environment:
+
+```bash .env
+E2B_API_KEY=e2b_***
+KERNEL_API_KEY=kernel_***
+ANTHROPIC_API_KEY=sk-ant-***
+```
+
+## How it works
+
+<Steps>
+<Step title="Create the sandbox">
+Start an E2B sandbox using the `kernel-agent-browser` template, which comes with Kernel SDK, Playwright, and Browser Use pre-installed. Pass the API keys the agent will need.
+
+```python
+from e2b_code_interpreter import Sandbox
+
+sandbox = Sandbox.create(
+    "kernel-agent-browser",
+    envs={
+        "KERNEL_API_KEY": os.environ["KERNEL_API_KEY"],
+        "ANTHROPIC_API_KEY": os.environ["ANTHROPIC_API_KEY"],
+    },
+    timeout=300,
+)
+```
+</Step>
+
+<Step title="Write the agent script">
+The agent script creates a Kernel browser, connects Browser Use to it, and runs a task autonomously.
+
+```python
+AGENT_SCRIPT = '''
+import asyncio
+from kernel import Kernel
+from browser_use import Agent, Browser, ChatAnthropic
+
+async def main():
+    kernel = Kernel()
+    kb = kernel.browsers.create()
+
+    browser = Browser(cdp_url=kb.cdp_ws_url)
+
+    agent = Agent(
+        task="Go to Hacker News, find the top 3 AI stories, and summarize them",
+        llm=ChatAnthropic(model="claude-sonnet-4-20250514"),
+        browser=browser,
+    )
+    result = await agent.run()
+    print(result)
+
+asyncio.run(main())
+'''
+
+sandbox.files.write("/home/user/agent_task.py", AGENT_SCRIPT)
+```
+</Step>
+
+<Step title="Run the agent">
+Execute the agent inside the sandbox. The agent will autonomously browse, click, type, and navigate to complete the task.
+
+```python
+result = sandbox.commands.run(
+    "python3 /home/user/agent_task.py",
+    timeout=180,
+)
+print(result.stdout)
+```
+</Step>
+</Steps>
+
+## Full example
+
+```python agent_browser.py expandable
+"""
+Agent Remote Browser — E2B + Kernel + Browser Use
+
+Spins up an E2B sandbox with Browser Use framework and Kernel cloud browser.
+An AI agent autonomously browses the web to complete a research task.
+"""
+
+import os
+
+from e2b_code_interpreter import Sandbox
+
+AGENT_SCRIPT = '''
+import asyncio
+from kernel import Kernel
+from browser_use import Agent, Browser, ChatAnthropic
+
+async def main():
+    # Create a Kernel cloud browser
+    kernel = Kernel()
+    kb = kernel.browsers.create()
+    print(f"Kernel browser created: {kb.id}")
+
+    # Connect Browser Use to the Kernel browser via CDP
+    browser = Browser(cdp_url=kb.cdp_ws_url)
+
+    # Create an AI agent that autonomously browses
+    agent = Agent(
+        task="""
+        Go to https://news.ycombinator.com and find the top 3 stories
+        that are about AI or machine learning. For each story:
+        1. Note the title and point count
+        2. Click through to the comments page
+        3. Read the top comment
+
+        Return a summary of your findings.
+        """,
+        llm=ChatAnthropic(model="claude-sonnet-4-20250514"),
+        browser=browser,
+        max_actions_per_step=4,
+    )
+
+    result = await agent.run()
+    print("\\n" + "=" * 60)
+    print("AGENT RESULT:")
+    print("=" * 60)
+    print(result)
+
+asyncio.run(main())
+'''
+
+
+def main():
+    sandbox = Sandbox.create(
+        "kernel-agent-browser",
+        envs={
+            "KERNEL_API_KEY": os.environ["KERNEL_API_KEY"],
+            "ANTHROPIC_API_KEY": os.environ["ANTHROPIC_API_KEY"],
+        },
+        timeout=300,
+    )
+
+    try:
+        sandbox.files.write("/home/user/agent_task.py", AGENT_SCRIPT)
+
+        result = sandbox.commands.run(
+            "python3 /home/user/agent_task.py",
+            timeout=180,
+        )
+
+        if result.exit_code != 0:
+            print(f"Agent failed: {result.stderr}")
+        else:
+            print(result.stdout)
+
+    finally:
+        sandbox.kill()
+
+
+if __name__ == "__main__":
+    main()
+```
+
+## Key concepts
+
+| Concept | Detail |
+|---|---|
+| **E2B template** | `kernel-agent-browser` — pre-built with Kernel SDK, Playwright, and Browser Use |
+| **Kernel browser** | `kernel.browsers.create()` spins up a remote Chromium; connect via `kb.cdp_ws_url` |
+| **Browser Use** | `Browser(cdp_url=...)` connects the agent framework to Kernel's CDP endpoint |
+| **LLM choice** | Browser Use supports `ChatAnthropic`, `ChatOpenAI`, `ChatGoogle`, and more |
+| **Autonomous agent** | The LLM sees the page (via screenshots) and decides what actions to take |
+
+## Choosing an LLM
+
+Browser Use supports multiple LLM providers. Import the one you need:
+
+```python
+# Anthropic (recommended)
+from browser_use import ChatAnthropic
+llm = ChatAnthropic(model="claude-sonnet-4-20250514")
+
+# OpenAI
+from browser_use import ChatOpenAI
+llm = ChatOpenAI(model="gpt-4o")
+
+# Google
+from browser_use import ChatGoogle
+llm = ChatGoogle(model="gemini-2.5-flash")
+```
+
+Pass the corresponding API key in the sandbox `envs`.
+
+## Adapting this example
+
+- **Different tasks** — change the `task` string to any web research, form filling, or data extraction task.
+- **Custom actions** — Browser Use supports [custom actions](https://docs.browser-use.com/customize/custom-actions) to extend agent capabilities.
+- **Vision control** — set `use_vision="auto"` on the Agent to let it decide when to use screenshots vs DOM.
+- **Multiple agents** — run several agents in parallel, each with their own Kernel browser, for concurrent research.
+
+## Related guides
+
+<CardGroup cols={3}>
+  <Card title="Remote browser" icon="globe" href="/docs/use-cases/browser-use">
+    Programmatic browser automation with Playwright + Kernel
+  </Card>
+  <Card title="Computer use" icon="desktop" href="/docs/use-cases/computer-use">
+    Build AI agents that control virtual desktops
+  </Card>
+  <Card title="Sandbox lifecycle" icon="rotate" href="/docs/sandbox">
+    Create, manage, and control sandbox lifecycle
+  </Card>
+</CardGroup>
diff --git a/docs/use-cases/browser-use.mdx b/docs/use-cases/browser-use.mdx
@@ -1,5 +1,5 @@
 ---
-title: "Browser use"
+title: "Remote browser"
 description: "Deploy a web app in an E2B sandbox and use a Kernel cloud browser to screenshot every route and generate a preview report."
 icon: "globe"
 ---