Skip to content

Commit 47c28f8

Browse files
committed
docs: Add agent browser use case and reorganize browser use section
1 parent 0bf9040 commit 47c28f8

3 files changed

Lines changed: 244 additions & 2 deletions

File tree

docs.json

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,14 @@
4848
"docs/use-cases/coding-agents",
4949
"docs/use-cases/computer-use",
5050
"docs/use-cases/ci-cd",
51-
"docs/use-cases/browser-use"
51+
{
52+
"group": "Browser use",
53+
"icon": "globe",
54+
"pages": [
55+
"docs/use-cases/browser-use",
56+
"docs/use-cases/agent-browser"
57+
]
58+
}
5259
]
5360
},
5461
{

docs/use-cases/agent-browser.mdx

Lines changed: 235 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,235 @@
1+
---
2+
title: "Agent remote browser"
3+
description: "Run an autonomous AI agent inside an E2B sandbox that browses the web using a Kernel cloud browser and the Browser Use framework."
4+
icon: "robot"
5+
---
6+
7+
Run an AI agent inside an E2B sandbox that autonomously controls a [Kernel](https://www.kernel.computer/) cloud browser. The agent decides what to click, type, and navigate — you just give it a task.
8+
9+
This builds on the [remote browser](/docs/use-cases/browser-use) pattern by adding the [Browser Use](https://docs.browser-use.com/) framework, which turns an LLM into a browser-controlling agent.
10+
11+
## Architecture
12+
13+
1. **E2B Sandbox** — isolated environment where the agent code runs. Pre-installed with Kernel SDK, Playwright, and Browser Use.
14+
2. **Kernel Cloud Browser** — remote Chromium instance the agent controls via CDP.
15+
3. **Browser Use** — agent framework that connects an LLM to Playwright. The LLM sees screenshots and decides actions (click, type, scroll, navigate).
16+
17+
The orchestrator creates the sandbox and kicks off the agent. The agent runs autonomously inside the sandbox — it creates a Kernel browser, connects Browser Use, and executes the task.
18+
19+
## Prerequisites
20+
21+
- An [E2B API key](https://e2b.dev/dashboard?tab=keys)
22+
- A [Kernel API key](https://www.kernel.computer/)
23+
- An LLM API key (Anthropic, OpenAI, or other [supported model](https://docs.browser-use.com/customize/supported-models))
24+
- Python 3.10+
25+
26+
```bash
27+
pip install e2b-code-interpreter
28+
```
29+
30+
Set your keys in the environment:
31+
32+
```bash .env
33+
E2B_API_KEY=e2b_***
34+
KERNEL_API_KEY=kernel_***
35+
ANTHROPIC_API_KEY=sk-ant-***
36+
```
37+
38+
## How it works
39+
40+
<Steps>
41+
<Step title="Create the sandbox">
42+
Start an E2B sandbox using the `kernel-agent-browser` template, which comes with Kernel SDK, Playwright, and Browser Use pre-installed. Pass the API keys the agent will need.
43+
44+
```python
45+
from e2b_code_interpreter import Sandbox
46+
47+
sandbox = Sandbox.create(
48+
"kernel-agent-browser",
49+
envs={
50+
"KERNEL_API_KEY": os.environ["KERNEL_API_KEY"],
51+
"ANTHROPIC_API_KEY": os.environ["ANTHROPIC_API_KEY"],
52+
},
53+
timeout=300,
54+
)
55+
```
56+
</Step>
57+
58+
<Step title="Write the agent script">
59+
The agent script creates a Kernel browser, connects Browser Use to it, and runs a task autonomously.
60+
61+
```python
62+
AGENT_SCRIPT = '''
63+
import asyncio
64+
from kernel import Kernel
65+
from browser_use import Agent, Browser, ChatAnthropic
66+
67+
async def main():
68+
kernel = Kernel()
69+
kb = kernel.browsers.create()
70+
71+
browser = Browser(cdp_url=kb.cdp_ws_url)
72+
73+
agent = Agent(
74+
task="Go to Hacker News, find the top 3 AI stories, and summarize them",
75+
llm=ChatAnthropic(model="claude-sonnet-4-20250514"),
76+
browser=browser,
77+
)
78+
result = await agent.run()
79+
print(result)
80+
81+
asyncio.run(main())
82+
'''
83+
84+
sandbox.files.write("/home/user/agent_task.py", AGENT_SCRIPT)
85+
```
86+
</Step>
87+
88+
<Step title="Run the agent">
89+
Execute the agent inside the sandbox. The agent will autonomously browse, click, type, and navigate to complete the task.
90+
91+
```python
92+
result = sandbox.commands.run(
93+
"python3 /home/user/agent_task.py",
94+
timeout=180,
95+
)
96+
print(result.stdout)
97+
```
98+
</Step>
99+
</Steps>
100+
101+
## Full example
102+
103+
```python agent_browser.py expandable
104+
"""
105+
Agent Remote Browser — E2B + Kernel + Browser Use
106+
107+
Spins up an E2B sandbox with Browser Use framework and Kernel cloud browser.
108+
An AI agent autonomously browses the web to complete a research task.
109+
"""
110+
111+
import os
112+
113+
from e2b_code_interpreter import Sandbox
114+
115+
AGENT_SCRIPT = '''
116+
import asyncio
117+
from kernel import Kernel
118+
from browser_use import Agent, Browser, ChatAnthropic
119+
120+
async def main():
121+
# Create a Kernel cloud browser
122+
kernel = Kernel()
123+
kb = kernel.browsers.create()
124+
print(f"Kernel browser created: {kb.id}")
125+
126+
# Connect Browser Use to the Kernel browser via CDP
127+
browser = Browser(cdp_url=kb.cdp_ws_url)
128+
129+
# Create an AI agent that autonomously browses
130+
agent = Agent(
131+
task="""
132+
Go to https://news.ycombinator.com and find the top 3 stories
133+
that are about AI or machine learning. For each story:
134+
1. Note the title and point count
135+
2. Click through to the comments page
136+
3. Read the top comment
137+
138+
Return a summary of your findings.
139+
""",
140+
llm=ChatAnthropic(model="claude-sonnet-4-20250514"),
141+
browser=browser,
142+
max_actions_per_step=4,
143+
)
144+
145+
result = await agent.run()
146+
print("\\n" + "=" * 60)
147+
print("AGENT RESULT:")
148+
print("=" * 60)
149+
print(result)
150+
151+
asyncio.run(main())
152+
'''
153+
154+
155+
def main():
156+
sandbox = Sandbox.create(
157+
"kernel-agent-browser",
158+
envs={
159+
"KERNEL_API_KEY": os.environ["KERNEL_API_KEY"],
160+
"ANTHROPIC_API_KEY": os.environ["ANTHROPIC_API_KEY"],
161+
},
162+
timeout=300,
163+
)
164+
165+
try:
166+
sandbox.files.write("/home/user/agent_task.py", AGENT_SCRIPT)
167+
168+
result = sandbox.commands.run(
169+
"python3 /home/user/agent_task.py",
170+
timeout=180,
171+
)
172+
173+
if result.exit_code != 0:
174+
print(f"Agent failed: {result.stderr}")
175+
else:
176+
print(result.stdout)
177+
178+
finally:
179+
sandbox.kill()
180+
181+
182+
if __name__ == "__main__":
183+
main()
184+
```
185+
186+
## Key concepts
187+
188+
| Concept | Detail |
189+
|---|---|
190+
| **E2B template** | `kernel-agent-browser` — pre-built with Kernel SDK, Playwright, and Browser Use |
191+
| **Kernel browser** | `kernel.browsers.create()` spins up a remote Chromium; connect via `kb.cdp_ws_url` |
192+
| **Browser Use** | `Browser(cdp_url=...)` connects the agent framework to Kernel's CDP endpoint |
193+
| **LLM choice** | Browser Use supports `ChatAnthropic`, `ChatOpenAI`, `ChatGoogle`, and more |
194+
| **Autonomous agent** | The LLM sees the page (via screenshots) and decides what actions to take |
195+
196+
## Choosing an LLM
197+
198+
Browser Use supports multiple LLM providers. Import the one you need:
199+
200+
```python
201+
# Anthropic (recommended)
202+
from browser_use import ChatAnthropic
203+
llm = ChatAnthropic(model="claude-sonnet-4-20250514")
204+
205+
# OpenAI
206+
from browser_use import ChatOpenAI
207+
llm = ChatOpenAI(model="gpt-4o")
208+
209+
# Google
210+
from browser_use import ChatGoogle
211+
llm = ChatGoogle(model="gemini-2.5-flash")
212+
```
213+
214+
Pass the corresponding API key in the sandbox `envs`.
215+
216+
## Adapting this example
217+
218+
- **Different tasks** — change the `task` string to any web research, form filling, or data extraction task.
219+
- **Custom actions** — Browser Use supports [custom actions](https://docs.browser-use.com/customize/custom-actions) to extend agent capabilities.
220+
- **Vision control** — set `use_vision="auto"` on the Agent to let it decide when to use screenshots vs DOM.
221+
- **Multiple agents** — run several agents in parallel, each with their own Kernel browser, for concurrent research.
222+
223+
## Related guides
224+
225+
<CardGroup cols={3}>
226+
<Card title="Remote browser" icon="globe" href="/docs/use-cases/browser-use">
227+
Programmatic browser automation with Playwright + Kernel
228+
</Card>
229+
<Card title="Computer use" icon="desktop" href="/docs/use-cases/computer-use">
230+
Build AI agents that control virtual desktops
231+
</Card>
232+
<Card title="Sandbox lifecycle" icon="rotate" href="/docs/sandbox">
233+
Create, manage, and control sandbox lifecycle
234+
</Card>
235+
</CardGroup>

docs/use-cases/browser-use.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: "Browser use"
2+
title: "Remote browser"
33
description: "Deploy a web app in an E2B sandbox and use a Kernel cloud browser to screenshot every route and generate a preview report."
44
icon: "globe"
55
---

0 commit comments

Comments
 (0)