Skip to content
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -275,6 +275,15 @@ dynamic benchmarks.
between the two executions. **Note**: this is a beta feature and will need some adaptation for your
own agent.

## Variables
Here's a list of relevant env. variables that are used by AgentLab:
- `OPEAI_API_KEY` which is used by default for OpenAI LLMs.
- `AZURE_OPENAI_API_KEY`, used by default for AzureOpenAI LLMs.
- `AZURE_OPENAI_ENDPOINT` to specify your Azure endpoint.
- `OPENAI_API_VERSION` for the Azure API.
- `OPENROUTER_API_KEY` for the Openrouter API
- `AGENTLAB_EXP_ROOT`, desired path for your experiments to be stored, defaults to `~/agentlab-results`.
- `AGENTXRAY_SHARE_GRADIO`, which prompts AgentXRay to open a public tunnel on launch.

## Misc

Expand Down
5 changes: 3 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,11 @@ authors = [
{name = "Alex Lacoste", email = "alex.lacoste@servicenow.com"},
{name = "Tom Marty", email = "tom.marty@polymtl.ca"},
{name = "Massimo Caccia", email = "massimo.caccia1@servicenow.com"},
{name = "Thibault Le Sellier de Chezelles", email = "thibault.de.chezelles@gmail.com"}
{name = "Thibault Le Sellier de Chezelles", email = "thibault.de.chezelles@gmail.com"},
{name = "Aman Jaiswal", email = "aman.jaiswal@servicenow.com"},
]
readme = "README.md"
requires-python = ">3.7"
requires-python = ">3.10"
license = {text = "Apache-2.0"}
classifiers = [
"Development Status :: 2 - Pre-Alpha",
Expand Down
2 changes: 2 additions & 0 deletions reproducibility_journal.csv
Original file line number Diff line number Diff line change
Expand Up @@ -74,3 +74,5 @@ Leo Boisvert,GenericAgent-openai_o1-mini-2024-09-12,workarena_l1,0.4.1,2025-02-0
M: src/agentlab/analyze/agent_xray.py
M: src/agentlab/llm/llm_configs.py",0.13.3,1d2d7160e5b7ec9954ecb48988f71eb56288dd29,"
Leo Boisvert,GenericAgent-anthropic_claude-3.7-sonnet,workarena_l1,0.4.1,2025-02-25_02-32-09,d4f900c2-1de1-4e4b-a3ab-495ff2675fff,0.515,0.028,0,330/330,None,Linux (#68-Ubuntu SMP Mon Oct 7 14:34:20 UTC 2024),3.12.3,1.44.0,v0.4.0,c9d2ef9648435ef1119950ecb1a0734497ccc33b,,0.13.3,1d2d7160e5b7ec9954ecb48988f71eb56288dd29,
agentlabtraces,GenericAgent-meta-llama_llama-4-maverick,workarena_l1,0.4.1,2025-04-14_17-15-56,a6dc4022-2bb7-4b46-8b37-f62c010defc1,0.27,0.024,0,330/330,None,Linux (#135-Ubuntu SMP Fri Sep 27 13:53:58 UTC 2024),3.12.7,1.39.0,v0.4.0,5eb2ecb5e5b293170230bcbed8b17fe192af214a,,0.13.3,70dac253628c476aff1af6a975f27f8563453ad2,
agentlabtraces,GenericAgent-meta-llama_llama-4-maverick,workarena_l2_agent_curriculum_eval,0.4.1,2025-04-22_15-38-44,d62fed39-caac-4ef3-92ac-b29897c69f88,0.085,0.018,1,235/235,None,Linux (#68-Ubuntu SMP Mon Oct 7 14:34:20 UTC 2024),3.12.7,1.39.0,v0.4.0,43bafbcfbe398fca39e4ffdc57b2f226d2c6d3e1,,0.13.3,70dac253628c476aff1af6a975f27f8563453ad2,
2 changes: 2 additions & 0 deletions src/agentlab/agents/generic_agent/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
AGENT_3_5,
AGENT_8B,
AGENT_CUSTOM,
AGENT_LLAMA4_17B_INSTRUCT,
AGENT_LLAMA3_70B,
AGENT_LLAMA31_70B,
RANDOM_SEARCH_AGENT,
Expand All @@ -31,6 +32,7 @@
"AGENT_4o_VISION",
"AGENT_o3_MINI",
"AGENT_o1_MINI",
"AGENT_LLAMA4_17B_INSTRUCT",
"AGENT_LLAMA3_70B",
"AGENT_LLAMA31_70B",
"AGENT_8B",
Expand Down
6 changes: 5 additions & 1 deletion src/agentlab/agents/generic_agent/agent_configs.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@

from .generic_agent import GenericAgentArgs
from .generic_agent_prompt import GenericPromptFlags
from .tmlr_config import BASE_FLAGS

FLAGS_CUSTOM = GenericPromptFlags(
obs=dp.ObsFlags(
Expand Down Expand Up @@ -296,7 +297,10 @@
chat_model_args=CHAT_MODEL_ARGS_DICT["openrouter/anthropic/claude-3.5-sonnet:beta"],
flags=FLAGS_GPT_4o_VISION,
)

AGENT_LLAMA4_17B_INSTRUCT = GenericAgentArgs(
chat_model_args=CHAT_MODEL_ARGS_DICT["openrouter/meta-llama/llama-4-maverick"],
flags=BASE_FLAGS,
)

DEFAULT_RS_FLAGS = GenericPromptFlags(
flag_group="default_rs",
Expand Down
2 changes: 1 addition & 1 deletion src/agentlab/analyze/agent_xray.py
Original file line number Diff line number Diff line change
Expand Up @@ -550,7 +550,7 @@ def tag_screenshot_with_action(screenshot: Image, action: str) -> Image:
try:
coords = action[action.index("(") + 1 : action.index(")")].split(",")
coords = [c.strip() for c in coords]
if len(coords) != 2:
if len(coords) not in [2, 3]:
raise ValueError(f"Invalid coordinate format: {coords}")
if coords[0].startswith("x="):
coords[0] = coords[0][2:]
Expand Down
Loading
Loading