fix(sdk): sandbox custom-code evaluators by default (RestrictedPython runner)#4636
fix(sdk): sandbox custom-code evaluators by default (RestrictedPython runner)#4636mmabrouk wants to merge 1 commit into
Conversation
… runner) Custom-code evaluators ran user Python with raw exec() in the services process, sandboxed only if an operator opted into Daytona. Add a RestrictedPython runner and make it the default; keep raw exec() as an explicit 'local' opt-in. The new sandbox uses a guarded __import__ limited to a pure-stdlib allowlist and safer_getattr to block the class-gadget escape, closing the two holes the previous RestrictedPython sandbox had.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
hosting/kubernetes/ee/values.ee.example.yaml (1)
116-125:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winHelm schema blocks the new "restricted" default across both example files. Both the EE and OSS Kubernetes examples now document and set
sandboxRunner: restricted, buthosting/kubernetes/helm/values.schema.jsondefines the enum as["local", "daytona"]and omits"restricted". Users applying these examples will hit schema validation failures. The schema enum must be updated to["restricted", "local", "daytona"]to match the runtime contract and allow the new default value.
🧹 Nitpick comments (1)
sdks/python/agenta/sdk/engines/running/runners/restricted.py (1)
175-182: 💤 Low valueRedundant SyntaxError handler (harmless).
The SyntaxError catch at lines 178-179 is redundant since syntax errors are already caught during compilation at lines 146-147. However, defensive error handling is acceptable here.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 53ec6584-c261-4e60-99ba-f4d6cfa0e963
⛔ Files ignored due to path filters (1)
sdks/python/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (15)
api/oss/src/utils/env.pydocs/docs/evaluation/configure-evaluators/07-custom-evaluator.mdxdocs/docs/self-host/02-configuration.mdxhosting/docker-compose/ee/env.ee.dev.examplehosting/docker-compose/ee/env.ee.gh.examplehosting/docker-compose/oss/env.oss.dev.examplehosting/docker-compose/oss/env.oss.gh.examplehosting/kubernetes/ee/values.ee.example.yamlhosting/kubernetes/oss/values.oss.example.yamlsdks/python/agenta/sdk/engines/running/runners/registry.pysdks/python/agenta/sdk/engines/running/runners/restricted.pysdks/python/agenta/sdk/engines/running/sandbox.pysdks/python/oss/tests/pytest/utils/test_code_v0.pysdks/python/oss/tests/pytest/utils/test_restricted_runner.pysdks/python/pyproject.toml
💤 Files with no reviewable changes (1)
- sdks/python/agenta/sdk/engines/running/sandbox.py
|
@mmabrouk |
Context
Custom-code evaluators executed user-supplied Python with a raw
exec()inside the services process. There was no sandbox unless an operator explicitly setAGENTA_SERVICES_CODE_SANDBOX_RUNNER=daytona. On a self-hosted deployment, any authenticated user who could create a custom-code evaluator could run arbitrary code on the host. The older RestrictedPython sandbox was dropped when evaluation moved to the new runner architecture, which left rawexec()(local) as the default.Changes
Add a RestrictedPython-based runner and make it the default.
AGENTA_SERVICES_CODE_SANDBOX_RUNNERnow selects one of three runners:restricted(new default): in-process RestrictedPython sandbox. Strict pure-stdlib import allowlist, no filesystem, network, or host access.local: rawexec(), no sandbox. Explicit opt-in for trusted or single-tenant deployments.daytona: isolated remote sandbox (unchanged).The new sandbox closes the two holes the previous RestrictedPython sandbox had:
__import__, soimport osworked. The new runner installs a guarded__import__that allows only a pure-stdlib allowlist (math, statistics, datetime, json, re, random, string, typing, collections, itertools, functools).httpxand anything with host or network reach are excluded._getattr_, leaving the().__class__.__bases__[0].__subclasses__()gadget open. The new runner setssafer_getattr, which blocks dunder and underscore attribute access.Before:
After:
import os,import subprocess,import httpx,__import__('os'),open(...),eval(...), and the class-gadget escape all raise instead of running.Behavior change (rollout note)
The default flips from unrestricted
execto the sandbox. Existing self-hosted evaluators that rely on non-allowlisted imports (for examplehttpxoros) or on raw exec will now fail under the default. They must opt back in withAGENTA_SERVICES_CODE_SANDBOX_RUNNER=local(trusted only) or move todaytona. The import and error messages name this escape hatch. RestrictedPython is a new dependency, so self-hosters need to rebuild the services image, not just change an env var. Agenta Cloud is unaffected; it already isolates evaluator execution.Tests / notes
sdks/python/oss/tests/pytest/utils/test_restricted_runner.py: v1 and v2 evaluators return floats, allowlisted imports work, andimport os/subprocess/httpx,__import__('os'),open(),eval(), and the class-gadget escape are all blocked. Registry selection is covered (defaultrestricted,localopt-in, legacyAGENTA_SERVICES_SANDBOX_RUNNER, daytona-without-key raises, unknown value raises).test_code_v0.pysuite now runs under the restricted default.uv run pyteston both files: 60 passed, 2 daytona-only skipped.ruff formatandruff checkclean.:::warningon the custom-evaluator page and the three runner options in the self-host configuration reference. Example env files and Helm values updated to showrestrictedas the default and warn onlocal.