[design] Class-oriented authoring API for the Agenta SDK (POC) by mmabrouk · Pull Request #4627 · Agenta-AI/agenta

mmabrouk · 2026-06-10T19:54:58Z

Context

Today users write evaluators and applications by decorating plain functions. Schemas are inferred from function signatures at runtime, and the settings/inputs/outputs contracts are untyped dicts. This makes it hard to see what a workflow expects at a glance, and means typos in column names or config keys fail deep inside an evaluation run rather than at definition time.

This PR proposes a class-oriented authoring model as an alternative. None of the code in this PR runs. It is a design POC that shows what the developer experience could look like.

What this adds

Eight annotated example files under docs/designs/class-based-sdk/:

The core pattern. A class IS the workflow. You declare three inner Pydantic models and implement one method:

# Before (today)
@ag.evaluator(slug="rubric-judge", name="Rubric Judge")
async def rubric_judge(inputs: dict, outputs, trace) -> dict:
    ...  # no schema, no validation, raw dicts

# Proposed
class RubricJudge(ag.Evaluator):
    slug = "rubric-judge"
    name = "Rubric Judge"

    class Parameters(BaseModel):     # -> schemas.parameters (the UI config form)
        judge_model: str = "gpt-4o-mini"
        rubric: str = "..."

    class Inputs(BaseModel):         # -> schemas.inputs (testset columns consumed)
        expected_answer: str | None = None

    class Outputs(BaseModel):        # -> schemas.outputs (score columns in the UI)
        score: float
        verdict: str

    async def evaluate(self, *, inputs: Inputs, outputs, parameters: Parameters) -> Outputs:
        ...

The three Pydantic models compile directly to JsonSchemas.parameters/inputs/outputs in the existing WorkflowRevisionData. The class is a typed front-end over the workflow data model, not a parallel system. Everything underneath (middleware chain, handler registry, tracing, upsert, serving) reuses the current engine.

Framework adapters (06_framework_adapters.py). Three tiers for teams already using OpenAI Agents SDK, Pydantic AI, or LangGraph:

Manual: build the framework agent inside run() from parameters.
Factory: subclass ag.ext.openai_agents.Application, implement build(parameters), the base runs it.
Automatic: ag.Application.from_agent(existing_agent). An AgentAdapter port extracts Parameters/Inputs/Outputs from the agent object without mutating it.

Config-only workflows (07_config_only.py). ag.Configuration has Parameters but no handler. It is a versioned, deployable config store (prompts, routing tables, rubrics) that any service can pull with afetch(environment="production"). Generalizes prompt management without a special-case API.

Testsets as classes (08_testsets.py). A Case inner model becomes the column schema. ag.aevaluate can check compatibility between testset columns and application/evaluator inputs before running anything, so a missing column raises at submit time rather than after 200 LLM calls.

Serving (05_serve.py). Each class exposes a standard APIRouter with typed /invoke and /inspect endpoints. You mount it with app.include_router like any FastAPI router. No custom registration call.

Notes

This is purely a design document to spark discussion. The ag.Application, ag.Evaluator, ag.Configuration, ag.Testset, and related methods do not exist yet. The main question for review is whether the proposed surface feels right before any implementation starts.

The implementation path (how classes map onto WorkflowRevisionData, the ~20-line seam change in auto_workflow, what new files would live in sdk/authoring/) is sketched in docs/designs/class-based-sdk/README.md.

vercel · 2026-06-10T19:55:04Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agenta-documentation	Ready	Preview, Comment	Jun 11, 2026 2:48pm

coderabbitai · 2026-06-10T19:55:08Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 55a44681-9f38-47d3-8f2c-2ecaf0546c5d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch design/class-based-sdk

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

design(sdk): class-oriented authoring API proposal (POC)

b36032b

mmabrouk requested review from ardaerzin and jp-agenta June 10, 2026 20:14

Add functional SDK extensions

766056f

vercel Bot deployed to Preview June 11, 2026 14:43 View deployment

Merge branch 'main' into design/class-based-sdk

4f75f50

vercel Bot deployed to Preview June 11, 2026 14:48 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[design] Class-oriented authoring API for the Agenta SDK (POC)#4627

[design] Class-oriented authoring API for the Agenta SDK (POC)#4627
mmabrouk wants to merge 3 commits into
mainfrom
design/class-based-sdk

mmabrouk commented Jun 10, 2026

Uh oh!

vercel Bot commented Jun 10, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 10, 2026 •

edited

Loading

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mmabrouk commented Jun 10, 2026

Context

What this adds

Notes

Uh oh!

vercel Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vercel Bot commented Jun 10, 2026 •

edited

Loading

coderabbitai Bot commented Jun 10, 2026 •

edited

Loading