fix(frontend): show user feedback as filterable fields in observability by mmabrouk · Pull Request #4652 · Agenta-AI/agenta

mmabrouk · 2026-06-11T17:20:16Z

Context

When you sent user feedback on a trace through the API (POST /api/simple/traces/ with an evaluator slug), the Observability trace filter never offered that feedback as a filterable field. The feedback was stored but invisible to filtering.

Two root causes, both on the frontend:

The filter built its feedback options from evaluator.metrics, but its data source returns thin evaluator refs that carry no data/metrics, so the list was always empty (regressed in 5867326f6b).
Evaluators auto-created from feedback infer their output schema with genson over the whole trace data envelope, so the stored schema is wrapped one level deeper than UI-created evaluators. Every frontend consumer expected the flat shape.

Per product decision the backend inference stays as is. The frontend normalizes the shape on read.

Changes

New evaluatorFeedbackSchemasAtom (web/packages/agenta-entities/.../state/evaluatorUtils.ts) resolves each non-archived evaluator's latest revision and exposes its output-metric properties. It follows the existing evaluatorKeyMapAtom pattern. Filters.tsx now reads it instead of the empty .metrics. deriveFeedbackValueType is unchanged, so number/boolean/string detection (including array item types) is preserved.

resolveOutputSchemaProperties (schema.ts) now unwraps the genson envelope when the output schema is a single outputs object with nested properties.

Before (auto-created feedback evaluator output schema):

{ properties: { outputs: { properties: { score, comment } } } }

After (what every consumer now reads):

{ score, comment }

The unwrap is a strict no-op for every other evaluator. No builtin or human evaluator has a lone outputs key with nested properties.

Feedback field free-text entry: evaluators without an output schema suggest no metrics, and the feedback-field Select used to clear anything you typed. It now surfaces the typed text as a <typed> (custom) option that commits and persists, so you can filter by a feedback name even when no schema provides one.

Docs: corrected two pages that claimed evaluators are not auto-created. They are, with an output schema inferred from the first annotation.

Tests / notes

Unit tests in web/packages/agenta-entities/tests/unit/resolveOutputSchemaProperties.test.ts cover the wrapped unwrap, flat passthrough, a single real metric literally named outputs, a multi-key schema, and empty/missing schemas. 6/6 pass.
@agenta/entities typechecks. ESLint is clean on the changed files.
Verified on the dev box: sent feedback with a fresh slug via POST /api/simple/traces/, confirmed the evaluator was auto-created with the wrapped schema, then confirmed score/comment appeared as feedback filter fields with score resolving to a Number filter.
The evaluator-name dropdown still sources from the full evaluator list (includes archived) while feedback fields come from non-archived evaluators. Not a regression. Flag for follow-up if archived feedback evaluators should also be hidden from the dropdown.

What to QA

In a project with user feedback sent via the API, open Observability, open Filter, add an "Is Annotation" condition, pick the evaluator, then open the Feedback field dropdown. It lists the evaluator's metric keys (e.g. score, comment). Selecting a numeric metric gives a Number filter.
Send fresh feedback to a brand-new evaluator slug via POST /api/simple/traces/ (data.outputs + references.evaluator.slug + links.invocation). The new evaluator's fields show up in the filter.
In the feedback field, type a name that is not suggested (or pick an evaluator with no schema). A <name> (custom) option appears; selecting it keeps the value and the row stays usable.
Regression: existing UI-created human evaluators still show their metrics in the filter and in the trace drawer annotation tab.

The observability trace filter never listed annotation feedback fields (score, comment, etc.) from evaluators, so feedback sent via the API was not filterable. Two causes, both fixed on the frontend: - The filter read evaluator.metrics off thin list refs that carry no data; it now resolves each evaluator's latest revision via a new evaluatorFeedbackSchemasAtom. - Auto-created feedback evaluators store a genson-inferred output schema wrapped one level deeper ({outputs:{properties}}); resolveOutputSchema- Properties now unwraps that envelope so real metric keys surface. Also corrects docs that claimed evaluators are not auto-created.

coderabbitai · 2026-06-11T17:20:36Z

📝 Walkthrough

Summary by CodeRabbit

Release Notes

Documentation
- Clarified evaluator auto-creation behavior: Agenta automatically creates evaluators on first annotation and infers schemas from the provided data; users can pre-create evaluators to enforce specific schema details like fixed labels and required fields.
Improvements
- Enhanced feedback field selection with improved custom entry support and search functionality.

Walkthrough

This pull request implements evaluator feedback schema support by unwrapping genson-inferred output envelopes, building a Jotai atom to expose per-evaluator properties, refactoring the Filters component to consume feedback schemas (and accept typed custom fields), and updating docs to explain evaluator auto-creation and schema inference.

Changes

Evaluator Feedback Schema Pipeline

Layer / File(s)	Summary
Schema unwrapping helper and core logic `web/packages/agenta-entities/src/workflow/core/schema.ts`, `web/packages/agenta-entities/tests/unit/resolveOutputSchemaProperties.test.ts`	`unwrapEnvelopeProperties` detects and unwraps single-key `outputs` envelopes in genson-inferred schemas; `resolveOutputSchemaProperties` applies unwrapping to expose real metric keys. Tests validate envelope unwrapping across edge cases including preserved non-envelope schemas, non-unwrapped `outputs` with multiple siblings, and `null` handling.
Feedback schema state atom and exports `web/packages/agenta-entities/src/workflow/state/evaluatorUtils.ts`, `web/packages/agenta-entities/src/workflow/state/index.ts`, `web/packages/agenta-entities/src/workflow/index.ts`	`EvaluatorFeedbackSchema` interface and `evaluatorFeedbackSchemasAtom` build per-evaluator output-metric property data by resolving each evaluator's latest revision and applying `resolveOutputSchemaProperties`. The atom is re-exported through state and workflow barrel modules as public API.
Filters component integration `web/oss/src/components/Filters/Filters.tsx`	The component imports `evaluatorFeedbackSchemasAtom` and replaces metrics-based option derivation with schema-based building: iterating each evaluator's `properties`, using property keys as option values, `schema.title` as labels, and `deriveFeedbackValueType` to determine option types. Adds `feedbackFieldSearch` state and typed-custom/select-preserve behavior for feedback-field `<Select>`.
User documentation updates `docs/docs/observability/trace-with-python-sdk/08-annotate-traces.mdx`, `docs/docs/tutorials/cookbooks/01-capture-user-feedback.mdx`	Documentation clarifies that evaluators are auto-created on first annotation with inferred schemas, and users can pre-create evaluators to control output schema structure (fixed labels, required fields, value ranges).

Sequence Diagram

sequenceDiagram
  participant Filter as Filters Component
  participant Atom as evaluatorFeedbackSchemasAtom
  participant Schema as resolveOutputSchemaProperties
  participant Evaluator as Evaluator Revision
  Filter->>Atom: useAtomValue(evaluatorFeedbackSchemasAtom)
  Atom->>Evaluator: iterate non-archived evaluators
  Evaluator->>Schema: revision.data
  Schema->>Schema: unwrapEnvelopeProperties
  Schema-->>Atom: properties
  Atom-->>Filter: EvaluatorFeedbackSchema[]
  Filter->>Filter: useMemo: iterate properties
  Filter->>Filter: property key → value, schema.title → label
  Filter-->>Filter: annotationFeedbackOptions

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main frontend fix—showing user feedback as filterable fields in observability—which is the primary objective of this PR.
Description check	✅ Passed	The description is comprehensive and related to the changeset, clearly explaining context, root causes, changes made, and testing/verification steps.
Linked Issues check	✅ Passed	The PR fully addresses linked issue `#4654` by fixing both root causes: it reads evaluator feedback schemas from a new atom instead of empty .metrics, and unwraps the genson-inferred nested schema so feedback metrics (score, comment) appear as filterable fields.
Out of Scope Changes check	✅ Passed	All changes align with the linked issue objectives: new atom for feedback schemas, schema unwrapping logic, Filters component updates, and documentation corrections. No unrelated changes detected.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 60.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/observability-feedback-filter

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

vercel · 2026-06-11T17:20:51Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agenta-documentation	Ready	Preview, Comment	Jun 12, 2026 11:06am

github-actions · 2026-06-11T17:35:47Z

Railway Preview Environment


Preview URL	https://gateway-production-c0e7.up.railway.app/w
Project	`agenta-oss-pr-4652`
Image tag	`pr-4652-b44023f`
Status	Deployed
Railway logs	Open logs
Workflow logs	View workflow run
Updated at 2026-06-12T11:17:33.320Z

linear-code · 2026-06-11T17:36:59Z

AGE-3826

junaway · 2026-06-11T19:30:46Z

@mmabrouk @ardaerzin
Shouldn't this be fixed instead of patched ?
Very curious how this never pop'ed up before/

…ty filter Evaluators without an output schema expose no feedback metrics to suggest, and the feedback-field Select cleared any typed value. The Select now surfaces the typed text as a '<typed> (custom)' option that commits and persists, so users can filter by a feedback name even when the schema can't provide one.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

web/oss/src/components/Filters/Filters.tsx (1)

1788-1793: ⚡ Quick win

Debounce feedback-field search updates to reduce rerender churn.

onSearch writes state on every keystroke per row. Please debounce/throttle this handler (even a small 150–250ms debounce) to align with frontend guidance and avoid avoidable render pressure when many rows are open.

As per coding guidelines, “Debounce/throttle search inputs, filters, scroll and resize handlers in React components”.

Source: Coding guidelines

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: dcb895e2-cf05-431e-8535-9eb48c345d02

📥 Commits

Reviewing files that changed from the base of the PR and between 5240b9b and 4c13a15.

📒 Files selected for processing (1)

web/oss/src/components/Filters/Filters.tsx

coderabbitai · 2026-06-12T11:08:50Z

+    // Free-text the user is typing into a row's feedback-field Select. Lets them name a
+    // feedback metric even when the evaluator has no output schema to suggest options.
+    const [feedbackFieldSearch, setFeedbackFieldSearch] = useState<Record<number, string>>({})


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Stale custom-search text can leak across rows after index shifts.

feedbackFieldSearch is keyed by row index, but row deletes/reordering can shift indices while this map is preserved. That can show a stale “(custom)” typed option on the wrong row. Prefer keying by a stable row id, or prune/remap feedbackFieldSearch whenever rows are removed/reset.

Also applies to: 1788-1809

mmabrouk · 2026-06-12T12:30:48Z

@mmabrouk @ardaerzin Shouldn't this be fixed instead of patched ? Very curious how this never pop'ed up before/

How is this a patch? I considered the backend response as the ground truth and fixed the FE to follow that pattern.

The only "patch" is allowing users to write down the name of the feedback in case the evaluator does not have a schema (which is something we allow for the moment)

dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. documentation Improvements or additions to documentation Frontend labels Jun 11, 2026

vercel Bot deployed to Preview June 11, 2026 17:20 View deployment

Merge branch 'main' into fix/observability-feedback-filter

39993fc

vercel Bot deployed to Preview June 12, 2026 08:12 View deployment

vercel Bot deployed to Preview June 12, 2026 11:06 View deployment

coderabbitai Bot reviewed Jun 12, 2026

View reviewed changes

mmabrouk requested a review from ardaerzin June 12, 2026 12:29

ardaerzin approved these changes Jun 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(frontend): show user feedback as filterable fields in observability#4652

fix(frontend): show user feedback as filterable fields in observability#4652
mmabrouk wants to merge 3 commits into
mainfrom
fix/observability-feedback-filter

mmabrouk commented Jun 11, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading

Summary by CodeRabbit

Release Notes

Walkthrough

Changes

Sequence Diagram

Uh oh!

vercel Bot commented Jun 11, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 11, 2026 •

edited

Loading

Uh oh!

linear-code Bot commented Jun 11, 2026

Uh oh!

junaway commented Jun 11, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 12, 2026

Uh oh!

mmabrouk commented Jun 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mmabrouk commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Context

Changes

Tests / notes

What to QA

Uh oh!

coderabbitai Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Walkthrough

Changes

Sequence Diagram

Uh oh!

vercel Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Railway Preview Environment

Uh oh!

linear-code Bot commented Jun 11, 2026

Uh oh!

junaway commented Jun 11, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

mmabrouk commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mmabrouk commented Jun 11, 2026 •

edited

Loading

coderabbitai Bot commented Jun 11, 2026 •

edited

Loading

vercel Bot commented Jun 11, 2026 •

edited

Loading

github-actions Bot commented Jun 11, 2026 •

edited

Loading

mmabrouk commented Jun 12, 2026 •

edited

Loading