fix(frontend): show user feedback as filterable fields in observability#4652
fix(frontend): show user feedback as filterable fields in observability#4652mmabrouk wants to merge 3 commits into
Conversation
The observability trace filter never listed annotation feedback fields
(score, comment, etc.) from evaluators, so feedback sent via the API was
not filterable.
Two causes, both fixed on the frontend:
- The filter read evaluator.metrics off thin list refs that carry no
data; it now resolves each evaluator's latest revision via a new
evaluatorFeedbackSchemasAtom.
- Auto-created feedback evaluators store a genson-inferred output schema
wrapped one level deeper ({outputs:{properties}}); resolveOutputSchema-
Properties now unwraps that envelope so real metric keys surface.
Also corrects docs that claimed evaluators are not auto-created.
📝 WalkthroughSummary by CodeRabbitRelease Notes
WalkthroughThis pull request implements evaluator feedback schema support by unwrapping genson-inferred output envelopes, building a Jotai atom to expose per-evaluator properties, refactoring the Filters component to consume feedback schemas (and accept typed custom fields), and updating docs to explain evaluator auto-creation and schema inference. ChangesEvaluator Feedback Schema Pipeline
Sequence DiagramsequenceDiagram
participant Filter as Filters Component
participant Atom as evaluatorFeedbackSchemasAtom
participant Schema as resolveOutputSchemaProperties
participant Evaluator as Evaluator Revision
Filter->>Atom: useAtomValue(evaluatorFeedbackSchemasAtom)
Atom->>Evaluator: iterate non-archived evaluators
Evaluator->>Schema: revision.data
Schema->>Schema: unwrapEnvelopeProperties
Schema-->>Atom: properties
Atom-->>Filter: EvaluatorFeedbackSchema[]
Filter->>Filter: useMemo: iterate properties
Filter->>Filter: property key → value, schema.title → label
Filter-->>Filter: annotationFeedbackOptions
🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Railway Preview Environment
|
|
@mmabrouk @ardaerzin |
…ty filter Evaluators without an output schema expose no feedback metrics to suggest, and the feedback-field Select cleared any typed value. The Select now surfaces the typed text as a '<typed> (custom)' option that commits and persists, so users can filter by a feedback name even when the schema can't provide one.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
web/oss/src/components/Filters/Filters.tsx (1)
1788-1793: ⚡ Quick winDebounce feedback-field search updates to reduce rerender churn.
onSearchwrites state on every keystroke per row. Please debounce/throttle this handler (even a small 150–250ms debounce) to align with frontend guidance and avoid avoidable render pressure when many rows are open.As per coding guidelines, “Debounce/throttle search inputs, filters, scroll and resize handlers in React components”.
Source: Coding guidelines
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: dcb895e2-cf05-431e-8535-9eb48c345d02
📒 Files selected for processing (1)
web/oss/src/components/Filters/Filters.tsx
| // Free-text the user is typing into a row's feedback-field Select. Lets them name a | ||
| // feedback metric even when the evaluator has no output schema to suggest options. | ||
| const [feedbackFieldSearch, setFeedbackFieldSearch] = useState<Record<number, string>>({}) |
There was a problem hiding this comment.
Stale custom-search text can leak across rows after index shifts.
feedbackFieldSearch is keyed by row index, but row deletes/reordering can shift indices while this map is preserved. That can show a stale “(custom)” typed option on the wrong row. Prefer keying by a stable row id, or prune/remap feedbackFieldSearch whenever rows are removed/reset.
Also applies to: 1788-1809
How is this a patch? I considered the backend response as the ground truth and fixed the FE to follow that pattern. The only "patch" is allowing users to write down the name of the feedback in case the evaluator does not have a schema (which is something we allow for the moment) |
Closes #4654
Context
When you sent user feedback on a trace through the API (
POST /api/simple/traces/with an evaluator slug), the Observability trace filter never offered that feedback as a filterable field. The feedback was stored but invisible to filtering.Two root causes, both on the frontend:
evaluator.metrics, but its data source returns thin evaluator refs that carry nodata/metrics, so the list was always empty (regressed in5867326f6b).dataenvelope, so the stored schema is wrapped one level deeper than UI-created evaluators. Every frontend consumer expected the flat shape.Per product decision the backend inference stays as is. The frontend normalizes the shape on read.
Changes
New
evaluatorFeedbackSchemasAtom(web/packages/agenta-entities/.../state/evaluatorUtils.ts) resolves each non-archived evaluator's latest revision and exposes its output-metric properties. It follows the existingevaluatorKeyMapAtompattern.Filters.tsxnow reads it instead of the empty.metrics.deriveFeedbackValueTypeis unchanged, so number/boolean/string detection (including array item types) is preserved.resolveOutputSchemaProperties(schema.ts) now unwraps the genson envelope when the output schema is a singleoutputsobject with nested properties.Before (auto-created feedback evaluator output schema):
After (what every consumer now reads):
The unwrap is a strict no-op for every other evaluator. No builtin or human evaluator has a lone
outputskey with nested properties.Feedback field free-text entry: evaluators without an output schema suggest no metrics, and the feedback-field Select used to clear anything you typed. It now surfaces the typed text as a
<typed> (custom)option that commits and persists, so you can filter by a feedback name even when no schema provides one.Docs: corrected two pages that claimed evaluators are not auto-created. They are, with an output schema inferred from the first annotation.
Tests / notes
web/packages/agenta-entities/tests/unit/resolveOutputSchemaProperties.test.tscover the wrapped unwrap, flat passthrough, a single real metric literally namedoutputs, a multi-key schema, and empty/missing schemas. 6/6 pass.@agenta/entitiestypechecks. ESLint is clean on the changed files.POST /api/simple/traces/, confirmed the evaluator was auto-created with the wrapped schema, then confirmedscore/commentappeared as feedback filter fields withscoreresolving to a Number filter.What to QA
score,comment). Selecting a numeric metric gives a Number filter.POST /api/simple/traces/(data.outputs+references.evaluator.slug+links.invocation). The new evaluator's fields show up in the filter.<name> (custom)option appears; selecting it keeps the value and the row stays usable.