Skip to content

Add CV screening example with curated resume test set#4607

Draft
mmabrouk wants to merge 6 commits into
mainfrom
claude/cv-classifier-demo-oug3jb
Draft

Add CV screening example with curated resume test set#4607
mmabrouk wants to merge 6 commits into
mainfrom
claude/cv-classifier-demo-oug3jb

Conversation

@mmabrouk

@mmabrouk mmabrouk commented Jun 9, 2026

Copy link
Copy Markdown
Member

A walkthrough demo for classifying CVs against a job spec with Agenta:

  • Curated test set of 30 real Markdown CVs (from the public
    opensporks/resumes dataset on Hugging Face, a mirror of the Kaggle
    Resume Dataset), hand-labeled against an IT Manager job spec
  • prepare_testset.py rebuilds the CSV reproducibly and can upload it
    to Agenta via the SDK
  • create_app.py creates the completion app with the screening prompt
    and structured-output JSON schema, and deploys it to production
  • Streamlit demo UI: PDF upload -> Markdown (markitdown) -> prompt
    fetched from the Agenta registry -> structured score dashboard
  • Sample CV PDFs (one per classification) generated from the test set

https://claude.ai/code/session_01YMbf4sUb2VBFQHGNKv6yh3

claude added 2 commits June 9, 2026 20:46
A walkthrough demo for classifying CVs against a job spec with Agenta:

- Curated test set of 30 real Markdown CVs (from the public
  opensporks/resumes dataset on Hugging Face, a mirror of the Kaggle
  Resume Dataset), hand-labeled against an IT Manager job spec
- prepare_testset.py rebuilds the CSV reproducibly and can upload it
  to Agenta via the SDK
- create_app.py creates the completion app with the screening prompt
  and structured-output JSON schema, and deploys it to production
- Streamlit demo UI: PDF upload -> Markdown (markitdown) -> prompt
  fetched from the Agenta registry -> structured score dashboard
- Sample CV PDFs (one per classification) generated from the test set

https://claude.ai/code/session_01YMbf4sUb2VBFQHGNKv6yh3
The Streamlit app now shows a thumbs up/down form with an optional
comment after each screening. Submitting it attaches the feedback to
the screening's trace in Agenta as an annotation (evaluator slug
'user-feedback'), following the capture-user-feedback cookbook:
the invocation link is captured inside the instrumented classify_cv
call and the annotation is POSTed to /api/simple/traces/.

Screening results now persist in session state so the result and
feedback form survive Streamlit reruns. Entry scripts load .env via
python-dotenv, matching the documented setup flow.

https://claude.ai/code/session_01YMbf4sUb2VBFQHGNKv6yh3
@vercel

vercel Bot commented Jun 9, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agenta-documentation Ready Ready Preview, Comment Jun 11, 2026 12:38pm

Request Review

@dosubot dosubot Bot added example python Pull requests that update Python code size:XL This PR changes 500-999 lines, ignoring generated files. labels Jun 9, 2026
@mmabrouk mmabrouk marked this pull request as draft June 9, 2026 21:43
@coderabbitai

coderabbitai Bot commented Jun 9, 2026

Copy link
Copy Markdown

Review Change Stack

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 9a496be8-f6c1-482e-ae68-8e55d9528686

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR introduces a complete, production-ready CV screening example for the Python SDK. It includes shared configuration with a structured JSON schema for classification results, scripts to prepare a curated test set from external resume data, an Agenta deployment script, and an interactive Streamlit demo that fetches prompts from Agenta, runs LLM screening, and collects user feedback as trace annotations.

Changes

CV Screening Example

Layer / File(s) Summary
Overview, Documentation, and Infrastructure
examples/python/Readme.md, examples/python/cv-screening/Readme.md, examples/python/cv-screening/.env.example, examples/python/cv-screening/requirements.txt, examples/python/cv-screening/data/.gitignore
Root README adds CV screening to use cases table. New cv-screening README documents the full end-to-end workflow. Environment template defines AGENTA_API_KEY, AGENTA_HOST, and OPENAI_API_KEY. Dependencies include Agenta SDK, OpenAI client, Streamlit, test-data tools, and PDF generation.
Shared Configuration and Agenta Deployment
examples/python/cv-screening/config.py, examples/python/cv-screening/create_app.py
config.py defines app/variant slugs, system/user prompts with {cv} template, a strict JSON schema enforcing scores (1–5), requirement lists, classification enum, and reasoning. create_app.py initializes the Agenta client, creates the service completion app, publishes the prompt variant with the schema-based LLM config, and deploys to production.
Test Set Preparation and Sample Generation
examples/python/cv-screening/Readme.md (test set docs), examples/python/cv-screening/prepare_testset.py, examples/python/cv-screening/make_sample_pdfs.py
README documents test set construction from external resume dataset with curated IT-manager classifications. prepare_testset.py downloads a public Hugging Face parquet, applies hand-curated resume ID mappings, converts resume HTML to Markdown, writes data/testset.csv, and optionally uploads to Agenta. make_sample_pdfs.py renders selected test CVs as PDF files under data/sample_cvs/ with text normalization and FPDF styling.
Interactive Demo with PDF Upload and Feedback
examples/python/cv-screening/Readme.md (demo walkthrough), examples/python/cv-screening/app.py
README walks through setup, Agenta deployment, test-set upload, playground iteration, Streamlit demo run, and feedback collection. app.py provides a Streamlit dashboard: PDF upload and Markdown conversion, fetches the production prompt from Agenta (with local fallback), invokes OpenAI chat completion with the schema, captures Agenta trace invocation IDs, renders classification banner and per-area score metrics with progress bars, requirement lists, and a feedback form (thumbs up/down + optional comment) that posts to Agenta trace data and triggers a success rerun.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant StreamlitApp as Streamlit App
  participant Agenta
  participant OpenAI
  
  User->>StreamlitApp: Upload CV PDF
  StreamlitApp->>StreamlitApp: Convert PDF to Markdown
  StreamlitApp->>Agenta: Fetch production prompt config
  Agenta-->>StreamlitApp: Return prompt + LLM config
  User->>StreamlitApp: Click "Screen CV" button
  StreamlitApp->>OpenAI: Call chat completion<br/>with prompt + schema
  OpenAI-->>StreamlitApp: Return structured JSON<br/>(scores, requirements, classification)
  StreamlitApp->>StreamlitApp: Render classification banner<br/>+ score metrics + requirements
  StreamlitApp->>Agenta: Capture trace invocation ID
  User->>StreamlitApp: Submit feedback<br/>(thumbs up/down + comment)
  StreamlitApp->>Agenta: POST feedback as trace annotation
  Agenta-->>StreamlitApp: Success response (200/202)
  StreamlitApp-->>User: Show feedback confirmation
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 21.05% which is insufficient. The required threshold is 60.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely describes the main change: adding a CV screening example with a curated resume test set, which aligns with the core content of the PR.
Description check ✅ Passed The description is directly related to the changeset, providing clear details about the CV screening demo, test set preparation, deployment scripts, and Streamlit UI implementation.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/cv-classifier-demo-oug3jb

Warning

Review ran into problems

🔥 Problems

Stopped waiting for pipeline failures after 30000ms. One of your pipelines takes longer than our 30000ms fetch window to run, so review may not consider pipeline-failure results for inline comments if any failures occurred after the fetch window. Increase the timeout if you want to wait longer or run a @coderabbit review after the pipeline has finished.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
examples/python/cv-screening/requirements.txt (1)

1-19: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Pin dependency versions to avoid pulling vulnerable packages.

The requirements file specifies no version constraints, which means pip install will fetch the latest versions of all packages and their transitive dependencies. OSV Scanner has flagged numerous critical and high-severity vulnerabilities in transitive dependencies that could be pulled in, including:

  • aiohttp: 23 CRITICAL issues (SSRF, header injection, DoS, credential leaks)
  • gitpython: 9 CRITICAL issues (RCE, path traversal, arbitrary code execution)
  • litellm: 13 CRITICAL issues (SSTI, SQL injection, SSRF, eval-based RCE)
  • pillow: 6 CRITICAL issues (arbitrary code execution, buffer overflow, DoS)
  • pyarrow: 3 CRITICAL issues (arbitrary code execution)

While this is example code, users may run it in environments connected to real data or networks. Unpinned dependencies create a supply-chain risk.

🔒 Recommendation

Generate a pinned requirements.txt by running:

pip install -r requirements.txt
pip freeze > requirements.txt

Then review the frozen versions and update any packages flagged by pip-audit or OSV Scanner. Alternatively, specify minimum safe versions inline:

 # Agenta SDK + LLM client
-agenta
-openai
-python-dotenv
+agenta>=0.28.0
+openai>=1.0.0
+python-dotenv>=1.0.0

For the remaining packages, apply the same pattern after verifying secure minimum versions.

🧹 Nitpick comments (1)
examples/python/cv-screening/Readme.md (1)

20-22: 💤 Low value

Add language identifier to code fence.

The code fence starting at line 20 lacks a language identifier, triggering a markdownlint warning (MD040). While this is ASCII art rather than code, specifying text or leaving it as triple-backticks with no syntax highlighting improves consistency.

📝 Proposed fix
-```
+```text
 PDF upload ──> Markdown (markitdown) ──> prompt fetched from Agenta ──> LLM ──> structured scores
</details>

<!-- cr-comment:v1:63b9a63971a5e8574a05aef6 -->

</blockquote></details>

</blockquote></details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: Organization UI

**Review profile**: CHILL

**Plan**: Pro Plus

**Run ID**: `7c1b7401-bc27-47de-b94d-d0d734c5558f`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between aed2d47357cc8d88347011835c7cc1f3f7f08ea7 and c28d1a2dca9c1982a3b8885929de57447d80e256.

</details>

<details>
<summary>⛔ Files ignored due to path filters (4)</summary>

* `examples/python/cv-screening/data/sample_cvs/candidate_chef.pdf` is excluded by `!**/*.pdf`
* `examples/python/cv-screening/data/sample_cvs/candidate_it_manager.pdf` is excluded by `!**/*.pdf`
* `examples/python/cv-screening/data/sample_cvs/candidate_it_supervisor.pdf` is excluded by `!**/*.pdf`
* `examples/python/cv-screening/data/testset.csv` is excluded by `!**/*.csv`

</details>

<details>
<summary>📒 Files selected for processing (10)</summary>

* `examples/python/Readme.md`
* `examples/python/cv-screening/.env.example`
* `examples/python/cv-screening/Readme.md`
* `examples/python/cv-screening/app.py`
* `examples/python/cv-screening/config.py`
* `examples/python/cv-screening/create_app.py`
* `examples/python/cv-screening/data/.gitignore`
* `examples/python/cv-screening/make_sample_pdfs.py`
* `examples/python/cv-screening/prepare_testset.py`
* `examples/python/cv-screening/requirements.txt`

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Comment thread examples/python/cv-screening/app.py Outdated
Comment on lines +94 to +109
response = requests.post(
f"{host}/api/simple/traces/",
headers={
"Content-Type": "application/json",
"Authorization": f"ApiKey {os.environ['AGENTA_API_KEY']}",
},
json={
"trace": {
"data": {"outputs": outputs},
"references": {"evaluator": {"slug": FEEDBACK_EVALUATOR_SLUG}},
"links": {"invocation": invocation},
}
},
timeout=30,
)
return response.status_code in (200, 202)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle feedback POST failures explicitly.

requests.post(...) can raise on timeout/connection issues, which can crash feedback submission instead of returning a clean UI error path.

Proposed fix
 def send_feedback(invocation: dict, thumbs_up: bool, comment: str) -> bool:
@@
-    response = requests.post(
-        f"{host}/api/simple/traces/",
-        headers={
-            "Content-Type": "application/json",
-            "Authorization": f"ApiKey {os.environ['AGENTA_API_KEY']}",
-        },
-        json={
-            "trace": {
-                "data": {"outputs": outputs},
-                "references": {"evaluator": {"slug": FEEDBACK_EVALUATOR_SLUG}},
-                "links": {"invocation": invocation},
-            }
-        },
-        timeout=30,
-    )
-    return response.status_code in (200, 202)
+    try:
+        response = requests.post(
+            f"{host}/api/simple/traces/",
+            headers={
+                "Content-Type": "application/json",
+                "Authorization": f"ApiKey {os.environ['AGENTA_API_KEY']}",
+            },
+            json={
+                "trace": {
+                    "data": {"outputs": outputs},
+                    "references": {"evaluator": {"slug": FEEDBACK_EVALUATOR_SLUG}},
+                    "links": {"invocation": invocation},
+                }
+            },
+            timeout=30,
+        )
+    except requests.RequestException:
+        return False
+    return response.ok

Comment on lines +185 to +192
cv_markdown = pdf_to_markdown(uploaded.getvalue())
with st.expander("Extracted Markdown", expanded=False):
st.markdown(cv_markdown)

if st.button("Screen candidate", type="primary"):
with st.spinner("Evaluating CV against the job spec ..."):
result = classify_cv(cv_markdown, config)
st.session_state["screening"] = {"cv": cv_markdown, "result": result}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard conversion/classification with user-facing error handling.

The main screening path can raise on PDF parsing, LLM call, or JSON decoding; currently those failures bubble up and break the interaction.

Proposed fix
-    cv_markdown = pdf_to_markdown(uploaded.getvalue())
+    try:
+        cv_markdown = pdf_to_markdown(uploaded.getvalue())
+    except Exception as exc:
+        st.error(f"Could not read this PDF: {exc}")
+        return
@@
     if st.button("Screen candidate", type="primary"):
         with st.spinner("Evaluating CV against the job spec ..."):
-            result = classify_cv(cv_markdown, config)
+            try:
+                result = classify_cv(cv_markdown, config)
+            except Exception as exc:
+                st.error(f"Screening failed: {exc}")
+                return
         st.session_state["screening"] = {"cv": cv_markdown, "result": result}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
cv_markdown = pdf_to_markdown(uploaded.getvalue())
with st.expander("Extracted Markdown", expanded=False):
st.markdown(cv_markdown)
if st.button("Screen candidate", type="primary"):
with st.spinner("Evaluating CV against the job spec ..."):
result = classify_cv(cv_markdown, config)
st.session_state["screening"] = {"cv": cv_markdown, "result": result}
try:
cv_markdown = pdf_to_markdown(uploaded.getvalue())
except Exception as exc:
st.error(f"Could not read this PDF: {exc}")
return
with st.expander("Extracted Markdown", expanded=False):
st.markdown(cv_markdown)
if st.button("Screen candidate", type="primary"):
with st.spinner("Evaluating CV against the job spec ..."):
try:
result = classify_cv(cv_markdown, config)
except Exception as exc:
st.error(f"Screening failed: {exc}")
return
st.session_state["screening"] = {"cv": cv_markdown, "result": result}

Comment on lines +26 to +44
try:
ag.AppManager.create(app_slug=APP_SLUG, app_type="SERVICE:completion")
except Exception as exc: # noqa: BLE001 - app may already exist
print(f" Application not created ({exc}); assuming it already exists.")

print(f"Committing prompt to variant '{VARIANT_SLUG}' ...")
try:
variant = ag.VariantManager.create(
parameters=PROMPT_CONFIG,
app_slug=APP_SLUG,
variant_slug=VARIANT_SLUG,
)
except Exception:
# The variant already exists: commit a new version instead.
variant = ag.VariantManager.commit(
parameters=PROMPT_CONFIG,
app_slug=APP_SLUG,
variant_slug=VARIANT_SLUG,
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle duplicate-resource paths explicitly and re-raise real failures.

Both create steps treat any exception as an “already exists” case. That can silently hide real failures (auth, connectivity, API/server errors) and still attempt production deploy.

Proposed fix
     print(f"Creating application '{APP_SLUG}' ...")
     try:
         ag.AppManager.create(app_slug=APP_SLUG, app_type="SERVICE:completion")
     except Exception as exc:  # noqa: BLE001 - app may already exist
-        print(f"  Application not created ({exc}); assuming it already exists.")
+        if "already exists" in str(exc).lower():
+            print(f"  Application already exists ({exc}).")
+        else:
+            raise

     print(f"Committing prompt to variant '{VARIANT_SLUG}' ...")
     try:
         variant = ag.VariantManager.create(
             parameters=PROMPT_CONFIG,
             app_slug=APP_SLUG,
             variant_slug=VARIANT_SLUG,
         )
-    except Exception:
+    except Exception as exc:
         # The variant already exists: commit a new version instead.
-        variant = ag.VariantManager.commit(
-            parameters=PROMPT_CONFIG,
-            app_slug=APP_SLUG,
-            variant_slug=VARIANT_SLUG,
-        )
+        if "already exists" in str(exc).lower():
+            variant = ag.VariantManager.commit(
+                parameters=PROMPT_CONFIG,
+                app_slug=APP_SLUG,
+                variant_slug=VARIANT_SLUG,
+            )
+        else:
+            raise
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
try:
ag.AppManager.create(app_slug=APP_SLUG, app_type="SERVICE:completion")
except Exception as exc: # noqa: BLE001 - app may already exist
print(f" Application not created ({exc}); assuming it already exists.")
print(f"Committing prompt to variant '{VARIANT_SLUG}' ...")
try:
variant = ag.VariantManager.create(
parameters=PROMPT_CONFIG,
app_slug=APP_SLUG,
variant_slug=VARIANT_SLUG,
)
except Exception:
# The variant already exists: commit a new version instead.
variant = ag.VariantManager.commit(
parameters=PROMPT_CONFIG,
app_slug=APP_SLUG,
variant_slug=VARIANT_SLUG,
)
try:
ag.AppManager.create(app_slug=APP_SLUG, app_type="SERVICE:completion")
except Exception as exc: # noqa: BLE001 - app may already exist
if "already exists" in str(exc).lower():
print(f" Application already exists ({exc}).")
else:
raise
print(f"Committing prompt to variant '{VARIANT_SLUG}' ...")
try:
variant = ag.VariantManager.create(
parameters=PROMPT_CONFIG,
app_slug=APP_SLUG,
variant_slug=VARIANT_SLUG,
)
except Exception as exc:
# The variant already exists: commit a new version instead.
if "already exists" in str(exc).lower():
variant = ag.VariantManager.commit(
parameters=PROMPT_CONFIG,
app_slug=APP_SLUG,
variant_slug=VARIANT_SLUG,
)
else:
raise

Comment on lines +34 to +37
PARQUET_URL = (
"https://huggingface.co/api/datasets/opensporks/resumes"
"/parquet/default/train/0.parquet"
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Pin/verify the source dataset artifact for true reproducibility.

The script currently downloads from a mutable source path. If upstream data changes, data/testset.csv can drift over time, which conflicts with the reproducible-build objective.

Proposed fix
 import argparse
 import asyncio
 import csv
+import hashlib
 import re
 import sys
 from pathlib import Path
@@
 PARQUET_URL = (
@@
 )
+EXPECTED_PARQUET_SHA256 = "<fill-with-known-good-sha256>"
@@
 def download_dataset() -> pd.DataFrame:
@@
         response = requests.get(PARQUET_URL, timeout=120)
         response.raise_for_status()
-        CACHE_PATH.write_bytes(response.content)
+        content = response.content
+        digest = hashlib.sha256(content).hexdigest()
+        if digest != EXPECTED_PARQUET_SHA256:
+            raise RuntimeError(
+                f"Dataset artifact hash mismatch: got {digest}, expected {EXPECTED_PARQUET_SHA256}"
+            )
+        CACHE_PATH.write_bytes(content)

Comment on lines +126 to +130
for resume_id, expected in CURATED_RESUMES.items():
matches = df[df["ID"] == resume_id]
if matches.empty:
print(f"warning: resume {resume_id} not found in dataset, skipping")
continue

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t silently skip curated IDs; fail fast on missing records.

Skipping missing curated resumes can silently shrink the testset and invalidate evaluation comparisons.

Proposed fix
 def build_testset(df: pd.DataFrame) -> list[dict]:
     rows = []
+    missing_ids = []
     for resume_id, expected in CURATED_RESUMES.items():
         matches = df[df["ID"] == resume_id]
         if matches.empty:
-            print(f"warning: resume {resume_id} not found in dataset, skipping")
+            missing_ids.append(resume_id)
             continue
@@
-    return rows
+    if missing_ids:
+        raise RuntimeError(
+            f"Curated resumes missing from source dataset: {missing_ids}"
+        )
+    return rows

…pt revision

Move all the AI logic out of the Streamlit app into a new screening.py
module (prompt fetch, the LLM call, tracing, feedback), leaving app.py as
a UI-only shell. Any other frontend can import screening.py unchanged.

Tracing improvements so screenings are easy to act on from the UI:

- Auto-instrument the OpenAI client with OpenInference, so every trace has
  a child LLM span with the exact messages, token counts, and cost.
- classify_cv takes its inputs as a dict whose keys match the prompt input
  variables ({"cv": ...}), and the prompt config is kept out of the trace
  (ignore_inputs). The span data then mirrors the completion app's inputs.
- Link each span to the deployed prompt revision via ag.tracing.store_refs,
  so traces filter by app/environment and open in the playground on the
  right revision with inputs pre-filled.

Also fix create_app.py to read variant.variant_version as an attribute
(VariantManager now returns a ConfigurationResponse, not a dict).
mmabrouk added 2 commits June 11, 2026 14:00
The walkthrough needed a leaner story: the output schema is now
tech_match / experience_match / overall_match, each with a short reason,
plus the missing-requirements list. overall_match is a holistic
hire-or-not judgment, so a requirement like a language can flip it while
the other two stay true. The test set drops the bookkeeping columns and
carries one expected_* column per dimension; empty cells are skipped by
the code evaluator documented in the Readme.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

example python Pull requests that update Python code size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants