Skip to content

Nightly eval regression detected (2026-05-02) #17

@github-actions

Description

@github-actions

Regression Report: math-20

Baseline: 0.2.0-cerebras-llama3.1-8b
Current: ci-nightly
Status: REGRESSION DETECTED

Metrics

Metric Baseline Current Change
accuracy 0.9000 0.8500 -5.6%
avg_latency 11.2776 3.6134 -68.0%
total_tokens 3194.0000 1967.0000 -38.4%
avg_tool_accuracy 0.0000 0.0000 N/A

Alerts

  • [WARNING] math-20/accuracy: 0.9000 -> 0.8500 (-5.6%, threshold: 5%)

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions