Regression Report: math-20
Baseline: 0.2.0-cerebras-llama3.1-8b
Current: ci-monitor
Status: REGRESSION DETECTED
Metrics
| Metric |
Baseline |
Current |
Change |
| accuracy |
0.9000 |
0.8500 |
-5.6% |
| avg_latency |
11.2776 |
3.3946 |
-69.9% |
| total_tokens |
3194.0000 |
1966.0000 |
-38.4% |
| avg_tool_accuracy |
0.0000 |
0.0000 |
N/A |
Alerts
- [WARNING] math-20/accuracy: 0.9000 -> 0.8500 (-5.6%, threshold: 5%)
Regression Report: math-20
Baseline: 0.2.0-cerebras-llama3.1-8b
Current: ci-monitor
Status: REGRESSION DETECTED
Metrics
Alerts