[Fix] discard failed and filtered rollout_state#1922
Conversation
|
@claude review |
|
Claude: ## Summary Discards FAILED and FILTERED rollout groups in ProduceBatchResult impact: RoutedExperts impact: Positive — fixes a gap where IssuesCritical
Warning
Nit
VerdictREQUEST_CHANGES |
|
@claude review |
| result.filtered_samples = status_counts.get(Status.FILTERED, 0) | ||
|
|
||
|
|
||
| def _merge_discarded_counts(result: ProduceBatchResult, progress: Any, task_name: str) -> None: |
There was a problem hiding this comment.
Claude: Nit — progress: Any bypasses mypy in a strict-checked path.
Both ProduceProgress and DisaggProduceProgress implement consume_discarded(task_name) -> tuple[int, int], but using Any here skips static verification. A lightweight Protocol or explicit union type would catch future signature drift:
| def _merge_discarded_counts(result: ProduceBatchResult, progress: Any, task_name: str) -> None: | |
| def _merge_discarded_counts(result: ProduceBatchResult, progress: ProduceProgress | DisaggProduceProgress, task_name: str) -> None: |
(Requires importing DisaggProduceProgress — alternatively a Protocol with the consume_discarded method.)
| failed_samples_state = state.get( | ||
| "failed_samples", | ||
| state.get("failed_groups", state.get("discarded_failed_groups", {})), | ||
| ) | ||
| filtered_samples_state = state.get( | ||
| "filtered_samples", | ||
| state.get("filtered_groups", state.get("discarded_filtered_groups", {})), | ||
| ) |
There was a problem hiding this comment.
Claude: Nit — fallback keys "failed_groups" / "discarded_failed_groups" don't appear in any existing checkpoint format.
If these were intermediate names during development that never shipped, the nested .get() chains are dead code. Simplifying to just state.get("failed_samples", {}) makes the contract clearer and avoids future confusion about which key names are "supported."
|
Claude: ## Updated Review (all 3 commits) The previous review flagged critical issues in commit 1 (in-place mutation corrupting group status,
ProduceBatchResult impact: RoutedExperts impact: Positive — Remaining IssuesNit
VerdictAPPROVE — the memory fix is correct and well-tested. Only cosmetic nits remain. |
修复点
修复后内存曲线