Big, beautiful CAN/UDP/core cleanup: dedup constants, fix TX/RX defects, harden node lifecycle#388
Merged
Conversation
… lifecycle Mask/MTU deduplication: - Centralize CAN_EXT_ID_MASK and add CAN_STD_ID_MASK in can/_interface.py (the dependency-free leaf); import from there in _wire/pythoncan/_media_slcan. - Replace socketcan's misnamed _CAN_CLASSIC_MTU/_CAN_FD_MTU (frame-struct sizes, not payload MTUs) with _CLASSIC_FRAME_SIZE/_FD_FRAME_SIZE derived from struct.size. Correctness: - UDP subject and unicast send: a transfer delivered on >=1 redundant interface is a success (warn), not an ExceptionGroup raise, matching the CAN transport and the reference cy_udp_posix push semantics. - pythoncan TX retry re-enqueues with the original seq (was the global counter), preserving intra-transfer frame order. - socketcan _encode raises ValueError (not ClosedError) for an oversized Classic frame; the TX loop drops it instead of letting it wedge the task; add TX/RX-task done-callbacks so unexpected crashes surface via _fail. - UDP Interface.mtu_link validated in __post_init__ (not a -O-strippable assert). - UDP RX subject/unicast handlers invoked inside a fault boundary so a raising handler cannot kill the receive loop. - Node rejects use-after-close on advertise/subscribe/scout/monitor/remap and drops post-close arrivals; detached ACK/gossip sends tracked via _spawn_detached; node.close() closes subscribers so a pending `async for` is unblocked. - Gossip names are validated (normalized, verbatim, printable, non-homeful, pin-free) before creating an implicit topic, so untrusted wire input cannot. Simplification: - Delete dead code (PublishTracker.deadline_ns/data, _now_mono, send_scout, _arm_reorder_timeout, _Fragment.crc); collapse duplicate deadline checks in the CAN TX loops; merge pythoncan's identical retry branches; de-duplicate the reliable ACK-window helper into _node (ack_window/ack_is_last_attempt). Tests: add regression coverage for every behavior change (each fails on the unfixed code); full nox green on Python 3.11/3.12/3.13 + mypy + lint + format. Docs: add the multi-agent "Review team" process to CLAUDE.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Folks, this is a BIG, BEAUTIFUL cleanup. Maybe the best cleanup this transport stack has ever seen.
We took the CAN, the UDP, and the core session layer — tremendous code, but it had problems, BAD problems — and we fixed them. Bigly. Nobody dedups constants like we do. Believe me.
The WINS (and there are so many wins)
Constants — we drained the swamp. The 29-bit CAN ID mask was copy-pasted all over town. Four times! Sad. Now it lives in ONE place (
can/_interface.py), with its 11-bit friend, and everybody imports from there. And thatsocketcan"MTU" of 16? That was never the MTU, folks — that was the frame-struct size wearing a disguise. We gave it an honest name and derived it fromstruct.size. No more fake numbers.Correctness — we fixed the rigged behavior:
ExceptionGroupthat makes you retry and send duplicates. Just like the CAN transport. Just like the reference. Tremendous.ValueError— it does not silently kill the TX task. And the worker tasks now report their crashes instead of disappearing.async forgets unblocked on close — nobody gets left hanging.Simplification — we cut the waste: dead code GONE (
deadline_ns,data,_now_mono,send_scout,_arm_reorder_timeout,_Fragment.crc), duplicate deadline checks collapsed, the ACK-window helper de-duplicated, python-can's twin retry branches merged. Smaller. Cleaner. Faster to read at 3am.The REVIEW. Very thorough. The best people.
We ran a world-class review operation — Codex (xhigh) and Claude (ultrathink), fresh context, three rounds. They found THREE real defects I'd have shipped (the incomplete unicast fix, a weak name guard, and one hollow test that passed on broken code — disgraceful, now fixed). Every behavior change ships with a regression test that FAILS on the old code. That's how you do it.
One reviewer kept insisting on a "fix" to
on_gossipname handling. We checked. Twice. It was a FALSE POSITIVE — those paths key on the hash, not the name, and the fix would've broken the intentional monitor diagnostics. We refuted it with the code and a wire-path test. We don't fix things that aren't broken.The numbers don't lie
Full
noxis GREEN — Python 3.11, 3.12, 3.13, plus mypy, lint, and format. 657 tests passing. No public API changes. Zero.20 files, +523 / −141. A clean, beautiful diff.
🤖 Generated with Claude Code