tests: --full-matrix runs N-adapter cross-driver interop tables#34
Merged
Conversation
9f850e2 to
f06bb98
Compare
Extends tests/regress.py with a --full-matrix mode that iterates every
ordered (TX, RX) pair of plugged-in DUTs across all four driver-side
combinations (kernel-only, devourer-TX/kernel-RX, kernel-TX/devourer-RX,
devourer-only) and emits one NxN table per mode instead of one 4-cell
table for a single pair. Useful for catching cross-chipset interop
regressions in PRs that touch shared HAL code.
Usage:
sudo python3 tests/regress.py --full-matrix --channel 100 \\
--vm-name devourer-testrig --vm-ssh <user>@<VM-IP>
For N adapters, runs N*(N-1)*4 cells total — at ~30-40s per cell in VM
mode that's ~16 min for N=3, manageable. Diagonal is blanked (same
physical adapter can't simultaneously TX and RX with one driver). The
script reuses run_cell as-is; the addition is just the outer pair loop,
result dict keyed by (tx_side, rx_side, tx_vidpid, rx_vidpid), and a
new emit_full_markdown that renders four NxN tables.
Also scrubs personal identifiers from earlier docs/scripts (PR #33):
- tests/setup_vm.sh now reads VM_USER from $SUDO_USER / $USER instead
of hardcoding a specific username
- tests/README.md + regress.py docstrings switch to <user>@<VM-IP>
placeholders in example commands
Validation on a 3-adapter rig (Ubuntu 22.04 VM with aircrack-ng/88XXau,
0bda:8812 + 0bda:8813 + 2357:0120, channel 100, 10s/cell):
## Kernel-only (rig sanity)
All 6 cross-chipset cells pass — 88XXau handles all three chipsets
cleanly in the pinned-kernel VM (88-271 hits per cell).
## devourer-TX → kernel-RX
devourer-TX confirmed for 8812 (4114, 4693 hits) AND 8821 (4341 hits
reaching 8814 kernel RX). 8814 TX flaky after passthrough cycles
(chip-state degradation across cell sequencing — known sensitive).
## kernel-TX → devourer-RX
Surprise — devourer-RX 8821 caught 200 frames from kernel-TX 8814,
contradicting PR #30's "RX silent" finding. devourer-RX 8812
confirmed (100 hits from each of 8814, 8821 TX). devourer-RX 8814
confirmed broken (0 hits all directions — known TODO).
## devourer ↔ devourer
All 0 — every cell hits at least one broken side (8814 RX or 8814 TX
degraded mid-run).
Net new product signal from the full matrix:
- devourer-TX 8821 actually works (was unvalidated since PR #30 had no
peer sniffer in that session — VM mode is the peer)
- devourer-RX 8821 works under at least one TX condition — reopen PR
#30's "RX silent" conclusion
- 8814 chip state degrades through repeated host↔VM passthrough — needs
investigation, may want a chip reset between cells
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
f06bb98 to
368a198
Compare
This was referenced May 23, 2026
josephnef
added a commit
that referenced
this pull request
May 23, 2026
… RX asymmetry) (#37) ## Status: partial — register-state parity with kernel, but doesn't fix #30's 8812-peer asymmetry Brings devourer's `CHIP_8821` init register-write set into parity with `aircrack-ng/88XXau`'s post-fwdl monitor-mode bring-up, based on a usbmon-trace diff. Adds 13 missing post-fwdl writes, 17 BB/AGC value overrides, MAC-address programming, and corrects an earlier wrong guess at REG_USB_HRPWM. **This does NOT fix the asymmetry** (`devourer-RX 8821` catches kernel-TX 8814 frames but drops kernel-TX 8812 frames at 0/280) that #34 surfaced. The cheap "what's different at the register level" patches don't cross the threshold. Filing anyway because it's honest progress + brings the chip's baseline state into kernel parity, which the next fix attempt can build on. ## What this changes | Where | Change | |---|---| | `HalModule.cpp::rtl8812au_hal_init` | REG_USB_HRPWM for CHIP_8821: 0x84 → 0x00 (kernel writes 0; my earlier "leave LPS wake" guess was wrong) | | `HalModule.cpp::rtl8812au_hal_init` | CHIP_8821: program MAC address to REG_MACID (0x0610-0x0615). Kernel always does this even in monitor; some chip RX-path logic gates on MACID being non-zero. | | `HalModule.cpp::rtl8812au_hal_init` | CHIP_8821: 13 trace-derived post-fwdl writes (REG_AUTO_LLT region, REG_TX_PTCL_CTRL, NAV-related, 8821 BB high-addr block 0x1874-0x187f) | | `HalModule.cpp::rtl8812au_hal_init` | CHIP_8821: 17 BB/AGC value overrides (0x0830 PWED_TH, 0x0c20-0x0c44 AGC table, 0x0e90 TX power region) forced to kernel's observed runtime values | | `tests/inject_beacon.py` | `--rate` CLI arg (added during rate-decode hypothesis testing; useful diagnostic knob) | 8812AU + 8814AU paths untouched. ## Why the RX gap probably remains Aircrack-ng runs **phydm** — a runtime feedback loop that continually adjusts AGC / RX-gain based on observed signal quality. This PR sets the chip's *static* values to match kernel's post-init snapshot, but the chip needs the active loop to *maintain* them during RX. Without phydm running, the chip drifts back toward defaults that work for higher-SNR peers (8814) but not for 8812. Three follow-up paths to actually fix the asymmetry: 1. **Port phydm runtime** for 8821 (substantial — most of `aircrack-ng/hal/phydm/rtl8821a/`). Days of work, cleanest fix. 2. **Loop-replay**: capture kernel's full register sequence over a 5-second RX window (not just init) and replay periodically from devourer. Hacky but cheap. 3. **Accept devourer-RX 8821 as "works for some peers, not others"** until phydm is in. Document the limit in the adapter inventory. ## usbmon methodology (for future trace-diff work) Capture kernel side (in VM with `aircrack-ng/88XXau`): ```bash # On VM: sudo modprobe usbmon sudo bash -c "cat /sys/kernel/debug/usb/usbmon/0u > /tmp/trace_kernel.txt &" # Then attach DUT via virsh, bring up monitor mode, stop usbmon ``` Capture devourer side (on host): ```bash sudo bash -c "timeout 20 cat /sys/kernel/debug/usb/usbmon/Nu > /tmp/trace_dvr.txt" & sudo ./build/WiFiDriverDemo ... ``` Filter Realtek control writes: ```bash grep -E "S Co:.* 40 05 " trace.txt | awk '{print $8, $13}' ``` **Important encoding gotcha:** usbmon shows wire bytes in transmission order (= LE u32's byte sequence). To write the same value via `rtw_write32` on a LE host, the u32 value is read-as-LE — so the trace text `82824001` represents u32 `0x01408282` (NOT `0x82824001`). The first attempt at the post-fwdl writes in this PR used wrong-endian values and naturally had no effect; the fix was to byte-pair-reverse them. ## Validation Single-pair test on the trainer rig (Arch host, Ubuntu 22.04 VM with aircrack-ng/88XXau, channel 100, 10s): | Cell | Hits | Note | |---|---|---| | kernel-TX 8812 → kernel-RX 8821 (baseline) | 283 ✓ | rig sane | | devourer-TX 8821 → kernel-RX 8814 | 4663 ✓ | dvr-TX 8821 fine | | kernel-TX 8812 → devourer-RX 8821 | **0 ✗** | the asymmetry — unresolved | | devourer-TX 8821 → devourer-RX 8812 | **0 ✗** | same | Other chips untouched. ## Test plan - [x] Builds clean - [x] CHIP_8812 + CHIP_8814A paths unchanged - [x] CHIP_8821 init now writes the kernel-equivalent register set - [x] devourer-TX 8821 still works → kernel-RX 8814 (4663 hits) - [ ] dvr-RX 8821 ↔ 8812 — known *not* fixed by this PR - [ ] phydm runtime ported (follow-up) Refs #30 (initial 8821 port), #34 (full matrix that surfaced the asymmetry). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this is
Extends
tests/regress.pywith--full-matrix: iterates every ordered (TX, RX) pair of plugged-in DUTs across all 4 driver-side combinations (kernel-only / dvr-TX×kernel-RX / kernel-TX×dvr-RX / devourer-only) and emits one NxN table per mode. Useful for catching cross-chipset interop regressions in PRs that touch shared HAL code.sudo python3 tests/regress.py --full-matrix --channel 100 \ --vm-name devourer-testrig --vm-ssh <user>@<VM-IP>For N adapters → N×(N-1)×4 cells. ~16 min for N=3 in VM mode. Diagonal blanked (same physical adapter can't simultaneously TX and RX with one driver).
Also scrubs placeholder usernames from earlier docs/scripts (
tests/setup_vm.shnow readsVM_USERfrom$SUDO_USER/$USERinstead of hardcoding; READMEs + docstrings use<user>@<VM-IP>).Validation on a 3-adapter rig
Ubuntu 22.04 VM with aircrack-ng/88XXau, DUTs
0bda:8812 + 0bda:8813 + 2357:0120, channel 100, 10s/cell:Kernel-only (rig sanity / cross-chipset kernel interop)
6/6 cells pass.
88XXauin the pinned-kernel VM gives full cross-chipset kernel-side interop. Rig is sound.devourer-TX → kernel-RX (does devourer emit valid frames?)
kernel-TX → devourer-RX (does devourer RX a known-good frame?)
devourer ↔ devourer (end-to-end devourer)
All 0. Every cell hits at least one broken side (8814 RX broken or 8814 TX degraded mid-run).
Net new product signal
What this PR doesn't touch
--full-matrixextension doesn't change any existing code path; the single-pair mode (--tx-pid/--rx-pid) still works exactly as before. Pure addition.usbresetbetween cells)Test plan
--full-matrixruns end-to-end against 3 adapters in VM mode🤖 Generated with Claude Code