Skip to content

(ignore) feat(n-vm): in-VM test infrastructure (testn absorption + e1000)#1583

Draft
daniel-noland wants to merge 38 commits into
pr/daniel-noland/acl-pat-designfrom
pr/daniel-noland/n-vm-again
Draft

(ignore) feat(n-vm): in-VM test infrastructure (testn absorption + e1000)#1583
daniel-noland wants to merge 38 commits into
pr/daniel-noland/acl-pat-designfrom
pr/daniel-noland/n-vm-again

Conversation

@daniel-noland

Copy link
Copy Markdown
Collaborator

Absorb the n-vm, n-it, n-vm-macros, and n-vm-protocol crates from the external testn repository into the workspace and add the supporting infrastructure for running tests inside a QEMU/cloud-hypervisor guest.

This is a squashed revival of the n-vm-again branch, rebased onto the current ACL/dpdk line. The branch had forked before the ACL module existed; its stale fork of dpdk/, dpdk-sys/, and dpdk/src/acl/* has been dropped in favour of the authoritative versions on the base, keeping only the orthogonal VM test infrastructure.

Contents:

  • n-vm / n-vm-protocol: QEMU and cloud-hypervisor backends, dynamic vsock allocation, hugepage and NIC-model configuration, scratch-only container mode with nix-store bind mounts.
  • n-vm-macros: the #[in_vm] attribute and companion attributes (#[network], guest/hypervisor config), compile-fail trybuild tests.
  • n-it: in-guest init system (PID 1) with mount-table-driven teardown.
  • nix: linux-fancy kernel built from config fragments (VFIO, IOMMU, virtio, e1000/e1000e), testroot/vmroot derivations, merge-config.nix.
  • hardware / dpdk-sys: e1000/e1000e NIC binding and rte_net_e1000 PMD linkage; driver() returns Ok(None) for unbound devices.
  • mgmt: re-enable test_sample_config under #[in_vm].
  • justfile: export N_VM_TEST_ROOT/N_VM_VM_ROOT for #[in_vm] tests.

The hardware/tests/dpdk_in_vm.rs integration test (virtio-net/e1000/ e1000e) is intentionally NOT carried forward: it was written against the pre-rewrite dpdk device API (StartedDev queue handles, RxOffloadConfig, public Headers fields) and needs a port to the current API. It is preserved on the backup/n-vm-again branch for a clean revival.

daniel-noland and others added 21 commits June 3, 2026 13:38
Add a `dataplane-lifecycle` crate with `Shutdown` and `Subsystem`
primitives, signal-handler installation, and a process-wide shutdown
watchdog.

`Shutdown` bundles a root `CancellationToken` and one `Subsystem` per
long-lived component (workers, router, mgmt, metrics). Each subsystem
exposes a per-subsystem cancel token, a `TaskTracker`, and a shared
fatal flag. Subsystems drain in topological order with per-subsystem
deadlines; the detached watchdog enforces an absolute upper bound on
total shutdown duration.

No consumers yet -- wired up in follow-on commits.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Plumb lifecycle Subsystems into the routing crate as the first step of
the threading rewrite. The rest of `main`'s shutdown signaling (ctrlc +
mpsc<i32> + start_mgmt + MetricsServer + old DriverKernel::start) stays
in place; follow-on commits migrate each.

- `Router::new` takes `(mgmt, mgmt_handle, router)` Subsystems +
  runtime handle. Plumbed through `packet_processor::start_router`.
- `start_rio` takes `&Subsystem`; the IO loop observes its cancel
  between poll cycles (worst-case exit latency = 1s poll). Adds an
  ExitGuard so panic-unwind or unexpected loop exit reports fatal.
- `RouterCtlMsg::Finish` removed; `RioHandle::finish` becomes idempotent.
- `bmp::spawn_background` spawns onto the caller-provided runtime handle
  tracked under `mgmt`; no more leaked runtime.
- `runtime.rs` builds a multi-thread mgmt runtime (only BMP tenants it
  in this commit) and a `Shutdown` for plumbing into Router::new.
- `dataplane` and `mgmt` Cargo.toml gain `lifecycle` dep.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Move mgmt + metrics from dedicated OS threads with private runtimes to
tenants of the multi-thread mgmt runtime introduced in the prior commit.
The kernel driver still uses the legacy ctrlc + mpsc<i32> path; the
final unification commit migrates that.

- `run_mgmt` (renamed from `start_mgmt`): synchronous init on the
  caller-provided handle, then spawns the three long-lived tasks (config
  processor, status updater, config watcher) via
  `Subsystem::spawn_fatal_on_exit` so their unexpected exit flips fatal.
  Init observes `mgmt.root_token()` so SIGINT during k8s retries returns
  `LaunchError::Cancelled` within cancel latency.
- `LaunchError::Cancelled` is a clean-shutdown signal; the call site in
  `runtime.rs` forwards the existing mpsc stop channel with code 0 so
  the legacy shutdown path stays consistent.
- `spawn_metrics` replaces `MetricsServer`: HTTP endpoint, upkeep ticker,
  and stats collector all spawn onto `mgmt_handle` tracked under
  `metrics`. Uses plain `spawn_on` (not `spawn_fatal_on_exit`) — a dead
  metrics endpoint should not take down the dataplane.
- Drop stale `LaunchError` variants no longer constructed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Replace the last legacy shutdown signaling (ctrlc handler + mpsc<i32>
exit code + dedicated controller thread) with the single
`lifecycle::Shutdown` path, and migrate the kernel driver to scoped
threads with cancellation observation. After this commit there is one
signaling path: SIGINT/SIGTERM -> shutdown.root, or any subsystem's
report_fatal -> shutdown.root, with the watchdog as the absolute
upper bound.

- `main`: install `spawn_signal_handler` and `spawn_shutdown_watchdog`,
  run everything inside `concurrency::thread::scope`, block on
  `root.cancelled()`, then drain subsystems in canonical order
  (workers -> router -> metrics -> mgmt). Exit code from `is_fatal()`.
- `DriverKernel::start`: takes `&Scope` and `&Subsystem`; workers spawn
  via `spawn_scoped` with an `ExitGuard` Drop pattern that reports
  fatal on panic-unwind, early `?`-return, and unexpected normal exit.
  Reader loops observe cancel between reads. Supervisor joins-and-logs.
- Drop `dataplane/src/drivers/tokio_util.rs` and its
  `run_in_local_tokio_runtime` helper (inlined where needed).
- Drop `ctrlc` and `mio` from `dataplane` dependencies; drop `ctrlc`
  from the workspace.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: Daniel Noland <daniel@githedgehog.com>
Lets callers derive the rte_acl input-buffer size requirement from a
const FIELD_DEFS array at compile time -- useful for feeding into a
const generic (e.g. the STRIDE on a DPDK-backed Lookup type alias)
rather than rediscovering it at runtime.

The runtime path through AclBuildConfig::new still calls into the
same helper and caches the result; only the surface broadens from
fn to const fn (with the for-loop rewritten as a while-let-i index
walk that const-fn-eval can chew on).  Existing FieldExtentOverflow
validation is preserved; in a const context overflow becomes a
compile error instead of UB.

Adds one test exercising the const path: const MIN_INPUT_SIZE =
AclBuildConfig::compute_min_input_size(&DEFS); plus the assertion
that the runtime path returns the same value.

just fmt; cargo check -p dataplane-dpdk --all-targets passes.
Introduces shared EAL test scaffolding for downstream crates that
need to exercise rte_acl, mempools, or any other DPDK runtime under
`#[test]`.  Two pieces, both behind the new dpdk `test` feature so
production builds remain unchanged:

- dpdk-test-macros: proc-macro crate exposing #[with_eal], which
  injects `let _eal = <dpdk_crate>::test_support::start_eal();` at
  the top of a #[test] function.  Resolves the dpdk crate's path via
  proc-macro-crate so the macro works in-tree (`crate`), under the
  workspace alias (`::dpdk`), or with the canonical name
  (`::dataplane_dpdk`).
- dpdk::test_support: hosts a shared OnceLock<Eal> initialized with
  --no-huge / --no-pci / --in-memory and the cpu-affinity-aware
  --lcores derivation that was previously inline in acl/mod.rs.
  Once-per-process by construction; safe under nextest's per-test
  forking and single-process runners alike.

dpdk/src/acl/mod.rs picks up the new macro: every test in the module
loses its `let _eal = start_eal();` prologue and gains a #[with_eal]
attribute above #[test].  The inline start_eal helper and the
module-scoped OnceLock<Eal> static go away.

dpdk/Cargo.toml grows the `test` feature (turns on dpdk-test-macros
plus the id and nix runtime helpers that test_support needs) and a
self-referencing dev-dep `dataplane-dpdk = { path = ".", features =
["test"] }` so dpdk's own tests see the macro surface.

just fmt; cargo check --workspace --all-targets passes.
Introduces a no_std FixedSize trait carrying a known byte width plus
a big-endian write.  Lives in its own crate so value-providing
crates (`net`) and downstream key-packing frameworks (a future
`match-action` PR) can both depend on it without depending on each
other -- `net` implements FixedSize on its own newtypes without an
orphan-rule violation, and the framework doesn't need to know about
any particular value supplier.

fixed-size/src/lib.rs:

- pub trait FixedSize: Copy, with const SIZE: usize and fn
  write_be(&self, out: &mut [u8]).
- Blanket impls for the network-order primitives the framework cares
  about: u8 / u16 / u32 / u64 / u128 / Ipv4Addr / Ipv6Addr.

net/src/fixed_size.rs is the consumer bridge: FixedSize impls for
TcpPort, UdpPort, UnicastIpv4Addr, and Vni.  Each delegates write_be
to the underlying primitive's impl, so wire bytes match what writing
the raw integer would produce.  Vni is a 24-bit value written into a
4-byte field (high byte zero) because backends model fields in 1/2/4
widths, not 3.

net/Cargo.toml grows the fixed-size workspace dep; net/src/lib.rs
adds the module behind no cfg gates (it has no further dependencies
of its own).

just fmt; cargo check --workspace --all-targets passes.
Adds the read-only key/action lookup vocabulary downstream match-action
backends will implement.  Tiny crate by design (one trait pair, two
collection impls, inline tests) so consumer crates depend on this
without depending on each other.

lookup/src/lib.rs:

- pub trait Projection<T>: extracts a key of type T from self.
  Implemented on `&'a Source` so the lifetime threads into T when T
  borrows; for owned T the lifetime is unused.  The same source can
  implement Projection<T> for many T -- the call site picks one by
  inference from a Lookup backend's key type.
- impl<K> Projection<Option<K>> for Option<K>: identity, so
  classify_opt accepts a pre-computed Option<K> directly.  Scoped to
  Option<K> (not a general impl<T> Projection<T> for T) so a backend
  that implements Lookup<K, A> for every K can't make classify
  ambiguous.
- pub trait Lookup<K, A>: lookup(&K) -> Option<&A>, plus default
  methods classify (S: Projection<K>) and classify_opt (S:
  Projection<Option<K>>) that project and look up in one call.  Two
  trait type parameters, so one backend can serve many (K, A) pairs.
- Blanket Lookup impls for BTreeMap<K, V> and HashMap<K, V, S> so
  tests / simple consumers get a working backend out of the box.

Inline tests exercise: projection-by-table-type inference at v4
2-tuple and 4-tuple widths, lifetime threading through borrowed
projections, miss returning None, classify_opt short-circuit on None
projection, the identity Projection<Option<K>> for Option<K>, and
HashMap parity with BTreeMap.

just fmt; cargo check --workspace --all-targets passes.
Lands the match-action key vocabulary plus its derive.  Backends in
downstream PRs consume the result via a structural FIELD_SPECS view
and a byte-packing as_key_into(); the bolero property-test
generators land next as their own feature gate.

match-action/src/:

- lib.rs publishes the FieldKind enum (Prefix / Mask / Range /
  Exact), the FieldSpec layout record (name / kind / size / offset),
  and the MatchKey trait (N, KEY_SIZE, field_specs(),
  as_key_into()).  Trait methods take slices rather than
  Self::N-sized arrays because Self::N in a trait fn needs unstable
  generic_const_exprs; the derive emits sized inherent helpers on
  concrete types instead.
- field.rs is a thin re-export of FixedSize from the fixed-size
  crate, so consumers can refer to it via match_action::FixedSize.
- predicate.rs holds the type-erased FieldPredicate form (Prefix,
  Mask, Range, Exact variants over FieldBytes) plus the Erased
  backend marker used by the reference oracle.
- rule.rs defines the four kind-typed *Spec wrappers (PrefixSpec,
  MaskSpec, RangeSpec, ExactSpec), the Backend / Accepts /
  IntoBackendField / IsUniversal traits, and the RuleField envelope
  carrying name / spec.

match-action-derive (proc-macro crate) emits, from a struct annotated
with #[prefix] / #[mask] / #[range] / #[exact]:

- the MatchKey impl,
- a parallel <Name>Rule struct holding the typed *Spec data,
- an inherent FIELD_SPECS const for compile-time inspection by
  downstream backends, and
- (for non-generic structs) an inherent as_key() -> [u8; KEY_SIZE]
  for ergonomic byte packing.

tests/derive_roundtrip.rs exercises a 5-tuple-shaped key covering
all four predicate kinds: asserts N, KEY_SIZE, the field_specs()
content, and that as_key_into / as_key produce the expected
big-endian byte layout (with declared field ordering preserved).

just fmt; cargo check --workspace --all-targets passes.
Adds property-test generators over the rule wrappers, behind a new
`bolero` feature gate so the bolero dep stays out of the default
build graph.  Downstream consumers (acl property tests, future
cascade Upsert-laws harnesses) enable the feature in their
dev-deps and draw random matching / non-matching packets for any
single rule.

match-action/src/generator.rs:

- FieldHit / FieldMiss TypeGenerators yielding bytes that satisfy /
  violate a given field predicate.  Cover all four predicate kinds:
  Prefix (any address sharing the rule's high-order bits / one with
  flipped bits in that prefix), Mask (value bits match / a flipped
  in-mask bit), Range (uniform draw in [min, max] / outside the
  bounds), Exact (the value / a different value).
- Miss generators skip universal predicates (range covering all
  values, mask of all zeros, prefix length zero, etc.); the
  derive-emitted IsUniversal check on <Name>Rule lets callers detect
  whole rules that have an empty miss set.

match-action/src/lib.rs picks up #[cfg(feature = "bolero")] gates on
the generator mod + its FieldHit / FieldMiss re-exports.

match-action/Cargo.toml grows the bolero optional dep and the
`bolero` feature, which also turns on the matching forwarded feature
on match-action-derive (a no-op today, kept for future emit changes
in the derive).  match-action-derive/Cargo.toml mirrors the
`bolero` feature declaration so cargo can forward through.

just fmt; cargo check --workspace --all-targets and
cargo check -p dataplane-match-action --features bolero pass.
Introduces the dataplane-acl crate with the software reference
classifier.  The DPDK rte_acl backend lands behind a feature gate in
a follow-up PR.

The reference backend is a linear-scan software classifier built on
the canonical FieldPredicate form from match-action
(rule.into_backend_fields::<Erased>()), so it speaks the same four
predicate kinds (Prefix / Mask / Range / Exact) as every other
backend.  Two roles:

1. Differential-testing oracle against rte_acl (a future PR's
   differential property tests pit both backends against the same
   random rule + packet draws).
2. Non-lossy substrate for a small-delta cascade front over a slow
   tail backend.

Layout:

- src/lib.rs declares the crate-level docs and re-exports the
  reference module.  The dpdk feature gate and dpdk_table_alias!
  macro land alongside the rte_acl backend itself in the next PR.
- src/reference/table.rs is the typed surface: ReferenceTable<K, A>
  parameterised by a MatchKey and an action; RefRule wraps the
  lowered Erased predicates plus an action.  Inline unit tests cover
  positional precedence (first match wins) and the four predicate
  kinds.
- src/reference/dyn_table.rs is the runtime-shape twin:
  DynReferenceTable carries its FieldSpec layout at runtime so
  property tests can fuzz the schema itself.  Returns DynShapeError
  on shape mismatch.

just fmt; cargo check --workspace --all-targets passes.
Lands the static type machinery for the DPDK `rte_acl` backend
behind the new `dpdk` feature gate: how a MatchKey's FIELD_SPECS maps
into rte_acl's per-field FieldDef array, and how the four
match-action *Spec predicate kinds lower into IntoBackendField for
the `Dpdk` backend marker.  The runtime install / classify path
(install.rs, lookup.rs) and the dpdk_table_alias! macro land next.

acl/src/dpdk/:

- mod.rs declares the two submodules; carries a temporary
  #![allow(dead_code)] because the layout's `stride` field and the
  rule.rs RuleSpec fields are consumed only once install / lookup
  arrive in the next PR.  The allow goes away then.
- layout.rs has the rte_acl field planner: group fields by
  input_index (rte_acl requires the first field to be one byte,
  remaining fields grouped into <= 4-byte buckets), insert padding
  for gaps, and yield a DpdkLayout { field_defs: [FieldDef; N],
  stride, user_to_dpdk }.  const_extents() is const fn so a const
  alias can derive N / STRIDE from K::FIELD_SPECS without unstable
  generic_const_exprs.  Wide fields (Ipv6Addr, u128) decompose into
  four u32 sub-fields the way l3fwd-acl does.
- rule.rs holds the Dpdk backend marker, the AclWord trait (blanket
  impl over FixedSize via chunks()), the IntoBackendField impls
  carrying each *Spec into a backend-typed AclField group, the
  RuleSpec rule-field envelope, and splice_user_fields_to_dpdk for
  reordering user-declared fields into rte_acl's layout-driven
  ordering.

acl/src/lib.rs picks up the #[cfg(feature = "dpdk")] gate on
pub mod dpdk; (no macro yet -- the dpdk_table_alias! macro lands with
its lookup-side referent next PR).

acl/Cargo.toml grows the dpdk feature and the optional dpdk
workspace dep.  No dev-deps yet.

just fmt; cargo check --workspace --all-targets and
cargo clippy -p dataplane-acl --features dpdk -- -D warnings pass.
Wires the layout planner and rule lowering from the previous PR into
a working DPDK backend: build an AclContext from a MatchKey plus its
rules, wrap it in a DpdkAclLookup, and classify packets through it.
First EAL-touching PR in the acl stack.

src/dpdk/:

- install.rs is the from-K-plus-rules constructor: take a MatchKey,
  call plan_layout to get the rte_acl FieldDefs, build an
  AclContext, splice each user RuleSpec through layout's
  user_to_dpdk map into rte_acl's column order, hand the rules to
  the context, build, and wrap the built context in a
  DpdkAclLookup<K, N, STRIDE, A>.
- lookup.rs is DpdkAclLookup itself: stack-packed key bytes
  (MAX_USER_KEY_BYTES sentinel feeds the compile-time guard in
  dpdk_table_alias!), the impl Lookup<K, A> single-shot path, and a
  batched classify_batch over a slice of K returning aligned actions.
- mod.rs picks up pub mod install / pub mod lookup and drops the
  temporary #![allow(dead_code)] from the previous PR -- RuleSpec
  fields and DpdkLayout.stride now have readers.

src/lib.rs gains the dpdk_table_alias! macro: dpdk_table_alias!(pub
type FiveTupleTable<Verdict> = FiveTuple); yields a DpdkAclLookup<K,
N, STRIDE, A> with N / STRIDE derived from K::FIELD_SPECS via
const_extents.  A const _: () = assert!(KEY_SIZE <=
MAX_USER_KEY_BYTES) guards against keys that wouldn't fit the
stack scratch buffer.  The hidden __match_action module re-exports
MatchKey so the macro resolves without a caller-side import.

tests/eal_install_classify.rs is the smoke: derive a MatchKey, install
two rules with priority precedence, classify via the single-shot path
and the batch path, assert userdata.

acl/Cargo.toml grows a single dev-dep -- self-overriding dpdk with the
`test` feature on so #[with_eal] from dpdk-test-macros works.

just fmt; cargo check --workspace --all-targets and
cargo clippy -p dataplane-acl --features dpdk -- -D warnings pass.
Adds the runtime-shape twin of DpdkAclLookup -- DynDpdkLookup carries
its FieldSpec layout at runtime instead of in const generics -- and
the shape-fuzz oracle that proves the byte-level pipeline agrees
between the reference oracle and rte_acl over an unconstrained
schema.

src/dpdk/dyn_table.rs is DynDpdkLookup<A>:

- new(name, max_rule_num, field_specs) plans the rte_acl layout from
  a Vec<FieldSpec> at runtime, builds an empty AclContext, and
  returns a typed lookup keyed by an Erased FieldPredicate vector.
- add_rules takes Vec<DynRuleSpec> -- the runtime-shape rule
  carrier (priority, category_mask, lowered fields, action) -- and
  splices each rule's field bytes through the user_to_dpdk map into
  rte_acl's column order, then builds.
- impl Lookup<Vec<FieldBytes>, A>: pack the probe bytes onto the
  stack scratch buffer in the layout's column order, hand them to
  rte_acl_classify, and translate the userdata hit back to &A.

src/dpdk/mod.rs picks up pub mod dyn_table; alongside the typed path.

tests/property_dyn_shape.rs is the schema fuzz:

- bolero TypeGenerator yields a random Vec<FieldSpec>, a single rule
  matching that shape, and packet seeds.
- For each shape: install the same rule into a DynReferenceTable
  (oracle) and a DynDpdkLookup, then probe both with both a
  hit-byte seed and a miss-byte seed.  Assert agreement on every
  probe.
- No MatchKey types involved -- exercises the byte-level pipeline
  end-to-end and catches drift in layout planning, the splice map,
  and rte_acl's per-predicate semantics simultaneously.

acl/Cargo.toml gains bolero + match-action[bolero] dev-deps the
test needs.

just fmt; cargo check --workspace --all-targets and
cargo clippy -p dataplane-acl --features dpdk -- -D warnings pass.
Pure test broadening; no src changes.  Adds the single-rule v4/v6
differential against the reference oracle and three Headers /
metadata projection demos that exercise classify / classify_opt
against real net::HeadersView packets.

tests/property_predicate.rs is the differential.  For a random
5-tuple rule + random hit/miss byte seeds drawn via match-action's
FieldHit / FieldMiss generators, both the reference oracle and the
DPDK backend must accept every hits() draw and reject every misses()
draw.  Parameterised over the address width via a sealed IpAddress
trait so a single body covers v4 (Ipv4Addr) and v6 (Ipv6Addr) -- the
DPDK wide-field split (one 16-byte address -> four 4-byte
sub-fields) is exercised end-to-end by the v6 invocation.  Single
rule only; multi-rule differential is deferred (positional
precedence vs numeric Priority).

tests/eal_classify_via_projection.rs is the end-to-end projection
demo: a real packet -> HeadersView -> Projection<FiveTuple> -> DPDK
Lookup<FiveTuple, _> -> action.  Shows Lookup::classify runs the
projection and the lookup as a single call -- the call site reads
table.classify(\&headers) and doesn't see the intermediate key
construction.

tests/metadata_projection.rs is the partial-projection demo.  Header
fields live in Headers; VRF / VNI live in PacketMeta.  A projection
source bundles &HeadersView with &PacketMeta and projects to
Option<K>: the header part is total (shape proves presence), the
metadata part narrows from its Option with ?.  Missing metadata
projects to None and Lookup::classify_opt turns that into a table
miss with no explicit branch in user code.

tests/net_field_types.rs uses net wire newtypes (TcpPort, UdpPort,
Vni, UnicastIpv4Addr) directly as MatchKey fields with no acl-side
AclWord impl, leaning on net's FixedSize impls (PR 2a) and the DPDK
backend's blanket AclWord-over-FixedSize impl.

acl/Cargo.toml grows the net[test_buffer, builder] dev-dep these
projection demos need.

just fmt; cargo check --workspace --all-targets and
cargo clippy -p dataplane-acl --features dpdk -- -D warnings pass.
Adds criterion benchmarks for both backends at v4 and v6 widths,
plus the nix / just plumbing to produce bench binaries from a
sandboxed build.

acl/benches/:

- reference_five_tuple.rs sweeps a deep miss (full per-rule scan)
  and an early hit through the reference's O(rules * fields) linear
  scan.  Both widths.
- dpdk_five_tuple.rs is the rte_acl companion: trie walk cost (close
  to flat in rule count), miss vs hit, single-shot vs SIMD batch.
  v6 exercises the wide-field split (one 16-byte address -> four
  4-byte sub-fields).  Requires a live EAL.
- table_build.rs measures construction cost vs rule count: reference
  (lower + Vec wrap) and DPDK (rte_acl_build, the update-latency
  cost).  Both widths.  iter_batched so teardown is excluded.

acl/Cargo.toml gets the criterion dev-dep and three harness = false
[[bench]] entries.  Workspace Cargo.toml gets the criterion = 0.5.1
shared dep entry.

default.nix adds a bench-builder derivation: cargo bench --no-run
under the profile-appropriate DPDK sysroot, then copies each
compiled benchmark into $out/bin (stripping cargo's -<hash>
suffix).  Linked against the optimized DPDK when profile = release.

justfile adds a bench recipe that builds the benches package and
runs every binary under results/benches/bin/ in turn.

just fmt; cargo check --workspace --all-targets and
cargo clippy -p dataplane-acl --features dpdk -- -D warnings pass.
Introduces dataplane-cascade.  Models a small in-memory LSM: writes
land in a concurrent multi-writer head, periodically frozen into
immutable intermediate layers, eventually compacted into an immutable
tail.  Readers walk head -> frozen[] -> tail and stop at the first
definitive answer.

The same primitive serves three problems via one Upsert trait:
match-action table updates (atomic publish under load), hardware
offload programming (drain output feeds the HW backend), and
active-active state replication (serialized drain output ships to
peer dataplanes).  freeze / fuse / compact / merge are all the same
Upsert operation applied to different operands.

Layout:

- lib.rs publishes the surface: Cascade / DrainEvent / FrozenEntry /
  Snapshot, the Upsert / MergeInto / MutableHead traits, Generation,
  and re-exports Lookup / Projection from the lookup crate.
- cascade.rs is the central type plus the rotate / drain / publish
  state machine.  Cascade::subscribe is feature-gated under
  `subscribe` (tokio broadcast).
- head.rs / merge.rs / upsert.rs are the trait definitions plus
  LastWriteWins as a stock blanket-friendly Upsert impl.
- generation.rs carries the monotonic Generation counter that orders
  layers within a cascade.
- property_tests.rs is a reusable bolero harness (under the bolero
  feature) consumer crates use to verify their own Upsert impls
  against the cascade's algebraic laws.  Exercised by a self-test
  in the next PR.

tests/smoke.rs is the trait-shape end-to-end with a trivial concrete
implementation: head shadows sealed shadows tail, tombstones in the
head suppress lower-layer hits, rotate seals and publishes.  Uses
the shared tests/common/mod.rs helpers that subsequent tests also
share.

Cascade is independent of every other PR in this stack: no acl dep,
no match-action dep.  Real-shape consumer pressure plus
bolero / subscribe integration tests land in the next PR.

just fmt; cargo check --workspace --all-targets passes.
Three integration tests that round out the cascade's test surface --
each exercises a different part of the design pressure that shaped
the trait surface in the previous PR.

tests/acl_consumer.rs is the first real-shaped consumer: a minimal
ACL classifier built on top of the cascade with no upfront ACL
crate dep -- the toy ACL is built inline using only Cascade /
MergeInto / MutableHead.

Purpose: surface design pressure on the trait surface against a use
case that is NOT exact-match-keyed.  ACL classification looks up
packets by header match expressions, not by a single key, and rules
carry their own identity (priority) separate from the lookup input.
If the cascade trait shape works for ACL it almost certainly works
for anything simpler.

Scope intentionally tight: rules install-only (no shadow-rule
removal yet), match expressions src/dst IPv4 with optional single
port, head Lookup always returns None (writes visible only after a
rotation seals the head).  Comment block at the end captures the
open question about removal under cascade semantics.

tests/upsert_properties.rs is the self-test of the property harness
landed in the previous PR.  Exercises check_upsert_order_independent
against the provided LastWriteWins -- known correct by construction.
If it fails, the harness itself is broken; if it passes, it's a
useful black-box check for downstream Upsert impls.

tests/subscribe.rs is the drain-subscription integration under tokio.
Exercises Cascade::subscribe end-to-end -- gated by the cascade
`subscribe` feature (enabled in dev-deps via the self-path override
on Cargo.toml).

just fmt; cargo check --workspace --all-targets passes.
@coderabbitai

coderabbitai Bot commented Jun 6, 2026

Copy link
Copy Markdown

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 0fa413b0-b7e3-40bb-be74-1e9fc617cd76

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch pr/daniel-noland/n-vm-again

Comment @coderabbitai help to get the list of available commands and usage tips.

@daniel-noland daniel-noland force-pushed the pr/daniel-noland/n-vm-again branch 2 times, most recently from eda6318 to 5394823 Compare June 6, 2026 19:31
daniel-noland and others added 6 commits June 6, 2026 17:32
Absorb the `n-vm`, `n-it`, `n-vm-macros`, and `n-vm-protocol` crates
from the external `testn` repository into the workspace and add the
supporting infrastructure for running tests inside a QEMU/cloud-hypervisor
guest.

This is a squashed revival of the `n-vm-again` branch, rebased onto the
current ACL/dpdk line.  The branch had forked before the ACL module
existed; its stale fork of `dpdk/`, `dpdk-sys/`, and `dpdk/src/acl/*` has
been dropped in favour of the authoritative versions on the base, keeping
only the orthogonal VM test infrastructure.

Contents:

- `n-vm` / `n-vm-protocol`: QEMU and cloud-hypervisor backends, dynamic
  vsock allocation, hugepage and NIC-model configuration, scratch-only
  container mode with nix-store bind mounts.
- `n-vm-macros`: the `#[in_vm]` attribute and companion attributes
  (`#[network]`, guest/hypervisor config), compile-fail trybuild tests.
- `n-it`: in-guest init system (PID 1) with mount-table-driven teardown.
- `nix`: `linux-fancy` kernel built from config fragments (VFIO, IOMMU,
  virtio, e1000/e1000e), `testroot`/`vmroot` derivations, `merge-config.nix`.
- `hardware` / `dpdk-sys`: e1000/e1000e NIC binding and `rte_net_e1000`
  PMD linkage; `driver()` returns `Ok(None)` for unbound devices.
- `mgmt`: re-enable `test_sample_config` under `#[in_vm]`.
- `justfile`: export `N_VM_TEST_ROOT`/`N_VM_VM_ROOT` for `#[in_vm]` tests.

The `hardware/tests/dpdk_in_vm.rs` integration test (virtio-net/e1000/
e1000e) is intentionally NOT carried forward: it was written against the
pre-rewrite dpdk device API (`StartedDev` queue handles, `RxOffloadConfig`,
public `Headers` fields) and needs a port to the current API.  It is
preserved on the `backup/n-vm-again` branch for a clean revival.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Make the cross-architecture `#[in_vm]` pipeline run end-to-end and remove
the x86-centric framing of architecture-divergent VM config.

- container: run the foreign test binary under user-mode QEMU inside the
  container (mirroring `scripts/test-runner.sh`) rather than relying on a
  host `binfmt_misc` handler.  The `qemu-<arch>` interpreter is resolved on
  PATH to its `/nix/store` path, reachable via the bind-mounted store.
- cloud-hypervisor: set `mergeable = false` in the memory config; current
  cloud-hypervisor rejects `mergeable` together with `shared`, and `shared`
  is required for virtiofs.
- Arch is now an explicit dimension threaded through the QEMU arg builders
  and `build_kernel_cmdline` (not `Arch::current()` buried in each), so the
  per-ISA lowering is unit-tested for both ISAs on any build host.
- vIOMMU is folded into `Arch::virtual_iommu_device()`; an `iommu = true`
  request on an ISA with no lowering now resolves to a graceful skip in the
  host tier (reusing the cloud-hypervisor-cross skip path) instead of a
  launch panic.
- boot spike: also validate virtio-net / e1000 / e1000e driver binding in
  the guest.

Validated: x86 `just test n-vm` 184/184; aarch64
`just platform=aarch64 libc=musl test n-vm` 185/186.  The lone aarch64
failure is the `#[should_panic]` control test, whose deliberate panic comes
out of the TCG guest as SIGSEGV -- a separate runtime issue, deferred.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`#[should_panic]` cannot compose with `#[in_vm]`.  The generated function
runs under libtest at all three dispatch tiers (host, container, guest),
and a panic is absorbed at whichever tier produces it, so the semantics are
incoherent: when a guest test fails, the `run_container_tier_for` "VM test
failed" panic is caught by the *container-tier* libtest's own
`#[should_panic]` (so the container reports "1 passed"), the host then sees
a clean exit, and the host-tier `#[should_panic]` fails with "did not panic
as expected".  On aarch64 this is reliably hit because the guest's panic
faults (SIGSEGV) before it can be caught in-guest.

Reject the combination in the macro with a `compile_error!` that directs
the author to assert the failure condition in the body instead, add a
trybuild compile-fail case, and drop the broken
`test_which_runs_in_vm_control` (the harness's failure-detection is covered
by the verdict-decode unit tests).

With this, `just platform=aarch64 libc=musl test n-vm` is fully green
(185/185).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The vIOMMU realization was spread across three `Arch` methods -- the QEMU
`-device` string, the `kernel-irqchip=split` machine option, and the guest
kernel IOMMU params -- so adding a new ISA's vIOMMU meant remembering to
touch all three, and they could drift apart.

Replace them with a single `Arch::virtual_iommu() -> Option<VIommuLowering>`
carrying `device` + `machine_opts` + `kernel_params` as one object;
`supports_virtual_iommu()` derives from it.  Adding aarch64 SMMUv3 (a
follow-up) is now filling in one struct rather than editing several methods.

Behavior-preserving: the emitted QEMU args and guest kernel command line are
byte-identical on x86_64, and aarch64 still has no vIOMMU (`None`).  Verified
by the per-ISA builder and cmdline unit tests plus a new coherence property
test (an ISA has a complete lowering or none -- the pieces can't half-apply).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
daniel-noland and others added 3 commits June 6, 2026 23:13
aarch64's vIOMMU is QEMU's `virt` SMMUv3, lowered as a `-machine
iommu=smmuv3` option (not a `-device` like x86 intel-iommu) and auto-probed
from the device tree (no kernel command-line opt-in).  Add it as the
aarch64 `VIommuLowering` (`device: None`, `machine_opts: "iommu=smmuv3"`),
make `VIommuLowering.device` an `Option` so a machine-option-only IOMMU is
expressible, and add `CONFIG_ARM_SMMU_V3` to the guest kernel.
`supports_virtual_iommu()` is now true for aarch64, so the host-tier skip
flips off and `#[hypervisor(iommu)]` tests run on the cross path instead of
skipping.

Validated under TCG (scripts/n-vm-aarch64-smmu-spike.sh): the SMMUv3
probes, PCI devices (incl. e1000) land in their own IOMMU groups so
vfio-pci/DPDK is viable, and `iommu_platform=on,ats=on` is accepted without
faults.  The three `#[hypervisor(iommu)]` integration tests now run and pass
on aarch64 -- the vhost-vsock verdict channel and vhost-user-fs root both
work behind the SMMU -- with the suite at 186/186.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The in_vm test VMs always run with `-nographic`, so QEMU's GUI display
backends (gtk/sdl/vnc/spice/...) are dead weight that dragged
gtk4/gtk3/cairo/pango/vte/libepoxy/SDL into every test/dev root.  This
was invisible on the native path (the cached `qemu_kvm` simply fetched
that closure) but became loud on the cross path, where building
`qemu-system-aarch64` from source meant compiling all of GTK.

Select nixpkgs' headless `nixosTestRunner` QEMU profile instead:

  - Native guest: the prebuilt, cache-hit `qemu_test`
    (= `qemu_kvm` + `nixosTestRunner`) -- host-cpu-only, KVM, no GUI.
    Drops GTK from the common devroot with no loss of the binary cache.
  - Cross guest: the base `qemu`, headless and restricted to just the
    build-host + guest `*-softmmu` targets (the build-host target keeps
    QEMU's `qemu-kvm` compat symlink from dangling and tripping the
    `noBrokenSymlinks` install check).  A genuine-cross `pkgsBuildHost`
    qemu is never in the binary cache regardless (Hydra does not build
    that derivation), so trimming targets + GUI shrinks that unavoidable
    build from a 2247 MB closure to 582 MB.

`nixosTestRunner`'s only non-GUI effect is a 9p uid0 patch we never
exercise (we mount the guest root via vhost-user-fs, not `-virtfs`).

Validated gtk-free, full suites green: x86 186/186 (`qemu_test`, KVM)
and aarch64 186/186 (slim cross qemu, TCG).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The `compile_fail` trybuild test shells out to `cargo` at run time to
compile each case for the build target.  Under cross emulation
(`--cfg emulated`, set by nix/profiles.nix when the test arch differs
from the host) the test binary runs via qemu-user and the build target's
std/core is unavailable, so every case fails with E0463 ("can't find
crate for `core`") instead of the diagnostics it asserts -- which failed
`just libc=musl platform=bluefield3 test`.

Gate the test with `#[cfg_attr(emulated, ignore)]`, matching the repo's
existing convention (routing/nat/mgmt).  The macro's compile-time
diagnostics are arch-independent, so the native run is full coverage.

n-vm-macros' only test is then skipped, so an isolated
`just test n-vm-macros` cross run would trip nextest's no-tests-run
error (exit 4); add `--no-tests pass` to the interactive `test` recipe to
match what `test-each` already does for per-package archives.

Validated: native runs + passes (1/1), cross skips (exit 0).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@daniel-noland daniel-noland force-pushed the pr/daniel-noland/n-vm-again branch from 5394823 to a64c165 Compare June 7, 2026 06:48
daniel-noland and others added 8 commits June 8, 2026 21:41
ContainerGuard::into_result disarmed both safety nets (the defused flag
and the background CleanupThread) before calling collect_and_cleanup.
That helper can return early on inspect failure or missing state, before
remove_container runs -- and at that point Drop is a no-op, so the
container leaks on exactly the error path the guard exists to cover.

Collect and remove first; mark defused / defuse the thread only after the
removal succeeds, so an early `?` propagates with the guard still armed
and Drop triggers emergency removal.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
waitpid(-1, WNOHANG) reports "no children remain" as Err(ECHILD), not
WaitStatus::StillAlive. The reap loop fell through to the catch-all error
arm and logged `unexpected errno from waitpid in init: ECHILD` on every
shutdown round (and once per SIGTERM round in terminate_remaining_
processes). Match ECHILD explicitly and break cleanly; keep the warning
for genuinely unexpected errnos.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…efix

TestResult::parse stripped WIRE_PREFIX with no boundary check, so a line
like `n-it-resultpass` stripped to a `pass` verdict. For a pass/fail
parser whose contract is "garbled verdict means failure", a spurious pass
is the dangerous direction. Require whitespace immediately after the
prefix (the separator the wire form always emits) and add a regression
test.

Also replaces the box-drawing section dividers in this file with ASCII
per the repo's ASCII-only source convention.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A bare `#[in_vm] fn foo() {}` with neither `#[test]` nor `#[tokio::test]`
compiled to an ordinary function that libtest never collects, so the test
silently never ran -- the worst failure mode for a test harness. Emit a
clear compile error instead, checked after the more specific diagnostics
(bad backend, params, return type, NIC/backend mismatch) so those still
take precedence. Adds a missing_test_attr compile-fail fixture.

Also makes the migrated-option and multiple-backends diagnostics ASCII
(em-dash and ellipsis -> `--` / `...`), re-blessing their .stderr.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The repo convention is ASCII-only source. Replaces box-drawing section
dividers (U+2500) with `-`, the multiplication sign (U+00D7) in topology
comments and the hugepage error message with `x`, and em-dashes (U+2014)
in doc comments with `--`. No behavioral change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The npm-distributed ckb (`npx @tastehub/ckb`) is upstream's `build-fast`
variant: CGO_ENABLED=0, no `-tags cartographer`. That compiles out the
tree-sitter AST analyzers, so `ckb review` reports "Complexity analysis
not available (tree-sitter not built)" -- the complexity/bug-pattern code
is `//go:build cgo` with a `!cgo` stub.

Build ckb from source with CGO on so complexity analysis works. It is
Rust-aware (internal/complexity imports go-tree-sitter's rust grammar and
dispatches LangRust); verified against n-vm/src/qemu/qmp.rs, which the
npm binary refuses with "requires CGO". Pinned via npins at v9.2.0 to
match the existing index.scip schema (v11).

Bug-pattern detection stays Go-only (Go-specific node types) -- harmless
on this Rust repo. The cartographer backend (layers/arch-health), which
links a Rust static lib via `-tags cartographer`, is deferred: it is a
much heavier build and we want to gauge the complexity analysis first.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@daniel-noland daniel-noland force-pushed the pr/daniel-noland/acl-pat-design branch 5 times, most recently from 4915824 to e5d8a01 Compare June 18, 2026 19:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant