Benchmark prs#1529
Conversation
8ad9393 to
0e181c4
Compare
bc499ea to
a277ac1
Compare
e660953 to
9ce9ed0
Compare
Benchmark Resultskvm / amd (Linux) (❌ *1.81x slower* → 🚀 **2.41x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
kvm / intel (Linux) (❌ *1.88x slower* → 🚀 **7.91x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
mshv3 / amd (Linux) (❌ *2.19x slower* → 🚀 **2.85x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
mshv3 / intel (Linux) (❌ *1.56x slower* → 🚀 **1.89x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
hyperv-ws2025 / amd (Windows) (❌ *4.96x slower* → ✅ **1.42x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
hyperv-ws2025 / intel (Windows) (❌ *6.68x slower* → ✅ **1.31x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
|
Benchmark Resultskvm / amd (Linux) (❌ *1.92x slower* → 🚀 **2.44x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
kvm / intel (Linux) (❌ *1.70x slower* → 🚀 **5.39x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
mshv3 / amd (Linux) (❌ *1.98x slower* → 🚀 **2.60x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
mshv3 / intel (Linux) (❌ *2.33x slower* → 🚀 **1.94x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
hyperv-ws2025 / amd (Windows) (❌ *8.66x slower* → ✅ **1.11x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
hyperv-ws2025 / intel (Windows) (❌ *3.32x slower* → ✅ **1.24x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
|
Benchmark Resultskvm / amd (Linux) (❌ *1.35x slower* → 🚀 **2.81x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
kvm / intel (Linux) (✅ **1.01x slower** → 🚀 **13.07x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
mshv3 / amd (Linux) (❌ *1.44x slower* → 🚀 **4.04x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
mshv3 / intel (Linux) (✅ **1.10x slower** → 🚀 **2.89x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
hyperv-ws2025 / amd (Windows) (❌ *3.62x slower* → 🚀 **1.82x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
hyperv-ws2025 / intel (Windows) (❌ *1.99x slower* → ✅ **1.38x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
|
Benchmark Resultskvm / amd (Linux) (❌ *1.81x slower* → 🚀 **2.37x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
kvm / intel (Linux) (❌ *1.92x slower* → 🚀 **8.06x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
mshv3 / amd (Linux) (❌ *2.29x slower* → 🚀 **2.72x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
mshv3 / intel (Linux) (❌ *2.15x slower* → ✅ **1.70x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
hyperv-ws2025 / amd (Windows) (❌ *6.07x slower* → ✅ **1.44x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
hyperv-ws2025 / intel (Windows) (❌ *5.82x slower* → ✅ **1.29x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
|
Benchmark Resultskvm / amd (Linux) (❌ *1.77x slower* → 🚀 **2.48x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
kvm / intel (Linux) (❌ *2.04x slower* → 🚀 **12.49x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
mshv3 / amd (Linux) (❌ *1.87x slower* → 🚀 **2.73x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
mshv3 / intel (Linux) (❌ *1.43x slower* → 🚀 **2.26x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
hyperv-ws2025 / amd (Windows) (❌ *7.91x slower* → ✅ **1.21x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
hyperv-ws2025 / intel (Windows) (❌ *10.73x slower* → ✅ **1.46x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
|
There was a problem hiding this comment.
Pull request overview
This PR introduces a new hyperlight-ci workspace crate to host CI/dev helper commands (benchmark runner + report generator), and wires it into GitHub Actions so PR benchmarks run in a matrix and get aggregated into a bot-posted PR comment.
Changes:
- Add a new
hyperlight-cibinary withbenchandbench-reportsubcommands (criterion-swarm execution + criterion-markdown rendering). - Add a
cargo ci ...alias and updateJustfilebenchmark targets to use it. - Extend PR validation workflows to run benchmarks, upload per-matrix markdown reports, and combine them into a single artifact for
hyperlight-gh-bot.
Reviewed changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/hyperlight_ci/src/main.rs |
New CLI entrypoint with bench / bench-report subcommands. |
src/hyperlight_ci/src/bench.rs |
Implements the benchmark runner using criterion-swarm and output mode flags. |
src/hyperlight_ci/src/bench_report.rs |
Generates markdown reports from existing target/criterion results. |
src/hyperlight_ci/Cargo.toml |
New crate manifest and dependencies for the CI tool. |
Justfile |
Switch benchmark recipes to cargo ci bench .... |
Cargo.toml |
Adds src/hyperlight_ci to the workspace members. |
Cargo.lock |
Locks new dependencies (criterion-swarm/criterion-markdown/etc.) and adds hyperlight-ci. |
.github/workflows/ValidatePullRequest.yml |
Adds benchmark matrix job + aggregation job producing the PR comment artifact. |
.github/workflows/dep_benchmarks.yml |
Generates and uploads a benchmark.md report artifact per matrix entry. |
.github/hyperlight-bot.yml |
Configures hyperlight-gh-bot to post the aggregated benchmark comment. |
.cargo/config.toml |
Adds cargo ci alias to run hyperlight-ci. |
| #[arg(long, short, default_value_t = 0)] | ||
| pub jobs: usize, | ||
|
|
||
| /// Build output mode (comma-separated or repeated): spinner, stream, summary, none |
| pub build_output: Vec<OutputModeFlags>, | ||
|
|
||
| /// Benchmarks output mode (comma-separated or repeated): spinner, stream, summary, none | ||
| #[arg(long, value_delimiter = ',')] | ||
| pub benchmarks_output: Vec<OutputModeFlags>, |
Benchmark Resultskvm / amd (Linux) (❌ *1.77x slower* → 🚀 **2.45x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
kvm / intel (Linux) (❌ *1.79x slower* → 🚀 **13.07x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
mshv3 / amd (Linux) (❌ *2.39x slower* → 🚀 **2.69x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
mshv3 / intel (Linux) (❌ *1.71x slower* → 🚀 **2.54x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
hyperv-ws2025 / amd (Windows) (❌ *6.21x slower* → ✅ **1.46x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
hyperv-ws2025 / intel (Windows) (❌ *10.31x slower* → ✅ **1.26x faster**)function_call_serialization
guest_calls
guest_functions_with_large_parameters
sample_workloads
sandboxes
shared_memory
snapshots
Summary
|
It's not clear what this summary is telling me. Are all these faster and we are good to go? I looked at the details for a few and it looks like its slow for some and fast for others |
|
is this ready for review @jprendes ? |
c8f3018 to
3986d2e
Compare
Benchmark Resultskvm / amd (Linux) (❌ *1.15x slower* → 🚀 **2.76x faster**)alloc_fragmented
alloc_lifo
alloc_single
free
free_list_reuse
function_call_serialization
guest_calls
guest_functions_with_large_parameters
recycle_pool
sample_workloads
sandboxes
segmented_payload
shared_memory
snapshots
virtq_readonly_allocator_strategy
virtq_readwrite_allocator_strategy
Summary
kvm / intel (Linux) (✅ **1.07x slower** → 🚀 **11.85x faster**)alloc_fragmented
alloc_lifo
alloc_single
free
free_list_reuse
function_call_serialization
guest_calls
guest_functions_with_large_parameters
recycle_pool
sample_workloads
sandboxes
segmented_payload
shared_memory
snapshots
virtq_readonly_allocator_strategy
virtq_readwrite_allocator_strategy
Summary
mshv3 / amd (Linux) (❌ *1.18x slower* → 🚀 **8.04x faster**)alloc_fragmented
alloc_lifo
alloc_single
free
free_list_reuse
function_call_serialization
guest_calls
guest_functions_with_large_parameters
recycle_pool
sample_workloads
sandboxes
segmented_payload
shared_memory
snapshots
virtq_readonly_allocator_strategy
virtq_readwrite_allocator_strategy
Summary
mshv3 / intel (Linux) (✅ **1.11x slower** → 🚀 **2.85x faster**)alloc_fragmented
alloc_lifo
alloc_single
free
free_list_reuse
function_call_serialization
guest_calls
guest_functions_with_large_parameters
recycle_pool
sample_workloads
sandboxes
segmented_payload
shared_memory
snapshots
virtq_readonly_allocator_strategy
virtq_readwrite_allocator_strategy
Summary
hyperv-ws2025 / amd (Windows) (❌ *1.29x slower* → ✅ **1.46x faster**)alloc_fragmented
alloc_lifo
alloc_single
free
free_list_reuse
function_call_serialization
guest_calls
guest_functions_with_large_parameters
recycle_pool
sample_workloads
sandboxes
segmented_payload
shared_memory
snapshots
virtq_readonly_allocator_strategy
virtq_readwrite_allocator_strategy
Summary
|
1 similar comment
Benchmark Resultskvm / amd (Linux) (❌ *1.15x slower* → 🚀 **2.76x faster**)alloc_fragmented
alloc_lifo
alloc_single
free
free_list_reuse
function_call_serialization
guest_calls
guest_functions_with_large_parameters
recycle_pool
sample_workloads
sandboxes
segmented_payload
shared_memory
snapshots
virtq_readonly_allocator_strategy
virtq_readwrite_allocator_strategy
Summary
kvm / intel (Linux) (✅ **1.07x slower** → 🚀 **11.85x faster**)alloc_fragmented
alloc_lifo
alloc_single
free
free_list_reuse
function_call_serialization
guest_calls
guest_functions_with_large_parameters
recycle_pool
sample_workloads
sandboxes
segmented_payload
shared_memory
snapshots
virtq_readonly_allocator_strategy
virtq_readwrite_allocator_strategy
Summary
mshv3 / amd (Linux) (❌ *1.18x slower* → 🚀 **8.04x faster**)alloc_fragmented
alloc_lifo
alloc_single
free
free_list_reuse
function_call_serialization
guest_calls
guest_functions_with_large_parameters
recycle_pool
sample_workloads
sandboxes
segmented_payload
shared_memory
snapshots
virtq_readonly_allocator_strategy
virtq_readwrite_allocator_strategy
Summary
mshv3 / intel (Linux) (✅ **1.11x slower** → 🚀 **2.85x faster**)alloc_fragmented
alloc_lifo
alloc_single
free
free_list_reuse
function_call_serialization
guest_calls
guest_functions_with_large_parameters
recycle_pool
sample_workloads
sandboxes
segmented_payload
shared_memory
snapshots
virtq_readonly_allocator_strategy
virtq_readwrite_allocator_strategy
Summary
hyperv-ws2025 / amd (Windows) (❌ *1.29x slower* → ✅ **1.46x faster**)alloc_fragmented
alloc_lifo
alloc_single
free
free_list_reuse
function_call_serialization
guest_calls
guest_functions_with_large_parameters
recycle_pool
sample_workloads
sandboxes
segmented_payload
shared_memory
snapshots
virtq_readonly_allocator_strategy
virtq_readwrite_allocator_strategy
Summary
|
Benchmark Resultskvm / amd (Linux) (❌ *1.13x slower* → 🚀 **2.73x faster**)alloc_fragmented
alloc_lifo
alloc_single
free
free_list_reuse
function_call_serialization
guest_calls
guest_functions_with_large_parameters
recycle_pool
sample_workloads
sandboxes
segmented_payload
shared_memory
snapshots
virtq_readonly_allocator_strategy
virtq_readwrite_allocator_strategy
Summary
kvm / intel (Linux) (✅ **1.08x slower** → 🚀 **11.46x faster**)alloc_fragmented
alloc_lifo
alloc_single
free
free_list_reuse
function_call_serialization
guest_calls
guest_functions_with_large_parameters
recycle_pool
sample_workloads
sandboxes
segmented_payload
shared_memory
snapshots
virtq_readonly_allocator_strategy
virtq_readwrite_allocator_strategy
Summary
mshv3 / amd (Linux) (✅ **1.11x slower** → 🚀 **4.25x faster**)alloc_fragmented
alloc_lifo
alloc_single
free
free_list_reuse
function_call_serialization
guest_calls
guest_functions_with_large_parameters
recycle_pool
sample_workloads
sandboxes
segmented_payload
shared_memory
snapshots
virtq_readonly_allocator_strategy
virtq_readwrite_allocator_strategy
Summary
mshv3 / intel (Linux) (❌ *1.16x slower* → 🚀 **3.13x faster**)alloc_fragmented
alloc_lifo
alloc_single
free
free_list_reuse
function_call_serialization
guest_calls
guest_functions_with_large_parameters
recycle_pool
sample_workloads
sandboxes
segmented_payload
shared_memory
snapshots
virtq_readonly_allocator_strategy
virtq_readwrite_allocator_strategy
Summary
hyperv-ws2025 / amd (Windows) (✅ **1.11x slower** → 🚀 **1.95x faster**)alloc_fragmented
alloc_lifo
alloc_single
free
free_list_reuse
function_call_serialization
guest_calls
guest_functions_with_large_parameters
recycle_pool
sample_workloads
sandboxes
segmented_payload
shared_memory
snapshots
virtq_readonly_allocator_strategy
virtq_readwrite_allocator_strategy
Summary
hyperv-ws2025 / intel (Windows) (❌ *1.23x slower* → ✅ **1.76x faster**)alloc_fragmented
alloc_lifo
alloc_single
free
free_list_reuse
function_call_serialization
guest_calls
guest_functions_with_large_parameters
recycle_pool
sample_workloads
sandboxes
segmented_payload
shared_memory
snapshots
virtq_readonly_allocator_strategy
virtq_readwrite_allocator_strategy
Summary
|
Introduce a new internal tooling crate (hyperlight-ci) that provides: - bench subcommand: Runs criterion benchmarks in parallel via criterion-swarm. Features include: - Configurable parallelism (-j N, defaults to all P-cores) - Configurable output modes (spinner, stream, summary) - Support for pre-built binaries (--binary) to skip rebuilds - Trailing args forwarded to criterion (filter, --exact, etc.) - bench-report subcommand: Generates markdown comparison tables from criterion's target/criterion/ JSON output via criterion-markdown. Features include: - Benchmark discovery via criterion-swarm - Optional allowlist filtering via --binary or trailing args - Output to stdout This replaces ad-hoc benchmark scripting with a unified tool suitable for both local development and CI report generation. Signed-off-by: Jorge Prendes <jorge.prendes@gmail.com>
- Add cargo alias (`cargo ci`) for convenient hyperlight-ci invocation - Update dep_benchmarks workflow to use `cargo ci bench` and generate a markdown report via `cargo ci bench-report`, posting results as a PR comment per hypervisor/cpu matrix entry - Add benchmarks job to ValidatePullRequest workflow with hypervisor and cpu matrix, gated behind docs-only and build-guests checks - Grant pull-requests: write permission for PR comment posting - Simplify Justfile bench recipes to delegate to `cargo ci bench` - Update benchmarking docs to reflect the new workflow Signed-off-by: Jorge Prendes <jorge.prendes@gmail.com>
No, some of the benchmarks were slower, and some were faster, e.g.:
For that platform / hypervisor combination there was at least one benchmark that was 1.71x slower, and at least one that was 2.54x faster. The values between brackets show the "spread" in benchmark results for that platform. I guess it could be confused as <before> → <after>. I'm open to ideas on how to improve it :-) |
Benchmark Resultskvm / amd (Linux) (❌ *1.83x slower* → 🚀 **2.45x faster**)alloc_fragmented
alloc_lifo
alloc_single
free
free_list_reuse
function_call_serialization
guest_calls
guest_functions_with_large_parameters
recycle_pool
sample_workloads
sandboxes
segmented_payload
shared_memory
snapshot_files
snapshots
virtq_readonly_allocator_strategy
virtq_readwrite_allocator_strategy
Summary
kvm / intel (Linux) (❌ *1.74x slower* → 🚀 **12.06x faster**)alloc_fragmented
alloc_lifo
alloc_single
free
free_list_reuse
function_call_serialization
guest_calls
guest_functions_with_large_parameters
recycle_pool
sample_workloads
sandboxes
segmented_payload
shared_memory
snapshot_files
snapshots
virtq_readonly_allocator_strategy
virtq_readwrite_allocator_strategy
Summary
mshv3 / amd (Linux) (❌ *2.04x slower* → 🚀 **2.96x faster**)alloc_fragmented
alloc_lifo
alloc_single
free
free_list_reuse
function_call_serialization
guest_calls
guest_functions_with_large_parameters
recycle_pool
sample_workloads
sandboxes
segmented_payload
shared_memory
snapshot_files
snapshots
virtq_readonly_allocator_strategy
virtq_readwrite_allocator_strategy
Summary
mshv3 / intel (Linux) (❌ *1.37x slower* → 🚀 **2.27x faster**)alloc_fragmented
alloc_lifo
alloc_single
free
free_list_reuse
function_call_serialization
guest_calls
guest_functions_with_large_parameters
recycle_pool
sample_workloads
sandboxes
segmented_payload
shared_memory
snapshot_files
snapshots
virtq_readonly_allocator_strategy
virtq_readwrite_allocator_strategy
Summary
hyperv-ws2025 / amd (Windows) (❌ *5.42x slower* → ✅ **1.45x faster**)alloc_fragmented
alloc_lifo
alloc_single
free
free_list_reuse
function_call_serialization
guest_calls
guest_functions_with_large_parameters
recycle_pool
sample_workloads
sandboxes
segmented_payload
shared_memory
snapshot_files
snapshots
virtq_readonly_allocator_strategy
virtq_readwrite_allocator_strategy
Summary
hyperv-ws2025 / intel (Windows) (❌ *4.82x slower* → ✅ **1.32x faster**)alloc_fragmented
alloc_lifo
alloc_single
free
free_list_reuse
function_call_serialization
guest_calls
guest_functions_with_large_parameters
recycle_pool
sample_workloads
sandboxes
segmented_payload
shared_memory
snapshot_files
snapshots
virtq_readonly_allocator_strategy
virtq_readwrite_allocator_strategy
Summary
|
No description provided.