Skip to content

Commit 941880e

Browse files
Fidget-Spinnerpablogsal
authored andcommitted
Add perf figures, tighten JIT claims
1 parent 2f241a5 commit 941880e

3 files changed

Lines changed: 42255 additions & 13 deletions

File tree

peps/pep-0830.rst

Lines changed: 25 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -405,18 +405,18 @@ The JIT Compiler Needs Frame Pointers to Be Debuggable
405405
------------------------------------------------------
406406

407407
CPython's copy-and-patch JIT (:pep:`744`) generates native machine code at
408-
runtime. Without frame pointers in the interpreter, stack unwinding through
408+
runtime. Without reserved frame pointers in the JIT code, stack unwinding through
409409
JIT frames is broken for virtually every tool in the ecosystem: GDB, LLDB,
410410
libunwind, libdw (elfutils), py-spy, Austin, pystack, memray, ``perf``, and
411411
all eBPF-based profilers. Ensuring full-stack observability for JIT-compiled
412412
code is a prerequisite for the JIT to be considered production-ready.
413413

414414
Individual JIT stencils do not need frame-pointer prologues; the entire JIT
415415
region can be treated as a single frameless region for unwinding purposes.
416-
What matters is that the interpreter itself is built with frame pointers, so
416+
What matters is that the JIT itself is must reserve frame pointers, so
417417
that the frame-pointer register (``%rbp`` on x86-64, ``x29`` on AArch64) is
418418
reserved and not clobbered by stencil code. With frame pointers in the
419-
interpreter, unwinders can walk through JIT regions without needing to inspect
419+
JIT, most unwinders can walk through JIT regions without needing to inspect
420420
individual stencils. This is a remarkably good outcome compared to other
421421
JIT compilers (V8, LuaJIT, .NET CoreCLR, Julia, LLVM's ORC JIT), which
422422
typically require hundreds to thousands of lines of code to implement custom
@@ -432,7 +432,7 @@ The shift toward frame pointers has already happened independently of CPython
432432
upstream, and at massive scale.
433433

434434
``python-build-standalone``, the hermetic Python distribution used by ``uv``,
435-
``mise``, ``rye``, and many CI systems, enabled ``-fno-omit-frame-pointer`` on
435+
``mise``, ``rye``, and many CI systems, enfabled ``-fno-omit-frame-pointer`` on
436436
all x86-64 and AArch64 Linux builds in early 2026 and shipped in ``uv`` 0.11.0
437437
[#pbs992]_. Gregory Szorc, the project's creator, stated: "Frame pointers
438438
should be enabled on 100% of x86-64 / aarch64 binaries in 2026. Full stop." He
@@ -840,14 +840,14 @@ pyperformance JSON files can be found in
840840
===================================== =======================
841841
Machine Geometric mean overhead
842842
===================================== =======================
843-
Apple M2 Mac Mini (arm64) 1.01x slower
844-
Intel Xeon Platinum 8480 (x86-64) 1.01x slower
845-
AMD EPYC 9654 (x86-64) 1.01x slower
846-
AWS Graviton c7g.16xlarge (aarch64) 1.02x slower
847-
Ampere Altra Max (aarch64) 1.01x slower
848-
Raspberry Pi (aarch64). 1.00x slower
849-
macOS M3 Pro (arm64) 1.00x slower
850-
Intel i7 12700H (x86-64) 1.02x slower
843+
Apple M2 Mac Mini (arm64) 1.006x slower
844+
macOS M3 Pro (arm64) 1.001x slower
845+
Raspberry Pi (aarch64). 1.002x slower
846+
Ampere Altra Max (aarch64) 1.020x slower
847+
AWS Graviton c7g.16xlarge (aarch64) 1.027x slower
848+
Intel i7 12700H (x86-64) 1.019x slower
849+
AMD EPYC 9654 (x86-64) 1.008x slower
850+
Intel Xeon Platinum 8480 (x86-64) 1.006x slower
851851
===================================== =======================
852852

853853
This overhead applies to both the interpreter and to C extensions that inherit
@@ -1048,7 +1048,19 @@ Footnotes
10481048
Appendix
10491049
========
10501050

1051-
# TODO: KJ, once we have Diego's results.
1051+
For all graphs below, the green dots are geometric means of the
1052+
individual benchmark's median, while orange lines are the median of our data points.
1053+
1054+
The first graph is the overall effect on pyperformance seen on each system.
1055+
Apart from the Ubuntu AWS Graviton System, all system configurations have below 2%
1056+
geometric mean and median slowdown:
1057+
1058+
.. image:: pep-830_perf_over_baseline.svg
1059+
1060+
For individual benchmark results, see the following:
1061+
1062+
.. image:: pep-830_perf_over_baseline_indiv.svg
1063+
10521064

10531065
Copyright
10541066
=========

0 commit comments

Comments
 (0)