Skip to content

PEP 831: Frame pointers#4896

Open
pablogsal wants to merge 22 commits intopython:mainfrom
pablogsal:frame-pointers
Open

PEP 831: Frame pointers#4896
pablogsal wants to merge 22 commits intopython:mainfrom
pablogsal:frame-pointers

Conversation

@pablogsal
Copy link
Copy Markdown
Member

@pablogsal pablogsal commented Apr 9, 2026

Basic requirements (all PEP Types)

  • Read and followed PEP 1 & PEP 12
  • File created from the latest PEP template
  • PEP has next available number, & set in filename (pep-NNNN.rst), PR title (PEP 123: <Title of PEP>) and PEP header
  • Title clearly, accurately and concisely describes the content in 79 characters or less
  • Core dev/PEP editor listed as Author or Sponsor, and formally confirmed their approval
  • Author, Status (Draft), Type and Created headers filled out correctly
  • PEP-Delegate, Topic, Requires and Replaces headers completed if appropriate
  • Required sections included
    • Abstract (first section)
    • Copyright (last section; exact wording from template required)
  • Code is well-formatted (PEP 7/PEP 8) and is in code blocks, with the right lexer names if non-Python
  • PEP builds with no warnings, pre-commit checks pass and content displays as intended in the rendered HTML
  • Authors/sponsor added to .github/CODEOWNERS for the PEP

Standards Track requirements

  • PEP topic discussed in a suitable venue with general agreement that a PEP is appropriate
  • Suggested sections included (unless not applicable)
    • Motivation
    • Specification
    • Rationale
    • Backwards Compatibility
    • Security Implications
    • How to Teach This
    • Reference Implementation
    • Rejected Ideas
    • Open Issues
    • Acknowledgements
    • Footnotes
    • Change History
  • Python-Version set to valid (pre-beta) future Python version, if relevant
  • Any project stated in the PEP as supporting/endorsing/benefiting from the PEP formally confirmed such
  • Right before or after initial merging, PEP discussion thread created and linked to in Discussions-To and Post-History

📚 Documentation preview 📚: https://pep-previews--4896.org.readthedocs.build/pep-0831/

@pablogsal pablogsal requested review from a team and brettcannon as code owners April 9, 2026 22:01
@pablogsal pablogsal changed the title Frame pointers PEP 830: Frame pointers Apr 9, 2026
@brianschubert brianschubert added the new-pep A new draft PEP submitted for initial review label Apr 9, 2026
Copy link
Copy Markdown
Member

@savannahostrowski savannahostrowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the numbers we've seen, we should update 0.5-2%...if we're rounding up?

@pablogsal pablogsal changed the title PEP 830: Frame pointers PEP 831: Frame pointers Apr 9, 2026
Copy link
Copy Markdown
Member

@hugovk hugovk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting suggestions.

Copy link
Copy Markdown
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very compelling.

I'd added a few quick comments on the first part of the PEP.

This is long, and might be a bit much to expect anyone to read through in one go.
Maybe some of the supporting analyses could be moved into appendices and linked to from the body of the PEP?

For example, the detailed, but long, analyses of perf and ebpf tools could be moved to appendices.

above) prevents these tools from producing useful call stacks for Python
processes, and undermines the perf trampoline support CPython shipped in 3.12.

The measured overhead is under 2% geometric mean for typical workloads.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what platform, build configuration, etc, etc?
Can you link to the data, otherwise the readers (at least this reader's) first thought is "where's the evidence?"

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a cross-reference to the Backwards Compatibility section which has the full per-platform table (8 machines across arm64 and x86-64, geometric mean across 80 pyperformance benchmarks). The abstract is intentionally concise but now points readers to the evidence.

==========

Python's observability story (profiling, debugging, and system-level tracing)
is fundamentally limited by the absence of frame pointers. The core motivation
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't quite true. It isn't impossible without frame pointers, just much, much harder.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The text says "fundamentally limited", not impossible, no?

The performance wins that profiling enables far outweigh the modest overhead of
frame pointers. As Brendan Gregg notes: "I've seen frame pointers help find
performance wins ranging from 5% to 500%" [#gregg2024]_. A 0.5-2% overhead that
unlocks the ability to find 5-500% improvements is a favourable trade.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it clear, that large speedups are not going to happen for CPython, but it might happen for larger systems using Python as a glue language.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Clarified that the 5-500% wins Gregg describes come from identifying hot paths in production systems, not from enabling frame pointers on CPython itself.

frame also stores the *previous* frame pointer, creating a linked list through
the entire call stack::

┌──────────────────┐
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have the arrows actually point to the previous frame. I think it is OK to use an image,instead of ASCII art.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah perhaps an image may be better.

Just to clarify: The arrows do point to the previous frame as each saved %rbp ──► points to the frame above it (i.e. its caller), which is the correct direction for a frame-pointer chain. The chain walks up (toward main). I can look into replacing with an image if the ASCII art is unclear, but it would complicate the diff/review cycle. Open to suggestions.

default [#gcc_fomit]_. This frees the ``%rbp`` register for general use,
giving the optimiser one more register to work with. On x86-64 this is a gain
of one register out of 16 (about 7%). The performance benefit is small
(typically 0.5-2%) but it was considered worthwhile when the convention was
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, you need to be more nuanced about how much things slow down, and for which platforms.

I'd use terms like "a few" rather than 0.5-2 and add a link to a section with a more detailed breakdown

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to "typically a few percent" and added a cross-reference to the detailed analysis section which has the full platform-by-platform breakdown.

established for 32-bit x86, where the gain was one register out of 6 (~20%).

Without frame pointers, the linked list does not exist. Tools that need to
walk the call stack must instead parse DWARF debug information (a complex,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't forget Windows. It would make your case more compelling, as tools would need to parse DWARF and understand what Windows does.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Added a mention of Windows .pdata / .xdata unwind metadata alongside DWARF.

follows the chain, and reconstructs the full call stack in microseconds. This
is fast enough to sample at 10,000 Hz in production with negligible overhead.

Without frame pointers, the profiler must fall back to DWARF unwinding, a
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The earlier mention (in "What Are Frame Pointers?") is a brief general overview; this section ("Profilers Are Slower...") dives into the specific profiler impact with data (55x larger perf.data, 210x slower unwinding, etc.). Some overlap is intentional as the first establishes the concept, the second builds the quantitative case. I could trim the first mention further if you think it's too redundant.

@pablogsal pablogsal requested a review from CAM-Gerlach as a code owner April 10, 2026 18:53
Co-authored-by: Mark Shannon <Mark.Shannon@arm.com>
Co-authored-by: Hugo van Kemenade <1324225+hugovk@users.noreply.github.com>
@pablogsal
Copy link
Copy Markdown
Member Author

pablogsal commented Apr 11, 2026

Thanks for the thorough review Mark! Addressed all the inline comments (cross-references to data, clarified the Gregg quote, softened language, added Windows mention, etc). Re: length and moving content to appendices, that is a fair point. I will move the eBPF and perf investigation to the appendix.

args: [--fix=lf]
- id: trailing-whitespace
name: "Remove trailing whitespace"
exclude: .*\.svg
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fidget-Spinner Did this break the SVGs? They're code, so it's usually okay to allow this.

Suggested change
exclude: .*\.svg

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes they break the svgs

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait let me correct, I'm not sure if they break it but its very disruptive to reformat svgs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new-pep A new draft PEP submitted for initial review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants