Skip to content

gh-90533: Implement BytesIO.peek()#30808

Open
marcelm wants to merge 32 commits intopython:mainfrom
marcelm:fix-issue-46375
Open

gh-90533: Implement BytesIO.peek()#30808
marcelm wants to merge 32 commits intopython:mainfrom
marcelm:fix-issue-46375

Conversation

@marcelm
Copy link
Copy Markdown

@marcelm marcelm commented Jan 22, 2022

Closes gh-90533

@the-knights-who-say-ni

This comment was marked as outdated.

@marcelm marcelm changed the title bpo-46375: Implement BytesIO.peek() gh-90533: Implement BytesIO.peek() Apr 11, 2022
@marcelm

This comment was marked as outdated.

@bedevere-bot
Copy link
Copy Markdown

Most changes to Python require a NEWS entry.

Please add it using the blurb_it web app or the blurb command-line tool.

@marcelm marcelm force-pushed the fix-issue-46375 branch 2 times, most recently from 6998be1 to be39ff2 Compare November 6, 2022 15:11
@marcelm
Copy link
Copy Markdown
Author

marcelm commented Nov 9, 2022

@AlexWaygood You’ve been the only human to interact with this PR so far, do you possibly have any advice on how to move this forward?

@AlexWaygood AlexWaygood added the type-feature A feature request or enhancement label Nov 9, 2022
@AlexWaygood
Copy link
Copy Markdown
Member

AlexWaygood commented Nov 9, 2022

Hi @marcelm — sorry for the delay in anybody looking at this. I haven't studied your PR in detail (or thought about whether the proposal is a good idea), but it looks well put together at first glance.

I'll try to take a look soon. I won't be able to review the C code, but I can comment on whether the proposal seems like a good idea, and I can review the Python implementation and the tests.

Note that if this proposal is accepted, it will also need:

  • Updates to the docs for BytesIO (Doc/library/io.rst)
  • An entry in "What's new in Python 3.12" (Doc/whatsnew/3.12.rst)

You can also add yourself to Misc/ACKS as part of this PR, and give yourself credit in the NEWS entry, if you like. (Neither is compulsory.)

@AlexWaygood AlexWaygood self-requested a review November 9, 2022 13:37
@marcelm
Copy link
Copy Markdown
Author

marcelm commented Nov 9, 2022

Thank you! I pushed a documentation update and will add an entry to the What’s new document in case the PR is reviewed favorably.

@AlexWaygood
Copy link
Copy Markdown
Member

I'll defer to @benjaminp's judgement on this one -- I'm really not the right reviewer for this, unfortunately :(

Please ping me again if you still haven't had a review in a few weeks.

@AlexWaygood AlexWaygood removed their request for review November 28, 2022 12:26
@marcelm
Copy link
Copy Markdown
Author

marcelm commented Nov 28, 2022

Thank you! I appreciate that you took the time.

@awalgarg
Copy link
Copy Markdown

awalgarg commented Jul 8, 2023

Please ping me again if you still haven't had a review in a few weeks.

@AlexWaygood Hi! Been 8 months, no review so far :) How can we take this forward? I'm not the original author of the PR but I also have a use-case for this API and would like to see this change land, happy to help progress this.

@marcelm
Copy link
Copy Markdown
Author

marcelm commented Jul 8, 2023

Thanks for your interest! I have updated the PR to fix the merge conflict and to reflect that it now needs to target 3.13.

@awalgarg
Copy link
Copy Markdown

awalgarg commented Jul 8, 2023

@marcelm That was quick, thank you so much!

@AA-Turner
Copy link
Copy Markdown
Member

cc: @erlend-aasland @vstinner @benjaminp for the io C code.

A

.. versionadded:: 3.13

Return bytes from the current position onwards but without advancing the
position. The number of bytes returned may be less or more than requested.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait. Why can it more than requested?

Apparently size=0 means "read all". Please document it.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was just copied from the documentation for BufferedReader.peek(), but I agree that it could be written better. Changed now.

Wait. Why can it more than requested?

BufferedReader.peek() ignores the size argument and just returns whatever it has in its internal buffer.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer that BytesIO.peek() documents its behavior, rather than BufferedReader.peek() behavior.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, it’s already changed.

(BTW, is there a policy on who clicks the "Resolve conversation" button?)

Lib/_pyio.py Outdated
pos = self.tell()
if size == 0:
size = -1
b = self.read(size)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to implement it without touching the position? This code is not thread safe. I don't know if it's supposed to be thread-safe. Maybe add a private read method which has an argument to decide to move the position or not.

Same remark for C code.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point; I implemented this now so that the position is not changed. I factored out a peek_bytes function in the C version that does not advance the position.

It seems though that neither the C nor the Python version of BytesIO are supposed to be thread safe. (They don’t use locks as e.g. BufferedReader does.) So I would suggest making them so would be a task for a different PR.

marcelm and others added 23 commits April 10, 2026 23:19
Semantic change: The default argument for peek is now size=1.
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
Co-authored-by: Erlend E. Aasland <erlend.aasland@protonmail.com>
@marcelm marcelm requested a review from AA-Turner as a code owner April 10, 2026 21:37
Copy link
Copy Markdown
Member

@vstinner vstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked BytesIO.peek(0) to return an empty string, but I didn't notice that BufferedReader.peek(0) returns non-empty string. Now I'm unhappy that the two classes have an inconsistent behavior on peek(0), sorry.

>>> f=open("/etc/issue", "rb")
>>> f.peek(0)
b'\\S\nKernel \\r on \\m (\\l)\n\n'

On a short file, BufferedReader.peek(0) even returned the whole file contents!

Since BufferedReader.peek(0) behavior already exists, I would suggest to use the same behavior on BytesIO.peek(): return non-empty string.

cc @cmaloney

marcelm and others added 2 commits April 13, 2026 13:46
@marcelm
Copy link
Copy Markdown
Author

marcelm commented Apr 13, 2026

Since BufferedReader.peek(0) behavior already exists, I would suggest to use the same behavior on BytesIO.peek(): return non-empty string.

Sure! I have changed it so that the rest of the buffer is returned for size=0.

The signature is still such that the default is size=1. Maybe that is actually a good compromise?

  • If desired, an explicit peek(0) returns a full (possibly expensive) copy - but the default is cheap.
  • peek() without arguments returns a single byte, which is the minimum necessary for compatibility with BufferedReader.peek(). Note that the latter also can return a single byte when that is how much is left in the buffer:
$ python3 -c 'f=open("README.rst", "rb");f.read(8191);print(f.peek())'
b'r'
  • If more than a single byte but not the full buffer is needed, an explicit size can be set.

Alternatively, the default could become size=0, but then the question is how huch of the buffer to return (which is where this PR got stalled last time).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting merge stale Stale PR or inactive for long period of time. topic-IO type-feature A feature request or enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

io.BytesIO does not have peek()

10 participants