Skip to content

Commit 939b03e

Browse files
Use the :abbr: role for BMP (Basic Multilingual Plane)
1 parent 2f8f569 commit 939b03e

7 files changed

Lines changed: 12 additions & 7 deletions

File tree

Doc/library/idle.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -864,7 +864,7 @@ A Windows console, for instance, keeps a user-settable 1 to 9999 lines,
864864
with 300 the default.
865865

866866
A Tk Text widget, and hence IDLE's Shell, displays characters (codepoints) in
867-
the BMP (Basic Multilingual Plane) subset of Unicode. Which characters are
867+
the :abbr:`BMP (Basic Multilingual Plane)` subset of Unicode. Which characters are
868868
displayed with a proper glyph and which with a replacement box depends on the
869869
operating system and installed fonts. Tab characters cause the following text
870870
to begin after the next tab stop. (They occur every 8 'characters'). Newline

Doc/library/pyexpat.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,8 @@ The :mod:`!xml.parsers.expat` module contains two functions:
7676
For other encodings (including aliases like Latin1 and ASCII) it
7777
falls back to Python.
7878
It supports most of 8-bit encodings and many multi-byte encodings
79-
like Shift_JIS, although only BMP characters (``U+0000-U+FFFF``)
79+
like Shift_JIS, although only the :abbr:`BMP (Basic Multilingual Plane)`
80+
characters (U+0000 through U+FFFF)
8081
are supported with non-native encodings (this restriction is also
8182
applied to aliases like UTF8).
8283
These restrictions only apply if *encoding* is not given.

Doc/whatsnew/3.16.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,8 @@ xml
115115
* Add support for multiple multi-byte encodings in the :mod:`XML parser
116116
<xml.parsers.expat>`: "cp932", "cp949", "cp950", "Big5","EUC-JP",
117117
"GB2312", "GBK", "johab", and "Shift_JIS".
118-
Add partial support (only BMP characters) for multi-byte encodings
118+
Add partial support (only the :abbr:`BMP (Basic Multilingual Plane)`
119+
characters) for multi-byte encodings
119120
"Big5-HKSCS", "EUC_JIS-2004", "EUC_JISX0213", "Shift_JIS-2004",
120121
"Shift_JISX0213", "utf-8-sig" and non-standard aliases like "UTF8"
121122
(without hyphen).

Doc/whatsnew/3.3.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -262,7 +262,8 @@ The storage of Unicode strings now depends on the highest code point in the stri
262262

263263
* pure ASCII and Latin1 strings (``U+0000-U+00FF``) use 1 byte per code point;
264264

265-
* BMP strings (``U+0000-U+FFFF``) use 2 bytes per code point;
265+
* :abbr:`BMP (Basic Multilingual Plane)` strings (``U+0000-U+FFFF``) use
266+
2 bytes per code point;
266267

267268
* non-BMP strings (``U+10000-U+10FFFF``) use 4 bytes per code point.
268269

Doc/whatsnew/3.4.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -418,7 +418,8 @@ Some smaller changes made to the core Python language are:
418418
* All the UTF-\* codecs (except UTF-7) now reject surrogates during both
419419
encoding and decoding unless the ``surrogatepass`` error handler is used,
420420
with the exception of the UTF-16 decoder (which accepts valid surrogate pairs)
421-
and the UTF-16 encoder (which produces them while encoding non-BMP characters).
421+
and the UTF-16 encoder (which produces them while encoding characters that
422+
are not in the :abbr:`BMP (Basic Multilingual Plane)`).
422423
(Contributed by Victor Stinner, Kang-Hao (Kenny) Lu and Serhiy Storchaka in
423424
:issue:`12892`.)
424425

Doc/whatsnew/3.8.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -868,7 +868,8 @@ window are shown and hidden in the Options menu.
868868
(Contributed by Tal Einat and Saimadhav Heblikar in :issue:`17535`.)
869869

870870
OS native encoding is now used for converting between Python strings and Tcl
871-
objects. This allows IDLE to work with emoji and other non-BMP characters.
871+
objects. This allows IDLE to work with emoji and other characters that are not
872+
in the :abbr:`BMP (Basic Multilingual Plane)`.
872873
These characters can be displayed or copied and pasted to or from the
873874
clipboard. Converting strings from Tcl to Python and back now never fails.
874875
(Many people worked on this for eight years but the problem was finally

Misc/NEWS.d/next/Library/2026-05-14-17-01-19.gh-issue-62259.ytlFD5.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
Add support for multiple multi-byte encodings in the :mod:`XML parser
22
<xml.parsers.expat>`: "cp932", "cp949", "cp950", "Big5","EUC-JP", "GB2312",
3-
"GBK", "johab", and "Shift_JIS". Add partial support (only BMP characters)
3+
"GBK", "johab", and "Shift_JIS". Add partial support (only the BMP characters)
44
for multi-byte encodings "Big5-HKSCS", "EUC_JIS-2004", "EUC_JISX0213",
55
"Shift_JIS-2004", "Shift_JISX0213", "utf-8-sig" and non-standard aliases
66
like "UTF8" (without hyphen). The parser now raises :exc:`ValueError` for

0 commit comments

Comments
 (0)