Skip to content

Commit e812928

Browse files
committed
Merge tag 'cxl-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull CXL updates from Dave Jiang: - Introduce cxl_memdev_attach and pave way for soft reserved handling, type2 accelerator enabling, and LSA 2.0 enabling. All these series require the endpoint driver to settle before continuing the memdev driver probe. - Address CXL port error protocol handling and reporting. The large patch series was split into three parts. The first two parts are included here with the final part coming later. The first part consists of a series of code refactoring to PCI AER sub-system that addresses CXL and also CXL RAS code to prepare for port error handling. The second part refactors the CXL code to move management of component registers to cxl_port objects to allow all CXL AER errors to be handled through the cxl_port hierarchy. - Provide AMD Zen5 platform address translation for CXL using ACPI PRMT. This includes a conventions document to explain why this is needed and how it's implemented. - Misc CXL patches of fixes, cleanups, and updates. Including CXL address translation for unaligned MOD3 regions. [ TLA service: CXL is "Compute Express Link" ] * tag 'cxl-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (59 commits) cxl: Disable HPA/SPA translation handlers for Normalized Addressing cxl/region: Factor out code into cxl_region_setup_poison() cxl/atl: Lock decoders that need address translation cxl: Enable AMD Zen5 address translation using ACPI PRMT cxl/acpi: Prepare use of EFI runtime services cxl: Introduce callback for HPA address ranges translation cxl/region: Use region data to get the root decoder cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos() cxl/region: Separate region parameter setup and region construction cxl: Simplify cxl_root_ops allocation and handling cxl/region: Store HPA range in struct cxl_region cxl/region: Store root decoder in struct cxl_region cxl/region: Rename misleading variable name @HPA to @hpa_range Documentation/driver-api/cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement cxl, doc: Moving conventions in separate files cxl, doc: Remove isonum.txt inclusion cxl/port: Unify endpoint and switch port lookup cxl/port: Move endpoint component register management to cxl_port cxl/port: Map Port RAS registers cxl/port: Move dport RAS setup to dport add time ...
2 parents cebcffe + 63fbf27 commit e812928

50 files changed

Lines changed: 2522 additions & 1240 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.
Lines changed: 7 additions & 171 deletions
Original file line numberDiff line numberDiff line change
@@ -1,182 +1,18 @@
11
.. SPDX-License-Identifier: GPL-2.0
2-
.. include:: <isonum.txt>
32
4-
=======================================
53
Compute Express Link: Linux Conventions
6-
=======================================
4+
#######################################
75

86
There exists shipping platforms that bend or break CXL specification
97
expectations. Record the details and the rationale for those deviations.
108
Borrow the ACPI Code First template format to capture the assumptions
119
and tradeoffs such that multiple platform implementations can follow the
1210
same convention.
1311

14-
<(template) Title>
15-
==================
12+
.. toctree::
13+
:maxdepth: 1
14+
:caption: Contents
1615

17-
Document
18-
--------
19-
CXL Revision <rev>, Version <ver>
20-
21-
License
22-
-------
23-
SPDX-License Identifier: CC-BY-4.0
24-
25-
Creator/Contributors
26-
--------------------
27-
28-
Summary of the Change
29-
---------------------
30-
31-
<Detail the conflict with the specification and where available the
32-
assumptions and tradeoffs taken by the hardware platform.>
33-
34-
35-
Benefits of the Change
36-
----------------------
37-
38-
<Detail what happens if platforms and Linux do not adopt this
39-
convention.>
40-
41-
References
42-
----------
43-
44-
Detailed Description of the Change
45-
----------------------------------
46-
47-
<Propose spec language that corrects the conflict.>
48-
49-
50-
Resolve conflict between CFMWS, Platform Memory Holes, and Endpoint Decoders
51-
============================================================================
52-
53-
Document
54-
--------
55-
56-
CXL Revision 3.2, Version 1.0
57-
58-
License
59-
-------
60-
61-
SPDX-License Identifier: CC-BY-4.0
62-
63-
Creator/Contributors
64-
--------------------
65-
66-
- Fabio M. De Francesco, Intel
67-
- Dan J. Williams, Intel
68-
- Mahesh Natu, Intel
69-
70-
Summary of the Change
71-
---------------------
72-
73-
According to the current Compute Express Link (CXL) Specifications (Revision
74-
3.2, Version 1.0), the CXL Fixed Memory Window Structure (CFMWS) describes zero
75-
or more Host Physical Address (HPA) windows associated with each CXL Host
76-
Bridge. Each window represents a contiguous HPA range that may be interleaved
77-
across one or more targets, including CXL Host Bridges. Each window has a set
78-
of restrictions that govern its usage. It is the Operating System-directed
79-
configuration and Power Management (OSPM) responsibility to utilize each window
80-
for the specified use.
81-
82-
Table 9-22 of the current CXL Specifications states that the Window Size field
83-
contains the total number of consecutive bytes of HPA this window describes.
84-
This value must be a multiple of the Number of Interleave Ways (NIW) * 256 MB.
85-
86-
Platform Firmware (BIOS) might reserve physical addresses below 4 GB where a
87-
memory gap such as the Low Memory Hole for PCIe MMIO may exist. In such cases,
88-
the CFMWS Range Size may not adhere to the NIW * 256 MB rule.
89-
90-
The HPA represents the actual physical memory address space that the CXL devices
91-
can decode and respond to, while the System Physical Address (SPA), a related
92-
but distinct concept, represents the system-visible address space that users can
93-
direct transaction to and so it excludes reserved regions.
94-
95-
BIOS publishes CFMWS to communicate the active SPA ranges that, on platforms
96-
with LMH's, map to a strict subset of the HPA. The SPA range trims out the hole,
97-
resulting in lost capacity in the Endpoints with no SPA to map to that part of
98-
the HPA range that intersects the hole.
99-
100-
E.g, an x86 platform with two CFMWS and an LMH starting at 2 GB:
101-
102-
+--------+------------+-------------------+------------------+-------------------+------+
103-
| Window | CFMWS Base | CFMWS Size | HDM Decoder Base | HDM Decoder Size | Ways |
104-
+========+============+===================+==================+===================+======+
105-
|  0 | 0 GB | 2 GB | 0 GB | 3 GB | 12 |
106-
+--------+------------+-------------------+------------------+-------------------+------+
107-
|  1 | 4 GB | NIW*256MB Aligned | 4 GB | NIW*256MB Aligned | 12 |
108-
+--------+------------+-------------------+------------------+-------------------+------+
109-
110-
HDM decoder base and HDM decoder size represent all the 12 Endpoint Decoders of
111-
a 12 ways region and all the intermediate Switch Decoders. They are configured
112-
by the BIOS according to the NIW * 256MB rule, resulting in a HPA range size of
113-
3GB. Instead, the CFMWS Base and CFMWS Size are used to configure the Root
114-
Decoder HPA range that results smaller (2GB) than that of the Switch and
115-
Endpoint Decoders in the hierarchy (3GB).
116-
117-
This creates 2 issues which lead to a failure to construct a region:
118-
119-
1) A mismatch in region size between root and any HDM decoder. The root decoders
120-
will always be smaller due to the trim.
121-
122-
2) The trim causes the root decoder to violate the (NIW * 256MB) rule.
123-
124-
This change allows a region with a base address of 0GB to bypass these checks to
125-
allow for region creation with the trimmed root decoder address range.
126-
127-
This change does not allow for any other arbitrary region to violate these
128-
checks - it is intended exclusively to enable x86 platforms which map CXL memory
129-
under 4GB.
130-
131-
Despite the HDM decoders covering the PCIE hole HPA region, it is expected that
132-
the platform will never route address accesses to the CXL complex because the
133-
root decoder only covers the trimmed region (which excludes this). This is
134-
outside the ability of Linux to enforce.
135-
136-
On the example platform, only the first 2GB will be potentially usable, but
137-
Linux, aiming to adhere to the current specifications, fails to construct
138-
Regions and attach Endpoint and intermediate Switch Decoders to them.
139-
140-
There are several points of failure that due to the expectation that the Root
141-
Decoder HPA size, that is equal to the CFMWS from which it is configured, has
142-
to be greater or equal to the matching Switch and Endpoint HDM Decoders.
143-
144-
In order to succeed with construction and attachment, Linux must construct a
145-
Region with Root Decoder HPA range size, and then attach to that all the
146-
intermediate Switch Decoders and Endpoint Decoders that belong to the hierarchy
147-
regardless of their range sizes.
148-
149-
Benefits of the Change
150-
----------------------
151-
152-
Without the change, the OSPM wouldn't match intermediate Switch and Endpoint
153-
Decoders with Root Decoders configured with CFMWS HPA sizes that don't align
154-
with the NIW * 256MB constraint, and so it leads to lost memdev capacity.
155-
156-
This change allows the OSPM to construct Regions and attach intermediate Switch
157-
and Endpoint Decoders to them, so that the addressable part of the memory
158-
devices total capacity is made available to the users.
159-
160-
References
161-
----------
162-
163-
Compute Express Link Specification Revision 3.2, Version 1.0
164-
<https://www.computeexpresslink.org/>
165-
166-
Detailed Description of the Change
167-
----------------------------------
168-
169-
The description of the Window Size field in table 9-22 needs to account for
170-
platforms with Low Memory Holes, where SPA ranges might be subsets of the
171-
endpoints HPA. Therefore, it has to be changed to the following:
172-
173-
"The total number of consecutive bytes of HPA this window represents. This value
174-
shall be a multiple of NIW * 256 MB.
175-
176-
On platforms that reserve physical addresses below 4 GB, such as the Low Memory
177-
Hole for PCIe MMIO on x86, an instance of CFMWS whose Base HPA range is 0 might
178-
have a size that doesn't align with the NIW * 256 MB constraint.
179-
180-
Note that the matching intermediate Switch Decoders and the Endpoint Decoders
181-
HPA range sizes must still align to the above-mentioned rule, but the memory
182-
capacity that exceeds the CFMWS window size won't be accessible.".
16+
conventions/cxl-lmh.rst
17+
conventions/cxl-atl.rst
18+
conventions/template.rst

0 commit comments

Comments
 (0)