Skip to content

Commit 5797d10

Browse files
committed
Merge tag 'cxl-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull compute express link (CXL) updates from Dave Jiang: "The additions of note are adding CXL region remove support for locked CXL decoders, adding unit testing support for XOR address translation, and adding unit testing support for extended linear cache. Misc: - Remove incorrect page-allocator quirk section in documentation - Remove unused devm_cxl_port_enumerate_dports() function - Fix typo in cdat.c code comment - Replace use of system_wq with system_percpu_wq - Add locked CXL decoder support for region removal - Return when generic target updated - Rename region_res_match_cxl_range() to spa_maps_hpa() - Clarify comment in spa_maps_hpa() Enable unit testing for XOR address translation of SPA to DPA and vice versa: - Refactor address translation funcs for testing in cxl_region - Make the XOR calculations available for testing - Add cxl_translate module for address translation testing in cxl_test Extended Linear Cache changes: - Add extended linear cache size sysfs attribute - Adjust failure emission of extended linear cache detection in cxl_acpi - Added extended linear cache unit testing support in cxl_test Preparation refactor patches for PRM translation support: - Simplify cxl_rd_ops allocation and handling - Group xor arithmetric setup code in a single block - Remove local variable @inc in cxl_port_setup_targets()" * tag 'cxl-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (22 commits) cxl/test: Assign overflow_err_count from log->nr_overflow cxl/test: Remove ret_limit race condition in mock_get_event() cxl/test: remove unused mock function for cxl_rcd_component_reg_phys() cxl/test: Add support for acpi extended linear cache cxl/test: Add cxl_test CFMWS support for extended linear cache cxl/test: Standardize CXL auto region size cxl/region: Remove local variable @inc in cxl_port_setup_targets() cxl/acpi: Group xor arithmetric setup code in a single block cxl: Simplify cxl_rd_ops allocation and handling cxl: Clarify comment in spa_maps_hpa() cxl: Rename region_res_match_cxl_range() to spa_maps_hpa() acpi/hmat: Return when generic target is updated cxl: Add handling of locked CXL decoder cxl/region: Add support to indicate region has extended linear cache cxl: Adjust extended linear cache failure emission in cxl_acpi cxl/test: Add cxl_translate module for address translation testing cxl/acpi: Make the XOR calculations available for testing cxl/region: Refactor address translation funcs for testing cxl/pci: replace use of system_wq with system_percpu_wq cxl: fix typos in cdat.c comments ...
2 parents 43dfc13 + ea5514e commit 5797d10

19 files changed

Lines changed: 839 additions & 327 deletions

File tree

Documentation/ABI/testing/sysfs-bus-cxl

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -496,8 +496,17 @@ Description:
496496
changed, only freed by writing 0. The kernel makes no guarantees
497497
that data is maintained over an address space freeing event, and
498498
there is no guarantee that a free followed by an allocate
499-
results in the same address being allocated.
499+
results in the same address being allocated. If extended linear
500+
cache is present, the size indicates extended linear cache size
501+
plus the CXL region size.
500502

503+
What: /sys/bus/cxl/devices/regionZ/extended_linear_cache_size
504+
Date: October, 2025
505+
KernelVersion: v6.19
506+
Contact: linux-cxl@vger.kernel.org
507+
Description:
508+
(RO) The size of extended linear cache, if there is an extended
509+
linear cache. Otherwise the attribute will not be visible.
501510

502511
What: /sys/bus/cxl/devices/regionZ/mode
503512
Date: January, 2023

Documentation/driver-api/cxl/allocation/page-allocator.rst

Lines changed: 0 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -41,37 +41,6 @@ To simplify this, the page allocator will prefer :code:`ZONE_MOVABLE` over
4141
will fallback to allocate from :code:`ZONE_NORMAL`.
4242

4343

44-
Zone and Node Quirks
45-
====================
46-
Let's consider a configuration where the local DRAM capacity is largely onlined
47-
into :code:`ZONE_NORMAL`, with no :code:`ZONE_MOVABLE` capacity present. The
48-
CXL capacity has the opposite configuration - all onlined in
49-
:code:`ZONE_MOVABLE`.
50-
51-
Under the default allocation policy, the page allocator will completely skip
52-
:code:`ZONE_MOVABLE` as a valid allocation target. This is because, as of
53-
Linux v6.15, the page allocator does (approximately) the following: ::
54-
55-
for (each zone in local_node):
56-
57-
for (each node in fallback_order):
58-
59-
attempt_allocation(gfp_flags);
60-
61-
Because the local node does not have :code:`ZONE_MOVABLE`, the CXL node is
62-
functionally unreachable for direct allocation. As a result, the only way
63-
for CXL capacity to be used is via `demotion` in the reclaim path.
64-
65-
This configuration also means that if the DRAM ndoe has :code:`ZONE_MOVABLE`
66-
capacity - when that capacity is depleted, the page allocator will actually
67-
prefer CXL :code:`ZONE_MOVABLE` pages over DRAM :code:`ZONE_NORMAL` pages.
68-
69-
We may wish to invert this priority in future Linux versions.
70-
71-
If `demotion` and `swap` are disabled, Linux will begin to cause OOM crashes
72-
when the DRAM nodes are depleted. See the reclaim section for more details.
73-
74-
7544
CGroups and CPUSets
7645
===================
7746
Finally, assuming CXL memory is reachable via the page allocation (i.e. onlined

drivers/acpi/numa/hmat.c

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -910,12 +910,13 @@ static void hmat_register_target(struct memory_target *target)
910910
* Register generic port perf numbers. The nid may not be
911911
* initialized and is still NUMA_NO_NODE.
912912
*/
913-
mutex_lock(&target_lock);
914-
if (*(u16 *)target->gen_port_device_handle) {
915-
hmat_update_generic_target(target);
916-
target->registered = true;
913+
scoped_guard(mutex, &target_lock) {
914+
if (*(u16 *)target->gen_port_device_handle) {
915+
hmat_update_generic_target(target);
916+
target->registered = true;
917+
return;
918+
}
917919
}
918-
mutex_unlock(&target_lock);
919920

920921
hmat_hotplug_target(target);
921922
}

drivers/cxl/acpi.c

Lines changed: 41 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -11,25 +11,36 @@
1111
#include "cxlpci.h"
1212
#include "cxl.h"
1313

14-
struct cxl_cxims_data {
15-
int nr_maps;
16-
u64 xormaps[] __counted_by(nr_maps);
17-
};
18-
1914
static const guid_t acpi_cxl_qtg_id_guid =
2015
GUID_INIT(0xF365F9A6, 0xA7DE, 0x4071,
2116
0xA6, 0x6A, 0xB4, 0x0C, 0x0B, 0x4F, 0x8E, 0x52);
2217

23-
static u64 cxl_apply_xor_maps(struct cxl_root_decoder *cxlrd, u64 addr)
18+
#define HBIW_TO_NR_MAPS_SIZE (CXL_DECODER_MAX_INTERLEAVE + 1)
19+
static const int hbiw_to_nr_maps[HBIW_TO_NR_MAPS_SIZE] = {
20+
[1] = 0, [2] = 1, [3] = 0, [4] = 2, [6] = 1, [8] = 3, [12] = 2, [16] = 4
21+
};
22+
23+
static const int valid_hbiw[] = { 1, 2, 3, 4, 6, 8, 12, 16 };
24+
25+
u64 cxl_do_xormap_calc(struct cxl_cxims_data *cximsd, u64 addr, int hbiw)
2426
{
25-
struct cxl_cxims_data *cximsd = cxlrd->platform_data;
26-
int hbiw = cxlrd->cxlsd.nr_targets;
27+
int nr_maps_to_apply = -1;
2728
u64 val;
2829
int pos;
2930

30-
/* No xormaps for host bridge interleave ways of 1 or 3 */
31-
if (hbiw == 1 || hbiw == 3)
32-
return addr;
31+
/*
32+
* Strictly validate hbiw since this function is used for testing and
33+
* that nullifies any expectation of trusted parameters from the CXL
34+
* Region Driver.
35+
*/
36+
for (int i = 0; i < ARRAY_SIZE(valid_hbiw); i++) {
37+
if (valid_hbiw[i] == hbiw) {
38+
nr_maps_to_apply = hbiw_to_nr_maps[hbiw];
39+
break;
40+
}
41+
}
42+
if (nr_maps_to_apply == -1 || nr_maps_to_apply > cximsd->nr_maps)
43+
return ULLONG_MAX;
3344

3445
/*
3546
* In regions using XOR interleave arithmetic the CXL HPA may not
@@ -60,6 +71,14 @@ static u64 cxl_apply_xor_maps(struct cxl_root_decoder *cxlrd, u64 addr)
6071

6172
return addr;
6273
}
74+
EXPORT_SYMBOL_FOR_MODULES(cxl_do_xormap_calc, "cxl_translate");
75+
76+
static u64 cxl_apply_xor_maps(struct cxl_root_decoder *cxlrd, u64 addr)
77+
{
78+
struct cxl_cxims_data *cximsd = cxlrd->platform_data;
79+
80+
return cxl_do_xormap_calc(cximsd, addr, cxlrd->cxlsd.nr_targets);
81+
}
6382

6483
struct cxl_cxims_context {
6584
struct device *dev;
@@ -353,7 +372,7 @@ static int cxl_acpi_set_cache_size(struct cxl_root_decoder *cxlrd)
353372

354373
rc = hmat_get_extended_linear_cache_size(&res, nid, &cache_size);
355374
if (rc)
356-
return rc;
375+
return 0;
357376

358377
/*
359378
* The cache range is expected to be within the CFMWS.
@@ -378,21 +397,18 @@ static void cxl_setup_extended_linear_cache(struct cxl_root_decoder *cxlrd)
378397
int rc;
379398

380399
rc = cxl_acpi_set_cache_size(cxlrd);
381-
if (!rc)
382-
return;
383-
384-
if (rc != -EOPNOTSUPP) {
400+
if (rc) {
385401
/*
386-
* Failing to support extended linear cache region resize does not
402+
* Failing to retrieve extended linear cache region resize does not
387403
* prevent the region from functioning. Only causes cxl list showing
388404
* incorrect region size.
389405
*/
390406
dev_warn(cxlrd->cxlsd.cxld.dev.parent,
391-
"Extended linear cache calculation failed rc:%d\n", rc);
392-
}
407+
"Extended linear cache retrieval failed rc:%d\n", rc);
393408

394-
/* Ignoring return code */
395-
cxlrd->cache_size = 0;
409+
/* Ignoring return code */
410+
cxlrd->cache_size = 0;
411+
}
396412
}
397413

398414
DEFINE_FREE(put_cxlrd, struct cxl_root_decoder *,
@@ -453,8 +469,6 @@ static int __cxl_parse_cfmws(struct acpi_cedt_cfmws *cfmws,
453469
ig = CXL_DECODER_MIN_GRANULARITY;
454470
cxld->interleave_granularity = ig;
455471

456-
cxl_setup_extended_linear_cache(cxlrd);
457-
458472
if (cfmws->interleave_arithmetic == ACPI_CEDT_CFMWS_ARITHMETIC_XOR) {
459473
if (ways != 1 && ways != 3) {
460474
cxims_ctx = (struct cxl_cxims_context) {
@@ -470,18 +484,13 @@ static int __cxl_parse_cfmws(struct acpi_cedt_cfmws *cfmws,
470484
return -EINVAL;
471485
}
472486
}
487+
cxlrd->ops.hpa_to_spa = cxl_apply_xor_maps;
488+
cxlrd->ops.spa_to_hpa = cxl_apply_xor_maps;
473489
}
474490

475-
cxlrd->qos_class = cfmws->qtg_id;
476-
477-
if (cfmws->interleave_arithmetic == ACPI_CEDT_CFMWS_ARITHMETIC_XOR) {
478-
cxlrd->ops = kzalloc(sizeof(*cxlrd->ops), GFP_KERNEL);
479-
if (!cxlrd->ops)
480-
return -ENOMEM;
491+
cxl_setup_extended_linear_cache(cxlrd);
481492

482-
cxlrd->ops->hpa_to_spa = cxl_apply_xor_maps;
483-
cxlrd->ops->spa_to_hpa = cxl_apply_xor_maps;
484-
}
493+
cxlrd->qos_class = cfmws->qtg_id;
485494

486495
rc = cxl_decoder_add(cxld);
487496
if (rc)

drivers/cxl/core/cdat.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -826,7 +826,7 @@ static struct xarray *cxl_switch_gather_bandwidth(struct cxl_region *cxlr,
826826
cxl_coordinates_combine(coords, coords, ctx->coord);
827827

828828
/*
829-
* Take the min of the calculated bandwdith and the upstream
829+
* Take the min of the calculated bandwidth and the upstream
830830
* switch SSLBIS bandwidth if there's a parent switch
831831
*/
832832
if (!is_root)
@@ -949,7 +949,7 @@ static struct xarray *cxl_hb_gather_bandwidth(struct xarray *xa)
949949
/**
950950
* cxl_region_update_bandwidth - Update the bandwidth access coordinates of a region
951951
* @cxlr: The region being operated on
952-
* @input_xa: xarray holds cxl_perf_ctx wht calculated bandwidth per ACPI0017 instance
952+
* @input_xa: xarray holds cxl_perf_ctx with calculated bandwidth per ACPI0017 instance
953953
*/
954954
static void cxl_region_update_bandwidth(struct cxl_region *cxlr,
955955
struct xarray *input_xa)

drivers/cxl/core/hdm.c

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -905,6 +905,9 @@ static void cxl_decoder_reset(struct cxl_decoder *cxld)
905905
if ((cxld->flags & CXL_DECODER_F_ENABLE) == 0)
906906
return;
907907

908+
if (test_bit(CXL_DECODER_F_LOCK, &cxld->flags))
909+
return;
910+
908911
if (port->commit_end == id)
909912
cxl_port_commit_reap(cxld);
910913
else

drivers/cxl/core/pci.c

Lines changed: 8 additions & 79 deletions
Original file line numberDiff line numberDiff line change
@@ -71,85 +71,6 @@ struct cxl_dport *__devm_cxl_add_dport_by_dev(struct cxl_port *port,
7171
}
7272
EXPORT_SYMBOL_NS_GPL(__devm_cxl_add_dport_by_dev, "CXL");
7373

74-
struct cxl_walk_context {
75-
struct pci_bus *bus;
76-
struct cxl_port *port;
77-
int type;
78-
int error;
79-
int count;
80-
};
81-
82-
static int match_add_dports(struct pci_dev *pdev, void *data)
83-
{
84-
struct cxl_walk_context *ctx = data;
85-
struct cxl_port *port = ctx->port;
86-
int type = pci_pcie_type(pdev);
87-
struct cxl_register_map map;
88-
struct cxl_dport *dport;
89-
u32 lnkcap, port_num;
90-
int rc;
91-
92-
if (pdev->bus != ctx->bus)
93-
return 0;
94-
if (!pci_is_pcie(pdev))
95-
return 0;
96-
if (type != ctx->type)
97-
return 0;
98-
if (pci_read_config_dword(pdev, pci_pcie_cap(pdev) + PCI_EXP_LNKCAP,
99-
&lnkcap))
100-
return 0;
101-
102-
rc = cxl_find_regblock(pdev, CXL_REGLOC_RBI_COMPONENT, &map);
103-
if (rc)
104-
dev_dbg(&port->dev, "failed to find component registers\n");
105-
106-
port_num = FIELD_GET(PCI_EXP_LNKCAP_PN, lnkcap);
107-
dport = devm_cxl_add_dport(port, &pdev->dev, port_num, map.resource);
108-
if (IS_ERR(dport)) {
109-
ctx->error = PTR_ERR(dport);
110-
return PTR_ERR(dport);
111-
}
112-
ctx->count++;
113-
114-
return 0;
115-
}
116-
117-
/**
118-
* devm_cxl_port_enumerate_dports - enumerate downstream ports of the upstream port
119-
* @port: cxl_port whose ->uport_dev is the upstream of dports to be enumerated
120-
*
121-
* Returns a positive number of dports enumerated or a negative error
122-
* code.
123-
*/
124-
int devm_cxl_port_enumerate_dports(struct cxl_port *port)
125-
{
126-
struct pci_bus *bus = cxl_port_to_pci_bus(port);
127-
struct cxl_walk_context ctx;
128-
int type;
129-
130-
if (!bus)
131-
return -ENXIO;
132-
133-
if (pci_is_root_bus(bus))
134-
type = PCI_EXP_TYPE_ROOT_PORT;
135-
else
136-
type = PCI_EXP_TYPE_DOWNSTREAM;
137-
138-
ctx = (struct cxl_walk_context) {
139-
.port = port,
140-
.bus = bus,
141-
.type = type,
142-
};
143-
pci_walk_bus(bus, match_add_dports, &ctx);
144-
145-
if (ctx.count == 0)
146-
return -ENODEV;
147-
if (ctx.error)
148-
return ctx.error;
149-
return ctx.count;
150-
}
151-
EXPORT_SYMBOL_NS_GPL(devm_cxl_port_enumerate_dports, "CXL");
152-
15374
static int cxl_dvsec_mem_range_valid(struct cxl_dev_state *cxlds, int id)
15475
{
15576
struct pci_dev *pdev = to_pci_dev(cxlds->dev);
@@ -1217,6 +1138,14 @@ int cxl_gpf_port_setup(struct cxl_dport *dport)
12171138
return 0;
12181139
}
12191140

1141+
struct cxl_walk_context {
1142+
struct pci_bus *bus;
1143+
struct cxl_port *port;
1144+
int type;
1145+
int error;
1146+
int count;
1147+
};
1148+
12201149
static int count_dports(struct pci_dev *pdev, void *data)
12211150
{
12221151
struct cxl_walk_context *ctx = data;

drivers/cxl/core/port.c

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -459,7 +459,6 @@ static void cxl_root_decoder_release(struct device *dev)
459459
if (atomic_read(&cxlrd->region_id) >= 0)
460460
memregion_free(atomic_read(&cxlrd->region_id));
461461
__cxl_decoder_release(&cxlrd->cxlsd.cxld);
462-
kfree(cxlrd->ops);
463462
kfree(cxlrd);
464463
}
465464

0 commit comments

Comments
 (0)