Skip to content

Commit 44f732f

Browse files
committed
x86/boot/e820: Clean up confusing and self-contradictory verbiage around E820 related resource allocations
So the E820 code has a rather confusing area of code at around e820__reserve_resources(), which is, by its plain reading, rather self-contradictory. For example, the comment explaining e820__reserve_resources() claims: - '* Mark E820 reserved areas as busy for the resource manager' By 'E820 reserved areas' one can naively conclude that it's talking about E820_TYPE_RESERVED areas - while those areas are treated in exactly the opposite fashion by do_mark_busy(): switch (type) { case E820_TYPE_RESERVED: case E820_TYPE_SOFT_RESERVED: case E820_TYPE_PRAM: case E820_TYPE_PMEM: return false; Ie. E820_TYPE_RESERVED areas are *not* marked busy for the resource manager, because E820_TYPE_RESERVED areas are device regions that might eventually be claimed by a device driver. This type of confusion permeates this whole area of code, making it exceedingly difficult to read (for me at least). So untangle it bit by bit: - Instead of talking about ambiguous 'reserved areas', talk about 'E820 device address regions' instead, and 'register'/'lock' them. - The do_mark_busy() function is a misnomer as well, because despite its name it 'does' nothing - it only determines what type of resource handling an E820 type should receive from the kernel. Rename it to e820_device_region() and negate its meaning, to avoid the 'busy/reserved' confusion. Because that's what this code is really about: filtering out device regions such as E820_TYPE_RESERVED, E820_TYPE_PRAM, E820_TYPE_PMEM, etc., and allowing them to be claimed by device drivers later on. - All other E820 regions (system regions) are registered and locked early on, before the PCI resource manager does its search for device BAR addresses, etc. Also fix this somewhat misleading comment: /* * Try to bump up RAM regions to reasonable boundaries, to * avoid stolen RAM: */ and explain that here we register artificial 'gap' resources at the end of suspiciously sized RAM regions, as heuristics to try to avoid buggy firmware with undeclared 'stolen RAM' regions: /* * Create additional 'gaps' at the end of RAM regions, * rounding them up to 64k/1MB/64MB boundaries, should * they be weirdly sized, and register extra, locked * resource regions for them, to make sure drivers * won't claim those addresses. * * These are basically blind guesses and heuristics to * avoid resource conflicts with broken firmware that * doesn't properly list 'stolen RAM' as a system region * in the E820 map. */ Also improve the printout of this extra resource a bit: make the message more unambiguous, and upgrade it from pr_debug() (where very few people will see it), to pr_info() (where it will make it into the syslog on default distro configs). Also fix spelling and improve comment placement. No change in functionality intended. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Juergen Gross <jgross@suse.com> Cc: "H . Peter Anvin" <hpa@zytor.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://patch.msgid.link/20250515120549.2820541-13-mingo@kernel.org
1 parent d214484 commit 44f732f

1 file changed

Lines changed: 37 additions & 18 deletions

File tree

arch/x86/kernel/e820.c

Lines changed: 37 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1064,37 +1064,44 @@ static unsigned long __init e820_type_to_iores_desc(struct e820_entry *entry)
10641064
}
10651065
}
10661066

1067-
static bool __init do_mark_busy(enum e820_type type, struct resource *res)
1067+
/*
1068+
* We assign one resource entry for each E820 map entry:
1069+
*/
1070+
static struct resource __initdata *e820_res;
1071+
1072+
/*
1073+
* Is this a device address region that should not be marked busy?
1074+
* (Versus system address regions that we register & lock early.)
1075+
*/
1076+
static bool __init e820_device_region(enum e820_type type, struct resource *res)
10681077
{
1069-
/* this is the legacy bios/dos rom-shadow + mmio region */
1078+
/* This is the legacy BIOS/DOS ROM-shadow + MMIO region: */
10701079
if (res->start < (1ULL<<20))
1071-
return true;
1080+
return false;
10721081

10731082
/*
10741083
* Treat persistent memory and other special memory ranges like
1075-
* device memory, i.e. reserve it for exclusive use of a driver
1084+
* device memory, i.e. keep it available for exclusive use of a
1085+
* driver:
10761086
*/
10771087
switch (type) {
10781088
case E820_TYPE_RESERVED:
10791089
case E820_TYPE_SOFT_RESERVED:
10801090
case E820_TYPE_PRAM:
10811091
case E820_TYPE_PMEM:
1082-
return false;
1092+
return true;
10831093
case E820_TYPE_RAM:
10841094
case E820_TYPE_ACPI:
10851095
case E820_TYPE_NVS:
10861096
case E820_TYPE_UNUSABLE:
10871097
default:
1088-
return true;
1098+
return false;
10891099
}
10901100
}
10911101

10921102
/*
1093-
* Mark E820 reserved areas as busy for the resource manager:
1103+
* Mark E820 system regions as busy for the resource manager:
10941104
*/
1095-
1096-
static struct resource __initdata *e820_res;
1097-
10981105
void __init e820__reserve_resources(void)
10991106
{
11001107
int i;
@@ -1120,18 +1127,18 @@ void __init e820__reserve_resources(void)
11201127
res->desc = e820_type_to_iores_desc(entry);
11211128

11221129
/*
1123-
* Don't register the region that could be conflicted with
1124-
* PCI device BAR resources and insert them later in
1125-
* pcibios_resource_survey():
1130+
* Skip and don't register device regions that could be conflicted
1131+
* with PCI device BAR resources. They get inserted later in
1132+
* pcibios_resource_survey() -> e820__reserve_resources_late():
11261133
*/
1127-
if (do_mark_busy(entry->type, res)) {
1134+
if (!e820_device_region(entry->type, res)) {
11281135
res->flags |= IORESOURCE_BUSY;
11291136
insert_resource(&iomem_resource, res);
11301137
}
11311138
res++;
11321139
}
11331140

1134-
/* Expose the kexec e820 table to the sysfs. */
1141+
/* Expose the kexec e820 table to sysfs: */
11351142
for (i = 0; i < e820_table_kexec->nr_entries; i++) {
11361143
struct e820_entry *entry = e820_table_kexec->entries + i;
11371144

@@ -1165,6 +1172,10 @@ void __init e820__reserve_resources_late(void)
11651172
int i;
11661173
struct resource *res;
11671174

1175+
/*
1176+
* Register device address regions listed in the E820 map,
1177+
* these can be claimed by device drivers later on:
1178+
*/
11681179
res = e820_res;
11691180
for (i = 0; i < e820_table->nr_entries; i++) {
11701181
if (!res->parent && res->end)
@@ -1173,8 +1184,16 @@ void __init e820__reserve_resources_late(void)
11731184
}
11741185

11751186
/*
1176-
* Try to bump up RAM regions to reasonable boundaries, to
1177-
* avoid stolen RAM:
1187+
* Create additional 'gaps' at the end of RAM regions,
1188+
* rounding them up to 64k/1MB/64MB boundaries, should
1189+
* they be weirdly sized, and register extra, locked
1190+
* resource regions for them, to make sure drivers
1191+
* won't claim those addresses.
1192+
*
1193+
* These are basically blind guesses and heuristics to
1194+
* avoid resource conflicts with broken firmware that
1195+
* doesn't properly list 'stolen RAM' as a system region
1196+
* in the E820 map.
11781197
*/
11791198
for (i = 0; i < e820_table->nr_entries; i++) {
11801199
struct e820_entry *entry = &e820_table->entries[i];
@@ -1190,7 +1209,7 @@ void __init e820__reserve_resources_late(void)
11901209
if (start >= end)
11911210
continue;
11921211

1193-
printk(KERN_DEBUG "e820: reserve RAM buffer [mem %#010llx-%#010llx]\n", start, end);
1212+
pr_info("e820: register RAM buffer resource [mem %#010llx-%#010llx]\n", start, end);
11941213
reserve_region_with_split(&iomem_resource, start, end, "RAM buffer");
11951214
}
11961215
}

0 commit comments

Comments
 (0)