Skip to content

Commit a2fff99

Browse files
soleenakpm00
authored andcommitted
kho: increase metadata bitmap size to PAGE_SIZE
KHO memory preservation metadata is preserved in 512 byte chunks which requires their allocation from slab allocator. Slabs are not safe to be used with KHO because of kfence, and because partial slabs may lead leaks to the next kernel. Change the size to be PAGE_SIZE. The kfence specifically may cause memory corruption, where it randomly provides slab objects that can be within the scratch area. The reason for that is that kfence allocates its objects prior to KHO scratch is marked as CMA region. While this change could potentially increase metadata overhead on systems with sparsely preserved memory, this is being mitigated by ongoing work to reduce sparseness during preservation via 1G guest pages. Furthermore, this change aligns with future work on a stateless KHO, which will also use page-sized bitmaps for its radix tree metadata. Link: https://lkml.kernel.org/r/20251021000852.2924827-3-pasha.tatashin@soleen.com Fixes: fc33e4b ("kexec: enable KHO support for memory preservation") Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Christian Brauner <brauner@kernel.org> Cc: David Matlack <dmatlack@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Samiullah Khawaja <skhawaja@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
1 parent e38f65d commit a2fff99

1 file changed

Lines changed: 11 additions & 10 deletions

File tree

kernel/kexec_handover.c

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -69,23 +69,25 @@ early_param("kho", kho_parse_enable);
6969
* Keep track of memory that is to be preserved across KHO.
7070
*
7171
* The serializing side uses two levels of xarrays to manage chunks of per-order
72-
* 512 byte bitmaps. For instance if PAGE_SIZE = 4096, the entire 1G order of a
73-
* 1TB system would fit inside a single 512 byte bitmap. For order 0 allocations
74-
* each bitmap will cover 16M of address space. Thus, for 16G of memory at most
75-
* 512K of bitmap memory will be needed for order 0.
72+
* PAGE_SIZE byte bitmaps. For instance if PAGE_SIZE = 4096, the entire 1G order
73+
* of a 8TB system would fit inside a single 4096 byte bitmap. For order 0
74+
* allocations each bitmap will cover 128M of address space. Thus, for 16G of
75+
* memory at most 512K of bitmap memory will be needed for order 0.
7676
*
7777
* This approach is fully incremental, as the serialization progresses folios
7878
* can continue be aggregated to the tracker. The final step, immediately prior
7979
* to kexec would serialize the xarray information into a linked list for the
8080
* successor kernel to parse.
8181
*/
8282

83-
#define PRESERVE_BITS (512 * 8)
83+
#define PRESERVE_BITS (PAGE_SIZE * 8)
8484

8585
struct kho_mem_phys_bits {
8686
DECLARE_BITMAP(preserve, PRESERVE_BITS);
8787
};
8888

89+
static_assert(sizeof(struct kho_mem_phys_bits) == PAGE_SIZE);
90+
8991
struct kho_mem_phys {
9092
/*
9193
* Points to kho_mem_phys_bits, a sparse bitmap array. Each bit is sized
@@ -133,19 +135,19 @@ static struct kho_out kho_out = {
133135
.finalized = false,
134136
};
135137

136-
static void *xa_load_or_alloc(struct xarray *xa, unsigned long index, size_t sz)
138+
static void *xa_load_or_alloc(struct xarray *xa, unsigned long index)
137139
{
138140
void *res = xa_load(xa, index);
139141

140142
if (res)
141143
return res;
142144

143-
void *elm __free(kfree) = kzalloc(sz, GFP_KERNEL);
145+
void *elm __free(kfree) = kzalloc(PAGE_SIZE, GFP_KERNEL);
144146

145147
if (!elm)
146148
return ERR_PTR(-ENOMEM);
147149

148-
if (WARN_ON(kho_scratch_overlap(virt_to_phys(elm), sz)))
150+
if (WARN_ON(kho_scratch_overlap(virt_to_phys(elm), PAGE_SIZE)))
149151
return ERR_PTR(-EINVAL);
150152

151153
res = xa_cmpxchg(xa, index, NULL, elm, GFP_KERNEL);
@@ -218,8 +220,7 @@ static int __kho_preserve_order(struct kho_mem_track *track, unsigned long pfn,
218220
}
219221
}
220222

221-
bits = xa_load_or_alloc(&physxa->phys_bits, pfn_high / PRESERVE_BITS,
222-
sizeof(*bits));
223+
bits = xa_load_or_alloc(&physxa->phys_bits, pfn_high / PRESERVE_BITS);
223224
if (IS_ERR(bits))
224225
return PTR_ERR(bits);
225226

0 commit comments

Comments
 (0)