Skip to content

Commit a493c7a

Browse files
xhyf77akpm00
authored andcommitted
mm/page_alloc: optimize lowmem_reserve max lookup using its semantic monotonicity
calculate_totalreserve_pages() currently finds the maximum lowmem_reserve[j] for a zone by scanning the full forward range [j = zone_idx .. MAX_NR_ZONES). However, for a given zone i, the lowmem_reserve[j] array (for j > i) is naturally expected to form a monotonically non-decreasing sequence in j, not as an implementation detail, but as a consequence that naturally arises from the semantics of lowmem_reserve[]. For zone "i", lowmem_reserve[j] expresses how many pages in zone i must effectively be kept in reserve when deciding whether an allocation class that may allocate from zones up to j is allowed to fall back into i. It protects less flexible allocation classes (which cannot use higher zones) from being starved by more flexible ones. Viewed from this semantics, it is natural to expect a partial ordering in j: as j increases, the allocation class gains access to a strictly larger set of fallback zones. Therefore lowmem_reserve[j] is expected to be monotonically non-decreasing in j: more flexible allocation classes must not be allowed to deplete low zones more aggressively than less flexible ones. In other words, if lowmem_reserve[j] were ever observed to *decrease* as j grows, that would be unexpected from the reserve semantics' point of view and would likely indicate a semantic change or a misconfiguration. The current implementation in setup_per_zone_lowmem_reserve() reflects this policy by accumulating managed pages from higher zones and applying the configured ratio, which results in a non-decreasing sequence. This patch makes calculate_totalreserve_pages() rely on that monotonicity explicitly and finds the maximum reserve value by scanning backward and stopping at the first non-zero entry. This avoids unnecessary iteration and reflects the conceptual model more directly. No functional behavior changes. To maintain this assumption explicitly, a comment is added next to setup_per_zone_lowmem_reserve() documenting the monotonicity expectation and noting that calculate_totalreserve_pages() relies on it. Link: https://lkml.kernel.org/r/tencent_EB0FED91B01B1F8B6DAEE96719C5F5797F07@qq.com Signed-off-by: fujunjie <fujunjie1@qq.com> Acked-by: Zi Yan <ziy@nvidia.com> Cc: Brendan Jackman <jackmanb@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
1 parent 3cf41ed commit a493c7a

1 file changed

Lines changed: 29 additions & 4 deletions

File tree

mm/page_alloc.c

Lines changed: 29 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6311,10 +6311,21 @@ static void calculate_totalreserve_pages(void)
63116311
long max = 0;
63126312
unsigned long managed_pages = zone_managed_pages(zone);
63136313

6314-
/* Find valid and maximum lowmem_reserve in the zone */
6315-
for (j = i; j < MAX_NR_ZONES; j++)
6316-
max = max(max, zone->lowmem_reserve[j]);
6314+
/*
6315+
* lowmem_reserve[j] is monotonically non-decreasing
6316+
* in j for a given zone (see
6317+
* setup_per_zone_lowmem_reserve()). The maximum
6318+
* valid reserve lives at the highest index with a
6319+
* non-zero value, so scan backwards and stop at the
6320+
* first hit.
6321+
*/
6322+
for (j = MAX_NR_ZONES - 1; j > i; j--) {
6323+
if (!zone->lowmem_reserve[j])
6324+
continue;
63176325

6326+
max = zone->lowmem_reserve[j];
6327+
break;
6328+
}
63186329
/* we treat the high watermark as reserved pages. */
63196330
max += high_wmark_pages(zone);
63206331

@@ -6339,7 +6350,21 @@ static void setup_per_zone_lowmem_reserve(void)
63396350
{
63406351
struct pglist_data *pgdat;
63416352
enum zone_type i, j;
6342-
6353+
/*
6354+
* For a given zone node_zones[i], lowmem_reserve[j] (j > i)
6355+
* represents how many pages in zone i must effectively be kept
6356+
* in reserve when deciding whether an allocation class that is
6357+
* allowed to allocate from zones up to j may fall back into
6358+
* zone i.
6359+
*
6360+
* As j increases, the allocation class can use a strictly larger
6361+
* set of fallback zones and therefore must not be allowed to
6362+
* deplete low zones more aggressively than a less flexible one.
6363+
* As a result, lowmem_reserve[j] is required to be monotonically
6364+
* non-decreasing in j for each zone i. Callers such as
6365+
* calculate_totalreserve_pages() rely on this monotonicity when
6366+
* selecting the maximum reserve entry.
6367+
*/
63436368
for_each_online_pgdat(pgdat) {
63446369
for (i = 0; i < MAX_NR_ZONES - 1; i++) {
63456370
struct zone *zone = &pgdat->node_zones[i];

0 commit comments

Comments
 (0)