Skip to content

Commit 3f1509c

Browse files
hnazakpm00
authored andcommitted
Revert "mm/vmscan: never demote for memcg reclaim"
This reverts commit 3a23569. Its premise was that cgroup reclaim cares about freeing memory inside the cgroup, and demotion just moves them around within the cgroup limit. Hence, pages from toptier nodes should be reclaimed directly. However, with NUMA balancing now doing tier promotions, demotion is part of the page aging process. Global reclaim demotes the coldest toptier pages to secondary memory, where their life continues and from which they have a chance to get promoted back. Essentially, tiered memory systems have an LRU order that spans multiple nodes. When cgroup reclaims pages coming off the toptier directly, there can be colder pages on lower tier nodes that were demoted by global reclaim. This is an aging inversion, not unlike if cgroups were to reclaim directly from the active lists while there are inactive pages. Proactive reclaim is another factor. The goal of that it is to offload colder pages from expensive RAM to cheaper storage. When lower tier memory is available as an intermediate layer, we want offloading to take advantage of it instead of bypassing to storage. Revert the patch so that cgroups respect the LRU order spanning the memory hierarchy. Of note is a specific undercommit scenario, where all cgroup limits in the system add up to <= available toptier memory. In that case, shuffling pages out to lower tiers first to reclaim them from there is inefficient. This is something could be optimized/short-circuited later on (although care must be taken not to accidentally recreate the aging inversion). Let's ensure correctness first. Link: https://lkml.kernel.org/r/20220518190911.82400-1-hannes@cmpxchg.org Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Yang Shi <shy828301@gmail.com> Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Reviewed-by: "Huang, Ying" <ying.huang@intel.com> Reviewed-by: Muchun Song <songmuchun@bytedance.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Shakeel Butt <shakeelb@google.com> Acked-by: Tim Chen <tim.c.chen@linux.intel.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
1 parent 83d7d04 commit 3f1509c

1 file changed

Lines changed: 2 additions & 7 deletions

File tree

mm/vmscan.c

Lines changed: 2 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -528,13 +528,8 @@ static bool can_demote(int nid, struct scan_control *sc)
528528
{
529529
if (!numa_demotion_enabled)
530530
return false;
531-
if (sc) {
532-
if (sc->no_demotion)
533-
return false;
534-
/* It is pointless to do demotion in memcg reclaim */
535-
if (cgroup_reclaim(sc))
536-
return false;
537-
}
531+
if (sc && sc->no_demotion)
532+
return false;
538533
if (next_demotion_node(nid) == NUMA_NO_NODE)
539534
return false;
540535

0 commit comments

Comments
 (0)