Skip to content

Commit 07f440c

Browse files
Baolin Wangakpm00
authored andcommitted
arm64: mm: implement the architecture-specific clear_flush_young_ptes()
Implement the Arm64 architecture-specific clear_flush_young_ptes() to enable batched checking of young flags and TLB flushing, improving performance during large folio reclamation. Performance testing: Allocate 10G clean file-backed folios by mmap() in a memory cgroup, and try to reclaim 8G file-backed folios via the memory.reclaim interface. I can observe 33% performance improvement on my Arm64 32-core server (and 10%+ improvement on my X86 machine). Meanwhile, the hotspot folio_check_references() dropped from approximately 35% to around 5%. W/o patchset: real 0m1.518s user 0m0.000s sys 0m1.518s W/ patchset: real 0m1.018s user 0m0.000s sys 0m1.018s Link: https://lkml.kernel.org/r/ce749fbae3e900e733fa104a16fcb3ca9fe4f9bd.1770645603.git.baolin.wang@linux.alibaba.com Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: David Hildenbrand (Arm) <david@kernel.org> Cc: Barry Song <baohua@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Harry Yoo <harry.yoo@oracle.com> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Rik van Riel <riel@surriel.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
1 parent 6f0e114 commit 07f440c

1 file changed

Lines changed: 11 additions & 0 deletions

File tree

arch/arm64/include/asm/pgtable.h

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1838,6 +1838,17 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
18381838
return contpte_clear_flush_young_ptes(vma, addr, ptep, 1);
18391839
}
18401840

1841+
#define clear_flush_young_ptes clear_flush_young_ptes
1842+
static inline int clear_flush_young_ptes(struct vm_area_struct *vma,
1843+
unsigned long addr, pte_t *ptep,
1844+
unsigned int nr)
1845+
{
1846+
if (likely(nr == 1 && !pte_cont(__ptep_get(ptep))))
1847+
return __ptep_clear_flush_young(vma, addr, ptep);
1848+
1849+
return contpte_clear_flush_young_ptes(vma, addr, ptep, nr);
1850+
}
1851+
18411852
#define wrprotect_ptes wrprotect_ptes
18421853
static __always_inline void wrprotect_ptes(struct mm_struct *mm,
18431854
unsigned long addr, pte_t *ptep, unsigned int nr)

0 commit comments

Comments
 (0)