Commit 52e054f
mm: rmap: support batched checks of the references for large folios
Patch series "support batch checking of references and unmapping for large
folios", v6.
Currently, folio_referenced_one() always checks the young flag for each
PTE sequentially, which is inefficient for large folios. This
inefficiency is especially noticeable when reclaiming clean file-backed
large folios, where folio_referenced() is observed as a significant
performance hotspot.
Moreover, on Arm architecture, which supports contiguous PTEs, there is
already an optimization to clear the young flags for PTEs within a
contiguous range. However, this is not sufficient. We can extend this to
perform batched operations for the entire large folio (which might exceed
the contiguous range: CONT_PTE_SIZE).
Similar to folio_referenced_one(), we can also apply batched unmapping for
large file folios to optimize the performance of file folio reclamation.
By supporting batched checking of the young flags, flushing TLB entries,
and unmapping, I can observed a significant performance improvements in my
performance tests for file folios reclamation. Please check the
performance data in the commit message of each patch.
This patch (of 5):
Currently, folio_referenced_one() always checks the young flag for each
PTE sequentially, which is inefficient for large folios. This
inefficiency is especially noticeable when reclaiming clean file-backed
large folios, where folio_referenced() is observed as a significant
performance hotspot.
Moreover, on Arm64 architecture, which supports contiguous PTEs, there is
already an optimization to clear the young flags for PTEs within a
contiguous range. However, this is not sufficient. We can extend this to
perform batched operations for the entire large folio (which might exceed
the contiguous range: CONT_PTE_SIZE).
Introduce a new API: clear_flush_young_ptes() to facilitate batched
checking of the young flags and flushing TLB entries, thereby improving
performance during large folio reclamation. And it will be overridden by
the architecture that implements a more efficient batch operation in the
following patches.
While we are at it, rename ptep_clear_flush_young_notify() to
clear_flush_young_ptes_notify() to indicate that this is a batch
operation.
Link: https://lkml.kernel.org/r/cover.1770645603.git.baolin.wang@linux.alibaba.com
Link: https://lkml.kernel.org/r/12132694536834262062d1fb304f8f8a064b6750.1770645603.git.baolin.wang@linux.alibaba.com
Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: Harry Yoo <harry.yoo@oracle.com>
Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Cc: Barry Song <baohua@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>1 parent f615cc9 commit 52e054f
3 files changed
Lines changed: 65 additions & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
515 | 515 | | |
516 | 516 | | |
517 | 517 | | |
518 | | - | |
| 518 | + | |
519 | 519 | | |
520 | 520 | | |
521 | 521 | | |
522 | 522 | | |
523 | | - | |
| 523 | + | |
| 524 | + | |
524 | 525 | | |
525 | 526 | | |
526 | 527 | | |
527 | | - | |
| 528 | + | |
528 | 529 | | |
529 | 530 | | |
530 | 531 | | |
| |||
650 | 651 | | |
651 | 652 | | |
652 | 653 | | |
653 | | - | |
| 654 | + | |
654 | 655 | | |
655 | 656 | | |
656 | 657 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1068 | 1068 | | |
1069 | 1069 | | |
1070 | 1070 | | |
| 1071 | + | |
| 1072 | + | |
| 1073 | + | |
| 1074 | + | |
| 1075 | + | |
| 1076 | + | |
| 1077 | + | |
| 1078 | + | |
| 1079 | + | |
| 1080 | + | |
| 1081 | + | |
| 1082 | + | |
| 1083 | + | |
| 1084 | + | |
| 1085 | + | |
| 1086 | + | |
| 1087 | + | |
| 1088 | + | |
| 1089 | + | |
| 1090 | + | |
| 1091 | + | |
| 1092 | + | |
| 1093 | + | |
| 1094 | + | |
| 1095 | + | |
| 1096 | + | |
| 1097 | + | |
| 1098 | + | |
| 1099 | + | |
| 1100 | + | |
| 1101 | + | |
| 1102 | + | |
| 1103 | + | |
| 1104 | + | |
| 1105 | + | |
1071 | 1106 | | |
1072 | 1107 | | |
1073 | 1108 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
913 | 913 | | |
914 | 914 | | |
915 | 915 | | |
| 916 | + | |
916 | 917 | | |
917 | 918 | | |
918 | 919 | | |
| 920 | + | |
919 | 921 | | |
920 | 922 | | |
921 | 923 | | |
| |||
960 | 962 | | |
961 | 963 | | |
962 | 964 | | |
963 | | - | |
964 | | - | |
| 965 | + | |
| 966 | + | |
| 967 | + | |
| 968 | + | |
| 969 | + | |
| 970 | + | |
| 971 | + | |
| 972 | + | |
| 973 | + | |
| 974 | + | |
| 975 | + | |
965 | 976 | | |
| 977 | + | |
| 978 | + | |
| 979 | + | |
966 | 980 | | |
967 | 981 | | |
968 | 982 | | |
| |||
972 | 986 | | |
973 | 987 | | |
974 | 988 | | |
975 | | - | |
| 989 | + | |
| 990 | + | |
| 991 | + | |
| 992 | + | |
| 993 | + | |
| 994 | + | |
| 995 | + | |
| 996 | + | |
| 997 | + | |
976 | 998 | | |
977 | 999 | | |
978 | 1000 | | |
| |||
0 commit comments