Skip to content

Commit 0cfc0e7

Browse files
ryncsnakpm00
authored andcommitted
mm/shmem, swap: avoid redundant Xarray lookup during swapin
Patch series "mm/shmem, swap: bugfix and improvement of mTHP swap in", v6. The current THP swapin path have several problems. It may potentially hang, may cause redundant faults due to false positive swap cache lookup, and it issues redundant Xarray walks. !CONFIG_TRANSPARENT_HUGEPAGE builds may also contain unnecessary THP checks. This series fixes all of the mentioned issues, the code should be more robust and prepared for the swap table series. Now 4 walks is reduced to 3 (get order & confirm, confirm, insert folio), !CONFIG_TRANSPARENT_HUGEPAGE build overhead is also minimized, and comes with a sanity check now. The performance is slightly better after this series, sequential swap in of 24G data from ZRAM, using transparent_hugepage_tmpfs=always (24 samples each): Before: avg: 10.66s, stddev: 0.04 After patch 1: avg: 10.58s, stddev: 0.04 After patch 2: avg: 10.65s, stddev: 0.05 After patch 3: avg: 10.65s, stddev: 0.04 After patch 4: avg: 10.67s, stddev: 0.04 After patch 5: avg: 9.79s, stddev: 0.04 After patch 6: avg: 9.79s, stddev: 0.05 After patch 7: avg: 9.78s, stddev: 0.05 After patch 8: avg: 9.79s, stddev: 0.04 Several patches improve the performance by a little, which is about ~8% faster in total. Build kernel test showed very slightly improvement, testing with make -j48 with defconfig in a 768M memcg also using ZRAM as swap, and transparent_hugepage_tmpfs=always (6 test runs): Before: avg: 3334.66s, stddev: 43.76 After patch 1: avg: 3349.77s, stddev: 18.55 After patch 2: avg: 3325.01s, stddev: 42.96 After patch 3: avg: 3354.58s, stddev: 14.62 After patch 4: avg: 3336.24s, stddev: 32.15 After patch 5: avg: 3325.13s, stddev: 22.14 After patch 6: avg: 3285.03s, stddev: 38.95 After patch 7: avg: 3287.32s, stddev: 26.37 After patch 8: avg: 3295.87s, stddev: 46.24 This patch (of 7): Currently shmem calls xa_get_order to get the swap radix entry order, requiring a full tree walk. This can be easily combined with the swap entry value checking (shmem_confirm_swap) to avoid the duplicated lookup and abort early if the entry is gone already. Which should improve the performance. Link: https://lkml.kernel.org/r/20250728075306.12704-1-ryncsn@gmail.com Link: https://lkml.kernel.org/r/20250728075306.12704-3-ryncsn@gmail.com Signed-off-by: Kairui Song <kasong@tencent.com> Reviewed-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Baoquan He <bhe@redhat.com> Cc: Barry Song <baohua@kernel.org> Cc: Chris Li <chrisl@kernel.org> Cc: Hugh Dickins <hughd@google.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Nhat Pham <nphamcs@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
1 parent 5d79c2b commit 0cfc0e7

1 file changed

Lines changed: 25 additions & 9 deletions

File tree

mm/shmem.c

Lines changed: 25 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -512,15 +512,27 @@ static int shmem_replace_entry(struct address_space *mapping,
512512

513513
/*
514514
* Sometimes, before we decide whether to proceed or to fail, we must check
515-
* that an entry was not already brought back from swap by a racing thread.
515+
* that an entry was not already brought back or split by a racing thread.
516516
*
517517
* Checking folio is not enough: by the time a swapcache folio is locked, it
518518
* might be reused, and again be swapcache, using the same swap as before.
519+
* Returns the swap entry's order if it still presents, else returns -1.
519520
*/
520-
static bool shmem_confirm_swap(struct address_space *mapping,
521-
pgoff_t index, swp_entry_t swap)
521+
static int shmem_confirm_swap(struct address_space *mapping, pgoff_t index,
522+
swp_entry_t swap)
522523
{
523-
return xa_load(&mapping->i_pages, index) == swp_to_radix_entry(swap);
524+
XA_STATE(xas, &mapping->i_pages, index);
525+
int ret = -1;
526+
void *entry;
527+
528+
rcu_read_lock();
529+
do {
530+
entry = xas_load(&xas);
531+
if (entry == swp_to_radix_entry(swap))
532+
ret = xas_get_order(&xas);
533+
} while (xas_retry(&xas, entry));
534+
rcu_read_unlock();
535+
return ret;
524536
}
525537

526538
/*
@@ -2293,16 +2305,20 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
22932305
return -EIO;
22942306

22952307
si = get_swap_device(swap);
2296-
if (!si) {
2297-
if (!shmem_confirm_swap(mapping, index, swap))
2308+
order = shmem_confirm_swap(mapping, index, swap);
2309+
if (unlikely(!si)) {
2310+
if (order < 0)
22982311
return -EEXIST;
22992312
else
23002313
return -EINVAL;
23012314
}
2315+
if (unlikely(order < 0)) {
2316+
put_swap_device(si);
2317+
return -EEXIST;
2318+
}
23022319

23032320
/* Look it up and read it in.. */
23042321
folio = swap_cache_get_folio(swap, NULL, 0);
2305-
order = xa_get_order(&mapping->i_pages, index);
23062322
if (!folio) {
23072323
int nr_pages = 1 << order;
23082324
bool fallback_order0 = false;
@@ -2412,7 +2428,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
24122428
*/
24132429
folio_lock(folio);
24142430
if ((!skip_swapcache && !folio_test_swapcache(folio)) ||
2415-
!shmem_confirm_swap(mapping, index, swap) ||
2431+
shmem_confirm_swap(mapping, index, swap) < 0 ||
24162432
folio->swap.val != swap.val) {
24172433
error = -EEXIST;
24182434
goto unlock;
@@ -2460,7 +2476,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index,
24602476
*foliop = folio;
24612477
return 0;
24622478
failed:
2463-
if (!shmem_confirm_swap(mapping, index, swap))
2479+
if (shmem_confirm_swap(mapping, index, swap) < 0)
24642480
error = -EEXIST;
24652481
if (error == -EIO)
24662482
shmem_set_folio_swapin_error(inode, index, folio, swap,

0 commit comments

Comments
 (0)