Skip to content

Commit 48647d3

Browse files
tehcasterVlastimil Babka (SUSE)
authored andcommitted
slab: distinguish lock and trylock for sheaf_flush_main()
sheaf_flush_main() can be called from __pcs_replace_full_main() where it's fine if the trylock fails, and pcs_flush_all() where it's not expected to and for some flush callers (when destroying the cache or memory hotremove) it would be actually a problem if it failed and left the main sheaf not flushed. The flush callers can however safely use local_lock() instead of trylock. The trylock failure should not happen in practice on !PREEMPT_RT, but can happen on PREEMPT_RT. The impact is limited in practice because when a trylock fails in the kmem_cache_destroy() path, it means someone is using the cache while destroying it, which is a bug on its own. The memory hotremove path is unlikely to be employed in a production RT config, but it's possible. To fix this, split the function into sheaf_flush_main() (using local_lock()) and sheaf_try_flush_main() (using local_trylock()) where both call __sheaf_flush_main_batch() to flush a single batch of objects. This will also allow lockdep to verify our context assumptions. The problem was raised in an off-list question by Marcelo. Fixes: 2d517aa ("slab: add opt-in caching layer of percpu sheaves") Cc: stable@vger.kernel.org Reported-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Harry Yoo <harry.yoo@oracle.com> Reviewed-by: Hao Li <hao.li@linux.dev> Link: https://patch.msgid.link/20260211-b4-sheaf-flush-v1-1-4e7f492f0055@suse.cz Signed-off-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
1 parent e9217ca commit 48647d3

1 file changed

Lines changed: 37 additions & 10 deletions

File tree

mm/slub.c

Lines changed: 37 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2858,19 +2858,19 @@ static void __kmem_cache_free_bulk(struct kmem_cache *s, size_t size, void **p);
28582858
* object pointers are moved to a on-stack array under the lock. To bound the
28592859
* stack usage, limit each batch to PCS_BATCH_MAX.
28602860
*
2861-
* returns true if at least partially flushed
2861+
* Must be called with s->cpu_sheaves->lock locked, returns with the lock
2862+
* unlocked.
2863+
*
2864+
* Returns how many objects are remaining to be flushed
28622865
*/
2863-
static bool sheaf_flush_main(struct kmem_cache *s)
2866+
static unsigned int __sheaf_flush_main_batch(struct kmem_cache *s)
28642867
{
28652868
struct slub_percpu_sheaves *pcs;
28662869
unsigned int batch, remaining;
28672870
void *objects[PCS_BATCH_MAX];
28682871
struct slab_sheaf *sheaf;
2869-
bool ret = false;
28702872

2871-
next_batch:
2872-
if (!local_trylock(&s->cpu_sheaves->lock))
2873-
return ret;
2873+
lockdep_assert_held(this_cpu_ptr(&s->cpu_sheaves->lock));
28742874

28752875
pcs = this_cpu_ptr(s->cpu_sheaves);
28762876
sheaf = pcs->main;
@@ -2888,10 +2888,37 @@ static bool sheaf_flush_main(struct kmem_cache *s)
28882888

28892889
stat_add(s, SHEAF_FLUSH, batch);
28902890

2891-
ret = true;
2891+
return remaining;
2892+
}
28922893

2893-
if (remaining)
2894-
goto next_batch;
2894+
static void sheaf_flush_main(struct kmem_cache *s)
2895+
{
2896+
unsigned int remaining;
2897+
2898+
do {
2899+
local_lock(&s->cpu_sheaves->lock);
2900+
2901+
remaining = __sheaf_flush_main_batch(s);
2902+
2903+
} while (remaining);
2904+
}
2905+
2906+
/*
2907+
* Returns true if the main sheaf was at least partially flushed.
2908+
*/
2909+
static bool sheaf_try_flush_main(struct kmem_cache *s)
2910+
{
2911+
unsigned int remaining;
2912+
bool ret = false;
2913+
2914+
do {
2915+
if (!local_trylock(&s->cpu_sheaves->lock))
2916+
return ret;
2917+
2918+
ret = true;
2919+
remaining = __sheaf_flush_main_batch(s);
2920+
2921+
} while (remaining);
28952922

28962923
return ret;
28972924
}
@@ -5704,7 +5731,7 @@ __pcs_replace_full_main(struct kmem_cache *s, struct slub_percpu_sheaves *pcs,
57045731
if (put_fail)
57055732
stat(s, BARN_PUT_FAIL);
57065733

5707-
if (!sheaf_flush_main(s))
5734+
if (!sheaf_try_flush_main(s))
57085735
return NULL;
57095736

57105737
if (!local_trylock(&s->cpu_sheaves->lock))

0 commit comments

Comments
 (0)