Commit f0498d2
Peter Zijlstra
sched: Fix stop_one_cpu_nowait() vs hotplug
Kuyo reported sporadic failures on a sched_setaffinity() vs CPU
hotplug stress-test -- notably affine_move_task() remains stuck in
wait_for_completion(), leading to a hung-task detector warning.
Specifically, it was reported that stop_one_cpu_nowait(.fn =
migration_cpu_stop) returns false -- this stopper is responsible for
the matching complete().
The race scenario is:
CPU0 CPU1
// doing _cpu_down()
__set_cpus_allowed_ptr()
task_rq_lock();
takedown_cpu()
stop_machine_cpuslocked(take_cpu_down..)
<PREEMPT: cpu_stopper_thread()
MULTI_STOP_PREPARE
...
__set_cpus_allowed_ptr_locked()
affine_move_task()
task_rq_unlock();
<PREEMPT: cpu_stopper_thread()\>
ack_state()
MULTI_STOP_RUN
take_cpu_down()
__cpu_disable();
stop_machine_park();
stopper->enabled = false;
/>
/>
stop_one_cpu_nowait(.fn = migration_cpu_stop);
if (stopper->enabled) // false!!!
That is, by doing stop_one_cpu_nowait() after dropping rq-lock, the
stopper thread gets a chance to preempt and allows the cpu-down for
the target CPU to complete.
OTOH, since stop_one_cpu_nowait() / cpu_stop_queue_work() needs to
issue a wakeup, it must not be ran under the scheduler locks.
Solve this apparent contradiction by keeping preemption disabled over
the unlock + queue_stopper combination:
preempt_disable();
task_rq_unlock(...);
if (!stop_pending)
stop_one_cpu_nowait(...)
preempt_enable();
This respects the lock ordering contraints while still avoiding the
above race. That is, if we find the CPU is online under rq-lock, the
targeted stop_one_cpu_nowait() must succeed.
Apply this pattern to all similar stop_one_cpu_nowait() invocations.
Fixes: 6d337ea ("sched: Fix migrate_disable() vs set_cpus_allowed_ptr()")
Reported-by: "Kuyo Chang (張建文)" <Kuyo.Chang@mediatek.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: "Kuyo Chang (張建文)" <Kuyo.Chang@mediatek.com>
Link: https://lkml.kernel.org/r/20231010200442.GA16515@noisy.programming.kicks-ass.net1 parent 0c29240 commit f0498d2
4 files changed
Lines changed: 17 additions & 3 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2645 | 2645 | | |
2646 | 2646 | | |
2647 | 2647 | | |
| 2648 | + | |
2648 | 2649 | | |
2649 | 2650 | | |
2650 | 2651 | | |
| 2652 | + | |
2651 | 2653 | | |
2652 | 2654 | | |
2653 | 2655 | | |
| |||
2967 | 2969 | | |
2968 | 2970 | | |
2969 | 2971 | | |
| 2972 | + | |
2970 | 2973 | | |
2971 | | - | |
2972 | 2974 | | |
2973 | 2975 | | |
2974 | 2976 | | |
2975 | 2977 | | |
| 2978 | + | |
2976 | 2979 | | |
2977 | 2980 | | |
2978 | 2981 | | |
| |||
3038 | 3041 | | |
3039 | 3042 | | |
3040 | 3043 | | |
| 3044 | + | |
3041 | 3045 | | |
3042 | | - | |
3043 | 3046 | | |
3044 | 3047 | | |
3045 | 3048 | | |
3046 | 3049 | | |
| 3050 | + | |
3047 | 3051 | | |
3048 | 3052 | | |
3049 | 3053 | | |
| |||
9421 | 9425 | | |
9422 | 9426 | | |
9423 | 9427 | | |
| 9428 | + | |
9424 | 9429 | | |
9425 | 9430 | | |
9426 | 9431 | | |
| 9432 | + | |
9427 | 9433 | | |
9428 | 9434 | | |
9429 | 9435 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2420 | 2420 | | |
2421 | 2421 | | |
2422 | 2422 | | |
| 2423 | + | |
2423 | 2424 | | |
2424 | 2425 | | |
2425 | 2426 | | |
| 2427 | + | |
2426 | 2428 | | |
2427 | 2429 | | |
2428 | 2430 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
11254 | 11254 | | |
11255 | 11255 | | |
11256 | 11256 | | |
11257 | | - | |
11258 | 11257 | | |
| 11258 | + | |
| 11259 | + | |
11259 | 11260 | | |
11260 | 11261 | | |
11261 | 11262 | | |
11262 | 11263 | | |
11263 | 11264 | | |
| 11265 | + | |
11264 | 11266 | | |
11265 | 11267 | | |
11266 | 11268 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2063 | 2063 | | |
2064 | 2064 | | |
2065 | 2065 | | |
| 2066 | + | |
2066 | 2067 | | |
2067 | 2068 | | |
2068 | 2069 | | |
| 2070 | + | |
2069 | 2071 | | |
2070 | 2072 | | |
2071 | 2073 | | |
| |||
2402 | 2404 | | |
2403 | 2405 | | |
2404 | 2406 | | |
| 2407 | + | |
2405 | 2408 | | |
2406 | 2409 | | |
2407 | 2410 | | |
| 2411 | + | |
2408 | 2412 | | |
2409 | 2413 | | |
2410 | 2414 | | |
| |||
0 commit comments