Commit d6962c4
sched: Clear ttwu_pending after enqueue_task()
We found a long tail latency in schbench whem m*t is close to nr_cpus.
(e.g., "schbench -m 2 -t 16" on a machine with 32 cpus.)
This is because when the wakee cpu is idle, rq->ttwu_pending is cleared
too early, and idle_cpu() will return true until the wakee task enqueued.
This will mislead the waker when selecting idle cpu, and wake multiple
worker threads on the same wakee cpu. This situation is enlarged by
commit f3dd3f6 ("sched: Remove the limitation of WF_ON_CPU on
wakelist if wakee cpu is idle") because it tends to use wakelist.
Here is the result of "schbench -m 2 -t 16" on a VM with 32vcpu
(Intel(R) Xeon(R) Platinum 8369B).
Latency percentiles (usec):
base base+revert_f3dd3f674555 base+this_patch
50.0000th: 9 13 9
75.0000th: 12 19 12
90.0000th: 15 22 15
95.0000th: 18 24 17
*99.0000th: 27 31 24
99.5000th: 3364 33 27
99.9000th: 12560 36 30
We also tested on unixbench and hackbench, and saw no performance
change.
Signed-off-by: Tianchen Ding <dtcccc@linux.alibaba.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Mel Gorman <mgorman@suse.de>
Link: https://lkml.kernel.org/r/20221104023601.12844-1-dtcccc@linux.alibaba.com1 parent 52b33d8 commit d6962c4
1 file changed
Lines changed: 11 additions & 7 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3739 | 3739 | | |
3740 | 3740 | | |
3741 | 3741 | | |
3742 | | - | |
3743 | | - | |
3744 | | - | |
3745 | | - | |
3746 | | - | |
3747 | | - | |
3748 | | - | |
3749 | 3742 | | |
3750 | 3743 | | |
3751 | 3744 | | |
| |||
3759 | 3752 | | |
3760 | 3753 | | |
3761 | 3754 | | |
| 3755 | + | |
| 3756 | + | |
| 3757 | + | |
| 3758 | + | |
| 3759 | + | |
| 3760 | + | |
| 3761 | + | |
| 3762 | + | |
| 3763 | + | |
| 3764 | + | |
| 3765 | + | |
3762 | 3766 | | |
3763 | 3767 | | |
3764 | 3768 | | |
| |||
0 commit comments