Commit 96492a6
perf/core: Fix perf_cgroup_switch()
There is a race problem that can trigger WARN_ON_ONCE(cpuctx->cgrp)
in perf_cgroup_switch().
CPU1 CPU2
perf_cgroup_sched_out(prev, next)
cgrp1 = perf_cgroup_from_task(prev)
cgrp2 = perf_cgroup_from_task(next)
if (cgrp1 != cgrp2)
perf_cgroup_switch(prev, PERF_CGROUP_SWOUT)
cgroup_migrate_execute()
task->cgroups = ?
perf_cgroup_attach()
task_function_call(task, __perf_cgroup_move)
perf_cgroup_sched_in(prev, next)
cgrp1 = perf_cgroup_from_task(prev)
cgrp2 = perf_cgroup_from_task(next)
if (cgrp1 != cgrp2)
perf_cgroup_switch(next, PERF_CGROUP_SWIN)
__perf_cgroup_move()
perf_cgroup_switch(task, PERF_CGROUP_SWOUT | PERF_CGROUP_SWIN)
The commit a8d757e ("perf events: Fix slow and broken cgroup
context switch code") want to skip perf_cgroup_switch() when the
perf_cgroup of "prev" and "next" are the same.
But task->cgroups can change in concurrent with context_switch()
in cgroup_migrate_execute(). If cgrp1 == cgrp2 in sched_out(),
cpuctx won't do sched_out. Then task->cgroups changed cause
cgrp1 != cgrp2 in sched_in(), cpuctx will do sched_in. So trigger
WARN_ON_ONCE(cpuctx->cgrp).
Even though __perf_cgroup_move() will be synchronized as the context
switch disables the interrupt, context_switch() still can see the
task->cgroups is changing in the middle, since task->cgroups changed
before sending IPI.
So we have to combine perf_cgroup_sched_in() into perf_cgroup_sched_out(),
unified into perf_cgroup_switch(), to fix the incosistency between
perf_cgroup_sched_out() and perf_cgroup_sched_in().
But we can't just compare prev->cgroups with next->cgroups to decide
whether to skip cpuctx sched_out/in since the prev->cgroups is changing
too. For example:
CPU1 CPU2
cgroup_migrate_execute()
prev->cgroups = ?
perf_cgroup_attach()
task_function_call(task, __perf_cgroup_move)
perf_cgroup_switch(task)
cgrp1 = perf_cgroup_from_task(prev)
cgrp2 = perf_cgroup_from_task(next)
if (cgrp1 != cgrp2)
cpuctx sched_out/in ...
task_function_call() will return -ESRCH
In the above example, prev->cgroups changing cause (cgrp1 == cgrp2)
to be true, so skip cpuctx sched_out/in. And later task_function_call()
would return -ESRCH since the prev task isn't running on cpu anymore.
So we would leave perf_events of the old prev->cgroups still sched on
the CPU, which is wrong.
The solution is that we should use cpuctx->cgrp to compare with
the next task's perf_cgroup. Since cpuctx->cgrp can only be changed
on local CPU, and we have irq disabled, we can read cpuctx->cgrp to
compare without holding ctx lock.
Fixes: a8d757e ("perf events: Fix slow and broken cgroup context switch code")
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220329154523.86438-4-zhouchengming@bytedance.com1 parent 6875186 commit 96492a6
1 file changed
Lines changed: 25 additions & 107 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
824 | 824 | | |
825 | 825 | | |
826 | 826 | | |
827 | | - | |
828 | | - | |
829 | | - | |
830 | 827 | | |
831 | 828 | | |
832 | | - | |
833 | | - | |
834 | | - | |
835 | 829 | | |
836 | | - | |
| 830 | + | |
837 | 831 | | |
| 832 | + | |
838 | 833 | | |
839 | 834 | | |
840 | 835 | | |
| |||
845 | 840 | | |
846 | 841 | | |
847 | 842 | | |
| 843 | + | |
| 844 | + | |
848 | 845 | | |
849 | 846 | | |
850 | 847 | | |
| 848 | + | |
| 849 | + | |
851 | 850 | | |
852 | 851 | | |
853 | 852 | | |
854 | 853 | | |
855 | | - | |
856 | | - | |
857 | | - | |
858 | | - | |
859 | | - | |
860 | | - | |
861 | | - | |
862 | | - | |
| 854 | + | |
| 855 | + | |
| 856 | + | |
| 857 | + | |
| 858 | + | |
| 859 | + | |
| 860 | + | |
| 861 | + | |
| 862 | + | |
| 863 | + | |
| 864 | + | |
| 865 | + | |
| 866 | + | |
863 | 867 | | |
864 | | - | |
865 | | - | |
866 | | - | |
867 | | - | |
868 | | - | |
869 | | - | |
870 | | - | |
871 | | - | |
872 | | - | |
873 | | - | |
874 | | - | |
875 | | - | |
876 | | - | |
877 | 868 | | |
878 | 869 | | |
879 | 870 | | |
880 | 871 | | |
881 | 872 | | |
882 | 873 | | |
883 | 874 | | |
884 | | - | |
885 | | - | |
886 | | - | |
887 | | - | |
888 | | - | |
889 | | - | |
890 | | - | |
891 | | - | |
892 | | - | |
893 | | - | |
894 | | - | |
895 | | - | |
896 | | - | |
897 | | - | |
898 | | - | |
899 | | - | |
900 | | - | |
901 | | - | |
902 | | - | |
903 | | - | |
904 | | - | |
905 | | - | |
906 | | - | |
907 | | - | |
908 | | - | |
909 | | - | |
910 | | - | |
911 | | - | |
912 | | - | |
913 | | - | |
914 | | - | |
915 | | - | |
916 | | - | |
917 | | - | |
918 | | - | |
919 | | - | |
920 | | - | |
921 | | - | |
922 | | - | |
923 | | - | |
924 | | - | |
925 | | - | |
926 | | - | |
927 | | - | |
928 | | - | |
929 | | - | |
930 | | - | |
931 | | - | |
932 | | - | |
933 | | - | |
934 | | - | |
935 | | - | |
936 | 875 | | |
937 | 876 | | |
938 | 877 | | |
| |||
1096 | 1035 | | |
1097 | 1036 | | |
1098 | 1037 | | |
1099 | | - | |
1100 | | - | |
1101 | | - | |
1102 | | - | |
1103 | | - | |
1104 | | - | |
1105 | | - | |
1106 | | - | |
1107 | | - | |
1108 | | - | |
1109 | 1038 | | |
1110 | 1039 | | |
1111 | 1040 | | |
| |||
1118 | 1047 | | |
1119 | 1048 | | |
1120 | 1049 | | |
1121 | | - | |
1122 | | - | |
1123 | | - | |
1124 | | - | |
1125 | | - | |
1126 | 1050 | | |
1127 | 1051 | | |
1128 | 1052 | | |
| |||
1142 | 1066 | | |
1143 | 1067 | | |
1144 | 1068 | | |
| 1069 | + | |
| 1070 | + | |
| 1071 | + | |
| 1072 | + | |
1145 | 1073 | | |
1146 | 1074 | | |
1147 | 1075 | | |
| |||
3661 | 3589 | | |
3662 | 3590 | | |
3663 | 3591 | | |
3664 | | - | |
| 3592 | + | |
3665 | 3593 | | |
3666 | 3594 | | |
3667 | 3595 | | |
| |||
3975 | 3903 | | |
3976 | 3904 | | |
3977 | 3905 | | |
3978 | | - | |
3979 | | - | |
3980 | | - | |
3981 | | - | |
3982 | | - | |
3983 | | - | |
3984 | | - | |
3985 | | - | |
3986 | | - | |
3987 | | - | |
3988 | 3906 | | |
3989 | 3907 | | |
3990 | 3908 | | |
| |||
13556 | 13474 | | |
13557 | 13475 | | |
13558 | 13476 | | |
13559 | | - | |
| 13477 | + | |
13560 | 13478 | | |
13561 | 13479 | | |
13562 | 13480 | | |
| |||
0 commit comments