Commit 84ad0af
net/sched: qdisc_destroy() old ingress and clsact Qdiscs before grafting
mini_Qdisc_pair::p_miniq is a double pointer to mini_Qdisc, initialized
in ingress_init() to point to net_device::miniq_ingress. ingress Qdiscs
access this per-net_device pointer in mini_qdisc_pair_swap(). Similar
for clsact Qdiscs and miniq_egress.
Unfortunately, after introducing RTNL-unlocked RTM_{NEW,DEL,GET}TFILTER
requests (thanks Hillf Danton for the hint), when replacing ingress or
clsact Qdiscs, for example, the old Qdisc ("@old") could access the same
miniq_{in,e}gress pointer(s) concurrently with the new Qdisc ("@new"),
causing race conditions [1] including a use-after-free bug in
mini_qdisc_pair_swap() reported by syzbot:
BUG: KASAN: slab-use-after-free in mini_qdisc_pair_swap+0x1c2/0x1f0 net/sched/sch_generic.c:1573
Write of size 8 at addr ffff888045b31308 by task syz-executor690/14901
...
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106
print_address_description.constprop.0+0x2c/0x3c0 mm/kasan/report.c:319
print_report mm/kasan/report.c:430 [inline]
kasan_report+0x11c/0x130 mm/kasan/report.c:536
mini_qdisc_pair_swap+0x1c2/0x1f0 net/sched/sch_generic.c:1573
tcf_chain_head_change_item net/sched/cls_api.c:495 [inline]
tcf_chain0_head_change.isra.0+0xb9/0x120 net/sched/cls_api.c:509
tcf_chain_tp_insert net/sched/cls_api.c:1826 [inline]
tcf_chain_tp_insert_unique net/sched/cls_api.c:1875 [inline]
tc_new_tfilter+0x1de6/0x2290 net/sched/cls_api.c:2266
...
@old and @new should not affect each other. In other words, @old should
never modify miniq_{in,e}gress after @new, and @new should not update
@old's RCU state.
Fixing without changing sch_api.c turned out to be difficult (please
refer to Closes: for discussions). Instead, make sure @new's first call
always happen after @old's last call (in {ingress,clsact}_destroy()) has
finished:
In qdisc_graft(), return -EBUSY if @old has any ongoing filter requests,
and call qdisc_destroy() for @old before grafting @new.
Introduce qdisc_refcount_dec_if_one() as the counterpart of
qdisc_refcount_inc_nz() used for filter requests. Introduce a
non-static version of qdisc_destroy() that does a TCQ_F_BUILTIN check,
just like qdisc_put() etc.
Depends on patch "net/sched: Refactor qdisc_graft() for ingress and
clsact Qdiscs".
[1] To illustrate, the syzkaller reproducer adds ingress Qdiscs under
TC_H_ROOT (no longer possible after commit c7cfbd1 ("net/sched:
sch_ingress: Only create under TC_H_INGRESS")) on eth0 that has 8
transmission queues:
Thread 1 creates ingress Qdisc A (containing mini Qdisc a1 and a2),
then adds a flower filter X to A.
Thread 2 creates another ingress Qdisc B (containing mini Qdisc b1 and
b2) to replace A, then adds a flower filter Y to B.
Thread 1 A's refcnt Thread 2
RTM_NEWQDISC (A, RTNL-locked)
qdisc_create(A) 1
qdisc_graft(A) 9
RTM_NEWTFILTER (X, RTNL-unlocked)
__tcf_qdisc_find(A) 10
tcf_chain0_head_change(A)
mini_qdisc_pair_swap(A) (1st)
|
| RTM_NEWQDISC (B, RTNL-locked)
RCU sync 2 qdisc_graft(B)
| 1 notify_and_destroy(A)
|
tcf_block_release(A) 0 RTM_NEWTFILTER (Y, RTNL-unlocked)
qdisc_destroy(A) tcf_chain0_head_change(B)
tcf_chain0_head_change_cb_del(A) mini_qdisc_pair_swap(B) (2nd)
mini_qdisc_pair_swap(A) (3rd) |
... ...
Here, B calls mini_qdisc_pair_swap(), pointing eth0->miniq_ingress to
its mini Qdisc, b1. Then, A calls mini_qdisc_pair_swap() again during
ingress_destroy(), setting eth0->miniq_ingress to NULL, so ingress
packets on eth0 will not find filter Y in sch_handle_ingress().
This is just one of the possible consequences of concurrently accessing
miniq_{in,e}gress pointers.
Fixes: 7a096d5 ("net: sched: ingress: set 'unlocked' flag for Qdisc ops")
Fixes: 87f3739 ("net: sched: ingress: set 'unlocked' flag for clsact Qdisc ops")
Reported-by: syzbot+b53a9c0d1ea4ad62da8b@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/r/0000000000006cf87705f79acf1a@google.com/
Cc: Hillf Danton <hdanton@sina.com>
Cc: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>1 parent 2d5f6a8 commit 84ad0af
3 files changed
Lines changed: 42 additions & 8 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
137 | 137 | | |
138 | 138 | | |
139 | 139 | | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
140 | 147 | | |
141 | 148 | | |
142 | 149 | | |
| |||
652 | 659 | | |
653 | 660 | | |
654 | 661 | | |
| 662 | + | |
655 | 663 | | |
656 | 664 | | |
657 | 665 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1086 | 1086 | | |
1087 | 1087 | | |
1088 | 1088 | | |
1089 | | - | |
| 1089 | + | |
| 1090 | + | |
1090 | 1091 | | |
1091 | 1092 | | |
1092 | 1093 | | |
| 1094 | + | |
| 1095 | + | |
| 1096 | + | |
| 1097 | + | |
| 1098 | + | |
| 1099 | + | |
| 1100 | + | |
| 1101 | + | |
| 1102 | + | |
| 1103 | + | |
| 1104 | + | |
1093 | 1105 | | |
1094 | 1106 | | |
1095 | 1107 | | |
| |||
1110 | 1122 | | |
1111 | 1123 | | |
1112 | 1124 | | |
1113 | | - | |
1114 | | - | |
| 1125 | + | |
| 1126 | + | |
| 1127 | + | |
| 1128 | + | |
| 1129 | + | |
| 1130 | + | |
| 1131 | + | |
| 1132 | + | |
| 1133 | + | |
| 1134 | + | |
1115 | 1135 | | |
1116 | 1136 | | |
1117 | 1137 | | |
| |||
1125 | 1145 | | |
1126 | 1146 | | |
1127 | 1147 | | |
1128 | | - | |
1129 | | - | |
1130 | 1148 | | |
1131 | 1149 | | |
1132 | 1150 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1046 | 1046 | | |
1047 | 1047 | | |
1048 | 1048 | | |
1049 | | - | |
| 1049 | + | |
1050 | 1050 | | |
1051 | 1051 | | |
1052 | 1052 | | |
| |||
1070 | 1070 | | |
1071 | 1071 | | |
1072 | 1072 | | |
| 1073 | + | |
| 1074 | + | |
| 1075 | + | |
| 1076 | + | |
| 1077 | + | |
| 1078 | + | |
| 1079 | + | |
| 1080 | + | |
1073 | 1081 | | |
1074 | 1082 | | |
1075 | 1083 | | |
| |||
1079 | 1087 | | |
1080 | 1088 | | |
1081 | 1089 | | |
1082 | | - | |
| 1090 | + | |
1083 | 1091 | | |
1084 | 1092 | | |
1085 | 1093 | | |
| |||
1094 | 1102 | | |
1095 | 1103 | | |
1096 | 1104 | | |
1097 | | - | |
| 1105 | + | |
1098 | 1106 | | |
1099 | 1107 | | |
1100 | 1108 | | |
| |||
0 commit comments