Commit 80b3fd4
committed
rcu: Make rcu_barrier() no longer block CPU-hotplug operations
This commit removes the cpus_read_lock() and cpus_read_unlock() calls
from rcu_barrier(), thus allowing CPUs to come and go during the course
of rcu_barrier() execution. Posting of the ->barrier_head callbacks does
synchronize with portions of RCU's CPU-hotplug notifiers, but these locks
are held for short time periods on both sides. Thus, full CPU-hotplug
operations could both start and finish during the execution of a given
rcu_barrier() invocation.
Additional synchronization is provided by a global ->barrier_lock.
Since the ->barrier_lock is only used during rcu_barrier() execution and
during onlining/offlining a CPU, the contention for this lock should
be low. It might be tempting to make use of a per-CPU lock just on
general principles, but straightforward attempts to do this have the
problems shown below.
Initial state: 3 CPUs present, CPU 0 and CPU1 do not have
any callback and CPU2 has callbacks.
1. CPU0 calls rcu_barrier().
2. CPU1 starts offlining for CPU2. CPU1 calls
rcutree_migrate_callbacks(). rcu_barrier_entrain() is called
from rcutree_migrate_callbacks(), with CPU2's rdp->barrier_lock.
It does not entrain ->barrier_head for CPU2, as rcu_barrier()
on CPU0 hasn't started the barrier sequence (by calling
rcu_seq_start(&rcu_state.barrier_sequence)) yet.
3. CPU0 starts new barrier sequence. It iterates over
CPU0 and CPU1, after acquiring their per-cpu ->barrier_lock
and finds 0 segcblist length. It updates ->barrier_seq_snap
for CPU0 and CPU1 and continues loop iteration to CPU2.
for_each_possible_cpu(cpu) {
raw_spin_lock_irqsave(&rdp->barrier_lock, flags);
if (!rcu_segcblist_n_cbs(&rdp->cblist)) {
WRITE_ONCE(rdp->barrier_seq_snap, gseq);
raw_spin_unlock_irqrestore(&rdp->barrier_lock, flags);
rcu_barrier_trace(TPS("NQ"), cpu, rcu_state.barrier_sequence);
continue;
}
4. rcutree_migrate_callbacks() completes execution on CPU1.
Segcblist len for CPU2 becomes 0.
5. The loop iteration on CPU0, checks rcu_segcblist_n_cbs(&rdp->cblist)
for CPU2 and completes the loop iteration after setting
->barrier_seq_snap.
6. As there isn't any ->barrier_head callback entrained; at
this point, rcu_barrier() in CPU0 returns.
7. The callbacks, which migrated from CPU2 to CPU1, execute.
Straightforward per-CPU locking is also subject to the following race
condition noted by Boqun Feng:
1. CPU0 calls rcu_barrier(), starting a new barrier sequence by invoking
rcu_seq_start() and init_completion(), but does not yet initialize
rcu_state.barrier_cpu_count.
2. CPU1 starts offlining for CPU2, calling rcutree_migrate_callbacks(),
which in turn calls rcu_barrier_entrain() holding CPU2's.
rdp->barrier_lock. It then entrains ->barrier_head for CPU2
and atomically increments rcu_state.barrier_cpu_count, which is
unfortunately not yet initialized to the value 2.
3. The just-entrained RCU callback is invoked. It atomically
decrements rcu_state.barrier_cpu_count and sees that it is
now zero. This callback therefore invokes complete().
4. CPU0 continues executing rcu_barrier(), but is not blocked
by its call to wait_for_completion(). This results in rcu_barrier()
returning before all pre-existing callbacks have been invoked,
which is a bug.
Therefore, synchronization is provided by rcu_state.barrier_lock,
which is also held across the initialization sequence, especially the
rcu_seq_start() and the atomic_set() that sets rcu_state.barrier_cpu_count
to the value 2. In addition, this lock is held when entraining the
rcu_barrier() callback, when deciding whether or not a CPU has callbacks
that rcu_barrier() must wait on, when setting the ->qsmaskinitnext for
incoming CPUs, and when migrating callbacks from a CPU that is going
offline.
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Co-developed-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Signed-off-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>1 parent a16578d commit 80b3fd4
2 files changed
Lines changed: 16 additions & 15 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
87 | 87 | | |
88 | 88 | | |
89 | 89 | | |
| 90 | + | |
90 | 91 | | |
91 | 92 | | |
92 | 93 | | |
| |||
3994 | 3995 | | |
3995 | 3996 | | |
3996 | 3997 | | |
3997 | | - | |
| 3998 | + | |
3998 | 3999 | | |
3999 | 4000 | | |
4000 | 4001 | | |
| |||
4023 | 4024 | | |
4024 | 4025 | | |
4025 | 4026 | | |
4026 | | - | |
| 4027 | + | |
4027 | 4028 | | |
4028 | | - | |
| 4029 | + | |
4029 | 4030 | | |
4030 | 4031 | | |
4031 | 4032 | | |
| |||
4058 | 4059 | | |
4059 | 4060 | | |
4060 | 4061 | | |
| 4062 | + | |
4061 | 4063 | | |
4062 | 4064 | | |
4063 | 4065 | | |
| |||
4071 | 4073 | | |
4072 | 4074 | | |
4073 | 4075 | | |
4074 | | - | |
| 4076 | + | |
4075 | 4077 | | |
4076 | 4078 | | |
4077 | 4079 | | |
| |||
4083 | 4085 | | |
4084 | 4086 | | |
4085 | 4087 | | |
4086 | | - | |
| 4088 | + | |
4087 | 4089 | | |
4088 | 4090 | | |
4089 | | - | |
| 4091 | + | |
4090 | 4092 | | |
4091 | 4093 | | |
4092 | 4094 | | |
4093 | 4095 | | |
4094 | 4096 | | |
4095 | 4097 | | |
4096 | | - | |
| 4098 | + | |
4097 | 4099 | | |
4098 | 4100 | | |
4099 | 4101 | | |
4100 | | - | |
| 4102 | + | |
4101 | 4103 | | |
4102 | 4104 | | |
4103 | 4105 | | |
4104 | 4106 | | |
4105 | 4107 | | |
4106 | 4108 | | |
4107 | 4109 | | |
4108 | | - | |
4109 | 4110 | | |
4110 | 4111 | | |
4111 | 4112 | | |
| |||
4173 | 4174 | | |
4174 | 4175 | | |
4175 | 4176 | | |
4176 | | - | |
4177 | 4177 | | |
4178 | 4178 | | |
4179 | 4179 | | |
| |||
4325 | 4325 | | |
4326 | 4326 | | |
4327 | 4327 | | |
4328 | | - | |
| 4328 | + | |
4329 | 4329 | | |
4330 | 4330 | | |
4331 | | - | |
| 4331 | + | |
4332 | 4332 | | |
4333 | 4333 | | |
4334 | 4334 | | |
| |||
4415 | 4415 | | |
4416 | 4416 | | |
4417 | 4417 | | |
4418 | | - | |
| 4418 | + | |
4419 | 4419 | | |
4420 | 4420 | | |
4421 | 4421 | | |
| |||
4427 | 4427 | | |
4428 | 4428 | | |
4429 | 4429 | | |
4430 | | - | |
| 4430 | + | |
4431 | 4431 | | |
4432 | 4432 | | |
4433 | 4433 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
188 | 188 | | |
189 | 189 | | |
190 | 190 | | |
191 | | - | |
192 | 191 | | |
193 | 192 | | |
194 | 193 | | |
| |||
323 | 322 | | |
324 | 323 | | |
325 | 324 | | |
| 325 | + | |
| 326 | + | |
326 | 327 | | |
327 | 328 | | |
328 | 329 | | |
| |||
0 commit comments