@@ -1973,9 +1973,7 @@ code, and the FQS loop, all of which refer to or modify this bookkeeping.
19731973Note that grace period initialization (rcu_gp_init()) must carefully sequence
19741974CPU hotplug scanning with grace period state changes. For example, the
19751975following race could occur in rcu_gp_init() if rcu_seq_start() were to happen
1976- after the CPU hotplug scanning.
1977-
1978- .. code-block :: none
1976+ after the CPU hotplug scanning::
19791977
19801978 CPU0 (rcu_gp_init) CPU1 CPU2
19811979 --------------------- ---- ----
@@ -2008,22 +2006,22 @@ after the CPU hotplug scanning.
20082006 kfree(r1);
20092007 r2 = *r0; // USE-AFTER-FREE!
20102008
2011- By incrementing gp_seq first, CPU1's RCU read-side critical section
2009+ By incrementing `` gp_seq `` first, CPU1's RCU read-side critical section
20122010is guaranteed to not be missed by CPU2.
20132011
2014- **Concurrent Quiescent State Reporting for Offline CPUs **
2012+ Concurrent Quiescent State Reporting for Offline CPUs
2013+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20152014
20162015RCU must ensure that CPUs going offline report quiescent states to avoid
20172016blocking grace periods. This requires careful synchronization to handle
20182017race conditions
20192018
2020- **Race condition causing Offline CPU to hang GP **
2021-
2022- A race between CPU offlining and new GP initialization (gp_init) may occur
2023- because `rcu_report_qs_rnp() ` in `rcutree_report_cpu_dead() ` must temporarily
2024- release the `rcu_node ` lock to wake the RCU grace-period kthread:
2019+ Race condition causing Offline CPU to hang GP
2020+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20252021
2026- .. code-block :: none
2022+ A race between CPU offlining and new GP initialization (gp_init()) may occur
2023+ because rcu_report_qs_rnp() in rcutree_report_cpu_dead() must temporarily
2024+ release the ``rcu_node `` lock to wake the RCU grace-period kthread::
20272025
20282026 CPU1 (going offline) CPU0 (GP kthread)
20292027 -------------------- -----------------
@@ -2044,15 +2042,14 @@ release the `rcu_node` lock to wake the RCU grace-period kthread:
20442042 // Reacquire lock (but too late)
20452043 rnp->qsmaskinitnext &= ~mask // Finally clears bit
20462044
2047- Without `ofl_lock `, the new grace period includes the offline CPU and waits
2045+ Without `` ofl_lock ` `, the new grace period includes the offline CPU and waits
20482046forever for its quiescent state causing a GP hang.
20492047
2050- **A solution with ofl_lock **
2048+ A solution with ofl_lock
2049+ ^^^^^^^^^^^^^^^^^^^^^^^^
20512050
2052- The `ofl_lock ` (offline lock) prevents `rcu_gp_init() ` from running during
2053- the vulnerable window when `rcu_report_qs_rnp() ` has released `rnp->lock `:
2054-
2055- .. code-block :: none
2051+ The ``ofl_lock `` (offline lock) prevents rcu_gp_init() from running during
2052+ the vulnerable window when rcu_report_qs_rnp() has released ``rnp->lock ``::
20562053
20572054 CPU0 (rcu_gp_init) CPU1 (rcutree_report_cpu_dead)
20582055 ------------------ ------------------------------
@@ -2065,21 +2062,20 @@ the vulnerable window when `rcu_report_qs_rnp()` has released `rnp->lock`:
20652062 arch_spin_unlock(&ofl_lock) ---> // Now CPU1 can proceed
20662063 } // But snapshot already taken
20672064
2068- **Another race causing GP hangs in rcu_gpu_init(): Reporting QS for Now-offline CPUs **
2065+ Another race causing GP hangs in rcu_gpu_init(): Reporting QS for Now-offline CPUs
2066+ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
20692067
20702068After the first loop takes an atomic snapshot of online CPUs, as shown above,
2071- the second loop in ` rcu_gp_init() ` detects CPUs that went offline between
2072- releasing `ofl_lock ` and acquiring the per-node `rnp->lock `. This detection is
2073- crucial because:
2069+ the second loop in rcu_gp_init() detects CPUs that went offline between
2070+ releasing `` ofl_lock `` and acquiring the per-node `` rnp->lock ``.
2071+ This detection is crucial because:
20742072
207520731. The CPU might have gone offline after the snapshot but before the second loop
207620742. The offline CPU cannot report its own QS if it's already dead
207720753. Without this detection, the grace period would wait forever for CPUs that
20782076 are now offline.
20792077
2080- The second loop performs this detection safely:
2081-
2082- .. code-block :: none
2078+ The second loop performs this detection safely::
20832079
20842080 rcu_for_each_node_breadth_first(rnp) {
20852081 raw_spin_lock_irqsave_rcu_node(rnp, flags);
@@ -2093,10 +2089,10 @@ The second loop performs this detection safely:
20932089 }
20942090
20952091This approach ensures atomicity: quiescent state reporting for offline CPUs
2096- happens either in ` rcu_gp_init() ` (second loop) or in ` rcutree_report_cpu_dead() ` ,
2097- never both and never neither. The `rnp->lock ` held throughout the sequence
2098- prevents races - ` rcutree_report_cpu_dead() ` also acquires this lock when
2099- clearing `qsmaskinitnext `, ensuring mutual exclusion.
2092+ happens either in rcu_gp_init() (second loop) or in rcutree_report_cpu_dead(),
2093+ never both and never neither. The `` rnp->lock ` ` held throughout the sequence
2094+ prevents races - rcutree_report_cpu_dead() also acquires this lock when
2095+ clearing `` qsmaskinitnext ` `, ensuring mutual exclusion.
21002096
21012097Scheduler and RCU
21022098~~~~~~~~~~~~~~~~~
0 commit comments