@@ -16,18 +16,23 @@ to start learning about RCU:
1616| 6. The RCU API, 2019 Edition https://lwn.net/Articles/777036/
1717| 2019 Big API Table https://lwn.net/Articles/777165/
1818
19+ For those preferring video:
20+
21+ | 1. Unraveling RCU Mysteries: Fundamentals https://www.linuxfoundation.org/webinars/unraveling-rcu-usage-mysteries
22+ | 2. Unraveling RCU Mysteries: Additional Use Cases https://www.linuxfoundation.org/webinars/unraveling-rcu-usage-mysteries-additional-use-cases
23+
1924
2025What is RCU?
2126
2227RCU is a synchronization mechanism that was added to the Linux kernel
2328during the 2.5 development effort that is optimized for read-mostly
24- situations. Although RCU is actually quite simple once you understand it,
25- getting there can sometimes be a challenge. Part of the problem is that
26- most of the past descriptions of RCU have been written with the mistaken
27- assumption that there is "one true way" to describe RCU. Instead,
28- the experience has been that different people must take different paths
29- to arrive at an understanding of RCU . This document provides several
30- different paths, as follows:
29+ situations. Although RCU is actually quite simple, making effective use
30+ of it requires you to think differently about your code. Another part
31+ of the problem is the mistaken assumption that there is "one true way" to
32+ describe and to use RCU. Instead, the experience has been that different
33+ people must take different paths to arrive at an understanding of RCU,
34+ depending on their experiences and use cases . This document provides
35+ several different paths, as follows:
3136
3237:ref: `1. RCU OVERVIEW <1_whatisRCU >`
3338
@@ -157,34 +162,36 @@ rcu_read_lock()
157162^^^^^^^^^^^^^^^
158163 void rcu_read_lock(void);
159164
160- Used by a reader to inform the reclaimer that the reader is
161- entering an RCU read-side critical section. It is illegal
162- to block while in an RCU read-side critical section, though
163- kernels built with CONFIG_PREEMPT_RCU can preempt RCU
164- read-side critical sections. Any RCU-protected data structure
165- accessed during an RCU read-side critical section is guaranteed to
166- remain unreclaimed for the full duration of that critical section.
167- Reference counts may be used in conjunction with RCU to maintain
168- longer-term references to data structures.
165+ This temporal primitive is used by a reader to inform the
166+ reclaimer that the reader is entering an RCU read-side critical
167+ section. It is illegal to block while in an RCU read-side
168+ critical section, though kernels built with CONFIG_PREEMPT_RCU
169+ can preempt RCU read-side critical sections. Any RCU-protected
170+ data structure accessed during an RCU read-side critical section
171+ is guaranteed to remain unreclaimed for the full duration of that
172+ critical section. Reference counts may be used in conjunction
173+ with RCU to maintain longer-term references to data structures.
169174
170175rcu_read_unlock()
171176^^^^^^^^^^^^^^^^^
172177 void rcu_read_unlock(void);
173178
174- Used by a reader to inform the reclaimer that the reader is
175- exiting an RCU read-side critical section. Note that RCU
176- read-side critical sections may be nested and/or overlapping.
179+ This temporal primitives is used by a reader to inform the
180+ reclaimer that the reader is exiting an RCU read-side critical
181+ section. Note that RCU read-side critical sections may be nested
182+ and/or overlapping.
177183
178184synchronize_rcu()
179185^^^^^^^^^^^^^^^^^
180186 void synchronize_rcu(void);
181187
182- Marks the end of updater code and the beginning of reclaimer
183- code. It does this by blocking until all pre-existing RCU
184- read-side critical sections on all CPUs have completed.
185- Note that synchronize_rcu() will **not ** necessarily wait for
186- any subsequent RCU read-side critical sections to complete.
187- For example, consider the following sequence of events::
188+ This temporal primitive marks the end of updater code and the
189+ beginning of reclaimer code. It does this by blocking until
190+ all pre-existing RCU read-side critical sections on all CPUs
191+ have completed. Note that synchronize_rcu() will **not **
192+ necessarily wait for any subsequent RCU read-side critical
193+ sections to complete. For example, consider the following
194+ sequence of events::
188195
189196 CPU 0 CPU 1 CPU 2
190197 ----------------- ------------------------- ---------------
@@ -211,13 +218,13 @@ synchronize_rcu()
211218 to be useful in all but the most read-intensive situations,
212219 synchronize_rcu()'s overhead must also be quite small.
213220
214- The call_rcu() API is a callback form of synchronize_rcu(),
215- and is described in more detail in a later section. Instead of
216- blocking, it registers a function and argument which are invoked
217- after all ongoing RCU read-side critical sections have completed.
218- This callback variant is particularly useful in situations where
219- it is illegal to block or where update-side performance is
220- critically important.
221+ The call_rcu() API is an asynchronous callback form of
222+ synchronize_rcu(), and is described in more detail in a later
223+ section. Instead of blocking, it registers a function and
224+ argument which are invoked after all ongoing RCU read-side
225+ critical sections have completed. This callback variant is
226+ particularly useful in situations where it is illegal to block
227+ or where update-side performance is critically important.
221228
222229 However, the call_rcu() API should not be used lightly, as use
223230 of the synchronize_rcu() API generally results in simpler code.
@@ -236,11 +243,13 @@ rcu_assign_pointer()
236243 would be cool to be able to declare a function in this manner.
237244 (Compiler experts will no doubt disagree.)
238245
239- The updater uses this function to assign a new value to an
246+ The updater uses this spatial macro to assign a new value to an
240247 RCU-protected pointer, in order to safely communicate the change
241- in value from the updater to the reader. This macro does not
242- evaluate to an rvalue, but it does execute any memory-barrier
243- instructions required for a given CPU architecture.
248+ in value from the updater to the reader. This is a spatial (as
249+ opposed to temporal) macro. It does not evaluate to an rvalue,
250+ but it does execute any memory-barrier instructions required
251+ for a given CPU architecture. Its ordering properties are that
252+ of a store-release operation.
244253
245254 Perhaps just as important, it serves to document (1) which
246255 pointers are protected by RCU and (2) the point at which a
@@ -255,14 +264,15 @@ rcu_dereference()
255264 Like rcu_assign_pointer(), rcu_dereference() must be implemented
256265 as a macro.
257266
258- The reader uses rcu_dereference() to fetch an RCU-protected
259- pointer, which returns a value that may then be safely
260- dereferenced. Note that rcu_dereference() does not actually
261- dereference the pointer, instead, it protects the pointer for
262- later dereferencing. It also executes any needed memory-barrier
263- instructions for a given CPU architecture. Currently, only Alpha
264- needs memory barriers within rcu_dereference() -- on other CPUs,
265- it compiles to nothing, not even a compiler directive.
267+ The reader uses the spatial rcu_dereference() macro to fetch
268+ an RCU-protected pointer, which returns a value that may
269+ then be safely dereferenced. Note that rcu_dereference()
270+ does not actually dereference the pointer, instead, it
271+ protects the pointer for later dereferencing. It also
272+ executes any needed memory-barrier instructions for a given
273+ CPU architecture. Currently, only Alpha needs memory barriers
274+ within rcu_dereference() -- on other CPUs, it compiles to a
275+ volatile load.
266276
267277 Common coding practice uses rcu_dereference() to copy an
268278 RCU-protected pointer to a local variable, then dereferences
@@ -355,12 +365,15 @@ reader, updater, and reclaimer.
355365 synchronize_rcu() & call_rcu()
356366
357367
358- The RCU infrastructure observes the time sequence of rcu_read_lock(),
368+ The RCU infrastructure observes the temporal sequence of rcu_read_lock(),
359369rcu_read_unlock(), synchronize_rcu(), and call_rcu() invocations in
360370order to determine when (1) synchronize_rcu() invocations may return
361371to their callers and (2) call_rcu() callbacks may be invoked. Efficient
362372implementations of the RCU infrastructure make heavy use of batching in
363373order to amortize their overhead over many uses of the corresponding APIs.
374+ The rcu_assign_pointer() and rcu_dereference() invocations communicate
375+ spatial changes via stores to and loads from the RCU-protected pointer in
376+ question.
364377
365378There are at least three flavors of RCU usage in the Linux kernel. The diagram
366379above shows the most common one. On the updater side, the rcu_assign_pointer(),
@@ -392,7 +405,9 @@ b. RCU applied to networking data structures that may be subjected
392405c. RCU applied to scheduler and interrupt/NMI-handler tasks.
393406
394407Again, most uses will be of (a). The (b) and (c) cases are important
395- for specialized uses, but are relatively uncommon.
408+ for specialized uses, but are relatively uncommon. The SRCU, RCU-Tasks,
409+ RCU-Tasks-Rude, and RCU-Tasks-Trace have similar relationships among
410+ their assorted primitives.
396411
397412.. _3_whatisRCU :
398413
@@ -468,7 +483,7 @@ So, to sum up:
468483- Within an RCU read-side critical section, use rcu_dereference()
469484 to dereference RCU-protected pointers.
470485
471- - Use some solid scheme (such as locks or semaphores) to
486+ - Use some solid design (such as locks or semaphores) to
472487 keep concurrent updates from interfering with each other.
473488
474489- Use rcu_assign_pointer() to update an RCU-protected pointer.
@@ -579,6 +594,14 @@ to avoid having to write your own callback::
579594
580595 kfree_rcu(old_fp, rcu);
581596
597+ If the occasional sleep is permitted, the single-argument form may
598+ be used, omitting the rcu_head structure from struct foo.
599+
600+ kfree_rcu(old_fp);
601+
602+ This variant of kfree_rcu() almost never blocks, but might do so by
603+ invoking synchronize_rcu() in response to memory-allocation failure.
604+
582605Again, see checklist.rst for additional rules governing the use of RCU.
583606
584607.. _5_whatisRCU :
@@ -596,7 +619,7 @@ lacking both functionality and performance. However, they are useful
596619in getting a feel for how RCU works. See kernel/rcu/update.c for a
597620production-quality implementation, and see:
598621
599- http ://www.rdrop .com/users/paulmck/RCU
622+ https ://docs.google .com/document/d/1X0lThx8OK0ZgLMqVoXiR4ZrGURHrXK6NyLRbeXe3Xac/edit
600623
601624for papers describing the Linux kernel RCU implementation. The OLS'01
602625and OLS'02 papers are a good introduction, and the dissertation provides
@@ -929,6 +952,8 @@ unfortunately any spinlock in a ``SLAB_TYPESAFE_BY_RCU`` object must be
929952initialized after each and every call to kmem_cache_alloc(), which renders
930953reference-free spinlock acquisition completely unsafe. Therefore, when
931954using ``SLAB_TYPESAFE_BY_RCU ``, make proper use of a reference counter.
955+ (Those willing to use a kmem_cache constructor may also use locking,
956+ including cache-friendly sequence locking.)
932957
933958With traditional reference counting -- such as that implemented by the
934959kref library in Linux -- there is typically code that runs when the last
@@ -1047,6 +1072,30 @@ sched::
10471072 rcu_read_lock_sched_held
10481073
10491074
1075+ RCU-Tasks::
1076+
1077+ Critical sections Grace period Barrier
1078+
1079+ N/A call_rcu_tasks rcu_barrier_tasks
1080+ synchronize_rcu_tasks
1081+
1082+
1083+ RCU-Tasks-Rude::
1084+
1085+ Critical sections Grace period Barrier
1086+
1087+ N/A call_rcu_tasks_rude rcu_barrier_tasks_rude
1088+ synchronize_rcu_tasks_rude
1089+
1090+
1091+ RCU-Tasks-Trace::
1092+
1093+ Critical sections Grace period Barrier
1094+
1095+ rcu_read_lock_trace call_rcu_tasks_trace rcu_barrier_tasks_trace
1096+ rcu_read_unlock_trace synchronize_rcu_tasks_trace
1097+
1098+
10501099SRCU::
10511100
10521101 Critical sections Grace period Barrier
@@ -1087,35 +1136,43 @@ list can be helpful:
10871136
10881137a. Will readers need to block? If so, you need SRCU.
10891138
1090- b. What about the -rt patchset? If readers would need to block
1091- in an non-rt kernel, you need SRCU. If readers would block
1092- in a -rt kernel, but not in a non-rt kernel, SRCU is not
1093- necessary. (The -rt patchset turns spinlocks into sleeplocks,
1094- hence this distinction.)
1139+ b. Will readers need to block and are you doing tracing, for
1140+ example, ftrace or BPF? If so, you need RCU-tasks,
1141+ RCU-tasks-rude, and/or RCU-tasks-trace.
1142+
1143+ c. What about the -rt patchset? If readers would need to block in
1144+ an non-rt kernel, you need SRCU. If readers would block when
1145+ acquiring spinlocks in a -rt kernel, but not in a non-rt kernel,
1146+ SRCU is not necessary. (The -rt patchset turns spinlocks into
1147+ sleeplocks, hence this distinction.)
10951148
1096- c . Do you need to treat NMI handlers, hardirq handlers,
1149+ d . Do you need to treat NMI handlers, hardirq handlers,
10971150 and code segments with preemption disabled (whether
10981151 via preempt_disable(), local_irq_save(), local_bh_disable(),
10991152 or some other mechanism) as if they were explicit RCU readers?
1100- If so, RCU-sched is the only choice that will work for you.
1101-
1102- d. Do you need RCU grace periods to complete even in the face
1103- of softirq monopolization of one or more of the CPUs? For
1104- example, is your code subject to network-based denial-of-service
1105- attacks? If so, you should disable softirq across your readers,
1106- for example, by using rcu_read_lock_bh().
1107-
1108- e. Is your workload too update-intensive for normal use of
1153+ If so, RCU-sched readers are the only choice that will work
1154+ for you, but since about v4.20 you use can use the vanilla RCU
1155+ update primitives.
1156+
1157+ e. Do you need RCU grace periods to complete even in the face of
1158+ softirq monopolization of one or more of the CPUs? For example,
1159+ is your code subject to network-based denial-of-service attacks?
1160+ If so, you should disable softirq across your readers, for
1161+ example, by using rcu_read_lock_bh(). Since about v4.20 you
1162+ use can use the vanilla RCU update primitives.
1163+
1164+ f. Is your workload too update-intensive for normal use of
11091165 RCU, but inappropriate for other synchronization mechanisms?
11101166 If so, consider SLAB_TYPESAFE_BY_RCU (which was originally
11111167 named SLAB_DESTROY_BY_RCU). But please be careful!
11121168
1113- f. Do you need read-side critical sections that are respected
1114- even though they are in the middle of the idle loop, during
1115- user-mode execution, or on an offlined CPU? If so, SRCU is the
1116- only choice that will work for you.
1169+ g. Do you need read-side critical sections that are respected even
1170+ on CPUs that are deep in the idle loop, during entry to or exit
1171+ from user-mode execution, or on an offlined CPU? If so, SRCU
1172+ and RCU Tasks Trace are the only choices that will work for you,
1173+ with SRCU being strongly preferred in almost all cases.
11171174
1118- g . Otherwise, use RCU.
1175+ h . Otherwise, use RCU.
11191176
11201177Of course, this all assumes that you have determined that RCU is in fact
11211178the right tool for your job.
0 commit comments