@@ -37,7 +37,9 @@ compiler's use of code-motion and common-subexpression optimizations.
3737Therefore, if a given access is involved in an intentional data race,
3838using READ_ONCE() for loads and WRITE_ONCE() for stores is usually
3939preferable to data_race(), which in turn is usually preferable to plain
40- C-language accesses.
40+ C-language accesses. It is permissible to combine #2 and #3, for example,
41+ data_race(READ_ONCE(a)), which will both restrict compiler optimizations
42+ and disable KCSAN diagnostics.
4143
4244KCSAN will complain about many types of data races involving plain
4345C-language accesses, but marking all accesses involved in a given data
@@ -86,6 +88,10 @@ that fail to exclude the updates. In this case, it is important to use
8688data_race() for the diagnostic reads because otherwise KCSAN would give
8789false-positive warnings about these diagnostic reads.
8890
91+ If it is necessary to both restrict compiler optimizations and disable
92+ KCSAN diagnostics, use both data_race() and READ_ONCE(), for example,
93+ data_race(READ_ONCE(a)).
94+
8995In theory, plain C-language loads can also be used for this use case.
9096However, in practice this will have the disadvantage of causing KCSAN
9197to generate false positives because KCSAN will have no way of knowing
@@ -126,6 +132,11 @@ consistent errors, which in turn are quite capable of breaking heuristics.
126132Therefore use of data_race() should be limited to cases where some other
127133code (such as a barrier() call) will force the occasional reload.
128134
135+ Note that this use case requires that the heuristic be able to handle
136+ any possible error. In contrast, if the heuristics might be fatally
137+ confused by one or more of the possible erroneous values, use READ_ONCE()
138+ instead of data_race().
139+
129140In theory, plain C-language loads can also be used for this use case.
130141However, in practice this will have the disadvantage of causing KCSAN
131142to generate false positives because KCSAN will have no way of knowing
@@ -259,9 +270,9 @@ diagnostic purposes. The code might look as follows:
259270 return ret;
260271 }
261272
262- int read_foo_diagnostic(void)
273+ void read_foo_diagnostic(void)
263274 {
264- return data_race(foo);
275+ pr_info("Current value of foo: %d\n", data_race(foo) );
265276 }
266277
267278The reader-writer lock prevents the compiler from introducing concurrency
@@ -274,19 +285,34 @@ tells KCSAN that data races are expected, and should be silently
274285ignored. This data_race() also tells the human reading the code that
275286read_foo_diagnostic() might sometimes return a bogus value.
276287
277- However, please note that your kernel must be built with
278- CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC=n in order for KCSAN to
279- detect a buggy lockless write. If you need KCSAN to detect such a
280- write even if that write did not change the value of foo, you also
281- need CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n. If you need KCSAN to
282- detect such a write happening in an interrupt handler running on the
283- same CPU doing the legitimate lock-protected write, you also need
284- CONFIG_KCSAN_INTERRUPT_WATCHER=y. With some or all of these Kconfig
285- options set properly, KCSAN can be quite helpful, although it is not
286- necessarily a full replacement for hardware watchpoints. On the other
287- hand, neither are hardware watchpoints a full replacement for KCSAN
288- because it is not always easy to tell hardware watchpoint to conditionally
289- trap on accesses.
288+ If it is necessary to suppress compiler optimization and also detect
289+ buggy lockless writes, read_foo_diagnostic() can be updated as follows:
290+
291+ void read_foo_diagnostic(void)
292+ {
293+ pr_info("Current value of foo: %d\n", data_race(READ_ONCE(foo)));
294+ }
295+
296+ Alternatively, given that KCSAN is to ignore all accesses in this function,
297+ this function can be marked __no_kcsan and the data_race() can be dropped:
298+
299+ void __no_kcsan read_foo_diagnostic(void)
300+ {
301+ pr_info("Current value of foo: %d\n", READ_ONCE(foo));
302+ }
303+
304+ However, in order for KCSAN to detect buggy lockless writes, your kernel
305+ must be built with CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC=n. If you
306+ need KCSAN to detect such a write even if that write did not change
307+ the value of foo, you also need CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n.
308+ If you need KCSAN to detect such a write happening in an interrupt handler
309+ running on the same CPU doing the legitimate lock-protected write, you
310+ also need CONFIG_KCSAN_INTERRUPT_WATCHER=y. With some or all of these
311+ Kconfig options set properly, KCSAN can be quite helpful, although
312+ it is not necessarily a full replacement for hardware watchpoints.
313+ On the other hand, neither are hardware watchpoints a full replacement
314+ for KCSAN because it is not always easy to tell hardware watchpoint to
315+ conditionally trap on accesses.
290316
291317
292318Lock-Protected Writes With Lockless Reads
@@ -319,6 +345,99 @@ of the ASSERT_EXCLUSIVE_WRITER() is to allow KCSAN to check for a buggy
319345concurrent lockless write.
320346
321347
348+ Lock-Protected Writes With Heuristic Lockless Reads
349+ ---------------------------------------------------
350+
351+ For another example, suppose that the code can normally make use of
352+ a per-data-structure lock, but there are times when a global lock
353+ is required. These times are indicated via a global flag. The code
354+ might look as follows, and is based loosely on nf_conntrack_lock(),
355+ nf_conntrack_all_lock(), and nf_conntrack_all_unlock():
356+
357+ bool global_flag;
358+ DEFINE_SPINLOCK(global_lock);
359+ struct foo {
360+ spinlock_t f_lock;
361+ int f_data;
362+ };
363+
364+ /* All foo structures are in the following array. */
365+ int nfoo;
366+ struct foo *foo_array;
367+
368+ void do_something_locked(struct foo *fp)
369+ {
370+ /* This works even if data_race() returns nonsense. */
371+ if (!data_race(global_flag)) {
372+ spin_lock(&fp->f_lock);
373+ if (!smp_load_acquire(&global_flag)) {
374+ do_something(fp);
375+ spin_unlock(&fp->f_lock);
376+ return;
377+ }
378+ spin_unlock(&fp->f_lock);
379+ }
380+ spin_lock(&global_lock);
381+ /* global_lock held, thus global flag cannot be set. */
382+ spin_lock(&fp->f_lock);
383+ spin_unlock(&global_lock);
384+ /*
385+ * global_flag might be set here, but begin_global()
386+ * will wait for ->f_lock to be released.
387+ */
388+ do_something(fp);
389+ spin_unlock(&fp->f_lock);
390+ }
391+
392+ void begin_global(void)
393+ {
394+ int i;
395+
396+ spin_lock(&global_lock);
397+ WRITE_ONCE(global_flag, true);
398+ for (i = 0; i < nfoo; i++) {
399+ /*
400+ * Wait for pre-existing local locks. One at
401+ * a time to avoid lockdep limitations.
402+ */
403+ spin_lock(&fp->f_lock);
404+ spin_unlock(&fp->f_lock);
405+ }
406+ }
407+
408+ void end_global(void)
409+ {
410+ smp_store_release(&global_flag, false);
411+ spin_unlock(&global_lock);
412+ }
413+
414+ All code paths leading from the do_something_locked() function's first
415+ read from global_flag acquire a lock, so endless load fusing cannot
416+ happen.
417+
418+ If the value read from global_flag is true, then global_flag is
419+ rechecked while holding ->f_lock, which, if global_flag is now false,
420+ prevents begin_global() from completing. It is therefore safe to invoke
421+ do_something().
422+
423+ Otherwise, if either value read from global_flag is true, then after
424+ global_lock is acquired global_flag must be false. The acquisition of
425+ ->f_lock will prevent any call to begin_global() from returning, which
426+ means that it is safe to release global_lock and invoke do_something().
427+
428+ For this to work, only those foo structures in foo_array[] may be passed
429+ to do_something_locked(). The reason for this is that the synchronization
430+ with begin_global() relies on momentarily holding the lock of each and
431+ every foo structure.
432+
433+ The smp_load_acquire() and smp_store_release() are required because
434+ changes to a foo structure between calls to begin_global() and
435+ end_global() are carried out without holding that structure's ->f_lock.
436+ The smp_load_acquire() and smp_store_release() ensure that the next
437+ invocation of do_something() from do_something_locked() will see those
438+ changes.
439+
440+
322441Lockless Reads and Writes
323442-------------------------
324443
0 commit comments