Skip to content

Commit 1be5bdf

Browse files
committed
Merge tag 'kcsan.2022.01.09a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
Pull KCSAN updates from Paul McKenney: "This provides KCSAN fixes and also the ability to take memory barriers into account for weakly-ordered systems. This last can increase the probability of detecting certain types of data races" * tag 'kcsan.2022.01.09a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (29 commits) kcsan: Only test clear_bit_unlock_is_negative_byte if arch defines it kcsan: Avoid nested contexts reading inconsistent reorder_access kcsan: Turn barrier instrumentation into macros kcsan: Make barrier tests compatible with lockdep kcsan: Support WEAK_MEMORY with Clang where no objtool support exists compiler_attributes.h: Add __disable_sanitizer_instrumentation objtool, kcsan: Remove memory barrier instrumentation from noinstr objtool, kcsan: Add memory barrier instrumentation to whitelist sched, kcsan: Enable memory barrier instrumentation mm, kcsan: Enable barrier instrumentation x86/qspinlock, kcsan: Instrument barrier of pv_queued_spin_unlock() x86/barriers, kcsan: Use generic instrumentation for non-smp barriers asm-generic/bitops, kcsan: Add instrumentation for barriers locking/atomics, kcsan: Add instrumentation for barriers locking/barriers, kcsan: Support generic instrumentation locking/barriers, kcsan: Add instrumentation for barriers kcsan: selftest: Add test case to check memory barrier instrumentation kcsan: Ignore GCC 11+ warnings about TSan runtime support kcsan: test: Add test cases for memory barrier instrumentation kcsan: test: Match reordered or normal accesses ...
2 parents 1c824bf + b473a38 commit 1be5bdf

27 files changed

Lines changed: 1347 additions & 172 deletions

File tree

Documentation/dev-tools/kcsan.rst

Lines changed: 63 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -204,17 +204,17 @@ Ultimately this allows to determine the possible executions of concurrent code,
204204
and if that code is free from data races.
205205

206206
KCSAN is aware of *marked atomic operations* (``READ_ONCE``, ``WRITE_ONCE``,
207-
``atomic_*``, etc.), but is oblivious of any ordering guarantees and simply
208-
assumes that memory barriers are placed correctly. In other words, KCSAN
209-
assumes that as long as a plain access is not observed to race with another
210-
conflicting access, memory operations are correctly ordered.
211-
212-
This means that KCSAN will not report *potential* data races due to missing
213-
memory ordering. Developers should therefore carefully consider the required
214-
memory ordering requirements that remain unchecked. If, however, missing
215-
memory ordering (that is observable with a particular compiler and
216-
architecture) leads to an observable data race (e.g. entering a critical
217-
section erroneously), KCSAN would report the resulting data race.
207+
``atomic_*``, etc.), and a subset of ordering guarantees implied by memory
208+
barriers. With ``CONFIG_KCSAN_WEAK_MEMORY=y``, KCSAN models load or store
209+
buffering, and can detect missing ``smp_mb()``, ``smp_wmb()``, ``smp_rmb()``,
210+
``smp_store_release()``, and all ``atomic_*`` operations with equivalent
211+
implied barriers.
212+
213+
Note, KCSAN will not report all data races due to missing memory ordering,
214+
specifically where a memory barrier would be required to prohibit subsequent
215+
memory operation from reordering before the barrier. Developers should
216+
therefore carefully consider the required memory ordering requirements that
217+
remain unchecked.
218218

219219
Race Detection Beyond Data Races
220220
--------------------------------
@@ -268,6 +268,56 @@ marked operations, if all accesses to a variable that is accessed concurrently
268268
are properly marked, KCSAN will never trigger a watchpoint and therefore never
269269
report the accesses.
270270

271+
Modeling Weak Memory
272+
~~~~~~~~~~~~~~~~~~~~
273+
274+
KCSAN's approach to detecting data races due to missing memory barriers is
275+
based on modeling access reordering (with ``CONFIG_KCSAN_WEAK_MEMORY=y``).
276+
Each plain memory access for which a watchpoint is set up, is also selected for
277+
simulated reordering within the scope of its function (at most 1 in-flight
278+
access).
279+
280+
Once an access has been selected for reordering, it is checked along every
281+
other access until the end of the function scope. If an appropriate memory
282+
barrier is encountered, the access will no longer be considered for simulated
283+
reordering.
284+
285+
When the result of a memory operation should be ordered by a barrier, KCSAN can
286+
then detect data races where the conflict only occurs as a result of a missing
287+
barrier. Consider the example::
288+
289+
int x, flag;
290+
void T1(void)
291+
{
292+
x = 1; // data race!
293+
WRITE_ONCE(flag, 1); // correct: smp_store_release(&flag, 1)
294+
}
295+
void T2(void)
296+
{
297+
while (!READ_ONCE(flag)); // correct: smp_load_acquire(&flag)
298+
... = x; // data race!
299+
}
300+
301+
When weak memory modeling is enabled, KCSAN can consider ``x`` in ``T1`` for
302+
simulated reordering. After the write of ``flag``, ``x`` is again checked for
303+
concurrent accesses: because ``T2`` is able to proceed after the write of
304+
``flag``, a data race is detected. With the correct barriers in place, ``x``
305+
would not be considered for reordering after the proper release of ``flag``,
306+
and no data race would be detected.
307+
308+
Deliberate trade-offs in complexity but also practical limitations mean only a
309+
subset of data races due to missing memory barriers can be detected. With
310+
currently available compiler support, the implementation is limited to modeling
311+
the effects of "buffering" (delaying accesses), since the runtime cannot
312+
"prefetch" accesses. Also recall that watchpoints are only set up for plain
313+
accesses, and the only access type for which KCSAN simulates reordering. This
314+
means reordering of marked accesses is not modeled.
315+
316+
A consequence of the above is that acquire operations do not require barrier
317+
instrumentation (no prefetching). Furthermore, marked accesses introducing
318+
address or control dependencies do not require special handling (the marked
319+
access cannot be reordered, later dependent accesses cannot be prefetched).
320+
271321
Key Properties
272322
~~~~~~~~~~~~~~
273323

@@ -290,8 +340,8 @@ Key Properties
290340
4. **Detects Racy Writes from Devices:** Due to checking data values upon
291341
setting up watchpoints, racy writes from devices can also be detected.
292342

293-
5. **Memory Ordering:** KCSAN is *not* explicitly aware of the LKMM's ordering
294-
rules; this may result in missed data races (false negatives).
343+
5. **Memory Ordering:** KCSAN is aware of only a subset of LKMM ordering rules;
344+
this may result in missed data races (false negatives).
295345

296346
6. **Analysis Accuracy:** For observed executions, due to using a sampling
297347
strategy, the analysis is *unsound* (false negatives possible), but aims to

arch/x86/include/asm/barrier.h

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,9 @@
1919
#define wmb() asm volatile(ALTERNATIVE("lock; addl $0,-4(%%esp)", "sfence", \
2020
X86_FEATURE_XMM2) ::: "memory", "cc")
2121
#else
22-
#define mb() asm volatile("mfence":::"memory")
23-
#define rmb() asm volatile("lfence":::"memory")
24-
#define wmb() asm volatile("sfence" ::: "memory")
22+
#define __mb() asm volatile("mfence":::"memory")
23+
#define __rmb() asm volatile("lfence":::"memory")
24+
#define __wmb() asm volatile("sfence" ::: "memory")
2525
#endif
2626

2727
/**
@@ -51,8 +51,8 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
5151
/* Prevent speculative execution past this barrier. */
5252
#define barrier_nospec() alternative("", "lfence", X86_FEATURE_LFENCE_RDTSC)
5353

54-
#define dma_rmb() barrier()
55-
#define dma_wmb() barrier()
54+
#define __dma_rmb() barrier()
55+
#define __dma_wmb() barrier()
5656

5757
#define __smp_mb() asm volatile("lock; addl $0,-4(%%" _ASM_SP ")" ::: "memory", "cc")
5858

arch/x86/include/asm/qspinlock.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -53,6 +53,7 @@ static inline void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
5353

5454
static inline void queued_spin_unlock(struct qspinlock *lock)
5555
{
56+
kcsan_release();
5657
pv_queued_spin_unlock(lock);
5758
}
5859

include/asm-generic/barrier.h

Lines changed: 40 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -14,12 +14,38 @@
1414
#ifndef __ASSEMBLY__
1515

1616
#include <linux/compiler.h>
17+
#include <linux/kcsan-checks.h>
1718
#include <asm/rwonce.h>
1819

1920
#ifndef nop
2021
#define nop() asm volatile ("nop")
2122
#endif
2223

24+
/*
25+
* Architectures that want generic instrumentation can define __ prefixed
26+
* variants of all barriers.
27+
*/
28+
29+
#ifdef __mb
30+
#define mb() do { kcsan_mb(); __mb(); } while (0)
31+
#endif
32+
33+
#ifdef __rmb
34+
#define rmb() do { kcsan_rmb(); __rmb(); } while (0)
35+
#endif
36+
37+
#ifdef __wmb
38+
#define wmb() do { kcsan_wmb(); __wmb(); } while (0)
39+
#endif
40+
41+
#ifdef __dma_rmb
42+
#define dma_rmb() do { kcsan_rmb(); __dma_rmb(); } while (0)
43+
#endif
44+
45+
#ifdef __dma_wmb
46+
#define dma_wmb() do { kcsan_wmb(); __dma_wmb(); } while (0)
47+
#endif
48+
2349
/*
2450
* Force strict CPU ordering. And yes, this is required on UP too when we're
2551
* talking to devices.
@@ -62,15 +88,15 @@
6288
#ifdef CONFIG_SMP
6389

6490
#ifndef smp_mb
65-
#define smp_mb() __smp_mb()
91+
#define smp_mb() do { kcsan_mb(); __smp_mb(); } while (0)
6692
#endif
6793

6894
#ifndef smp_rmb
69-
#define smp_rmb() __smp_rmb()
95+
#define smp_rmb() do { kcsan_rmb(); __smp_rmb(); } while (0)
7096
#endif
7197

7298
#ifndef smp_wmb
73-
#define smp_wmb() __smp_wmb()
99+
#define smp_wmb() do { kcsan_wmb(); __smp_wmb(); } while (0)
74100
#endif
75101

76102
#else /* !CONFIG_SMP */
@@ -123,19 +149,19 @@ do { \
123149
#ifdef CONFIG_SMP
124150

125151
#ifndef smp_store_mb
126-
#define smp_store_mb(var, value) __smp_store_mb(var, value)
152+
#define smp_store_mb(var, value) do { kcsan_mb(); __smp_store_mb(var, value); } while (0)
127153
#endif
128154

129155
#ifndef smp_mb__before_atomic
130-
#define smp_mb__before_atomic() __smp_mb__before_atomic()
156+
#define smp_mb__before_atomic() do { kcsan_mb(); __smp_mb__before_atomic(); } while (0)
131157
#endif
132158

133159
#ifndef smp_mb__after_atomic
134-
#define smp_mb__after_atomic() __smp_mb__after_atomic()
160+
#define smp_mb__after_atomic() do { kcsan_mb(); __smp_mb__after_atomic(); } while (0)
135161
#endif
136162

137163
#ifndef smp_store_release
138-
#define smp_store_release(p, v) __smp_store_release(p, v)
164+
#define smp_store_release(p, v) do { kcsan_release(); __smp_store_release(p, v); } while (0)
139165
#endif
140166

141167
#ifndef smp_load_acquire
@@ -178,13 +204,13 @@ do { \
178204
#endif /* CONFIG_SMP */
179205

180206
/* Barriers for virtual machine guests when talking to an SMP host */
181-
#define virt_mb() __smp_mb()
182-
#define virt_rmb() __smp_rmb()
183-
#define virt_wmb() __smp_wmb()
184-
#define virt_store_mb(var, value) __smp_store_mb(var, value)
185-
#define virt_mb__before_atomic() __smp_mb__before_atomic()
186-
#define virt_mb__after_atomic() __smp_mb__after_atomic()
187-
#define virt_store_release(p, v) __smp_store_release(p, v)
207+
#define virt_mb() do { kcsan_mb(); __smp_mb(); } while (0)
208+
#define virt_rmb() do { kcsan_rmb(); __smp_rmb(); } while (0)
209+
#define virt_wmb() do { kcsan_wmb(); __smp_wmb(); } while (0)
210+
#define virt_store_mb(var, value) do { kcsan_mb(); __smp_store_mb(var, value); } while (0)
211+
#define virt_mb__before_atomic() do { kcsan_mb(); __smp_mb__before_atomic(); } while (0)
212+
#define virt_mb__after_atomic() do { kcsan_mb(); __smp_mb__after_atomic(); } while (0)
213+
#define virt_store_release(p, v) do { kcsan_release(); __smp_store_release(p, v); } while (0)
188214
#define virt_load_acquire(p) __smp_load_acquire(p)
189215

190216
/**

include/asm-generic/bitops/instrumented-atomic.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,7 @@ static inline void change_bit(long nr, volatile unsigned long *addr)
6767
*/
6868
static inline bool test_and_set_bit(long nr, volatile unsigned long *addr)
6969
{
70+
kcsan_mb();
7071
instrument_atomic_read_write(addr + BIT_WORD(nr), sizeof(long));
7172
return arch_test_and_set_bit(nr, addr);
7273
}
@@ -80,6 +81,7 @@ static inline bool test_and_set_bit(long nr, volatile unsigned long *addr)
8081
*/
8182
static inline bool test_and_clear_bit(long nr, volatile unsigned long *addr)
8283
{
84+
kcsan_mb();
8385
instrument_atomic_read_write(addr + BIT_WORD(nr), sizeof(long));
8486
return arch_test_and_clear_bit(nr, addr);
8587
}
@@ -93,6 +95,7 @@ static inline bool test_and_clear_bit(long nr, volatile unsigned long *addr)
9395
*/
9496
static inline bool test_and_change_bit(long nr, volatile unsigned long *addr)
9597
{
98+
kcsan_mb();
9699
instrument_atomic_read_write(addr + BIT_WORD(nr), sizeof(long));
97100
return arch_test_and_change_bit(nr, addr);
98101
}

include/asm-generic/bitops/instrumented-lock.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
*/
2323
static inline void clear_bit_unlock(long nr, volatile unsigned long *addr)
2424
{
25+
kcsan_release();
2526
instrument_atomic_write(addr + BIT_WORD(nr), sizeof(long));
2627
arch_clear_bit_unlock(nr, addr);
2728
}
@@ -37,6 +38,7 @@ static inline void clear_bit_unlock(long nr, volatile unsigned long *addr)
3738
*/
3839
static inline void __clear_bit_unlock(long nr, volatile unsigned long *addr)
3940
{
41+
kcsan_release();
4042
instrument_write(addr + BIT_WORD(nr), sizeof(long));
4143
arch___clear_bit_unlock(nr, addr);
4244
}
@@ -71,6 +73,7 @@ static inline bool test_and_set_bit_lock(long nr, volatile unsigned long *addr)
7173
static inline bool
7274
clear_bit_unlock_is_negative_byte(long nr, volatile unsigned long *addr)
7375
{
76+
kcsan_release();
7477
instrument_atomic_write(addr + BIT_WORD(nr), sizeof(long));
7578
return arch_clear_bit_unlock_is_negative_byte(nr, addr);
7679
}

0 commit comments

Comments
 (0)