Skip to content

Commit d8fd74d

Browse files
melverpaulmckrcu
authored andcommitted
kcsan: permissive: Ignore data-racy 1-bit value changes
Add rules to ignore data-racy reads with only 1-bit value changes. Details about the rules are captured in comments in kernel/kcsan/permissive.h. More background follows. While investigating a number of data races, we've encountered data-racy accesses on flags variables to be very common. The typical pattern is a reader masking all but one bit, and/or the writer setting/clearing only 1 bit (current->flags being a frequently encountered case; more examples in mm/sl[au]b.c, which disable KCSAN for this reason). Since these types of data-racy accesses are common (with the assumption they are intentional and hard to miscompile) having the option (with CONFIG_KCSAN_PERMISSIVE=y) to filter them will avoid forcing everyone to mark them, and deliberately left to preference at this time. One important motivation for having this option built-in is to move closer to being able to enable KCSAN on CI systems or for testers wishing to test the whole kernel, while more easily filtering less interesting data races with higher probability. For the implementation, we considered several alternatives, but had one major requirement: that the rules be kept together with the Linux-kernel tree. Adding them to the compiler would preclude us from making changes quickly; if the rules require tweaks, having them part of the compiler requires waiting another ~1 year for the next release -- that's not realistic. We are left with the following options: 1. Maintain compiler plugins as part of the kernel-tree that removes instrumentation for some accesses (e.g. plain-& with 1-bit mask). The analysis would be reader-side focused, as no assumption can be made about racing writers. Because it seems unrealistic to maintain 2 plugins, one for LLVM and GCC, we would likely pick LLVM. Furthermore, no kernel infrastructure exists to maintain LLVM plugins, and the build-system implications and maintenance overheads do not look great (historically, plugins written against old LLVM APIs are not guaranteed to work with newer LLVM APIs). 2. Find a set of rules that can be expressed in terms of observed value changes, and make it part of the KCSAN runtime. The analysis is writer-side focused, given we rely on observed value changes. The approach taken here is (2). While a complete approach requires both (1) and (2), experiments show that the majority of data races involving trivial bit operations on flags variables can be removed with (2) alone. It goes without saying that the filtering of data races using (1) or (2) does _not_ guarantee they are safe! Therefore, limiting ourselves to (2) for now is the conservative choice for setups that wish to enable CONFIG_KCSAN_PERMISSIVE=y. Signed-off-by: Marco Elver <elver@google.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
1 parent 9c827cd commit d8fd74d

2 files changed

Lines changed: 80 additions & 1 deletion

File tree

kernel/kcsan/kcsan_test.c

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -414,6 +414,14 @@ static noinline void test_kernel_atomic_builtins(void)
414414
__atomic_load_n(&test_var, __ATOMIC_RELAXED);
415415
}
416416

417+
static noinline void test_kernel_xor_1bit(void)
418+
{
419+
/* Do not report data races between the read-writes. */
420+
kcsan_nestable_atomic_begin();
421+
test_var ^= 0x10000;
422+
kcsan_nestable_atomic_end();
423+
}
424+
417425
/* ===== Test cases ===== */
418426

419427
/* Simple test with normal data race. */
@@ -952,6 +960,29 @@ static void test_atomic_builtins(struct kunit *test)
952960
KUNIT_EXPECT_FALSE(test, match_never);
953961
}
954962

963+
__no_kcsan
964+
static void test_1bit_value_change(struct kunit *test)
965+
{
966+
const struct expect_report expect = {
967+
.access = {
968+
{ test_kernel_read, &test_var, sizeof(test_var), 0 },
969+
{ test_kernel_xor_1bit, &test_var, sizeof(test_var), __KCSAN_ACCESS_RW(KCSAN_ACCESS_WRITE) },
970+
},
971+
};
972+
bool match = false;
973+
974+
begin_test_checks(test_kernel_read, test_kernel_xor_1bit);
975+
do {
976+
match = IS_ENABLED(CONFIG_KCSAN_PERMISSIVE)
977+
? report_available()
978+
: report_matches(&expect);
979+
} while (!end_test_checks(match));
980+
if (IS_ENABLED(CONFIG_KCSAN_PERMISSIVE))
981+
KUNIT_EXPECT_FALSE(test, match);
982+
else
983+
KUNIT_EXPECT_TRUE(test, match);
984+
}
985+
955986
/*
956987
* Generate thread counts for all test cases. Values generated are in interval
957988
* [2, 5] followed by exponentially increasing thread counts from 8 to 32.
@@ -1024,6 +1055,7 @@ static struct kunit_case kcsan_test_cases[] = {
10241055
KCSAN_KUNIT_CASE(test_jiffies_noreport),
10251056
KCSAN_KUNIT_CASE(test_seqlock_noreport),
10261057
KCSAN_KUNIT_CASE(test_atomic_builtins),
1058+
KCSAN_KUNIT_CASE(test_1bit_value_change),
10271059
{},
10281060
};
10291061

kernel/kcsan/permissive.h

Lines changed: 48 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,8 @@
1212
#ifndef _KERNEL_KCSAN_PERMISSIVE_H
1313
#define _KERNEL_KCSAN_PERMISSIVE_H
1414

15+
#include <linux/bitops.h>
16+
#include <linux/sched.h>
1517
#include <linux/types.h>
1618

1719
/*
@@ -22,7 +24,11 @@ static __always_inline bool kcsan_ignore_address(const volatile void *ptr)
2224
if (!IS_ENABLED(CONFIG_KCSAN_PERMISSIVE))
2325
return false;
2426

25-
return false;
27+
/*
28+
* Data-racy bitops on current->flags are too common, ignore completely
29+
* for now.
30+
*/
31+
return ptr == &current->flags;
2632
}
2733

2834
/*
@@ -41,6 +47,47 @@ kcsan_ignore_data_race(size_t size, int type, u64 old, u64 new, u64 diff)
4147
if (type || size > sizeof(long))
4248
return false;
4349

50+
/*
51+
* A common pattern is checking/setting just 1 bit in a variable; for
52+
* example:
53+
*
54+
* if (flags & SOME_FLAG) { ... }
55+
*
56+
* and elsewhere flags is updated concurrently:
57+
*
58+
* flags |= SOME_OTHER_FLAG; // just 1 bit
59+
*
60+
* While it is still recommended that such accesses be marked
61+
* appropriately, in many cases these types of data races are so common
62+
* that marking them all is often unrealistic and left to maintainer
63+
* preference.
64+
*
65+
* The assumption in all cases is that with all known compiler
66+
* optimizations (including those that tear accesses), because no more
67+
* than 1 bit changed, the plain accesses are safe despite the presence
68+
* of data races.
69+
*
70+
* The rules here will ignore the data races if we observe no more than
71+
* 1 bit changed.
72+
*
73+
* Of course many operations can effecively change just 1 bit, but the
74+
* general assuption that data races involving 1-bit changes can be
75+
* tolerated still applies.
76+
*
77+
* And in case a true bug is missed, the bug likely manifests as a
78+
* reportable data race elsewhere.
79+
*/
80+
if (hweight64(diff) == 1) {
81+
/*
82+
* Exception: Report data races where the values look like
83+
* ordinary booleans (one of them was 0 and the 0th bit was
84+
* changed) More often than not, they come with interesting
85+
* memory ordering requirements, so let's report them.
86+
*/
87+
if (!((!old || !new) && diff == 1))
88+
return true;
89+
}
90+
4491
return false;
4592
}
4693

0 commit comments

Comments
 (0)