Skip to content

Commit ddfe129

Browse files
fbqpaulmckrcu
authored andcommitted
tools/memory-model: Provide extra ordering for unlock+lock pair on the same CPU
A recent discussion[1] shows that we are in favor of strengthening the ordering of unlock + lock on the same CPU: a unlock and a po-after lock should provide the so-called RCtso ordering, that is a memory access S po-before the unlock should be ordered against a memory access R po-after the lock, unless S is a store and R is a load. The strengthening meets programmers' expection that "sequence of two locked regions to be ordered wrt each other" (from Linus), and can reduce the mental burden when using locks. Therefore add it in LKMM. [1]: https://lore.kernel.org/lkml/20210909185937.GA12379@rowland.harvard.edu/ Co-developed-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc) Acked-by: Palmer Dabbelt <palmerdabbelt@google.com> (RISC-V) Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
1 parent fa55b7d commit ddfe129

2 files changed

Lines changed: 28 additions & 22 deletions

File tree

tools/memory-model/Documentation/explanation.txt

Lines changed: 25 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1813,15 +1813,16 @@ spin_trylock() -- we can call these things lock-releases and
18131813
lock-acquires -- have two properties beyond those of ordinary releases
18141814
and acquires.
18151815

1816-
First, when a lock-acquire reads from a lock-release, the LKMM
1817-
requires that every instruction po-before the lock-release must
1818-
execute before any instruction po-after the lock-acquire. This would
1819-
naturally hold if the release and acquire operations were on different
1820-
CPUs, but the LKMM says it holds even when they are on the same CPU.
1821-
For example:
1816+
First, when a lock-acquire reads from or is po-after a lock-release,
1817+
the LKMM requires that every instruction po-before the lock-release
1818+
must execute before any instruction po-after the lock-acquire. This
1819+
would naturally hold if the release and acquire operations were on
1820+
different CPUs and accessed the same lock variable, but the LKMM says
1821+
it also holds when they are on the same CPU, even if they access
1822+
different lock variables. For example:
18221823

18231824
int x, y;
1824-
spinlock_t s;
1825+
spinlock_t s, t;
18251826

18261827
P0()
18271828
{
@@ -1830,9 +1831,9 @@ For example:
18301831
spin_lock(&s);
18311832
r1 = READ_ONCE(x);
18321833
spin_unlock(&s);
1833-
spin_lock(&s);
1834+
spin_lock(&t);
18341835
r2 = READ_ONCE(y);
1835-
spin_unlock(&s);
1836+
spin_unlock(&t);
18361837
}
18371838

18381839
P1()
@@ -1842,10 +1843,10 @@ For example:
18421843
WRITE_ONCE(x, 1);
18431844
}
18441845

1845-
Here the second spin_lock() reads from the first spin_unlock(), and
1846-
therefore the load of x must execute before the load of y. Thus we
1847-
cannot have r1 = 1 and r2 = 0 at the end (this is an instance of the
1848-
MP pattern).
1846+
Here the second spin_lock() is po-after the first spin_unlock(), and
1847+
therefore the load of x must execute before the load of y, even though
1848+
the two locking operations use different locks. Thus we cannot have
1849+
r1 = 1 and r2 = 0 at the end (this is an instance of the MP pattern).
18491850

18501851
This requirement does not apply to ordinary release and acquire
18511852
fences, only to lock-related operations. For instance, suppose P0()
@@ -1872,13 +1873,13 @@ instructions in the following order:
18721873

18731874
and thus it could load y before x, obtaining r2 = 0 and r1 = 1.
18741875

1875-
Second, when a lock-acquire reads from a lock-release, and some other
1876-
stores W and W' occur po-before the lock-release and po-after the
1877-
lock-acquire respectively, the LKMM requires that W must propagate to
1878-
each CPU before W' does. For example, consider:
1876+
Second, when a lock-acquire reads from or is po-after a lock-release,
1877+
and some other stores W and W' occur po-before the lock-release and
1878+
po-after the lock-acquire respectively, the LKMM requires that W must
1879+
propagate to each CPU before W' does. For example, consider:
18791880

18801881
int x, y;
1881-
spinlock_t x;
1882+
spinlock_t s;
18821883

18831884
P0()
18841885
{
@@ -1908,7 +1909,12 @@ each CPU before W' does. For example, consider:
19081909

19091910
If r1 = 1 at the end then the spin_lock() in P1 must have read from
19101911
the spin_unlock() in P0. Hence the store to x must propagate to P2
1911-
before the store to y does, so we cannot have r2 = 1 and r3 = 0.
1912+
before the store to y does, so we cannot have r2 = 1 and r3 = 0. But
1913+
if P1 had used a lock variable different from s, the writes could have
1914+
propagated in either order. (On the other hand, if the code in P0 and
1915+
P1 had all executed on a single CPU, as in the example before this
1916+
one, then the writes would have propagated in order even if the two
1917+
critical sections used different lock variables.)
19121918

19131919
These two special requirements for lock-release and lock-acquire do
19141920
not arise from the operational model. Nevertheless, kernel developers

tools/memory-model/linux-kernel.cat

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ include "lock.cat"
2727
(* Release Acquire *)
2828
let acq-po = [Acquire] ; po ; [M]
2929
let po-rel = [M] ; po ; [Release]
30-
let po-unlock-rf-lock-po = po ; [UL] ; rf ; [LKR] ; po
30+
let po-unlock-lock-po = po ; [UL] ; (po|rf) ; [LKR] ; po
3131

3232
(* Fences *)
3333
let R4rmb = R \ Noreturn (* Reads for which rmb works *)
@@ -70,12 +70,12 @@ let rwdep = (dep | ctrl) ; [W]
7070
let overwrite = co | fr
7171
let to-w = rwdep | (overwrite & int) | (addr ; [Plain] ; wmb)
7272
let to-r = addr | (dep ; [Marked] ; rfi)
73-
let ppo = to-r | to-w | fence | (po-unlock-rf-lock-po & int)
73+
let ppo = to-r | to-w | fence | (po-unlock-lock-po & int)
7474

7575
(* Propagation: Ordering from release operations and strong fences. *)
7676
let A-cumul(r) = (rfe ; [Marked])? ; r
7777
let cumul-fence = [Marked] ; (A-cumul(strong-fence | po-rel) | wmb |
78-
po-unlock-rf-lock-po) ; [Marked]
78+
po-unlock-lock-po) ; [Marked]
7979
let prop = [Marked] ; (overwrite & ext)? ; cumul-fence* ;
8080
[Marked] ; rfe? ; [Marked]
8181

0 commit comments

Comments
 (0)