Skip to content

Commit 6b63f90

Browse files
committed
Merge tag 'cgroup-for-6.19-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fix from Tejun Heo: - Fix a race condition in css_rstat_updated() where CMPXCHG without LOCK prefix could cause lnode corruption when the flusher runs concurrently on another CPU. The issue was introduced in 6.17 and causes memcg stats to become corrupted in production. * tag 'cgroup-for-6.19-rc1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: rstat: use LOCK CMPXCHG in css_rstat_updated
2 parents 8f0b4cc + 3309b63 commit 6b63f90

1 file changed

Lines changed: 8 additions & 5 deletions

File tree

kernel/cgroup/rstat.c

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,6 @@ __bpf_kfunc void css_rstat_updated(struct cgroup_subsys_state *css, int cpu)
7171
{
7272
struct llist_head *lhead;
7373
struct css_rstat_cpu *rstatc;
74-
struct css_rstat_cpu __percpu *rstatc_pcpu;
7574
struct llist_node *self;
7675

7776
/*
@@ -104,18 +103,22 @@ __bpf_kfunc void css_rstat_updated(struct cgroup_subsys_state *css, int cpu)
104103
/*
105104
* This function can be renentered by irqs and nmis for the same cgroup
106105
* and may try to insert the same per-cpu lnode into the llist. Note
107-
* that llist_add() does not protect against such scenarios.
106+
* that llist_add() does not protect against such scenarios. In addition
107+
* this same per-cpu lnode can be modified through init_llist_node()
108+
* from css_rstat_flush() running on a different CPU.
108109
*
109110
* To protect against such stacked contexts of irqs/nmis, we use the
110111
* fact that lnode points to itself when not on a list and then use
111-
* this_cpu_cmpxchg() to atomically set to NULL to select the winner
112+
* try_cmpxchg() to atomically set to NULL to select the winner
112113
* which will call llist_add(). The losers can assume the insertion is
113114
* successful and the winner will eventually add the per-cpu lnode to
114115
* the llist.
116+
*
117+
* Please note that we can not use this_cpu_cmpxchg() here as on some
118+
* archs it is not safe against modifications from multiple CPUs.
115119
*/
116120
self = &rstatc->lnode;
117-
rstatc_pcpu = css->rstat_cpu;
118-
if (this_cpu_cmpxchg(rstatc_pcpu->lnode.next, self, NULL) != self)
121+
if (!try_cmpxchg(&rstatc->lnode.next, &self, NULL))
119122
return;
120123

121124
lhead = ss_lhead_cpu(css->ss, cpu);

0 commit comments

Comments
 (0)