Skip to content

Commit 4138787

Browse files
swahlhpeKAGA-KOKO
authored andcommitted
tick/sched: Limit non-timekeeper CPUs calling jiffies update
On large NUMA systems, while running a test program that saturates the inter-processor and inter-NUMA links, acquiring the jiffies_lock can be very expensive. If the cpu designated to do jiffies updates (tick_do_timer_cpu) gets delayed and other cpus decide to do the jiffies update themselves, a large number of them decide to do so at the same time. The inexpensive check against tick_next_period is far quicker than actually acquiring the lock, so most of these get in line to obtain the lock. If obtaining the lock is slow enough, this spirals into the vast majority of CPUs continuously being stuck waiting for this lock, just to obtain it and find out that time has already been updated by another cpu. For example, on one random entry to kdb by manually-injected NMI, 2912 of 3840 CPUs were observed to be stuck there. To avoid this, allow only one non-timekeeper CPU to call tick_do_update_jiffies64() at any given time, resetting ts->stalled jiffies only if the jiffies update function is actually called. With this change, manually interrupting the test at most two CPUs are observed to invoke tick_do_update_jiffies64() - the timekeeper and one other. Signed-off-by: Steve Wahl <steve.wahl@hpe.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Shrikanth Hegde <sshegde@linux.ibm.com> Link: https://patch.msgid.link/20251027183456.343407-1-steve.wahl@hpe.com
1 parent 391253b commit 4138787

1 file changed

Lines changed: 26 additions & 4 deletions

File tree

kernel/time/tick-sched.c

Lines changed: 26 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -201,6 +201,27 @@ static inline void tick_sched_flag_clear(struct tick_sched *ts,
201201
ts->flags &= ~flag;
202202
}
203203

204+
/*
205+
* Allow only one non-timekeeper CPU at a time update jiffies from
206+
* the timer tick.
207+
*
208+
* Returns true if update was run.
209+
*/
210+
static bool tick_limited_update_jiffies64(struct tick_sched *ts, ktime_t now)
211+
{
212+
static atomic_t in_progress;
213+
int inp;
214+
215+
inp = atomic_read(&in_progress);
216+
if (inp || !atomic_try_cmpxchg(&in_progress, &inp, 1))
217+
return false;
218+
219+
if (ts->last_tick_jiffies == jiffies)
220+
tick_do_update_jiffies64(now);
221+
atomic_set(&in_progress, 0);
222+
return true;
223+
}
224+
204225
#define MAX_STALLED_JIFFIES 5
205226

206227
static void tick_sched_do_timer(struct tick_sched *ts, ktime_t now)
@@ -239,10 +260,11 @@ static void tick_sched_do_timer(struct tick_sched *ts, ktime_t now)
239260
ts->stalled_jiffies = 0;
240261
ts->last_tick_jiffies = READ_ONCE(jiffies);
241262
} else {
242-
if (++ts->stalled_jiffies == MAX_STALLED_JIFFIES) {
243-
tick_do_update_jiffies64(now);
244-
ts->stalled_jiffies = 0;
245-
ts->last_tick_jiffies = READ_ONCE(jiffies);
263+
if (++ts->stalled_jiffies >= MAX_STALLED_JIFFIES) {
264+
if (tick_limited_update_jiffies64(ts, now)) {
265+
ts->stalled_jiffies = 0;
266+
ts->last_tick_jiffies = READ_ONCE(jiffies);
267+
}
246268
}
247269
}
248270

0 commit comments

Comments
 (0)