Skip to content

Commit e2c73a6

Browse files
committed
rcu: Remove the RCU_FAST_NO_HZ Kconfig option
All of the uses of CONFIG_RCU_FAST_NO_HZ=y that I have seen involve systems with RCU callbacks offloaded. In this situation, all that this Kconfig option does is slow down idle entry/exit with an additional allways-taken early exit. If this is the only use case, then this Kconfig option nothing but an attractive nuisance that needs to go away. This commit therefore removes the RCU_FAST_NO_HZ Kconfig option. Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
1 parent 24eab6e commit e2c73a6

11 files changed

Lines changed: 7 additions & 269 deletions

File tree

Documentation/RCU/stallwarn.rst

Lines changed: 0 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -254,17 +254,6 @@ period (in this case 2603), the grace-period sequence number (7075), and
254254
an estimate of the total number of RCU callbacks queued across all CPUs
255255
(625 in this case).
256256

257-
In kernels with CONFIG_RCU_FAST_NO_HZ, more information is printed
258-
for each CPU::
259-
260-
0: (64628 ticks this GP) idle=dd5/3fffffffffffffff/0 softirq=82/543 last_accelerate: a345/d342 dyntick_enabled: 1
261-
262-
The "last_accelerate:" prints the low-order 16 bits (in hex) of the
263-
jiffies counter when this CPU last invoked rcu_try_advance_all_cbs()
264-
from rcu_needs_cpu() or last invoked rcu_accelerate_cbs() from
265-
rcu_prepare_for_idle(). "dyntick_enabled: 1" indicates that dyntick-idle
266-
processing is enabled.
267-
268257
If the grace period ends just as the stall warning starts printing,
269258
there will be a spurious stall-warning message, which will include
270259
the following::

Documentation/admin-guide/kernel-parameters.txt

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4489,10 +4489,6 @@
44894489
on rcutree.qhimark at boot time and to zero to
44904490
disable more aggressive help enlistment.
44914491

4492-
rcutree.rcu_idle_gp_delay= [KNL]
4493-
Set wakeup interval for idle CPUs that have
4494-
RCU callbacks (RCU_FAST_NO_HZ=y).
4495-
44964492
rcutree.rcu_kick_kthreads= [KNL]
44974493
Cause the grace-period kthread to get an extra
44984494
wake_up() if it sleeps three times longer than

Documentation/timers/no_hz.rst

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -184,16 +184,12 @@ There are situations in which idle CPUs cannot be permitted to
184184
enter either dyntick-idle mode or adaptive-tick mode, the most
185185
common being when that CPU has RCU callbacks pending.
186186

187-
The CONFIG_RCU_FAST_NO_HZ=y Kconfig option may be used to cause such CPUs
188-
to enter dyntick-idle mode or adaptive-tick mode anyway. In this case,
189-
a timer will awaken these CPUs every four jiffies in order to ensure
190-
that the RCU callbacks are processed in a timely fashion.
191-
192-
Another approach is to offload RCU callback processing to "rcuo" kthreads
187+
Avoid this by offloading RCU callback processing to "rcuo" kthreads
193188
using the CONFIG_RCU_NOCB_CPU=y Kconfig option. The specific CPUs to
194189
offload may be selected using The "rcu_nocbs=" kernel boot parameter,
195190
which takes a comma-separated list of CPUs and CPU ranges, for example,
196-
"1,3-5" selects CPUs 1, 3, 4, and 5.
191+
"1,3-5" selects CPUs 1, 3, 4, and 5. Note that CPUs specified by
192+
the "nohz_full" kernel boot parameter are also offloaded.
197193

198194
The offloaded CPUs will never queue RCU callbacks, and therefore RCU
199195
never prevents offloaded CPUs from entering either dyntick-idle mode

kernel/rcu/Kconfig

Lines changed: 0 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -169,24 +169,6 @@ config RCU_FANOUT_LEAF
169169

170170
Take the default if unsure.
171171

172-
config RCU_FAST_NO_HZ
173-
bool "Accelerate last non-dyntick-idle CPU's grace periods"
174-
depends on NO_HZ_COMMON && SMP && RCU_EXPERT
175-
default n
176-
help
177-
This option permits CPUs to enter dynticks-idle state even if
178-
they have RCU callbacks queued, and prevents RCU from waking
179-
these CPUs up more than roughly once every four jiffies (by
180-
default, you can adjust this using the rcutree.rcu_idle_gp_delay
181-
parameter), thus improving energy efficiency. On the other
182-
hand, this option increases the duration of RCU grace periods,
183-
for example, slowing down synchronize_rcu().
184-
185-
Say Y if energy efficiency is critically important, and you
186-
don't care about increased grace-period durations.
187-
188-
Say N if you are unsure.
189-
190172
config RCU_BOOST
191173
bool "Enable RCU priority boosting"
192174
depends on (RT_MUTEXES && PREEMPT_RCU && RCU_EXPERT) || PREEMPT_RT

kernel/rcu/tree.c

Lines changed: 0 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -624,7 +624,6 @@ static noinstr void rcu_eqs_enter(bool user)
624624
instrumentation_begin();
625625
trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, atomic_read(&rdp->dynticks));
626626
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
627-
rcu_prepare_for_idle();
628627
rcu_preempt_deferred_qs(current);
629628

630629
// instrumentation for the noinstr rcu_dynticks_eqs_enter()
@@ -768,9 +767,6 @@ noinstr void rcu_nmi_exit(void)
768767
trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, atomic_read(&rdp->dynticks));
769768
WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
770769

771-
if (!in_nmi())
772-
rcu_prepare_for_idle();
773-
774770
// instrumentation for the noinstr rcu_dynticks_eqs_enter()
775771
instrument_atomic_write(&rdp->dynticks, sizeof(rdp->dynticks));
776772
instrumentation_end();
@@ -872,7 +868,6 @@ static void noinstr rcu_eqs_exit(bool user)
872868
// instrumentation for the noinstr rcu_dynticks_eqs_exit()
873869
instrument_atomic_write(&rdp->dynticks, sizeof(rdp->dynticks));
874870

875-
rcu_cleanup_after_idle();
876871
trace_rcu_dyntick(TPS("End"), rdp->dynticks_nesting, 1, atomic_read(&rdp->dynticks));
877872
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
878873
WRITE_ONCE(rdp->dynticks_nesting, 1);
@@ -1014,12 +1009,6 @@ noinstr void rcu_nmi_enter(void)
10141009
rcu_dynticks_eqs_exit();
10151010
// ... but is watching here.
10161011

1017-
if (!in_nmi()) {
1018-
instrumentation_begin();
1019-
rcu_cleanup_after_idle();
1020-
instrumentation_end();
1021-
}
1022-
10231012
instrumentation_begin();
10241013
// instrumentation for the noinstr rcu_dynticks_curr_cpu_in_eqs()
10251014
instrument_atomic_read(&rdp->dynticks, sizeof(rdp->dynticks));

kernel/rcu/tree.h

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -189,11 +189,6 @@ struct rcu_data {
189189
bool rcu_urgent_qs; /* GP old need light quiescent state. */
190190
bool rcu_forced_tick; /* Forced tick to provide QS. */
191191
bool rcu_forced_tick_exp; /* ... provide QS to expedited GP. */
192-
#ifdef CONFIG_RCU_FAST_NO_HZ
193-
unsigned long last_accelerate; /* Last jiffy CBs were accelerated. */
194-
unsigned long last_advance_all; /* Last jiffy CBs were all advanced. */
195-
int tick_nohz_enabled_snap; /* Previously seen value from sysfs. */
196-
#endif /* #ifdef CONFIG_RCU_FAST_NO_HZ */
197192

198193
/* 4) rcu_barrier(), OOM callbacks, and expediting. */
199194
struct rcu_head barrier_head;
@@ -419,8 +414,6 @@ static bool rcu_is_callbacks_kthread(void);
419414
static void rcu_cpu_kthread_setup(unsigned int cpu);
420415
static void rcu_spawn_one_boost_kthread(struct rcu_node *rnp);
421416
static void __init rcu_spawn_boost_kthreads(void);
422-
static void rcu_cleanup_after_idle(void);
423-
static void rcu_prepare_for_idle(void);
424417
static bool rcu_preempt_has_tasks(struct rcu_node *rnp);
425418
static bool rcu_preempt_need_deferred_qs(struct task_struct *t);
426419
static void rcu_preempt_deferred_qs(struct task_struct *t);

kernel/rcu/tree_plugin.h

Lines changed: 2 additions & 183 deletions
Original file line numberDiff line numberDiff line change
@@ -51,8 +51,6 @@ static void __init rcu_bootup_announce_oddness(void)
5151
RCU_FANOUT);
5252
if (rcu_fanout_exact)
5353
pr_info("\tHierarchical RCU autobalancing is disabled.\n");
54-
if (IS_ENABLED(CONFIG_RCU_FAST_NO_HZ))
55-
pr_info("\tRCU dyntick-idle grace-period acceleration is enabled.\n");
5654
if (IS_ENABLED(CONFIG_PROVE_RCU))
5755
pr_info("\tRCU lockdep checking is enabled.\n");
5856
if (IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD))
@@ -1253,16 +1251,14 @@ static void __init rcu_spawn_boost_kthreads(void)
12531251

12541252
#endif /* #else #ifdef CONFIG_RCU_BOOST */
12551253

1256-
#if !defined(CONFIG_RCU_FAST_NO_HZ)
1257-
12581254
/*
12591255
* Check to see if any future non-offloaded RCU-related work will need
12601256
* to be done by the current CPU, even if none need be done immediately,
12611257
* returning 1 if so. This function is part of the RCU implementation;
12621258
* it is -not- an exported member of the RCU API.
12631259
*
1264-
* Because we not have RCU_FAST_NO_HZ, just check whether or not this
1265-
* CPU has RCU callbacks queued.
1260+
* Just check whether or not this CPU has non-offloaded RCU callbacks
1261+
* queued.
12661262
*/
12671263
int rcu_needs_cpu(u64 basemono, u64 *nextevt)
12681264
{
@@ -1271,183 +1267,6 @@ int rcu_needs_cpu(u64 basemono, u64 *nextevt)
12711267
!rcu_rdp_is_offloaded(this_cpu_ptr(&rcu_data));
12721268
}
12731269

1274-
/*
1275-
* Because we do not have RCU_FAST_NO_HZ, don't bother cleaning up
1276-
* after it.
1277-
*/
1278-
static void rcu_cleanup_after_idle(void)
1279-
{
1280-
}
1281-
1282-
/*
1283-
* Do the idle-entry grace-period work, which, because CONFIG_RCU_FAST_NO_HZ=n,
1284-
* is nothing.
1285-
*/
1286-
static void rcu_prepare_for_idle(void)
1287-
{
1288-
}
1289-
1290-
#else /* #if !defined(CONFIG_RCU_FAST_NO_HZ) */
1291-
1292-
/*
1293-
* This code is invoked when a CPU goes idle, at which point we want
1294-
* to have the CPU do everything required for RCU so that it can enter
1295-
* the energy-efficient dyntick-idle mode.
1296-
*
1297-
* The following preprocessor symbol controls this:
1298-
*
1299-
* RCU_IDLE_GP_DELAY gives the number of jiffies that a CPU is permitted
1300-
* to sleep in dyntick-idle mode with RCU callbacks pending. This
1301-
* is sized to be roughly one RCU grace period. Those energy-efficiency
1302-
* benchmarkers who might otherwise be tempted to set this to a large
1303-
* number, be warned: Setting RCU_IDLE_GP_DELAY too high can hang your
1304-
* system. And if you are -that- concerned about energy efficiency,
1305-
* just power the system down and be done with it!
1306-
*
1307-
* The value below works well in practice. If future workloads require
1308-
* adjustment, they can be converted into kernel config parameters, though
1309-
* making the state machine smarter might be a better option.
1310-
*/
1311-
#define RCU_IDLE_GP_DELAY 4 /* Roughly one grace period. */
1312-
1313-
static int rcu_idle_gp_delay = RCU_IDLE_GP_DELAY;
1314-
module_param(rcu_idle_gp_delay, int, 0644);
1315-
1316-
/*
1317-
* Try to advance callbacks on the current CPU, but only if it has been
1318-
* awhile since the last time we did so. Afterwards, if there are any
1319-
* callbacks ready for immediate invocation, return true.
1320-
*/
1321-
static bool __maybe_unused rcu_try_advance_all_cbs(void)
1322-
{
1323-
bool cbs_ready = false;
1324-
struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
1325-
struct rcu_node *rnp;
1326-
1327-
/* Exit early if we advanced recently. */
1328-
if (jiffies == rdp->last_advance_all)
1329-
return false;
1330-
rdp->last_advance_all = jiffies;
1331-
1332-
rnp = rdp->mynode;
1333-
1334-
/*
1335-
* Don't bother checking unless a grace period has
1336-
* completed since we last checked and there are
1337-
* callbacks not yet ready to invoke.
1338-
*/
1339-
if ((rcu_seq_completed_gp(rdp->gp_seq,
1340-
rcu_seq_current(&rnp->gp_seq)) ||
1341-
unlikely(READ_ONCE(rdp->gpwrap))) &&
1342-
rcu_segcblist_pend_cbs(&rdp->cblist))
1343-
note_gp_changes(rdp);
1344-
1345-
if (rcu_segcblist_ready_cbs(&rdp->cblist))
1346-
cbs_ready = true;
1347-
return cbs_ready;
1348-
}
1349-
1350-
/*
1351-
* Allow the CPU to enter dyntick-idle mode unless it has callbacks ready
1352-
* to invoke. If the CPU has callbacks, try to advance them. Tell the
1353-
* caller about what to set the timeout.
1354-
*
1355-
* The caller must have disabled interrupts.
1356-
*/
1357-
int rcu_needs_cpu(u64 basemono, u64 *nextevt)
1358-
{
1359-
struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
1360-
unsigned long dj;
1361-
1362-
lockdep_assert_irqs_disabled();
1363-
1364-
/* If no non-offloaded callbacks, RCU doesn't need the CPU. */
1365-
if (rcu_segcblist_empty(&rdp->cblist) ||
1366-
rcu_rdp_is_offloaded(rdp)) {
1367-
*nextevt = KTIME_MAX;
1368-
return 0;
1369-
}
1370-
1371-
/* Attempt to advance callbacks. */
1372-
if (rcu_try_advance_all_cbs()) {
1373-
/* Some ready to invoke, so initiate later invocation. */
1374-
invoke_rcu_core();
1375-
return 1;
1376-
}
1377-
rdp->last_accelerate = jiffies;
1378-
1379-
/* Request timer and round. */
1380-
dj = round_up(rcu_idle_gp_delay + jiffies, rcu_idle_gp_delay) - jiffies;
1381-
1382-
*nextevt = basemono + dj * TICK_NSEC;
1383-
return 0;
1384-
}
1385-
1386-
/*
1387-
* Prepare a CPU for idle from an RCU perspective. The first major task is to
1388-
* sense whether nohz mode has been enabled or disabled via sysfs. The second
1389-
* major task is to accelerate (that is, assign grace-period numbers to) any
1390-
* recently arrived callbacks.
1391-
*
1392-
* The caller must have disabled interrupts.
1393-
*/
1394-
static void rcu_prepare_for_idle(void)
1395-
{
1396-
bool needwake;
1397-
struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
1398-
struct rcu_node *rnp;
1399-
int tne;
1400-
1401-
lockdep_assert_irqs_disabled();
1402-
if (rcu_rdp_is_offloaded(rdp))
1403-
return;
1404-
1405-
/* Handle nohz enablement switches conservatively. */
1406-
tne = READ_ONCE(tick_nohz_active);
1407-
if (tne != rdp->tick_nohz_enabled_snap) {
1408-
if (!rcu_segcblist_empty(&rdp->cblist))
1409-
invoke_rcu_core(); /* force nohz to see update. */
1410-
rdp->tick_nohz_enabled_snap = tne;
1411-
return;
1412-
}
1413-
if (!tne)
1414-
return;
1415-
1416-
/*
1417-
* If we have not yet accelerated this jiffy, accelerate all
1418-
* callbacks on this CPU.
1419-
*/
1420-
if (rdp->last_accelerate == jiffies)
1421-
return;
1422-
rdp->last_accelerate = jiffies;
1423-
if (rcu_segcblist_pend_cbs(&rdp->cblist)) {
1424-
rnp = rdp->mynode;
1425-
raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */
1426-
needwake = rcu_accelerate_cbs(rnp, rdp);
1427-
raw_spin_unlock_rcu_node(rnp); /* irqs remain disabled. */
1428-
if (needwake)
1429-
rcu_gp_kthread_wake();
1430-
}
1431-
}
1432-
1433-
/*
1434-
* Clean up for exit from idle. Attempt to advance callbacks based on
1435-
* any grace periods that elapsed while the CPU was idle, and if any
1436-
* callbacks are now ready to invoke, initiate invocation.
1437-
*/
1438-
static void rcu_cleanup_after_idle(void)
1439-
{
1440-
struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
1441-
1442-
lockdep_assert_irqs_disabled();
1443-
if (rcu_rdp_is_offloaded(rdp))
1444-
return;
1445-
if (rcu_try_advance_all_cbs())
1446-
invoke_rcu_core();
1447-
}
1448-
1449-
#endif /* #else #if !defined(CONFIG_RCU_FAST_NO_HZ) */
1450-
14511270
/*
14521271
* Is this CPU a NO_HZ_FULL CPU that should ignore RCU so that the
14531272
* grace-period kthread will do force_quiescent_state() processing?

0 commit comments

Comments
 (0)