Skip to content

Commit c3c1e98

Browse files
committed
Merge tag 'pm-7.0-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull more power management updates from Rafael Wysocki: "These are mostly fixes on top of the power management updates merged recently in cpuidle governors, in the Intel RAPL power capping driver and in the wake IRQ management code: - Fix the handling of package-scope MSRs in the intel_rapl power capping driver when called from the PMU subsystem and make it add all package CPUs to the PMU cpumask to allow tools to read RAPL events from any CPU in the package (Kuppuswamy Satharayananyan) - Rework the invalid version check in the intel_rapl_tpmi power capping driver to account for the fact that on partitioned systems, multiple TPMI instances may exist per package, but RAPL registers are only valid on one instance (Kuppuswamy Satharayananyan) - Describe the new intel_idle.table command line option in the admin-guide intel_idle documentation (Artem Bityutskiy) - Fix a crash in the ladder cpuidle governor on systems with only one (polling) idle state available by making the cpuidle core bypass the governor in those cases and adjust the other existing governors to that change (Aboorva Devarajan, Christian Loehle) - Update kerneldoc comments for wake IRQ management functions that have not been matching the code (Wang Jiayue)" * tag 'pm-7.0-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpuidle: menu: Remove single state handling cpuidle: teo: Remove single state handling cpuidle: haltpoll: Remove single state handling cpuidle: Skip governor when only one idle state is available powercap: intel_rapl_tpmi: Remove FW_BUG from invalid version check PM: sleep: wakeirq: Update outdated documentation comments Documentation: PM: Document intel_idle.table command line option powercap: intel_rapl: Expose all package CPUs in PMU cpumask powercap: intel_rapl: Remove incorrect CPU check in PMU context
2 parents 23b0f90 + becbdde commit c3c1e98

10 files changed

Lines changed: 45 additions & 36 deletions

File tree

Documentation/admin-guide/pm/intel_idle.rst

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -260,6 +260,17 @@ mode to off when the CPU is in any one of the available idle states. This may
260260
help performance of a sibling CPU at the expense of a slightly higher wakeup
261261
latency for the idle CPU.
262262

263+
The ``table`` argument allows customization of idle state latency and target
264+
residency. The syntax is a comma-separated list of ``name:latency:residency``
265+
entries, where ``name`` is the idle state name, ``latency`` is the exit latency
266+
in microseconds, and ``residency`` is the target residency in microseconds. It
267+
is not necessary to specify all idle states; only those to be customized. For
268+
example, ``C1:1:3,C6:50:100`` sets the exit latency and target residency for
269+
C1 and C6 to 1/3 and 50/100 microseconds, respectively. Remaining idle states
270+
keep their default values. The driver verifies that deeper idle states have
271+
higher latency and target residency than shallower ones. Also, target
272+
residency cannot be smaller than exit latency. If any of these conditions is
273+
not met, the driver ignores the entire ``table`` parameter.
263274

264275
.. _intel-idle-core-and-package-idle-states:
265276

drivers/base/power/wakeirq.c

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -273,8 +273,10 @@ EXPORT_SYMBOL_GPL(dev_pm_set_dedicated_wake_irq_reverse);
273273
* otherwise try to disable already disabled wakeirq. The wake-up interrupt
274274
* starts disabled with IRQ_NOAUTOEN set.
275275
*
276-
* Should be only called from rpm_suspend() and rpm_resume() path.
277-
* Caller must hold &dev->power.lock to change wirq->status
276+
* Should be called from rpm_suspend(), rpm_resume(),
277+
* pm_runtime_force_suspend() or pm_runtime_force_resume().
278+
* Caller must hold &dev->power.lock or disable runtime PM to change
279+
* wirq->status.
278280
*/
279281
void dev_pm_enable_wake_irq_check(struct device *dev,
280282
bool can_change_status)
@@ -306,7 +308,8 @@ void dev_pm_enable_wake_irq_check(struct device *dev,
306308
* @cond_disable: if set, also check WAKE_IRQ_DEDICATED_REVERSE
307309
*
308310
* Disables wake-up interrupt conditionally based on status.
309-
* Should be only called from rpm_suspend() and rpm_resume() path.
311+
* Should be called from rpm_suspend(), rpm_resume(),
312+
* pm_runtime_force_suspend() or pm_runtime_force_resume().
310313
*/
311314
void dev_pm_disable_wake_irq_check(struct device *dev, bool cond_disable)
312315
{
@@ -332,7 +335,7 @@ void dev_pm_disable_wake_irq_check(struct device *dev, bool cond_disable)
332335
* enable wake IRQ after running ->runtime_suspend() which depends on
333336
* WAKE_IRQ_DEDICATED_REVERSE.
334337
*
335-
* Should be only called from rpm_suspend() path.
338+
* Should be called from rpm_suspend() or pm_runtime_force_suspend().
336339
*/
337340
void dev_pm_enable_wake_irq_complete(struct device *dev)
338341
{

drivers/cpuidle/cpuidle.c

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -359,6 +359,16 @@ noinstr int cpuidle_enter_state(struct cpuidle_device *dev,
359359
int cpuidle_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
360360
bool *stop_tick)
361361
{
362+
/*
363+
* If there is only a single idle state (or none), there is nothing
364+
* meaningful for the governor to choose. Skip the governor and
365+
* always use state 0 with the tick running.
366+
*/
367+
if (drv->state_count <= 1) {
368+
*stop_tick = false;
369+
return 0;
370+
}
371+
362372
return cpuidle_curr_governor->select(drv, dev, stop_tick);
363373
}
364374

drivers/cpuidle/governors/haltpoll.c

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -50,9 +50,7 @@ static int haltpoll_select(struct cpuidle_driver *drv,
5050
struct cpuidle_device *dev,
5151
bool *stop_tick)
5252
{
53-
s64 latency_req = cpuidle_governor_latency_req(dev->cpu);
54-
55-
if (!drv->state_count || latency_req == 0) {
53+
if (cpuidle_governor_latency_req(dev->cpu) == 0) {
5654
*stop_tick = false;
5755
return 0;
5856
}

drivers/cpuidle/governors/menu.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -281,7 +281,7 @@ static int menu_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
281281
data->bucket = BUCKETS - 1;
282282
}
283283

284-
if (drv->state_count <= 1 || latency_req == 0 ||
284+
if (latency_req == 0 ||
285285
((data->next_timer_ns < drv->states[1].target_residency_ns ||
286286
latency_req < drv->states[1].exit_latency_ns) &&
287287
!dev->states_usage[0].disable)) {

drivers/cpuidle/governors/teo.c

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -338,12 +338,6 @@ static int teo_select(struct cpuidle_driver *drv, struct cpuidle_device *dev,
338338
*/
339339
cpu_data->sleep_length_ns = KTIME_MAX;
340340

341-
/* Check if there is any choice in the first place. */
342-
if (drv->state_count < 2) {
343-
idx = 0;
344-
goto out_tick;
345-
}
346-
347341
if (!dev->states_usage[0].disable)
348342
idx = 0;
349343

drivers/powercap/intel_rapl_common.c

Lines changed: 8 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -254,7 +254,7 @@ static void rapl_init_domains(struct rapl_package *rp);
254254
static int rapl_read_data_raw(struct rapl_domain *rd,
255255
enum rapl_primitives prim,
256256
bool xlate, u64 *data,
257-
bool atomic);
257+
bool pmu_ctx);
258258
static int rapl_write_data_raw(struct rapl_domain *rd,
259259
enum rapl_primitives prim,
260260
unsigned long long value);
@@ -832,7 +832,7 @@ prim_fixups(struct rapl_domain *rd, enum rapl_primitives prim)
832832
*/
833833
static int rapl_read_data_raw(struct rapl_domain *rd,
834834
enum rapl_primitives prim, bool xlate, u64 *data,
835-
bool atomic)
835+
bool pmu_ctx)
836836
{
837837
u64 value;
838838
enum rapl_primitives prim_fixed = prim_fixups(rd, prim);
@@ -854,7 +854,7 @@ static int rapl_read_data_raw(struct rapl_domain *rd,
854854

855855
ra.mask = rpi->mask;
856856

857-
if (rd->rp->priv->read_raw(get_rid(rd->rp), &ra, atomic)) {
857+
if (rd->rp->priv->read_raw(get_rid(rd->rp), &ra, pmu_ctx)) {
858858
pr_debug("failed to read reg 0x%llx for %s:%s\n", ra.reg.val, rd->rp->name, rd->name);
859859
return -EIO;
860860
}
@@ -1590,23 +1590,21 @@ static struct rapl_pmu rapl_pmu;
15901590

15911591
/* PMU helpers */
15921592

1593-
static int get_pmu_cpu(struct rapl_package *rp)
1593+
static void set_pmu_cpumask(struct rapl_package *rp, cpumask_var_t mask)
15941594
{
15951595
int cpu;
15961596

15971597
if (!rp->has_pmu)
1598-
return nr_cpu_ids;
1598+
return;
15991599

16001600
/* Only TPMI & MSR RAPL are supported for now */
16011601
if (rp->priv->type != RAPL_IF_TPMI && rp->priv->type != RAPL_IF_MSR)
1602-
return nr_cpu_ids;
1602+
return;
16031603

16041604
/* TPMI/MSR RAPL uses any CPU in the package for PMU */
16051605
for_each_online_cpu(cpu)
16061606
if (topology_physical_package_id(cpu) == rp->id)
1607-
return cpu;
1608-
1609-
return nr_cpu_ids;
1607+
cpumask_set_cpu(cpu, mask);
16101608
}
16111609

16121610
static bool is_rp_pmu_cpu(struct rapl_package *rp, int cpu)
@@ -1883,7 +1881,6 @@ static ssize_t cpumask_show(struct device *dev,
18831881
{
18841882
struct rapl_package *rp;
18851883
cpumask_var_t cpu_mask;
1886-
int cpu;
18871884
int ret;
18881885

18891886
if (!alloc_cpumask_var(&cpu_mask, GFP_KERNEL))
@@ -1895,9 +1892,7 @@ static ssize_t cpumask_show(struct device *dev,
18951892

18961893
/* Choose a cpu for each RAPL Package */
18971894
list_for_each_entry(rp, &rapl_packages, plist) {
1898-
cpu = get_pmu_cpu(rp);
1899-
if (cpu < nr_cpu_ids)
1900-
cpumask_set_cpu(cpu, cpu_mask);
1895+
set_pmu_cpumask(rp, cpu_mask);
19011896
}
19021897
cpus_read_unlock();
19031898

drivers/powercap/intel_rapl_msr.c

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -110,16 +110,14 @@ static int rapl_cpu_down_prep(unsigned int cpu)
110110
return 0;
111111
}
112112

113-
static int rapl_msr_read_raw(int cpu, struct reg_action *ra, bool atomic)
113+
static int rapl_msr_read_raw(int cpu, struct reg_action *ra, bool pmu_ctx)
114114
{
115115
/*
116-
* When called from atomic-context (eg PMU event handler)
117-
* perform MSR read directly using rdmsrq().
116+
* When called from PMU context, perform MSR read directly using
117+
* rdmsrq() without IPI overhead. Package-scoped MSRs are readable
118+
* from any CPU in the package.
118119
*/
119-
if (atomic) {
120-
if (unlikely(smp_processor_id() != cpu))
121-
return -EIO;
122-
120+
if (pmu_ctx) {
123121
rdmsrq(ra->reg.msr, ra->value);
124122
goto out;
125123
}

drivers/powercap/intel_rapl_tpmi.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -157,7 +157,7 @@ static int parse_one_domain(struct tpmi_rapl_package *trp, u32 offset)
157157
tpmi_domain_flags = tpmi_domain_header >> 32 & 0xffff;
158158

159159
if (tpmi_domain_version == TPMI_VERSION_INVALID) {
160-
pr_warn(FW_BUG "Invalid version\n");
160+
pr_debug("Invalid version, other instances may be valid\n");
161161
return -ENODEV;
162162
}
163163

include/linux/intel_rapl.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -152,7 +152,7 @@ struct rapl_if_priv {
152152
union rapl_reg reg_unit;
153153
union rapl_reg regs[RAPL_DOMAIN_MAX][RAPL_DOMAIN_REG_MAX];
154154
int limits[RAPL_DOMAIN_MAX];
155-
int (*read_raw)(int id, struct reg_action *ra, bool atomic);
155+
int (*read_raw)(int id, struct reg_action *ra, bool pmu_ctx);
156156
int (*write_raw)(int id, struct reg_action *ra);
157157
void *defaults;
158158
void *rpi;

0 commit comments

Comments
 (0)