Skip to content

Commit 9244724

Browse files
committed
Merge tag 'smp-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull SMP updates from Thomas Gleixner: "A large update for SMP management: - Parallel CPU bringup The reason why people are interested in parallel bringup is to shorten the (kexec) reboot time of cloud servers to reduce the downtime of the VM tenants. The current fully serialized bringup does the following per AP: 1) Prepare callbacks (allocate, intialize, create threads) 2) Kick the AP alive (e.g. INIT/SIPI on x86) 3) Wait for the AP to report alive state 4) Let the AP continue through the atomic bringup 5) Let the AP run the threaded bringup to full online state There are two significant delays: #3 The time for an AP to report alive state in start_secondary() on x86 has been measured in the range between 350us and 3.5ms depending on vendor and CPU type, BIOS microcode size etc. #4 The atomic bringup does the microcode update. This has been measured to take up to ~8ms on the primary threads depending on the microcode patch size to apply. On a two socket SKL server with 56 cores (112 threads) the boot CPU spends on current mainline about 800ms busy waiting for the APs to come up and apply microcode. That's more than 80% of the actual onlining procedure. This can be reduced significantly by splitting the bringup mechanism into two parts: 1) Run the prepare callbacks and kick the AP alive for each AP which needs to be brought up. The APs wake up, do their firmware initialization and run the low level kernel startup code including microcode loading in parallel up to the first synchronization point. (#1 and #2 above) 2) Run the rest of the bringup code strictly serialized per CPU (#3 - #5 above) as it's done today. Parallelizing that stage of the CPU bringup might be possible in theory, but it's questionable whether required surgery would be justified for a pretty small gain. If the system is large enough the first AP is already waiting at the first synchronization point when the boot CPU finished the wake-up of the last AP. That reduces the AP bringup time on that SKL from ~800ms to ~80ms, i.e. by a factor ~10x. The actual gain varies wildly depending on the system, CPU, microcode patch size and other factors. There are some opportunities to reduce the overhead further, but that needs some deep surgery in the x86 CPU bringup code. For now this is only enabled on x86, but the core functionality obviously works for all SMP capable architectures. - Enhancements for SMP function call tracing so it is possible to locate the scheduling and the actual execution points. That allows to measure IPI delivery time precisely" * tag 'smp-core-2023-06-26' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits) trace,smp: Add tracepoints for scheduling remotelly called functions trace,smp: Add tracepoints around remotelly called functions MAINTAINERS: Add CPU HOTPLUG entry x86/smpboot: Fix the parallel bringup decision x86/realmode: Make stack lock work in trampoline_compat() x86/smp: Initialize cpu_primary_thread_mask late cpu/hotplug: Fix off by one in cpuhp_bringup_mask() x86/apic: Fix use of X{,2}APIC_ENABLE in asm with older binutils x86/smpboot/64: Implement arch_cpuhp_init_parallel_bringup() and enable it x86/smpboot: Support parallel startup of secondary CPUs x86/smpboot: Implement a bit spinlock to protect the realmode stack x86/apic: Save the APIC virtual base address cpu/hotplug: Allow "parallel" bringup up to CPUHP_BP_KICK_AP_STATE x86/apic: Provide cpu_primary_thread mask x86/smpboot: Enable split CPU startup cpu/hotplug: Provide a split up CPUHP_BRINGUP mechanism cpu/hotplug: Reset task stack state in _cpu_up() cpu/hotplug: Remove unused state functions riscv: Switch to hotplug core state synchronization parisc: Switch to hotplug core state synchronization ...
2 parents 7cffdbe + bf5a8c2 commit 9244724

66 files changed

Lines changed: 1077 additions & 1011 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

Documentation/admin-guide/kernel-parameters.txt

Lines changed: 6 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -818,20 +818,6 @@
818818
Format:
819819
<first_slot>,<last_slot>,<port>,<enum_bit>[,<debug>]
820820

821-
cpu0_hotplug [X86] Turn on CPU0 hotplug feature when
822-
CONFIG_BOOTPARAM_HOTPLUG_CPU0 is off.
823-
Some features depend on CPU0. Known dependencies are:
824-
1. Resume from suspend/hibernate depends on CPU0.
825-
Suspend/hibernate will fail if CPU0 is offline and you
826-
need to online CPU0 before suspend/hibernate.
827-
2. PIC interrupts also depend on CPU0. CPU0 can't be
828-
removed if a PIC interrupt is detected.
829-
It's said poweroff/reboot may depend on CPU0 on some
830-
machines although I haven't seen such issues so far
831-
after CPU0 is offline on a few tested machines.
832-
If the dependencies are under your control, you can
833-
turn on cpu0_hotplug.
834-
835821
cpuidle.off=1 [CPU_IDLE]
836822
disable the cpuidle sub-system
837823

@@ -852,6 +838,12 @@
852838
on every CPU online, such as boot, and resume from suspend.
853839
Default: 10000
854840

841+
cpuhp.parallel=
842+
[SMP] Enable/disable parallel bringup of secondary CPUs
843+
Format: <bool>
844+
Default is enabled if CONFIG_HOTPLUG_PARALLEL=y. Otherwise
845+
the parameter has no effect.
846+
855847
crash_kexec_post_notifiers
856848
Run kdump after running panic-notifiers and dumping
857849
kmsg. This only for the users who doubt kdump always

Documentation/core-api/cpu_hotplug.rst

Lines changed: 2 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -127,17 +127,8 @@ bring CPU4 back online::
127127
$ echo 1 > /sys/devices/system/cpu/cpu4/online
128128
smpboot: Booting Node 0 Processor 4 APIC 0x1
129129

130-
The CPU is usable again. This should work on all CPUs. CPU0 is often special
131-
and excluded from CPU hotplug. On X86 the kernel option
132-
*CONFIG_BOOTPARAM_HOTPLUG_CPU0* has to be enabled in order to be able to
133-
shutdown CPU0. Alternatively the kernel command option *cpu0_hotplug* can be
134-
used. Some known dependencies of CPU0:
135-
136-
* Resume from hibernate/suspend. Hibernate/suspend will fail if CPU0 is offline.
137-
* PIC interrupts. CPU0 can't be removed if a PIC interrupt is detected.
138-
139-
Please let Fenghua Yu <fenghua.yu@intel.com> know if you find any dependencies
140-
on CPU0.
130+
The CPU is usable again. This should work on all CPUs, but CPU0 is often special
131+
and excluded from CPU hotplug.
141132

142133
The CPU hotplug coordination
143134
============================

MAINTAINERS

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5344,6 +5344,18 @@ F: include/linux/sched/cpufreq.h
53445344
F: kernel/sched/cpufreq*.c
53455345
F: tools/testing/selftests/cpufreq/
53465346

5347+
CPU HOTPLUG
5348+
M: Thomas Gleixner <tglx@linutronix.de>
5349+
M: Peter Zijlstra <peterz@infradead.org>
5350+
L: linux-kernel@vger.kernel.org
5351+
S: Maintained
5352+
T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git smp/core
5353+
F: kernel/cpu.c
5354+
F: kernel/smpboot.*
5355+
F: include/linux/cpu.h
5356+
F: include/linux/cpuhotplug.h
5357+
F: include/linux/smpboot.h
5358+
53475359
CPU IDLE TIME MANAGEMENT FRAMEWORK
53485360
M: "Rafael J. Wysocki" <rafael@kernel.org>
53495361
M: Daniel Lezcano <daniel.lezcano@linaro.org>

arch/Kconfig

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,29 @@ config ARCH_HAS_SUBPAGE_FAULTS
3434
config HOTPLUG_SMT
3535
bool
3636

37+
# Selected by HOTPLUG_CORE_SYNC_DEAD or HOTPLUG_CORE_SYNC_FULL
38+
config HOTPLUG_CORE_SYNC
39+
bool
40+
41+
# Basic CPU dead synchronization selected by architecture
42+
config HOTPLUG_CORE_SYNC_DEAD
43+
bool
44+
select HOTPLUG_CORE_SYNC
45+
46+
# Full CPU synchronization with alive state selected by architecture
47+
config HOTPLUG_CORE_SYNC_FULL
48+
bool
49+
select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU
50+
select HOTPLUG_CORE_SYNC
51+
52+
config HOTPLUG_SPLIT_STARTUP
53+
bool
54+
select HOTPLUG_CORE_SYNC_FULL
55+
56+
config HOTPLUG_PARALLEL
57+
bool
58+
select HOTPLUG_SPLIT_STARTUP
59+
3760
config GENERIC_ENTRY
3861
bool
3962

arch/arm/Kconfig

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,7 @@ config ARM
125125
select HAVE_SYSCALL_TRACEPOINTS
126126
select HAVE_UID16
127127
select HAVE_VIRT_CPU_ACCOUNTING_GEN
128+
select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU
128129
select IRQ_FORCED_THREADING
129130
select MODULES_USE_ELF_REL
130131
select NEED_DMA_MAP_STATE

arch/arm/include/asm/smp.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ extern void secondary_startup_arm(void);
6464

6565
extern int __cpu_disable(void);
6666

67-
extern void __cpu_die(unsigned int cpu);
67+
static inline void __cpu_die(unsigned int cpu) { }
6868

6969
extern void arch_send_call_function_single_ipi(int cpu);
7070
extern void arch_send_call_function_ipi_mask(const struct cpumask *mask);

arch/arm/kernel/smp.c

Lines changed: 7 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -288,15 +288,11 @@ int __cpu_disable(void)
288288
}
289289

290290
/*
291-
* called on the thread which is asking for a CPU to be shutdown -
292-
* waits until shutdown has completed, or it is timed out.
291+
* called on the thread which is asking for a CPU to be shutdown after the
292+
* shutdown completed.
293293
*/
294-
void __cpu_die(unsigned int cpu)
294+
void arch_cpuhp_cleanup_dead_cpu(unsigned int cpu)
295295
{
296-
if (!cpu_wait_death(cpu, 5)) {
297-
pr_err("CPU%u: cpu didn't die\n", cpu);
298-
return;
299-
}
300296
pr_debug("CPU%u: shutdown\n", cpu);
301297

302298
clear_tasks_mm_cpumask(cpu);
@@ -336,11 +332,11 @@ void __noreturn arch_cpu_idle_dead(void)
336332
flush_cache_louis();
337333

338334
/*
339-
* Tell __cpu_die() that this CPU is now safe to dispose of. Once
340-
* this returns, power and/or clocks can be removed at any point
341-
* from this CPU and its cache by platform_cpu_kill().
335+
* Tell cpuhp_bp_sync_dead() that this CPU is now safe to dispose
336+
* of. Once this returns, power and/or clocks can be removed at
337+
* any point from this CPU and its cache by platform_cpu_kill().
342338
*/
343-
(void)cpu_report_death();
339+
cpuhp_ap_report_dead();
344340

345341
/*
346342
* Ensure that the cache lines associated with that completion are

arch/arm64/Kconfig

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -222,6 +222,7 @@ config ARM64
222222
select HAVE_KPROBES
223223
select HAVE_KRETPROBES
224224
select HAVE_GENERIC_VDSO
225+
select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU
225226
select IRQ_DOMAIN
226227
select IRQ_FORCED_THREADING
227228
select KASAN_VMALLOC if KASAN

arch/arm64/include/asm/smp.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -99,7 +99,7 @@ static inline void arch_send_wakeup_ipi_mask(const struct cpumask *mask)
9999

100100
extern int __cpu_disable(void);
101101

102-
extern void __cpu_die(unsigned int cpu);
102+
static inline void __cpu_die(unsigned int cpu) { }
103103
extern void __noreturn cpu_die(void);
104104
extern void __noreturn cpu_die_early(void);
105105

arch/arm64/kernel/smp.c

Lines changed: 5 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -332,17 +332,13 @@ static int op_cpu_kill(unsigned int cpu)
332332
}
333333

334334
/*
335-
* called on the thread which is asking for a CPU to be shutdown -
336-
* waits until shutdown has completed, or it is timed out.
335+
* Called on the thread which is asking for a CPU to be shutdown after the
336+
* shutdown completed.
337337
*/
338-
void __cpu_die(unsigned int cpu)
338+
void arch_cpuhp_cleanup_dead_cpu(unsigned int cpu)
339339
{
340340
int err;
341341

342-
if (!cpu_wait_death(cpu, 5)) {
343-
pr_crit("CPU%u: cpu didn't die\n", cpu);
344-
return;
345-
}
346342
pr_debug("CPU%u: shutdown\n", cpu);
347343

348344
/*
@@ -369,8 +365,8 @@ void __noreturn cpu_die(void)
369365

370366
local_daif_mask();
371367

372-
/* Tell __cpu_die() that this CPU is now safe to dispose of */
373-
(void)cpu_report_death();
368+
/* Tell cpuhp_bp_sync_dead() that this CPU is now safe to dispose of */
369+
cpuhp_ap_report_dead();
374370

375371
/*
376372
* Actually shutdown the CPU. This must never fail. The specific hotplug

0 commit comments

Comments
 (0)