Skip to content

Commit 4090871

Browse files
committed
Merge tag 'kvmarm-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 updates for 6.3 - Provide a virtual cache topology to the guest to avoid inconsistencies with migration on heterogenous systems. Non secure software has no practical need to traverse the caches by set/way in the first place. - Add support for taking stage-2 access faults in parallel. This was an accidental omission in the original parallel faults implementation, but should provide a marginal improvement to machines w/o FEAT_HAFDBS (such as hardware from the fruit company). - A preamble to adding support for nested virtualization to KVM, including vEL2 register state, rudimentary nested exception handling and masking unsupported features for nested guests. - Fixes to the PSCI relay that avoid an unexpected host SVE trap when resuming a CPU when running pKVM. - VGIC maintenance interrupt support for the AIC - Improvements to the arch timer emulation, primarily aimed at reducing the trap overhead of running nested. - Add CONFIG_USERFAULTFD to the KVM selftests config fragment in the interest of CI systems. - Avoid VM-wide stop-the-world operations when a vCPU accesses its own redistributor. - Serialize when toggling CPACR_EL1.SMEN to avoid unexpected exceptions in the host. - Aesthetic and comment/kerneldoc fixes - Drop the vestiges of the old Columbia mailing list and add [Oliver] as co-maintainer This also drags in arm64's 'for-next/sme2' branch, because both it and the PSCI relay changes touch the EL2 initialization code.
2 parents 7f604e9 + 96a4627 commit 4090871

85 files changed

Lines changed: 3105 additions & 402 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

Documentation/admin-guide/kernel-parameters.txt

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2553,9 +2553,14 @@
25532553
protected: nVHE-based mode with support for guests whose
25542554
state is kept private from the host.
25552555

2556+
nested: VHE-based mode with support for nested
2557+
virtualization. Requires at least ARMv8.3
2558+
hardware.
2559+
25562560
Defaults to VHE/nVHE based on hardware support. Setting
25572561
mode to "protected" will disable kexec and hibernation
2558-
for the host.
2562+
for the host. "nested" is experimental and should be
2563+
used with extreme caution.
25592564

25602565
kvm-arm.vgic_v3_group0_trap=
25612566
[KVM,ARM] Trap guest accesses to GICv3 group-0

Documentation/arm64/booting.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -369,6 +369,16 @@ Before jumping into the kernel, the following conditions must be met:
369369

370370
- HCR_EL2.ATA (bit 56) must be initialised to 0b1.
371371

372+
For CPUs with the Scalable Matrix Extension version 2 (FEAT_SME2):
373+
374+
- If EL3 is present:
375+
376+
- SMCR_EL3.EZT0 (bit 30) must be initialised to 0b1.
377+
378+
- If the kernel is entered at EL1 and EL2 is present:
379+
380+
- SMCR_EL2.EZT0 (bit 30) must be initialised to 0b1.
381+
372382
The requirements described above for CPU mode, caches, MMUs, architected
373383
timers, coherency and system registers apply to all CPUs. All CPUs must
374384
enter the kernel in the same exception level. Where the values documented

Documentation/arm64/elf_hwcaps.rst

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -284,6 +284,24 @@ HWCAP2_RPRFM
284284
HWCAP2_SVE2P1
285285
Functionality implied by ID_AA64ZFR0_EL1.SVEver == 0b0010.
286286

287+
HWCAP2_SME2
288+
Functionality implied by ID_AA64SMFR0_EL1.SMEver == 0b0001.
289+
290+
HWCAP2_SME2P1
291+
Functionality implied by ID_AA64SMFR0_EL1.SMEver == 0b0010.
292+
293+
HWCAP2_SMEI16I32
294+
Functionality implied by ID_AA64SMFR0_EL1.I16I32 == 0b0101
295+
296+
HWCAP2_SMEBI32I32
297+
Functionality implied by ID_AA64SMFR0_EL1.BI32I32 == 0b1
298+
299+
HWCAP2_SMEB16B16
300+
Functionality implied by ID_AA64SMFR0_EL1.B16B16 == 0b1
301+
302+
HWCAP2_SMEF16F16
303+
Functionality implied by ID_AA64SMFR0_EL1.F16F16 == 0b1
304+
287305
4. Unused AT_HWCAP bits
288306
-----------------------
289307

Documentation/arm64/sme.rst

Lines changed: 43 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -18,14 +18,19 @@ model features for SME is included in Appendix A.
1818
1. General
1919
-----------
2020

21-
* PSTATE.SM, PSTATE.ZA, the streaming mode vector length, the ZA
22-
register state and TPIDR2_EL0 are tracked per thread.
21+
* PSTATE.SM, PSTATE.ZA, the streaming mode vector length, the ZA and (when
22+
present) ZTn register state and TPIDR2_EL0 are tracked per thread.
2323

2424
* The presence of SME is reported to userspace via HWCAP2_SME in the aux vector
2525
AT_HWCAP2 entry. Presence of this flag implies the presence of the SME
2626
instructions and registers, and the Linux-specific system interfaces
2727
described in this document. SME is reported in /proc/cpuinfo as "sme".
2828

29+
* The presence of SME2 is reported to userspace via HWCAP2_SME2 in the
30+
aux vector AT_HWCAP2 entry. Presence of this flag implies the presence of
31+
the SME2 instructions and ZT0, and the Linux-specific system interfaces
32+
described in this document. SME2 is reported in /proc/cpuinfo as "sme2".
33+
2934
* Support for the execution of SME instructions in userspace can also be
3035
detected by reading the CPU ID register ID_AA64PFR1_EL1 using an MRS
3136
instruction, and checking that the value of the SME field is nonzero. [3]
@@ -44,6 +49,7 @@ model features for SME is included in Appendix A.
4449
HWCAP2_SME_B16F32
4550
HWCAP2_SME_F32F32
4651
HWCAP2_SME_FA64
52+
HWCAP2_SME2
4753

4854
This list may be extended over time as the SME architecture evolves.
4955

@@ -52,8 +58,8 @@ model features for SME is included in Appendix A.
5258
cpu-feature-registers.txt for details.
5359

5460
* Debuggers should restrict themselves to interacting with the target via the
55-
NT_ARM_SVE, NT_ARM_SSVE and NT_ARM_ZA regsets. The recommended way
56-
of detecting support for these regsets is to connect to a target process
61+
NT_ARM_SVE, NT_ARM_SSVE, NT_ARM_ZA and NT_ARM_ZT regsets. The recommended
62+
way of detecting support for these regsets is to connect to a target process
5763
first and then attempt a
5864

5965
ptrace(PTRACE_GETREGSET, pid, NT_ARM_<regset>, &iov).
@@ -89,13 +95,13 @@ be zeroed.
8995
-------------------------
9096

9197
* On syscall PSTATE.ZA is preserved, if PSTATE.ZA==1 then the contents of the
92-
ZA matrix are preserved.
98+
ZA matrix and ZTn (if present) are preserved.
9399

94100
* On syscall PSTATE.SM will be cleared and the SVE registers will be handled
95101
as per the standard SVE ABI.
96102

97-
* Neither the SVE registers nor ZA are used to pass arguments to or receive
98-
results from any syscall.
103+
* None of the SVE registers, ZA or ZTn are used to pass arguments to
104+
or receive results from any syscall.
99105

100106
* On process creation (eg, clone()) the newly created process will have
101107
PSTATE.SM cleared.
@@ -134,6 +140,14 @@ be zeroed.
134140
__reserved[] referencing this space. za_context is then written in the
135141
extra space. Refer to [1] for further details about this mechanism.
136142

143+
* If ZTn is supported and PSTATE.ZA==1 then a signal frame record for ZTn will
144+
be generated.
145+
146+
* The signal record for ZTn has magic ZT_MAGIC (0x5a544e01) and consists of a
147+
standard signal frame header followed by a struct zt_context specifying
148+
the number of ZTn registers supported by the system, then zt_context.nregs
149+
blocks of 64 bytes of data per register.
150+
137151

138152
5. Signal return
139153
-----------------
@@ -151,6 +165,9 @@ When returning from a signal handler:
151165
the signal frame does not match the current vector length, the signal return
152166
attempt is treated as illegal, resulting in a forced SIGSEGV.
153167

168+
* If ZTn is not supported or PSTATE.ZA==0 then it is illegal to have a
169+
signal frame record for ZTn, resulting in a forced SIGSEGV.
170+
154171

155172
6. prctl extensions
156173
--------------------
@@ -214,8 +231,8 @@ prctl(PR_SME_SET_VL, unsigned long arg)
214231
vector length that will be applied at the next execve() by the calling
215232
thread.
216233

217-
* Changing the vector length causes all of ZA, P0..P15, FFR and all bits of
218-
Z0..Z31 except for Z0 bits [127:0] .. Z31 bits [127:0] to become
234+
* Changing the vector length causes all of ZA, ZTn, P0..P15, FFR and all
235+
bits of Z0..Z31 except for Z0 bits [127:0] .. Z31 bits [127:0] to become
219236
unspecified, including both streaming and non-streaming SVE state.
220237
Calling PR_SME_SET_VL with vl equal to the thread's current vector
221238
length, or calling PR_SME_SET_VL with the PR_SVE_SET_VL_ONEXEC flag,
@@ -317,6 +334,15 @@ The regset data starts with struct user_za_header, containing:
317334

318335
* The effect of writing a partial, incomplete payload is unspecified.
319336

337+
* A new regset NT_ARM_ZT is defined for access to ZTn state via
338+
PTRACE_GETREGSET and PTRACE_SETREGSET.
339+
340+
* The NT_ARM_ZT regset consists of a single 512 bit register.
341+
342+
* When PSTATE.ZA==0 reads of NT_ARM_ZT will report all bits of ZTn as 0.
343+
344+
* Writes to NT_ARM_ZT will set PSTATE.ZA to 1.
345+
320346

321347
8. ELF coredump extensions
322348
---------------------------
@@ -331,6 +357,11 @@ The regset data starts with struct user_za_header, containing:
331357
been read if a PTRACE_GETREGSET of NT_ARM_ZA were executed for each thread
332358
when the coredump was generated.
333359

360+
* A NT_ARM_ZT note will be added to each coredump for each thread of the
361+
dumped process. The contents will be equivalent to the data that would have
362+
been read if a PTRACE_GETREGSET of NT_ARM_ZT were executed for each thread
363+
when the coredump was generated.
364+
334365
* The NT_ARM_TLS note will be extended to two registers, the second register
335366
will contain TPIDR2_EL0 on systems that support SME and will be read as
336367
zero with writes ignored otherwise.
@@ -406,6 +437,9 @@ In A64 state, SME adds the following:
406437
For best system performance it is strongly encouraged for software to enable
407438
ZA only when it is actively being used.
408439

440+
* A new ZT0 register is introduced when SME2 is present. This is a 512 bit
441+
register which is accessible when PSTATE.ZA is set, as ZA itself is.
442+
409443
* Two new 1 bit fields in PSTATE which may be controlled via the SMSTART and
410444
SMSTOP instructions or by access to the SVCR system register:
411445

MAINTAINERS

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11362,13 +11362,12 @@ F: virt/kvm/*
1136211362

1136311363
KERNEL VIRTUAL MACHINE FOR ARM64 (KVM/arm64)
1136411364
M: Marc Zyngier <maz@kernel.org>
11365+
M: Oliver Upton <oliver.upton@linux.dev>
1136511366
R: James Morse <james.morse@arm.com>
1136611367
R: Suzuki K Poulose <suzuki.poulose@arm.com>
11367-
R: Oliver Upton <oliver.upton@linux.dev>
1136811368
R: Zenghui Yu <yuzenghui@huawei.com>
1136911369
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
1137011370
L: kvmarm@lists.linux.dev
11371-
L: kvmarm@lists.cs.columbia.edu (deprecated, moderated for non-subscribers)
1137211371
S: Maintained
1137311372
T: git git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git
1137411373
F: arch/arm64/include/asm/kvm*

arch/arm64/include/asm/cache.h

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,15 @@
1616
#define CLIDR_LOC(clidr) (((clidr) >> CLIDR_LOC_SHIFT) & 0x7)
1717
#define CLIDR_LOUIS(clidr) (((clidr) >> CLIDR_LOUIS_SHIFT) & 0x7)
1818

19+
/* Ctypen, bits[3(n - 1) + 2 : 3(n - 1)], for n = 1 to 7 */
20+
#define CLIDR_CTYPE_SHIFT(level) (3 * (level - 1))
21+
#define CLIDR_CTYPE_MASK(level) (7 << CLIDR_CTYPE_SHIFT(level))
22+
#define CLIDR_CTYPE(clidr, level) \
23+
(((clidr) & CLIDR_CTYPE_MASK(level)) >> CLIDR_CTYPE_SHIFT(level))
24+
25+
/* Ttypen, bits [2(n - 1) + 34 : 2(n - 1) + 33], for n = 1 to 7 */
26+
#define CLIDR_TTYPE_SHIFT(level) (2 * ((level) - 1) + CLIDR_EL1_Ttypen_SHIFT)
27+
1928
/*
2029
* Memory returned by kmalloc() may be used for DMA, so we must make
2130
* sure that all such allocations are cache aligned. Otherwise,

arch/arm64/include/asm/cpufeature.h

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -769,6 +769,12 @@ static __always_inline bool system_supports_sme(void)
769769
cpus_have_const_cap(ARM64_SME);
770770
}
771771

772+
static __always_inline bool system_supports_sme2(void)
773+
{
774+
return IS_ENABLED(CONFIG_ARM64_SME) &&
775+
cpus_have_const_cap(ARM64_SME2);
776+
}
777+
772778
static __always_inline bool system_supports_fa64(void)
773779
{
774780
return IS_ENABLED(CONFIG_ARM64_SME) &&

arch/arm64/include/asm/el2_setup.h

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -196,4 +196,103 @@
196196
__init_el2_nvhe_prepare_eret
197197
.endm
198198

199+
#ifndef __KVM_NVHE_HYPERVISOR__
200+
// This will clobber tmp1 and tmp2, and expect tmp1 to contain
201+
// the id register value as read from the HW
202+
.macro __check_override idreg, fld, width, pass, fail, tmp1, tmp2
203+
ubfx \tmp1, \tmp1, #\fld, #\width
204+
cbz \tmp1, \fail
205+
206+
adr_l \tmp1, \idreg\()_override
207+
ldr \tmp2, [\tmp1, FTR_OVR_VAL_OFFSET]
208+
ldr \tmp1, [\tmp1, FTR_OVR_MASK_OFFSET]
209+
ubfx \tmp2, \tmp2, #\fld, #\width
210+
ubfx \tmp1, \tmp1, #\fld, #\width
211+
cmp \tmp1, xzr
212+
and \tmp2, \tmp2, \tmp1
213+
csinv \tmp2, \tmp2, xzr, ne
214+
cbnz \tmp2, \pass
215+
b \fail
216+
.endm
217+
218+
// This will clobber tmp1 and tmp2
219+
.macro check_override idreg, fld, pass, fail, tmp1, tmp2
220+
mrs \tmp1, \idreg\()_el1
221+
__check_override \idreg \fld 4 \pass \fail \tmp1 \tmp2
222+
.endm
223+
#else
224+
// This will clobber tmp
225+
.macro __check_override idreg, fld, width, pass, fail, tmp, ignore
226+
ldr_l \tmp, \idreg\()_el1_sys_val
227+
ubfx \tmp, \tmp, #\fld, #\width
228+
cbnz \tmp, \pass
229+
b \fail
230+
.endm
231+
232+
.macro check_override idreg, fld, pass, fail, tmp, ignore
233+
__check_override \idreg \fld 4 \pass \fail \tmp \ignore
234+
.endm
235+
#endif
236+
237+
.macro finalise_el2_state
238+
check_override id_aa64pfr0, ID_AA64PFR0_EL1_SVE_SHIFT, .Linit_sve_\@, .Lskip_sve_\@, x1, x2
239+
240+
.Linit_sve_\@: /* SVE register access */
241+
mrs x0, cptr_el2 // Disable SVE traps
242+
bic x0, x0, #CPTR_EL2_TZ
243+
msr cptr_el2, x0
244+
isb
245+
mov x1, #ZCR_ELx_LEN_MASK // SVE: Enable full vector
246+
msr_s SYS_ZCR_EL2, x1 // length for EL1.
247+
248+
.Lskip_sve_\@:
249+
check_override id_aa64pfr1, ID_AA64PFR1_EL1_SME_SHIFT, .Linit_sme_\@, .Lskip_sme_\@, x1, x2
250+
251+
.Linit_sme_\@: /* SME register access and priority mapping */
252+
mrs x0, cptr_el2 // Disable SME traps
253+
bic x0, x0, #CPTR_EL2_TSM
254+
msr cptr_el2, x0
255+
isb
256+
257+
mrs x1, sctlr_el2
258+
orr x1, x1, #SCTLR_ELx_ENTP2 // Disable TPIDR2 traps
259+
msr sctlr_el2, x1
260+
isb
261+
262+
mov x0, #0 // SMCR controls
263+
264+
// Full FP in SM?
265+
mrs_s x1, SYS_ID_AA64SMFR0_EL1
266+
__check_override id_aa64smfr0, ID_AA64SMFR0_EL1_FA64_SHIFT, 1, .Linit_sme_fa64_\@, .Lskip_sme_fa64_\@, x1, x2
267+
268+
.Linit_sme_fa64_\@:
269+
orr x0, x0, SMCR_ELx_FA64_MASK
270+
.Lskip_sme_fa64_\@:
271+
272+
// ZT0 available?
273+
mrs_s x1, SYS_ID_AA64SMFR0_EL1
274+
__check_override id_aa64smfr0, ID_AA64SMFR0_EL1_SMEver_SHIFT, 4, .Linit_sme_zt0_\@, .Lskip_sme_zt0_\@, x1, x2
275+
.Linit_sme_zt0_\@:
276+
orr x0, x0, SMCR_ELx_EZT0_MASK
277+
.Lskip_sme_zt0_\@:
278+
279+
orr x0, x0, #SMCR_ELx_LEN_MASK // Enable full SME vector
280+
msr_s SYS_SMCR_EL2, x0 // length for EL1.
281+
282+
mrs_s x1, SYS_SMIDR_EL1 // Priority mapping supported?
283+
ubfx x1, x1, #SMIDR_EL1_SMPS_SHIFT, #1
284+
cbz x1, .Lskip_sme_\@
285+
286+
msr_s SYS_SMPRIMAP_EL2, xzr // Make all priorities equal
287+
288+
mrs x1, id_aa64mmfr1_el1 // HCRX_EL2 present?
289+
ubfx x1, x1, #ID_AA64MMFR1_EL1_HCX_SHIFT, #4
290+
cbz x1, .Lskip_sme_\@
291+
292+
mrs_s x1, SYS_HCRX_EL2
293+
orr x1, x1, #HCRX_EL2_SMPME_MASK // Enable priority mapping
294+
msr_s SYS_HCRX_EL2, x1
295+
.Lskip_sme_\@:
296+
.endm
297+
199298
#endif /* __ARM_KVM_INIT_H__ */

arch/arm64/include/asm/esr.h

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -272,6 +272,10 @@
272272
(((e) & ESR_ELx_SYS64_ISS_OP2_MASK) >> \
273273
ESR_ELx_SYS64_ISS_OP2_SHIFT))
274274

275+
/* ISS field definitions for ERET/ERETAA/ERETAB trapping */
276+
#define ESR_ELx_ERET_ISS_ERET 0x2
277+
#define ESR_ELx_ERET_ISS_ERETA 0x1
278+
275279
/*
276280
* ISS field definitions for floating-point exception traps
277281
* (FP_EXC_32/FP_EXC_64).
@@ -350,6 +354,7 @@
350354
#define ESR_ELx_SME_ISS_ILL 1
351355
#define ESR_ELx_SME_ISS_SM_DISABLED 2
352356
#define ESR_ELx_SME_ISS_ZA_DISABLED 3
357+
#define ESR_ELx_SME_ISS_ZT_DISABLED 4
353358

354359
#ifndef __ASSEMBLY__
355360
#include <asm/types.h>

0 commit comments

Comments
 (0)