Skip to content

Commit f3826aa

Browse files
committed
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm updates from Paolo Bonzini: "This excludes the bulk of the x86 changes, which I will send separately. They have two not complex but relatively unusual conflicts so I will wait for other dust to settle. guest_memfd: - Add support for host userspace mapping of guest_memfd-backed memory for VM types that do NOT use support KVM_MEMORY_ATTRIBUTE_PRIVATE (which isn't precisely the same thing as CoCo VMs, since x86's SEV-MEM and SEV-ES have no way to detect private vs. shared). This lays the groundwork for removal of guest memory from the kernel direct map, as well as for limited mmap() for guest_memfd-backed memory. For more information see: - commit a6ad541 ("Merge branch 'guest-memfd-mmap' into HEAD") - guest_memfd in Firecracker: https://github.com/firecracker-microvm/firecracker/tree/feature/secret-hiding - direct map removal: https://lore.kernel.org/all/20250221160728.1584559-1-roypat@amazon.co.uk/ - mmap support: https://lore.kernel.org/all/20250328153133.3504118-1-tabba@google.com/ ARM: - Add support for FF-A 1.2 as the secure memory conduit for pKVM, allowing more registers to be used as part of the message payload. - Change the way pKVM allocates its VM handles, making sure that the privileged hypervisor is never tricked into using uninitialised data. - Speed up MMIO range registration by avoiding unnecessary RCU synchronisation, which results in VMs starting much quicker. - Add the dump of the instruction stream when panic-ing in the EL2 payload, just like the rest of the kernel has always done. This will hopefully help debugging non-VHE setups. - Add 52bit PA support to the stage-1 page-table walker, and make use of it to populate the fault level reported to the guest on failing to translate a stage-1 walk. - Add NV support to the GICv3-on-GICv5 emulation code, ensuring feature parity for guests, irrespective of the host platform. - Fix some really ugly architecture problems when dealing with debug in a nested VM. This has some bad performance impacts, but is at least correct. - Add enough infrastructure to be able to disable EL2 features and give effective values to the EL2 control registers. This then allows a bunch of features to be turned off, which helps cross-host migration. - Large rework of the selftest infrastructure to allow most tests to transparently run at EL2. This is the first step towards enabling NV testing. - Various fixes and improvements all over the map, including one BE fix, just in time for the removal of the feature. LoongArch: - Detect page table walk feature on new hardware - Add sign extension with kernel MMIO/IOCSR emulation - Improve in-kernel IPI emulation - Improve in-kernel PCH-PIC emulation - Move kvm_iocsr tracepoint out of generic code RISC-V: - Added SBI FWFT extension for Guest/VM with misaligned delegation and pointer masking PMLEN features - Added ONE_REG interface for SBI FWFT extension - Added Zicbop and bfloat16 extensions for Guest/VM - Enabled more common KVM selftests for RISC-V - Added SBI v3.0 PMU enhancements in KVM and perf driver s390: - Improve interrupt cpu for wakeup, in particular the heuristic to decide which vCPU to deliver a floating interrupt to. - Clear the PTE when discarding a swapped page because of CMMA; this bug was introduced in 6.16 when refactoring gmap code. x86 selftests: - Add #DE coverage in the fastops test (the only exception that's guest- triggerable in fastop-emulated instructions). - Fix PMU selftests errors encountered on Granite Rapids (GNR), Sierra Forest (SRF) and Clearwater Forest (CWF). - Minor cleanups and improvements x86 (guest side): - For the legacy PCI hole (memory between TOLUD and 4GiB) to UC when overriding guest MTRR for TDX/SNP to fix an issue where ACPI auto-mapping could map devices as WB and prevent the device drivers from mapping their devices with UC/UC-. - Make kvm_async_pf_task_wake() a local static helper and remove its export. - Use native qspinlocks when running in a VM with dedicated vCPU=>pCPU bindings even when PV_UNHALT is unsupported. Generic: - Remove a redundant __GFP_NOWARN from kvm_setup_async_pf() as __GFP_NOWARN is now included in GFP_NOWAIT. * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (178 commits) KVM: s390: Fix to clear PTE when discarding a swapped page KVM: arm64: selftests: Cover ID_AA64ISAR3_EL1 in set_id_regs KVM: arm64: selftests: Remove a duplicate register listing in set_id_regs KVM: arm64: selftests: Cope with arch silliness in EL2 selftest KVM: arm64: selftests: Add basic test for running in VHE EL2 KVM: arm64: selftests: Enable EL2 by default KVM: arm64: selftests: Initialize HCR_EL2 KVM: arm64: selftests: Use the vCPU attr for setting nr of PMU counters KVM: arm64: selftests: Use hyp timer IRQs when test runs at EL2 KVM: arm64: selftests: Select SMCCC conduit based on current EL KVM: arm64: selftests: Provide helper for getting default vCPU target KVM: arm64: selftests: Alias EL1 registers to EL2 counterparts KVM: arm64: selftests: Create a VGICv3 for 'default' VMs KVM: arm64: selftests: Add unsanitised helpers for VGICv3 creation KVM: arm64: selftests: Add helper to check for VGICv3 support KVM: arm64: selftests: Initialize VGICv3 only once KVM: arm64: selftests: Provide kvm_arch_vm_post_create() in library code KVM: selftests: Add ex_str() to print human friendly name of exception vectors selftests/kvm: remove stale TODO in xapic_state_test KVM: selftests: Handle Intel Atom errata that leads to PMU event overcount ...
2 parents bf897d2 + 99cab80 commit f3826aa

148 files changed

Lines changed: 4120 additions & 1489 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

Documentation/virt/kvm/api.rst

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6414,6 +6414,15 @@ most one mapping per page, i.e. binding multiple memory regions to a single
64146414
guest_memfd range is not allowed (any number of memory regions can be bound to
64156415
a single guest_memfd file, but the bound ranges must not overlap).
64166416

6417+
When the capability KVM_CAP_GUEST_MEMFD_MMAP is supported, the 'flags' field
6418+
supports GUEST_MEMFD_FLAG_MMAP. Setting this flag on guest_memfd creation
6419+
enables mmap() and faulting of guest_memfd memory to host userspace.
6420+
6421+
When the KVM MMU performs a PFN lookup to service a guest fault and the backing
6422+
guest_memfd has the GUEST_MEMFD_FLAG_MMAP set, then the fault will always be
6423+
consumed from guest_memfd, regardless of whether it is a shared or a private
6424+
fault.
6425+
64176426
See KVM_SET_USER_MEMORY_REGION2 for additional details.
64186427

64196428
4.143 KVM_PRE_FAULT_MEMORY

arch/arm64/include/asm/kvm_asm.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,8 @@ enum __kvm_host_smccc_func {
8181
__KVM_HOST_SMCCC_FUNC___kvm_timer_set_cntvoff,
8282
__KVM_HOST_SMCCC_FUNC___vgic_v3_save_vmcr_aprs,
8383
__KVM_HOST_SMCCC_FUNC___vgic_v3_restore_vmcr_aprs,
84+
__KVM_HOST_SMCCC_FUNC___pkvm_reserve_vm,
85+
__KVM_HOST_SMCCC_FUNC___pkvm_unreserve_vm,
8486
__KVM_HOST_SMCCC_FUNC___pkvm_init_vm,
8587
__KVM_HOST_SMCCC_FUNC___pkvm_init_vcpu,
8688
__KVM_HOST_SMCCC_FUNC___pkvm_teardown_vm,

arch/arm64/include/asm/kvm_emulate.h

Lines changed: 28 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -220,6 +220,20 @@ static inline bool vcpu_el2_tge_is_set(const struct kvm_vcpu *vcpu)
220220

221221
static inline bool vcpu_el2_amo_is_set(const struct kvm_vcpu *vcpu)
222222
{
223+
/*
224+
* DDI0487L.b Known Issue D22105
225+
*
226+
* When executing at EL2 and HCR_EL2.{E2H,TGE} = {1, 0} it is
227+
* IMPLEMENTATION DEFINED whether the effective value of HCR_EL2.AMO
228+
* is the value programmed or 1.
229+
*
230+
* Make the implementation choice of treating the effective value as 1 as
231+
* we cannot subsequently catch changes to TGE or AMO that would
232+
* otherwise lead to the SError becoming deliverable.
233+
*/
234+
if (vcpu_is_el2(vcpu) && vcpu_el2_e2h_is_set(vcpu) && !vcpu_el2_tge_is_set(vcpu))
235+
return true;
236+
223237
return ctxt_sys_reg(&vcpu->arch.ctxt, HCR_EL2) & HCR_AMO;
224238
}
225239

@@ -511,21 +525,29 @@ static inline void kvm_vcpu_set_be(struct kvm_vcpu *vcpu)
511525
if (vcpu_mode_is_32bit(vcpu)) {
512526
*vcpu_cpsr(vcpu) |= PSR_AA32_E_BIT;
513527
} else {
514-
u64 sctlr = vcpu_read_sys_reg(vcpu, SCTLR_EL1);
528+
enum vcpu_sysreg r;
529+
u64 sctlr;
530+
531+
r = vcpu_has_nv(vcpu) ? SCTLR_EL2 : SCTLR_EL1;
532+
533+
sctlr = vcpu_read_sys_reg(vcpu, r);
515534
sctlr |= SCTLR_ELx_EE;
516-
vcpu_write_sys_reg(vcpu, sctlr, SCTLR_EL1);
535+
vcpu_write_sys_reg(vcpu, sctlr, r);
517536
}
518537
}
519538

520539
static inline bool kvm_vcpu_is_be(struct kvm_vcpu *vcpu)
521540
{
541+
enum vcpu_sysreg r;
542+
u64 bit;
543+
522544
if (vcpu_mode_is_32bit(vcpu))
523545
return !!(*vcpu_cpsr(vcpu) & PSR_AA32_E_BIT);
524546

525-
if (vcpu_mode_priv(vcpu))
526-
return !!(vcpu_read_sys_reg(vcpu, SCTLR_EL1) & SCTLR_ELx_EE);
527-
else
528-
return !!(vcpu_read_sys_reg(vcpu, SCTLR_EL1) & SCTLR_EL1_E0E);
547+
r = is_hyp_ctxt(vcpu) ? SCTLR_EL2 : SCTLR_EL1;
548+
bit = vcpu_mode_priv(vcpu) ? SCTLR_ELx_EE : SCTLR_EL1_E0E;
549+
550+
return vcpu_read_sys_reg(vcpu, r) & bit;
529551
}
530552

531553
static inline unsigned long vcpu_data_guest_to_host(struct kvm_vcpu *vcpu,

arch/arm64/include/asm/kvm_host.h

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -252,7 +252,8 @@ struct kvm_protected_vm {
252252
pkvm_handle_t handle;
253253
struct kvm_hyp_memcache teardown_mc;
254254
struct kvm_hyp_memcache stage2_teardown_mc;
255-
bool enabled;
255+
bool is_protected;
256+
bool is_created;
256257
};
257258

258259
struct kvm_mpidr_data {
@@ -1442,7 +1443,7 @@ struct kvm *kvm_arch_alloc_vm(void);
14421443

14431444
#define __KVM_HAVE_ARCH_FLUSH_REMOTE_TLBS_RANGE
14441445

1445-
#define kvm_vm_is_protected(kvm) (is_protected_kvm_enabled() && (kvm)->arch.pkvm.enabled)
1446+
#define kvm_vm_is_protected(kvm) (is_protected_kvm_enabled() && (kvm)->arch.pkvm.is_protected)
14461447

14471448
#define vcpu_is_protected(vcpu) kvm_vm_is_protected((vcpu)->kvm)
14481449

arch/arm64/include/asm/kvm_nested.h

Lines changed: 25 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -83,6 +83,8 @@ extern void check_nested_vcpu_requests(struct kvm_vcpu *vcpu);
8383
extern void kvm_nested_flush_hwstate(struct kvm_vcpu *vcpu);
8484
extern void kvm_nested_sync_hwstate(struct kvm_vcpu *vcpu);
8585

86+
extern void kvm_nested_setup_mdcr_el2(struct kvm_vcpu *vcpu);
87+
8688
struct kvm_s2_trans {
8789
phys_addr_t output;
8890
unsigned long block_size;
@@ -265,15 +267,18 @@ static inline u64 decode_range_tlbi(u64 val, u64 *range, u16 *asid)
265267
return base;
266268
}
267269

268-
static inline unsigned int ps_to_output_size(unsigned int ps)
270+
static inline unsigned int ps_to_output_size(unsigned int ps, bool pa52bit)
269271
{
270272
switch (ps) {
271273
case 0: return 32;
272274
case 1: return 36;
273275
case 2: return 40;
274276
case 3: return 42;
275277
case 4: return 44;
276-
case 5:
278+
case 5: return 48;
279+
case 6: if (pa52bit)
280+
return 52;
281+
fallthrough;
277282
default:
278283
return 48;
279284
}
@@ -285,20 +290,36 @@ enum trans_regime {
285290
TR_EL2,
286291
};
287292

293+
struct s1_walk_info;
294+
295+
struct s1_walk_context {
296+
struct s1_walk_info *wi;
297+
u64 table_ipa;
298+
int level;
299+
};
300+
301+
struct s1_walk_filter {
302+
int (*fn)(struct s1_walk_context *, void *);
303+
void *priv;
304+
};
305+
288306
struct s1_walk_info {
307+
struct s1_walk_filter *filter;
289308
u64 baddr;
290309
enum trans_regime regime;
291310
unsigned int max_oa_bits;
292311
unsigned int pgshift;
293312
unsigned int txsz;
294313
int sl;
314+
u8 sh;
295315
bool as_el0;
296316
bool hpd;
297317
bool e0poe;
298318
bool poe;
299319
bool pan;
300320
bool be;
301321
bool s2;
322+
bool pa52bit;
302323
};
303324

304325
struct s1_walk_result {
@@ -334,6 +355,8 @@ struct s1_walk_result {
334355

335356
int __kvm_translate_va(struct kvm_vcpu *vcpu, struct s1_walk_info *wi,
336357
struct s1_walk_result *wr, u64 va);
358+
int __kvm_find_s1_desc_level(struct kvm_vcpu *vcpu, u64 va, u64 ipa,
359+
int *level);
337360

338361
/* VNCR management */
339362
int kvm_vcpu_allocate_vncr_tlb(struct kvm_vcpu *vcpu);

arch/arm64/include/asm/kvm_pkvm.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818

1919
int pkvm_init_host_vm(struct kvm *kvm);
2020
int pkvm_create_hyp_vm(struct kvm *kvm);
21+
bool pkvm_hyp_vm_is_created(struct kvm *kvm);
2122
void pkvm_destroy_hyp_vm(struct kvm *kvm);
2223
int pkvm_create_hyp_vcpu(struct kvm_vcpu *vcpu);
2324

arch/arm64/include/asm/traps.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ int kasan_brk_handler(struct pt_regs *regs, unsigned long esr);
3636
int ubsan_brk_handler(struct pt_regs *regs, unsigned long esr);
3737

3838
int early_brk64(unsigned long addr, unsigned long esr, struct pt_regs *regs);
39+
void dump_kernel_instr(unsigned long kaddr);
3940

4041
/*
4142
* Move regs->pc to next instruction and do necessary setup before it

arch/arm64/include/asm/vncr_mapping.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -94,6 +94,8 @@
9494
#define VNCR_PMSICR_EL1 0x838
9595
#define VNCR_PMSIRR_EL1 0x840
9696
#define VNCR_PMSLATFR_EL1 0x848
97+
#define VNCR_PMSNEVFR_EL1 0x850
98+
#define VNCR_PMSDSFR_EL1 0x858
9799
#define VNCR_TRFCR_EL1 0x880
98100
#define VNCR_MPAM1_EL1 0x900
99101
#define VNCR_MPAMHCR_EL2 0x930

arch/arm64/kernel/cpufeature.c

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2550,6 +2550,15 @@ test_has_mpam_hcr(const struct arm64_cpu_capabilities *entry, int scope)
25502550
return idr & MPAMIDR_EL1_HAS_HCR;
25512551
}
25522552

2553+
static bool
2554+
test_has_gicv5_legacy(const struct arm64_cpu_capabilities *entry, int scope)
2555+
{
2556+
if (!this_cpu_has_cap(ARM64_HAS_GICV5_CPUIF))
2557+
return false;
2558+
2559+
return !!(read_sysreg_s(SYS_ICC_IDR0_EL1) & ICC_IDR0_EL1_GCIE_LEGACY);
2560+
}
2561+
25532562
static const struct arm64_cpu_capabilities arm64_features[] = {
25542563
{
25552564
.capability = ARM64_ALWAYS_BOOT,
@@ -3167,6 +3176,12 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
31673176
.matches = has_cpuid_feature,
31683177
ARM64_CPUID_FIELDS(ID_AA64PFR2_EL1, GCIE, IMP)
31693178
},
3179+
{
3180+
.desc = "GICv5 Legacy vCPU interface",
3181+
.type = ARM64_CPUCAP_EARLY_LOCAL_CPU_FEATURE,
3182+
.capability = ARM64_HAS_GICV5_LEGACY,
3183+
.matches = test_has_gicv5_legacy,
3184+
},
31703185
{},
31713186
};
31723187

arch/arm64/kernel/image-vars.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -105,6 +105,9 @@ KVM_NVHE_ALIAS(__hyp_stub_vectors);
105105
KVM_NVHE_ALIAS(vgic_v2_cpuif_trap);
106106
KVM_NVHE_ALIAS(vgic_v3_cpuif_trap);
107107

108+
/* Static key indicating whether GICv3 has GICv2 compatibility */
109+
KVM_NVHE_ALIAS(vgic_v3_has_v2_compat);
110+
108111
/* Static key which is set if CNTVOFF_EL2 is unusable */
109112
KVM_NVHE_ALIAS(broken_cntvoff_key);
110113

0 commit comments

Comments
 (0)