Skip to content

Commit 51d90a1

Browse files
committed
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini: "ARM: - Support for userspace handling of synchronous external aborts (SEAs), allowing the VMM to potentially handle the abort in a non-fatal manner - Large rework of the VGIC's list register handling with the goal of supporting more active/pending IRQs than available list registers in hardware. In addition, the VGIC now supports EOImode==1 style deactivations for IRQs which may occur on a separate vCPU than the one that acked the IRQ - Support for FEAT_XNX (user / privileged execute permissions) and FEAT_HAF (hardware update to the Access Flag) in the software page table walkers and shadow MMU - Allow page table destruction to reschedule, fixing long need_resched latencies observed when destroying a large VM - Minor fixes to KVM and selftests Loongarch: - Get VM PMU capability from HW GCFG register - Add AVEC basic support - Use 64-bit register definition for EIOINTC - Add KVM timer test cases for tools/selftests RISC/V: - SBI message passing (MPXY) support for KVM guest - Give a new, more specific error subcode for the case when in-kernel AIA virtualization fails to allocate IMSIC VS-file - Support KVM_DIRTY_LOG_INITIALLY_SET, enabling dirty log gradually in small chunks - Fix guest page fault within HLV* instructions - Flush VS-stage TLB after VCPU migration for Andes cores s390: - Always allocate ESCA (Extended System Control Area), instead of starting with the basic SCA and converting to ESCA with the addition of the 65th vCPU. The price is increased number of exits (and worse performance) on z10 and earlier processor; ESCA was introduced by z114/z196 in 2010 - VIRT_XFER_TO_GUEST_WORK support - Operation exception forwarding support - Cleanups x86: - Skip the costly "zap all SPTEs" on an MMIO generation wrap if MMIO SPTE caching is disabled, as there can't be any relevant SPTEs to zap - Relocate a misplaced export - Fix an async #PF bug where KVM would clear the completion queue when the guest transitioned in and out of paging mode, e.g. when handling an SMI and then returning to paged mode via RSM - Leave KVM's user-return notifier registered even when disabling virtualization, as long as kvm.ko is loaded. On reboot/shutdown, keeping the notifier registered is ok; the kernel does not use the MSRs and the callback will run cleanly and restore host MSRs if the CPU manages to return to userspace before the system goes down - Use the checked version of {get,put}_user() - Fix a long-lurking bug where KVM's lack of catch-up logic for periodic APIC timers can result in a hard lockup in the host - Revert the periodic kvmclock sync logic now that KVM doesn't use a clocksource that's subject to NTP corrections - Clean up KVM's handling of MMIO Stale Data and L1TF, and bury the latter behind CONFIG_CPU_MITIGATIONS - Context switch XCR0, XSS, and PKRU outside of the entry/exit fast path; the only reason they were handled in the fast path was to paper of a bug in the core #MC code, and that has long since been fixed - Add emulator support for AVX MOV instructions, to play nice with emulated devices whose guest drivers like to access PCI BARs with large multi-byte instructions x86 (AMD): - Fix a few missing "VMCB dirty" bugs - Fix the worst of KVM's lack of EFER.LMSLE emulation - Add AVIC support for addressing 4k vCPUs in x2AVIC mode - Fix incorrect handling of selective CR0 writes when checking intercepts during emulation of L2 instructions - Fix a currently-benign bug where KVM would clobber SPEC_CTRL[63:32] on VMRUN and #VMEXIT - Fix a bug where KVM corrupt the guest code stream when re-injecting a soft interrupt if the guest patched the underlying code after the VM-Exit, e.g. when Linux patches code with a temporary INT3 - Add KVM_X86_SNP_POLICY_BITS to advertise supported SNP policy bits to userspace, and extend KVM "support" to all policy bits that don't require any actual support from KVM x86 (Intel): - Use the root role from kvm_mmu_page to construct EPTPs instead of the current vCPU state, partly as worthwhile cleanup, but mostly to pave the way for tracking per-root TLB flushes, and elide EPT flushes on pCPU migration if the root is clean from a previous flush - Add a few missing nested consistency checks - Rip out support for doing "early" consistency checks via hardware as the functionality hasn't been used in years and is no longer useful in general; replace it with an off-by-default module param to WARN if hardware fails a check that KVM does not perform - Fix a currently-benign bug where KVM would drop the guest's SPEC_CTRL[63:32] on VM-Enter - Misc cleanups - Overhaul the TDX code to address systemic races where KVM (acting on behalf of userspace) could inadvertantly trigger lock contention in the TDX-Module; KVM was either working around these in weird, ugly ways, or was simply oblivious to them (though even Yan's devilish selftests could only break individual VMs, not the host kernel) - Fix a bug where KVM could corrupt a vCPU's cpu_list when freeing a TDX vCPU, if creating said vCPU failed partway through - Fix a few sparse warnings (bad annotation, 0 != NULL) - Use struct_size() to simplify copying TDX capabilities to userspace - Fix a bug where TDX would effectively corrupt user-return MSR values if the TDX Module rejects VP.ENTER and thus doesn't clobber host MSRs as expected Selftests: - Fix a math goof in mmu_stress_test when running on a single-CPU system/VM - Forcefully override ARCH from x86_64 to x86 to play nice with specifying ARCH=x86_64 on the command line - Extend a bunch of nested VMX to validate nested SVM as well - Add support for LA57 in the core VM_MODE_xxx macro, and add a test to verify KVM can save/restore nested VMX state when L1 is using 5-level paging, but L2 is not - Clean up the guest paging code in anticipation of sharing the core logic for nested EPT and nested NPT guest_memfd: - Add NUMA mempolicy support for guest_memfd, and clean up a variety of rough edges in guest_memfd along the way - Define a CLASS to automatically handle get+put when grabbing a guest_memfd from a memslot to make it harder to leak references - Enhance KVM selftests to make it easer to develop and debug selftests like those added for guest_memfd NUMA support, e.g. where test and/or KVM bugs often result in hard-to-debug SIGBUS errors - Misc cleanups Generic: - Use the recently-added WQ_PERCPU when creating the per-CPU workqueue for irqfd cleanup - Fix a goof in the dirty ring documentation - Fix choice of target for directed yield across different calls to kvm_vcpu_on_spin(); the function was always starting from the first vCPU instead of continuing the round-robin search" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (260 commits) KVM: arm64: at: Update AF on software walk only if VM has FEAT_HAFDBS KVM: arm64: at: Use correct HA bit in TCR_EL2 when regime is EL2 KVM: arm64: Document KVM_PGTABLE_PROT_{UX,PX} KVM: arm64: Fix spelling mistake "Unexpeced" -> "Unexpected" KVM: arm64: Add break to default case in kvm_pgtable_stage2_pte_prot() KVM: arm64: Add endian casting to kvm_swap_s[12]_desc() KVM: arm64: Fix compilation when CONFIG_ARM64_USE_LSE_ATOMICS=n KVM: arm64: selftests: Add test for AT emulation KVM: arm64: nv: Expose hardware access flag management to NV guests KVM: arm64: nv: Implement HW access flag management in stage-2 SW PTW KVM: arm64: Implement HW access flag management in stage-1 SW PTW KVM: arm64: Propagate PTW errors up to AT emulation KVM: arm64: Add helper for swapping guest descriptor KVM: arm64: nv: Use pgtable definitions in stage-2 walk KVM: arm64: Handle endianness in read helper for emulated PTW KVM: arm64: nv: Stop passing vCPU through void ptr in S2 PTW KVM: arm64: Call helper for reading descriptors directly KVM: arm64: nv: Advertise support for FEAT_XNX KVM: arm64: Teach ptdump about FEAT_XNX permissions KVM: s390: Use generic VIRT_XFER_TO_GUEST_WORK functions ...
2 parents 399ead3 + e0c26d4 commit 51d90a1

191 files changed

Lines changed: 6293 additions & 2625 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

Documentation/virt/kvm/api.rst

Lines changed: 66 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -7286,6 +7286,41 @@ exit, even without calls to ``KVM_ENABLE_CAP`` or similar. In this case,
72867286
it will enter with output fields already valid; in the common case, the
72877287
``unknown.ret`` field of the union will be ``TDVMCALL_STATUS_SUBFUNC_UNSUPPORTED``.
72887288
Userspace need not do anything if it does not wish to support a TDVMCALL.
7289+
7290+
::
7291+
7292+
/* KVM_EXIT_ARM_SEA */
7293+
struct {
7294+
#define KVM_EXIT_ARM_SEA_FLAG_GPA_VALID (1ULL << 0)
7295+
__u64 flags;
7296+
__u64 esr;
7297+
__u64 gva;
7298+
__u64 gpa;
7299+
} arm_sea;
7300+
7301+
Used on arm64 systems. When the VM capability ``KVM_CAP_ARM_SEA_TO_USER`` is
7302+
enabled, a KVM exits to userspace if a guest access causes a synchronous
7303+
external abort (SEA) and the host APEI fails to handle the SEA.
7304+
7305+
``esr`` is set to a sanitized value of ESR_EL2 from the exception taken to KVM,
7306+
consisting of the following fields:
7307+
7308+
- ``ESR_EL2.EC``
7309+
- ``ESR_EL2.IL``
7310+
- ``ESR_EL2.FnV``
7311+
- ``ESR_EL2.EA``
7312+
- ``ESR_EL2.CM``
7313+
- ``ESR_EL2.WNR``
7314+
- ``ESR_EL2.FSC``
7315+
- ``ESR_EL2.SET`` (when FEAT_RAS is implemented for the VM)
7316+
7317+
``gva`` is set to the value of FAR_EL2 from the exception taken to KVM when
7318+
``ESR_EL2.FnV == 0``. Otherwise, the value of ``gva`` is unknown.
7319+
7320+
``gpa`` is set to the faulting IPA from the exception taken to KVM when
7321+
the ``KVM_EXIT_ARM_SEA_FLAG_GPA_VALID`` flag is set. Otherwise, the value of
7322+
``gpa`` is unknown.
7323+
72897324
::
72907325

72917326
/* Fix the size of the union. */
@@ -7820,7 +7855,7 @@ where 0xff represents CPUs 0-7 in cluster 0.
78207855
:Architectures: s390
78217856
:Parameters: none
78227857

7823-
With this capability enabled, all illegal instructions 0x0000 (2 bytes) will
7858+
With this capability enabled, the illegal instruction 0x0000 (2 bytes) will
78247859
be intercepted and forwarded to user space. User space can use this
78257860
mechanism e.g. to realize 2-byte software breakpoints. The kernel will
78267861
not inject an operating exception for these instructions, user space has
@@ -8028,7 +8063,7 @@ will be initialized to 1 when created. This also improves performance because
80288063
dirty logging can be enabled gradually in small chunks on the first call
80298064
to KVM_CLEAR_DIRTY_LOG. KVM_DIRTY_LOG_INITIALLY_SET depends on
80308065
KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE (it is also only available on
8031-
x86 and arm64 for now).
8066+
x86, arm64 and riscv for now).
80328067

80338068
KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2 was previously available under the name
80348069
KVM_CAP_MANUAL_DIRTY_LOG_PROTECT, but the implementation had bugs that make
@@ -8524,7 +8559,7 @@ Therefore, the ioctl must be called *before* reading the content of
85248559
the dirty pages.
85258560

85268561
The dirty ring can get full. When it happens, the KVM_RUN of the
8527-
vcpu will return with exit reason KVM_EXIT_DIRTY_LOG_FULL.
8562+
vcpu will return with exit reason KVM_EXIT_DIRTY_RING_FULL.
85288563

85298564
The dirty ring interface has a major difference comparing to the
85308565
KVM_GET_DIRTY_LOG interface in that, when reading the dirty ring from
@@ -8692,7 +8727,7 @@ given VM.
86928727
When this capability is enabled, KVM resets the VCPU when setting
86938728
MP_STATE_INIT_RECEIVED through IOCTL. The original MP_STATE is preserved.
86948729

8695-
7.43 KVM_CAP_ARM_CACHEABLE_PFNMAP_SUPPORTED
8730+
7.44 KVM_CAP_ARM_CACHEABLE_PFNMAP_SUPPORTED
86968731
-------------------------------------------
86978732

86988733
:Architectures: arm64
@@ -8703,6 +8738,33 @@ This capability indicate to the userspace whether a PFNMAP memory region
87038738
can be safely mapped as cacheable. This relies on the presence of
87048739
force write back (FWB) feature support on the hardware.
87058740

8741+
7.45 KVM_CAP_ARM_SEA_TO_USER
8742+
----------------------------
8743+
8744+
:Architecture: arm64
8745+
:Target: VM
8746+
:Parameters: none
8747+
:Returns: 0 on success, -EINVAL if unsupported.
8748+
8749+
When this capability is enabled, KVM may exit to userspace for SEAs taken to
8750+
EL2 resulting from a guest access. See ``KVM_EXIT_ARM_SEA`` for more
8751+
information.
8752+
8753+
7.46 KVM_CAP_S390_USER_OPEREXEC
8754+
-------------------------------
8755+
8756+
:Architectures: s390
8757+
:Parameters: none
8758+
8759+
When this capability is enabled KVM forwards all operation exceptions
8760+
that it doesn't handle itself to user space. This also includes the
8761+
0x0000 instructions managed by KVM_CAP_S390_USER_INSTR0. This is
8762+
helpful if user space wants to emulate instructions which are not
8763+
(yet) implemented in hardware.
8764+
8765+
This capability can be enabled dynamically even if VCPUs were already
8766+
created and are running.
8767+
87068768
8. Other capabilities.
87078769
======================
87088770

Documentation/virt/kvm/x86/errata.rst

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,14 @@ versus "has_error_code", i.e. KVM's ABI follows AMD behavior.
4848
Nested virtualization features
4949
------------------------------
5050

51-
TBD
51+
On AMD CPUs, when GIF is cleared, #DB exceptions or traps due to a breakpoint
52+
register match are ignored and discarded by the CPU. The CPU relies on the VMM
53+
to fully virtualize this behavior, even when vGIF is enabled for the guest
54+
(i.e. vGIF=0 does not cause the CPU to drop #DBs when the guest is running).
55+
KVM does not virtualize this behavior as the complexity is unjustified given
56+
the rarity of the use case. One way to handle this would be for KVM to
57+
intercept the #DB, temporarily disable the breakpoint, single-step over the
58+
instruction, then re-enable the breakpoint.
5259

5360
x2APIC
5461
------

arch/arm64/include/asm/kvm_arm.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,7 @@
111111
#define TCR_EL2_DS (1UL << 32)
112112
#define TCR_EL2_RES1 ((1U << 31) | (1 << 23))
113113
#define TCR_EL2_HPD (1 << 24)
114+
#define TCR_EL2_HA (1 << 21)
114115
#define TCR_EL2_TBI (1 << 20)
115116
#define TCR_EL2_PS_SHIFT 16
116117
#define TCR_EL2_PS_MASK (7 << TCR_EL2_PS_SHIFT)

arch/arm64/include/asm/kvm_asm.h

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ enum __kvm_host_smccc_func {
7979
__KVM_HOST_SMCCC_FUNC___kvm_tlb_flush_vmid_range,
8080
__KVM_HOST_SMCCC_FUNC___kvm_flush_cpu_context,
8181
__KVM_HOST_SMCCC_FUNC___kvm_timer_set_cntvoff,
82-
__KVM_HOST_SMCCC_FUNC___vgic_v3_save_vmcr_aprs,
82+
__KVM_HOST_SMCCC_FUNC___vgic_v3_save_aprs,
8383
__KVM_HOST_SMCCC_FUNC___vgic_v3_restore_vmcr_aprs,
8484
__KVM_HOST_SMCCC_FUNC___pkvm_reserve_vm,
8585
__KVM_HOST_SMCCC_FUNC___pkvm_unreserve_vm,
@@ -246,9 +246,9 @@ extern void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu);
246246
extern int __kvm_tlbi_s1e2(struct kvm_s2_mmu *mmu, u64 va, u64 sys_encoding);
247247

248248
extern void __kvm_timer_set_cntvoff(u64 cntvoff);
249-
extern void __kvm_at_s1e01(struct kvm_vcpu *vcpu, u32 op, u64 vaddr);
250-
extern void __kvm_at_s1e2(struct kvm_vcpu *vcpu, u32 op, u64 vaddr);
251-
extern void __kvm_at_s12(struct kvm_vcpu *vcpu, u32 op, u64 vaddr);
249+
extern int __kvm_at_s1e01(struct kvm_vcpu *vcpu, u32 op, u64 vaddr);
250+
extern int __kvm_at_s1e2(struct kvm_vcpu *vcpu, u32 op, u64 vaddr);
251+
extern int __kvm_at_s12(struct kvm_vcpu *vcpu, u32 op, u64 vaddr);
252252

253253
extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
254254

arch/arm64/include/asm/kvm_host.h

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@
5454
#define KVM_REQ_NESTED_S2_UNMAP KVM_ARCH_REQ(8)
5555
#define KVM_REQ_GUEST_HYP_IRQ_PENDING KVM_ARCH_REQ(9)
5656
#define KVM_REQ_MAP_L1_VNCR_EL2 KVM_ARCH_REQ(10)
57+
#define KVM_REQ_VGIC_PROCESS_UPDATE KVM_ARCH_REQ(11)
5758

5859
#define KVM_DIRTY_LOG_MANUAL_CAPS (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \
5960
KVM_DIRTY_LOG_INITIALLY_SET)
@@ -350,6 +351,8 @@ struct kvm_arch {
350351
#define KVM_ARCH_FLAG_GUEST_HAS_SVE 9
351352
/* MIDR_EL1, REVIDR_EL1, and AIDR_EL1 are writable from userspace */
352353
#define KVM_ARCH_FLAG_WRITABLE_IMP_ID_REGS 10
354+
/* Unhandled SEAs are taken to userspace */
355+
#define KVM_ARCH_FLAG_EXIT_SEA 11
353356
unsigned long flags;
354357

355358
/* VM-wide vCPU feature set */

arch/arm64/include/asm/kvm_hyp.h

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,12 +77,13 @@ DECLARE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
7777
int __vgic_v2_perform_cpuif_access(struct kvm_vcpu *vcpu);
7878

7979
u64 __gic_v3_get_lr(unsigned int lr);
80+
void __gic_v3_set_lr(u64 val, int lr);
8081

8182
void __vgic_v3_save_state(struct vgic_v3_cpu_if *cpu_if);
8283
void __vgic_v3_restore_state(struct vgic_v3_cpu_if *cpu_if);
8384
void __vgic_v3_activate_traps(struct vgic_v3_cpu_if *cpu_if);
8485
void __vgic_v3_deactivate_traps(struct vgic_v3_cpu_if *cpu_if);
85-
void __vgic_v3_save_vmcr_aprs(struct vgic_v3_cpu_if *cpu_if);
86+
void __vgic_v3_save_aprs(struct vgic_v3_cpu_if *cpu_if);
8687
void __vgic_v3_restore_vmcr_aprs(struct vgic_v3_cpu_if *cpu_if);
8788
int __vgic_v3_perform_cpuif_access(struct kvm_vcpu *vcpu);
8889

arch/arm64/include/asm/kvm_nested.h

Lines changed: 38 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -120,9 +120,42 @@ static inline bool kvm_s2_trans_writable(struct kvm_s2_trans *trans)
120120
return trans->writable;
121121
}
122122

123-
static inline bool kvm_s2_trans_executable(struct kvm_s2_trans *trans)
123+
static inline bool kvm_has_xnx(struct kvm *kvm)
124124
{
125-
return !(trans->desc & BIT(54));
125+
return cpus_have_final_cap(ARM64_HAS_XNX) &&
126+
kvm_has_feat(kvm, ID_AA64MMFR1_EL1, XNX, IMP);
127+
}
128+
129+
static inline bool kvm_s2_trans_exec_el0(struct kvm *kvm, struct kvm_s2_trans *trans)
130+
{
131+
u8 xn = FIELD_GET(KVM_PTE_LEAF_ATTR_HI_S2_XN, trans->desc);
132+
133+
if (!kvm_has_xnx(kvm))
134+
xn &= FIELD_PREP(KVM_PTE_LEAF_ATTR_HI_S2_XN, 0b10);
135+
136+
switch (xn) {
137+
case 0b00:
138+
case 0b01:
139+
return true;
140+
default:
141+
return false;
142+
}
143+
}
144+
145+
static inline bool kvm_s2_trans_exec_el1(struct kvm *kvm, struct kvm_s2_trans *trans)
146+
{
147+
u8 xn = FIELD_GET(KVM_PTE_LEAF_ATTR_HI_S2_XN, trans->desc);
148+
149+
if (!kvm_has_xnx(kvm))
150+
xn &= FIELD_PREP(KVM_PTE_LEAF_ATTR_HI_S2_XN, 0b10);
151+
152+
switch (xn) {
153+
case 0b00:
154+
case 0b11:
155+
return true;
156+
default:
157+
return false;
158+
}
126159
}
127160

128161
extern int kvm_walk_nested_s2(struct kvm_vcpu *vcpu, phys_addr_t gipa,
@@ -320,6 +353,7 @@ struct s1_walk_info {
320353
bool be;
321354
bool s2;
322355
bool pa52bit;
356+
bool ha;
323357
};
324358

325359
struct s1_walk_result {
@@ -370,4 +404,6 @@ void kvm_handle_s1e2_tlbi(struct kvm_vcpu *vcpu, u32 inst, u64 val);
370404
(FIX_VNCR - __c); \
371405
})
372406

407+
int __kvm_at_swap_desc(struct kvm *kvm, gpa_t ipa, u64 old, u64 new);
408+
373409
#endif /* __ARM64_KVM_NESTED_H */

arch/arm64/include/asm/kvm_pgtable.h

Lines changed: 42 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,7 @@ typedef u64 kvm_pte_t;
8989

9090
#define KVM_PTE_LEAF_ATTR_HI_S1_XN BIT(54)
9191

92-
#define KVM_PTE_LEAF_ATTR_HI_S2_XN BIT(54)
92+
#define KVM_PTE_LEAF_ATTR_HI_S2_XN GENMASK(54, 53)
9393

9494
#define KVM_PTE_LEAF_ATTR_HI_S1_GP BIT(50)
9595

@@ -240,7 +240,9 @@ enum kvm_pgtable_stage2_flags {
240240

241241
/**
242242
* enum kvm_pgtable_prot - Page-table permissions and attributes.
243-
* @KVM_PGTABLE_PROT_X: Execute permission.
243+
* @KVM_PGTABLE_PROT_UX: Unprivileged execute permission.
244+
* @KVM_PGTABLE_PROT_PX: Privileged execute permission.
245+
* @KVM_PGTABLE_PROT_X: Privileged and unprivileged execute permission.
244246
* @KVM_PGTABLE_PROT_W: Write permission.
245247
* @KVM_PGTABLE_PROT_R: Read permission.
246248
* @KVM_PGTABLE_PROT_DEVICE: Device attributes.
@@ -251,12 +253,15 @@ enum kvm_pgtable_stage2_flags {
251253
* @KVM_PGTABLE_PROT_SW3: Software bit 3.
252254
*/
253255
enum kvm_pgtable_prot {
254-
KVM_PGTABLE_PROT_X = BIT(0),
255-
KVM_PGTABLE_PROT_W = BIT(1),
256-
KVM_PGTABLE_PROT_R = BIT(2),
256+
KVM_PGTABLE_PROT_PX = BIT(0),
257+
KVM_PGTABLE_PROT_UX = BIT(1),
258+
KVM_PGTABLE_PROT_X = KVM_PGTABLE_PROT_PX |
259+
KVM_PGTABLE_PROT_UX,
260+
KVM_PGTABLE_PROT_W = BIT(2),
261+
KVM_PGTABLE_PROT_R = BIT(3),
257262

258-
KVM_PGTABLE_PROT_DEVICE = BIT(3),
259-
KVM_PGTABLE_PROT_NORMAL_NC = BIT(4),
263+
KVM_PGTABLE_PROT_DEVICE = BIT(4),
264+
KVM_PGTABLE_PROT_NORMAL_NC = BIT(5),
260265

261266
KVM_PGTABLE_PROT_SW0 = BIT(55),
262267
KVM_PGTABLE_PROT_SW1 = BIT(56),
@@ -355,6 +360,11 @@ static inline kvm_pte_t *kvm_dereference_pteref(struct kvm_pgtable_walker *walke
355360
return pteref;
356361
}
357362

363+
static inline kvm_pte_t *kvm_dereference_pteref_raw(kvm_pteref_t pteref)
364+
{
365+
return pteref;
366+
}
367+
358368
static inline int kvm_pgtable_walk_begin(struct kvm_pgtable_walker *walker)
359369
{
360370
/*
@@ -384,6 +394,11 @@ static inline kvm_pte_t *kvm_dereference_pteref(struct kvm_pgtable_walker *walke
384394
return rcu_dereference_check(pteref, !(walker->flags & KVM_PGTABLE_WALK_SHARED));
385395
}
386396

397+
static inline kvm_pte_t *kvm_dereference_pteref_raw(kvm_pteref_t pteref)
398+
{
399+
return rcu_dereference_raw(pteref);
400+
}
401+
387402
static inline int kvm_pgtable_walk_begin(struct kvm_pgtable_walker *walker)
388403
{
389404
if (walker->flags & KVM_PGTABLE_WALK_SHARED)
@@ -551,6 +566,26 @@ static inline int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_s2
551566
*/
552567
void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt);
553568

569+
/**
570+
* kvm_pgtable_stage2_destroy_range() - Destroy the unlinked range of addresses.
571+
* @pgt: Page-table structure initialised by kvm_pgtable_stage2_init*().
572+
* @addr: Intermediate physical address at which to place the mapping.
573+
* @size: Size of the mapping.
574+
*
575+
* The page-table is assumed to be unreachable by any hardware walkers prior
576+
* to freeing and therefore no TLB invalidation is performed.
577+
*/
578+
void kvm_pgtable_stage2_destroy_range(struct kvm_pgtable *pgt,
579+
u64 addr, u64 size);
580+
581+
/**
582+
* kvm_pgtable_stage2_destroy_pgd() - Destroy the PGD of guest stage-2 page-table.
583+
* @pgt: Page-table structure initialised by kvm_pgtable_stage2_init*().
584+
*
585+
* It is assumed that the rest of the page-table is freed before this operation.
586+
*/
587+
void kvm_pgtable_stage2_destroy_pgd(struct kvm_pgtable *pgt);
588+
554589
/**
555590
* kvm_pgtable_stage2_free_unlinked() - Free an unlinked stage-2 paging structure.
556591
* @mm_ops: Memory management callbacks.

arch/arm64/include/asm/kvm_pkvm.h

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -180,7 +180,9 @@ struct pkvm_mapping {
180180

181181
int pkvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_s2_mmu *mmu,
182182
struct kvm_pgtable_mm_ops *mm_ops);
183-
void pkvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt);
183+
void pkvm_pgtable_stage2_destroy_range(struct kvm_pgtable *pgt,
184+
u64 addr, u64 size);
185+
void pkvm_pgtable_stage2_destroy_pgd(struct kvm_pgtable *pgt);
184186
int pkvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size, u64 phys,
185187
enum kvm_pgtable_prot prot, void *mc,
186188
enum kvm_pgtable_walk_flags flags);

arch/arm64/include/asm/virt.h

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,8 +40,13 @@
4040
*/
4141
#define HVC_FINALISE_EL2 3
4242

43+
/*
44+
* HVC_GET_ICH_VTR_EL2 - Retrieve the ICH_VTR_EL2 value
45+
*/
46+
#define HVC_GET_ICH_VTR_EL2 4
47+
4348
/* Max number of HYP stub hypercalls */
44-
#define HVC_STUB_HCALL_NR 4
49+
#define HVC_STUB_HCALL_NR 5
4550

4651
/* Error returned when an invalid stub number is passed into x0 */
4752
#define HVC_STUB_ERR 0xbadca11

0 commit comments

Comments
 (0)