Skip to content

Commit b47b93c

Browse files
committed
KVM: x86: Disallow setting CPUID and/or feature MSRs if L2 is active
Extend KVM's restriction on CPUID and feature MSR changes to disallow updates while L2 is active in addition to rejecting updates after the vCPU has run at least once. Like post-run vCPU model updates, attempting to react to model changes while L2 is active is practically infeasible, e.g. KVM would need to do _something_ in response to impossible situations where userspace has a removed a feature that was consumed as parted of nested VM-Enter. In practice, disallowing vCPU model changes while L2 is active is largely uninteresting, as the only way for L2 to be active without the vCPU having run at least once is if userspace stuffed state via KVM_SET_NESTED_STATE. And because KVM_SET_NESTED_STATE can't put the vCPU into L2 without userspace first defining the vCPU model, e.g. to enable SVM/VMX, modifying the vCPU model while L2 is active would require deliberately setting the vCPU model, then loading nested state, and then changing the model. I.e. no sane VMM should run afoul of the new restriction, and any VMM that does encounter problems has likely been running a broken setup for a long time. Cc: Yosry Ahmed <yosry.ahmed@linux.dev> Cc: Kevin Cheng <chengkev@google.com> Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251230205641.4092235-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
1 parent 9ace475 commit b47b93c

5 files changed

Lines changed: 34 additions & 23 deletions

File tree

arch/x86/kvm/cpuid.c

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -534,17 +534,20 @@ static int kvm_set_cpuid(struct kvm_vcpu *vcpu, struct kvm_cpuid_entry2 *e2,
534534
BUILD_BUG_ON(sizeof(vcpu_caps) != sizeof(vcpu->arch.cpu_caps));
535535

536536
/*
537-
* KVM does not correctly handle changing guest CPUID after KVM_RUN, as
538-
* MAXPHYADDR, GBPAGES support, AMD reserved bit behavior, etc.. aren't
539-
* tracked in kvm_mmu_page_role. As a result, KVM may miss guest page
540-
* faults due to reusing SPs/SPTEs. In practice no sane VMM mucks with
541-
* the core vCPU model on the fly. It would've been better to forbid any
542-
* KVM_SET_CPUID{,2} calls after KVM_RUN altogether but unfortunately
543-
* some VMMs (e.g. QEMU) reuse vCPU fds for CPU hotplug/unplug and do
537+
* KVM does not correctly handle changing guest CPUID after KVM_RUN or
538+
* while L2 is active, as MAXPHYADDR, GBPAGES support, AMD reserved bit
539+
* behavior, etc. aren't tracked in kvm_mmu_page_role, and L2 state
540+
* can't be adjusted (without breaking L2 in some way). As a result,
541+
* KVM may reuse SPs/SPTEs and/or run L2 with bad/misconfigured state.
542+
*
543+
* In practice, no sane VMM mucks with the core vCPU model on the fly.
544+
* It would've been better to forbid any KVM_SET_CPUID{,2} calls after
545+
* KVM_RUN or KVM_SET_NESTED_STATE altogether, but unfortunately some
546+
* VMMs (e.g. QEMU) reuse vCPU fds for CPU hotplug/unplug and do
544547
* KVM_SET_CPUID{,2} again. To support this legacy behavior, check
545548
* whether the supplied CPUID data is equal to what's already set.
546549
*/
547-
if (kvm_vcpu_has_run(vcpu)) {
550+
if (!kvm_can_set_cpuid_and_feature_msrs(vcpu)) {
548551
r = kvm_cpuid_check_equal(vcpu, e2, nent);
549552
if (r)
550553
goto err;

arch/x86/kvm/mmu/mmu.c

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6031,11 +6031,7 @@ void kvm_mmu_after_set_cpuid(struct kvm_vcpu *vcpu)
60316031
vcpu->arch.nested_mmu.cpu_role.ext.valid = 0;
60326032
kvm_mmu_reset_context(vcpu);
60336033

6034-
/*
6035-
* Changing guest CPUID after KVM_RUN is forbidden, see the comment in
6036-
* kvm_arch_vcpu_ioctl().
6037-
*/
6038-
KVM_BUG_ON(kvm_vcpu_has_run(vcpu), vcpu->kvm);
6034+
KVM_BUG_ON(!kvm_can_set_cpuid_and_feature_msrs(vcpu), vcpu->kvm);
60396035
}
60406036

60416037
void kvm_mmu_reset_context(struct kvm_vcpu *vcpu)

arch/x86/kvm/pmu.c

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -853,7 +853,7 @@ void kvm_pmu_refresh(struct kvm_vcpu *vcpu)
853853
{
854854
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
855855

856-
if (KVM_BUG_ON(kvm_vcpu_has_run(vcpu), vcpu->kvm))
856+
if (KVM_BUG_ON(!kvm_can_set_cpuid_and_feature_msrs(vcpu), vcpu->kvm))
857857
return;
858858

859859
/*

arch/x86/kvm/x86.c

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2314,13 +2314,14 @@ static int do_set_msr(struct kvm_vcpu *vcpu, unsigned index, u64 *data)
23142314
u64 val;
23152315

23162316
/*
2317-
* Disallow writes to immutable feature MSRs after KVM_RUN. KVM does
2318-
* not support modifying the guest vCPU model on the fly, e.g. changing
2319-
* the nVMX capabilities while L2 is running is nonsensical. Allow
2320-
* writes of the same value, e.g. to allow userspace to blindly stuff
2321-
* all MSRs when emulating RESET.
2322-
*/
2323-
if (kvm_vcpu_has_run(vcpu) && kvm_is_immutable_feature_msr(index) &&
2317+
* Reject writes to immutable feature MSRs if the vCPU model is frozen,
2318+
* as KVM doesn't support modifying the guest vCPU model on the fly,
2319+
* e.g. changing the VMX capabilities MSRs while L2 is active is
2320+
* nonsensical. Allow writes of the same value, e.g. so that userspace
2321+
* can blindly stuff all MSRs when emulating RESET.
2322+
*/
2323+
if (!kvm_can_set_cpuid_and_feature_msrs(vcpu) &&
2324+
kvm_is_immutable_feature_msr(index) &&
23242325
(do_get_msr(vcpu, index, &val) || *data != val))
23252326
return -EINVAL;
23262327

arch/x86/kvm/x86.h

Lines changed: 13 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -172,9 +172,20 @@ static inline void kvm_nested_vmexit_handle_ibrs(struct kvm_vcpu *vcpu)
172172
indirect_branch_prediction_barrier();
173173
}
174174

175-
static inline bool kvm_vcpu_has_run(struct kvm_vcpu *vcpu)
175+
/*
176+
* Disallow modifying CPUID and feature MSRs, which affect the core virtual CPU
177+
* model exposed to the guest and virtualized by KVM, if the vCPU has already
178+
* run or is in guest mode (L2). In both cases, KVM has already consumed the
179+
* current virtual CPU model, and doesn't support "unwinding" to react to the
180+
* new model.
181+
*
182+
* Note, the only way is_guest_mode() can be true with 'last_vmentry_cpu == -1'
183+
* is if userspace sets CPUID and feature MSRs (to enable VMX/SVM), then sets
184+
* nested state, and then attempts to set CPUID and/or feature MSRs *again*.
185+
*/
186+
static inline bool kvm_can_set_cpuid_and_feature_msrs(struct kvm_vcpu *vcpu)
176187
{
177-
return vcpu->arch.last_vmentry_cpu != -1;
188+
return vcpu->arch.last_vmentry_cpu == -1 && !is_guest_mode(vcpu);
178189
}
179190

180191
static inline void kvm_set_mp_state(struct kvm_vcpu *vcpu, int mp_state)

0 commit comments

Comments
 (0)