Skip to content

Commit 1e17a6f

Browse files
sean-jcbonzini
authored andcommitted
KVM: x86: Don't snapshot pending INIT/SIPI prior to checking nested events
Don't snapshot pending INIT/SIPI events prior to checking nested events, architecturally there's nothing wrong with KVM processing (dropping) a SIPI that is received immediately after synthesizing a VM-Exit. Taking and consuming the snapshot makes the flow way more subtle than it needs to be, e.g. nVMX consumes/clears events that trigger VM-Exit (INIT/SIPI), and so at first glance it appears that KVM is double-dipping on pending INITs and SIPIs. But that's not the case because INIT is blocked unconditionally in VMX root mode the CPU cannot be in wait-for_SIPI after VM-Exit, i.e. the paths that truly consume the snapshot are unreachable if apic->pending_events is modified by kvm_check_nested_events(). nSVM is a similar story as GIF is cleared by the CPU on VM-Exit; INIT is blocked regardless of whether or not it was pending prior to VM-Exit. Drop the snapshot logic so that a future fix doesn't create weirdness when kvm_vcpu_running()'s call to kvm_check_nested_events() is moved to vcpu_block(). In that case, kvm_check_nested_events() will be called immediately before kvm_apic_accept_events(), which raises the obvious question of why that change doesn't break the snapshot logic. Note, there is a subtle functional change. Previously, KVM would clear pending SIPIs if and only SIPI was pending prior to VM-Exit, whereas now KVM clears pending SIPI unconditionally if INIT+SIPI are blocked. The latter is architecturally allowed, as SIPI is ignored if the CPU is not in wait-for-SIPI mode (arguably, KVM should be even more aggressive in dropping SIPIs). It is software's responsibility to ensure the SIPI is delivered, i.e. software shouldn't be firing INIT-SIPI at a CPU until it knows with 100% certaining that the target CPU isn't in VMX root mode. Furthermore, the existing code is extra weird as SIPIs that arrive after VM-Exit _are_ dropped if there also happened to be a pending SIPI before VM-Exit. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220921003201.1441511-10-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
1 parent ea2f00c commit 1e17a6f

1 file changed

Lines changed: 10 additions & 26 deletions

File tree

arch/x86/kvm/lapic.c

Lines changed: 10 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -3025,56 +3025,40 @@ int kvm_apic_accept_events(struct kvm_vcpu *vcpu)
30253025
struct kvm_lapic *apic = vcpu->arch.apic;
30263026
u8 sipi_vector;
30273027
int r;
3028-
unsigned long pe;
30293028

3030-
if (!lapic_in_kernel(vcpu))
3031-
return 0;
3032-
3033-
/*
3034-
* Read pending events before calling the check_events
3035-
* callback.
3036-
*/
3037-
pe = smp_load_acquire(&apic->pending_events);
3038-
if (!pe)
3029+
if (!kvm_apic_has_pending_init_or_sipi(vcpu))
30393030
return 0;
30403031

30413032
if (is_guest_mode(vcpu)) {
30423033
r = kvm_check_nested_events(vcpu);
30433034
if (r < 0)
30443035
return r == -EBUSY ? 0 : r;
30453036
/*
3046-
* If an event has happened and caused a vmexit,
3047-
* we know INITs are latched and therefore
3048-
* we will not incorrectly deliver an APIC
3049-
* event instead of a vmexit.
3037+
* Continue processing INIT/SIPI even if a nested VM-Exit
3038+
* occurred, e.g. pending SIPIs should be dropped if INIT+SIPI
3039+
* are blocked as a result of transitioning to VMX root mode.
30503040
*/
30513041
}
30523042

30533043
/*
3054-
* INITs are blocked while CPU is in specific states
3055-
* (SMM, VMX root mode, SVM with GIF=0).
3056-
* Because a CPU cannot be in these states immediately
3057-
* after it has processed an INIT signal (and thus in
3058-
* KVM_MP_STATE_INIT_RECEIVED state), just eat SIPIs
3059-
* and leave the INIT pending.
3044+
* INITs are blocked while CPU is in specific states (SMM, VMX root
3045+
* mode, SVM with GIF=0), while SIPIs are dropped if the CPU isn't in
3046+
* wait-for-SIPI (WFS).
30603047
*/
30613048
if (!kvm_apic_init_sipi_allowed(vcpu)) {
30623049
WARN_ON_ONCE(vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED);
3063-
if (test_bit(KVM_APIC_SIPI, &pe))
3064-
clear_bit(KVM_APIC_SIPI, &apic->pending_events);
3050+
clear_bit(KVM_APIC_SIPI, &apic->pending_events);
30653051
return 0;
30663052
}
30673053

3068-
if (test_bit(KVM_APIC_INIT, &pe)) {
3069-
clear_bit(KVM_APIC_INIT, &apic->pending_events);
3054+
if (test_and_clear_bit(KVM_APIC_INIT, &apic->pending_events)) {
30703055
kvm_vcpu_reset(vcpu, true);
30713056
if (kvm_vcpu_is_bsp(apic->vcpu))
30723057
vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE;
30733058
else
30743059
vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED;
30753060
}
3076-
if (test_bit(KVM_APIC_SIPI, &pe)) {
3077-
clear_bit(KVM_APIC_SIPI, &apic->pending_events);
3061+
if (test_and_clear_bit(KVM_APIC_SIPI, &apic->pending_events)) {
30783062
if (vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) {
30793063
/* evaluate pending_events before reading the vector */
30803064
smp_rmb();

0 commit comments

Comments
 (0)