Skip to content

Commit 2ed90cb

Browse files
dmatlackavpatel
authored andcommitted
KVM: RISC-V: Retry fault if vma_lookup() results become invalid
Read mmu_invalidate_seq before dropping the mmap_lock so that KVM can detect if the results of vma_lookup() (e.g. vma_shift) become stale before it acquires kvm->mmu_lock. This fixes a theoretical bug where a VMA could be changed by userspace after vma_lookup() and before KVM reads the mmu_invalidate_seq, causing KVM to install page table entries based on a (possibly) no-longer-valid vma_shift. Re-order the MMU cache top-up to earlier in user_mem_abort() so that it is not done after KVM has read mmu_invalidate_seq (i.e. so as to avoid inducing spurious fault retries). It's unlikely that any sane userspace currently modifies VMAs in such a way as to trigger this race. And even with directed testing I was unable to reproduce it. But a sufficiently motivated host userspace might be able to exploit this race. Note KVM/ARM had the same bug and was fixed in a separate, near identical patch (see Link). Link: https://lore.kernel.org/kvm/20230313235454.2964067-1-dmatlack@google.com/ Fixes: 9955371 ("RISC-V: KVM: Implement MMU notifiers") Cc: stable@vger.kernel.org Signed-off-by: David Matlack <dmatlack@google.com> Tested-by: Anup Patel <anup@brainfault.org> Signed-off-by: Anup Patel <anup@brainfault.org>
1 parent 6a8f57a commit 2ed90cb

1 file changed

Lines changed: 16 additions & 9 deletions

File tree

arch/riscv/kvm/mmu.c

Lines changed: 16 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -628,6 +628,13 @@ int kvm_riscv_gstage_map(struct kvm_vcpu *vcpu,
628628
!(memslot->flags & KVM_MEM_READONLY)) ? true : false;
629629
unsigned long vma_pagesize, mmu_seq;
630630

631+
/* We need minimum second+third level pages */
632+
ret = kvm_mmu_topup_memory_cache(pcache, gstage_pgd_levels);
633+
if (ret) {
634+
kvm_err("Failed to topup G-stage cache\n");
635+
return ret;
636+
}
637+
631638
mmap_read_lock(current->mm);
632639

633640
vma = vma_lookup(current->mm, hva);
@@ -648,6 +655,15 @@ int kvm_riscv_gstage_map(struct kvm_vcpu *vcpu,
648655
if (vma_pagesize == PMD_SIZE || vma_pagesize == PUD_SIZE)
649656
gfn = (gpa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT;
650657

658+
/*
659+
* Read mmu_invalidate_seq so that KVM can detect if the results of
660+
* vma_lookup() or gfn_to_pfn_prot() become stale priort to acquiring
661+
* kvm->mmu_lock.
662+
*
663+
* Rely on mmap_read_unlock() for an implicit smp_rmb(), which pairs
664+
* with the smp_wmb() in kvm_mmu_invalidate_end().
665+
*/
666+
mmu_seq = kvm->mmu_invalidate_seq;
651667
mmap_read_unlock(current->mm);
652668

653669
if (vma_pagesize != PUD_SIZE &&
@@ -657,15 +673,6 @@ int kvm_riscv_gstage_map(struct kvm_vcpu *vcpu,
657673
return -EFAULT;
658674
}
659675

660-
/* We need minimum second+third level pages */
661-
ret = kvm_mmu_topup_memory_cache(pcache, gstage_pgd_levels);
662-
if (ret) {
663-
kvm_err("Failed to topup G-stage cache\n");
664-
return ret;
665-
}
666-
667-
mmu_seq = kvm->mmu_invalidate_seq;
668-
669676
hfn = gfn_to_pfn_prot(kvm, gfn, is_write, &writable);
670677
if (hfn == KVM_PFN_ERR_HWPOISON) {
671678
send_sig_mceerr(BUS_MCEERR_AR, (void __user *)hva,

0 commit comments

Comments
 (0)