Skip to content

Commit 86e00cd

Browse files
committed
KVM: Add irqfd to eventfd's waitqueue while holding irqfds.lock
Add an irqfd to its target eventfd's waitqueue while holding irqfds.lock, which is mildly terrifying but functionally safe. irqfds.lock is taken inside the waitqueue's lock, but if and only if the eventfd is being released, i.e. that path is mutually exclusive with registration as KVM holds a reference to the eventfd (and obviously must do so to avoid UAF). This will allow using the eventfd's waitqueue to enforce KVM's requirement that eventfd is assigned to at most one irqfd, without introducing races. Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250522235223.3178519-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
1 parent 5f8ca05 commit 86e00cd

1 file changed

Lines changed: 18 additions & 3 deletions

File tree

virt/kvm/eventfd.c

Lines changed: 18 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -204,6 +204,11 @@ irqfd_wakeup(wait_queue_entry_t *wait, unsigned mode, int sync, void *key)
204204
int ret = 0;
205205

206206
if (flags & EPOLLIN) {
207+
/*
208+
* WARNING: Do NOT take irqfds.lock in any path except EPOLLHUP,
209+
* as KVM holds irqfds.lock when registering the irqfd with the
210+
* eventfd.
211+
*/
207212
u64 cnt;
208213
eventfd_ctx_do_read(irqfd->eventfd, &cnt);
209214

@@ -225,6 +230,11 @@ irqfd_wakeup(wait_queue_entry_t *wait, unsigned mode, int sync, void *key)
225230
/* The eventfd is closing, detach from KVM */
226231
unsigned long iflags;
227232

233+
/*
234+
* Taking irqfds.lock is safe here, as KVM holds a reference to
235+
* the eventfd when registering the irqfd, i.e. this path can't
236+
* be reached while kvm_irqfd_add() is running.
237+
*/
228238
spin_lock_irqsave(&kvm->irqfds.lock, iflags);
229239

230240
/*
@@ -296,16 +306,21 @@ static void kvm_irqfd_register(struct file *file, wait_queue_head_t *wqh,
296306

297307
list_add_tail(&irqfd->list, &kvm->irqfds.items);
298308

299-
spin_unlock_irq(&kvm->irqfds.lock);
300-
301309
/*
302310
* Add the irqfd as a priority waiter on the eventfd, with a custom
303311
* wake-up handler, so that KVM *and only KVM* is notified whenever the
304-
* underlying eventfd is signaled.
312+
* underlying eventfd is signaled. Temporarily lie to lockdep about
313+
* holding irqfds.lock to avoid a false positive regarding potential
314+
* deadlock with irqfd_wakeup() (see irqfd_wakeup() for details).
305315
*/
306316
init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup);
307317

318+
spin_release(&kvm->irqfds.lock.dep_map, _RET_IP_);
308319
add_wait_queue_priority(wqh, &irqfd->wait);
320+
spin_acquire(&kvm->irqfds.lock.dep_map, 0, 0, _RET_IP_);
321+
322+
spin_unlock_irq(&kvm->irqfds.lock);
323+
309324
p->ret = 0;
310325
}
311326

0 commit comments

Comments
 (0)