Skip to content

Commit c1e4f1d

Browse files
LuBaolujoergroedel
authored andcommitted
iommu/vt-d: Clear Present bit before tearing down context entry
When tearing down a context entry, the current implementation zeros the entire 128-bit entry using multiple 64-bit writes. This creates a window where the hardware can fetch a "torn" entry — where some fields are already zeroed while the 'Present' bit is still set — leading to unpredictable behavior or spurious faults. While x86 provides strong write ordering, the compiler may reorder writes to the two 64-bit halves of the context entry. Even without compiler reordering, the hardware fetch is not guaranteed to be atomic with respect to multiple CPU writes. Align with the "Guidance to Software for Invalidations" in the VT-d spec (Section 6.5.3.3) by implementing the recommended ownership handshake: 1. Clear only the 'Present' (P) bit of the context entry first to signal the transition of ownership from hardware to software. 2. Use dma_wmb() to ensure the cleared bit is visible to the IOMMU. 3. Perform the required cache and context-cache invalidation to ensure hardware no longer has cached references to the entry. 4. Fully zero out the entry only after the invalidation is complete. Also, add a dma_wmb() to context_set_present() to ensure the entry is fully initialized before the 'Present' bit becomes visible. Fixes: ba39592 ("Intel IOMMU: Intel IOMMU driver") Reported-by: Dmytro Maluka <dmaluka@chromium.org> Closes: https://lore.kernel.org/all/aTG7gc7I5wExai3S@google.com/ Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Dmytro Maluka <dmaluka@chromium.org> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20260120061816.2132558-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
1 parent 75ed000 commit c1e4f1d

3 files changed

Lines changed: 27 additions & 3 deletions

File tree

drivers/iommu/intel/iommu.c

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1240,10 +1240,12 @@ static void domain_context_clear_one(struct device_domain_info *info, u8 bus, u8
12401240
}
12411241

12421242
did = context_domain_id(context);
1243-
context_clear_entry(context);
1243+
context_clear_present(context);
12441244
__iommu_flush_cache(iommu, context, sizeof(*context));
12451245
spin_unlock(&iommu->lock);
12461246
intel_context_flush_no_pasid(info, context, did);
1247+
context_clear_entry(context);
1248+
__iommu_flush_cache(iommu, context, sizeof(*context));
12471249
}
12481250

12491251
int __domain_setup_first_level(struct intel_iommu *iommu, struct device *dev,

drivers/iommu/intel/iommu.h

Lines changed: 20 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -900,7 +900,26 @@ static inline int pfn_level_offset(u64 pfn, int level)
900900

901901
static inline void context_set_present(struct context_entry *context)
902902
{
903-
context->lo |= 1;
903+
u64 val;
904+
905+
dma_wmb();
906+
val = READ_ONCE(context->lo) | 1;
907+
WRITE_ONCE(context->lo, val);
908+
}
909+
910+
/*
911+
* Clear the Present (P) bit (bit 0) of a context table entry. This initiates
912+
* the transition of the entry's ownership from hardware to software. The
913+
* caller is responsible for fulfilling the invalidation handshake recommended
914+
* by the VT-d spec, Section 6.5.3.3 (Guidance to Software for Invalidations).
915+
*/
916+
static inline void context_clear_present(struct context_entry *context)
917+
{
918+
u64 val;
919+
920+
val = READ_ONCE(context->lo) & GENMASK_ULL(63, 1);
921+
WRITE_ONCE(context->lo, val);
922+
dma_wmb();
904923
}
905924

906925
static inline void context_set_fault_enable(struct context_entry *context)

drivers/iommu/intel/pasid.c

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1024,7 +1024,7 @@ static int device_pasid_table_setup(struct device *dev, u8 bus, u8 devfn)
10241024
}
10251025

10261026
if (context_copied(iommu, bus, devfn)) {
1027-
context_clear_entry(context);
1027+
context_clear_present(context);
10281028
__iommu_flush_cache(iommu, context, sizeof(*context));
10291029

10301030
/*
@@ -1044,6 +1044,9 @@ static int device_pasid_table_setup(struct device *dev, u8 bus, u8 devfn)
10441044
iommu->flush.flush_iotlb(iommu, 0, 0, 0, DMA_TLB_GLOBAL_FLUSH);
10451045
devtlb_invalidation_with_pasid(iommu, dev, IOMMU_NO_PASID);
10461046

1047+
context_clear_entry(context);
1048+
__iommu_flush_cache(iommu, context, sizeof(*context));
1049+
10471050
/*
10481051
* At this point, the device is supposed to finish reset at
10491052
* its driver probe stage, so no in-flight DMA will exist,

0 commit comments

Comments
 (0)