Skip to content

Commit 3fd3327

Browse files
committed
Merge tag 'x86-pasid-2022-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 PASID support from Thomas Gleixner: "Reenable ENQCMD/PASID support: - Simplify the PASID handling to allocate the PASID once, associate it to the mm of a process and free it on mm_exit(). The previous attempt of refcounted PASIDs and dynamic alloc()/free() turned out to be error prone and too complex. The PASID space is 20bits, so the case of resource exhaustion is a pure academic concern. - Populate the PASID MSR on demand via #GP to avoid racy updates via IPIs. - Reenable ENQCMD and let objtool check for the forbidden usage of ENQCMD in the kernel. - Update the documentation for Shared Virtual Addressing accordingly" * tag 'x86-pasid-2022-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: Documentation/x86: Update documentation for SVA (Shared Virtual Addressing) tools/objtool: Check for use of the ENQCMD instruction in the kernel x86/cpufeatures: Re-enable ENQCMD x86/traps: Demand-populate PASID MSR via #GP sched: Define and initialize a flag to identify valid PASID in the task x86/fpu: Clear PASID when copying fpstate iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit kernel/fork: Initialize mm's PASID iommu/ioasid: Introduce a helper to check for valid PASIDs mm: Change CONFIG option for mm->pasid field iommu/sva: Rename CONFIG_IOMMU_SVA_LIB to CONFIG_IOMMU_SVA
2 parents eaa54b1 + 83aa52f commit 3fd3327

20 files changed

Lines changed: 197 additions & 120 deletions

File tree

Documentation/x86/sva.rst

Lines changed: 41 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -104,18 +104,47 @@ The MSR must be configured on each logical CPU before any application
104104
thread can interact with a device. Threads that belong to the same
105105
process share the same page tables, thus the same MSR value.
106106

107-
PASID is cleared when a process is created. The PASID allocation and MSR
108-
programming may occur long after a process and its threads have been created.
109-
One thread must call iommu_sva_bind_device() to allocate the PASID for the
110-
process. If a thread uses ENQCMD without the MSR first being populated, a #GP
111-
will be raised. The kernel will update the PASID MSR with the PASID for all
112-
threads in the process. A single process PASID can be used simultaneously
113-
with multiple devices since they all share the same address space.
114-
115-
One thread can call iommu_sva_unbind_device() to free the allocated PASID.
116-
The kernel will clear the PASID MSR for all threads belonging to the process.
117-
118-
New threads inherit the MSR value from the parent.
107+
PASID Life Cycle Management
108+
===========================
109+
110+
PASID is initialized as INVALID_IOASID (-1) when a process is created.
111+
112+
Only processes that access SVA-capable devices need to have a PASID
113+
allocated. This allocation happens when a process opens/binds an SVA-capable
114+
device but finds no PASID for this process. Subsequent binds of the same, or
115+
other devices will share the same PASID.
116+
117+
Although the PASID is allocated to the process by opening a device,
118+
it is not active in any of the threads of that process. It's loaded to the
119+
IA32_PASID MSR lazily when a thread tries to submit a work descriptor
120+
to a device using the ENQCMD.
121+
122+
That first access will trigger a #GP fault because the IA32_PASID MSR
123+
has not been initialized with the PASID value assigned to the process
124+
when the device was opened. The Linux #GP handler notes that a PASID has
125+
been allocated for the process, and so initializes the IA32_PASID MSR
126+
and returns so that the ENQCMD instruction is re-executed.
127+
128+
On fork(2) or exec(2) the PASID is removed from the process as it no
129+
longer has the same address space that it had when the device was opened.
130+
131+
On clone(2) the new task shares the same address space, so will be
132+
able to use the PASID allocated to the process. The IA32_PASID is not
133+
preemptively initialized as the PASID value might not be allocated yet or
134+
the kernel does not know whether this thread is going to access the device
135+
and the cleared IA32_PASID MSR reduces context switch overhead by xstate
136+
init optimization. Since #GP faults have to be handled on any threads that
137+
were created before the PASID was assigned to the mm of the process, newly
138+
created threads might as well be treated in a consistent way.
139+
140+
Due to complexity of freeing the PASID and clearing all IA32_PASID MSRs in
141+
all threads in unbind, free the PASID lazily only on mm exit.
142+
143+
If a process does a close(2) of the device file descriptor and munmap(2)
144+
of the device MMIO portal, then the driver will unbind the device. The
145+
PASID is still marked VALID in the PASID_MSR for any threads in the
146+
process that accessed the device. But this is harmless as without the
147+
MMIO portal they cannot submit new work to the device.
119148

120149
Relationships
121150
=============

arch/x86/include/asm/disabled-features.h

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -56,8 +56,11 @@
5656
# define DISABLE_PTI (1 << (X86_FEATURE_PTI & 31))
5757
#endif
5858

59-
/* Force disable because it's broken beyond repair */
60-
#define DISABLE_ENQCMD (1 << (X86_FEATURE_ENQCMD & 31))
59+
#ifdef CONFIG_INTEL_IOMMU_SVM
60+
# define DISABLE_ENQCMD 0
61+
#else
62+
# define DISABLE_ENQCMD (1 << (X86_FEATURE_ENQCMD & 31))
63+
#endif
6164

6265
#ifdef CONFIG_X86_SGX
6366
# define DISABLE_SGX 0

arch/x86/kernel/fpu/core.c

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -612,6 +612,13 @@ int fpu_clone(struct task_struct *dst, unsigned long clone_flags)
612612
fpu_inherit_perms(dst_fpu);
613613
fpregs_unlock();
614614

615+
/*
616+
* Children never inherit PASID state.
617+
* Force it to have its init value:
618+
*/
619+
if (use_xsave())
620+
dst_fpu->fpstate->regs.xsave.header.xfeatures &= ~XFEATURE_MASK_PASID;
621+
615622
trace_x86_fpu_copy_src(src_fpu);
616623
trace_x86_fpu_copy_dst(dst_fpu);
617624

arch/x86/kernel/traps.c

Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,7 @@
3939
#include <linux/io.h>
4040
#include <linux/hardirq.h>
4141
#include <linux/atomic.h>
42+
#include <linux/ioasid.h>
4243

4344
#include <asm/stacktrace.h>
4445
#include <asm/processor.h>
@@ -559,6 +560,57 @@ static bool fixup_iopl_exception(struct pt_regs *regs)
559560
return true;
560561
}
561562

563+
/*
564+
* The unprivileged ENQCMD instruction generates #GPs if the
565+
* IA32_PASID MSR has not been populated. If possible, populate
566+
* the MSR from a PASID previously allocated to the mm.
567+
*/
568+
static bool try_fixup_enqcmd_gp(void)
569+
{
570+
#ifdef CONFIG_IOMMU_SVA
571+
u32 pasid;
572+
573+
/*
574+
* MSR_IA32_PASID is managed using XSAVE. Directly
575+
* writing to the MSR is only possible when fpregs
576+
* are valid and the fpstate is not. This is
577+
* guaranteed when handling a userspace exception
578+
* in *before* interrupts are re-enabled.
579+
*/
580+
lockdep_assert_irqs_disabled();
581+
582+
/*
583+
* Hardware without ENQCMD will not generate
584+
* #GPs that can be fixed up here.
585+
*/
586+
if (!cpu_feature_enabled(X86_FEATURE_ENQCMD))
587+
return false;
588+
589+
pasid = current->mm->pasid;
590+
591+
/*
592+
* If the mm has not been allocated a
593+
* PASID, the #GP can not be fixed up.
594+
*/
595+
if (!pasid_valid(pasid))
596+
return false;
597+
598+
/*
599+
* Did this thread already have its PASID activated?
600+
* If so, the #GP must be from something else.
601+
*/
602+
if (current->pasid_activated)
603+
return false;
604+
605+
wrmsrl(MSR_IA32_PASID, pasid | MSR_IA32_PASID_VALID);
606+
current->pasid_activated = 1;
607+
608+
return true;
609+
#else
610+
return false;
611+
#endif
612+
}
613+
562614
DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
563615
{
564616
char desc[sizeof(GPFSTR) + 50 + 2*sizeof(unsigned long) + 1] = GPFSTR;
@@ -567,6 +619,9 @@ DEFINE_IDTENTRY_ERRORCODE(exc_general_protection)
567619
unsigned long gp_addr;
568620
int ret;
569621

622+
if (user_mode(regs) && try_fixup_enqcmd_gp())
623+
return;
624+
570625
cond_local_irq_enable(regs);
571626

572627
if (static_cpu_has(X86_FEATURE_UMIP)) {

drivers/iommu/Kconfig

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -144,8 +144,8 @@ config IOMMU_DMA
144144
select IRQ_MSI_IOMMU
145145
select NEED_SG_DMA_LENGTH
146146

147-
# Shared Virtual Addressing library
148-
config IOMMU_SVA_LIB
147+
# Shared Virtual Addressing
148+
config IOMMU_SVA
149149
bool
150150
select IOASID
151151

@@ -379,7 +379,7 @@ config ARM_SMMU_V3
379379
config ARM_SMMU_V3_SVA
380380
bool "Shared Virtual Addressing support for the ARM SMMUv3"
381381
depends on ARM_SMMU_V3
382-
select IOMMU_SVA_LIB
382+
select IOMMU_SVA
383383
select MMU_NOTIFIER
384384
help
385385
Support for sharing process address spaces with devices using the

drivers/iommu/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,6 @@ obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
2727
obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
2828
obj-$(CONFIG_HYPERV_IOMMU) += hyperv-iommu.o
2929
obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o
30-
obj-$(CONFIG_IOMMU_SVA_LIB) += iommu-sva-lib.o io-pgfault.o
30+
obj-$(CONFIG_IOMMU_SVA) += iommu-sva-lib.o io-pgfault.o
3131
obj-$(CONFIG_SPRD_IOMMU) += sprd-iommu.o
3232
obj-$(CONFIG_APPLE_DART) += apple-dart.o

drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -340,14 +340,12 @@ __arm_smmu_sva_bind(struct device *dev, struct mm_struct *mm)
340340
bond->smmu_mn = arm_smmu_mmu_notifier_get(smmu_domain, mm);
341341
if (IS_ERR(bond->smmu_mn)) {
342342
ret = PTR_ERR(bond->smmu_mn);
343-
goto err_free_pasid;
343+
goto err_free_bond;
344344
}
345345

346346
list_add(&bond->list, &master->bonds);
347347
return &bond->sva;
348348

349-
err_free_pasid:
350-
iommu_sva_free_pasid(mm);
351349
err_free_bond:
352350
kfree(bond);
353351
return ERR_PTR(ret);
@@ -377,7 +375,6 @@ void arm_smmu_sva_unbind(struct iommu_sva *handle)
377375
if (refcount_dec_and_test(&bond->refs)) {
378376
list_del(&bond->list);
379377
arm_smmu_mmu_notifier_put(bond->smmu_mn);
380-
iommu_sva_free_pasid(bond->mm);
381378
kfree(bond);
382379
}
383380
mutex_unlock(&sva_lock);

drivers/iommu/intel/Kconfig

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ config INTEL_IOMMU_SVM
5252
select PCI_PRI
5353
select MMU_NOTIFIER
5454
select IOASID
55-
select IOMMU_SVA_LIB
55+
select IOMMU_SVA
5656
help
5757
Shared Virtual Memory (SVM) provides a facility for devices
5858
to access DMA resources through process address space by

drivers/iommu/intel/iommu.c

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4781,7 +4781,7 @@ static int aux_domain_add_dev(struct dmar_domain *domain,
47814781
link_failed:
47824782
spin_unlock_irqrestore(&device_domain_lock, flags);
47834783
if (list_empty(&domain->subdevices) && domain->default_pasid > 0)
4784-
ioasid_put(domain->default_pasid);
4784+
ioasid_free(domain->default_pasid);
47854785

47864786
return ret;
47874787
}
@@ -4811,7 +4811,7 @@ static void aux_domain_remove_dev(struct dmar_domain *domain,
48114811
spin_unlock_irqrestore(&device_domain_lock, flags);
48124812

48134813
if (list_empty(&domain->subdevices) && domain->default_pasid > 0)
4814-
ioasid_put(domain->default_pasid);
4814+
ioasid_free(domain->default_pasid);
48154815
}
48164816

48174817
static int prepare_domain_attach_device(struct iommu_domain *domain,

drivers/iommu/intel/svm.c

Lines changed: 0 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -514,11 +514,6 @@ static int intel_svm_alloc_pasid(struct device *dev, struct mm_struct *mm,
514514
return iommu_sva_alloc_pasid(mm, PASID_MIN, max_pasid - 1);
515515
}
516516

517-
static void intel_svm_free_pasid(struct mm_struct *mm)
518-
{
519-
iommu_sva_free_pasid(mm);
520-
}
521-
522517
static struct iommu_sva *intel_svm_bind_mm(struct intel_iommu *iommu,
523518
struct device *dev,
524519
struct mm_struct *mm,
@@ -662,8 +657,6 @@ static int intel_svm_unbind_mm(struct device *dev, u32 pasid)
662657
kfree(svm);
663658
}
664659
}
665-
/* Drop a PASID reference and free it if no reference. */
666-
intel_svm_free_pasid(mm);
667660
}
668661
out:
669662
return ret;
@@ -1047,8 +1040,6 @@ struct iommu_sva *intel_svm_bind(struct device *dev, struct mm_struct *mm, void
10471040
}
10481041

10491042
sva = intel_svm_bind_mm(iommu, dev, mm, flags);
1050-
if (IS_ERR_OR_NULL(sva))
1051-
intel_svm_free_pasid(mm);
10521043
mutex_unlock(&pasid_mutex);
10531044

10541045
return sva;

0 commit comments

Comments
 (0)