@@ -104,18 +104,47 @@ The MSR must be configured on each logical CPU before any application
104104thread can interact with a device. Threads that belong to the same
105105process share the same page tables, thus the same MSR value.
106106
107- PASID is cleared when a process is created. The PASID allocation and MSR
108- programming may occur long after a process and its threads have been created.
109- One thread must call iommu_sva_bind_device() to allocate the PASID for the
110- process. If a thread uses ENQCMD without the MSR first being populated, a #GP
111- will be raised. The kernel will update the PASID MSR with the PASID for all
112- threads in the process. A single process PASID can be used simultaneously
113- with multiple devices since they all share the same address space.
114-
115- One thread can call iommu_sva_unbind_device() to free the allocated PASID.
116- The kernel will clear the PASID MSR for all threads belonging to the process.
117-
118- New threads inherit the MSR value from the parent.
107+ PASID Life Cycle Management
108+ ===========================
109+
110+ PASID is initialized as INVALID_IOASID (-1) when a process is created.
111+
112+ Only processes that access SVA-capable devices need to have a PASID
113+ allocated. This allocation happens when a process opens/binds an SVA-capable
114+ device but finds no PASID for this process. Subsequent binds of the same, or
115+ other devices will share the same PASID.
116+
117+ Although the PASID is allocated to the process by opening a device,
118+ it is not active in any of the threads of that process. It's loaded to the
119+ IA32_PASID MSR lazily when a thread tries to submit a work descriptor
120+ to a device using the ENQCMD.
121+
122+ That first access will trigger a #GP fault because the IA32_PASID MSR
123+ has not been initialized with the PASID value assigned to the process
124+ when the device was opened. The Linux #GP handler notes that a PASID has
125+ been allocated for the process, and so initializes the IA32_PASID MSR
126+ and returns so that the ENQCMD instruction is re-executed.
127+
128+ On fork(2) or exec(2) the PASID is removed from the process as it no
129+ longer has the same address space that it had when the device was opened.
130+
131+ On clone(2) the new task shares the same address space, so will be
132+ able to use the PASID allocated to the process. The IA32_PASID is not
133+ preemptively initialized as the PASID value might not be allocated yet or
134+ the kernel does not know whether this thread is going to access the device
135+ and the cleared IA32_PASID MSR reduces context switch overhead by xstate
136+ init optimization. Since #GP faults have to be handled on any threads that
137+ were created before the PASID was assigned to the mm of the process, newly
138+ created threads might as well be treated in a consistent way.
139+
140+ Due to complexity of freeing the PASID and clearing all IA32_PASID MSRs in
141+ all threads in unbind, free the PASID lazily only on mm exit.
142+
143+ If a process does a close(2) of the device file descriptor and munmap(2)
144+ of the device MMIO portal, then the driver will unbind the device. The
145+ PASID is still marked VALID in the PASID_MSR for any threads in the
146+ process that accessed the device. But this is harmless as without the
147+ MMIO portal they cannot submit new work to the device.
119148
120149Relationships
121150=============
0 commit comments