Skip to content

Commit 6623c5f

Browse files
Jie1zhangalexdeucher
authored andcommitted
drm/amdgpu: fix lock warning in amdgpu_userq_fence_driver_process
Fix a potential deadlock caused by inconsistent spinlock usage between interrupt and process contexts in the userq fence driver. The issue occurs when amdgpu_userq_fence_driver_process() is called from both: - Interrupt context: gfx_v11_0_eop_irq() -> amdgpu_userq_fence_driver_process() - Process context: amdgpu_eviction_fence_suspend_worker() -> amdgpu_userq_fence_driver_force_completion() -> amdgpu_userq_fence_driver_process() In interrupt context, the spinlock was acquired without disabling interrupts, leaving it in {IN-HARDIRQ-W} state. When the same lock is acquired in process context, the kernel detects inconsistent locking since the process context acquisition would enable interrupts while holding a lock previously acquired in interrupt context. Kernel log shows: [ 4039.310790] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. [ 4039.310804] kworker/7:2/409 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 4039.310818] ffff9284e1bed000 (&fence_drv->fence_list_lock){?...}-{3:3}, [ 4039.310993] {IN-HARDIRQ-W} state was registered at: [ 4039.311004] lock_acquire+0xc6/0x300 [ 4039.311018] _raw_spin_lock+0x39/0x80 [ 4039.311031] amdgpu_userq_fence_driver_process.part.0+0x30/0x180 [amdgpu] [ 4039.311146] amdgpu_userq_fence_driver_process+0x17/0x30 [amdgpu] [ 4039.311257] gfx_v11_0_eop_irq+0x132/0x170 [amdgpu] Fix by using spin_lock_irqsave()/spin_unlock_irqrestore() to properly manage interrupt state regardless of calling context. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ded3ad7) Cc: stable@vger.kernel.org
1 parent 9f8fd53 commit 6623c5f

1 file changed

Lines changed: 3 additions & 2 deletions

File tree

drivers/gpu/drm/amd/amdgpu/amdgpu_userq_fence.c

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -151,15 +151,16 @@ void amdgpu_userq_fence_driver_process(struct amdgpu_userq_fence_driver *fence_d
151151
{
152152
struct amdgpu_userq_fence *userq_fence, *tmp;
153153
struct dma_fence *fence;
154+
unsigned long flags;
154155
u64 rptr;
155156
int i;
156157

157158
if (!fence_drv)
158159
return;
159160

161+
spin_lock_irqsave(&fence_drv->fence_list_lock, flags);
160162
rptr = amdgpu_userq_fence_read(fence_drv);
161163

162-
spin_lock(&fence_drv->fence_list_lock);
163164
list_for_each_entry_safe(userq_fence, tmp, &fence_drv->fences, link) {
164165
fence = &userq_fence->base;
165166

@@ -174,7 +175,7 @@ void amdgpu_userq_fence_driver_process(struct amdgpu_userq_fence_driver *fence_d
174175
list_del(&userq_fence->link);
175176
dma_fence_put(fence);
176177
}
177-
spin_unlock(&fence_drv->fence_list_lock);
178+
spin_unlock_irqrestore(&fence_drv->fence_list_lock, flags);
178179
}
179180

180181
void amdgpu_userq_fence_driver_destroy(struct kref *ref)

0 commit comments

Comments
 (0)