Commit f7afda7
drm/amd: Fix hang on amdgpu unload by using pci_dev_is_disconnected()
The commit 6a23e7b ("drm/amd: Clean up kfd node on surprise
disconnect") introduced early KFD cleanup when drm_dev_is_unplugged()
returns true. However, this causes hangs during normal module unload
(rmmod amdgpu).
The issue occurs because drm_dev_unplug() is called in amdgpu_pci_remove()
for all removal scenarios, not just surprise disconnects. This was done
intentionally in commit 39934d3 ("Revert "drm/amdgpu: TA unload
messages are not actually sent to psp when amdgpu is uninstalled"") to
fix IGT PCI software unplug test failures. As a result,
drm_dev_is_unplugged() returns true even during normal module unload,
triggering the early KFD cleanup inappropriately.
The correct check should distinguish between:
- Actual surprise disconnect (eGPU unplugged): pci_dev_is_disconnected()
returns true
- Normal module unload (rmmod): pci_dev_is_disconnected() returns false
Replace drm_dev_is_unplugged() with pci_dev_is_disconnected() to ensure
the early cleanup only happens during true hardware disconnect events.
Cc: stable@vger.kernel.org
Reported-by: Cal Peake <cp@absolutedigital.net>
Closes: https://lore.kernel.org/all/b0c22deb-c0fa-3343-33cf-fd9a77d7db99@absolutedigital.net/
Fixes: 6a23e7b ("drm/amd: Clean up kfd node on surprise disconnect")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>1 parent 6952255 commit f7afda7
1 file changed
Lines changed: 2 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4924 | 4924 | | |
4925 | 4925 | | |
4926 | 4926 | | |
4927 | | - | |
| 4927 | + | |
4928 | 4928 | | |
4929 | 4929 | | |
4930 | 4930 | | |
| |||
4936 | 4936 | | |
4937 | 4937 | | |
4938 | 4938 | | |
4939 | | - | |
| 4939 | + | |
4940 | 4940 | | |
4941 | 4941 | | |
4942 | 4942 | | |
| |||
0 commit comments