Commit e80d655
net/mlx5e: Fix potential deadlock by deferring RX timeout recovery
mlx5e_reporter_rx_timeout() is currently invoked synchronously
in the driver's open error flow. This causes the thread holding
priv->state_lock to attempt acquiring the devlink lock, which
can result in a circular dependency with other devlink operations.
For example:
- Devlink health diagnose flow:
- __devlink_nl_pre_doit() acquires the devlink lock.
- devlink_nl_health_reporter_diagnose_doit() invokes the
driver's diagnose callback.
- mlx5e_rx_reporter_diagnose() then attempts to acquire
priv->state_lock.
- Driver open flow:
- mlx5e_open() acquires priv->state_lock.
- If an error occurs, devlink_health_reporter may be called,
attempting to acquire the devlink lock.
To prevent this circular locking scenario, defer the RX timeout
recovery by scheduling it via a workqueue. This ensures that the
recovery work acquires locks in a consistent order: first the
devlink lock, then priv->state_lock.
Additionally, make the recovery work acquire the netdev instance
lock to safely synchronize with the open/close channel flows,
similar to mlx5e_tx_timeout_work. Repeatedly attempt to acquire
the netdev instance lock until it is taken or the target RQ is no
longer active, as indicated by the MLX5E_STATE_CHANNELS_ACTIVE bit.
Fixes: 32c57fb ("net/mlx5e: Report and recover from rx timeout")
Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/1753256672-337784-4-git-send-email-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>1 parent 6d19c44 commit e80d655
3 files changed
Lines changed: 33 additions & 1 deletion
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
728 | 728 | | |
729 | 729 | | |
730 | 730 | | |
| 731 | + | |
731 | 732 | | |
732 | 733 | | |
733 | 734 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
170 | 170 | | |
171 | 171 | | |
172 | 172 | | |
| 173 | + | |
173 | 174 | | |
174 | 175 | | |
175 | 176 | | |
176 | 177 | | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
177 | 182 | | |
178 | 183 | | |
179 | 184 | | |
180 | 185 | | |
181 | 186 | | |
182 | 187 | | |
| 188 | + | |
| 189 | + | |
183 | 190 | | |
184 | 191 | | |
185 | 192 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
707 | 707 | | |
708 | 708 | | |
709 | 709 | | |
| 710 | + | |
| 711 | + | |
| 712 | + | |
| 713 | + | |
| 714 | + | |
| 715 | + | |
| 716 | + | |
| 717 | + | |
| 718 | + | |
| 719 | + | |
| 720 | + | |
| 721 | + | |
| 722 | + | |
| 723 | + | |
| 724 | + | |
| 725 | + | |
| 726 | + | |
| 727 | + | |
| 728 | + | |
| 729 | + | |
| 730 | + | |
710 | 731 | | |
711 | 732 | | |
712 | 733 | | |
| |||
830 | 851 | | |
831 | 852 | | |
832 | 853 | | |
| 854 | + | |
833 | 855 | | |
834 | 856 | | |
835 | 857 | | |
| |||
1204 | 1226 | | |
1205 | 1227 | | |
1206 | 1228 | | |
1207 | | - | |
| 1229 | + | |
| 1230 | + | |
1208 | 1231 | | |
1209 | 1232 | | |
1210 | 1233 | | |
| |||
1375 | 1398 | | |
1376 | 1399 | | |
1377 | 1400 | | |
| 1401 | + | |
1378 | 1402 | | |
1379 | 1403 | | |
1380 | 1404 | | |
| |||
0 commit comments