Commit d451994
net/mlx5: Fix misidentification of write combining CQE during poll loop
The write combining completion poll loop uses usleep_range() which can
sleep much longer than requested due to scheduler latency. Under load,
we witnessed a 20ms+ delay until the process was rescheduled, causing
the jiffies based timeout to expire while the thread is sleeping.
The original do-while loop structure (poll, sleep, check timeout) would
exit without a final poll when waking after timeout, missing a CQE that
arrived during sleep.
Instead of the open-coded while loop, use the kernel's poll_timeout_us()
which always performs an additional check after the sleep expiration,
and is less error-prone.
Note: poll_timeout_us() doesn't accept a sleep range, by passing 10
sleep_us the sleep range effectively changes from 2-10 to 3-10 usecs.
Fixes: d98995b ("net/mlx5: Reimplement write combining test")
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20260218072904.1764634-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>1 parent ae3cb71 commit d451994
1 file changed
Lines changed: 5 additions & 9 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
| 5 | + | |
5 | 6 | | |
6 | 7 | | |
7 | 8 | | |
| |||
15 | 16 | | |
16 | 17 | | |
17 | 18 | | |
18 | | - | |
| 19 | + | |
19 | 20 | | |
20 | 21 | | |
21 | 22 | | |
| |||
359 | 360 | | |
360 | 361 | | |
361 | 362 | | |
362 | | - | |
363 | 363 | | |
364 | 364 | | |
365 | 365 | | |
| |||
389 | 389 | | |
390 | 390 | | |
391 | 391 | | |
392 | | - | |
393 | | - | |
394 | | - | |
395 | | - | |
396 | | - | |
397 | | - | |
398 | | - | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
399 | 395 | | |
400 | 396 | | |
401 | 397 | | |
| |||
0 commit comments