Skip to content

Commit c53ed16

Browse files
committed
net/mlx5: Always drain health in shutdown callback
jira LE-2157 cve CVE-2024-43866 Rebuild_History Non-Buildable kernel-5.14.0-503.14.1.el9_5 commit-author Shay Drory <[email protected]> commit 1b75da2 Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-5.14.0-503.14.1.el9_5/1b75da22.failed There is no point in recovery during device shutdown. if health work started need to wait for it to avoid races and NULL pointer access. Hence, drain health WQ on shutdown callback. Fixes: 1958fc2 ("net/mlx5: SF, Add auxiliary device driver") Fixes: d2aa060 ("net/mlx5: Cancel health poll before sending panic teardown command") Signed-off-by: Shay Drory <[email protected]> Reviewed-by: Moshe Shemesh <[email protected]> Signed-off-by: Tariq Toukan <[email protected]> Reviewed-by: Wojciech Drewek <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> (cherry picked from commit 1b75da2) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c
1 parent b7e4a22 commit c53ed16

File tree

1 file changed

+71
-0
lines changed

1 file changed

+71
-0
lines changed
Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
net/mlx5: Always drain health in shutdown callback
2+
3+
jira LE-2157
4+
cve CVE-2024-43866
5+
Rebuild_History Non-Buildable kernel-5.14.0-503.14.1.el9_5
6+
commit-author Shay Drory <[email protected]>
7+
commit 1b75da22ed1e6171e261bc9265370162553d5393
8+
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
9+
Will be included in final tarball splat. Ref for failed cherry-pick at:
10+
ciq/ciq_backports/kernel-5.14.0-503.14.1.el9_5/1b75da22.failed
11+
12+
There is no point in recovery during device shutdown. if health
13+
work started need to wait for it to avoid races and NULL pointer
14+
access.
15+
16+
Hence, drain health WQ on shutdown callback.
17+
18+
Fixes: 1958fc2f0712 ("net/mlx5: SF, Add auxiliary device driver")
19+
Fixes: d2aa060d40fa ("net/mlx5: Cancel health poll before sending panic teardown command")
20+
Signed-off-by: Shay Drory <[email protected]>
21+
Reviewed-by: Moshe Shemesh <[email protected]>
22+
Signed-off-by: Tariq Toukan <[email protected]>
23+
Reviewed-by: Wojciech Drewek <[email protected]>
24+
Link: https://patch.msgid.link/[email protected]
25+
Signed-off-by: Jakub Kicinski <[email protected]>
26+
(cherry picked from commit 1b75da22ed1e6171e261bc9265370162553d5393)
27+
Signed-off-by: Jonathan Maple <[email protected]>
28+
29+
# Conflicts:
30+
# drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c
31+
diff --cc drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c
32+
index 5c054a0005dd,b706f1486504..000000000000
33+
--- a/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c
34+
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c
35+
@@@ -106,8 -109,11 +106,14 @@@ static void mlx5_sf_dev_remove(struct a
36+
static void mlx5_sf_dev_shutdown(struct auxiliary_device *adev)
37+
{
38+
struct mlx5_sf_dev *sf_dev = container_of(adev, struct mlx5_sf_dev, adev);
39+
- struct mlx5_core_dev *mdev = sf_dev->mdev;
40+
41+
++<<<<<<< HEAD
42+
+ mlx5_unload_one(sf_dev->mdev, false);
43+
++=======
44+
+ set_bit(MLX5_BREAK_FW_WAIT, &mdev->intf_state);
45+
+ mlx5_drain_health_wq(mdev);
46+
+ mlx5_unload_one(mdev, false);
47+
++>>>>>>> 1b75da22ed1e (net/mlx5: Always drain health in shutdown callback)
48+
}
49+
50+
static const struct auxiliary_device_id mlx5_sf_dev_id_table[] = {
51+
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
52+
index 8152cba96786..c00e0f67c971 100644
53+
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
54+
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
55+
@@ -2140,7 +2140,6 @@ static int mlx5_try_fast_unload(struct mlx5_core_dev *dev)
56+
/* Panic tear down fw command will stop the PCI bus communication
57+
* with the HCA, so the health poll is no longer needed.
58+
*/
59+
- mlx5_drain_health_wq(dev);
60+
mlx5_stop_health_poll(dev, false);
61+
62+
ret = mlx5_cmd_fast_teardown_hca(dev);
63+
@@ -2175,6 +2174,7 @@ static void shutdown(struct pci_dev *pdev)
64+
65+
mlx5_core_info(dev, "Shutdown was called\n");
66+
set_bit(MLX5_BREAK_FW_WAIT, &dev->intf_state);
67+
+ mlx5_drain_health_wq(dev);
68+
err = mlx5_try_fast_unload(dev);
69+
if (err)
70+
mlx5_unload_one(dev, false);
71+
* Unmerged path drivers/net/ethernet/mellanox/mlx5/core/sf/dev/driver.c

0 commit comments

Comments
 (0)