Skip to content

Commit f8e7ab6

Browse files
committed
scsi: storvsc: Fix handling of virtual Fibre Channel timeouts
jira LE-1907 Rebuild_History Non-Buildable kernel-5.14.0-284.30.1.el9_2 commit-author Michael Kelley <[email protected]> commit 175544a Empty-Commit: Cherry-Pick Conflicts during history rebuild. Will be included in final tarball splat. Ref for failed cherry-pick at: ciq/ciq_backports/kernel-5.14.0-284.30.1.el9_2/175544ad.failed Hyper-V provides the ability to connect Fibre Channel LUNs to the host system and present them in a guest VM as a SCSI device. I/O to the vFC device is handled by the storvsc driver. The storvsc driver includes a partial integration with the FC transport implemented in the generic portion of the Linux SCSI subsystem so that FC attributes can be displayed in /sys. However, the partial integration means that some aspects of vFC don't work properly. Unfortunately, a full and correct integration isn't practical because of limitations in what Hyper-V provides to the guest. In particular, in the context of Hyper-V storvsc, the FC transport timeout function fc_eh_timed_out() causes a kernel panic because it can't find the rport and dereferences a NULL pointer. The original patch that added the call from storvsc_eh_timed_out() to fc_eh_timed_out() is faulty in this regard. In many cases a timeout is due to a transient condition, so the situation can be improved by just continuing to wait like with other I/O requests issued by storvsc, and avoiding the guaranteed panic. For a permanent failure, continuing to wait may result in a hung thread instead of a panic, which again may be better. So fix the panic by removing the storvsc call to fc_eh_timed_out(). This allows storvsc to keep waiting for a response. The change has been tested by users who experienced a panic in fc_eh_timed_out() due to transient timeouts, and it solves their problem. In the future we may want to deprecate the vFC functionality in storvsc since it can't be fully fixed. But it has current users for whom it is working well enough, so it should probably stay for a while longer. Fixes: 3930d73 ("scsi: storvsc: use default I/O timeout handler for FC devices") Cc: [email protected] Signed-off-by: Michael Kelley <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Martin K. Petersen <[email protected]> (cherry picked from commit 175544a) Signed-off-by: Jonathan Maple <[email protected]> # Conflicts: # drivers/scsi/storvsc_drv.c
1 parent ab60ca2 commit f8e7ab6

File tree

1 file changed

+74
-0
lines changed

1 file changed

+74
-0
lines changed
Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
scsi: storvsc: Fix handling of virtual Fibre Channel timeouts
2+
3+
jira LE-1907
4+
Rebuild_History Non-Buildable kernel-5.14.0-284.30.1.el9_2
5+
commit-author Michael Kelley <[email protected]>
6+
commit 175544ad48cbf56affeef2a679c6a4d4fb1e2881
7+
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
8+
Will be included in final tarball splat. Ref for failed cherry-pick at:
9+
ciq/ciq_backports/kernel-5.14.0-284.30.1.el9_2/175544ad.failed
10+
11+
Hyper-V provides the ability to connect Fibre Channel LUNs to the host
12+
system and present them in a guest VM as a SCSI device. I/O to the vFC
13+
device is handled by the storvsc driver. The storvsc driver includes a
14+
partial integration with the FC transport implemented in the generic
15+
portion of the Linux SCSI subsystem so that FC attributes can be displayed
16+
in /sys. However, the partial integration means that some aspects of vFC
17+
don't work properly. Unfortunately, a full and correct integration isn't
18+
practical because of limitations in what Hyper-V provides to the guest.
19+
20+
In particular, in the context of Hyper-V storvsc, the FC transport timeout
21+
function fc_eh_timed_out() causes a kernel panic because it can't find the
22+
rport and dereferences a NULL pointer. The original patch that added the
23+
call from storvsc_eh_timed_out() to fc_eh_timed_out() is faulty in this
24+
regard.
25+
26+
In many cases a timeout is due to a transient condition, so the situation
27+
can be improved by just continuing to wait like with other I/O requests
28+
issued by storvsc, and avoiding the guaranteed panic. For a permanent
29+
failure, continuing to wait may result in a hung thread instead of a panic,
30+
which again may be better.
31+
32+
So fix the panic by removing the storvsc call to fc_eh_timed_out(). This
33+
allows storvsc to keep waiting for a response. The change has been tested
34+
by users who experienced a panic in fc_eh_timed_out() due to transient
35+
timeouts, and it solves their problem.
36+
37+
In the future we may want to deprecate the vFC functionality in storvsc
38+
since it can't be fully fixed. But it has current users for whom it is
39+
working well enough, so it should probably stay for a while longer.
40+
41+
Fixes: 3930d7309807 ("scsi: storvsc: use default I/O timeout handler for FC devices")
42+
43+
Signed-off-by: Michael Kelley <[email protected]>
44+
Link: https://lore.kernel.org/r/[email protected]
45+
Signed-off-by: Martin K. Petersen <[email protected]>
46+
(cherry picked from commit 175544ad48cbf56affeef2a679c6a4d4fb1e2881)
47+
Signed-off-by: Jonathan Maple <[email protected]>
48+
49+
# Conflicts:
50+
# drivers/scsi/storvsc_drv.c
51+
diff --cc drivers/scsi/storvsc_drv.c
52+
index a8be138fda6e,047ffaf7d42a..000000000000
53+
--- a/drivers/scsi/storvsc_drv.c
54+
+++ b/drivers/scsi/storvsc_drv.c
55+
@@@ -1656,13 -1672,9 +1656,17 @@@ static int storvsc_host_reset_handler(s
56+
* be unbounded on Azure. Reset the timer unconditionally to give the host a
57+
* chance to perform EH.
58+
*/
59+
-static enum scsi_timeout_action storvsc_eh_timed_out(struct scsi_cmnd *scmnd)
60+
+static enum blk_eh_timer_return storvsc_eh_timed_out(struct scsi_cmnd *scmnd)
61+
{
62+
++<<<<<<< HEAD
63+
+#if IS_ENABLED(CONFIG_SCSI_FC_ATTRS)
64+
+ if (scmnd->device->host->transportt == fc_transport_template)
65+
+ return fc_eh_timed_out(scmnd);
66+
+#endif
67+
+ return BLK_EH_RESET_TIMER;
68+
++=======
69+
+ return SCSI_EH_RESET_TIMER;
70+
++>>>>>>> 175544ad48cb (scsi: storvsc: Fix handling of virtual Fibre Channel timeouts)
71+
}
72+
73+
static bool storvsc_scsi_cmd_ok(struct scsi_cmnd *scmnd)
74+
* Unmerged path drivers/scsi/storvsc_drv.c

0 commit comments

Comments
 (0)