linux/drivers
Bill Kuzeja a5dd506e15 scsi: qla2xxx: Fix scsi scan hang triggered if adapter fails during init
A system can get hung task timeouts if a qlogic board fails during
initialization (if the board breaks again or fails the init). The hang
involves the scsi scan.

In a nutshell, since commit beb9e315e6 ("qla2xxx: Prevent removal and
board_disable race"):

...it is possible to have freed ha (base_vha->hw) early by a call to
qla2x00_remove_one when pdev->enable_cnt equals zero:

       if (!atomic_read(&pdev->enable_cnt)) {
               scsi_host_put(base_vha->host);
               kfree(ha);
               pci_set_drvdata(pdev, NULL);
               return;

Almost always, the scsi_host_put above frees the vha structure
(attached to the end of the Scsi_Host we're putting) since it's the last
put, and life is good.  However, if we are entering this routine because
the adapter has broken sometime during initialization AND a scsi scan is
already in progress (and has done its own scsi_host_get), vha will not
be freed. What's worse, the scsi scan will access the freed ha structure
through qla2xxx_scan_finished:

        if (time > vha->hw->loop_reset_delay * HZ)
                return 1;

The scsi scan keeps checking to see if a scan is complete by calling
qla2xxx_scan_finished. There is a timeout value that limits the length
of time a scan can take (hw->loop_reset_delay, usually set to 5
seconds), but this definition is in the data structure (hw) that can get
freed early.

This can yield unpredictable results, the worst of which is that the
scsi scan can hang indefinitely. This happens when the freed structure
gets reused and loop_reset_delay gets overwritten with garbage, which
the scan obliviously uses as its timeout value.

The fix for this is simple: at the top of qla2xxx_scan_finished, check
for the UNLOADING bit in the vha structure (_vha is not freed at this
point).  If UNLOADING is set, we exit the scan for this adapter
immediately. After this last reference to the ha structure, we'll exit
the scan for this adapter, and continue on.

This problem is hard to hit, but I have run into it doing negative
testing many times now (with a test specifically designed to bring it
out), so I can verify that this fix works. My testing has been against a
RHEL7 driver variant, but the bug and patch are equally relevant to to
the upstream driver.

Fixes: beb9e315e6 ("qla2xxx: Prevent removal and board_disable race")
Cc: <stable@vger.kernel.org> # v3.18+
Signed-off-by: Bill Kuzeja <william.kuzeja@stratus.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2016-11-01 16:39:01 -04:00
..
accessibility
acpi nmi_backtrace: generate one-line reports for idle cpus 2016-10-07 18:46:30 -07:00
amba
android
ata
atm atm: iphase: fix newline escape and minor tweak to source formatting 2016-09-15 19:15:55 -04:00
auxdisplay
base memory-hotplug: fix store_mem_state() return value 2016-10-07 18:46:28 -07:00
bcma
block The big ticket item here is support for rbd exclusive-lock feature, 2016-10-10 13:52:05 -07:00
bluetooth Bluetooth: btusb: Fix atheros firmware download error 2016-10-07 09:46:56 +02:00
bus ARM: SoC driver updates for v4.9 2016-10-07 21:23:40 -07:00
cdrom
char Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
clk ARM: SoC: late DT updates for v4.9 2016-10-07 21:34:49 -07:00
clocksource ARM: SoC driver updates for v4.9 2016-10-07 21:23:40 -07:00
connector
cpufreq Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm 2016-10-06 07:59:37 -07:00
cpuidle nmi_backtrace: generate one-line reports for idle cpus 2016-10-07 18:46:30 -07:00
crypto Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2016-10-10 14:04:16 -07:00
dax
dca
devfreq PM / devfreq: rk3399_dmc: Remove explictly regulator_put call in .remove 2016-09-19 13:36:20 +02:00
dio
dma dmaengine updates for 4.8-rc1 2016-10-06 17:13:54 -07:00
dma-buf
edac * Altera Arria10 enablement of NAND, DMA, USB, QSPI and SD-MMC FIFO 2016-10-04 12:06:26 -07:00
eisa
extcon Merge branch 'next' into resolution 2016-09-15 16:45:20 +05:30
firewire
firmware ARM: SoC driver updates for v4.9 2016-10-07 21:23:40 -07:00
fmc
fpga
gpio Merge branch 'i2c/for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux 2016-10-07 14:12:21 -07:00
gpu Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 13:04:49 -07:00
hid Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid 2016-10-07 11:58:38 -07:00
hsi
hv Drivers: hv: get rid of id in struct vmbus_channel 2016-09-27 12:35:49 +02:00
hwmon hwmon updates for v4.9 2016-10-04 10:56:14 -07:00
hwspinlock
hwtracing
i2c Merge branch 'i2c/for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux 2016-10-07 14:12:21 -07:00
ide
idle nmi_backtrace: generate one-line reports for idle cpus 2016-10-07 18:46:30 -07:00
iio Staging/IIO patches for 4.9-rc1 2016-10-05 14:50:51 -07:00
infiniband Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2016-10-07 09:12:19 -07:00
iommu KVM updates for v4.9-rc1 2016-10-06 10:49:01 -07:00
ipack
irqchip Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-10-10 10:24:41 -07:00
isdn
leds leds: triggers: Check return value of kobject_uevent_env() 2016-09-20 10:22:10 +02:00
lguest
lightnvm Merge branch 'for-4.9/block' of git://git.kernel.dk/linux-block 2016-10-07 14:42:05 -07:00
macintosh powerpc: Remove all usages of NO_IRQ 2016-09-20 20:57:12 +10:00
mailbox Merge branch 'mailbox-for-next' of git://git.linaro.org/landing-teams/working/fujitsu/integration 2016-10-06 17:36:53 -07:00
mcb mcb: Add a dma_device to mcb_device 2016-09-27 12:33:47 +02:00
md Merge branch 'for-4.9/block-irq' of git://git.kernel.dk/linux-block 2016-10-09 17:29:33 -07:00
media ARM: SoC driver updates for v4.9 2016-10-07 21:23:40 -07:00
memory ARM: SoC driver updates for v4.9 2016-10-07 21:23:40 -07:00
memstick
message scsi: fusion: Fix error return code in mptfc_probe() 2016-09-14 14:26:19 -04:00
mfd - Core Frameworks 2016-10-07 08:35:35 -07:00
misc Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
mmc USB/PHY/EXTCON patches for 4.9-rc1 2016-10-03 20:17:35 -07:00
mtd This pull request contains: 2016-10-11 10:49:44 -07:00
net Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-10-11 08:10:19 -07:00
nfc
ntb
nubus
nvdimm libnvdimm, region: fix flush hint table thinko 2016-09-24 11:45:38 -07:00
nvme Merge branch 'for-4.9/block-irq' of git://git.kernel.dk/linux-block 2016-10-09 17:29:33 -07:00
nvmem ARM: SoC driver updates for v4.9 2016-10-07 21:23:40 -07:00
of MTD updates for 4.9-rc1 2016-10-10 17:39:51 -07:00
oprofile Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
parisc
parport
pci Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2016-10-10 14:04:16 -07:00
pcmcia pcmcia: soc_common: add driver-data pointer 2016-09-22 09:39:16 +01:00
perf ARM: SoC driver updates for v4.9 2016-10-07 21:23:40 -07:00
phy phy-twl4030-usb: initialize charging-related stuff via pm_runtime 2016-09-14 10:59:12 +05:30
pinctrl ARM: SoC driver updates for v4.9 2016-10-07 21:23:40 -07:00
platform Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
pnp
power power supply and reset changes for the v4.9 series 2016-10-06 18:21:15 -07:00
powercap
pps
ps3 powerpc: Remove all usages of NO_IRQ 2016-09-20 20:57:12 +10:00
ptp ptp: Fix resource leak in case of error 2016-10-03 21:54:10 -04:00
pwm
rapidio rapidio/rio_cm: avoid GFP_KERNEL in atomic context 2016-09-19 15:36:17 -07:00
ras
regulator - Core Frameworks 2016-10-07 08:35:35 -07:00
remoteproc rpmsg updates for v4.9 2016-10-06 17:03:49 -07:00
reset
rpmsg
rtc Merge branches 'ib-mfd-gpio-4.9', 'ib-mfd-gpio-regulator-4.9', 'ib-mfd-input-4.9', 'ib-mfd-regulator-4.9', 'ib-mfd-regulator-4.9.1', 'ib-mfd-regulator-rtc-4.9', 'ib-mfd-regulator-rtc-4.9-1' and 'ib-mfd-rtc-4.9' into ibs-for-mfd-merged 2016-10-04 15:47:01 +01:00
s390 scsi: zfcp: spin_lock_irqsave() is not nestable 2016-10-14 16:21:08 -04:00
sbus
scsi scsi: qla2xxx: Fix scsi scan hang triggered if adapter fails during init 2016-11-01 16:39:01 -04:00
sfi
sh
sn
soc ARM: SoC driver updates for v4.9 2016-10-07 21:23:40 -07:00
spi Merge remote-tracking branches 'spi/topic/ti-qspi', 'spi/topic/tools', 'spi/topic/txx9' and 'spi/topic/xlp' into spi-next 2016-09-30 09:14:22 -07:00
spmi spmi: pmic-arb: Return an error code if sanity check fails 2016-09-27 12:43:34 +02:00
ssb
staging Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
target chcr/cxgb4i/cxgbit/RDMA/cxgb4: Allocate resources dynamically for all cxgb4 ULD's 2016-09-19 01:37:32 -04:00
tc
thermal
thunderbolt
tty Merge branch 'printk-cleanups' 2016-10-10 09:29:50 -07:00
uio
usb Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-10-10 20:16:43 -07:00
uwb
vfio vfio_pci: use pci_alloc_irq_vectors 2016-09-29 13:36:38 -06:00
vhost
video backlight: pwm_bl: Handle gpio that can sleep 2016-10-06 09:27:26 +01:00
virt
virtio
vlynq
vme vme: fake: remove unexpected unlock in fake_master_set() 2016-09-27 12:43:35 +02:00
w1
watchdog USB/PHY/EXTCON patches for 4.9-rc1 2016-10-03 20:17:35 -07:00
xen xen: features and fixes for 4.9-rc0 2016-10-06 11:19:10 -07:00
zorro
Kconfig
Makefile clk: probe common clock drivers earlier 2016-09-23 13:00:04 +02:00