Commit Graph

1243 Commits

Author SHA1 Message Date
Sebastian Ott
d36f0c6638 [S390] cio: use pim to check for multipath.
To check if multipath is available we count the bits set in lpm,
which could change over time (via configure [on|off] of a path).

The following patch uses the pim (which is persistent) for this
decision.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:11 +01:00
Sebastian Ott
f444cc0e52 [S390] cio: commit all pmcw changes.
Sometimes we change the pmcw configuration but don't call msch
to transmit these changes to the channel subsystem.

The patch fixes this by calling cio_commit_config in such cases.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:10 +01:00
Sebastian Ott
13952ec12d [S390] cio: introduce cio_commit_config
To change the configuration of a subchannel we alter the modifiable
bits of the subchannel's schib field and issue a modify subchannel.
There can be the case that not all changes were applied -or worse-
quietly overwritten by the hardware. With the next store subchannel
we obtain the current state of the hardware but lose our target
configuration.

With this patch we introduce a subchannel_config structure which
contains the target subchannel configuration. Additionally the msch
wrapper cio_modify is replaced with cio_commit_config which
copies the desired changes to a temporary schib. msch is then
called with the temporary schib. This schib is only written back
to the subchannel if all changes were applied.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:10 +01:00
Sebastian Ott
cdb912a40d [S390] cio: introduce cio_update_schib
There is the chance that we get condition code 0 for a stsch but
the resulting schib is not vaild. In the current code there are
2 cases:
* we do a check for validity of the schib after stsch, but at this
  time we have already stored the invaild schib in the subchannel
  structure. This may lead to problems.
* we don't do a check for validity, which is not that good either.

The patch addresses both issues by introducing the stsch wrapper
cio_update_schib which performs stsch on a local schib. This schib
is only written back to the subchannel if it's valid.

side note: For some functions (chp_events) the return codes are
different now (-ENXIO vs -ENODEV) but this shouldn't do harm
since the caller doesn't check for _specific_ errors.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:10 +01:00
Cornelia Huck
d6a30761d8 [S390] cio: Use device_is_registered().
Check if a ccw device is registered via device_is_registered()
and not via the old kludge of checking the membership in driver
core internal klists.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:09 +01:00
Cornelia Huck
283fdd0b8a [S390] cio: Dont call ->release directly.
Just put the cdev's reference count to give up our reference.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:09 +01:00
Cornelia Huck
90ed2b692f [S390] cio: Dont fail probe for I/O subchannels.
If we fail the probe for an I/O subchannel, we won't be able
to unregister it again since there are no sch_event()
callbacks for unbound subchannels. Just succeed the probe in
any case and schedule unregistering the subchannel.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:08 +01:00
Cornelia Huck
5fb6b8544d [S390] cio: Only register ccw_device for registered subchannel.
There is a race between io_subchannel_register() and
io_subchannel_sch_event() which may cause a subchannel to be
unregistered because it is no longer operational before
io_subchannel_register() had run. We need to check whether the
subchannel is still registered before the ccw device can be
registered and just bail out if it is not.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:08 +01:00
Cornelia Huck
6eff208f47 [S390] cio: Fix I/O subchannel refcounting.
Subchannel refcounting was incorrect in some places, especially
a refcount was missing when ccw_device_call_sch_unregister()
was called and the refcount was not correctly switched after
moving devices.

Fix this by establishing the following rules:
- The ccw_device obtains a reference on its parent subchannel
  when dev.parent is set and gives it up in its release
  function. This is needed because we need a parent reference
  for correct refcounting even before the ccw device is (if at
  all) registered.
- When calling device_move(), obtain a reference on the new
  subchannel before moving the ccw device and give up the
  reference on the old parent after moving. This brings the
  refcount in line with the first rule.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:08 +01:00
Cornelia Huck
9cd6742197 [S390] cio: Fix reference counting for online/offline.
The current code attempts to get an extra reference count
for online devices by doing a get_device() in ccw_device_online()
and a put_device() in ccw_device_done(). However, this
- incorrectly obtains an extra reference for disconnected
  devices becoming available again (since they are already
  online)
- needs special checks for css_init_done in order to handle
  the console device
- is not obvious and
- may incorretly drop a reference count in ccw_device_done() if
  that function is called after path verification for a device
  that just became not operational.

So let's just get the reference in ccw_device_set_online() and
drop it in ccw_device_set_offline(). (Unfortunately, we still
need the special case in io_subchannel_probe().)

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:07 +01:00
Cornelia Huck
97166f52fc [S390] cio: Put referernce on correct device after moving.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:07 +01:00
Peter Oberparleiter
c619d4223e [S390] cio: fix ccwgroup online vs. ungroup race condition
Ensure atomicity of ungroup operation to prevent concurrent ungroup
and online processing which may lead to use-after-release situations.

Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:06 +01:00
Sebastian Ott
111e95a4ca [S390] cio: move irritating comment.
Due to former patches a comment and device id initialization were
split from the addressed function call in io_subchannel_probe.

Move it back to where it belongs.

Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:06 +01:00
Heiko Carstens
191fd44c11 [S390] cio: get rid of compile warning
Move cio_tpi() to the rest of the CONFIG_CCW_CONSOLE functions to
get rid of this one:

drivers/s390/cio/cio.c:115: warning: 'cio_tpi' defined but not used

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:05 +01:00
Kay Sievers
98df67b324 [S390] struct device - replace bus_id with dev_name(), dev_set_name()
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:03 +01:00
Stefan Haberland
0cd4bd4754 [S390] dasd: call cleanup_cqr with request_queue_lock
__dasd_cleanup_cqr should be called with request_queue_lock held and
__dasd_block_process_erp with queue_lock

Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:02 +01:00
Stefan Haberland
50afd20f8c [S390] dasd: correct sense byte condition for SIM
SIM sense data are always 32 bit sense data so sense byte 27 bit 0
has not to be set.

Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:02 +01:00
Cornelia Huck
faf16aa9b3 [S390] dasd: Use accessors instead of using driver_data directly.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:01 +01:00
Stefan Haberland
2bf373b3e3 [S390] dasd: improve dasd statistics proc interface
For a large number of I/O requests the values were shifted binary.
The shift was not transparent for the user because the shift value
was not displayed. To make this interface more human readable the
values are shifted decimal and the scale factor is displayed.

Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:01 +01:00
Christof Schmitt
bd43a42b7e [S390] zfcp: Report microcode level through service level interface
Register zfcp with the new /proc/service_level interface to report the
FCP microcode level. When the adapter goes offline or a channel path
disappears, zfcp unregisters, since the microcode version might change
and zfcp does not know about it.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:01 +01:00
Martin Schwidefsky
6bcac508fb [S390] service level interface.
Add a new proc interface /proc/service_levels that allows any code
to report a relevant service level, e.g. the microcode level of
devices, the service level of the hypervisor, etc.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:00 +01:00
Jan Glauber
7a0b4cbc7d [S390] qdio: fix error reporting for hipersockets
Hipersocket connections can encounter temporary busy conditions.
In case of the busy bit set we retry the SIGA operation immediatelly.
If the busy condition still persists after 100 ms we fail and report
the error to the upper layer. The second stage retry logic is removed.
In case of ongoing busy conditions the upper layer needs to reset the
connection.

The reporting of a SIGA error is now done synchronously to allow the
network driver to requeue the buffers. Also no error trace is created
for the temporary SIGA errors so the error message view is not flooded.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:39:00 +01:00
Jan Glauber
50f769df1c [S390] qdio: improve inbound buffer acknowledgement
- Use automatic acknowledgement of incoming buffers in QEBSM mode
- Move ACK for non-QEBSM mode always to the newest buffer to prevent
  a race with qdio_stop_polling
- Remove the polling spinlock, the upper layer drivers return new buffers
  in the same code path and could not run in parallel
- Don't flood the error log in case of no-target-buffer-empty
- In handle_inbound we check if we would overwrite an ACK'ed buffer, if so
  advance the pointer to the oldest ACK'ed buffer so we don't overwrite an
  empty buffer in qdio_stop_polling

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:59 +01:00
Jan Glauber
22f9934767 [S390] qdio: rework debug feature logging
- make qdio_trace a per device view
- remove s390dbf exceptions
- remove CONFIG_QDIO_DEBUG, not needed anymore if we check for the level
  before calling sprintf
- use snprintf for dbf entries
- add start markers to see if the dbf view wrapped
- add a global error view for all queues

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:59 +01:00
Jan Glauber
9a1ce28aeb [S390] qdio: fix compile warning under 31 bit
The QEBSM instructions are only available for CONFIG_64BIT, they are not
used under 31 bit. Make compiler happy about the false positive:

drivers/s390/cio/qdio_main.c: In function ?qdio_inbound_q_done?:
drivers/s390/cio/qdio_main.c:532: warning: ?state? may be used uninitialized in this function

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:58 +01:00
Jan Glauber
23589d057a [S390] qdio: add eqbs/sqbs instruction counters
Add counters for the eqbs and sqbs instructions that indicate how often
we issued the instructions and how often the instructions returned with
less buffers than specified.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:58 +01:00
Jan Glauber
bbd50e172f [S390] qdio: fix qeth port count detection
qeth needs to get the port count information before
qdio has allocated a page for the chsc operation.
Extend qdio_get_ssqd_desc() to store the data in the
specified structure.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:58 +01:00
Christian Maaser
43c207e6e5 [S390] ap: Minor code beautification.
Changed some symbol names for a better and clearer code.

Signed-off-by: Christian Maaser <cmaaser@de.ibm.com>
Signed-off-by: Felix Beck <beckf@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:57 +01:00
Felix Beck
cb17a6364a [S390] zcrypt: Use of Thin Interrupts
When the machine supports AP adapter interrupts polling will be
switched off at module initialization and the driver will work in
interrupt mode.

Signed-off-by: Felix Beck <felix.beck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:57 +01:00
Christian Borntraeger
a114a9d69d [S390] vmcp: remove BKL
The vmcp driver uses the session->mutex for concurrent access of the data
structures. Therefore, the BKL in vmcp_open does not protect against any
other function in the driver.
The BLK in vmcp_open would protect concurrent access to the module init
but all necessary steps ave finished before misc_register is called.
We can safely remove the lock_kernel from vcmp.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-12-25 13:38:54 +01:00
Swen Schillig
f7a65e92e4 [SCSI] zfcp: prevent double decrement on host_busy while being busy
The zfcp_scsi_queuecommand was not acting according to the standard
when the respective unit was not available. In this case an -EBUSY was
returned, which is not valid in itself, and in addition scsi_done
was called. This combination is not allowed and was leading to a
double finish of the request and therefor double decrement of the
host_busy counter.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-01 10:18:20 -06:00
Swen Schillig
fca55b6fb5 [SCSI] zfcp: fix deadlock between wq triggered port scan and ERP
Waiting for the ERP to be finished in a task running in the global
kernel work-queue is a bad idea, especially if the ERP needs to run
another job in this work-queue before it can finish. -> deadlock.

This patch removes the necessity to wait for a finished ERP from the
scan task and moves the job scheduling to the end of the ERP.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-01 10:18:04 -06:00
Swen Schillig
0ac55aa90f [SCSI] zfcp: eliminate race between validation and locking
The check of having a valid pointer was performed before the
processing was secured by the lock. Between those two steps the
pointer can turn invalid.  During further processing another value is
used (referenced by the pointer described above) as a function pointer
which is never verified to be valid either, resulting under some
circumstances in an invalid function call.  This patch is fixing both
issues.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-01 10:17:50 -06:00
Swen Schillig
26871c97d5 [SCSI] zfcp: verify for correct rport state before scanning for SCSI devs
Prevent a SCSI target scan for a rport which have turned invalid
in the meantime.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-01 10:17:34 -06:00
Swen Schillig
633528c304 [SCSI] zfcp: returning an ERR_PTR where a NULL value is expected
Aborting a SCSI cmnd might requrie to send a abort_fsf_cmnd. If the
creation of this fsf_req fails an ERR_PTR is returned where a NULL
value would be expected as an error indicator. This ERR_PTR is
dereferenced as valid fsf_req in succeeding processing leading to
an error.

Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-01 10:17:14 -06:00
Christof Schmitt
1c1cba17a9 [SCSI] zfcp: Fix opening of wka ports
Running two wka_port_get calls in parallel could issue two open_port
requests, overwriting the port handle. Don't issue an open_port
for the state PORT_OPENING, and only read the data from GOOD
responses.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Acked-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-01 10:16:59 -06:00
Martin Petermann
bce02614cd [SCSI] zfcp: fix remote port status check
For an incoming RSCN it was checked by the ZFCP_STATUS_PORT_DID_DID
define to re-open a remote port or to test the connection. Since this
define was re-used it was also necessary to replace that define with
ZFCP_STATUS_PORT_PHYS_OPEN.

Signed-off-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-12-01 10:16:44 -06:00
Linus Torvalds
011331483d Merge branch 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.osdl.marist.edu/pub/scm/linux-2.6:
  [S390] fix s390x_newuname
  [S390] dasd: log sense for fatal errors
  [S390] cpu topology: fix locking
  [S390] cio: Fix refcount after moving devices.
  [S390] ftrace: fix kernel stack backchain walking
  [S390] ftrace: disable tracing on idle psw
  [S390] lockdep: fix compile bug
  [S390] kvm_s390: Fix oops in virtio device detection with "mem="
  [S390] sclp: emit error message if assign storage fails
  [S390] Fix range for add_active_range() in setup_memory()
2008-11-15 11:38:02 -08:00
Stefan Haberland
a9cffb227d [S390] dasd: log sense for fatal errors
The logging of sense data for fatal errors was accidentally removed
during Hyper PAV implementation.

Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-14 18:18:54 +01:00
Cornelia Huck
85acc407bf [S390] cio: Fix refcount after moving devices.
In ccw_device_move_to_orphanage(), a replacing ccw_device
is searched via get_{disc,orphaned}_ccwdev_by_dev_id()
which obtain a reference on the returned ccw_device.
This reference must be given up again after the device
has been moved to its new parent.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-14 18:18:54 +01:00
Christian Borntraeger
cc835f7872 [S390] kvm_s390: Fix oops in virtio device detection with "mem="
The current virtio model on s390 has the descriptor page above the main
memory. The guest virtio detection will oops if the mem= parameter is
used to reduce/change the memory size.
We have to use real_memory_size instead of max_pfn to detect the virtio
descriptor pages.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2008-11-14 18:18:52 +01:00
Heiko Carstens
675be97a32 [S390] sclp: emit error message if assign storage fails
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2008-11-14 18:18:52 +01:00
Christof Schmitt
d94ce6c6e9 [SCSI] zfcp: Fix hexdump data in s390dbf traces
Fix multiple problems found in the hexdump data:
 - length calculation was wrong, traces were incomplete
 - FC payloads were dumped in different record than the output
   function tried to read
 - minor fixes in output
 - allow complete RSCN traces (up to 1024 bytes according to spec)

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:47:55 -05:00
Martin Petermann
7ea633ffad [SCSI] zfcp: fix erp timeout cleanup for port open requests
If an open port fsf request times out (in erp) the
corresponding erp_action member of the fsf
request need to set to NULL. If the port structure
will be removed later-on there will be still a
reference in the fsf request to the non existing
erp_action otherwise.

Signed-off-by: Martin Petermann <martin.petermann@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:47:40 -05:00
Christof Schmitt
77fd9494bc [SCSI] zfcp: Wait for port scan to complete when setting adapter online
Attaching a unit immediately after setting the adapter online should
be possible. The problem right now is that the port_scan runs from a
workqueue and has not finished when the set_online call returns and
the sysfs structures for the ports are not available yet. Fix that by
waiting for the port scan to complete.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:47:19 -05:00
Christof Schmitt
adc90daffb [SCSI] zfcp: Fix cast warning
Fix leftover from last typecast patch:

drivers/s390/scsi/zfcp_aux.c: In function ‘zfcp_port_enqueue’:
drivers/s390/scsi/zfcp_aux.c:629: warning: format ‘%016llx’ expects
type ‘long long unsigned int’, but argument 3 has type ‘u64’

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:47:03 -05:00
Christof Schmitt
3765138ae9 [SCSI] zfcp: Fix request list handling in error path
Fix the handling of the request list in the error path:
 - Use irqsave for the lock as in the good path.
 - Before removing the request, check if it is still in the list, a
   call to dismiss_all might have changed the list in between.
 - zfcp_qdio_send does not change the queue counters on failure,
   trying revert something is wrong, so remove this.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:46:39 -05:00
Christof Schmitt
88f2a97787 [SCSI] zfcp: fix mempool usage for status_read requests
When allocating fsf requests without qtcb, store the pointer to the
mempool in the fsf requests for later call to mempool_free. This
codepath is only used by the status_read requests.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:45:07 -05:00
Heiko Carstens
45316a86a6 [SCSI] zfcp: fix req_list_locking.
The per adapter req_list_lock must be held with interrupts disabled, otherwise
we might end up with nice deadlocks as lockdep tells us (see below).

zfcp 0.0.1804: QDIO problem occurred.

=========================================================
[ INFO: possible irq lock inversion dependency detected ]
2.6.27-rc8-00035-g4a77035-dirty #86
---------------------------------------------------------
swapper/0 just changed the state of lock:
 (&adapter->erp_lock){++..}, at: [<00000000002c82ae>] zfcp_erp_adapter_reopen+0x4e/0x8c
but this lock took another, hard-irq-unsafe lock in the past:
 (&adapter->req_list_lock){-+..}

and interrupts could create inverse lock ordering between them.

[tons of backtraces, but only the interesting part follows]

the second lock's dependencies:
-> (&adapter->req_list_lock){-+..} ops: 2280627634176 {
   initial-use  at:
                        [<0000000000071f10>] __lock_acquire+0x504/0x18bc
                        [<000000000007335c>] lock_acquire+0x94/0xbc
                        [<00000000003d7224>] _spin_lock_irqsave+0x6c/0xb0
                        [<00000000002cf684>] zfcp_fsf_req_dismiss_all+0x50/0x140
                        [<00000000002c87ee>] zfcp_erp_adapter_strategy_generic+0x66/0x3d0
                        [<00000000002c9498>] zfcp_erp_thread+0x88c/0x1318
                        [<000000000001b0d2>] kernel_thread_starter+0x6/0xc
                        [<000000000001b0cc>] kernel_thread_starter+0x0/0xc
   in-softirq-W at:
                        [<0000000000072172>] __lock_acquire+0x766/0x18bc
                        [<000000000007335c>] lock_acquire+0x94/0xbc
                        [<00000000003d7224>] _spin_lock_irqsave+0x6c/0xb0
                        [<00000000002ca73e>] zfcp_qdio_int_resp+0xbe/0x2ac
                        [<000000000027a1d6>] qdio_kick_inbound_handler+0x82/0xa0
                        [<000000000027daba>] tiqdio_inbound_processing+0x62/0xf8
                        [<0000000000047ba4>] tasklet_action+0x100/0x1f4
                        [<0000000000048b5a>] __do_softirq+0xae/0x154
                        [<0000000000021e4a>] do_softirq+0xea/0xf0
                        [<00000000000485de>] irq_exit+0xde/0xe8
                        [<0000000000268c64>] do_IRQ+0x160/0x1fc
                        [<00000000000261a2>] io_return+0x0/0x8
                        [<000000000001b8f8>] cpu_idle+0x17c/0x224
   hardirq-on-W at:
                        [<0000000000072190>] __lock_acquire+0x784/0x18bc
                        [<000000000007335c>] lock_acquire+0x94/0xbc
                        [<00000000003d702c>] _spin_lock+0x5c/0x9c
                        [<00000000002caff6>] zfcp_fsf_req_send+0x3e/0x158
                        [<00000000002ce7fe>] zfcp_fsf_exchange_config_data+0x106/0x124
                        [<00000000002c8948>] zfcp_erp_adapter_strategy_generic+0x1c0/0x3d0
                        [<00000000002c98ea>] zfcp_erp_thread+0xcde/0x1318
                        [<000000000001b0d2>] kernel_thread_starter+0x6/0xc
                        [<000000000001b0cc>] kernel_thread_starter+0x0/0xc
 }
 ... key      at: [<0000000000e356c8>] __key.26629+0x0/0x8

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmit@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:44:37 -05:00
Christof Schmitt
26816f1c2b [SCSI] zfcp: Dont clear reference from SCSI device to unit
It is possible that a remote port has a problem, the SCSI device gets
deleted after the rport timeout and then the timeout for pending SCSI
commands trigger an abort. For this case, don't delete the reference
from the SCSI device to the zfcp unit, so that we can still have the
reference to issue an abort request.

Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
2008-11-05 12:44:15 -05:00