Decouple the creation and destruction of the net_device from the order
of discovery and removal of nodes with RFC 2734 unit directories since
there is no reliable order. The net_device is now created when the
first RFC 2734 unit on a card is discovered, and destroyed when the last
RFC 2734 unit on a card went away. This includes all remote units as
well as the local unit, which is therefore tracked as a peer now too.
Also, locking around the list of peers is slightly extended to guard
against peer removal. As a side effect, fwnet_peer.pdg_lock has become
superfluous and is deleted.
Peer data (max_rec, speed, node ID, generation) are updated more
carefully.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The driver is now called firewire-net. It might implement the transport
of other networking protocols in the future, notably IPv6 per RFC 3146.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Implement IPv4 over IEEE 1394 as per RFC 2734 for the newer firewire
stack. This feature has only been present in the older ieee1394 stack
via the eth1394 driver.
Still to do:
- fix ipv4_priv and ipv4_node lifetime logic
- fix determination of speeds and max payloads
- fix bus reset handling
- fix unaligned memory accesses
- fix coding style
- further testing/ improvement of fragment reassembly
- perhaps multicast support
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (rebased, copyright note, changelog)
Tlabel is a 6 bits wide datum. Wrap it after 63 rather than 31 for more
safety against transaction label exhaustion and potential responders'
transaction layer bugs. (As noted by Guus Sliepen, this change requires
an expansion of tlabel_mask to 64 bits.)
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This extra check will avoid Broadcast_Channel register related traffic
to many IIDC, SBP-2, and AV/C devices which aren't IRMC or have a
max_rec < 8 (i.e. support < 512 bytes async payload). This avoids a
little bit of traffic after bus reset and is even more careful with
devices which don't implement this CSR.
The assumption is that no other protocol than IP over 1394 uses the
broadcast channel for streams.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The IP-over-1394 driver will add child devices beneath card devices
which are not of type fw_device. Hence firewire-core's callbacks in
device_for_each_child() and device_find_child() need to check for the
device type now.
Initial version written by Jay Fenlason.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Retrieval of an fw_unit's parent is a common pattern in high-level code.
Wrap it up as device = fw_parent_device(unit).
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The source files of firewire-core, firewire-ohci, firewire-sbp2, i.e.
"drivers/firewire/fw-*.c"
are renamed to
"drivers/firewire/core-*.c",
"drivers/firewire/ohci.c",
"drivers/firewire/sbp2.c".
The old fw- prefix was redundant to the directory name. The new core-
prefix distinguishes the files according to which driver they belong to.
This change comes a little late, but still before further firewire
drivers are added as anticipated RSN.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The three header files of firewire-core, i.e.
"drivers/firewire/fw-device.h",
"drivers/firewire/fw-topology.h",
"drivers/firewire/fw-transaction.h",
are replaced by
"drivers/firewire/core.h",
"include/linux/firewire.h".
The latter includes everything which a firewire high-level driver (like
firewire-sbp2) needs besides linux/firewire-constants.h, while core.h
contains the rest which is needed by firewire-core itself and by low-
level drivers (card drivers) like firewire-ohci.
High-level drivers can now also reside outside of drivers/firewire
without having to add drivers/firewire to the header file search path in
makefiles. At least the firedtv driver will be such a driver.
I also considered to spread the contents of core.h over several files,
one for each .c file where the respective implementation resides. But
it turned out that most core .c files will end up including most of the
core .h files. Also, the combined core.h isn't unreasonably big, and it
will lose more of its contents to linux/firewire.h anyway soon when more
firewire drivers are added. (IP-over-1394, firedtv, and there are plans
for one or two more.)
Furthermore, fw-ohci.h is renamed to ohci.h. The name of core.h and
ohci.h is chosen with regard to name changes of the .c files in a
follow-up change.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Include required headers which were only indirectly included.
Remove unused includes and an unused constant.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
In the unlikely event that card->driver->get_bus_time() is called during
a cycle64Seconds interrupt, we could read garbage unless atomic accesses
are used.
The switch to atomic ops requires to change the 64 seconds counter from
unsigned to signed, but this shouldn't matter to the end result.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Due to AV/C protocol extensions, FireDTV devices need a vendor-specific
driver. But their configuration ROM features a vendor ID only in the
root directory, not in the unit directory.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
That way, the new firedtv driver will be able to use a single ID table
in builds against ieee1394 core and/or against firewire core.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This adds the attribute /sys/bus/firewire/devices/fw[0-9]+/units. It
can be used in udev rules like the following ones:
# IIDC devices: industrial cameras and some webcams
SUBSYSTEM=="firewire", ATTR{units}=="*0x00a02d:0x00010?*", GROUP="video"
# AV/C devices: camcorders, set-top boxes, TV sets, audio devices, ...
SUBSYSTEM=="firewire", ATTR{units}=="*0x00a02d:0x010001*", GROUP="video"
Background:
firewire-core manages two device types:
- fw_device is a FireWire node. A character device file is associated
with it.
- fw_unit is a unit directory on a node. Each fw_device may have 0..n
children of type fw_unit. The units tell us what kinds of protocols
a node implements.
We want to set ownership or ACLs or permissions of the character device
file of an fw_device, or/and create symlinks to it, based on available
protocols. Until now udev rules had to look at the fw_unit devices and
then modify their parent's character device file accordingly. This is
problematic for two reasons: 1) It happens sometime after the creation
of the fw_device, 2) an access policy may require that information from
all children is evaluated before a decision about the parent is made.
Problem 1) can ultimately not be avoided since this is the nature of
FireWire nodes: They may add or remove unit directories at any point in
time.
However, we can still help userland a lot by providing the protocol type
information of all units in a summary sysfs attribute directly at the
fw_device. This way,
- the information is immediately available at the affected device
when userspace goes about to handle an ADD or CHANGE event of the
fw_device,
- with most policies, it won't be necessary anymore to dig through
child attributes.
The new attribute is called "units". It contains space-separated tuples
of specifier_id and version of each present unit. The delimiter within
tuples is a colon. Specifier_id and version are printed as 0x%06x.
Here is an example of a node which implements an IPv4 unit and an IPv6
unit: $ cat /sys/bus/firewire/devices/fw2/units
0x00005e:0x000001 0x00005e:0x000002
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
struct fw_attribute_group.attrs.[] must have enough room for all
attributes. This can and should be checked at build time.
Our previous check at run time was a little late and not reliable since
most of the time less than the available attributes are populated.
Furthermore, omit an increment of an index at its last usage.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
My recently added test for a device being local in fw-cdev.c got it
slightly wrong: Comparisons of node IDs are only valid if the
generation is current, which I forgot to check. Normally, serialization
by card->lock takes care of this, but a device in FW_DEVICE_GONE state
will necessarily have a wrong generation and invalid node_id.
The "is it local?" check is made 100% correct and simpler now by means
of a struct fw_device flag which is set at fw_device creation.
Besides the fw-cdev site which was to be fixed, there is another site
which can make use of the new flag, and an RFC-2734 driver will benefit
from it too.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cache the test result of whether a device implements BROADCAST_CHANNEL.
This minimizes traffic on the bus after each bus reset. A majority of
devices does not implement BROADCAST_CHANNEL.
Remove busy retries; just rely on the hardware to retry requests to busy
responders. Remove unnecessary log messages.
Rename the flag is_irm to broadcast_channel_allocated to better reflect
its meaning. Reset the flag earlier in fw_core_handle_bus_reset.
Pass the generation down as a call parameter; that way generation can't
be newer than card->broadcast_channel_allocated and device->node_id.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Per IEEE 1394 clause 8.4.2.5, bus manager capable nodes which are not
incumbent shall wait at least 125ms before trying to establish
themselves as bus manager.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This changes the as yet unreleased FW_CDEV_IOC_SEND_STREAM_PACKET ioctl
to generate an fw_cdev_event_response event just like the other two
ioctls for asynchronous request transmission do. This way, clients get
feedback on successful or unsuccessful transmission.
This also adds input validation for length, tag, channel, sy, speed.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This changes the ioctl() return value of FW_CDEV_IOC_SEND_REQUEST and of
the as yet unreleased FW_CDEV_IOC_SEND_BROADCAST_REQUEST. They used to
return
sizeof(struct fw_cdev_send_request *) + data_length
which is obviously a failed attempt to emulate the return value of
raw1394's respective interface which uses write() instead of ioctl().
However, the first summand, as size of a kernel pointer, is entirely
meaningless to clients and the second summand is already known to
clients. And the result does not resemble raw1394's write() return
code anyway.
So simplify it to a constant non-negative value, i.e. 0. The only
dangers here would be that future client implementations check for error
by ret != 0 instead of ret < 0 when running on top of an old kernel; or
that current clients interpret ret = 0 or more as failure. But both are
hypothetical cases which don't justify to return irritating values.
While we touch this code, also remove "& 0x1f" from tcode in the call of
fw_send_request. The tcode cannot be bigger than 0x1f at this point.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The bus reset handler concurrently frees client->device->node. Use
device->node_id instead. This is equivalent to device->node->node_id
while device->generation is current.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The access permissions and ownership or ACL of /dev/fw* character device
files will typically be set based on the device type of the respective
nodes, as obtained by firewire-core from descriptors in the device's
configuration ROM. An example policy is to deny write permission by
default but grant write permission to files of AV/C video and audio
devices and IIDC video devices.
The FW_CDEV_IOC_ADD_DESCRIPTOR ioctl could be used to partly subvert
such a policy: Find a device file with relaxed permissions, use the
ioctl to add a descriptor with AV/C marker to the local node's ROM, thus
gain access to the local node's character device file. (This is only
possible if there are udev scripts installed which actively relax
permissions for known device types and if there is a device of such a
type connected.)
Accessibility of the local node's device file is relevant to host
security if the host contains two or more IEEE 1394 link layer
controllers which are plugged into a single bus.
Therefore change the ABI to deny FW_CDEV_IOC_ADD_DESCRIPTOR if the file
belongs to a remote node. (This change has no impact on known
implementers of the ABI: None of them uses the ioctl yet.)
Also clarify the documentation: The ioctl affects all local nodes, not
just one local node.
Cc: stable@kernel.org
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The as yet unreleased FW_CDEV_IOC_GET_SPEED ioctl puts only a single
integer into the parameter buffer. We can use ioctl()'s return value
instead.
(Also: Some whitespace change in firewire-cdev.h.)
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This patch adds the ISO broadcast channel support that is required of a
1394a IRM. In specific, if the local device the IRM, it allocates ISO
channel 31 and sets the broadcast channel register of all devices on the
local bus to BROADCAST_CHANNEL_INITIAL | BROADCAST_CHANNEL_VALID to indicate
that channel 31 can be use for broadcast messages.
One minor complication is that on startup the local device may become IRM
before all the devices on the bus have been enumerated by the stack. Therefore
we have to keep a "the local device is IRM" flag and possibly set the
broadcast channel register of new devices at enumeration time.
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Allow userspace and other firewire drivers (fw-ipv4 I'm looking at
you!) to send Asynchronous Transmit Streams as described in 7.8.3 of
release 1.1 of the 1394 Open Host Controller Interface Specification.
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (tweaks)
Standardize on if (err)
handle_error;
and if (ret < 0)
handle_error;
Don't call a variable err if we store values in it which mean success.
Also, offset some return statements by a blank line since this how we do
it in drivers/firewire.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
reread_bus_info_block() only gets to see devices whose config_rom_length
is at least 6 (ROM header, bus info block, root directory header).
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The kernel API documentation says that queue_delayed_work() returns 0
(only) if the work was already queued. The return codes of
schedule_delayed_work() are not documented but the same.
In init_iso_resource(), the work has never been queued yet, hence we
can assume schedule_delayed_work() to be a guaranteed success there.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Some fixes:
- Remove stale documentation.
- Fix a != vs. == thinko that got in the way of channel management.
- Try bandwidth deallocation even if channel deallocation failed.
A simplification:
- fw_cdev_allocate_iso_resource.channels is now ordered like
libdc1394's dc1394_iso_allocate_channel() channels_allowed
argument.
By the way, I looked closer at cards from NEC, TI, and VIA, and noticed
that they all don't implement IEEE 1394a behaviour which is meant to
deviate from IEEE 1212's notion of lock compare-swap. This means that
we have to do two lock transactions instead of one in many cases where
one transaction would already succeed on a fully 1394a compliant IRM.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
DMA must be halted before we DMA-unmap and free the DMA buffer. Since
we cannot rely on the client to stop the context before it closes the
fd, we have to reorder fw_iso_buffer_destroy vs. fw_iso_context_destroy.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
All of these functions are entered with IRQs enabled.
Hence the unconditional spin_unlock_irq can be used.
Function: Caller context:
dequeue_event() client process, via read(2)
fill_bus_reset_event() fw-device.c update worqueue job
release_client_resource() client process, via ioctl(2)
fw_device_op_release() client process, via close(2)
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Make the size check of ioctl_send_request and
ioctl_send_broadcast_request speed dependent. Also change the error
return code from -EINVAL to -EIO to distinguish this from other errors
concerning the ioctl parameters.
Another payload size limit for which we don't check here though is the
remote node's Bus_Info_Block.max_rec.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
We don't want random users write to Memory Space (e.g. PCs with physical
DMA filters down) or to core CSRs like Reset_Start.
This does not protect SBP-2 target CSRs. But properly behaving SBP-2
targets ignore broadcast write requests to these registers, and the
maximum damage which can happen with laxer targets is DOS. But there
are ways to create DOS situations anyway if there are devices with weak
device file permissions (like audio/video devices) present at the same
bus as an SBP-2 target.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Write transactions to the broadcast node ID are a convenient way to
trigger functions of multiple nodes at once. IIDC is a protocol which
can make use of this if multiple cameras with same command_regs_base are
connected at the same bus.
Based on
Date: Wed, 10 Sep 2008 11:32:16 -0400
From: Jay Fenlason <fenlason@redhat.com>
Subject: [patch] SEND_BROADCAST_REQUEST
Changes: ioctl_send_request() and ioctl_send_broadcast_request() now
share code. Broadcast speed corrected to S100. Check for proper tcode.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
While the speed of asynchronous transactions is automatically chosen by
the kernel, the speed of isochronous streams has to be chosen by the
initiating client.
In case of 1394a bus topologies, the maximum possible speed could be
figured out with some effort by evaluation of the remote node's link
speed field in the config ROM, the local node's link speed field, and
the PHY speeds and topologic information in the local node's or IRM's
topology map CSR. However, this does not work in case of 1394b buses.
Hence add an ioctl to export the maximum speed which the kernel already
determined.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This adds ioctls for allocation and deallocation of a channel or/and
bandwidth without auto-reallocation and without auto-deallocation.
The benefit of these ioctls is that libraw1394-style isochronous
resource management can be implemented without write access to the IRM's
character device file.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Based on
Date: Tue, 18 Nov 2008 11:41:27 -0500
From: Jay Fenlason <fenlason@redhat.com>
Subject: [Patch V4] Add ISO resource management support
with several changes to the ABI and implementation. Only the part of
the ABI which enables auto-reallocation and auto-deallocation is
included here.
This implements ioctls for kernel-assisted allocation of isochronous
channels and isochronous bandwidth. The benefits are:
- The client does not have to have write access to the /dev/fw* device
corresponding to the IRM.
- The client does not have to perform reallocation after bus resets.
- Channel and bandwidth are deallocated by the kernel if the file is
closed before the client deallocated the resources. Thus resources
are released even if the client crashes.
It is anticipated that future in-kernel code (firewire-core IRM code;
the firewire port of firedtv), will use the fw-iso.c portions of this
code too.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Tested-by: David Moore <dcm@acm.org>
to indicate that they are specializations of struct event or of struct
client_resource, respectively.
struct response was both an event and a client_resource; it is now split
into struct outbound_transaction_resource and ~_event in order to
document more explicitly which types of client resources exist.
struct request and struct_request_event are renamed to struct
inbound_transaction_resource and ~_event because requests and responses
occur in outbound and in inbound transactions.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The lifetime of struct client instances must be longer than the lifetime
of any client resource.
This fixes a possible race between fw_device_op_release and transaction
completions. It also prepares for new ioctls for isochronous resource
management which will involve delayed processing of client resources.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Reviewed-by: David Moore <dcm@acm.org>
OHCI-1394 1.1 clause 10.4.3 says: "If more than one IR DMA context
specifies receives for packets from the same isochronous channel, the
context destination for that channel's packets is undefined."
Any userspace client and in the future also kernelspace clients can
allocate IR DMA contexts for any channel. We don't want them to
interfere with each other, hence it is preferable to return -EBUSY if
allocation of a second context for a channel is attempted.
Notes:
- This limitation is OHCI-1394 specific, therefore its proper place of
implementation is down in the low-level driver.
- Since the <linux/firewire-cdev.h> ABI simply maps one userspace iso
client context to one hardware iso context, this OHCI-1394
limitation alas requires userspace to implement its own multiplexing
of iso reception from the same channel and card to multiple clients
when needed.
- The limitation is independent of channel allocation at the IRM; the
latter is really only important for the initiation of iso
transmission but not of iso reception.
- We don't need to do the same for IT DMA because OHCI-1394 does not
have any ties between IT contexts and channels. Only the voluntary
channel allocation protocol via the IRM, globally to the FireWire
bus, can ensure proper isochronous transmit behaviour anyway.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Like before my commit 1415d9189e,
fw_core_add_address_handler() does not align the address region now.
Instead the caller is required to pass valid parameters.
Since one of the callers of fw_core_add_address_handler() is the cdev
userspace interface, we now check for valid input. If the client is
buggy, we give it a hint with -EINVAL.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The current code uses a linked list and a counter for storing
resources and the corresponding handle numbers. By changing to an idr
we can be safe from counter wrap-around giving two resources the same
handle.
Furthermore, the deallocation ioctls now check whether the resource to
be freed is of the intended type.
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Some rework by Stefan R:
- The idr API documentation says we get an ID within 0...0x7fffffff.
Hence we can rest assured that idr handles fit into cdev handles.
- Fix some races. Add a client->in_shutdown flag for this purpose.
- Add allocation retry to add_client_resource().
- It is possible to use idr_for_each() in fw_device_op_release().
- Fix ioctl_send_response() regression.
- Small style changes.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Unlink the client from the fw_device earlier in order to prevent bus
reset events being added to client->event_list during shutdown.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The behaviour of fw-transaction.c::fw_send_request is ill-defined for
any other tcodes than read/ write/ lock request tcodes. Therefore
prevent requests with wrong tcodes from entering the transaction layer.
Maybe fw_send_request should check them itself, but I am not inclined to
change it and fw_fill_request from void-valued functions to ones which
return error codes and pass those up. Besides, maybe fw_send_request is
going to support one more tcode than ioctl_send_request in the future
(TCODE_STREAM_DATA).
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This adds a client_list_lock, which only protects the device's
client_list, so that future versions of the driver can call code that
takes the card->lock while holding the client_list_lock. Adding this
lock is much simpler than adding __ versions of all the functions that
the future version may need. The one ordering issue is to make sure
code never takes the client_list_lock with card->lock held. Since
client_list_lock is only used in three places, that isn't hard.
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Update fill_bus_reset_event() accordingly. Include linux/spinlock.h.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Previously, when an iso context had header_size > 4, the iso header
(len/tag/channel/tcode/sy) was passed to userspace followed by quadlets
stripped from the payload. This patch changes the behavior:
header_size = 8 now passes the header quadlet followed by the timestamp
quadlet. When header_size > 8, quadlets are stripped from the payload.
The header_size = 4 case remains identical.
Since this alters the semantics of the API, the firewire API version
needs to be bumped concurrently with this change.
This change also refactors the header copying code slightly to be much
easier to read.
Signed-off-by: David Moore <dcm@acm.org>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Petr Vandrovec <petr@vandrovec.name>
After a controller initialization failure, addition of another card got
stuck due to card_list corruption.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
According to https://bugs.launchpad.net/bugs/294391
- 3rd generation iPods need the "fix capacity" workaround after all
(apparently they crash after the last sector was accessed),
- 2nd generation iPods need the "128 kB maximum request size"
workaround.
Alas both iPod generations feature the same model ID in the config ROM,
hence we can only define a shared quirks list entry for them. Luckily
the fix capacity workaround did not show a negative effect in Jarod's
tests with 2nd gen. iPod.
A side note: Apple computers in target mode (or at least an x86 Mac
mini) don't have firmware_version and model_id, hence none of the iPod
quirks list entries is active for them.
Tested-by: Jarod Wilson <jarod@redhat.com>
Acked-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Reported-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
who also provided a first version of the fix.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
1394-2008 clause 16.3.4.1 (1394b-2002 clause 16.3.1.1) defines tighter
limits than 1394-2008 clause 6.2.2.3 (1394a-2000 clause 6.2.2.3).
Our previously too large limit doesn't matter though if the controller
reports its max_receive correctly.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This fixes a regression by "firewire: keep highlevel drivers attached
during brief connection loss": There were 2 seconds unnecessary waiting
added to the shutdown procedure of each controller.
We use card->link as status flag to signal the device handler that there
is no use to wait for a come-back.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Camcorders have a tendency to fail read requests to their config ROM and
write request to their FCP command register with ack_busy_X. This has
become a problem with newer kernels and especially Panasonic camcorders,
causing AV/C in dvgrab and kino to fail. Dvgrab for example frequently
logs "send oops"; kino reports loss of AV/C control. I suspect that
lower CPU scheduling latencies in newer kernels made this issue more
prominent now.
According to
https://sourceforge.net/tracker/?func=detail&atid=114103&aid=2492640&group_id=14103
this can be fixed by configuring the FireWire controller for more
hardware retries for request transmission; these retries are evidently
more successful than libavc1394's own retry loop (typically 3 tries on
top of hardware retries).
Presumably the same issue has been reported at
https://bugzilla.redhat.com/show_bug.cgi?id=449252 and
https://bugzilla.redhat.com/show_bug.cgi?id=477279 .
In a quick test with a JVC camcorder (which didn't malfunction like the
reported camcorders), this change decreased the number of ack_busy_X
from 16 in three runs of dvgrab to 4 in three runs of the same capture
duration.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The present message is mostly just noise. We only need to be notified
if the "active" flag does not go off before the retry loop terminates.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
There are situations when nodes vanish from the bus and come back
quickly thereafter:
- When certain bus-powered hubs are plugged in,
- when certain devices are plugged into 6-port hubs,
- when certain disk enclosures are switched from self-power to bus
power or vice versa and break the daisy chain during the transition,
- when the user plugs a cable out and quickly plugs it back in, e.g.
to reorder a daisy chain (works on Mac OS X if done quickly enough),
- when certain hubs temporarily malfunction during high bus traffic.
Until now, firewire-core reported affected nodes as lost to the
highlevel drivers (firewire-sbp2 and userspace drivers). We now delay
the destruction of device representations until after at least two
seconds after the last bus reset. If a "new" device is detected in this
period whose bus information block and root directory header match that
of a device which is pending for deletion, we resurrect that device and
send update calls to highlevel drivers.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Noticed by Jarod Wilson: The bus manager work was unnecessarily delayed
each time the bus generation counter rolled over.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
The whole topology code only works if the old and new topologies which
are compared come from immediately successive self ID complete events.
If there happened bus resets without self ID complete events in the
meantime, or self ID complete events with invalid selfIDs, the topology
comparison could identify nodes wrongly, or more likely just corrupt
kernel memory or panic right away.
We now discard all nodes of the old topology and treat all current nodes
as new ones if the current self ID generation is not the previous one
plus 1.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
Due to commit 2831fe6f9c, "driver core:
create a private portion of struct device", device_initialize() can no
longer be called from atomic contexts.
We now defer it until after config ROM probing. This requires changes
to the bus manager code because this may use a device before it was
probed.
Reported-by: Jay Fenlason <fenlason@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
topology_map is by far the largest member in struct fw_card. Move it to
the very end of the struct so that card pointer dereferences have better
chances to hit the CPU cache.
This requires to increase the topology_map backing store to the size
specified in IEEE 1394, i.e. 256 rather than 255 quadlets. Otherwise
the topology_map response handler may access invalid memory.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
An earlier change, maybe long ago, removed the copying of self_id_count
into card->self_id_count. Since then each bus reset cleared
card->bm_retries even when it shouldn't.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Take a reference to the card whenever fw_card_bm_work() is scheduled on
that card and release it when the work is done. This allows us to
remove the cancel_delayed_work_sync() in fw_core_remove_card().
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (patch update)
What was I thinking when I added sbp2_set_generation()? Its locking did
nothing (except for implicitly providing the necessary barrier between
node IDs update and generation update).
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
There is a DMA map/ unmap imbalance whenever a block write request
packet is sent and then dequeued with ohci_cancel_packet. The latter
may happen frequently if the AR resp tasklet is executed before the AT
req tasklet for the same transaction.
Add the missing dma_unmap_single. This fixes
https://bugzilla.redhat.com/show_bug.cgi?id=475156
Reported-by: Emmanuel Kowalski
Tested-by: Emmanuel Kowalski
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Add another model ID of a broken firmware to prevent early I/O errors
by acesses at the end of the disk. Reported at linux1394-user,
http://marc.info/?t=122670842900002
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
1: There is a small race between queue_delayed_work() and its
corresponding kref_get(). Do the kref_get first, and _put it again
if the queue_delayed_work() failed, so there is no chance of the
kref going to zero while the work is scheduled.
2: An SBP2_LOGOUT_REQUEST could be sent out with a login_id full of
garbage. Initialize it to an invalid value so we can tell if we
ever got a valid login_id.
3: The node ID and generation may have changed but the new values may
not yet have been recorded in lu and tgt when the final logout is
attempted. Use the latest values from the device in
sbp2_release_target().
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This optimizes firewire-sbp2's device probe for the case that the local
node and the SBP-2 node were discovered at the same time. In this case,
fw-core's bus management work and fw-sbp2's login and SCSI probe work
are scheduled in parallel (in the globally shared workqueue and in
fw-sbp2's workqueue, respectively). The bus reset from fw-core may then
disturb and extremely delay the login and SCSI probe because the latter
fails with several command timeouts and retries and has to be retried
from scratch.
We avoid this particular situation of sbp2_login() and fw_card_bm_work()
running in parallel by delaying the first sbp2_login() a little bit.
This is meant to be a short-term fix for
https://bugzilla.redhat.com/show_bug.cgi?id=466679. In the long run,
the SCSI probe, i.e. fw-sbp2's call of __scsi_add_device(), should be
parallelized with sbp2_reconnect().
Problem reported and fix tested and confirmed by Alex Kanavin.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The transmit and receive context dma memory was not being freed on
module removal. Neither was the config rom memory. Fix that.
The ab->next assignment is pure paranoia.
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
With the bus_resets patch applied, it is easy to see this memory leak
by repeatedly resetting the firewire bus while running slabtop in
another window. Just watch kmalloc-32 grow and grow...
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The "color" is used during the topology building after a bus reset,
hovever in "struct fw_node"s it is stored in a u8, but in struct fw_card
it is stored in an int. When the value wraps in one struct, but not
the other, disaster strikes.
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Fixes http://bugzilla.kernel.org/show_bug.cgi?id=10922.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Reported by Jay Fenlason: ioctl() did not return as intended
- the size of data read into ioctl_send_request,
- the number of datagrams enqueued by ioctl_queue_iso.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
queuecommand() looked at the remote and local node IDs before it read
the bus generation. The corresponding race with sbp2_reconnect updating
these data was probably impossible to happen though because the current
code blocks the SCSI layer during reconnection. However, better safe
than sorry, especially if someone later improves the code to not block
the SCSI layer.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
1. We don't need to round the SBP-2 segment size limit down to a
multiple of 4 kB (0xffff -> 0xf000). It is only necessary to
ensure quadlet alignment (0xffff -> 0xfffc).
2. Use dma_set_max_seg_size() to tell the DMA mapping infrastructure
and the block IO layer about the restriction. This way we can
remove the size checks and segment splitting in the queuecommand
path.
This assumes that no other code in the firewire stack uses
dma_map_sg() with conflicting requirements. It furthermore assumes
that the controller device's platform actually allows us to set the
segment size to our liking. Assert the latter with a BUG_ON().
3. Also use blk_queue_max_segment_size() to tell the block IO layer
about it. It cannot know it because our scsi_add_host() does not
point to the FireWire controller's device.
Thanks to Grant Grundler and FUJITA Tomonori for advice.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Share code between fw_send_request + wait_for_completion callers.
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Addendum:
Removes an unnecessary struct and an ununsed retry loop.
Calls it fw_run_transaction() instead of fw_send_request_sync().
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Acked-by: Kristian Høgsberg <krh@redhat.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
firewire: Preserve response data alignment bug when it is harmless
Recently, a bug having to do with the alignment of transaction response
data was fixed. However, some apps such as libdc1394 relied on the
presence of that bug in order to function correctly. In order to stay
compatible with old versions of those apps, this patch preserves the bug
in cases where it is harmless to normal operation (such as the single
quadlet read) due to a simple duplication of data. This guarantees
maximum compatability for those users who are using the old app with the
fixed kernel.
Signed-off-by: David Moore <dcm@acm.org>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
firewire: state userland requirements in Kconfig help
firewire: avoid memleak after phy config transmit failure
firewire: fw-ohci: TSB43AB22/A dualbuffer workaround
firewire: queue the right number of data
firewire: warn on unfinished transactions during card removal
firewire: small fw_fill_request cleanup
firewire: fully initialize fw_transaction before marking it pending
firewire: fix race of bus reset with request transmission
Add per-device dma_mapping_ops support for CONFIG_X86_64 as POWER
architecture does:
This enables us to cleanly fix the Calgary IOMMU issue that some devices
are not behind the IOMMU (http://lkml.org/lkml/2008/5/8/423).
I think that per-device dma_mapping_ops support would be also helpful for
KVM people to support PCI passthrough but Andi thinks that this makes it
difficult to support the PCI passthrough (see the above thread). So I
CC'ed this to KVM camp. Comments are appreciated.
A pointer to dma_mapping_ops to struct dev_archdata is added. If the
pointer is non NULL, DMA operations in asm/dma-mapping.h use it. If it's
NULL, the system-wide dma_ops pointer is used as before.
If it's useful for KVM people, I plan to implement a mechanism to register
a hook called when a new pci (or dma capable) device is created (it works
with hot plugging). It enables IOMMUs to set up an appropriate
dma_mapping_ops per device.
The major obstacle is that dma_mapping_error doesn't take a pointer to the
device unlike other DMA operations. So x86 can't have dma_mapping_ops per
device. Note all the POWER IOMMUs use the same dma_mapping_error function
so this is not a problem for POWER but x86 IOMMUs use different
dma_mapping_error functions.
The first patch adds the device argument to dma_mapping_error. The patch
is trivial but large since it touches lots of drivers and dma-mapping.h in
all the architecture.
This patch:
dma_mapping_error() doesn't take a pointer to the device unlike other DMA
operations. So we can't have dma_mapping_ops per device.
Note that POWER already has dma_mapping_ops per device but all the POWER
IOMMUs use the same dma_mapping_error function. x86 IOMMUs use device
argument.
[akpm@linux-foundation.org: fix sge]
[akpm@linux-foundation.org: fix svc_rdma]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix bnx2x]
[akpm@linux-foundation.org: fix s2io]
[akpm@linux-foundation.org: fix pasemi_mac]
[akpm@linux-foundation.org: fix sdhci]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix sparc]
[akpm@linux-foundation.org: fix ibmvscsi]
Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Cc: Muli Ben-Yehuda <muli@il.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Avi Kivity <avi@qumranet.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use only statically allocated data for PHY config packet transmission.
With the previous incarnation, some data wouldn't be freed if the packet
transmit callback was never called.
A theoretical drawback now is that, in PCs with more than one card,
card A may complete() for a waiter on card B. But this is highly
unlikely and its impact not serious. Bus manager B may reset bus B
before the PHY config went out, but the next phy config on B should be
fine. However, with a timeout of 100ms, this situation is close to
impossible.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Isochronous reception in dualbuffer mode is reportedly broken with
TI TSB43AB22A on x86-64. Descriptor addresses above 2G have been
determined as the trigger:
https://bugzilla.redhat.com/show_bug.cgi?id=435550
Two fixes are possible:
- pci_set_consistent_dma_mask(pdev, DMA_31BIT_MASK);
at least when IR descriptors are allocated, or
- simply don't use dualbuffer.
This fix implements the latter workaround.
But we keep using dualbuffer on x86-32 which won't give us highmen (and
thus physical addresses outside the 31bit range) in coherent DMA memory
allocations. Right now we could for example also whitelist PPC32, but
DMA mapping implementation details are expected to change there.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
There will be 4 padding bytes in struct fw_cdev_event_response on some platforms
The member:__u32 data will point to these padding bytes. While queue the
response and data in complete_transaction in fw-cdev.c, it will queue like this:
|response(excluding padding bytes)|4 padding bytes|4 padding bytes|data.
It queue 4 extra bytes. That is to say it use "&response + sizeof(response)"
while other place of kernel and userspace library use "&response + offsetof
(typeof(response), data)". So it will lost the last 4 bytes of data. This patch
can fix it while not changing the struct definition.
Signed-off-by: JiSheng Zhang <jszhang3@mail.ustc.edu.cn>
This fixes responses to outbound block read requests on 64bit architectures.
Tested on i686, x86-64, and x86-64 with i686 userland, using firecontrol and
gscanbus.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
* 'sbp2-spindown' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
ieee1394: sbp2: spin disks down on suspend and shutdown
firewire: fw-sbp2: spin disks down on suspend and shutdown
ieee1394: sbp2: fix spindown for PL-3507 and TSB42AA9 firmwares
firewire: fw-sbp2: fix spindown for PL-3507 and TSB42AA9 firmwares
scsi: sd: optionally set power condition in START STOP UNIT
After card->done and card->work are completed, any remaining pending
request would be a bug. We cannot safely complete a transaction at
that point anymore.
IOW card users must not drop their last fw_card reference (usually
indirect references through fw_device references) before their last
outbound transaction through that card was finished.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
- better name for a function argument
- removal of a local variable which became unnecessary after
"fully initialize fw_transaction before marking it pending"
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
In theory, card->flush_timer could already access a transaction between
fw_send_request()'s spin_unlock_irqrestore and the rest of what happens
in fw_send_request(). This would happen if the process which sends the
request is preempted and put to sleep right after spin_unlock_irqrestore
for longer than 100ms.
Therefore we fill in everything in struct fw_transaction at which the
flush_timer might look at before we lift the lock.
To do: Ensure that the timer does not pick up the transaction before
the time of the AT request event plus split transaction timeout.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Reported by Jay Fenlason: A bus reset tasklet may call
fw_flush_transactions and touch transactions (call their callback which
will free them) while the context which submitted the transaction is
still inserting it into the transmission queue.
A simple solution to this problem is to _not_ "flush" the transactions
because of a bus reset (complete the transcations as 'cancelled'). They
will now simply time out (completed as 'cancelled' by the split-timeout
timer).
Jay Fenlason thought of this fix too but I was quicker to type it out.
:-)
Background:
Contexts which access an instance of struct fw_transaction are:
1. the submitter, until it inserted the packet which is embedded in the
transaction into the AT req DMA,
2. the AsReqTrContext tasklet when the request packet was acked by the
responder node or transmission to the responder failed,
3. the AsRspRcvContext tasklet when it found a request which matched
an incoming response,
4. the card->flush_timer when it picks up timed-out transactions to
cancel them,
5. the bus reset tasklet when it cancels transactions (this access is
eliminated by this patch),
6. a process which shuts down an fw_card (unregisters it from fw-core
when the controller is unbound from fw-ohci) --- although in this
case there shouldn't really be any transactions anymore because we
wait until all card users finished their business with the card.
All of these contexts run concurrently (except for the 6th, presumably).
The 1st is safe against the 2nd and 3rd because of the way how a request
packet is carefully submitted to the hardware. A race between 2nd and
3rd has been fixed a while ago (bug 9617). The 4th is almost safe
against 1st, 2nd, 3rd; there are issues with it if huge scheduling
latencies occur, to be fixed separately. The 5th looks safe against
2nd, 3rd, and 4th but is unsafe against 1st. Maybe this could be fixed
with an explicit state variable in struct fw_transaction. But this
would require fw_transaction to be rewritten as only dynamically
allocatable object with reference counting --- not a good solution if we
also can simply kill this 5th accessing context (replace it by the 4th).
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Contrary to a comment in the source, request->ack of a broadcast write
request can be ACK_PENDING. Hence the existing check is insufficient.
Debug dmesg before:
AR spd 0 tl 00, ffc0 -> ffff, ack_pending , QW req, fffff0000234 = ffffffff
AT spd 0 tl 00, ffff -> ffc0, ack_complete, W resp
And the requesting node (linux1394) reports an unsolicited response.
Debug dmesg after:
AR spd 0 tl 00, ffc0 -> ffff, ack_pending , QW req, fffff0000234 = ffffffff
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This is a functionally equivalent replacement of the current reference
counting of struct fw_card instances. It only converts it to common
idioms as suggested by Kristian Høgsberg:
- struct kref replaces atomic_t as the counter.
- wait_for_completion is used to wait for all card users to complete.
BTW, it may make sense to count card->flush_timer and card->work as
card users too.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This instructs sd_mod to send START STOP UNIT on suspend and resume,
and on driver unbinding or unloading (including when the system is shut
down).
We don't do this though if multiple initiators may log in to the target.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Tested-by: Tino Keitel <tino.keitel@gmx.de>
Reported by Tino Keitel: PL-3507 with firmware from Prolific does not
spin down the disk on START STOP UNIT with power condition = 0 and start
= 0. It does however work with power condition = 2 or 3.
Also found while investigating this: DViCO Momobay CX-1 and FX-3A (TI
TSB42AA9/A based) become unresponsive after START STOP UNIT with power
condition = 0 and start = 0. They stay responsive if power condition is
set when stopping the motor.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Tested-by: Tino Keitel <tino.keitel@gmx.de>
There is a small off-by-one bug in firewire-sbp2. This causes problems
when a device exports multiple LUN Directories. I found it when trying
to talk to a SONY DVD Jukebox.
Signed-off-by: Richard Sharpe <realrichardsharpe@gmail.com>
Acked-by: Kristian Høgsberg <krh@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (op. order, changelog)
Emphasize the recommendation to build only one stack.
Trim the prompts to better fit into short attention spans.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
If the low-level driver failed to initialize a card properly without
noticing it, fw-core was blocked indefinitely when trying to send a
PHY config packet. This hung up the events kernel thread, e.g. locked
up keyboard input.
https://bugzilla.redhat.com/show_bug.cgi?id=444694https://bugzilla.redhat.com/show_bug.cgi?id=446763
This problem was introduced between 2.6.25 and 2.6.26-rc1 by commit
2a0a259049 "firewire: wait until PHY
configuration packet was transmitted (fix bus reset loop)".
The solution is to wait with timeout. I tested it with 7 different
working controllers and 1 non-working controller. On the working ones,
the packet callback complete()s usually --- but not always --- before a
timeout of 10ms. Hence I chose a safer timeout of 100ms.
On the few tests with the non-working controller ALi M5271, PHY config
packet transmission always timed out so far. (Fw-ohci needs to be fixed
for this controller independently of this deadline fix. Often the core
doesn't even attempt to send a phy config because not even self ID
reception works.)
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
The messages which can be enabled by fw-ohci's debug module parameter
are changed from KERN_DEBUG to KERN_NOTICE level and uniformly prefixed
with "firewire_ohci: ". This further simplifies communication with
users when we ask them to capture debug messages.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Callers of fill_bus_reset_event() have to take card->lock. Otherwise
access to node data may oops if node removal is in progress.
A lockless alternative would be
- event->local_node_id = card->local_node->node_id;
+ tmp = fw_node_get(card->local_node);
+ event->local_node_id = tmp->node_id;
+ fw_node_put(tmp);
and ditto with the other node pointers which fill_bus_reset_event()
accesses. But I went the locked route because one of the two callers
already holds the lock. As a bonus, we don't need the memory barrier
anymore because device->generation and device->node_id are written in
a card->lock protected section.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Kristian Høgsberg <krh@redhat.com>
OHCI 1.1 clause 5.10 requires that selfIDBufferPtr is valid when a 1 is
written into LinkControl.rcvSelfID.
This driver bug has so far not been known to cause harm because most
chips obviously accept a later selfIDBufferPtr write, at least before
HCControl.linkEnable is written.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
Signed-off-by: Kristian Høgsberg <krh@redhat.com>
We want the rcvPhyPkt bit in LinkControl off before we start using the
chip. However, the spec says that the reset value of it is undefined.
Hence switch it explicitly off.
https://bugzilla.redhat.com/show_bug.cgi?id=244576#c48 shows that for
example the nForce2 integrated FireWire controller seems to have it on
by default.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
header_length and payload_length are filled with random data if an
unknown tcode was read from the AR buffer (i.e. if the AR buffer
contained invalid data).
We still need a better strategy to recover from this, but at least
handle_ar_packet now doesn't return out of bound buffer addresses
anymore.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
BUG() at this place is wrong. (Unless if the low level driver would
already do higher-level input validation of incoming request headers.)
Invalid incoming requests or bugs in the controller which corrupt the
AR-req buffer needlessly crashed the box because this is run in tasklet
context.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
If userspace ignores the POLLERR bit from poll(), and only attempts to
read() the device when POLLIN is set, it can still make ioctl() calls on
a device that has been removed from the system. The node_id and
generation returned by GET_INFO will be outdated, but INITIATE_BUS_RESET
would still cause a bus reset, and GET_CYCLE_TIMER will return data.
And if you guess the correct generation to use, you can send requests to
a different device on the bus, and get responses back.
This patch prevents open, ioctl, compat_ioctl, and mmap against shutdown
devices.
Signed-off-by: Jay Fenlason <fenlason@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
* git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6:
[SCSI] aic94xx: fix section mismatch
[SCSI] u14-34f: Fix 32bit only problem
[SCSI] dpt_i2o: sysfs code
[SCSI] dpt_i2o: 64 bit support
[SCSI] dpt_i2o: move from virt_to_bus/bus_to_virt to dma_alloc_coherent
[SCSI] dpt_i2o: use standard __init / __exit code
[SCSI] megaraid_sas: fix suspend/resume sections
[SCSI] aacraid: Add Power Management support
[SCSI] aacraid: Fix jbod operations scan issues
[SCSI] aacraid: Fix warning about macro side-effects
[SCSI] add support for variable length extended commands
[SCSI] Let scsi_cmnd->cmnd use request->cmd buffer
[SCSI] bsg: add large command support
[SCSI] aacraid: Fix down_interruptible() to check the return value correctly
[SCSI] megaraid_sas; Update the Version and Changelog
[SCSI] ibmvscsi: Handle non SCSI error status
[SCSI] bug fix for free list handling
[SCSI] ipr: Rename ipr's state scsi host attribute to prevent collisions
[SCSI] megaraid_mbox: fix Dell CERC firmware problem
- struct scsi_cmnd had a 16 bytes command buffer of its own.
This is an unnecessary duplication and copy of request's
cmd. It is probably left overs from the time that scsi_cmnd
could function without a request attached. So clean that up.
- Once above is done, few places, apart from scsi-ml, needed
adjustments due to changing the data type of scsi_cmnd->cmnd.
- Lots of drivers still use MAX_COMMAND_SIZE. So I have left
that #define but equate it to BLK_MAX_CDB. The way I see it
and is reflected in the patch below is.
MAX_COMMAND_SIZE - means: The longest fixed-length (*) SCSI CDB
as per the SCSI standard and is not related
to the implementation.
BLK_MAX_CDB. - The allocated space at the request level
- I have audit all ISA drivers and made sure none use ->cmnd in a DMA
Operation. Same audit was done by Andi Kleen.
(*)fixed-length here means commands that their size can be determined
by their opcode and the CDB does not carry a length specifier, (unlike
the VARIABLE_LENGTH_CMD(0x7f) command). This is actually not exactly
true and the SCSI standard also defines extended commands and
vendor specific commands that can be bigger than 16 bytes. The kernel
will support these using the same infrastructure used for VARLEN CDB's.
So in effect MAX_COMMAND_SIZE means the maximum size command
scsi-ml supports without specifying a cmd_len by ULD's
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
firewire: fw-sbp2: log scsi_target ID at release
ieee1394: fix NULL pointer dereference in sysfs access
None of these files use any of the functionality promised by
asm/semaphore.h. It's possible that they rely on it dragging in some
unrelated header file, but I can't build all these files, so we'll have
fix any build failures as they come up.
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Fix: The fact that nodes had different gap counts would be overlooked
if the bus manager code would pick gap count 63 because of beta
repeaters or because of very large hop counts. In this case, the bus
manager code would miss that it actually has to send the PHY config
packet with gap count 63.
Related trivial changes: Use bool for an int used as bool, touch up
some comments.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
We now exit fw_send_phy_config /after/ the PHY config packet has been
transmitted, instead of before. A subsequent fw_core_initiate_bus_reset
will therefore not overlap with the transmission. This is meant to make
the send PHY config packet + reset bus routine more deterministic.
Fixes bus reset loop and eventual panic with
- VIA VT6307 + IOGEAR hub + Unibrain Fire-i camera
http://bugzilla.kernel.org/show_bug.cgi?id=10128
- JMicron card
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
Trivial change to replace more meaningless (to the untrained eye) hex
values with defined CSR constants.
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
When a device changes its configuration ROM, it announces this with a
bus reset. firewire-core has to check which node initiated a bus reset
and whether any unit directories went away or were added on this node.
Tested with an IOI FWB-IDE01AB which has its link-on bit set if bus
power is available but does not respond to ROM read requests if self
power is off. This implements
- recognition of the units if self power is switched on after fw-core
gave up the initial attempt to read the config ROM,
- shutdown of the units when self power is switched off.
Also tested with a second PC running Linux/ieee1394. When the eth1394
driver is inserted and removed on that node, fw-core now notices the
addition and removal of the IPv4 unit on the ieee1394 node.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
read_bus_info_block() is repeatedly called by workqueue jobs.
These will step on each others toes eventually if there are multiple
workqueue threads, and we end up with corrupt config ROM images.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Unlike the ohci1394 driver, fw-ohci uses the selfIDGeneration field of
bus reset packets to determine the generation of incoming requests as
per OHCI 1.1 clause 8.4.2.3. This is more precise --- provided that the
controller inserts the correct generation. Texas Instruments chips
often don't.
This prevented the transmission of response packets, which for example
broke AV/C transactions as used when communicating with miniDV cameras
and any other AV/C devices.
There is apparently no way to detect and adjust incorrect generations.
Therefore we ignore the generation of bus reset packets from TI chips
and use the generation of the self ID buffer instead. Alas this is
received at a slightly wrong time. In rare cases, this could cause us
to not respond to legitimate requests or to respond to expired requests.
(The latter is less likely because the bus reset packet AR event is
typically handled before the self ID complete event.)
Bug reported by Mladen Kuntner, who was extraordinarily patient while
dealing with the driver maintainers. Fix confirmed to be required and
effective for TSB82AA2 and a TSB43AB22 or TSB43AB22A.
https://bugzilla.redhat.com/show_bug.cgi?id=243081
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
Extend the logging of "AR evt_bus_reset, link internal" to "AR
evt_bus_reset, generation ${selfIDGeneration}". That way we can check
whether this generation matches the one seen in self ID complete event
logging. See OHCI 1.1 clause 8.4.2.3.
Also extend logging of "firewire_ohci: * selfIDs, generation *" by
"local node ID ffc*" in self ID logging to make the local node in AT/AR
event logs more obvious.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
Add a debug option to watch bus reset interrupt events. Half of this
patch is taken from Jarod Wilson's first version of the JMicron fix.
BusReset interrupts are only generated if the respective module
parameter flag was set before the controller is being initialized.
Else we keep this event masked to reduce IRQ load in normal operation
and to avoid potential problems with buggy chips.
Note, this is unlike the other IRQ events whose logging can be enabled
any time after chip initialization. This and the influence on what
interrupts the chip generates is why I added an extra flag for it.
Also, reorder the debug parameter flags according to their perceived
usefulness.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
I finally tracked down the issues with this JMicron PCI-e card in my
possession to a failure to comply with section 7.2.3.2 of the OHCI 1.1
specification (thanks to Kristian for the pointer to illustrate that it
is indeed a flaw in this card, not the driver). The controller should
simply flush the packets we've appended to its AT queue if a bus reset
occurs before they've been transmitted and we'll try again, but
something goes wrong and the controller winds up hung.
However, we can avoid the problem by simply checking if the
IntEvent.busReset register had been set before we try appending to the
AT context. When busReset is set, the AT context is completely halted
until busReset is cleared, so there's no point in appending AT packets
until the register is cleared. So at_context_queue_packet() now checks
for busReset being set, and bails with an RCODE_GENERATION packet ack,
which results in us trying to append the packet again after recognizing
the fact there has been a bus reset, and clearing busReset.
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
While trying to debug this piece of crap JMicron PCI-e controller in my
possession, one thought was that perhaps I was encountering register access
failures. I'm not, but logging them would be good, so we can see if they
are a real problem we should be taking into account anywhere in the code.
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de> (added list contact)
I've now witnessed multiple occasions where one of my controllers (a very
poorly working JMicron PCIe card) fails to get its registers properly set
up in ohci_enable(), apparently due to an occasionally very slow to
initiate SClk. The easy fix for this problem is to add a tiny while loop
to try again a time or three after initially enabling LPS before we
move on (or give up).
Of course, the card still isn't fully functional yet, but this gets it at
least one tiny step closer...
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
This adds debug printks for asynchronous transmission and reception and
for self ID reception. They can be enabled at module load time, and at
runtime via /sys/module/firewire_ohci/parameters/debug.
Signed-off-by: Jarod Wilson <jwilson@redhat.com>
Also added: Logging of interrupt event codes and of cancelled AT
packets.
The code now depends on a Kconfig variable. This makes it easier to
build firewire-ohci without the feature or to make it an option in the
future. The variable is currently hidden and always on.
This feature inflates firewire-ohci.ko by 7 kB = 27% on x86-64 and by
4 kB = 23% on i686.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
fw_core_handle_bus_reset() incorrectly relied on the assumption that
self_id_count > 0.
We check early in fw-ohci and discard the self ID complete event if
self_id_count == 0 because a valid event always has at least one self ID
packet in it (the one of the local node). Hence treat self_id_count ==
0 like any other kind of invalid self ID buffer.
Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Signed-off-by: Jarod Wilson <jwilson@redhat.com>