linux

mirror of https://github.com/FEX-Emu/linux.git synced 2024-12-22 17:33:01 +00:00

Author	SHA1	Message	Date
Yosef Etigin	5eb620c81c	IB/core: Add helpers for uncached GID and P_Key searches Add ib_find_gid() and ib_find_pkey() functions that use uncached device queries. The calls might block but the returns are always up-to-date. Cache P_Key and GID table lengths in core to avoid extra port info queries. Signed-off-by: Yosef Etigin <yosefe@voltaire.com> Acked-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-19 08:51:53 -07:00
Eli Cohen	7b82cd8ee7	IB/core: Free umem when mm is already gone Free umem when task's mm is already destroyed by the time ib_umem_release gets called. Found by Dotan Barak at Mellanox. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-19 08:51:53 -07:00
Sean Hefty	6c719f5c6c	RDMA/cma: Add check to validate that cm_id is bound to a device Several checks in the rdma_cm check against the state of the cm_id, but only to validate that the cm_id is bound to an underlying transport specific CM and an RDMA device. Make the check explicit in what we're trying to check for, since we're not synchronizing against the cm_id state. This will allow a user to disconnect a cm_id or reject a connection after receiving a device removal event. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-14 14:10:32 -07:00
Sean Hefty	be65f086f2	RDMA/cma: Fix synchronization with device removal in cma_iw_handler The cma_iw_handler needs to validate the state of the rdma_cm_id before processing a new connection request to ensure that a device removal is not already being processed for the same rdma_cm_id. Without the state check, the user can receive simultaneous callbacks for the same cm_id, or a callback after they've destroyed the cm_id. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-14 13:56:32 -07:00
Sean Hefty	8aa08602bd	RDMA/cma: Simplify device removal handling code Add a new routine and rename another to encapsulate common code for synchronizing with device removal. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-14 13:54:49 -07:00
Roland Dreier	1bf66a3042	IB: Put rlimit accounting struct in struct ib_umem When memory pinned with ib_umem_get() is released, ib_umem_release() needs to subtract the amount of memory being unpinned from mm->locked_vm. However, ib_umem_release() may be called with mm->mmap_sem already held for writing if the memory is being released as part of an munmap() call, so it is sometimes necessary to defer this accounting into a workqueue. However, the work struct used to defer this accounting is dynamically allocated before it is queued, so there is the possibility of failing that allocation. If the allocation fails, then ib_umem_release has no choice except to bail out and leave the process with a permanently elevated locked_vm. Fix this by allocating the structure to defer accounting as part of the original struct ib_umem, so there's no possibility of failing a later allocation if creating the struct ib_umem and pinning memory succeeds. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-08 18:00:37 -07:00
Roland Dreier	f7c6a7b5d5	IB/uverbs: Export ib_umem_get()/ib_umem_release() to modules Export ib_umem_get()/ib_umem_release() and put low-level drivers in control of when to call ib_umem_get() to pin and DMA map userspace, rather than always calling it in ib_uverbs_reg_mr() before calling the low-level driver's reg_user_mr method. Also move these functions to be in the ib_core module instead of ib_uverbs, so that driver modules using them do not depend on ib_uverbs. This has a number of advantages: - It is better design from the standpoint of making generic code a library that can be used or overridden by device-specific code as the details of specific devices dictate. - Drivers that do not need to pin userspace memory regions do not need to take the performance hit of calling ib_mem_get(). For example, although I have not tried to implement it in this patch, the ipath driver should be able to avoid pinning memory and just use copy_{to,from}_user() to access userspace memory regions. - Buffers that need special mapping treatment can be identified by the low-level driver. For example, it may be possible to solve some Altix-specific memory ordering issues with mthca CQs in userspace by mapping CQ buffers with extra flags. - Drivers that need to pin and DMA map userspace memory for things other than memory regions can use ib_umem_get() directly, instead of hacks using extra parameters to their reg_phys_mr method. For example, the mlx4 driver that is pending being merged needs to pin and DMA map QP and CQ buffers, but it does not need to create a memory key for these buffers. So the cleanest solution is for mlx4 to call ib_umem_get() in the create_qp and create_cq methods. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-08 18:00:37 -07:00
Linus Torvalds	972d45fb43	Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: IPoIB: Convert to NAPI IB: Return "maybe missed event" hint from ib_req_notify_cq() IB: Add CQ comp_vector support IB/ipath: Fix a race condition when generating ACKs IB/ipath: Fix two more spin lock problems IB/fmr_pool: Add prefix to all printks IB/srp: Set proc_name IB/srp: Add orig_dgid sysfs attribute to scsi_host IPoIB/cm: Don't crash if remote side uses one QP for both directions RDMA/cxgb3: Support for new abort logic RDMA/cxgb3: Initialize cpu_idx field in cpl_close_listserv_req message RDMA/cxgb3: Fail qp creation if the requested max_inline is too large RDMA/cxgb3: Fix TERM codes IPoIB/cm: Fix error handling in ipoib_cm_dev_open() IB/ipath: Don't corrupt pending mmap list when unmapped objects are freed IB/mthca: Work around kernel QP starvation IB/ipath: Don't put QP in timeout queue if waiting to send IB/ipath: Don't call spin_lock_irq() from interrupt context	2007-05-07 12:18:21 -07:00
Michael S. Tsirkin	f4fd0b224d	IB: Add CQ comp_vector support Add a num_comp_vectors member to struct ib_device and extend ib_create_cq() to pass in a comp_vector parameter -- this parallels the userspace libibverbs API. Update all hardware drivers to set num_comp_vectors to 1 and have all ULPs pass 0 for the comp_vector value. Pass the value of num_comp_vectors to userspace rather than hard-coding a value of 1. We want multiple CQ event vector support (via MSI-X or similar for adapters that can generate multiple interrupts), but it's not clear how many vectors we want, or how we want to deal with policy issues such as how to decide which vector to use or how to set up interrupt affinity. This patch is useful for experimenting, since no core changes will be necessary when updating a driver to support multiple vectors, and we know that we want to make at least these changes anyway. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-06 21:18:11 -07:00
Roland Dreier	1a70a05d9d	IB/fmr_pool: Add prefix to all printks Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-06 21:18:11 -07:00
Jean Delvare	6473d160b4	PCI: Cleanup the includes of <linux/pci.h> I noticed that many source files include <linux/pci.h> while they do not appear to need it. Here is an attempt to clean it all up. In order to find all possibly affected files, I searched for all files including <linux/pci.h> but without any other occurence of "pci" or "PCI". I removed the include statement from all of these, then I compiled an allmodconfig kernel on both i386 and x86_64 and fixed the false positives manually. My tests covered 66% of the affected files, so there could be false positives remaining. Untested files are: arch/alpha/kernel/err_common.c arch/alpha/kernel/err_ev6.c arch/alpha/kernel/err_ev7.c arch/ia64/sn/kernel/huberror.c arch/ia64/sn/kernel/xpnet.c arch/m68knommu/kernel/dma.c arch/mips/lib/iomap.c arch/powerpc/platforms/pseries/ras.c arch/ppc/8260_io/enet.c arch/ppc/8260_io/fcc_enet.c arch/ppc/8xx_io/enet.c arch/ppc/syslib/ppc4xx_sgdma.c arch/sh64/mach-cayman/iomap.c arch/xtensa/kernel/xtensa_ksyms.c arch/xtensa/platform-iss/setup.c drivers/i2c/busses/i2c-at91.c drivers/i2c/busses/i2c-mpc.c drivers/media/video/saa711x.c drivers/misc/hdpuftrs/hdpu_cpustate.c drivers/misc/hdpuftrs/hdpu_nexus.c drivers/net/au1000_eth.c drivers/net/fec_8xx/fec_main.c drivers/net/fec_8xx/fec_mii.c drivers/net/fs_enet/fs_enet-main.c drivers/net/fs_enet/mac-fcc.c drivers/net/fs_enet/mac-fec.c drivers/net/fs_enet/mac-scc.c drivers/net/fs_enet/mii-bitbang.c drivers/net/fs_enet/mii-fec.c drivers/net/ibm_emac/ibm_emac_core.c drivers/net/lasi_82596.c drivers/parisc/hppb.c drivers/sbus/sbus.c drivers/video/g364fb.c drivers/video/platinumfb.c drivers/video/stifb.c drivers/video/valkyriefb.c include/asm-arm/arch-ixp4xx/dma.h sound/oss/au1550_ac97.c I would welcome test reports for these files. I am fine with removing the untested files from the patch if the general opinion is that these changes aren't safe. The tested part would still be nice to have. Note that this patch depends on another header fixup patch I submitted to LKML yesterday: [PATCH] scatterlist.h needs types.h http://lkml.org/lkml/2007/3/01/141 Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-05-02 19:02:35 -07:00
Joachim Fenkes	1912ffbb88	IB: Set class_dev->dev in core for nice device symlink All RDMA drivers except ehca set class_dev->dev to their dma_device value (ehca leaves this unset). dma_device is the only value that makes any sense, so move this assignment to core/sysfs.c. This reduce the duplicated code in the rest of the drivers and gives ehca a nice /sys/class/infiniband/ehcaX/device symlink. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-04-24 21:30:38 -07:00
Hal Rosenstock	de493d47d8	IB/mad: Change SMI to use enums rather than magic return codes Clarify code by changing return values from SMI functions to named enum values instead of magic 0/1 values. Signed-off-by: Hal Rosenstock <halr@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-04-24 16:31:12 -07:00
Sean Hefty	aeba84a925	IB/umad: Implement GRH handling for sent/received MADs We need to set the SGID index for routed MADs and pass received GRH information to userspace when a MAD is received. Signed-off-by: Sean Hefty <sean.hefty@intel.com>	2007-04-24 16:31:12 -07:00
Sean Hefty	d0e7bb1418	IB/sa: Set src_path_bits correctly in ib_init_ah_from_path() src_path_bits needs to mask off the base LID value. Signed-off-by: Sean Hefty <sean.hefty@intel.com>	2007-04-24 16:31:12 -07:00
Sean Hefty	9d41b7fdea	IB/ucm: Simplify ib_ucm_event() Use wait_event_interruptible() instead of a more complicated open-coded equivalent. Signed-off-by: Sean Hefty <sean.hefty@intel.com>	2007-04-24 16:31:11 -07:00
Sean Hefty	d92f76448c	RDMA/ucma: Simplify ucma_get_event() Use wait_event_interruptible() instead of a more complicated open-coded equivalent. Signed-off-by: Sean Hefty <sean.hefty@intel.com>	2007-04-24 16:31:11 -07:00
Hal Rosenstock	9a4b65e357	IB/umad: Fix declaration of dev_map[] The current ib_umad code never accesses bits past IB_UMAD_MAX_PORTS in dev_map[]. We shouldn't declare it to be twice as big. Pointed-out-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Hal Rosenstock <halr@voltaire.com>	2007-04-18 20:20:53 -07:00
Sean Hefty	3492856e33	RDMA/ucma: Avoid sending reject if backlog is full Change the returned error code to ENOMEM if the connection event backlog is full. This prevents the ib_cm from issuing a reject on the connection, which can allow retries to succeed. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-03-06 14:58:11 -08:00
Sean Hefty	cb164b8c6a	RDMA/cma: Initialize rdma_bind_list in cma_alloc_any_port() The struct rdma_bind_list fields for hlist are not being initialized, resulting in a corrupted list. Fix this by using kzalloc() to make sure all pointers are NULL. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-03-06 12:41:44 -08:00
Sean Hefty	1836854f25	RDMA/cma: Remove unused node_guid from cma_device structure Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-22 17:54:35 -08:00
Sean Hefty	e971b8cd19	IB/cm: Remove ca_guid from cm_device structure The cm_device references an ib_device, which already contains the node_guid. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-22 17:54:33 -08:00
Sean Hefty	962063e64b	RDMA/cma: Request reversible paths only The rdma_cm requires that path records be reversible. Set the reversible bit when issuing an path record query. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-22 17:54:07 -08:00
Sean Hefty	47645d8d25	IB/core: Set hop limit in ib_init_ah_from_wc correctly The hop_limit value in the ah_attr should be 0xFF, not the value read from the received GRH (which should be 0). See 13.5.4.4 in the 1.2 IB spec. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-22 17:54:02 -08:00
Roland Dreier	aaf1aef55f	IB/uverbs: Return correct error for invalid PD in register MR If no matching PD is found in ib_uverbs_reg_mr(), then the function jumps to err_release without setting the return value ret. This means that ret will hold the return value of the call to ib_umem_get() a few lines earlier; if the function reaches the point where it looks for the PD, we know that ib_umem_get() must have returned 0, so ib_uverbs_reg_mr() ends up return 0 for a bad PD ID. Fix this by setting ret to -EINVAL before jumping to the exit path when no PD is found. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-22 13:16:51 -08:00
Roland Dreier	7084f8429c	IB/core: Set static rate in ib_init_ah_from_path() The static rate from the path record should be put into the address vector -- a long time ago the rate in the address attributes needed to be a relative rate, which required more munging, but now that the conversion from absolute to relative is done in the low-level driver, it's easy for ib_init_ah_from_path() to put the absolute rate in. Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Cc: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-16 15:31:24 -08:00
Roland Dreier	38abaa63bf	IB/core: Fix sparse warnings about shadowed declarations Change a couple of variable names to avoid sparse warnings about symbols being shadowed. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-16 14:41:14 -08:00
Sean Hefty	c8f6a362bf	RDMA/cma: Add multicast communication support Extend rdma_cm to support multicast communication. Multicast support is added to the existing RDMA_PS_UDP port space, as well as a new RDMA_PS_IPOIB port space. The latter port space allows joining the multicast groups used by IPoIB, which enables offloading IPoIB traffic to a separate QP. The port space determines the signature used in the MGID when joining the group. The newly added RDMA_PS_IPOIB also allows for unicast operations, similar to RDMA_PS_UDP. Supporting the RDMA_PS_IPOIB requires changing how UD QPs are initialized, since we can no longer assume that the qkey is constant. This requires saving the Q_Key to use when attaching to a device, so that it is available when creating the QP. The Q_Key information is exported to the user through the existing rdma_init_qp_attr() interface. Multicast support is also exported to userspace through the rdma_ucm. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-16 14:29:07 -08:00
Sean Hefty	faec2f7b96	IB/sa: Track multicast join/leave requests The IB SA tracks multicast join/leave requests on a per port basis and does not do any reference counting: if two users of the same port join the same group, and one leaves that group, then the SA will remove the port from the group even though there is one user who wants to stay a member left. Therefore, in order to support multiple users of the same multicast group from the same port, we need to perform reference counting locally. To do this, add an multicast submodule to ib_sa to perform reference counting of multicast join/leave operations. Modify ib_ipoib (the only in-kernel user of multicast) to use the new interface. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-16 14:20:02 -08:00
Steve Wise	ebb90986e1	RDMA/iwcm: iw_cm_id destruction race fixes iwcm iw_cm_id destruction race condition fixes: - iwcm_deref_id() always wakes up if there's another reference. - clean up race condition in cm_work_handler(). - create static void free_cm_id() which deallocs the work entries and then kfrees the cm_id memory. This reduces code replication. - rem_ref() if this is the last reference -and- the IWCM owns freeing the cm_id, then free it. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-16 13:57:35 -08:00
Tim Schmielau	cd354f1ae7	[PATCH] remove many unneeded #includes of sched.h After Al Viro (finally) succeeded in removing the sched.h #include in module.h recently, it makes sense again to remove other superfluous sched.h includes. There are quite a lot of files which include it but don't actually need anything defined in there. Presumably these includes were once needed for macros that used to live in sched.h, but moved to other header files in the course of cleaning it up. To ease the pain, this time I did not fiddle with any header files and only removed #includes from .c-files, which tend to cause less trouble. Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha, arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig, allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all configs in arch/arm/configs on arm. I also checked that no new warnings were introduced by the patch (actually, some warnings are removed that were emitted by unnecessarily included header files). Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:09:54 -08:00
Linus Torvalds	93bbad8fe1	Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: IB/mthca: Always fill MTTs from CPU IB/mthca: Merge MR and FMR space on 64-bit systems IB/mthca: Fix access to MTT and MPT tables on non-cache-coherent CPUs IB/mthca: Give reserved MTTs a separate cache line IB/mthca: Fix reserved MTTs calculation on mem-free HCAs RDMA/cxgb3: Add driver for Chelsio T3 RNIC IB: Remove redundant "_wq" from workqueue names RDMA/cma: Increment port number after close to avoid re-use IB/ehca: Fix memleak on module unloading IB/mthca: Work around gcc bug on sparc64 IPoIB: Connected mode experimental support IB/core: Use ARRAY_SIZE macro for mandatory_table IB/mthca: Use correct structure size in call to memset()	2007-02-13 21:16:39 -08:00
Arjan van de Ven	2b8693c061	[PATCH] mark struct file_operations const 3 Many struct file_operations in the kernel can be "const". Marking them const moves these to the .rodata section, which avoids false sharing with potential dirty data. In addition it'll catch accidental writes at compile time to these shared resources. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:45 -08:00
Sean Hefty	c7f743a669	IB: Remove redundant "_wq" from workqueue names Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-10 08:00:50 -08:00
Sean Hefty	aedec08050	RDMA/cma: Increment port number after close to avoid re-use Randomize the starting port number and avoid re-using port values immediately after they are closed. Instead keep track of the last port value used and increment it every time a new port number is assigned, to better replicate other port spaces. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-10 08:00:50 -08:00
Ahmed S. Darwish	9a6b090c0d	IB/core: Use ARRAY_SIZE macro for mandatory_table Use ARRAY_SIZE() macro already defined in kernel.h instead of open coding equivalent code. Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-10 08:00:47 -08:00
Steve Wise	1f12667021	RDMA/addr: Handle ethernet neighbour updates during route resolution The iWARP connection manager uses the ib_addr services to do route resolution (neighbour discovery in the IP world). The ib_addr netevent callback routine, however, currently only acts on InfiniBand neighbour updates. It needs to act on ethernet neighbour updates as well. This patch just removes filtering on device type altogether and will trigger on any neighour updates where the nud_type is valid. This simplifies the code some. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-04 14:11:57 -08:00
Michael S. Tsirkin	062dbb69f3	IB: Return qp pointer as part of ib_wc struct ib_wc currently only includes the local QP number: this matches the IB spec, but seems mostly useless. The following patch replaces this with the pointer to qp itself, and updates all low level drivers and all users. This has the following advantages: - Ability to get a per-qp context through wc->qp->qp_context - Existing drivers already have the qp pointer ready in poll cq, so this change actually saves a tiny bit (extra memory read) on data path (for ehca it would actually be expensive to find the QP pointer when polling a CQ, but ehca does not support SRQ so we can leave wc->qp as NULL for ehca) - Users that need the QP number can still get it through wc->qp->qp_num Use case: In IPoIB connected mode code, I have a common CQ shared by multiple QPs. To track connection usage, I need a way to get at some per-QP context upon the completion, and I would like to avoid allocating context object per work request just to stick a QP pointer into it. With this code, I can just use wc->qp->qp_context. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-04 14:11:55 -08:00
Sean Hefty	0cefcf0bbc	RDMA/ucma: Don't report events with invalid user context There's a problem with how rdma cm events are reported to userspace that can lead to application crashes. When a new connection request arrives, a context for the connection is allocated in the kernel. The connection event is then reported to userspace. The userspace library retrieves the event and allocates its own context for the connection. The userspace context is associated with the kernel's context when accepting. This allows the kernel to give userspace context with other events. A problem occurs if a second event for the same connection occurs before the user has had a chance to call accept. The userspace context has not yet been set, which causes the librdmacm to crash. (This has been seen when the app takes too long to call accept, resulting in the remote side timing out and rejecting the connection) Fix this by ignoring events for new connections until userspace has set their context. This can only happen if an error occurs on a new connection before the user accepts it. This is okay, since the accept will just fail later. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-01-07 20:20:08 -08:00
Sean Hefty	30a5ec982e	RDMA/ucma: Fix struct ucma_event leak when backlog is full We discard new connection requests while the listen backlog is full, but leak a struct ucma_event in the process. Free the structure in this case. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-01-07 20:17:34 -08:00
Steve Wise	881a045fc5	RDMA/iwcm: iWARP connection timeouts shouldn't be reported as rejects The iWARP CM should report timeouts as event RDMA_CM_EVENT_UNREACHABLE, not event RDMA_CM_EVENT_REJECTED. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-01-07 20:15:58 -08:00
Ralph Campbell	1527106ff8	IB/core: Use the new verbs DMA mapping functions Convert code in core/ to use the new DMA mapping functions for kernel verbs consumers. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-12-12 14:28:30 -08:00
Sean Hefty	7521663857	RDMA/cma: Export rdma cm interface to userspace Export the rdma cm interfaces to userspace via a misc device. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-12-12 11:50:22 -08:00
Sean Hefty	628e5f6d39	RDMA/cma: Add support for RDMA_PS_UDP Allow the use of UD QPs through the rdma_cm, in order to provide address translation services for resolving IB addresses for datagram messages using SIDR. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-12-12 11:50:21 -08:00
Sean Hefty	0fe313b000	RDMA/cma: Allow early transition to RTS to handle lost CM messages During connection establishment, the passive side of a connection can receive messages from the active side before the connection event has been delivered to the user. Allow the passive side to send messages in response to received data before the event is delivered. To handle the case where the connection messages are lost, a new rdma_notify() function is added that users may invoke to force a connection into the established state. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-12-12 11:50:21 -08:00
Sean Hefty	a1b1b61f80	RDMA/cma: Report connect info with connect events Connection information was never given to the recipient of a connection request or reply message. Only the event was delivered. Report the connection data with the event to allows user to reject the connection based on the requested parameters, or adjust their resources to match the request. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-12-12 11:50:21 -08:00
Sean Hefty	9b2e9c0c24	RDMA/cma: Remove unneeded qp_type parameter from rdma_cm The qp_type parameter into the rdma_cm is unneeded, and can be misleading. The QP type should be determined from the port space. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-12-12 11:50:21 -08:00
Roland Dreier	f47e22c6e4	IB/fmr: ib_flush_fmr_pool() may wait too long ib_flush_fmr_pool() stashes away the request generation number properly, but then goes ahead and rereads it every time it tests whether the flush generation number has caught up. This means that there is a theoretical possibility of livelock, if the request generation number keeps getting bumped and the flush generation number never catches up. The fix is simple: use the request generation number read at the beginning of the function. Also, atomic_inc() followed by atomic_read() can be replaced with atomic_int_return(). There's no real requirement for atomicity here but we might as well shrink the code. This bug was discovered using David Binderman's list of "set but never used" warnings from icc. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-12-12 11:50:19 -08:00
Josef Sipek	1cfd6e648b	[PATCH] struct path: convert infiniband Signed-off-by: Josef Sipek <jsipek@fsl.cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:46 -08:00
David Howells	4c1ac1b491	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: drivers/infiniband/core/iwcm.c drivers/net/chelsio/cxgb2.c drivers/net/wireless/bcm43xx/bcm43xx_main.c drivers/net/wireless/prism54/islpci_eth.c drivers/usb/core/hub.h drivers/usb/input/hid-core.c net/core/netpoll.c Fix up merge failures with Linus's head and fix new compilation failures. Signed-Off-By: David Howells <dhowells@redhat.com>	2006-12-05 14:37:56 +00:00
Michael S. Tsirkin	f469b2626f	IB/ucm: Fix deadlock in cleanup ib_ucm_cleanup_events() holds file_mutex while calling ib_destroy_cm_id(). This can deadlock since ib_destroy_cm_id() flushes event handlers, and ib_ucm_event_handler() needs file_mutex, too. Therefore, drop the file_mutex during the call to ib_destroy_cm_id(). Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:10 -08:00
Sean Hefty	e1444b5a16	IB/cm: Fix automatic path migration support The ib_cm_establish() function is replaced with a more generic ib_cm_notify(). This routine is used to notify the CM that failover has occurred, so that future CM messages (LAP, DREQ) reach the remote CM. (Currently, we continue to use the original path) This bumps the userspace CM ABI. New alternate path information is captured when a LAP message is sent or received. This allows QP attributes to be initialized for the user when a new path is loaded after failover occurs. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:10 -08:00
Roland Dreier	04699a1f86	RDMA/addr: list_move() cleanups Replace a couple list_del()/list_add() combos with list_move(). Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:09 -08:00
Krishna Kumar	c78bb8442b	RDMA/addr: Fix some cancellation problems in process_req() Fix following problems in process_req() relating to cancellation: - Function is wrongly doing another addr_remote() when cancelled, which is not required. - Make failure reporting immediate by using time_after_eq(). - On cancellation, -ETIMEDOUT was returned to the callback routine instead of the more appropriate -ECANCELLED (users getting notified may want to print/return this status, eg ucma_event_handler). Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:09 -08:00
Krishna Kumar	9ab1ffa877	RDMA/iwcm: Fix comment for iwcm_deref_id() to match code In iwcm_deref_id(), the comment says : "If the last reference is being removed and iw_destroy_cm_id is waiting, wake up the waiting thread". The second part of the comment, "and iw_destroy_cm_id is waiting," is wrong, since this function either wakes the waiter already waiting in iwcm_deref_id, or enables it (so that when wait_for_completion() is performed later, it will immediately return). Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:08 -08:00
Krishna Kumar	715a588f42	RDMA/iwcm: Remove unnecessary function argument Remove unnecessary cm_id_priv argument to copy_private_data(), and change text to reflect the code. Fix couple of typos in comments. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:08 -08:00
Krishna Kumar	13fccdb380	RDMA/iwcm: Remove unnecessary initializations Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:08 -08:00
Krishna Kumar	83b9658623	RDMA/iwcm: Fix memory leak If we get IW_CM_EVENT_CONNECT_REQUEST message and encounter an error (not in the LISTEN state, cannot create an id, cannot alloc work_entry, etc), then the memory allocated by cm_event_handler() in the event->private_data gets leaked. Since cm_work_handler has already put the event on the work_free_list, this allocated memory is leaked. High backlog value can allow DoS attacks. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:07 -08:00
Krishna Kumar	33ba0fa9f3	RDMA/iwcm: Fix memory corruption bug in cm_work_handler() Possible memory corruption scenario: after putting the work entry back on the work_free_list, we call process_event() which dereferences work->event, which could have been modified to another value meanwhile. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:07 -08:00
Roland Dreier	e54f81889c	IB: Convert kmem_cache_t -> struct kmem_cache Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:07 -08:00
Dotan Barak	e31353eaec	RDMA/cm: Remove setting local write as part of QP access flags The qp_access_flags are for remote access permissions only, so IB_ACCESS_LOCAL_WRITE is an invalid value. Remove it from the values set by cm_init_qp_init_attr() and cma_init_ib_qp(). Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:06 -08:00
Eric Sesterhenn	bed8bdfddd	IB: kmemdup() cleanup Replace open coded kmemdup() to save some screen space, and allow inlining/not inlining to be triggered by gcc. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:06 -08:00
Krishna Kumar	a1a733f65b	RDMA/cma: Rewrite cma_req_handler() to encapsulate common code Rewrite cma_req_handler error handling case to encapsulate common code. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:05 -08:00
Krishna Kumar	f115db4803	RDMA/addr: Use time_after_eq() instead of time_after() in queue_req() In queue_req(), use time_after_eq() instead of time_after() for following reasons : - Improves insert time if multiple entries with same time are present. - set_timeout need not be called if entry with same time is added to the list (and that happens to be the entry with the smallest time), saving atomic/locking operations. - Earlier entries with same time are deleted first (fifo). Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:05 -08:00
Krishna Kumar	e4022274cf	RDMA/cma: Remove redundant check in cma_add_one Remove redundant check of node_guid in cma_add_one(). Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:04 -08:00
Krishna Kumar	e82153b54d	RDMA/cma: Optimize cma_bind_loopback() to check for empty list Optimize to test for an empty list first. This ends up simplifying the code too. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:04 -08:00
David Howells	c4028958b6	WorkStruct: make allyesconfig Fix up for make allyesconfig. Signed-Off-By: David Howells <dhowells@redhat.com>	2006-11-22 14:57:56 +00:00
Roland Dreier	39798695b4	IB/mad: Fix race between cancel and receive completion When ib_cancel_mad() is called, it puts the canceled send on a list and schedules a "flushed" callback from process context. However, this leaves a window where a receive completion could be processed before the send is fully flushed. This is fine, except that ib_find_send_mad() will find the MAD and return it to the receive processing, which results in the sender getting both a successful receive and a "flushed" send completion for the same request. Understandably, this confuses the sender, which is expecting only one of these two callbacks, and leads to grief such as a use-after-free in IPoIB. Fix this by changing ib_find_send_mad() to return a send struct only if the status is still successful (and not "flushed"). The search of the send_list already had this check, so this patch just adds the same check to the search of the wait_list. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-13 09:38:07 -08:00
Sean Hefty	7a118df3ea	RDMA/addr: Use client registration to fix module unload race Require registration with ib_addr module to prevent caller from unloading while a callback is in progress. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-02 14:26:04 -08:00
Jack Morgenstein	0b26c88f29	IB/uverbs: Return sq_draining value in query_qp response Return the sq_draining value back to user space for query_qp instead of the en_sqd_async notify value, which is valid only for modify_qp. For query_qp, the draining status should returned. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-10-30 21:19:35 -08:00
Krishna Kumar	255d0c14b3	RDMA/cma: rdma_bind_addr() leaks a cma_dev reference count rdma_bind_addr() leaks a cma_dev reference count in failure case. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com>	2006-10-30 20:52:51 -08:00
Sean Hefty	82a9c16a10	IB/cm: Send DREP in response to unmatched DREQ Currently a DREP is only sent in response to a DREQ if a connection has been found matching the DREQ, and it is in the proper state. Once a DREP is sent, the local connection moves into timewait. Duplicate DREQs received while in this state result in re-sending the DREP. However, it's likely that the local connection will enter and exit timewait before the remote side times out a lost DREP and resends a DREQ. To handle this, we send a DREP in response to a DREQ, even if a local connection is not found. This avoids maintaining disconnected id's in timewait states for excessively long times, just to handle a lost DREP. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-10-10 12:50:38 -07:00
Sean Hefty	8575329d4f	IB/cm: Fix timewait crash after module unload If the ib_cm module is unloaded while id's are still in timewait, the CM will destroy the work queue used to process timewait. Once the id's exit timewait, their timers will fire, leading to a crash trying to access the destroyed work queue. We need to track id's that are in timewait, and cancel their deferred work on module unload. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-10-10 12:50:38 -07:00
Krishna Kumar	3f168d2b66	RDMA/cma: Optimize error handling Reorganize code relating to cma_get_net_info() and rdam_create_id() to optimize error case handling (no need to alloc memory/etc. as part of rdma_create_id() if input parameters are wrong). Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-10-02 14:52:16 -07:00
Krishna Kumar	94de178ac6	RDMA/cma: Eliminate unnecessary remove_list Eliminate remove_list by using list_del_init() instead during device removal handling. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-10-02 14:52:16 -07:00
Sean Hefty	8f0472d331	RDMA/cma: Set status correctly on route resolution error On reporting a route error, also include the status for the error, rather than indicating a status of 0 when an error has occurred. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-10-02 14:52:15 -07:00
Krishna Kumar	6e35aabee1	RDMA/cma: Fix device removal race The race is as follows: A process : cma_process_remove() calls cma_remove_id_dev(), which sets id state to CMA_DEVICE_REMOVAL and calls wait_event(dev_remove). B process : cma_req_handler() had incremented dev_remove, and calls cma_acquire_ib_dev() and on failure calls cma_release_remove(), which does a wake_up of cma_process_remove(). Then cma_req_handler() calls rdma_destroy_id(); A Process : cma_remove_id_dev() gets woken and checks the state of id, and since it is still (wrongly) CMA_DEVICE_REMOVAL, it calls notify_user(id) and if that fails, the caller - cma_process_remove() calls rdma_destroy_id(id). Two processes can call rdma_destroy_id(), resulting in one de-referencing kfreed id_priv. Fix is for process B to set CMA_DESTROYING in cma_req_handler() so that process A will return instead of doing a rdma_destroy_id(). Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-10-02 14:52:15 -07:00
Krishna Kumar	675a027c3d	RDMA/cma: Fix leak of cm_ids in case of failures cma_connect_ib() and cma_connect_iw() leak cm_id's in failure cases. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-10-02 14:52:15 -07:00
Al Viro	60cad5da57	[IPV4]: annotate inetdev.h helpers inet_confirm_addr(), inet_ifa_byprefix(), ip_dev_find(), inet_make_mask() and inet_ifa_match() annotated, along with inferred net-endian variables Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-09-28 18:01:05 -07:00
Alexey Dobriyan	1a1d92c10d	[PATCH] Really ignore kmem_cache_destroy return value * Rougly half of callers already do it by not checking return value * Code in drivers/acpi/osl.c does the following to be sure: (void)kmem_cache_destroy(cache); * Those who check it printk something, however, slab_error already printed the name of failed cache. * XFS BUGs on failed kmem_cache_destroy which is not the decision low-level filesystem driver should make. Converted to ignore. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-27 08:26:10 -07:00
Al Viro	d7b2004528	[PATCH] missing includes from infiniband merge indirect chains of includes are arch-specific and can't be relied upon... (hell, even attempt to build it for itanic would trigger vmalloc.h ones; err.h triggers on e.g. alpha). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-09-23 11:34:43 -07:00
Krishna Kumar	9cd330d36b	IB: Fix typo in kerneldoc for ib_set_client_data() Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-09-22 15:22:58 -07:00
Michael S. Tsirkin	a70d059009	IB/cm: Do not track remote QPN in timewait state Do not track remote QPN in TimeWait state, since QP is not connected. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-09-22 15:22:53 -07:00
Michael S. Tsirkin	c1a0b23bf4	IB/sa: Require SA registration Require users to register with SA module, to prevent the sa_query module text from going away while an SA query callback is still running. Update all in-tree users for the new interface. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-09-22 15:22:53 -07:00
Sean Hefty	61a73c708f	RDMA/cma: Protect against adding device during destruction Closes a window where address resolution can attach an rdma_cm_id to a device during destruction of the rdma_cm_id. This can result in the rdma_cm_id remaining in the device list after its memory has been freed. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-09-22 15:22:49 -07:00
Tom Tucker	07ebafbaaa	RDMA: iWARP Core Changes. Modifications to the existing rdma header files, core files, drivers, and ulp files to support iWARP, including: - Hook iWARP CM into the build system and use it in rdma_cm. - Convert enum ib_node_type to enum rdma_node_type, which includes the possibility of RDMA_NODE_RNIC, and update everything for this. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-09-22 15:22:47 -07:00
Tom Tucker	922a8e9fb2	RDMA: iWARP Connection Manager. Add an iWARP Connection Manager (CM), which abstracts connection management for iWARP devices (RNICs). It is a logical instance of the xx_cm where xx is the transport type (ib or iw). The symbols exported are used by the transport independent rdma_cm module, and are available also for transport dependent ULPs. Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-09-22 15:22:46 -07:00
Roland Dreier	3cd965646b	IB: Whitespace fixes Remove some trailing whitespace that has snuck in despite the best efforts of whitespace=error-all. Also fix a few other whitespace bogosities. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-09-22 15:22:46 -07:00
Sean Hefty	f06d265375	IB/cm: Randomize starting comm ID Randomize the starting local comm ID to avoid getting a rejected connection due to a stale connection after a system reboot or reloading of the ib_cm. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-09-22 15:22:45 -07:00
James Lentini	2b3e258e5d	IB/mad: Remove unused includes The ib_mad module does not use a kthread function, but mad_priv.h includes <linux/kthread.h>. mad_rmpp.c does not do any DMA-related stuff, but includes <linux/dma-mapping.h>. Remove the unused includes. Signed-off-by: James Lentini <jlentini@netapp.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-09-22 15:22:44 -07:00
Sean Hefty	75ab13443e	IB/mad: Add support for dual-sided RMPP transfers. The implementation assumes that any RMPP request that requires a response uses DS RMPP. Based on the RMPP start-up scenarios defined by the spec, this should be a valid assumption. That is, there is no start-up scenario defined where an RMPP request is followed by a non-RMPP response. By having this assumption we avoid any API changes. In order for a node that supports DS RMPP to communicate with one that does not, RMPP responses assume a new window size of 1 if a DS ACK has not been received. (By DS ACK, I'm referring to the turn-around ACK after the final ACK of the request.) This is a slight spec deviation, but is necessary to allow communication with nodes that do not generate the DS ACK. It also handles the case when a response is sent after the request state has been discarded. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-09-22 15:22:44 -07:00
Sean Hefty	76842405fc	IB/cm: Use correct reject code for invalid GID Set the reject code properly when rejecting a request that contains an invalid GID. A suitable GID is returned by the IB CM in the additional reject information (ARI). This is a spec compliancy issue. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-09-22 15:22:43 -07:00
Sean Hefty	c1f250c0b4	IB/cm: Enable atomics along with RDMA reads Enable atomic operations along with RDMA reads if a local RDMA read/atomic depth is provided by the user. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-09-22 15:22:42 -07:00
Ralph Campbell	9bc57e2d19	IB/uverbs: Pass userspace data to modify_srq and modify_qp methods Pass a struct ib_udata to the low-level driver's ->modify_srq() and ->modify_qp() methods, so that it can get to the device-specific data passed in by the userspace driver. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-09-22 15:22:25 -07:00
Ralph Campbell	64f817ba98	IB/uverbs: Allow resize CQ operation to return driver-specific data Add a ib_uverbs_resize_cq_resp.driver_data field so that low-level drivers can return data from a resize CQ operation to userspace. Have ib_uverbs_resize_cq() only copy the cqe field, to avoid having to bump the userspace ABI. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-09-22 15:22:24 -07:00
Roland Dreier	1ccf6aa19a	IB/uverbs: Fix lockdep warning when QP is created with 2 CQs Lockdep warns when userspace creates a QP that uses different CQs for send completions and receive completions, because both CQs are locked and their mutexes belong to the same lock class. However, we know that the mutexes are distinct and the nesting is safe (there is no possibility of AB-BA deadlock because the mutexes are locked with down_read()), so annotate the situation with SINGLE_DEPTH_NESTING to get rid of the lockdep warning. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-09-22 15:17:20 -07:00
Roland Dreier	ab10867621	IB/uverbs: Use idr_read_cq() where appropriate There were two functions that open-coded idr_read_cq() in terms of idr_read_uobj() rather than using the helper. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-09-22 15:17:19 -07:00
Michael S. Tsirkin	d5bb75999c	RDMA/cma: Increase the IB CM retry count in CMA 3 seems like a low number of IB Communication Manager retries to set; we see connections failing under stress, and in any case 3 just looks like an arbitrary number. 15 is the max value allowed by the InfiniBand spec. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-09-14 13:55:30 -07:00
Jack Morgenstein	acaea9ee46	IB/core: Fix SM LID/LID change with client reregister set After commit `12bbb2b7be`, when SM LID change or LID change MAD also has a client reregistration bit set, only CLIENT_REREGISTER event is generated. As a result, the sa_query module and the cache module don't update the port information, and ULPs (e.g. IPoIB) stop working. This is the regression we observe as compared to 2.6.17. Rather than generate multiple events (which would have negative performance impact), let us simply let cache and SA query respond to reregister event in the same way as to LID and SM change events. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-08-16 09:54:47 -07:00
Jack Morgenstein	fd60ae404f	IB/uverbs: Avoid a crash on device hot remove Wait until all users have closed their device context before allowing device unregistration to complete. This prevents a crash caused by referring to stale data structures. A better solution would be to have a way to revoke contexts rather than waiting for userspace to close the context, but that's a much bigger change that will have to wait. For now let's at least avoid the crash. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-08-03 10:56:42 -07:00
Sean Hefty	75df23e229	IB/cm: Fix error handling in ib_send_cm_req Report error code rather than success (0) on failure allocating timewait_info in ib_send_cm_req(). Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-08-03 09:44:22 -07:00
Tom Tucker	e795d09250	[NET] infiniband: Cleanup ib_addr module to use the netevents Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2006-08-02 13:38:22 -07:00
Sean Hefty	2527e681fd	IB/mad: Validate MADs for spec compliance Validate MADs sent by userspace clients for spec compliance with C13-18.1.1 (prevent duplicate requests and responses sent on the same port). Without this, RMPP transactions get aborted because of duplicate packets. This patch is similar to that provided by Jack Morgenstein. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-07-24 09:18:07 -07:00
Roland Dreier	43db2bc044	IB/uverbs: Fix lockdep warnings Lockdep warns because uverbs is trying to take uobj->mutex when it already holds that lock. This is because there are really multiple types of uobjs even though all of their locks are initialized in common code. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-07-23 15:16:04 -07:00
Michael S. Tsirkin	ec924b4726	IB/uverbs: Fix unlocking in error paths ib_uverbs_create_ah() and ib_uverbs_create_srq() did not release the PD's read lock in their error paths, which lead to deadlock when destroying the PD. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-07-23 15:16:03 -07:00
Michael S. Tsirkin	e322fedf0c	[PATCH] IB/core: use correct gfp_mask in sa_query Avoid bogus out of memory errors: fix sa_query to actually pass gfp_mask supplied by the user to idr_pre_get. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Acked-by: "Sean Hefty" <mshefty@ichips.intel.com> Acked-by: "Roland Dreier" <rdreier@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-14 21:53:51 -07:00
Michael S. Tsirkin	adfaa888a2	[PATCH] fmr pool: remove unnecessary pointer dereference ib_fmr_pool_map_phys gets the virtual address by pointer but never writes there, and users (e.g. srp) seem to assume this and ignore the value returned. This patch cleans up the API to get the VA by value, and updates all users. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Acked-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-14 21:53:51 -07:00
Ira Weiny	74f76fbac7	[PATCH] IB/cm: set private data length for reject messages Set private data length for reject messages to the correct size. Fix from openib svn r8483. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-14 21:53:50 -07:00
Michael S. Tsirkin	f0ee3404cc	[PATCH] IB/addr: gid structure alignment fix The device address contains unsigned character arrays, which contain raw GID addresses. The GIDs may not be naturally aligned, so do not cast them to structures or unions. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-14 21:53:50 -07:00
Michael S. Tsirkin	04c335430f	[PATCH] IB/cm: drop REQ when out of memory If a user of the IB CM returns -ENOMEM from their connection callback, simply drop the incoming REQ - do not attempt to send a reject. This should allow the sender to retry the request. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-07-14 21:53:50 -07:00
Sean Hefty	0d8fdfd71b	IB/core: Set alternate port number when initializing QP attributes Set alternate port number when initializing QP attributes. This bug is OpenFabrics bugzilla bug #160. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-06-30 14:10:14 -07:00
Roland Dreier	146d26b2bf	IB/uverbs: Set correct user handle for user SRQs Store away the user handle passed in from userspace when creating an SRQ, so that the kernel can return the correct handle when an SRQ asynchronous event occurs. (A 0 was incorrectly stored as the user handle as part of the changes in `9ead190b`, "IB/uverbs: Don't serialize with ib_uverbs_idr_mutex") Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-06-30 13:40:13 -07:00
Eric Sesterhenn	5fd571cbc1	[PATCH] Array overrun in drivers/infiniband/core/cma.c This was spotted by coverity #id 1300. Since the array has only four elements, we should just use those four. Signed-off-by: Eric Sesterhenn <snakebyte@gmx.de> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-26 11:57:28 -07:00
Akinobu Mita	179e09172a	[PATCH] drivers: use list_move() This patch converts the combination of list_del(A) and list_add(A, B) to list_move(A, B) under drivers/. Acked-by: Corey Minyard <minyard@mvista.com> Cc: Ben Collins <bcollins@debian.org> Acked-by: Roland Dreier <rolandd@cisco.com> Cc: Alasdair Kergon <dm-devel@redhat.com> Cc: Gerd Knorr <kraxel@bytesex.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Frank Pavlic <fpavlic@de.ibm.com> Acked-by: Matthew Wilcox <matthew@wil.cx> Cc: Andrew Vasquez <linux-driver@qlogic.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Greg Kroah-Hartman <greg@kroah.com> Signed-off-by: Akinobu Mita <mita@miraclelinux.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-26 09:58:18 -07:00
Linus Torvalds	61b9175808	Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: IB/iser: iSER Kconfig and Makefile IB/iser: iSER handling of memory for RDMA IB/iser: iSER RDMA CM (CMA) and IB verbs interaction IB/iser: iSER initiator iSCSI PDU and TX/RX IB/iser: iSCSI iSER transport provider high level code IB/iser: iSCSI iSER transport provider header file IB/uverbs: Remove unnecessary list_del()s IB/uverbs: Don't free wr list when it's known to be empty	2006-06-25 16:07:58 -07:00
David Howells	454e2398be	[PATCH] VFS: Permit filesystem to override root dentry on mount Extend the get_sb() filesystem operation to take an extra argument that permits the VFS to pass in the target vfsmount that defines the mountpoint. The filesystem is then required to manually set the superblock and root dentry pointers. For most filesystems, this should be done with simple_set_mnt() which will set the superblock pointer and then set the root dentry to the superblock's s_root (as per the old default behaviour). The get_sb() op now returns an integer as there's now no need to return the superblock pointer. This patch permits a superblock to be implicitly shared amongst several mount points, such as can be done with NFS to avoid potential inode aliasing. In such a case, simple_set_mnt() would not be called, and instead the mnt_root and mnt_sb would be set directly. The patch also makes the following changes: () the get_sb_() convenience functions in the core kernel now take a vfsmount pointer argument and return an integer, so most filesystems have to change very little. () If one of the convenience function is not used, then get_sb() should normally call simple_set_mnt() to instantiate the vfsmount. This will always return 0, and so can be tail-called from get_sb(). () generic_shutdown_super() now calls shrink_dcache_sb() to clean up the dcache upon superblock destruction rather than shrink_dcache_anon(). This is required because the superblock may now have multiple trees that aren't actually bound to s_root, but that still need to be cleaned up. The currently called functions assume that the whole tree is rooted at s_root, and that anonymous dentries are not the roots of trees which results in dentries being left unculled. However, with the way NFS superblock sharing are currently set to be implemented, these assumptions are violated: the root of the filesystem is simply a dummy dentry and inode (the real inode for '/' may well be inaccessible), and all the vfsmounts are rooted on anonymous[] dentries with child trees. [] Anonymous until discovered from another tree. () The documentation has been adjusted, including the additional bit of changing ext2_ into foo_* in the documentation. [akpm@osdl.org: convert ipath_fs, do other stuff] Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Nathan Scott <nathans@sgi.com> Cc: Roland Dreier <rolandd@cisco.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-06-23 07:42:45 -07:00
Roland Dreier	9b8efc0242	IB/uverbs: Remove unnecessary list_del()s In ib_uverbs_cleanup_ucontext(), when iterating through the lists of objects, there's no reason to do list_del() to remove the objects, since both the objects and the lists that contain them are about to be freed anyway. Since list_del() is a moderately big inline function, getting rid of this extra work saves quite a bit of .text: add/remove: 0/0 grow/shrink: 1/2 up/down: 3/-217 (-214) function old new delta ib_uverbs_comp_handler 225 228 +3 ib_uverbs_async_handler 256 255 -1 ib_uverbs_close 905 689 -216 Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-06-22 07:47:27 -07:00
Krishna Kumar	183208284e	IB/uverbs: Don't free wr list when it's known to be empty In ib_uverbs_post_send(), move the "out:" label after the loop that frees the list of work requests, since the only place that jumps there is before any work requests could possibly be added to the list. This removes a compile warning: "is_ud might be used uninitialized in this function". Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-06-22 07:47:27 -07:00
Roland Dreier	9ead190bfd	IB/uverbs: Don't serialize with ib_uverbs_idr_mutex Currently, all userspace verbs operations that call into the kernel are serialized by ib_uverbs_idr_mutex. This can be a scalability issue for some workloads, especially for devices driven by the ipath driver, which needs to call into the kernel even for datapath operations. Fix this by adding reference counts to the userspace objects, and then converting ib_uverbs_idr_mutex into a spinlock that only protects the idrs long enough to take a reference on the object being looked up. Because remove operations may fail, we have to do a slightly funky two-step deletion, which is described in the comments at the top of uverbs_cmd.c. This also still leaves ib_uverbs_idr_lock as a single lock that is possibly subject to contention. However, the lock hold time will only be a single idr operation, so multiple threads should still be able to make progress, even if ib_uverbs_idr_lock is being ping-ponged. Surprisingly, these changes even shrink the object code: add/remove: 23/5 grow/shrink: 4/21 up/down: 633/-693 (-60) Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-06-17 20:44:49 -07:00
Roland Dreier	3463175d6e	IB/uverbs: Factor out common idr code Factor out common code for adding a userspace object to an idr into a function idr_add_uobj(). This shrinks both the source and object code: add/remove: 1/0 grow/shrink: 0/6 up/down: 57/-220 (-163) function old new delta idr_add_uobj - 57 +57 ib_uverbs_create_ah 543 512 -31 ib_uverbs_create_srq 662 630 -32 ib_uverbs_reg_mr 737 699 -38 ib_uverbs_create_cq 639 600 -39 ib_uverbs_alloc_pd 485 446 -39 ib_uverbs_create_qp 1020 979 -41 Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-06-17 20:37:40 -07:00
Roland Dreier	92b1582268	IB/uverbs: Don't decrement usecnt on error paths In error paths when destroying an object, uverbs should not decrement associated objects' usecnt, since ib_dereg_mr(), ib_destroy_qp(), etc. already do that. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-06-17 20:37:40 -07:00
Ganapathi CH	77f76013e3	IB/uverbs: Release lock on error path If ibdev->alloc_ucontext() fails then ib_uverbs_get_context() does not unlock file->mutex before returning error. Signed-off by: Ganapathi CH <cganapathi@novell.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-06-17 20:37:40 -07:00
Sean Hefty	ca222c6b2c	IB/cm: Use address handle helpers Use new ib_init_ah_from_wc() and ib_init_ah_from_path() helper functions to clean up the IB CM. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-06-17 20:37:40 -07:00
Sean Hefty	6d969a471b	IB/sa: Add ib_init_ah_from_path() Add a call to initialize address handle attributes given a path record. This is used by the CM, and would be useful for users of UD QPs. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-06-17 20:37:39 -07:00
Sean Hefty	4e00d69454	IB: Add ib_init_ah_from_wc() Add a function to initialize address handle attributes from a work completion. This functionality is duplicated by both verbs and the CM. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-06-17 20:37:39 -07:00
Sean Hefty	75af908851	IB/ucm: Get rid of duplicate P_Key parameter The P_Key is provided into a SIDR REQ in two places, once as a parameter, and again in the path record. Remove the P_Key as a parameter and always use the one given in the path record. This change has no practical effect on ABI functionality. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-06-17 20:37:39 -07:00
Or Gerlitz	6c8c1aa25d	IB/fmr: Use device's max_map_map_per_fmr attribute in FMR pool. When creating a FMR pool, query the IB device and use the returned max_map_map_per_fmr attribute as for the max number of FMR remaps. If the device does not suport querying this attribute, use the original IB_FMR_MAX_REMAPS (32) default. Signed-off-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-06-17 20:37:37 -07:00
Jack Morgenstein	9874e74655	IB/mad: Check GID/LID when matching requests Check GID/LID for requester side when searching for request which matches received response. This is in order to guarantee uniqueness if the same TID is used when requesting via multiple source LIDs (when LMC is not zero). Use ports' cached LMC to perform the check. Further, do not perform LID check for direct-routed packets, since the permissive LID makes a proper check impossible. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-06-17 20:37:34 -07:00
Jack Morgenstein	6fb9cdbf2c	IB: Add caching of ports' LMC Add an LMC cache to struct ib_device, and add a function ib_get_cached_lmc() to query the cache. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-06-17 20:37:34 -07:00
Michael S. Tsirkin	856c256f88	IB/cm: remove unneeded flush_workqueue destroy_workqueue() already does flush_workqueue(). Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com>	2006-06-17 20:37:33 -07:00
Sean Hefty	4be10c1e6d	IB/ucm: convert semaphore to mutex Convert semaphore in ib_ucm_file to a real mutex. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-06-17 20:37:33 -07:00
Roland Dreier	403a496fd4	IB: Make needlessly global ib_mad_cache static Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-06-17 20:37:31 -07:00
Sean Hefty	e51060f08a	IB: IP address based RDMA connection manager Kernel connection management agent over InfiniBand that connects based on IP addresses. The agent defines a generic RDMA connection abstraction to support clients wanting to connect over different RDMA devices. The agent also handles RDMA device hotplug events on behalf of clients. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-06-17 20:37:29 -07:00
Sean Hefty	7025fcd36b	IB: address translation to map IP toIB addresses (GIDs) Add an address translation service that maps IP addresses to InfiniBand GID addresses using IPoIB. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-06-17 20:37:28 -07:00
Sean Hefty	6e61d04f2d	IB/cm: Match connection requests based on private data Extend matching connection requests to listens in the InfiniBand CM to include private data checks. This allows applications to listen on the same service identifier, with private data directing the request to the appropriate application. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-06-17 20:37:28 -07:00
Sean Hefty	6a9af2e18a	IB: common handling for marshalling parameters to/from userspace Provide common handling for marshalling data between userspace clients and kernel InfiniBand drivers. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-06-17 20:37:27 -07:00
Roland Dreier	0cb4fe8d26	IB/uverbs: Don't leak ref to mm on error path In ib_umem_release_on_close(), if the kmalloc() fails, then a reference to current->mm will be leaked. Fix this by adding a mmput() instead of just returning on kmalloc() failure. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-05-17 22:20:50 -07:00
Sean Hefty	1b52fa98ed	IB: refcount race fixes Fix race condition during destruction calls to avoid possibility of accessing object after it has been freed. Instead of waking up a wait queue directly, which is susceptible to a race where the object is freed between the reference count going to 0 and the wake_up(), use a completion to wait in the function doing the freeing. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-05-12 14:57:52 -07:00
Ralph Campbell	d8b9f23b23	IB: Fix display of 4-bit port counters in sysfs The code to display local_link_integrity_errors and excessive_buffer_overrun_errors in /sys/class/infiniband/<hca>/ports/<n>/counters/ uses the wrong shift to extract the 4 bit values. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-05-09 10:50:28 -07:00
Hal Rosenstock	64cb9c6aff	IB/mad: Fix RMPP version check during agent registration Only check that RMPP version is not specified when MAD class does not support RMPP. Just because a class is allowed to use RMPP doesn't mean that rmpp_version needs to be set for the MAD agent to register. Checking this was a recent change which was too pedantic. Signed-off-by: Hal Rosenstock <halr@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-04-19 11:40:11 -07:00
Michael S. Tsirkin	ce684df05a	IB/cache: Use correct pointer to calculate size When allocating gid_cache, use kmalloc(sizeof gid_cache, ...) rather than kmalloc(sizeof pkey_cache, ...). It doesn't really matter which one is used, since the size ends up the same either way, but it's much better to say what we mean. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-04-10 13:17:43 -07:00
Jack Morgenstein	bf6a9e31cf	IB: simplify static rate encoding Push translation of static rate to HCA format into low-level drivers, where it belongs. For static rate encoding, use encoding of rate field from IB standard PathRecord, with addition of value 0, for backwards compatibility with current usage. The changes are: - Add enum ib_rate to midlayer includes. - Get rid of static rate translation in IPoIB; just use static rate directly from Path and MulticastGroup records. - Update mthca driver to translate absolute static rate into the format used by hardware. This also fixes mthca's static rate handling for HCAs that are capable of 4X DDR. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-04-10 09:43:47 -07:00
Michael S. Tsirkin	37289efe3e	IB/mad: fix oops in cancel_mads We have seen the following OOPs in cancel_mads, when restarting opensm multiple times: Call Trace: [<c010549b>] show_stack+0x9b/0xb0 [<c01055ec>] show_registers+0x11c/0x190 [<c01057cd>] die+0xed/0x160 [<c031b966>] do_page_fault+0x3f6/0x5d0 [<c010511f>] error_code+0x4f/0x60 [<f8ac4e38>] cancel_mads+0x128/0x150 [ib_mad] [<f8ac2811>] unregister_mad_agent+0x11/0x130 [ib_mad] [<f8ac2a12>] ib_unregister_mad_agent+0x12/0x20 [ib_mad] [<f8b10f23>] ib_umad_close+0xf3/0x130 [ib_umad] [<c0162937>] __fput+0x187/0x1c0 [<c01627a9>] fput+0x19/0x20 [<c0160f7a>] filp_close+0x3a/0x60 [<c0121ca8>] put_files_struct+0x68/0xa0 [<c0103cf7>] do_signal+0x47/0x100 [<c0103ded>] do_notify_resume+0x3d/0x40 [<c0103f9e>] work_notifysig+0x13/0x25 We traced this back to local_completions unlocking mad_agent_priv->lock while still keeping a pointer into local_list. A later call to list_del(&local->completion_list) would then corrupt the list. To fix this, remove the entry from local_list after looking it up but before releasing mad_agent_priv->lock, to prevent cancel_mads from finding and freeing it. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-04-02 14:39:19 -07:00
Hal Rosenstock	618a3c03fc	IB/mad: RMPP support for additional classes Add RMPP support for additional management classes that support it. Also, validate RMPP is consistent with management class specified. Signed-off-by: Hal Rosenstock <halr@voltaire.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-03-30 07:19:51 -08:00
Jack Morgenstein	fa9656bbd9	IB/mad: include GID/class when matching receives Received responses are currently matched against sent requests based on TID only. According to the spec, responses should match based on the combination of TID, management class, and requester LID/GID. Without the additional qualification, an agent that is responding to two requests, both of which have the same TID, can match RMPP ACKs with the incorrect transaction. This problem can occur on the SM node when responding to SA queries. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-03-30 07:19:48 -08:00
Michael S. Tsirkin	dc05980dd7	IB/mad: Fix oopsable race on device removal Fix an oopsable race debugged by Eli Cohen <eli@mellanox.co.il>: After removing the port from port_list, ib_mad_port_close flushes port_priv->wq before destroying the special QPs. This means that a completion event could arrive, and queue a new work in this work queue after flush. This patch also removes an unnecessary flush_workqueue(): destroy_workqueue() already includes a flush. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-03-20 10:08:25 -08:00
Roland Dreier	048975ac58	IB: Coverity fixes to sysfs.c Fix two bugs found by coverity: - Memory leak in error path of alloc_group_attrs() - Fencepost error in state_show(): the test should be < ARRAY_SIZE(), not <= ARRAY_SIZE(). Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-03-20 10:08:25 -08:00
Ami Perlmutter	702b2aaccf	IB/uverbs: Use correct alt_pkey_index in modify QP The old code incorrectly used the primary P_Key index as the alternate index too. Signed-off-by: Ami Perlmutter <amip@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-03-20 10:08:24 -08:00
Jack Morgenstein	f36e1793e2	IB/umad: Add support for large RMPP transfers Add support for sending and receiving large RMPP transfers. The old code supports transfers only as large as a single contiguous kernel memory allocation. This patch uses linked list of memory buffers when sending and receiving data to avoid needing contiguous pages for larger transfers. Receive side: copy the arriving MADs in chunks instead of coalescing to one large buffer in kernel space. Send side: split a multipacket MAD buffer to a list of segments, (multipacket_list) and send these using a gather list of size 2. Also, save pointer to last sent segment, and retrieve requested segments by walking list starting at last sent segment. Finally, save pointer to last-acked segment. When retrying, retrieve segments for resending relative to this pointer. When updating last ack, start at this pointer. Signed-off-by: Jack Morgenstein <jackm@mellanox.co.il> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-03-20 10:08:23 -08:00
Sean Hefty	87fd1a11ae	IB/cm: Check cm_id state before handling a REP Move checking the state of a cm_id before modifying it when handling a REP. This fixes a bug seen under MPI scale-up testing, where a NULL timewait_info pointer is dereferenced if a request times out before a REP is received. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-03-20 10:08:23 -08:00

1 2 3 4 5 ...

398 Commits