linux

mirror of https://github.com/FEX-Emu/linux.git synced 2024-12-22 17:33:01 +00:00

Author	SHA1	Message	Date
Roland Dreier	a394f83bdf	IB/umad: Fix bit ordering and 32-on-64 problems on big endian systems The declaration of struct ib_user_mad_reg_req.method_mask[] exported to userspace was an array of __u32, but the kernel internally treated it as a bitmap made up of longs. This makes a difference for 64-bit big-endian kernels, where numbering the bits in an array of__u32 gives: \|31.....0\|63....31\|95....64\|127...96\| while numbering the bits in an array of longs gives: \|63..............0\|127............64\| 64-bit userspace can handle this by just treating method_mask[] as an array of longs, but 32-bit userspace is really stuck: the meaning of the bits in method_mask[] depends on whether the kernel is 32-bit or 64-bit, and there's no sane way for userspace to know that. Fix this by updating <rdma/ib_user_mad.h> to make it clear that method_mask[] is an array of longs, and using a compat_ioctl method to convert to an array of 64-bit longs to handle the 32-on-64 problem. This fixes the interface description to match existing behavior (so working binaries continue to work) in almost all situations, and gives consistent semantics in the case of 32-bit userspace that can run on either a 32-bit or 64-bit kernel, so that the same binary can work for both 32-on-32 and 32-on-64 systems. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:15 -07:00
Roland Dreier	2be8e3ee8e	IB/umad: Add P_Key index support Add support for setting the P_Key index of sent MADs and getting the P_Key index of received MADs. This requires a change to the layout of the ABI structure struct ib_user_mad_hdr, so to avoid breaking compatibility, we default to the old (unchanged) ABI and add a new ioctl IB_USER_MAD_ENABLE_PKEY that allows applications that are aware of the new ABI to opt into using it. We plan on switching to the new ABI by default in a year or so, and this patch adds a warning that is printed when an application uses the old ABI, to push people towards converting to the new ABI. Signed-off-by: Roland Dreier <rolandd@cisco.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Hal Rosenstock <hal@xsigo.com>	2007-10-09 19:59:15 -07:00
Ralph Campbell	57cb61d587	IB/core: Fix handling of multicast response failures I was looking at the code for multicast.c and noticed that ib_sa_join_multicast() calls queue_join() which puts the request at the front of the group->pending_list. If this is a second request, it seems like it would interfere with process_join_error() since group->last_join won't point to the member at the head of the pending_list. The sequence would thus be: 1. ib_sa_join_multicast() puts member1 on head of pending_list and starts work thread 2. mcast_work_handler() calls send_join() which sets group->last_join to member1 3. ib_sa_join_multicast() puts member2 on head of pending_list 4. join operation for member1 receives failures response from SA. 5. join_handler() is called with error status 6. process_join_error() fails to process member1 since it doesn't match the first entry in the group->pending_list. The impact is that the failed join request is tossed. The second request is processed, and after it completes, the original request ends up being retried. This change also results in join requests being processed in FIFO order. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:14 -07:00
Steve Wise	935ef2d7a2	RDMA/cma: Use neigh_event_send() to start neighbour discovery Calling arp_send() to initiate neighbour discovery (ND) doesn't do the full ND protocol. Namely, it doesn't handle retransmitting the arp request if it is dropped. The function neigh_event_send() does all this. Without doing full ND, RDMA address resolution fails in the presence of dropped ARP broadcast packets. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:13 -07:00
Joachim Fenkes	c8d8beea03	IB/umem: Add hugetlb flag to struct ib_umem During ib_umem_get(), determine whether all pages from the memory region are hugetlb pages and report this in the "hugetlb" member. Low-level drivers can use this information if they need it. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:13 -07:00
Sean Hefty	7ce86409ad	RDMA/ucma: Allow user space to set service type Export the ability to set the type of service to user space. Model the interface after setsockopt. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:12 -07:00
Sean Hefty	a81c994d5e	RDMA/cma: Add ability to specify type of service Provide support to specify a type of service for a communication identifier. A new function call is used when dealing with IPv4 addresses. For IPv6 addresses, the ToS is specified through the traffic class field in the sockaddr_in6 structure. Signed-off-by: Sean Hefty <sean.hefty@intel.com> [ The comments Eitan Zahavi and myself have made over the v1 post at <http://lists.openfabrics.org/pipermail/general/2007-August/039247.html> were fully addressed. ] Reviewed-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:12 -07:00
Sean Hefty	733d65fe33	IB/sa: Add new QoS fields to path record The QoS annex defines new fields for path records. Add them to the ib_sa for consumers that want to use them. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Or Gerlitz <ogerlitz@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:12 -07:00
Ali Ayoub	3c10c7c929	IB/sa: Error handling thinko fix ib_create_send_mad() returns an error code pointer on error, not NULL. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:07 -07:00
Anton Blanchard	8a68bbe31d	IB/fmr_pool: Clean up some error messages in fmr_pool.c A number of printks in fmr_pool.c dont have newlines, eg: fmr_create failed for FMR 0<5>FS-Cache: Loaded Fix them up. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:05 -07:00
Roland Dreier	65d470b3ea	IB: find_first_zero_bit() takes unsigned pointer Fix sparse warning drivers/infiniband/core/device.c:142:6: warning: incorrect type in argument 1 (different signedness) drivers/infiniband/core/device.c:142:6: expected unsigned long const addr drivers/infiniband/core/device.c:142:6: got long [assigned] inuse by making the local variable inuse unsigned. Does not affect generated code at all. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-10-09 19:59:04 -07:00
Dotan Barak	92ddc447ce	IB: Move the macro IB_UMEM_MAX_PAGE_CHUNK() to umem.c After moving the definition of struct ib_umem_chunk from ib_verbs.h to ib_umem.h there isn't any reason for the macro IB_UMEM_MAX_PAGE_CHUNK to stay in ib_verbs.h. Move the macro to umem.c, the only place where it is used. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-08-03 10:45:18 -07:00
Sean Hefty	38d5af9565	IB/mad: Fix address handle leak in mad_rmpp The address handle associated with dual-sided RMPP direction switch ACKs is never destroyed. Free the AH for ACKs which fall into this category. Problem was reported by Dotan Barak <dotanb@dev.mellanox.co.il>. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-08-03 10:45:17 -07:00
Hal Rosenstock	8fc394b197	IB/mad: agent_send_response() should be void Nothing looks at the return value of agent_send_response(), so there's no point in returning anything. Signed-off-by: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-08-03 10:45:17 -07:00
Hal Rosenstock	86dfbecdea	IB/mad: Fix memory leak in switch handling in ib_mad_recv_done_handler() If agent_send_response() returns an error, we shouldn't do anything differently than if it succeeds; setting response to NULL just means that the response buffer gets leaked. Signed-off-by: Suresh Shelvapille <suri@baymicrosystems.com> Signed-off-by: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-08-03 10:45:17 -07:00
Hal Rosenstock	445d68070c	IB/mad: Fix error path if response alloc fails in ib_mad_recv_done_handler() If ib_mad_recv_done_handler() fails to allocate response, then it just printed a warning and continued, which leads to an oops if the MAD is being handled for a switch device, because the switch code uses response without checking for NULL. Fix this by bailing out of the function if the allocation fails. Signed-off-by: Suresh Shelvapille <suri@baymicrosystems.com> Signed-off-by: Hal Rosenstock <hal.rosenstock@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-08-03 10:45:17 -07:00
Roland Dreier	5399891052	IB/sa: Don't need to check for default P_Key twice Now that ib_find_pkey() ignores the membership bit of P_Keys, there's no need for ib_sa to look for both 0x7fff and 0xffff in a port's P_Key table. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-08-03 10:45:17 -07:00
Moni Shoua	36026ecc20	IB/core: Ignore membership bit in ib_find_pkey() ib_find_pkey() is used as a replacement for ib_find_cached_pkey(), and the original function ignored the membership bit when searching for a P_Key, so ib_find_pkey() should ignore the bit too. In particular, IPoIB turns on the P_Key membership bit of limited membership P_Keys when creating a child interface and looks for the full membership P_key. This broke if a port was a partial member of a partition when IPoIB switched from ib_find_cached_pkey() to ib_find_pkey(), and this change fixes things again. Signed-off-by: Moni Shoua <monis@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-08-03 10:45:17 -07:00
Paul Mundt	20c2df83d2	mm: Remove slab destructors from kmem_cache_create(). Slab destructors were no longer supported after Christoph's `c59def9f22` change. They've been BUGs for both slab and slub, and slob never supported them either. This rips out support for the dtor pointer from kmem_cache_create() completely and fixes up every single callsite in the kernel (there were about 224, not including the slab allocator definitions themselves, or the documentation references). Signed-off-by: Paul Mundt <lethal@linux-sh.org>	2007-07-20 10:11:58 +09:00
Yoann Padioleau	dd00cc486a	some kmalloc/memset ->kzalloc (tree wide) Transform some calls to kmalloc/memset to a single kzalloc (or kcalloc). Here is a short excerpt of the semantic patch performing this transformation: @@ type T2; expression x; identifier f,fld; expression E; expression E1,E2; expression e1,e2,e3,y; statement S; @@ x = - kmalloc + kzalloc (E1,E2) ... when != \(x->fld=E;\\|y=f(...,x,...);\\|f(...,x,...);\\|x=E;\\|while(...) S\\|for(e1;e2;e3) S\) - memset((T2)x,0,E1); @@ expression E1,E2,E3; @@ - kzalloc(E1 * E2,E3) + kcalloc(E1,E2,E3) [akpm@linux-foundation.org: get kcalloc args the right way around] Signed-off-by: Yoann Padioleau <padator@wanadoo.fr> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Acked-by: Russell King <rmk@arm.linux.org.uk> Cc: Bryan Wu <bryan.wu@analog.com> Acked-by: Jiri Slaby <jirislaby@gmail.com> Cc: Dave Airlie <airlied@linux.ie> Acked-by: Roland Dreier <rolandd@cisco.com> Cc: Jiri Kosina <jkosina@suse.cz> Acked-by: Dmitry Torokhov <dtor@mail.ru> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Mauro Carvalho Chehab <mchehab@infradead.org> Acked-by: Pierre Ossman <drzeus-list@drzeus.cx> Cc: Jeff Garzik <jeff@garzik.org> Cc: "David S. Miller" <davem@davemloft.net> Acked-by: Greg KH <greg@kroah.com> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: "Antonino A. Daplas" <adaplas@pol.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-07-19 10:04:50 -07:00
Dotan Barak	8f076531cd	RDMA/cma: Remove local write permission from QP access flags Local write permission makes no sense as part of the QP access flags, since the access flags only control what the remote end of the connection is allowed to do. Remove the code in the RDMA CM that initializes qp_access_flags with IB_ACCESS_LOCAL_WRITE. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Acked-by: Sean Hefty <sean.hefty@intel.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-17 20:30:22 -07:00
Roland Dreier	454a01e7f4	IB/cm: Make internal function cm_get_ack_delay() static Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-17 18:37:43 -07:00
Linus Torvalds	0cdf6990e9	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (76 commits) IB: Update MAINTAINERS with Hal's new email address IB/mlx4: Implement query SRQ IB/mlx4: Implement query QP IB/cm: Send no match if a SIDR REQ does not match a listen IB/cm: Fix handling of duplicate SIDR REQs IB/cm: cm_msgs.h should include ib_cm.h IB/cm: Include HCA ACK delay in local ACK timeout IB/cm: Use spin_lock_irq() instead of spin_lock_irqsave() when possible IB/sa: Make sure SA queries use default P_Key IPoIB: Recycle loopback skbs instead of freeing and reallocating IB/mthca: Replace memset(<addr>, 0, PAGE_SIZE) with clear_page(<addr>) IPoIB/cm: Fix warning if IPV6 is not enabled IB/core: Take sizeof the correct pointer when calling kmalloc() IB/ehca: Improve latency by unlocking after triggering the hardware IB/ehca: Notify consumers of LID/PKEY/SM changes after nondisruptive events IB/ehca: Return QP pointer in poll_cq() IB/ehca: Change idr spinlocks into rwlocks IB/ehca: Refactor sync between completions and destroy_cq using atomic_t IB/ehca: Lock renaming, static initializers IB/ehca: Report RDMA atomic attributes in query_qp() ...	2007-07-12 16:45:40 -07:00
Tejun Heo	7b595756ec	sysfs: kill unnecessary attribute->owner sysfs is now completely out of driver/module lifetime game. After deletion, a sysfs node doesn't access anything outside sysfs proper, so there's no reason to hold onto the attribute owners. Note that often the wrong modules were accounted for as owners leading to accessing removed modules. This patch kills now unnecessary attribute->owner. Note that with this change, userland holding a sysfs node does not prevent the backing module from being unloaded. For more info regarding lifetime rule cleanup, please read the following message. http://article.gmane.org/gmane.linux.kernel/510293 (tweaked by Greg to not delete the field just yet, to make it easier to merge things properly.) Signed-off-by: Tejun Heo <htejun@gmail.com> Cc: Cornelia Huck <cornelia.huck@de.ibm.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-07-11 16:09:06 -07:00
Sean Hefty	6164c8cd13	IB/cm: Send no match if a SIDR REQ does not match a listen If a SIDR REQ does not match a listen, we should reply with status value 1 (service ID not supported), rather than dropping through to the default case of status 2 (rejected by service provider). Doing this also fixes a bug where the cm_id_priv is removed from the remote_sidr_table twice. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-10 21:52:28 -07:00
Sean Hefty	29c2731cbf	IB/cm: Fix handling of duplicate SIDR REQs Fix handling to duplicate SIDR REQs to avoid sending a reject if a duplicate is detected. Duplicates should just be silently discarded. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-10 21:51:43 -07:00
Sean Hefty	5d861be8c8	IB/cm: cm_msgs.h should include ib_cm.h cm_msgs.h uses definitions from ib_cm.h. Include it directly, rather than depending on a specific include order. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-10 21:50:53 -07:00
Sean Hefty	1d84612649	IB/cm: Include HCA ACK delay in local ACK timeout The IB CM should include the HCA ACK delay when calculating the local ACK timeout value to use for RC QPs. If the HCA ACK delay is large enough relative to the packet life time, then if it is not taken into account, the calculated timeout value ends up being too small, which can result in "retry exceeded" errors. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-10 21:50:05 -07:00
Sean Hefty	24be6e81c7	IB/cm: Use spin_lock_irq() instead of spin_lock_irqsave() when possible The ib_cm is a little over zealous about using spin_lock_irqsave, when spin_lock_irq would do. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-10 21:47:29 -07:00
Sean Hefty	2aec5c602c	IB/sa: Make sure SA queries use default P_Key MADs sent to the SA should use the the default P_Key (0x7fff/0xffff). There's no requirement that the default P_Key is stored at index 0 in the local P_Key table, so add code to the sa_query module to look up the index of the default P_Key when creating an address handle for the SA (which is done any time the P_Key table might change), and use this index for all SA queries. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-10 21:45:31 -07:00
Dotan Barak	856c52a741	IB/core: Take sizeof the correct pointer when calling kmalloc() When allocating out_mad in show_pma_counter(), take sizeof out_mad instead of sizeof in_mad. It is true that today the type of in_mad and out_mad are the same, but this patch will give us a cleaner code. Signed-off-by: Dotan Barak <dotanb@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-10 11:04:40 -07:00
Andrew Morton	1d3f4b905a	IB: Fix ib_umem_get() when npages == 0 gcc correctly warned: drivers/infiniband/core/umem.c: In function 'ib_umem_get': drivers/infiniband/core/umem.c:78: warning: 'ret' may be used uninitialized in this function Set ret to 0 in case npages == 0 and the loop isn't entered at all. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 16:17:33 -07:00
Roland Dreier	43506d954e	IB: Remove garbage non-ASCII characters from comments A few files had 0xa0 characters in comments. Remove them so that the files are clean ASCII text. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 16:17:32 -07:00
Hal Rosenstock	1bae4dbf95	IB/mad: Enhance SMI for switch support Extend the SMI with switch (intermediate hop) support. Care has been taken to ensure that the CA (and router) code paths are changed as little as possible. Signed-off-by: Suresh Shelvapille <suri@baymicrosystems.com> Signed-off-by: Hal Rosenstock <halr@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-07-09 16:17:32 -07:00
Roland Dreier	24bce50803	IB/umem: Fix possible hang on process exit If ib_umem_release() is called after ib_uverbs_close() sets context->closing, then a process can get stuck in a D state, because the code boils down to if (down_write_trylock(&mm->mmap_sem)) down_write(&mm->mmap_sem); which is obviously a stupid instant deadlock. Fix the code so that we only try to take the lock once. This bug was introduced in commit `f7c6a7b5` ("IB/uverbs: Export ib_umem_get()/ib_umem_release() to modules") which fortunately never made it into a release, and was reported by Pete Wyckoff <pw@osc.edu>. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-06-21 11:05:58 -07:00
Sean Hefty	bf2944bd56	RDMA/cma: Fix initialization of next_port next_port should be between sysctl_local_port_range[0] and [1]. However, it is initially set to a random value with get_random_bytes(). If the value is negative when treated as a signed integer, next_port can end up outside the expected range because of the result of the % operator being negative. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-06-07 23:24:38 -07:00
Sean Hefty	d998ccce02	IB/cm: Fix stale connection detection The ib_cm can incorrectly detect a stale connection (a new connection request for a QPN that is already connected) as a duplicate connection request. Separate the handling of potential duplicate REQs from stale connections. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-29 16:07:09 -07:00
Linus Torvalds	8aee74c8ee	Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: IB/cm: Improve local id allocation IPoIB/cm: Fix SRQ WR leak IB/ipoib: Fix typos in error messages IB/mlx4: Check if SRQ is full when posting receive IB/mlx4: Pass send queue sizes from userspace to kernel IB/mlx4: Fix check of opcode in mlx4_ib_post_send() mlx4_core: Fix array overrun in dump_dev_cap_flags() IB/mlx4: Fix RESET to RESET and RESET to ERROR transitions IB/mthca: Fix RESET to ERROR transition IB/mlx4: Set GRH:HopLimit when sending globally routed MADs IB/mthca: Set GRH:HopLimit when building MLX headers IB/mlx4: Fix check of max_qp_dest_rdma in modify QP IB/mthca: Fix use-after-free on device restart IB/ehca: Return proper error code if register_mr fails IPoIB: Handle P_Key table reordering IB/core: Use start_port() and end_port() IB/core: Add helpers for uncached GID and P_Key searches IB/ipath: Fix potential deadlock with multicast spinlocks IB/core: Free umem when mm is already gone	2007-05-21 16:19:32 -07:00
Michael S. Tsirkin	9f81036c54	IB/cm: Improve local id allocation The IB CM uses an idr for local id allocations, with a running counter as start_id. This fails to generate distinct ids if 1. An id is constantly created and destroyed 2. A chunk of ids just beyond the current next_id value is occupied This in turn leads to an increased chance of connection request being mis-detected as a duplicate, sometimes for several retries, until next_id gets past the block of allocated ids. This has been observed in practice. As a fix, remember the last id allocated and start immediately above it. This also fixes a problem with the old code, where next_id might overflow and become negative. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-21 13:41:29 -07:00
Alexey Dobriyan	e8edc6e03a	Detach sched.h from mm.h First thing mm.h does is including sched.h solely for can_do_mlock() inline function which has "current" dereference inside. By dealing with can_do_mlock() mm.h can be detached from sched.h which is good. See below, why. This patch a) removes unconditional inclusion of sched.h from mm.h b) makes can_do_mlock() normal function in mm/mlock.c c) exports can_do_mlock() to not break compilation d) adds sched.h inclusions back to files that were getting it indirectly. e) adds less bloated headers to some files (asm/signal.h, jiffies.h) that were getting them indirectly Net result is: a) mm.h users would get less code to open, read, preprocess, parse, ... if they don't need sched.h b) sched.h stops being dependency for significant number of files: on x86_64 allmodconfig touching sched.h results in recompile of 4083 files, after patch it's only 3744 (-8.3%). Cross-compile tested on all arm defconfigs, all mips defconfigs, all powerpc defconfigs, alpha alpha-up arm i386 i386-up i386-defconfig i386-allnoconfig ia64 ia64-up m68k mips parisc parisc-up powerpc powerpc-up s390 s390-up sparc sparc-up sparc64 sparc64-up um-x86_64 x86_64 x86_64-up x86_64-defconfig x86_64-allnoconfig as well as my two usual configs. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-21 09:18:19 -07:00
Roland Dreier	1af4c435f3	IB/core: Use start_port() and end_port() Clean up ib_query_port() and ib_modify_port() slightly by using the just-added start_port() and end_port() helpers. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-19 08:51:54 -07:00
Yosef Etigin	5eb620c81c	IB/core: Add helpers for uncached GID and P_Key searches Add ib_find_gid() and ib_find_pkey() functions that use uncached device queries. The calls might block but the returns are always up-to-date. Cache P_Key and GID table lengths in core to avoid extra port info queries. Signed-off-by: Yosef Etigin <yosefe@voltaire.com> Acked-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-19 08:51:53 -07:00
Eli Cohen	7b82cd8ee7	IB/core: Free umem when mm is already gone Free umem when task's mm is already destroyed by the time ib_umem_release gets called. Found by Dotan Barak at Mellanox. Signed-off-by: Eli Cohen <eli@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-19 08:51:53 -07:00
Sean Hefty	6c719f5c6c	RDMA/cma: Add check to validate that cm_id is bound to a device Several checks in the rdma_cm check against the state of the cm_id, but only to validate that the cm_id is bound to an underlying transport specific CM and an RDMA device. Make the check explicit in what we're trying to check for, since we're not synchronizing against the cm_id state. This will allow a user to disconnect a cm_id or reject a connection after receiving a device removal event. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-14 14:10:32 -07:00
Sean Hefty	be65f086f2	RDMA/cma: Fix synchronization with device removal in cma_iw_handler The cma_iw_handler needs to validate the state of the rdma_cm_id before processing a new connection request to ensure that a device removal is not already being processed for the same rdma_cm_id. Without the state check, the user can receive simultaneous callbacks for the same cm_id, or a callback after they've destroyed the cm_id. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-14 13:56:32 -07:00
Sean Hefty	8aa08602bd	RDMA/cma: Simplify device removal handling code Add a new routine and rename another to encapsulate common code for synchronizing with device removal. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-14 13:54:49 -07:00
Roland Dreier	1bf66a3042	IB: Put rlimit accounting struct in struct ib_umem When memory pinned with ib_umem_get() is released, ib_umem_release() needs to subtract the amount of memory being unpinned from mm->locked_vm. However, ib_umem_release() may be called with mm->mmap_sem already held for writing if the memory is being released as part of an munmap() call, so it is sometimes necessary to defer this accounting into a workqueue. However, the work struct used to defer this accounting is dynamically allocated before it is queued, so there is the possibility of failing that allocation. If the allocation fails, then ib_umem_release has no choice except to bail out and leave the process with a permanently elevated locked_vm. Fix this by allocating the structure to defer accounting as part of the original struct ib_umem, so there's no possibility of failing a later allocation if creating the struct ib_umem and pinning memory succeeds. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-08 18:00:37 -07:00
Roland Dreier	f7c6a7b5d5	IB/uverbs: Export ib_umem_get()/ib_umem_release() to modules Export ib_umem_get()/ib_umem_release() and put low-level drivers in control of when to call ib_umem_get() to pin and DMA map userspace, rather than always calling it in ib_uverbs_reg_mr() before calling the low-level driver's reg_user_mr method. Also move these functions to be in the ib_core module instead of ib_uverbs, so that driver modules using them do not depend on ib_uverbs. This has a number of advantages: - It is better design from the standpoint of making generic code a library that can be used or overridden by device-specific code as the details of specific devices dictate. - Drivers that do not need to pin userspace memory regions do not need to take the performance hit of calling ib_mem_get(). For example, although I have not tried to implement it in this patch, the ipath driver should be able to avoid pinning memory and just use copy_{to,from}_user() to access userspace memory regions. - Buffers that need special mapping treatment can be identified by the low-level driver. For example, it may be possible to solve some Altix-specific memory ordering issues with mthca CQs in userspace by mapping CQ buffers with extra flags. - Drivers that need to pin and DMA map userspace memory for things other than memory regions can use ib_umem_get() directly, instead of hacks using extra parameters to their reg_phys_mr method. For example, the mlx4 driver that is pending being merged needs to pin and DMA map QP and CQ buffers, but it does not need to create a memory key for these buffers. So the cleanest solution is for mlx4 to call ib_umem_get() in the create_qp and create_cq methods. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-08 18:00:37 -07:00
Linus Torvalds	972d45fb43	Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: IPoIB: Convert to NAPI IB: Return "maybe missed event" hint from ib_req_notify_cq() IB: Add CQ comp_vector support IB/ipath: Fix a race condition when generating ACKs IB/ipath: Fix two more spin lock problems IB/fmr_pool: Add prefix to all printks IB/srp: Set proc_name IB/srp: Add orig_dgid sysfs attribute to scsi_host IPoIB/cm: Don't crash if remote side uses one QP for both directions RDMA/cxgb3: Support for new abort logic RDMA/cxgb3: Initialize cpu_idx field in cpl_close_listserv_req message RDMA/cxgb3: Fail qp creation if the requested max_inline is too large RDMA/cxgb3: Fix TERM codes IPoIB/cm: Fix error handling in ipoib_cm_dev_open() IB/ipath: Don't corrupt pending mmap list when unmapped objects are freed IB/mthca: Work around kernel QP starvation IB/ipath: Don't put QP in timeout queue if waiting to send IB/ipath: Don't call spin_lock_irq() from interrupt context	2007-05-07 12:18:21 -07:00
Michael S. Tsirkin	f4fd0b224d	IB: Add CQ comp_vector support Add a num_comp_vectors member to struct ib_device and extend ib_create_cq() to pass in a comp_vector parameter -- this parallels the userspace libibverbs API. Update all hardware drivers to set num_comp_vectors to 1 and have all ULPs pass 0 for the comp_vector value. Pass the value of num_comp_vectors to userspace rather than hard-coding a value of 1. We want multiple CQ event vector support (via MSI-X or similar for adapters that can generate multiple interrupts), but it's not clear how many vectors we want, or how we want to deal with policy issues such as how to decide which vector to use or how to set up interrupt affinity. This patch is useful for experimenting, since no core changes will be necessary when updating a driver to support multiple vectors, and we know that we want to make at least these changes anyway. Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-06 21:18:11 -07:00
Roland Dreier	1a70a05d9d	IB/fmr_pool: Add prefix to all printks Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-05-06 21:18:11 -07:00
Jean Delvare	6473d160b4	PCI: Cleanup the includes of <linux/pci.h> I noticed that many source files include <linux/pci.h> while they do not appear to need it. Here is an attempt to clean it all up. In order to find all possibly affected files, I searched for all files including <linux/pci.h> but without any other occurence of "pci" or "PCI". I removed the include statement from all of these, then I compiled an allmodconfig kernel on both i386 and x86_64 and fixed the false positives manually. My tests covered 66% of the affected files, so there could be false positives remaining. Untested files are: arch/alpha/kernel/err_common.c arch/alpha/kernel/err_ev6.c arch/alpha/kernel/err_ev7.c arch/ia64/sn/kernel/huberror.c arch/ia64/sn/kernel/xpnet.c arch/m68knommu/kernel/dma.c arch/mips/lib/iomap.c arch/powerpc/platforms/pseries/ras.c arch/ppc/8260_io/enet.c arch/ppc/8260_io/fcc_enet.c arch/ppc/8xx_io/enet.c arch/ppc/syslib/ppc4xx_sgdma.c arch/sh64/mach-cayman/iomap.c arch/xtensa/kernel/xtensa_ksyms.c arch/xtensa/platform-iss/setup.c drivers/i2c/busses/i2c-at91.c drivers/i2c/busses/i2c-mpc.c drivers/media/video/saa711x.c drivers/misc/hdpuftrs/hdpu_cpustate.c drivers/misc/hdpuftrs/hdpu_nexus.c drivers/net/au1000_eth.c drivers/net/fec_8xx/fec_main.c drivers/net/fec_8xx/fec_mii.c drivers/net/fs_enet/fs_enet-main.c drivers/net/fs_enet/mac-fcc.c drivers/net/fs_enet/mac-fec.c drivers/net/fs_enet/mac-scc.c drivers/net/fs_enet/mii-bitbang.c drivers/net/fs_enet/mii-fec.c drivers/net/ibm_emac/ibm_emac_core.c drivers/net/lasi_82596.c drivers/parisc/hppb.c drivers/sbus/sbus.c drivers/video/g364fb.c drivers/video/platinumfb.c drivers/video/stifb.c drivers/video/valkyriefb.c include/asm-arm/arch-ixp4xx/dma.h sound/oss/au1550_ac97.c I would welcome test reports for these files. I am fine with removing the untested files from the patch if the general opinion is that these changes aren't safe. The tested part would still be nice to have. Note that this patch depends on another header fixup patch I submitted to LKML yesterday: [PATCH] scatterlist.h needs types.h http://lkml.org/lkml/2007/3/01/141 Signed-off-by: Jean Delvare <khali@linux-fr.org> Cc: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2007-05-02 19:02:35 -07:00
Joachim Fenkes	1912ffbb88	IB: Set class_dev->dev in core for nice device symlink All RDMA drivers except ehca set class_dev->dev to their dma_device value (ehca leaves this unset). dma_device is the only value that makes any sense, so move this assignment to core/sysfs.c. This reduce the duplicated code in the rest of the drivers and gives ehca a nice /sys/class/infiniband/ehcaX/device symlink. Signed-off-by: Joachim Fenkes <fenkes@de.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-04-24 21:30:38 -07:00
Hal Rosenstock	de493d47d8	IB/mad: Change SMI to use enums rather than magic return codes Clarify code by changing return values from SMI functions to named enum values instead of magic 0/1 values. Signed-off-by: Hal Rosenstock <halr@voltaire.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-04-24 16:31:12 -07:00
Sean Hefty	aeba84a925	IB/umad: Implement GRH handling for sent/received MADs We need to set the SGID index for routed MADs and pass received GRH information to userspace when a MAD is received. Signed-off-by: Sean Hefty <sean.hefty@intel.com>	2007-04-24 16:31:12 -07:00
Sean Hefty	d0e7bb1418	IB/sa: Set src_path_bits correctly in ib_init_ah_from_path() src_path_bits needs to mask off the base LID value. Signed-off-by: Sean Hefty <sean.hefty@intel.com>	2007-04-24 16:31:12 -07:00
Sean Hefty	9d41b7fdea	IB/ucm: Simplify ib_ucm_event() Use wait_event_interruptible() instead of a more complicated open-coded equivalent. Signed-off-by: Sean Hefty <sean.hefty@intel.com>	2007-04-24 16:31:11 -07:00
Sean Hefty	d92f76448c	RDMA/ucma: Simplify ucma_get_event() Use wait_event_interruptible() instead of a more complicated open-coded equivalent. Signed-off-by: Sean Hefty <sean.hefty@intel.com>	2007-04-24 16:31:11 -07:00
Hal Rosenstock	9a4b65e357	IB/umad: Fix declaration of dev_map[] The current ib_umad code never accesses bits past IB_UMAD_MAX_PORTS in dev_map[]. We shouldn't declare it to be twice as big. Pointed-out-by: Roland Dreier <rolandd@cisco.com> Signed-off-by: Hal Rosenstock <halr@voltaire.com>	2007-04-18 20:20:53 -07:00
Sean Hefty	3492856e33	RDMA/ucma: Avoid sending reject if backlog is full Change the returned error code to ENOMEM if the connection event backlog is full. This prevents the ib_cm from issuing a reject on the connection, which can allow retries to succeed. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-03-06 14:58:11 -08:00
Sean Hefty	cb164b8c6a	RDMA/cma: Initialize rdma_bind_list in cma_alloc_any_port() The struct rdma_bind_list fields for hlist are not being initialized, resulting in a corrupted list. Fix this by using kzalloc() to make sure all pointers are NULL. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-03-06 12:41:44 -08:00
Sean Hefty	1836854f25	RDMA/cma: Remove unused node_guid from cma_device structure Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-22 17:54:35 -08:00
Sean Hefty	e971b8cd19	IB/cm: Remove ca_guid from cm_device structure The cm_device references an ib_device, which already contains the node_guid. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-22 17:54:33 -08:00
Sean Hefty	962063e64b	RDMA/cma: Request reversible paths only The rdma_cm requires that path records be reversible. Set the reversible bit when issuing an path record query. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-22 17:54:07 -08:00
Sean Hefty	47645d8d25	IB/core: Set hop limit in ib_init_ah_from_wc correctly The hop_limit value in the ah_attr should be 0xFF, not the value read from the received GRH (which should be 0). See 13.5.4.4 in the 1.2 IB spec. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-22 17:54:02 -08:00
Roland Dreier	aaf1aef55f	IB/uverbs: Return correct error for invalid PD in register MR If no matching PD is found in ib_uverbs_reg_mr(), then the function jumps to err_release without setting the return value ret. This means that ret will hold the return value of the call to ib_umem_get() a few lines earlier; if the function reaches the point where it looks for the PD, we know that ib_umem_get() must have returned 0, so ib_uverbs_reg_mr() ends up return 0 for a bad PD ID. Fix this by setting ret to -EINVAL before jumping to the exit path when no PD is found. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-22 13:16:51 -08:00
Roland Dreier	7084f8429c	IB/core: Set static rate in ib_init_ah_from_path() The static rate from the path record should be put into the address vector -- a long time ago the rate in the address attributes needed to be a relative rate, which required more munging, but now that the conversion from absolute to relative is done in the low-level driver, it's easy for ib_init_ah_from_path() to put the absolute rate in. Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Cc: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-16 15:31:24 -08:00
Roland Dreier	38abaa63bf	IB/core: Fix sparse warnings about shadowed declarations Change a couple of variable names to avoid sparse warnings about symbols being shadowed. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-16 14:41:14 -08:00
Sean Hefty	c8f6a362bf	RDMA/cma: Add multicast communication support Extend rdma_cm to support multicast communication. Multicast support is added to the existing RDMA_PS_UDP port space, as well as a new RDMA_PS_IPOIB port space. The latter port space allows joining the multicast groups used by IPoIB, which enables offloading IPoIB traffic to a separate QP. The port space determines the signature used in the MGID when joining the group. The newly added RDMA_PS_IPOIB also allows for unicast operations, similar to RDMA_PS_UDP. Supporting the RDMA_PS_IPOIB requires changing how UD QPs are initialized, since we can no longer assume that the qkey is constant. This requires saving the Q_Key to use when attaching to a device, so that it is available when creating the QP. The Q_Key information is exported to the user through the existing rdma_init_qp_attr() interface. Multicast support is also exported to userspace through the rdma_ucm. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-16 14:29:07 -08:00
Sean Hefty	faec2f7b96	IB/sa: Track multicast join/leave requests The IB SA tracks multicast join/leave requests on a per port basis and does not do any reference counting: if two users of the same port join the same group, and one leaves that group, then the SA will remove the port from the group even though there is one user who wants to stay a member left. Therefore, in order to support multiple users of the same multicast group from the same port, we need to perform reference counting locally. To do this, add an multicast submodule to ib_sa to perform reference counting of multicast join/leave operations. Modify ib_ipoib (the only in-kernel user of multicast) to use the new interface. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-16 14:20:02 -08:00
Steve Wise	ebb90986e1	RDMA/iwcm: iw_cm_id destruction race fixes iwcm iw_cm_id destruction race condition fixes: - iwcm_deref_id() always wakes up if there's another reference. - clean up race condition in cm_work_handler(). - create static void free_cm_id() which deallocs the work entries and then kfrees the cm_id memory. This reduces code replication. - rem_ref() if this is the last reference -and- the IWCM owns freeing the cm_id, then free it. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Tom Tucker <tom@opengridcomputing.com> Acked-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-16 13:57:35 -08:00
Tim Schmielau	cd354f1ae7	[PATCH] remove many unneeded #includes of sched.h After Al Viro (finally) succeeded in removing the sched.h #include in module.h recently, it makes sense again to remove other superfluous sched.h includes. There are quite a lot of files which include it but don't actually need anything defined in there. Presumably these includes were once needed for macros that used to live in sched.h, but moved to other header files in the course of cleaning it up. To ease the pain, this time I did not fiddle with any header files and only removed #includes from .c-files, which tend to cause less trouble. Compile tested against 2.6.20-rc2 and 2.6.20-rc2-mm2 (with offsets) on alpha, arm, i386, ia64, mips, powerpc, and x86_64 with allnoconfig, defconfig, allmodconfig, and allyesconfig as well as a few randconfigs on x86_64 and all configs in arch/arm/configs on arm. I also checked that no new warnings were introduced by the patch (actually, some warnings are removed that were emitted by unnecessarily included header files). Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-14 08:09:54 -08:00
Linus Torvalds	93bbad8fe1	Merge branch 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband * 'for-linus' of master.kernel.org:/pub/scm/linux/kernel/git/roland/infiniband: IB/mthca: Always fill MTTs from CPU IB/mthca: Merge MR and FMR space on 64-bit systems IB/mthca: Fix access to MTT and MPT tables on non-cache-coherent CPUs IB/mthca: Give reserved MTTs a separate cache line IB/mthca: Fix reserved MTTs calculation on mem-free HCAs RDMA/cxgb3: Add driver for Chelsio T3 RNIC IB: Remove redundant "_wq" from workqueue names RDMA/cma: Increment port number after close to avoid re-use IB/ehca: Fix memleak on module unloading IB/mthca: Work around gcc bug on sparc64 IPoIB: Connected mode experimental support IB/core: Use ARRAY_SIZE macro for mandatory_table IB/mthca: Use correct structure size in call to memset()	2007-02-13 21:16:39 -08:00
Arjan van de Ven	2b8693c061	[PATCH] mark struct file_operations const 3 Many struct file_operations in the kernel can be "const". Marking them const moves these to the .rodata section, which avoids false sharing with potential dirty data. In addition it'll catch accidental writes at compile time to these shared resources. Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-02-12 09:48:45 -08:00
Sean Hefty	c7f743a669	IB: Remove redundant "_wq" from workqueue names Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-10 08:00:50 -08:00
Sean Hefty	aedec08050	RDMA/cma: Increment port number after close to avoid re-use Randomize the starting port number and avoid re-using port values immediately after they are closed. Instead keep track of the last port value used and increment it every time a new port number is assigned, to better replicate other port spaces. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-10 08:00:50 -08:00
Ahmed S. Darwish	9a6b090c0d	IB/core: Use ARRAY_SIZE macro for mandatory_table Use ARRAY_SIZE() macro already defined in kernel.h instead of open coding equivalent code. Signed-off-by: Ahmed S. Darwish <darwish.07@gmail.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-10 08:00:47 -08:00
Steve Wise	1f12667021	RDMA/addr: Handle ethernet neighbour updates during route resolution The iWARP connection manager uses the ib_addr services to do route resolution (neighbour discovery in the IP world). The ib_addr netevent callback routine, however, currently only acts on InfiniBand neighbour updates. It needs to act on ethernet neighbour updates as well. This patch just removes filtering on device type altogether and will trigger on any neighour updates where the nud_type is valid. This simplifies the code some. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-04 14:11:57 -08:00
Michael S. Tsirkin	062dbb69f3	IB: Return qp pointer as part of ib_wc struct ib_wc currently only includes the local QP number: this matches the IB spec, but seems mostly useless. The following patch replaces this with the pointer to qp itself, and updates all low level drivers and all users. This has the following advantages: - Ability to get a per-qp context through wc->qp->qp_context - Existing drivers already have the qp pointer ready in poll cq, so this change actually saves a tiny bit (extra memory read) on data path (for ehca it would actually be expensive to find the QP pointer when polling a CQ, but ehca does not support SRQ so we can leave wc->qp as NULL for ehca) - Users that need the QP number can still get it through wc->qp->qp_num Use case: In IPoIB connected mode code, I have a common CQ shared by multiple QPs. To track connection usage, I need a way to get at some per-QP context upon the completion, and I would like to avoid allocating context object per work request just to stick a QP pointer into it. With this code, I can just use wc->qp->qp_context. Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-02-04 14:11:55 -08:00
Sean Hefty	0cefcf0bbc	RDMA/ucma: Don't report events with invalid user context There's a problem with how rdma cm events are reported to userspace that can lead to application crashes. When a new connection request arrives, a context for the connection is allocated in the kernel. The connection event is then reported to userspace. The userspace library retrieves the event and allocates its own context for the connection. The userspace context is associated with the kernel's context when accepting. This allows the kernel to give userspace context with other events. A problem occurs if a second event for the same connection occurs before the user has had a chance to call accept. The userspace context has not yet been set, which causes the librdmacm to crash. (This has been seen when the app takes too long to call accept, resulting in the remote side timing out and rejecting the connection) Fix this by ignoring events for new connections until userspace has set their context. This can only happen if an error occurs on a new connection before the user accepts it. This is okay, since the accept will just fail later. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-01-07 20:20:08 -08:00
Sean Hefty	30a5ec982e	RDMA/ucma: Fix struct ucma_event leak when backlog is full We discard new connection requests while the listen backlog is full, but leak a struct ucma_event in the process. Free the structure in this case. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-01-07 20:17:34 -08:00
Steve Wise	881a045fc5	RDMA/iwcm: iWARP connection timeouts shouldn't be reported as rejects The iWARP CM should report timeouts as event RDMA_CM_EVENT_UNREACHABLE, not event RDMA_CM_EVENT_REJECTED. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2007-01-07 20:15:58 -08:00
Ralph Campbell	1527106ff8	IB/core: Use the new verbs DMA mapping functions Convert code in core/ to use the new DMA mapping functions for kernel verbs consumers. Signed-off-by: Ralph Campbell <ralph.campbell@qlogic.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-12-12 14:28:30 -08:00
Sean Hefty	7521663857	RDMA/cma: Export rdma cm interface to userspace Export the rdma cm interfaces to userspace via a misc device. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-12-12 11:50:22 -08:00
Sean Hefty	628e5f6d39	RDMA/cma: Add support for RDMA_PS_UDP Allow the use of UD QPs through the rdma_cm, in order to provide address translation services for resolving IB addresses for datagram messages using SIDR. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-12-12 11:50:21 -08:00
Sean Hefty	0fe313b000	RDMA/cma: Allow early transition to RTS to handle lost CM messages During connection establishment, the passive side of a connection can receive messages from the active side before the connection event has been delivered to the user. Allow the passive side to send messages in response to received data before the event is delivered. To handle the case where the connection messages are lost, a new rdma_notify() function is added that users may invoke to force a connection into the established state. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-12-12 11:50:21 -08:00
Sean Hefty	a1b1b61f80	RDMA/cma: Report connect info with connect events Connection information was never given to the recipient of a connection request or reply message. Only the event was delivered. Report the connection data with the event to allows user to reject the connection based on the requested parameters, or adjust their resources to match the request. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-12-12 11:50:21 -08:00
Sean Hefty	9b2e9c0c24	RDMA/cma: Remove unneeded qp_type parameter from rdma_cm The qp_type parameter into the rdma_cm is unneeded, and can be misleading. The QP type should be determined from the port space. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-12-12 11:50:21 -08:00
Roland Dreier	f47e22c6e4	IB/fmr: ib_flush_fmr_pool() may wait too long ib_flush_fmr_pool() stashes away the request generation number properly, but then goes ahead and rereads it every time it tests whether the flush generation number has caught up. This means that there is a theoretical possibility of livelock, if the request generation number keeps getting bumped and the flush generation number never catches up. The fix is simple: use the request generation number read at the beginning of the function. Also, atomic_inc() followed by atomic_read() can be replaced with atomic_int_return(). There's no real requirement for atomicity here but we might as well shrink the code. This bug was discovered using David Binderman's list of "set but never used" warnings from icc. Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-12-12 11:50:19 -08:00
Josef Sipek	1cfd6e648b	[PATCH] struct path: convert infiniband Signed-off-by: Josef Sipek <jsipek@fsl.cs.sunysb.edu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>	2006-12-08 08:28:46 -08:00
David Howells	4c1ac1b491	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6 Conflicts: drivers/infiniband/core/iwcm.c drivers/net/chelsio/cxgb2.c drivers/net/wireless/bcm43xx/bcm43xx_main.c drivers/net/wireless/prism54/islpci_eth.c drivers/usb/core/hub.h drivers/usb/input/hid-core.c net/core/netpoll.c Fix up merge failures with Linus's head and fix new compilation failures. Signed-Off-By: David Howells <dhowells@redhat.com>	2006-12-05 14:37:56 +00:00
Michael S. Tsirkin	f469b2626f	IB/ucm: Fix deadlock in cleanup ib_ucm_cleanup_events() holds file_mutex while calling ib_destroy_cm_id(). This can deadlock since ib_destroy_cm_id() flushes event handlers, and ib_ucm_event_handler() needs file_mutex, too. Therefore, drop the file_mutex during the call to ib_destroy_cm_id(). Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:10 -08:00
Sean Hefty	e1444b5a16	IB/cm: Fix automatic path migration support The ib_cm_establish() function is replaced with a more generic ib_cm_notify(). This routine is used to notify the CM that failover has occurred, so that future CM messages (LAP, DREQ) reach the remote CM. (Currently, we continue to use the original path) This bumps the userspace CM ABI. New alternate path information is captured when a LAP message is sent or received. This allows QP attributes to be initialized for the user when a new path is loaded after failover occurs. Signed-off-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:10 -08:00
Roland Dreier	04699a1f86	RDMA/addr: list_move() cleanups Replace a couple list_del()/list_add() combos with list_move(). Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:09 -08:00
Krishna Kumar	c78bb8442b	RDMA/addr: Fix some cancellation problems in process_req() Fix following problems in process_req() relating to cancellation: - Function is wrongly doing another addr_remote() when cancelled, which is not required. - Make failure reporting immediate by using time_after_eq(). - On cancellation, -ETIMEDOUT was returned to the callback routine instead of the more appropriate -ECANCELLED (users getting notified may want to print/return this status, eg ucma_event_handler). Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:09 -08:00
Krishna Kumar	9ab1ffa877	RDMA/iwcm: Fix comment for iwcm_deref_id() to match code In iwcm_deref_id(), the comment says : "If the last reference is being removed and iw_destroy_cm_id is waiting, wake up the waiting thread". The second part of the comment, "and iw_destroy_cm_id is waiting," is wrong, since this function either wakes the waiter already waiting in iwcm_deref_id, or enables it (so that when wait_for_completion() is performed later, it will immediately return). Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:08 -08:00
Krishna Kumar	715a588f42	RDMA/iwcm: Remove unnecessary function argument Remove unnecessary cm_id_priv argument to copy_private_data(), and change text to reflect the code. Fix couple of typos in comments. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:08 -08:00
Krishna Kumar	13fccdb380	RDMA/iwcm: Remove unnecessary initializations Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:08 -08:00
Krishna Kumar	83b9658623	RDMA/iwcm: Fix memory leak If we get IW_CM_EVENT_CONNECT_REQUEST message and encounter an error (not in the LISTEN state, cannot create an id, cannot alloc work_entry, etc), then the memory allocated by cm_event_handler() in the event->private_data gets leaked. Since cm_work_handler has already put the event on the work_free_list, this allocated memory is leaked. High backlog value can allow DoS attacks. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:07 -08:00
Krishna Kumar	33ba0fa9f3	RDMA/iwcm: Fix memory corruption bug in cm_work_handler() Possible memory corruption scenario: after putting the work entry back on the work_free_list, we call process_event() which dereferences work->event, which could have been modified to another value meanwhile. Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <rolandd@cisco.com>	2006-11-29 15:33:07 -08:00

1 2 3 4 5 ...

389 Commits