linux/net
Eric Dumazet bf368e4e70 net: Avoid extra wakeups of threads blocked in wait_for_packet()
In 2.6.25 we added UDP mem accounting.

This unfortunatly added a penalty when a frame is transmitted, since
we have at TX completion time to call sock_wfree() to perform necessary
memory accounting. This calls sock_def_write_space() and utimately
scheduler if any thread is waiting on the socket.
Thread(s) waiting for an incoming frame was scheduled, then had to sleep
again as event was meaningless.

(All threads waiting on a socket are using same sk_sleep anchor)

This adds lot of extra wakeups and increases latencies, as noted
by Christoph Lameter, and slows down softirq handler.

Reference : http://marc.info/?l=linux-netdev&m=124060437012283&w=2 

Fortunatly, Davide Libenzi recently added concept of keyed wakeups
into kernel, and particularly for sockets (see commit
37e5540b3c 
epoll keyed wakeups: make sockets use keyed wakeups)

Davide goal was to optimize epoll, but this new wakeup infrastructure
can help non epoll users as well, if they care to setup an appropriate
handler.

This patch introduces new DEFINE_WAIT_FUNC() helper and uses it
in wait_for_packet(), so that only relevant event can wakeup a thread
blocked in this function.

Trace of function calls from bnx2 TX completion bnx2_poll_work() is :
__kfree_skb()
 skb_release_head_state()
  sock_wfree()
   sock_def_write_space()
    __wake_up_sync_key()
     __wake_up_common()
      receiver_wake_function() : Stops here since thread is waiting for an INPUT


Reported-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-28 02:24:21 -07:00
..
9p 9p: fix sparse warning: cast adds address space 2009-02-26 23:13:32 -08:00
802 tr: fix leakage of device in net/802/tr.c 2009-04-11 01:43:17 -07:00
8021q vlan: update vlan carrier state for admin up/down 2009-04-25 18:03:35 -07:00
appletalk proc 2/2: remove struct proc_dir_entry::owner 2009-03-31 01:14:44 +04:00
atm proc 2/2: remove struct proc_dir_entry::owner 2009-03-31 01:14:44 +04:00
ax25 ax25: proc uid file misses header 2009-04-20 02:14:59 -07:00
bluetooth Bluetooth: Add workaround for wrong HCI event in eSCO setup 2009-04-19 19:30:03 +02:00
bridge netfilter: bridge: allow fragmentation of VLAN packets traversing a bridge 2009-04-20 17:12:35 +02:00
can can: Network Drop Monitor: Make use of consume_skb() in af_can.c 2009-04-17 01:38:46 -07:00
core net: Avoid extra wakeups of threads blocked in wait_for_packet() 2009-04-28 02:24:21 -07:00
dcb
dccp dccp: Do not let initial option overhead shrink the MPS 2009-03-02 03:07:23 -08:00
decnet net/*: use linux/kernel.h swap() 2009-03-21 13:36:17 -07:00
dsa dsa: add switch chip cascading support 2009-03-21 19:06:54 -07:00
econet net: convert usage of packet_type to read_mostly 2009-03-10 05:22:43 -07:00
ethernet
ipv4 ipv4: Limit size of route cache hash table 2009-04-27 05:42:24 -07:00
ipv6 ipv6:remove useless check 2009-04-14 02:21:41 -07:00
ipx ipx: use constant for strings and desciptor 2009-03-21 19:06:51 -07:00
irda proc tty: switch ircomm to ->proc_fops 2009-04-01 08:59:10 -07:00
iucv af_iucv: Fix race when queuing incoming iucv messages 2009-04-21 23:43:15 -07:00
key af_key: remove some pointless conditionals before kfree_skb() 2009-02-26 23:07:32 -08:00
lapb
llc proc 2/2: remove struct proc_dir_entry::owner 2009-03-31 01:14:44 +04:00
mac80211 mac80211: fix alignment calculation bug 2009-04-21 16:43:33 -04:00
netfilter Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-2.6 2009-04-25 17:46:34 -07:00
netlabel netlabel: Always remove the correct address selector 2009-04-22 00:46:09 -07:00
netlink Merge branch 'master' of /home/davem/src/GIT/linux-2.6/ 2009-03-26 15:23:24 -07:00
netrom net/netrom: Fix socket locking 2009-04-22 00:49:51 -07:00
packet packet: avoid warnings when high-order page allocation fails 2009-04-15 03:39:52 -07:00
phonet trivial: fix typos/grammar errors in Kconfig texts 2009-03-30 15:22:01 +02:00
rds RDS: Use spinlock to protect 64b value update on 32b archs 2009-04-02 00:52:22 -07:00
rfkill
rose Revert "rose: zero length frame filtering in af_rose.c" 2009-04-14 20:28:00 -07:00
rxrpc RxRPC: Fix a potential NULL dereference 2009-02-06 21:50:52 -08:00
sched net: sch_netem: Fix an inconsistency in ingress netem timestamps. 2009-04-20 02:14:59 -07:00
sctp proc 2/2: remove struct proc_dir_entry::owner 2009-03-31 01:14:44 +04:00
sunrpc Merge branch 'for-2.6.30' of git://linux-nfs.org/~bfields/linux 2009-04-06 13:25:56 -07:00
tipc tipc: fix non-const printf format arguments 2009-03-18 19:11:29 -07:00
unix New helper - current_umask() 2009-03-31 23:00:26 -04:00
wanrouter wanrouter: fix sparse warnings: context imbalance 2009-02-26 23:13:36 -08:00
wimax trivial: fix typos/grammar errors in Kconfig texts 2009-03-30 15:22:01 +02:00
wireless nl80211: Make nl80211_send_mlme_event() atomic 2009-04-20 16:36:26 -04:00
x25 af_rose/x25: Sanity check the maximum user frame size 2009-03-27 00:28:21 -07:00
xfrm xfrm: wrong hash value for temporary SA 2009-04-27 02:58:59 -07:00
compat.c net: socket infrastructure for SO_TIMESTAMPING 2009-02-15 22:43:35 -08:00
Kconfig trivial: fix typos/grammar errors in Kconfig texts 2009-03-30 15:22:01 +02:00
Makefile RDS: Kconfig and Makefile 2009-02-26 23:43:35 -08:00
nonet.c
socket.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6 2009-04-06 18:05:43 -07:00
sysctl_net.c net: sysctl_net - use net_eq to compare nets 2009-03-16 16:23:30 +01:00
TUNABLE