120172 Commits

Author SHA1 Message Date
Tomas Winkler
e0737a77d6 iwlwifi: iwl-fh.h cleanup
This patch fix value of upper FH register bound plus
it reorders and groups registers in more readable way

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Acked-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-25 16:41:20 -05:00
Zhu, Yi
34faf780cf iwlwifi: some fh document fix and cleanup
This patch cleans up some flow handler related document. It also
removes some blank lines.

Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Acked-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-25 16:41:20 -05:00
Zhu, Yi
352bc8de19 iwlwifi: configure_filter rewrite
The patch rewrites the mac80211 configure_filter handler to better mapping
mac80211 filter flags to iwlwifi hardware filter flags. We now can support
5 mac80211 filter flags: FIF_OTHER_BSS, FIF_ALLMULTI, FIF_PROMISC_IN_BSS,
FIF_BCN_PRBRESP_PROMISC and FIF_CONTROL. This patch also avoids reconnecting
if the filter flags are changed when the STA is associated. Because rx_assoc
is used when full rxon is not necessary.

Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-25 16:41:19 -05:00
Abhijeet Kolekar
c305606540 iwlwifi : fix checkpatch.pl errors
Patch fixes checkpatch.pl errors for iwlwifi.

Signed-off-by: Abhijeet Kolekar <abhijeet.kolekar@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-25 16:41:18 -05:00
Chatre, Reinette
dbce56a456 iwlwifi: replace magic constants with define
use IWL_CCK_RATES_MASK and IWL_OFDM_RATES_MASK instead of
their values directly.

Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
cc: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-25 16:41:18 -05:00
Tomas Winkler
417f114bf2 iwlwifi: rs: remove fc variable and other cleanups
This patch
1. Removes use once use only fc variables, they are useless after refactoring
ieee80211 frame control handlers
2. Other trivial cleanups

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-25 16:41:17 -05:00
Tomas Winkler
9f58671e8d iwlwifi: consolidate station management code
This patch moves code around and group most of the station
management code into iwl-sta.c

No functional changes (yet)

Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-25 16:41:06 -05:00
Kolekar, Abhijeet
cee53ddb46 iwl3945 : Simplify iwl3945_pci_probe
Patch aligns iwl3945_pci_probe with iwlwifi's iwl_pci_probe.
Added few comments and code simplified to make readable.

Signed-off-by: Abhijeet Kolekar <abhijeet.kolekar@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-25 16:35:20 -05:00
Ivo van Doorn
0e3de99846 rt2x00: Fix TX failure path
The callback function write_tx_data() can only fail
when our ENTRY_OWNER_DEVICE_DATA flag on a queue entry
failed to determine the entry was not available and
it is in fact still owned by the hardware.
This means that if that function fails the queue
must be stopped in mac80211.

When rt2x00queue_get_queue() returns NULL in the TX
path, it means mac80211 has passed us an invalid queue,
although this should be impossible, it shouldn't hurt
if we send mac80211 a signal to stop the queue either.

Both issues can simply be resolved by removing their
manual failure handler and making them use the failure path
provided in rt2x00mac_tx().

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-25 16:32:54 -05:00
Ivo van Doorn
0f829b1d6f rt2x00: Move rt73usb register access wrappers into rt2x00usb
rt2500usb and rt73usb have different register word sizes,
for that reason the register access wrappers were never
moved into rt2x00usb.
With rt2800usb on its way, we should favor the 32bit
register access and move those wrappers into rt2x00usb.
That saves duplicate code, since only rt2500usb will
need the special 16bit wrappers.

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-25 16:32:53 -05:00
Ivo van Doorn
c9c3b1a5de rt2x00: Cleanup indirect register access
All code which accessed indirect registers was similar
in respect to the for-loop, the given timeout, etc.
Move it into a seperate function, which for PCI drivers
can be moved into rt2x00pci.

This allows us to cleanup the cleanup the code further
by removing the goto statementsand making the codepath
look a bit nicer.

Signed-off-by: Ivo van Doorn <IvDoorn@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-25 16:32:53 -05:00
Johannes Berg
9764f3f9c3 ath5k: name pci driver "ath5k" too
Call the ath5k pci driver struct "ath5k" too to be less
confusing in sysfs.

Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Acked-by: Nick Kossifidis <mickflemm@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-25 16:32:53 -05:00
Ingo Molnar
020cf6ba7a net/wireless/reg.c: fix bad WARN_ON in if statement
fix:

  net/wireless/reg.c:348:29: error: macro "if" passed 2 arguments, but takes just 1

triggered by the branch-tracer.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-25 16:13:09 -05:00
Martin Xu
02969b38e6 ath5k: disable beacon filter when station is not associated
Ath5k driver has too many interrupts per second at idle
http://bugzilla.kernel.org/show_bug.cgi?id=11749

Signed-off-by: Martin Xu <martin.xu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-25 16:13:08 -05:00
Cheng Renquan
33ab625f2a ath5k: fix Security issue in DebugFS part of ath5k
http://bugzilla.kernel.org/show_bug.cgi?id=12076

Remove any write access to groups and others, only keep write permission
to its owner, usually only root user.

Reported-by: Jérôme Poulin <jeromepoulin@gmail.com>
Signed-off-by: Cheng Renquan <crquan@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-25 16:13:08 -05:00
Luis R. Rodriguez
b4b6cda229 ath9k: correct expected max RX buffer size
We should only tell the hardware its capable of DMA'ing
to us only what we asked dev_alloc_skb(). Prior to this
it is possible a large RX'd frame could have corrupted
DMA data but for us but we were saved only because we
were previously also pci_map_single()'ing the same large
value. The issue prior to this though was we were unmapping
a smaller amount which the prior DMA patch fixed.

Signed-off-by: Bennyam Malavazi <Bennyam.Malavazi@atheros.com>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-25 16:13:08 -05:00
Luis R. Rodriguez
ca0c7e5101 ath9k: Fix SW-IOMMU bounce buffer starvation
This should fix the SW-IOMMU bounce buffer starvation
seen ok kernel.org bugzilla 11811:

http://bugzilla.kernel.org/show_bug.cgi?id=11811

Users on MacBook Pro 3.1/MacBook v2 would see something like:

DMA: Out of SW-IOMMU space for 4224 bytes at device 0000:0b:00.0

Unfortunately its only easy to trigger on MacBook Pro 3.1/MacBook v2
so far so its difficult to debug (even with swiotlb=force).

We were pci_unmap_single()'ing less bytes than what we called
for with pci_map_single() and as such we were starving
the swiotlb from its 64MB amount of bounce buffers. We remain
consistent and now always use sc->rxbufsize for RX. While at
it we update the beacon DMA maps as well to only use the data
portion of the skb, previous to this we were pci_map_single()'ing
more data for beaconing than what we tell the hardware it can use,
therefore pushing more iotlb abuse.

Still not sure why this is so easily triggerable on
MacBook Pro 3.1, it may be the hardware configuration
tends to use more memory > 3GB mark for DMA.

Signed-off-by: Maciej Zenczykowski <zenczykowski@gmail.com>
Signed-off-by: Bennyam Malavazi <Bennyam.Malavazi@atheros.com>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-25 16:13:08 -05:00
Abhijeet Kolekar
3dd3b79aea mac80211 : Fix setting ad-hoc mode and non-ibss channel
Patch fixes the kernel trace when user tries to set
ad-hoc mode on non IBSS channel.
e.g iwconfig wlan0 chan 36 mode ad-hoc

Signed-off-by: Abhijeet Kolekar <abhijeet.kolekar@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-25 16:13:08 -05:00
Johannes Berg
e91af0af86 iwlagn: fix DMA sync
For the RX DMA fix for iwlwifi ("iwlagn: fix RX skb alignment") Luis
pointed out:

> aligned_dma_addr can obviously be > real_dma_addr at this point, what
> guarantees we can use it on our own whim?

I asked around, and he's right, there may be platforms that do not allow
passing such such an address to the DMA API functions. This patch
changes it by using the proper dma_sync_single_range_for_cpu API
invented for this purpose.

Cc: Luis R. Rodriguez <mcgrof@gmail.com>
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2008-11-25 16:13:08 -05:00
David S. Miller
2f9889a20c Revert "hso: Fix crashes on close."
This reverts commit 4a3e818181e1baf970e9232ca8b747e233176b87.

On request from Alan Cox.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 03:53:09 -08:00
David S. Miller
ab153d84d9 Revert "hso: Fix free of mutexes still in use."
This reverts commit 52429eb216385fdc6969c0112ba8b46cffefaaef.

On request from Alan Cox.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 03:52:46 -08:00
David S. Miller
cd90ee1799 Revert "hso: Add TIOCM ioctl handling."
This reverts commit 7ea3a9ad9bf360f746a7ad6fa72511a5c359490d.

On request from Alan Cox.

Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 03:52:17 -08:00
Alexey Dobriyan
fb7e06748c xfrm: remove useless forward declarations
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 01:05:54 -08:00
Alexey Dobriyan
6daad37230 ah4/ah6: remove useless NULL assignments
struct will be kfreed in a moment, so...

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 01:05:09 -08:00
Alexander Duyck
69d728baf6 igb: loopback bits not correctly cleared from RCTL register
This change forces the bits to 0 by using an &= operation with an inverted
mask of all options instead of using an |= with a value of 0.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 01:04:03 -08:00
Alexander Duyck
9b07f3d315 igb: remove unneeded bit refrence when enabling jumbo frames
There is a reference to a Buffer Size extention bit that is unneded by
82575/82576 hardware.  Since it is not needed it should be removed from the
code.

Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 01:03:26 -08:00
Jeff Kirsher
7a6b6f515f DCB: fix kconfig option
Since the netlink option for DCB is necessary to actually be useful,
simplified the Kconfig option.  In addition, added useful help text for the
Kconfig option.

Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 01:02:08 -08:00
Trent Piepho
11c6dd2c72 phylib: Add Vitesse VSC8221 SGMII PHY
PHY is mostly compatible with the existing VSC8244 PHY.  The init sequence
is different and the interrupt mask lacks some bits present in the VSC8244.

Rather than making a copy of the existing VSC234x config_intr function and
change one constant, I modify it to select the interrupt mask based on
which driver is calling it.  This lets it be used by both drivers.

Signed-off-by: Trent Piepho <tpiepho@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 01:00:47 -08:00
Bernard Pidoux
244f46ae6e rose: zero length frame filtering in af_rose.c
Since changeset e79ad711a0108475c1b3a03815527e7237020b08 from  mainline,
>From David S. Miller,
empty packet can be transmitted on connected socket for datagram protocols.

However, this patch broke a high level application using ROSE network protocol with connected datagram.

Bulletin Board Stations perform bulletins forwarding between BBS stations via ROSE network using a forward protocol.
Now, if for some reason, a buffer in the application software happens to be empty at a specific moment,
ROSE sends an empty packet via unfiltered packet socket.
When received, this ROSE packet introduces perturbations of data exchange of BBS forwarding,
for the application message forwarding protocol is waiting for something else.
We agree that a more careful programming of the application protocol would avoid this situation and we are
willing to debug it.
But, as an empty frame is no use and does not have any meaning for ROSE protocol,
we may consider filtering zero length data both when sending and receiving socket data.

The proposed patch repaired BBS data exchange through ROSE network that were broken since 2.6.22.11 kernel.

Signed-off-by: Bernard Pidoux <f6bvp@amsat.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 00:56:20 -08:00
Harvey Harrison
411c41eea5 aoe: remove private mac address format function
Add %pm to omit the colons when printing a mac address.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 00:40:37 -08:00
Denis Joseph Barrow
9c8f92aed1 hso: Hook up ->reset_resume
Made usb_drivers reset_resume function point to hso_resume this 
fixes problems a usb reset is done when the network interface
is left idle for a few minutes. Possibly reset_resume should
initialise hardware more but this works in the common case.

Signed-off-by: Denis Joseph Barrow <D.Barow@option.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 00:36:10 -08:00
Denis Joseph Barrow
7ea3a9ad9b hso: Add TIOCM ioctl handling.
Makes TIOCM ioctls for Data Carrier Detect & related functions
work like /drivers/serial/serial-core.c potentially needed 
for pppd & similar user programs.   

Signed-off-by: Denis Joseph Barrow <D.Barow@option.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 00:35:26 -08:00
Denis Joseph Barrow
52429eb216 hso: Fix free of mutexes still in use.
A new structure hso_mutex_table had to be declared statically
& used as as hso_device mutex_lock(&serial->parent->mutex) etc
is freed in hso_serial_open & hso_serial_close by kref_put while
the mutex is still in use.

This is a substantial change but should make the driver much stabler.

Signed-off-by: Denis Joseph Barrow <D.Barow@option.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 00:33:13 -08:00
Denis Joseph Barrow
89930b7b5e hso: Fix URB submission -EINVAL.
Added check for IFF_UP in hso_resume, this should eliminate -EINVAL (-22)
errors caused from urb's being submitted twice, once by hso_resume
& once in hso_net_open, if suspend/resume USB power saving  mode is enabled

Signed-off-by: Denis Joseph Barrow <D.Barow@option.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 00:30:48 -08:00
Denis Joseph Barrow
4a3e818181 hso: Fix crashes on close.
Moved serial_open_count in hso_serial_open to
prevent crashes owing to the serial structure being made NULL
when hso_serial_close is called even though hso_serial_open
returned -ENODEV, Alan Cox pointed out this happens,
also put in sanity check in hso_serial_close
to check for a valid serial structure which should prevent
the most reproducable crash in the driver when the hso device
is disconnected while in use.

Signed-off-by: Denis Joseph Barrow <D.Barow@option.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 00:27:50 -08:00
Denis Joseph Barrow
bab04c3adb hso: Add new usb device id's.
Signed-off-by: Denis Joseph Barrow <D.Barow@option.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 00:26:12 -08:00
Stephen Hemminger
47fd5b8373 netdev: add HAVE_NET_DEVICE_OPS
As a concession to vendors who have to deal with one source for different
kernel versions, add a HAVE_NET_DEVICE_OPS so they don't end up hard
coding ifdef against kernel version.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-25 00:20:43 -08:00
Ilpo Järvinen
0ace285605 tcp: handle shift/merge of cloned skbs too
This caused me to get repeatably:

  tcpdump: pcap_loop: recvfrom: Bad address

Happens occassionally when I tcpdump my for-looped test xfers:
  while [ : ]; do echo -n "$(date '+%s.%N') "; ./sendfile; sleep 20; done

Rest of the relevant commands:
  ethtool -K eth0 tso off
  tc qdisc add dev eth0 root netem drop 4%
  tcpdump -n -s0 -i eth0 -w sacklog.all

Running net-next under kvm, connection goes to the same host
(basically just out of kvm). The connection itself works ok
and data gets sent without corruption even with a large
number of tests while tcpdump fails usually within less than
5 tests.

Whether it only happens because of this change or not, I
don't know for sure but it's the only thing with which
I've seen that error. The non-cloned variant works w/o it
for much longer time. I'm yet to debug where the error
actually comes from.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24 21:30:21 -08:00
Ilpo Järvinen
111cc8b913 tcp: add some mibs to track collapsing
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24 21:27:22 -08:00
Ilpo Järvinen
92ee76b6d9 tcp: Make shifting not clear the hints
The earlier version was just very basic one which is "playing
safe" by always clearing the hints. However, clearing of a hint
is extremely costly operation with large windows, so it must be
avoided at all cost whenever possible, there is a way with
shifting too achieve not-clearing.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24 21:26:56 -08:00
Ilpo Järvinen
832d11c5cd tcp: Try to restore large SKBs while SACK processing
During SACK processing, most of the benefits of TSO are eaten by
the SACK blocks that one-by-one fragment SKBs to MSS sized chunks.
Then we're in problems when cleanup work for them has to be done
when a large cumulative ACK comes. Try to return back to pre-split
state already while more and more SACK info gets discovered by
combining newly discovered SACK areas with the previous skb if
that's SACKed as well.

This approach has a number of benefits:

1) The processing overhead is spread more equally over the RTT
2) Write queue has less skbs to process (affect everything
   which has to walk in the queue past the sacked areas)
3) Write queue is consistent whole the time, so no other parts
   of TCP has to be aware of this (this was not the case with
   some other approach that was, well, quite intrusive all
   around).
4) Clean_rtx_queue can release most of the pages using single
   put_page instead of previous PAGE_SIZE/mss+1 calls

In case a hole is fully filled by the new SACK block, we attempt
to combine the next skb too which allows construction of skbs
that are even larger than what tso split them to and it handles
hole per on every nth patterns that often occur during slow start
overshoot pretty nicely. Though this to be really useful also
a retransmission would have to get lost since cumulative ACKs
advance one hole at a time in the most typical case.

TODO: handle upwards only merging. That should be rather easy
when segment is fully sacked but I'm leaving that as future
work item (it won't make very large difference anyway since
this current approach already covers quite a lot of normal
cases).

I was earlier thinking of some sophisticated way of tracking
timestamps of the first and the last segment but later on
realized that it won't be that necessary at all to store the
timestamp of the last segment. The cases that can occur are
basically either:
  1) ambiguous => no sensible measurement can be taken anyway
  2) non-ambiguous is due to reordering => having the timestamp
     of the last segment there is just skewing things more off
     than does some good since the ack got triggered by one of
     the holes (besides some substle issues that would make
     determining right hole/skb even harder problem). Anyway,
     it has nothing to do with this change then.

I choose to route some abnormal looking cases with goto noop,
some could be handled differently (eg., by stopping the
walking at that skb but again). In general, they either
shouldn't happen at all or are rare enough to make no difference
in practice.

In theory this change (as whole) could cause some macroscale
regression (global) because of cache misses that are taken over
the round-trip time but it gets very likely better because of much
less (local) cache misses per other write queue walkers and the
big recovery clearing cumulative ack.

Worth to note that these benefits would be very easy to get also
without TSO/GSO being on as long as the data is in pages so that
we can merge them. Currently I won't let that happen because
DSACK splitting at fragment that would mess up pcounts due to
sk_can_gso in tcp_set_skb_tso_segs. Once DSACKs fragments gets
avoided, we have some conditions that can be made less strict.

TODO: I will probably have to convert the excessive pointer
passing to struct sacktag_state... :-)

My testing revealed that considerable amount of skbs couldn't
be shifted because they were cloned (most likely still awaiting
tx reclaim)...

[The rest is considering future work instead since I got
repeatably EFAULT to tcpdump's recvfrom when I added
pskb_expand_head to deal with clones, so I separated that
into another, later patch]

...To counter that, I gave up on the fifth advantage:

5) When growing previous SACK block, less allocs for new skbs
   are done, basically a new alloc is needed only when new hole
   is detected and when the previous skb runs out of frags space

...which now only happens of if reclaim is fast enough to dispose
the clone before the SACK block comes in (the window is RTT long),
otherwise we'll have to alloc some.

With clones being handled I got these numbers (will be somewhat
worse without that), taken with fine-grained mibs:

                  TCPSackShifted 398
                   TCPSackMerged 877
            TCPSackShiftFallback 320
      TCPSACKCOLLAPSEFALLBACKGSO 0
  TCPSACKCOLLAPSEFALLBACKSKBBITS 0
  TCPSACKCOLLAPSEFALLBACKSKBDATA 0
    TCPSACKCOLLAPSEFALLBACKBELOW 0
    TCPSACKCOLLAPSEFALLBACKFIRST 1
 TCPSACKCOLLAPSEFALLBACKPREVBITS 318
      TCPSACKCOLLAPSEFALLBACKMSS 1
   TCPSACKCOLLAPSEFALLBACKNOHEAD 0
    TCPSACKCOLLAPSEFALLBACKSHIFT 0
          TCPSACKCOLLAPSENOOPSEQ 0
  TCPSACKCOLLAPSENOOPSMALLPCOUNT 0
     TCPSACKCOLLAPSENOOPSMALLLEN 0
             TCPSACKCOLLAPSEHOLE 12

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24 21:20:15 -08:00
Ilpo Järvinen
f58b22fd3c tcp: make tcp_sacktag_one able to handle partial skb too
This is preparatory work for SACK combiner patch which may
have to count TCP state changes for only a part of the skb
because it will intentionally avoids splitting skb to SACKed
and not sacked parts.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24 21:14:43 -08:00
Ilpo Järvinen
adb92db857 tcp: Make SACK code to split only at mss boundaries
Sadly enough, this adds possible divide though we try to avoid
it by checking one mss as common case.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24 21:13:50 -08:00
Ilpo Järvinen
e8bae275d9 tcp: more aggressive skipping
I knew already when rewriting the sacktag that this condition
was too conservative, change it now since it prevent lot of
useless work (especially in the sack shifter decision code
that is being added by a later patch). This shouldn't change
anything really, just save some processing regardless of the
shifter.

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24 21:12:28 -08:00
Ilpo Järvinen
e1aa680fa4 tcp: move tcp_simple_retransmit to tcp_input
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24 21:11:55 -08:00
Ilpo Järvinen
4a17fc3add tcp: collapse more than two on retransmission
I always had thought that collapsing up to two at a time was
intentional decision to avoid excessive processing if 1 byte
sized skbs are to be combined for a full mtu, and consecutive
retransmissions would make the size of the retransmittee
double each round anyway, but some recent discussion made me
to understand that was not the case. Thus make collapse work
more and wait less.

It would be possible to take advantage of the shifting
machinery (added in the later patch) in the case of paged
data but that can be implemented on top of this change.

tcp_skb_is_last check is now provided by the loop.

I tested a bit (ss-after-idle-off, fill 4096x4096B xfer,
10s sleep + 4096 x 1byte writes while dropping them for
some a while with netem):

. 16774097:16775545(1448) ack 1 win 46
. 16775545:16776993(1448) ack 1 win 46
. ack 16759617 win 2399
P 16776993:16777217(224) ack 1 win 46
. ack 16762513 win 2399
. ack 16765409 win 2399
. ack 16768305 win 2399
. ack 16771201 win 2399
. ack 16774097 win 2399
. ack 16776993 win 2399
. ack 16777217 win 2399
P 16777217:16777257(40) ack 1 win 46
. ack 16777257 win 2399
P 16777257:16778705(1448) ack 1 win 46
P 16778705:16780153(1448) ack 1 win 46
FP 16780153:16781313(1160) ack 1 win 46
. ack 16778705 win 2399
. ack 16780153 win 2399
F 1:1(0) ack 16781314 win 2399

While without drop-all period I get this:

. 16773585:16775033(1448) ack 1 win 46
. ack 16764897 win 9367
. ack 16767793 win 9367
. ack 16770689 win 9367
. ack 16773585 win 9367
. 16775033:16776481(1448) ack 1 win 46
P 16776481:16777217(736) ack 1 win 46
. ack 16776481 win 9367
. ack 16777217 win 9367
P 16777217:16777218(1) ack 1 win 46
P 16777218:16777219(1) ack 1 win 46
P 16777219:16777220(1) ack 1 win 46
  ...
P 16777247:16777248(1) ack 1 win 46
. ack 16777218 win 9367
. ack 16777219 win 9367
  ...
. ack 16777233 win 9367
. ack 16777248 win 9367
P 16777248:16778696(1448) ack 1 win 46
P 16778696:16780144(1448) ack 1 win 46
FP 16780144:16781313(1169) ack 1 win 46
. ack 16780144 win 9367
F 1:1(0) ack 16781314 win 9367

The window seems to be 30-40 segments, which were successfully
combined into: P 16777217:16777257(40) ack 1 win 46

Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24 21:03:43 -08:00
Eric Dumazet
a21bba9454 net: avoid a pair of dst_hold()/dst_release() in ip_push_pending_frames()
We can reduce pressure on dst entry refcount that slowdown UDP transmit
path on SMP machines. This pressure is visible on RTP servers when
delivering content to mediagateways, especially big ones, handling
thousand of streams. Several cpus send UDP frames to the same
destination, hence use the same dst entry.

This patch makes ip_push_pending_frames() steal the refcount its
callers had to take when filling inet->cork.dst.

This doesnt avoid all refcounting, but still gives speedups on SMP,
on UDP/RAW transmit path.

Signed-off-by: Eric Dumazet <dada1@cosmosbay.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24 16:07:50 -08:00
Herbert Xu
631339f1e5 bridge: netfilter: fix update_pmtu crash with GRE
As GRE tries to call the update_pmtu function on skb->dst and
bridge supplies an skb->dst that has a NULL ops field, all is
not well.

This patch fixes this by giving the bridge device an ops field
with an update_pmtu function.  For the moment I've left all
other fields blank but we can fill them in later should the
need arise.

Based on report and patch by Philip Craig.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24 16:06:50 -08:00
Jan Engelhardt
f79fca55f9 netfilter: xtables: add missing const qualifier to xt_tgchk_param
When entryinfo was a standalone parameter to functions, it used to be
"const void *". Put the const back in.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24 16:06:17 -08:00
Patrick McHardy
b54ad409fd netfilter: ctnetlink: fix conntrack creation race
Conntrack creation through ctnetlink has two races:

- the timer may expire and free the conntrack concurrently, causing an
  invalid memory access when attempting to put it in the hash tables

- an identical conntrack entry may be created in the packet processing
  path in the time between the lookup and hash insertion

Hold the conntrack lock between the lookup and insertion to avoid this.

Reported-by: Zoltan Borbely <bozo@andrews.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-11-24 15:56:17 -08:00