The flaw was in skipping the second byte in MAC header due to increasing
the pointer AND indexed access starting at '1'.
Signed-off-by: Joerg Marx <joerg.marx@secunet.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Assigning a socket in timewait state to skb->sk can trigger
kernel oops, e.g. in nfnetlink_log, which does:
if (skb->sk) {
read_lock_bh(&skb->sk->sk_callback_lock);
if (skb->sk->sk_socket && skb->sk->sk_socket->file) ...
in the timewait case, accessing sk->sk_callback_lock and sk->sk_socket
is invalid.
Either all of these spots will need to add a test for sk->sk_state != TCP_TIME_WAIT,
or xt_TPROXY must not assign a timewait socket to skb->sk.
This does the latter.
If a TW socket is found, assign the tproxy nfmark, but skip the skb->sk assignment,
thus mimicking behaviour of a '-m socket .. -j MARK/ACCEPT' re-routing rule.
The 'SYN to TW socket' case is left unchanged -- we try to redirect to the
listener socket.
Cc: Balazs Scheidler <bazsi@balabit.hu>
Cc: KOVACS Krisztian <hidden@balabit.hu>
Signed-off-by: Florian Westphal <fwestphal@astaro.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
As noticed by Eric, nf_iterate doesn't use RCU correctly by
accessing the prev pointer of a RCU protected list element when
a verdict of NF_REPEAT is issued.
Fix by jumping backwards to the hook invocation directly instead
of loading the previous list element before continuing the list
iteration.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
The TCP tracking code has a special case that allows to return
NF_REPEAT if we receive a new SYN packet while in TIME_WAIT state.
In this situation, the TCP tracking code destroys the existing
conntrack to start a new clean session.
[DESTROY] tcp 6 src=192.168.0.2 dst=192.168.1.2 sport=38925 dport=8000 src=192.168.1.2 dst=192.168.1.100 sport=8000 dport=38925 [ASSURED]
[NEW] tcp 6 120 SYN_SENT src=192.168.0.2 dst=192.168.1.2 sport=38925 dport=8000 [UNREPLIED] src=192.168.1.2 dst=192.168.1.100 sport=8000 dport=38925
However, this is a problem for the iptables' CT target event filtering
which will not work in this case since the conntrack template will not
be there for the new session. To fix this, we reassign the conntrack
template to the packet if we return NF_REPEAT.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
For the following rule:
iptables -I PREROUTING -t raw -j CT --ctevents assured
The event delivered looks like the following:
[UPDATE] tcp 6 src=192.168.0.2 dst=192.168.1.2 sport=37041 dport=80 src=192.168.1.2 dst=192.168.1.100 sport=80 dport=37041 [ASSURED]
Note that the TCP protocol state is not included. For that reason
the CT event filtering is not very useful for conntrackd.
To resolve this issue, instead of conditionally setting the CT events
bits based on the ctmask, we always set them and perform the filtering
in the late stage, just before the delivery.
Thus, the event delivered looks like the following:
[UPDATE] tcp 6 432000 ESTABLISHED src=192.168.0.2 dst=192.168.1.2 sport=37041 dport=80 src=192.168.1.2 dst=192.168.1.100 sport=80 dport=37041 [ASSURED]
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
In 135367b "netfilter: xtables: change xt_target.checkentry return type",
the type returned by checkentry was changed from boolean to int, but the
return values where not adjusted.
arptables: Input/output error
This broke arptables with the mangle target since it returns true
under success, which is interpreted by xtables as >0, thus
returning EIO.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
iprange_ipv6_sub was substracting 2 unsigned ints and then casting
the result to int to find out whether they are lt, eq or gt each
other, this doesn't work if the full 32 bits of each part
can be used in IPv6 addresses. Patch should remedy that without
significant performance penalties. Also number of ntohl
calls can be reduced this way (Jozsef Kadlecsik).
Signed-off-by: Thomas Jacob <jacob@internet24.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
In 13ee6ac netfilter: fix race in conntrack between dump_table and
destroy, we recovered spinlocks to protect the dump of the conntrack
table according to reports from Stephen and acknowledgments on the
issue from Eric.
In that patch, the refcount bump that allows to keep a reference
to the current ct object was removed. However, we still decrement
the refcount for that object in the output path of
ctnetlink_dump_table():
if (last)
nf_ct_put(last)
Cc: Stephen Hemminger <stephen.hemminger@vyatta.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Currently tools like ip and ifconfig report incorrect state for cxgb4
interfaces that are up but do not have link and do so until first link
establishment. This is because the initial netif_carrier_off call is
before register_netdev and it needs to be after to be fully effective.
Fix this by moving netif_carrier_off into .ndo_open.
Signed-off-by: Dimitris Michailidis <dm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The option name of Delayed SACK Timer should be SCTP_DELAYED_SACK,
not SCTP_DELAYED_ACK.
Left SCTP_DELAYED_ACK be concomitant with SCTP_DELAYED_SACK,
for making compatibility with existing applications.
Reference:
8.1.19. Get or Set Delayed SACK Timer (SCTP_DELAYED_SACK)
(http://tools.ietf.org/html/draft-ietf-tsvwg-sctpsocket-25)
Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Acked-by: Wei Yongjun <yjwei@cn.fujitsu.com>
Acked-by: Vlad Yasevich <vladislav.yasevich@hp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 0363466866 (net offloading: Convert checksums to use
centrally computed features.) mistakenly swapped can_checksum_protocol()
arguments.
This broke IPv6 on bnx2 for instance, on NIC without TCPv6 checksum
offloads.
Reported-by: Hans de Bruin <jmdebruin@xmsnet.nl>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Just stumbled upon the issue while looking for another bug.
The code looks correct, the indentation is not.
Signed-off-by: Anton Vorontsov <cbouatmailru@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sh_irda can not use RX/TX in same time,
but this driver didn't return to RX mode when TX error occurred.
This patch care xmit error case to solve this issue.
Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In netif_skb_features() we return only the features that are valid for vlans
if we have a vlan packet. However, we should not mask out NETIF_F_HW_VLAN_TX
since it enables transmission of vlan tags and is obviously valid.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- tx_fixup() can be called from either timer callback or from xmit()
in usbnet, so spinlock is added to avoid concurrency-related problem.
- minor correction due to checkpatch warning for some line over 80
chars after previous patch was applied.
Signed-off-by: Alexey Orishko <alexey.orishko@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In drivers/net/ns83820.c::ns83820_init_one() we dynamically allocate
memory via alloc_etherdev(). We then call PRIV() on the returned storage
which is 'return netdev_priv()'. netdev_priv() takes the pointer it is
passed and adds 'ALIGN(sizeof(struct net_device), NETDEV_ALIGN)' to it and
returns it. Then we test the resulting pointer for NULL, which it is
unlikely to be at this point, and later dereference it. This will go bad
if alloc_etherdev() actually returned NULL.
This patch reworks the code slightly so that we test for a NULL pointer
(and return -ENOMEM) directly after calling alloc_etherdev().
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a network namespace is created (via CLONE_NEWNET), the loopback
interface is automatically added to the new namespace, triggering a
printk in ipv6_add_dev() if CONFIG_IPV6_PRIVACY is set.
This is problematic for applications which use CLONE_NEWNET as
part of a sandbox, like Chromium's suid sandbox or recent versions of
vsftpd. On a busy machine, it can lead to thousands of useless
"lo: Disabled Privacy Extensions" messages appearing in dmesg.
It's easy enough to check the status of privacy extensions via the
use_tempaddr sysctl, so just removing the printk seems like the most
sensible solution.
Signed-off-by: Romain Francoise <romain@orebokech.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update bnx2x version to 1.62.00-4
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix AER settings for BCM57712 to allow accessing all device addresses range in CL45 MDC/MDIO
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix BCM84823 LED behavior which may show on some systems
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Device may show incorrect duplex mode for devices with external PHY
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Improve microcode loading verification before proceeding to next stage
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
LED on BCM57712+BCM8727 systems requires different settings
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Common init used to be called by the driver when the first port comes up, mainly to reset and reload external PHY microcode.
However, in case management driver is active on the other port, traffic would halted. So limit the common init to be done only once after POR.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable controlling BCM8073 PN polarity swap through nvm configuration, which is required in certain systems
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When read valid tx/rx chains from EEPROM, there is a bug to use the
tx chain value for both tx and rx, the result of this cause low
receive throughput on 1x2 devices becuase rx will only utilize single
chain instead of two chains
Signed-off-by: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
ath5k_reset must be called with sc->lock. Since the tx queue
watchdog runs in a workqueue and accesses sc, it's appropriate
to just take the lock over the whole function.
Signed-off-by: Bob Copeland <me@bobcopeland.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
The power detector adc offset calibration has to be done
on 4 minutes interval (longcal * pa_skip_count). But the commit
"ath9k_hw: fix a noise floor calibration related race condition"
makes the PA calibration executed more frequently beased on
nfcal_pending value. Running PAOffset calibration lesser than
longcal interval doesn't help anything and the worse part is that
it causes NF load timeouts and RX deaf conditions.
In a very noisy environment, where the distance b/w AP & station
is ~10 meter and running a downlink udp traffic with frequent
background scan causes "Timeout while waiting for nf to load:
AR_PHY_AGC_CONTROL=0x40d1a" and moves the chip into deaf state.
This issue was originaly reported in Android platform where
the network-manager application does bgscan more frequently
on AR9271 chips. (AR9285 family usb device).
Cc: stable@kernel.org
Signed-off-by: Vasanthakumar Thiagarajan <vasanth@atheros.com>
Signed-off-by: Rajkumar Manoharan <rmanoharan@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
There is an interoperability with AR9382/AR9380 in L1 state with a
few root complexes which can cause a hang. This is fixed by
setting some work around bits on the PCIE PHY. We fix by using
a new ini array to modify these bits when the radio is idle.
Cc: stable@kernel.org
Cc: Jack Lee <jack.lee@atheros.com>
Cc: Carl Huang <carl.huang@atheros.com>
Cc: David Quan <david.quan@atheros.com>
Cc: Nael Atallah <nael.atallah@atheros.com>
Cc: Sarvesh Shrivastava <sarvesh.shrivastava@atheros.com>
Signed-off-by: Luis R. Rodriguez <lrodriguez@atheros.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
In case of single tx and rx queues, three MSI-x vectors are allocated instead
of two. This patch fixes that.
Signed-off-by: Shreyas N Bhatewara <sbhatewara@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Access to cmd register is racey, especially in smp environments. Protect
it using a spinlock.
Signed-off-by: Matthieu Bucchianeri <matthieu@vmware.com>
Signed-off-by: Shreyas N Bhatewara <sbhatewara@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is a small possibility of a race where the suspend routine gets
called, while a napi callback is still pending and when that comes up,
it enables interrupts which just got disabled in the suspend routine.
This change adds napi disable call in suspend and enable in resume to
avoid race.
Signed-off-by: Shreyas N Bhatewara <sbhatewara@vmware.com>
Acked-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Show per-queue stats in ethtool -S output for vmxnet3 interface. Register dump
of ethtool should dump registers for all tx and rx queues.
Signed-off-by: Shreyas N Bhatewara <sbhatewara@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a performance enhancement fix. vmxnet3 device performs better when
provided with at least 54 bytes (ethernet 14 + IP 20+ TCP 20) in the first SG
buffer. For UDP packets driver provides lesser than that in first sg. This
change fixes the same. Also avoid the redundant pskb_may_pull() call.
Signed-off-by: Shreyas N Bhatewara <sbhatewara@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make hw vlan tag stripping as enabled by default. Thereby remove
the code to conditionally enable it later.
Signed-off-by: Guolin Yang <gyang@vmware.com>
Signed-off-by: Shreyas N Bhatewara <sbhatewara@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While activating the device get it's MAC address from netdev. This will allow
the MAC address configured using ifconfig to persist through the reset.
Signed-off-by: Shreyas N Bhatewara <sbhatewara@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix a bug while changing ring size when MTU is changed.
Signed-off-by: Shreyas N Bhatewara <sbhatewara@vmware.com>
Acked-by: Dmitry Torokhov <dtor@vmware.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the original code we check if (servl == NULL) twice. The first time
should print the message that cfmuxl_remove_uplayer() failed and set
"ret" correctly, but instead it just returns success. The second check
should be checking the value of "ret" instead of "servl".
Signed-off-by: Dan Carpenter <error27@gmail.com>
Acked-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch makes the CAN socket code conform to the manpage of sendmsg.
Signed-off-by: Kurt Van Dijck <kurt.van.dijck@eia.be>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some OSA level have a bug in the hw tx csum logic. We can circumvent
this bug by turning on IP hw csum also.
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The open function of qeth is not executed if the qeth device is in
state DOWN or HARDSETUP. A recovery switches from state SOFTSETUP to
HARDSETUP to DOWN to HARDSETUP and back to SOFTSETUP. If open and
recover are running concurrently, open fails if it hits the states
HARDSETUP or DOWN. This patch inserts waiting for recovery finish
in the qeth open functions to enable successful qeth device opening
in spite of a running recovery.
Signed-off-by: Ursula Braun <ursula.braun@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linux 2.6.21 defines different macros for __attribute__ which are also
used inside batman-adv. The next version of checkpatch.pl warns about
the usage of __attribute__((packed))).
Linux 2.6.33 defines an extra macro __always_unused which is used to
assist source code analyzers and can be used to removed the last
existing __attribute__ inside the source code.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Fixes the following:
1. POLL should not enable IRQ when work is not completed
2. No locking between TX descriptor cleaning and XMIT descriptor handling
3. No locking between RX POLL and XMIT modifying control register
4. Since TX cleaning (called from POLL) is running in parallel with XMIT
unnecessary locking is needed.
5. IRQ handler looks at RX frame status solely, this is wrong when IRQ is
temporarily disabled (in POLL), and when IRQ is shared.
6. IRQ handler clears IRQ status, which is unnecessary
7. TX queue was stopped in preventing cause when not MAX_SKB_FRAGS+1
descriptors were available after a SKB been scheduled by XMIT. Instead
the TX queue is stopped first when not enough descriptors are available
upon entering XMIT.
It was hard to split up this patch in smaller pieces since all are tied
together somehow.
Note the RX flag used in the interrupt handler does not signal that
interrupt was asserted, but that a frame was received. Same goes for TX.
Also, IRQ is not asserted when the RX flag is set before enabling IRQ
enable until a new frame is received. So extra care must be taken to
avoid enabling IRQ and all descriptors are already used, hence dead lock
will upon us. See new POLL implementation that enableds IRQ then look at
the RX flag to determine if one or more IRQs may have been missed. TX/RX
flags are cleared before handling previously enabled descriptors, this
ensures that the RX/TX flags are valid when determining if IRQ should be
turned on again.
By moving TX cleaning from POLL to XMIT in the standard case, removes some
locking trouble. Enabling TX cleaning from poll only when not enough TX
descriptors are available is safe because the TX queue is at the same time
stopped, thus XMIT will not be called. The TX queue is woken up again when
enough descriptrs are available.
TX Frames are always enabled with IRQ, however the TX IRQ Enable flag will
not be enabled until XMIT must wait for free descriptors.
Locking RX and XMIT parts of the driver from each other is needed because
the RX/TX enable bits share the same register.
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Frame error interrupts must also be handled since the RX flag only indicates
successful reception, it is unlikely but the old code may lead to dead lock
if 128 error frames are recieved in a row.
Signed-off-by: Daniel Hellstrom <daniel@gaisler.com>
Signed-off-by: David S. Miller <davem@davemloft.net>