commit f88037(sctp: Drop ipfargok in sctp_xmit function)
has droped ipfragok and set local_df value properly.
So the change of commit 77e2f1(ipv6: Fix ip6_xmit to
send fragments if ipfragok is true) is not needed.
So the patch remove them.
Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix an oversight in ipmr_destroy_unres() - the net pointer is
unconditionally initialized to NULL, resulting in a NULL pointer
dereference later on.
Fix by adding a net pointer to struct mr_table and using it in
ipmr_destroy_unres().
Signed-off-by: Patrick McHardy <kaber@trash.net>
The patch to convert struct mfc_cache to list_heads (ipv4: ipmr: convert
struct mfc_cache to struct list_head) introduced a bug when adding new
cache entries that don't match any unresolved entries.
The unres queue is searched for a matching entry, which is then resolved.
When no matching entry is present, the iterator points to the head of the
list, but is treated as a matching entry. Use a seperate variable to
indicate that a matching entry was found.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Eric Paris reported netif_rx() is calling smp_processor_id() from
preemptible context, in particular when caller is
ip_dev_loopback_xmit().
RPS commit added this smp_processor_id() call, this patch makes sure
preemption is disabled. rps_get_cpus() wants rcu_read_lock() anyway, we
can dot it a bit earlier.
Reported-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for multiple independant multicast routing instances,
named "tables".
Userspace multicast routing daemons can bind to a specific table instance by
issuing a setsockopt call using a new option MRT_TABLE. The table number is
stored in the raw socket data and affects all following ipmr setsockopt(),
getsockopt() and ioctl() calls. By default, a single table (RT_TABLE_DEFAULT)
is created with a default routing rule pointing to it. Newly created pimreg
devices have the table number appended ("pimregX"), with the exception of
devices created in the default table, which are named just "pimreg" for
compatibility reasons.
Packets are directed to a specific table instance using routing rules,
similar to how regular routing rules work. Currently iif, oif and mark
are supported as keys, source and destination addresses could be supported
additionally.
Example usage:
- bind pimd/xorp/... to a specific table:
uint32_t table = 123;
setsockopt(fd, IPPROTO_IP, MRT_TABLE, &table, sizeof(table));
- create routing rules directing packets to the new table:
# ip mrule add iif eth0 lookup 123
# ip mrule add oif eth0 lookup 123
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Now that cache entries in unres_queue don't need to be distinguished by their
network namespace pointer anymore, we can remove it from struct mfc_cache
add pass the namespace as function argument to the functions that need it.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The unres_queue is currently shared between all namespaces. Following patches
will additionally allow to create multiple multicast routing tables in each
namespace. Having a single shared queue for all these users seems to excessive,
move the queue and the cleanup timer to the per-namespace data to unshare it.
As a side-effect, this fixes a bug in the seq file iteration functions: the
first entry returned is always from the current namespace, entries returned
after that may belong to any namespace.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Decouple the address family values used for fib_rules from the real
address families in socket.h. This allows to use fib_rules for
code that is not a real address family without increasing AF_MAX/NPROTO.
Values up to 127 are reserved for real address families and map directly
to the corresponding AF value, values starting from 128 are for other
uses. rtnetlink is changed to invoke the AF_UNSPEC dumpit/doit handlers
for these families.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
All fib_rules implementations need to set the family in their ->fill()
functions. Since the value is available to the generic fib_nl_fill_rule()
function, set it there.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Both functions are equivalent, consolidate them since a following patch
needs a third implementation for multicast routing.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The function dst_ifdown is called only two places but in a non-
performance critical code path, there is no reason to inline it.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
skb_bond_should_drop() is too big to be inlined.
This patch reduces kernel text size, and its compilation time as well
(shrinking include/linux/netdevice.h)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
At this point optlen == sizeof(sfilter) but some compilers are dumb.
Reported-by: Németh Márton <nm127@freemail.h
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Oliver Hartkopp <oliver@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The notifier for address down should only be called if address is completely
gone, not just being marked as tentative on link transistion. The code
in net-next would case bonding/sctp/s390 to see address disappear on link
down, but they would never see it reappear on link up.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since an address in hash list has to already have a ref count,
no additional ref count is needed.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When link goes down, want address to be preserved but in a tentative
state, therefore it has to stay in hash list.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recent changes preserve IPv6 address when link goes down (good).
But would cause address to point to dead dst entry (bad).
The simplest fix is to just not delete route if address is
being held for later use.
Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With latest CONFIG_PROVE_RCU stuff, I felt more comfortable to make this
work.
sk->sk_dst_cache is currently protected by a rwlock (sk_dst_lock)
This rwlock is readlocked for a very small amount of time, and dst
entries are already freed after RCU grace period. This calls for RCU
again :)
This patch converts sk_dst_lock to a spinlock, and use RCU for readers.
__sk_dst_get() is supposed to be called with rcu_read_lock() or if
socket locked by user, so use appropriate rcu_dereference_check()
condition (rcu_read_lock_held() || sock_owned_by_user(sk))
This patch avoids two atomic ops per tx packet on UDP connected sockets,
for example, and permits sk_dst_lock to be much less dirtied.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dont use netdev_warn() in dev_cap_txqueue() and get_rps_cpu() so that we
can catch following warnings without crash.
bond0.2240 received packet on queue 6, but number of RX queues is 1
bond0.2240 received packet on queue 11, but number of RX queues is 1
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable the SO_TIMESTAMPING socket infrastructure for raw packet sockets.
We introduce PACKET_TX_TIMESTAMP for the control message cmsg_type.
Similar support for UDP and CAN sockets was added in commit
51f31cabe3
Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
Back in commit 04a0551c87
("loopback: Drop obsolete ip_summed setting") we stopped
setting CHECKSUM_UNNECESSARY in the loopback xmit.
This is because such a setting was a lie since it implies that the
checksum field of the packet is properly filled in.
Instead what happens normally is that CHECKSUM_PARTIAL is set and
skb->csum is calculated as needed.
But this was only happening for TCP data packets (via the
skb->ip_summed assignment done in tcp_sendmsg()). It doesn't
happen for non-data packets like ACKs etc.
Fix this by setting skb->ip_summed in the common non-data packet
constructor. It already is setting skb->csum to zero.
But this reminds us that we still have things like ip_output.c's
ip_dev_loopback_xmit() which sets skb->ip_summed to the value
CHECKSUM_UNNECESSARY, which Herbert's patch teaches us is not
valid. So we'll have to address that at some point too.
Signed-off-by: David S. Miller <davem@davemloft.net>
inet: Remove unused send_check length argument
This patch removes the unused length argument from the send_check
function in struct inet_connection_sock_af_ops.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Yinghai <yinghai.lu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcp: Handle CHECKSUM_PARTIAL for SYNACK packets for IPv6
This patch moves the common code between tcp_v6_send_check and
tcp_v6_gso_send_check into a new function __tcp_v6_send_check.
It then uses the new function in tcp_v6_send_synack as well as
tcp_v6_send_response so that they handle CHECKSUM_PARTIAL properly.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Yinghai <yinghai.lu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcp: Handle CHECKSUM_PARTIAL for SYNACK packets for IPv4
This patch moves the common code between tcp_v4_send_check and
tcp_v4_gso_send_check into a new function __tcp_v4_send_check.
It then uses the new function in tcp_v4_send_synack so that it
handles CHECKSUM_PARTIAL properly.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Tested-by: Yinghai <yinghai.lu@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 2626419ad5.
It causes regressions for people with IGB cards. Connection
requests don't complete etc. The true cause of the issue is
still not known, but we should sort this out in net-next-2.6
not net-2.6
Signed-off-by: David S. Miller <davem@davemloft.net>
As scan_work is queued from work_work it needs to be checked if scan
has been started during execution of work_work. Otherwise, when hw
scan is used, the stack gets error about hw being busy with ongoing
scan. This causes the stack to abort scan without notifying the driver
about it. This leads to a situation where the hw is scanning and the stack
thinks it's not. Then when the scan finishes, the stack will complain by
warnings.
Signed-off-by: Teemu Paasikivi <ext-teemu.3.paasikivi@nokia.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
We delay the skb linearising in ieee80211_rx_h_decrypt so that
frames do not require software decryption are not linearized. We
are safe to do this because ieee80211_get_mmie_keyidx() only
requires to touch nonlinear data for management frames, which are
already linearized before getting here.
Cc: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Enhance tracing by adding tracing for a variety of
callbacks that the drivers call, and also for
internal calls (currently limited to queue status).
This can aid debugging what is going on in mac80211
in interaction with drivers, since we can now see
what drivers call and not just what mac80211 calls
in the driver.
Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Grouped mesh action codes together with the other action codes in
ieee80211.h.
Signed-off-by: Javier Cardona <javier@cozybit.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
Back in commit 04a0551c87
("loopback: Drop obsolete ip_summed setting") we stopped
setting CHECKSUM_UNNECESSARY in the loopback xmit.
This is because such a setting was a lie since it implies that the
checksum field of the packet is properly filled in.
Instead what happens normally is that CHECKSUM_PARTIAL is set and
skb->csum is calculated as needed.
But this was only happening for TCP data packets (via the
skb->ip_summed assignment done in tcp_sendmsg()). It doesn't
happen for non-data packets like ACKs etc.
Fix this by setting skb->ip_summed in the common non-data packet
constructor. It already is setting skb->csum to zero.
But this reminds us that we still have things like ip_output.c's
ip_dev_loopback_xmit() which sets skb->ip_summed to the value
CHECKSUM_UNNECESSARY, which Herbert's patch teaches us is not
valid. So we'll have to address that at some point too.
Signed-off-by: David S. Miller <davem@davemloft.net>
Commits 5051ebd275 and
5051ebd275 ("ipv[46]: udp: optimize unicast RX
path") broke some programs.
After upgrading a L2TP server to 2.6.33 it started to fail, tunnels going up an
down, after the 10th tunnel came up. My modified rp-l2tp uses a global
unconnected socket bound to (INADDR_ANY, 1701) and one connected socket per
tunnel after parameter negotiation.
After ten sockets were open and due to mixed parameters to
udp[46]_lib_lookup2() kernel started to drop packets.
Signed-off-by: Jorge Boncompte [DTI2] <jorge@dti2.net>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix coding style errors and warnings output while running checkpatch.pl
on the files net/core/ethtool.c and include/linux/ethtool.h
Signed-off-by: chavey <chavey@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current X.25 code has some bugs in throughput negotiation:
1. It does negotiation in all cases, usually there is no need
2. It incorrectly attempts to negotiate the throughput class in one
direction only. There are separate throughput classes for input
and output and if either is negotiated both mist be negotiates.
This is bug https://bugzilla.kernel.org/show_bug.cgi?id=15681
This bug was first reported by Daniel Ferenci to the linux-x25 mailing
list on 6/8/2004, but is still present.
The current (2.6.34) x.25 code doesn't seem to know that the X.25
throughput facility includes two values, one for the required
throughput outbound, one for inbound.
This causes it to attempt to negotiate throughput 0x0A, which is
throughput 9600 inbound and the illegal value "0" for inbound
throughput.
Because of this some X.25 devices (e.g. Cisco 1600) refuse to connect
to Linux X.25.
The following patch fixes this behaviour. Unless the user specifies a
required throughput it does not attempt to negotiate. If the user
does not specify a throughput it accepts the suggestion of the remote
X.25 system. If the user requests a throughput then it validates both
the input and output throughputs and correctly negotiates them with
the remote end.
Signed-off-by: John Hughes <john@calva.com>
Tested-by: Andrew Hendry <andrew.hendry@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here is a patch to stop X.25 examining fields beyond the end of the packet.
For example, when a simple CALL ACCEPTED was received:
10 10 0f
x25_parse_facilities was attempting to decode the FACILITIES field, but this
packet contains no facilities field.
Signed-off-by: John Hughes <john@calva.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The IGMP3 report parsing is looking at the wrong address for
group records. This patch fixes it.
Reported-by: Banyeer <banyeer@yahoo.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
As pointed by Randy Dunlap, we must include linux/proc_fs.h in
net/core/dev_addr_lists.c, regardless of CONFIG_PROC_FS
Reported-by: Randy Dunlap <randy.dunlap@oracle.com>,
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
WEP crypto was broken, but upon finding the problem
it is evident that other things were broken by the
paged RX patch as well.
To fix it, for now move the linearising in front.
This means that we linearise all frames, which is
not at all what we want, but at least it fixes the
problem for now.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: Zhu Yi <yi.zhu@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>