linux/net/netfilter
Herbert Xu 9a279bcbe3 net: Partially allow skb destructors to be used on receive path
As it currently stands, skb destructors are forbidden on the
receive path because the protocol end-points will overwrite
any existing destructor with their own.

This is the reason why we have to call skb_orphan in the loopback
driver before we reinject the packet back into the stack, thus
creating a period during which loopback traffic isn't charged
to any socket.

With virtualisation, we have a similar problem in that traffic
is reinjected into the stack without being associated with any
socket entity, thus providing no natural congestion push-back
for those poor folks still stuck with UDP.

Now had we been consistent in telling them that UDP simply has
no congestion feedback, I could just fob them off.  Unfortunately,
we appear to have gone to some length in catering for this on
the standard UDP path, with skb/socket accounting so that has
created a very unhealthy dependency.

Alas habits are difficult to break out of, so we may just have
to allow skb destructors on the receive path.

It turns out that making skb destructors useable on the receive path
isn't as easy as it seems.  For instance, simply adding skb_orphan
to skb_set_owner_r isn't enough.  This is because we assume all
over the IP stack that skb->sk is an IP socket if present.

The new transparent proxy code goes one step further and assumes
that skb->sk is the receiving socket if present.

Now all of this can be dealt with by adding simple checks such
as only treating skb->sk as an IP socket if skb->sk->sk_family
matches.  However, it turns out that for bridging at least we
don't need to do all of this work.

This is of interest because most virtualisation setups use bridging
so we don't actually go through the IP stack on the host (with
the exception of our old nemesis the bridge netfilter, but that's
easily taken care of).

So this patch simply adds skb_orphan to the point just before we
enter the IP stack, but after we've gone through the bridge on the
receive path.  It also adds an skb_orphan to the one place in
netfilter that touches skb->sk/skb->destructor, that is, tproxy.

One word of caution, because of the internal code structure, anyone
wishing to deploy this must use skb_set_owner_w as opposed to
skb_set_owner_r since many functions that create a new skb from
an existing one will invoke skb_set_owner_w on the new skb.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-04 16:55:27 -08:00
..
ipvs net: replace uses of __constant_{endian} 2009-02-01 00:45:17 -08:00
core.c netfilter: enable netfilter in netns 2008-10-08 11:35:11 +02:00
Kconfig netfilter: xt_NFLOG is dependant of nfnetlink_log 2008-12-10 17:24:33 -08:00
Makefile Merge branch 'lvs-next-2.6' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/lvs-2.6 2008-10-08 14:26:36 -07:00
nf_conntrack_acct.c net: '&' redux 2008-11-03 18:21:05 -08:00
nf_conntrack_amanda.c net: replace uses of __constant_{endian} 2009-02-01 00:45:17 -08:00
nf_conntrack_core.c netfilter 07/09: simplify nf_conntrack_alloc() error handling 2009-01-12 21:18:36 -08:00
nf_conntrack_ecache.c netfilter: ctnetlink: deliver events for conntracks changed from userspace 2008-11-18 11:56:20 +01:00
nf_conntrack_expect.c netfilter: ctnetlink: deliver events for conntracks changed from userspace 2008-11-18 11:56:20 +01:00
nf_conntrack_extend.c netfilter: nf_conntrack_extend: avoid unnecessary "ct->ext" dereferences 2008-07-26 17:50:05 -07:00
nf_conntrack_ftp.c netfilter: fix warning in net/netfilter/nf_conntrack_ftp.c 2008-11-25 18:23:03 +01:00
nf_conntrack_h323_asn1.c [NETFILTER]: nf_conntrack_h323: constify and annotate H.323 helper 2008-01-31 19:28:07 -08:00
nf_conntrack_h323_main.c net: replace uses of __constant_{endian} 2009-02-01 00:45:17 -08:00
nf_conntrack_h323_types.c [NETFILTER]: nf_conntrack_h323: constify and annotate H.323 helper 2008-01-31 19:28:07 -08:00
nf_conntrack_helper.c Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6 2008-11-28 02:19:15 -08:00
nf_conntrack_irc.c netfilter: nf_conntrack: connection tracking helper name persistent aliases 2008-11-17 16:01:42 +01:00
nf_conntrack_l3proto_generic.c [NETFILTER]: nf_conntrack: use bool type in struct nf_conntrack_l3proto 2008-04-14 11:15:52 +02:00
nf_conntrack_netbios_ns.c net: replace uses of __constant_{endian} 2009-02-01 00:45:17 -08:00
nf_conntrack_netlink.c netfilter: ctnetlink: fix scheduling while atomic 2009-01-21 12:19:49 -08:00
nf_conntrack_pptp.c net: replace uses of __constant_{endian} 2009-02-01 00:45:17 -08:00
nf_conntrack_proto_dccp.c netfilter: netns nf_conntrack: per-netns net.netfilter.nf_conntrack_log_invalid sysctl 2008-10-08 11:35:08 +02:00
nf_conntrack_proto_generic.c net: '&' redux 2008-11-03 18:21:05 -08:00
nf_conntrack_proto_gre.c netfilter: nf_conntrack_proto_gre: spread __exit 2008-11-20 10:01:37 +01:00
nf_conntrack_proto_sctp.c netfilter: nf_conntrack_proto_sctp: avoid bogus warning 2008-11-24 13:47:21 +01:00
nf_conntrack_proto_tcp.c net: '&' redux 2008-11-03 18:21:05 -08:00
nf_conntrack_proto_udp.c net: '&' redux 2008-11-03 18:21:05 -08:00
nf_conntrack_proto_udplite.c net: '&' redux 2008-11-03 18:21:05 -08:00
nf_conntrack_proto.c netfilter: netns ct: walk netns list under RTNL 2008-11-05 03:03:18 -08:00
nf_conntrack_sane.c netfilter: nf_conntrack: connection tracking helper name persistent aliases 2008-11-17 16:01:42 +01:00
nf_conntrack_sip.c netfilter: nf_conntrack: connection tracking helper name persistent aliases 2008-11-17 16:01:42 +01:00
nf_conntrack_standalone.c cpumask: prepare for iterators to only go to nr_cpu_ids/nr_cpumask_bits: net 2008-12-29 22:44:47 -08:00
nf_conntrack_tftp.c netfilter: nf_conntrack: connection tracking helper name persistent aliases 2008-11-17 16:01:42 +01:00
nf_internals.h netfilter: Use unsigned types for hooknum and pf vars 2008-10-08 11:35:00 +02:00
nf_log.c netfilter: Introduce NFPROTO_* constants 2008-10-08 11:35:00 +02:00
nf_queue.c netfilter: Introduce NFPROTO_* constants 2008-10-08 11:35:00 +02:00
nf_sockopt.c netfilter: enable netfilter in netns 2008-10-08 11:35:11 +02:00
nf_tproxy_core.c net: Partially allow skb destructors to be used on receive path 2009-02-04 16:55:27 -08:00
nfnetlink_log.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6 2008-12-28 12:49:40 -08:00
nfnetlink_queue.c netns: Use net_eq() to compare net-namespaces for optimization. 2008-07-19 22:34:43 -07:00
nfnetlink.c net: Remove CONFIG_KMOD from net/ (towards removing CONFIG_KMOD entirely) 2008-10-16 15:24:51 -07:00
x_tables.c netfilter 04/09: x_tables: fix match/target revision lookup 2009-01-12 21:18:34 -08:00
xt_CLASSIFY.c netfilter: xtables: move extension arguments into compound structure (4/6) 2008-10-08 11:35:19 +02:00
xt_comment.c netfilter: xtables: use NFPROTO_UNSPEC in more extensions 2008-10-08 11:35:20 +02:00
xt_connbytes.c netfilter: xtables: cut down on static data for family-independent extensions 2008-10-08 11:35:20 +02:00
xt_connlimit.c netfilter: xtables: cut down on static data for family-independent extensions 2008-10-08 11:35:20 +02:00
xt_connmark.c netfilter: xtables: cut down on static data for family-independent extensions 2008-10-08 11:35:20 +02:00
xt_CONNMARK.c netfilter: xtables: cut down on static data for family-independent extensions 2008-10-08 11:35:20 +02:00
xt_CONNSECMARK.c netfilter: xtables: cut down on static data for family-independent extensions 2008-10-08 11:35:20 +02:00
xt_conntrack.c netfilter: xtables: cut down on static data for family-independent extensions 2008-10-08 11:35:20 +02:00
xt_dccp.c nf/dccp: merge errorpaths 2008-12-14 23:19:02 -08:00
xt_dscp.c netfilter: xtables: move extension arguments into compound structure (2/6) 2008-10-08 11:35:18 +02:00
xt_DSCP.c netfilter: xtables: move extension arguments into compound structure (5/6) 2008-10-08 11:35:19 +02:00
xt_esp.c netfilter: xtables: move extension arguments into compound structure (2/6) 2008-10-08 11:35:18 +02:00
xt_hashlimit.c net: replace NIPQUAD() in net/netfilter/ 2008-10-31 00:54:29 -07:00
xt_helper.c netfilter: xtables: cut down on static data for family-independent extensions 2008-10-08 11:35:20 +02:00
xt_iprange.c net: replace NIPQUAD() in net/netfilter/ 2008-10-31 00:54:29 -07:00
xt_length.c netfilter: xtables: move extension arguments into compound structure (1/6) 2008-10-08 11:35:18 +02:00
xt_limit.c netfilter: xtables: move extension arguments into compound structure (2/6) 2008-10-08 11:35:18 +02:00
xt_mac.c netfilter: xtables: use NFPROTO_UNSPEC in more extensions 2008-10-08 11:35:20 +02:00
xt_mark.c netfilter: xtables: move extension arguments into compound structure (2/6) 2008-10-08 11:35:18 +02:00
xt_MARK.c netfilter: xtables: use NFPROTO_UNSPEC in more extensions 2008-10-08 11:35:20 +02:00
xt_multiport.c netfilter: xtables: move extension arguments into compound structure (2/6) 2008-10-08 11:35:18 +02:00
xt_NFLOG.c netfilter: xt_NFLOG: don't call nf_log_packet in NFLOG module. 2008-11-04 14:21:08 +01:00
xt_NFQUEUE.c netfilter: replace old NF_ARP calls with NFPROTO_ARP 2008-10-20 03:34:51 -07:00
xt_NOTRACK.c netfilter: xtables: use NFPROTO_UNSPEC in more extensions 2008-10-08 11:35:20 +02:00
xt_owner.c CRED: Use creds in file structs 2008-11-14 10:39:25 +11:00
xt_physdev.c netfilter: xtables: use NFPROTO_UNSPEC in more extensions 2008-10-08 11:35:20 +02:00
xt_pkttype.c netfilter: xtables: cut down on static data for family-independent extensions 2008-10-08 11:35:20 +02:00
xt_policy.c netfilter: xtables: move extension arguments into compound structure (2/6) 2008-10-08 11:35:18 +02:00
xt_quota.c netfilter: xtables: move extension arguments into compound structure (2/6) 2008-10-08 11:35:18 +02:00
xt_rateest.c netfilter: xtables: move extension arguments into compound structure (3/6) 2008-10-08 11:35:19 +02:00
xt_RATEEST.c netfilter: xtables: move extension arguments into compound structure (6/6) 2008-10-08 11:35:19 +02:00
xt_realm.c netfilter: xtables: use NFPROTO_UNSPEC in more extensions 2008-10-08 11:35:20 +02:00
xt_recent.c netfilter: xt_recent: don't save proc dirs 2008-11-20 09:57:01 +01:00
xt_sctp.c netfilter: xtables: move extension arguments into compound structure (2/6) 2008-10-08 11:35:18 +02:00
xt_SECMARK.c netfilter: xtables: move extension arguments into compound structure (6/6) 2008-10-08 11:35:19 +02:00
xt_socket.c tproxy: fixe a possible read from an invalid location in the socket match 2008-12-07 23:53:46 -08:00
xt_state.c netfilter: xtables: move extension arguments into compound structure (3/6) 2008-10-08 11:35:19 +02:00
xt_statistic.c netfilter: xtables: move extension arguments into compound structure (2/6) 2008-10-08 11:35:18 +02:00
xt_string.c netfilter: xtables: move extension arguments into compound structure (3/6) 2008-10-08 11:35:19 +02:00
xt_tcpmss.c netfilter: xtables: move extension arguments into compound structure (1/6) 2008-10-08 11:35:18 +02:00
xt_TCPMSS.c netfilter: xtables: move extension arguments into compound structure (5/6) 2008-10-08 11:35:19 +02:00
xt_TCPOPTSTRIP.c netfilter: xtables: move extension arguments into compound structure (4/6) 2008-10-08 11:35:19 +02:00
xt_tcpudp.c netfilter: xtables: move extension arguments into compound structure (2/6) 2008-10-08 11:35:18 +02:00
xt_time.c netfilter 08/09: xt_time: print timezone for user information 2009-01-12 21:18:36 -08:00
xt_TPROXY.c netfilter: xtables: move extension arguments into compound structure (5/6) 2008-10-08 11:35:19 +02:00
xt_TRACE.c netfilter: xtables: move extension arguments into compound structure (4/6) 2008-10-08 11:35:19 +02:00
xt_u32.c netfilter: xtables: move extension arguments into compound structure (1/6) 2008-10-08 11:35:18 +02:00