linux/net/ipv4
Wei Wang ba615f6752 tcp: avoid fastopen API to be used on AF_UNSPEC
Fastopen API should be used to perform fastopen operations on the TCP
socket. It does not make sense to use fastopen API to perform disconnect
by calling it with AF_UNSPEC. The fastopen data path is also prone to
race conditions and bugs when using with AF_UNSPEC.

One issue reported and analyzed by Vegard Nossum is as follows:
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Thread A:                            Thread B:
------------------------------------------------------------------------
sendto()
 - tcp_sendmsg()
     - sk_stream_memory_free() = 0
         - goto wait_for_sndbuf
	     - sk_stream_wait_memory()
	        - sk_wait_event() // sleep
          |                          sendto(flags=MSG_FASTOPEN, dest_addr=AF_UNSPEC)
	  |                           - tcp_sendmsg()
	  |                              - tcp_sendmsg_fastopen()
	  |                                 - __inet_stream_connect()
	  |                                    - tcp_disconnect() //because of AF_UNSPEC
	  |                                       - tcp_transmit_skb()// send RST
	  |                                    - return 0; // no reconnect!
	  |                           - sk_stream_wait_connect()
	  |                                 - sock_error()
	  |                                    - xchg(&sk->sk_err, 0)
	  |                                    - return -ECONNRESET
	- ... // wake up, see sk->sk_err == 0
    - skb_entail() on TCP_CLOSE socket

If the connection is reopened then we will send a brand new SYN packet
after thread A has already queued a buffer. At this point I think the
socket internal state (sequence numbers etc.) becomes messed up.

When the new connection is closed, the FIN-ACK is rejected because the
sequence number is outside the window. The other side tries to
retransmit,
but __tcp_retransmit_skb() calls tcp_trim_head() on an empty skb which
corrupts the skb data length and hits a BUG() in copy_and_csum_bits().
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Hence, this patch adds a check for AF_UNSPEC in the fastopen data path
and return EOPNOTSUPP to user if such case happens.

Fixes: cf60af03ca ("tcp: Fast Open client - sendmsg(MSG_FASTOPEN)")
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Wei Wang <weiwan@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-05-25 13:30:34 -04:00
..
netfilter Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next 2017-05-01 10:47:53 -04:00
af_inet.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2017-05-02 16:40:27 -07:00
ah4.c
arp.c arp: always override existing neigh entries with gratuitous ARP 2017-05-21 13:26:45 -04:00
cipso_ipv4.c
datagram.c
devinet.c net: rtnetlink: plumb extended ack to doit function 2017-04-17 15:35:38 -04:00
esp4_offload.c esp4/6: Fix GSO path for non-GSO SW-crypto packets 2017-04-19 07:48:57 +02:00
esp4.c esp4: Fix udpencap for local TCP packets. 2017-05-04 07:27:26 +02:00
fib_frontend.c net: Improve handling of failures on link and route dumps 2017-05-16 14:54:11 -04:00
fib_lookup.h
fib_notifier.c ipv4: fib: Remove redundant argument 2017-03-10 09:45:09 -08:00
fib_rules.c ipv4: fib_rules: Dump FIB rules when registering FIB notifier 2017-03-16 10:18:34 -07:00
fib_semantics.c net: ipv4: add support for ECMP hash policy choice 2017-03-21 15:27:19 -07:00
fib_trie.c net: Improve handling of failures on link and route dumps 2017-05-16 14:54:11 -04:00
fou.c
gre_demux.c
gre_offload.c
icmp.c net: ipv4: add support for ECMP hash policy choice 2017-03-21 15:27:19 -07:00
igmp.c igmp, mld: Fix memory leak in igmpv3/mld_del_delrec() 2017-02-09 16:43:45 -05:00
inet_connection_sock.c dccp/tcp: do not inherit mc_list from parent 2017-05-09 15:17:49 -04:00
inet_diag.c
inet_fragment.c
inet_hashtables.c treewide: use kv[mz]alloc* rather than opencoded variants 2017-05-08 17:15:13 -07:00
inet_timewait_sock.c
inetpeer.c
ip_forward.c
ip_fragment.c inet: frag: release spinlock before calling icmp_send() 2017-03-22 15:40:45 -07:00
ip_gre.c ip_tunnel: Allow policy-based routing through tunnels 2017-04-21 13:21:31 -04:00
ip_input.c net: Add sysctl to toggle early demux for tcp and udp 2017-03-24 13:17:07 -07:00
ip_options.c
ip_output.c udp: avoid ufo handling on IP payload compression packets 2017-03-09 18:28:42 -08:00
ip_sockglue.c ipv4: get rid of ip_ra_lock 2017-04-30 22:44:04 -04:00
ip_tunnel_core.c netlink: pass extended ACK struct to parsing functions 2017-04-13 13:58:22 -04:00
ip_tunnel.c ip_tunnel: Allow policy-based routing through tunnels 2017-04-21 13:21:31 -04:00
ip_vti.c vti: check nla_put_* return value 2017-05-08 15:10:31 -04:00
ipcomp.c
ipconfig.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-04-06 08:24:51 -07:00
ipip.c ip_tunnel: Allow policy-based routing through tunnels 2017-04-21 13:21:31 -04:00
ipmr.c ipmr: vrf: Find VIFs using the actual device 2017-05-16 12:52:17 -04:00
Kconfig Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next 2017-02-16 21:25:49 -05:00
Makefile ipv4: fib: Move FIB notification code to a separate file 2017-03-10 09:45:09 -08:00
netfilter.c netfilter: use skb_to_full_sk in ip_route_me_harder 2017-02-28 12:49:36 +01:00
ping.c ping: implement proper locking 2017-03-24 20:50:28 -07:00
proc.c net/tcp_fastopen: Add snmp counter for blackhole detection 2017-04-24 14:27:17 -04:00
protocol.c net: Add sysctl to toggle early demux for tcp and udp 2017-03-24 13:17:07 -07:00
raw_diag.c
raw.c ipv4, ipv6: ensure raw socket message is big enough to hold an IP header 2017-05-04 11:02:46 -04:00
route.c Revert "ipv4: restore rt->fi for reference counting" 2017-05-08 22:35:32 -04:00
syncookies.c tcp: randomize timestamps on syncookies 2017-05-05 12:00:11 -04:00
sysctl_net_ipv4.c net/tcp_fastopen: Disable active side TFO in certain scenarios 2017-04-24 14:27:17 -04:00
tcp_bbr.c
tcp_bic.c
tcp_cdg.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/clock.h> 2017-03-02 08:42:27 +01:00
tcp_cong.c tcp: memset ca_priv data to 0 properly 2017-04-26 14:58:32 -04:00
tcp_cubic.c tcp_cubic: fix typo in module param description 2017-04-20 16:16:44 -04:00
tcp_dctcp.c
tcp_diag.c
tcp_fastopen.c net/tcp_fastopen: Add snmp counter for blackhole detection 2017-04-24 14:27:17 -04:00
tcp_highspeed.c
tcp_htcp.c
tcp_hybla.c
tcp_illinois.c
tcp_input.c tcp: eliminate negative reordering in tcp_clean_rtx_queue 2017-05-16 12:45:21 -04:00
tcp_ipv4.c Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-05-10 10:30:46 -07:00
tcp_lp.c tcp: fix wraparound issue in tcp_lp 2017-05-02 15:07:02 -04:00
tcp_metrics.c treewide: use kv[mz]alloc* rather than opencoded variants 2017-05-08 17:15:13 -07:00
tcp_minisocks.c tcp: do not inherit fastopen_req from parent 2017-05-04 11:00:04 -04:00
tcp_nv.c
tcp_offload.c
tcp_output.c tcp: make congestion control optionally skip slow start after idle 2017-05-08 14:37:07 -04:00
tcp_probe.c tcp: Revert "tcp: tcp_probe: use spin_lock_bh()" 2017-02-21 13:26:03 -05:00
tcp_rate.c tcp: do not pass timestamp to tcp_rate_gen() 2017-04-26 14:44:38 -04:00
tcp_recovery.c tcp: tcp_rack_reo_timeout() must update tp->tcp_mstamp 2017-04-27 11:46:15 -04:00
tcp_scalable.c
tcp_timer.c net/tcp_fastopen: Remove mss check in tcp_write_timeout() 2017-04-24 14:27:17 -04:00
tcp_vegas.c
tcp_vegas.h
tcp_veno.c
tcp_westwood.c tcp_westwood: fix tcp_westwood_info() style mistakes 2017-03-16 20:23:28 -07:00
tcp_yeah.c
tcp.c tcp: avoid fastopen API to be used on AF_UNSPEC 2017-05-25 13:30:34 -04:00
tunnel4.c
udp_diag.c
udp_impl.h udp: make *udp*_queue_rcv_skb() functions static 2017-05-18 10:23:33 -04:00
udp_offload.c udp: disable inner UDP checksum offloads in IPsec case 2017-04-24 13:48:54 -04:00
udp_tunnel.c
udp.c udp: make *udp*_queue_rcv_skb() functions static 2017-05-18 10:23:33 -04:00
udplite.c
xfrm4_input.c esp: Add a software GRO codepath 2017-02-15 11:04:11 +01:00
xfrm4_mode_beet.c
xfrm4_mode_transport.c xfrm: Add encapsulation header offsets while SKB is not encrypted 2017-04-14 10:07:39 +02:00
xfrm4_mode_tunnel.c xfrm: Add encapsulation header offsets while SKB is not encrypted 2017-04-14 10:07:39 +02:00
xfrm4_output.c xfrm: Add an IPsec hardware offloading API 2017-04-14 10:06:10 +02:00
xfrm4_policy.c xfrm: policy: make policy backend const 2017-02-09 10:22:19 +01:00
xfrm4_protocol.c xfrm: input: constify xfrm_input_afinfo 2017-02-09 10:22:17 +01:00
xfrm4_state.c
xfrm4_tunnel.c