linux/net/ipv4
Eric Dumazet 95bd09eb27 tcp: TSO packets automatic sizing
After hearing many people over past years complaining against TSO being
bursty or even buggy, we are proud to present automatic sizing of TSO
packets.

One part of the problem is that tcp_tso_should_defer() uses an heuristic
relying on upcoming ACKS instead of a timer, but more generally, having
big TSO packets makes little sense for low rates, as it tends to create
micro bursts on the network, and general consensus is to reduce the
buffering amount.

This patch introduces a per socket sk_pacing_rate, that approximates
the current sending rate, and allows us to size the TSO packets so
that we try to send one packet every ms.

This field could be set by other transports.

Patch has no impact for high speed flows, where having large TSO packets
makes sense to reach line rate.

For other flows, this helps better packet scheduling and ACK clocking.

This patch increases performance of TCP flows in lossy environments.

A new sysctl (tcp_min_tso_segs) is added, to specify the
minimal size of a TSO packet (default being 2).

A follow-up patch will provide a new packet scheduler (FQ), using
sk_pacing_rate as an input to perform optional per flow pacing.

This explains why we chose to set sk_pacing_rate to twice the current
rate, allowing 'slow start' ramp up.

sk_pacing_rate = 2 * cwnd * mss / srtt

v2: Neal Cardwell reported a suspect deferring of last two segments on
initial write of 10 MSS, I had to change tcp_tso_should_defer() to take
into account tp->xmit_size_goal_segs

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Yuchung Cheng <ycheng@google.com>
Cc: Van Jacobson <vanj@google.com>
Cc: Tom Herbert <therbert@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-08-29 15:50:06 -04:00
..
netfilter netfilter: add SYNPROXY core/target 2013-08-28 00:27:54 +02:00
af_inet.c gro: remove a sparse error 2013-06-12 15:03:24 -07:00
ah4.c ipv4: properly refresh rtable entries on pmtu/redirect events 2013-06-03 00:07:42 -07:00
arp.c arp: flush arp cache on IFF_NOARP change 2013-05-28 13:11:02 -07:00
cipso_ipv4.c
datagram.c ipv4: Add a socket release callback for datagram sockets 2013-01-21 14:17:05 -05:00
devinet.c net: igmp: Allow user-space configuration of igmp unsolicited report interval 2013-08-09 11:27:46 -07:00
esp4.c net: esp{4,6}: fix potential MTU calculation overflows 2013-08-05 12:26:50 -07:00
fib_frontend.c netlink: fix splat in skb_clone with large messages 2013-06-27 22:44:16 -07:00
fib_lookup.h
fib_rules.c fib_rules: fix suppressor names and default values 2013-08-03 10:40:23 -07:00
fib_semantics.c ipv4: use next hop exceptions also for input routes 2013-06-28 21:27:47 -07:00
fib_trie.c fib_trie: remove potential out of bound access 2013-08-05 15:26:11 -07:00
gre_demux.c net: gre: move GSO functions to gre_offload 2013-07-03 14:37:39 -07:00
gre_offload.c gso: Update tunnel segmentation to support Tx checksum offload 2013-07-11 12:18:49 -07:00
icmp.c icmp: avoid allocating large struct on stack 2013-06-03 00:28:44 -07:00
igmp.c net: igmp: Allow user-space configuration of igmp unsolicited report interval 2013-08-09 11:27:46 -07:00
inet_connection_sock.c tcp: Remove TCPCT 2013-03-17 14:35:13 -04:00
inet_diag.c netlink: rename ssk to sk in struct netlink_skb_params 2013-04-19 14:57:56 -04:00
inet_fragment.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-07-09 18:24:39 -07:00
inet_hashtables.c inet: fix spacing in assignment 2013-07-11 12:02:39 -07:00
inet_lro.c ipv4: replace ip_fast_csum with csum_replace2 2013-03-15 09:12:25 -04:00
inet_timewait_sock.c hlist: drop the node parameter from iterators 2013-02-27 19:10:24 -08:00
inetpeer.c
ip_forward.c
ip_fragment.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-04-22 20:32:51 -04:00
ip_gre.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-08-16 15:37:26 -07:00
ip_input.c net: add SNMP counters tracking incoming ECN bits 2013-08-08 22:24:59 -07:00
ip_options.c net/ipv4: Ensure that location of timestamp option is stored 2013-03-12 05:35:39 -04:00
ip_output.c ipv4: ip_output: remove inline marking of EXPORT_SYMBOL functions 2013-05-11 16:12:44 -07:00
ip_sockglue.c net: prevent setting ttl=0 via IP_TTL 2013-01-08 17:57:10 -08:00
ip_tunnel_core.c ip_tunnel: Do not use inner ip-header-id for tunnel ip-header-id. 2013-08-13 16:52:50 -07:00
ip_tunnel.c ipip: potential race in ip_tunnel_init_net() 2013-08-25 18:39:59 -04:00
ip_vti.c ipip: add x-netns support 2013-08-15 01:00:20 -07:00
ipcomp.c ipv4: properly refresh rtable entries on pmtu/redirect events 2013-06-03 00:07:42 -07:00
ipconfig.c ipconfig: add informative timeout messages while waiting for carrier 2013-04-02 14:35:33 -04:00
ipip.c ipip: add x-netns support 2013-08-15 01:00:20 -07:00
ipmr.c ipmr: change the prototype of ip_mr_forward(). 2013-07-23 17:01:05 -07:00
Kconfig Kconfig: remove dangling references to the deleted file 2013-06-04 15:17:39 -07:00
Makefile net: gre: move GSO functions to gre_offload 2013-07-03 14:37:39 -07:00
netfilter.c netfilter: add my copyright statements 2013-04-18 20:27:55 +02:00
ping.c net: proc_fs: trivial: print UIDs as unsigned int 2013-08-15 14:37:46 -07:00
proc.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-08-16 15:37:26 -07:00
protocol.c ipv4: Disallow non-namespace aware protocols to register. 2013-02-05 14:42:23 -05:00
raw.c net: proc_fs: trivial: print UIDs as unsigned int 2013-08-15 14:37:46 -07:00
route.c ipv4: raise IP_MAX_MTU to theoretical limit 2013-08-20 15:05:04 -07:00
syncookies.c net: syncookies: export cookie_v4_init_sequence/cookie_v4_check 2013-08-28 00:27:44 +02:00
sysctl_net_ipv4.c tcp: TSO packets automatic sizing 2013-08-29 15:50:06 -04:00
tcp_bic.c
tcp_cong.c tcp: remove Appropriate Byte Count support 2013-02-05 14:51:16 -05:00
tcp_cubic.c tcp: cubic: fix bug in bictcp_acked() 2013-08-07 10:35:08 -07:00
tcp_diag.c
tcp_fastopen.c tcp: add server ip to encrypt cookie in fast open 2013-08-10 00:35:33 -07:00
tcp_highspeed.c
tcp_htcp.c
tcp_hybla.c
tcp_illinois.c
tcp_input.c tcp: TSO packets automatic sizing 2013-08-29 15:50:06 -04:00
tcp_ipv4.c tcp: trivial: Remove nocache argument from tcp_v4_send_synack 2013-08-20 15:05:04 -07:00
tcp_lp.c
tcp_memcontrol.c net: tcp_memcontrol: minor: remove unused variable 2013-04-14 15:41:49 -04:00
tcp_metrics.c tcp: do not expire TCP fastopen cookies 2013-05-05 16:58:02 -04:00
tcp_minisocks.c tcp: consolidate SYNACK RTT sampling 2013-07-22 17:53:42 -07:00
tcp_offload.c net: tcp: move GRO/GSO functions to tcp_offload 2013-06-07 14:39:05 -07:00
tcp_output.c tcp: TSO packets automatic sizing 2013-08-29 15:50:06 -04:00
tcp_probe.c net: tcp_probe: allow more advanced ingress filtering by mark 2013-08-27 15:53:34 -04:00
tcp_scalable.c
tcp_timer.c tcp: refactor F-RTO 2013-03-21 11:47:50 -04:00
tcp_vegas.c
tcp_vegas.h
tcp_veno.c
tcp_westwood.c tcp: refactor F-RTO 2013-03-21 11:47:50 -04:00
tcp_yeah.c
tcp.c tcp: TSO packets automatic sizing 2013-08-29 15:50:06 -04:00
tunnel4.c
udp_diag.c netlink: rename ssk to sk in struct netlink_skb_params 2013-04-19 14:57:56 -04:00
udp_impl.h
udp_offload.c net: udp4: move GSO functions to udp_offload 2013-06-12 00:47:25 -07:00
udp.c net: proc_fs: trivial: print UIDs as unsigned int 2013-08-15 14:37:46 -07:00
udplite.c
xfrm4_input.c net: Add skb_unclone() helper function. 2013-02-15 15:10:37 -05:00
xfrm4_mode_beet.c
xfrm4_mode_transport.c
xfrm4_mode_tunnel.c xfrm: allow to avoid copying DSCP during encapsulation 2013-03-06 07:02:45 +01:00
xfrm4_output.c
xfrm4_policy.c xfrm: make gc_thresh configurable in all namespaces 2013-02-06 11:36:29 +01:00
xfrm4_state.c
xfrm4_tunnel.c sit: add IPv4 over IPv4 support 2013-05-31 17:19:05 -07:00