linux/net/sched
Eric Dumazet 4eaf3b84f2 net_sched: fix qdisc_tree_decrease_qlen() races
qdisc_tree_decrease_qlen() suffers from two problems on multiqueue
devices.

One problem is that it updates sch->q.qlen and sch->qstats.drops
on the mq/mqprio root qdisc, while it should not : Daniele
reported underflows errors :
[  681.774821] PAX: sch->q.qlen: 0 n: 1
[  681.774825] PAX: size overflow detected in function qdisc_tree_decrease_qlen net/sched/sch_api.c:769 cicus.693_49 min, count: 72, decl: qlen; num: 0; context: sk_buff_head;
[  681.774954] CPU: 2 PID: 19 Comm: ksoftirqd/2 Tainted: G           O    4.2.6.201511282239-1-grsec #1
[  681.774955] Hardware name: ASUSTeK COMPUTER INC. X302LJ/X302LJ, BIOS X302LJ.202 03/05/2015
[  681.774956]  ffffffffa9a04863 0000000000000000 0000000000000000 ffffffffa990ff7c
[  681.774959]  ffffc90000d3bc38 ffffffffa95d2810 0000000000000007 ffffffffa991002b
[  681.774960]  ffffc90000d3bc68 ffffffffa91a44f4 0000000000000001 0000000000000001
[  681.774962] Call Trace:
[  681.774967]  [<ffffffffa95d2810>] dump_stack+0x4c/0x7f
[  681.774970]  [<ffffffffa91a44f4>] report_size_overflow+0x34/0x50
[  681.774972]  [<ffffffffa94d17e2>] qdisc_tree_decrease_qlen+0x152/0x160
[  681.774976]  [<ffffffffc02694b1>] fq_codel_dequeue+0x7b1/0x820 [sch_fq_codel]
[  681.774978]  [<ffffffffc02680a0>] ? qdisc_peek_dequeued+0xa0/0xa0 [sch_fq_codel]
[  681.774980]  [<ffffffffa94cd92d>] __qdisc_run+0x4d/0x1d0
[  681.774983]  [<ffffffffa949b2b2>] net_tx_action+0xc2/0x160
[  681.774985]  [<ffffffffa90664c1>] __do_softirq+0xf1/0x200
[  681.774987]  [<ffffffffa90665ee>] run_ksoftirqd+0x1e/0x30
[  681.774989]  [<ffffffffa90896b0>] smpboot_thread_fn+0x150/0x260
[  681.774991]  [<ffffffffa9089560>] ? sort_range+0x40/0x40
[  681.774992]  [<ffffffffa9085fe4>] kthread+0xe4/0x100
[  681.774994]  [<ffffffffa9085f00>] ? kthread_worker_fn+0x170/0x170
[  681.774995]  [<ffffffffa95d8d1e>] ret_from_fork+0x3e/0x70

mq/mqprio have their own ways to report qlen/drops by folding stats on
all their queues, with appropriate locking.

A second problem is that qdisc_tree_decrease_qlen() calls qdisc_lookup()
without proper locking : concurrent qdisc updates could corrupt the list
that qdisc_match_from_root() parses to find a qdisc given its handle.

Fix first problem adding a TCQ_F_NOPARENT qdisc flag that
qdisc_tree_decrease_qlen() can use to abort its tree traversal,
as soon as it meets a mq/mqprio qdisc children.

Second problem can be fixed by RCU protection.
Qdisc are already freed after RCU grace period, so qdisc_list_add() and
qdisc_list_del() simply have to use appropriate rcu list variants.

A future patch will add a per struct netdev_queue list anchor, so that
qdisc_tree_decrease_qlen() can have more efficient lookups.

Reported-by: Daniele Fucini <dfucini@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Cong Wang <cwang@twopensource.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-03 14:59:05 -05:00
..
act_api.c net_sched: make tcf_hash_destroy() static 2015-08-26 11:01:44 -07:00
act_bpf.c bpf: add bpf_redirect() helper 2015-09-17 21:09:07 -07:00
act_connmark.c netfilter: nf_conntrack: Add a struct net parameter to l4_pkt_to_tuple 2015-09-18 22:00:04 +02:00
act_csum.c net: sched: add percpu stats to actions 2015-07-08 13:50:41 -07:00
act_gact.c net_sched: act_gact: remove spinlock in fast path 2015-07-08 13:50:42 -07:00
act_ipt.c netfilter: x_tables: Pass struct net in xt_action_param 2015-09-18 21:58:14 +02:00
act_mirred.c act_mirred: clear sender cpu before sending to tx 2015-10-08 04:59:04 -07:00
act_nat.c net: Change pseudohdr argument of inet_proto_csum_replace* to be a bool 2015-08-17 21:33:06 -07:00
act_pedit.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-07-31 23:52:20 -07:00
act_police.c
act_simple.c net: sched: add percpu stats to actions 2015-07-08 13:50:41 -07:00
act_skbedit.c net: sched: add percpu stats to actions 2015-07-08 13:50:41 -07:00
act_vlan.c net: sched: add percpu stats to actions 2015-07-08 13:50:41 -07:00
cls_api.c net: sched: fix call_rcu() race on classifier module unloads 2015-05-21 18:48:18 -04:00
cls_basic.c
cls_bpf.c sched, bpf: add helper for retrieving routing realms 2015-10-03 05:02:41 -07:00
cls_cgroup.c cls_cgroup: factor out classid retrieval 2015-07-20 12:41:30 -07:00
cls_flow.c sched: cls_flow: use skb_to_full_sk() helper 2015-11-08 20:56:39 -05:00
cls_flower.c flow_dissector: Add flags argument to skb_flow_dissector functions 2015-09-01 15:06:22 -07:00
cls_fw.c net: revert "net_sched: move tp->root allocation into fw_init()" 2015-09-24 14:33:30 -07:00
cls_route.c
cls_rsvp6.c
cls_rsvp.c
cls_rsvp.h net_sched: convert rsvp to call tcf_exts_destroy from rcu callback 2015-08-26 11:01:45 -07:00
cls_tcindex.c net_sched: convert tcindex to call tcf_exts_destroy from rcu callback 2015-08-26 11:01:44 -07:00
cls_u32.c cls_u32: complete the check for non-forced case in u32_destroy() 2015-08-25 17:02:48 -07:00
em_canid.c
em_cmp.c
em_ipset.c netfilter: x_tables: Pass struct net in xt_action_param 2015-09-18 21:58:14 +02:00
em_meta.c net_sched: em_meta: use skb_to_full_sk() helper 2015-11-08 20:56:39 -05:00
em_nbyte.c
em_text.c
em_u32.c
ematch.c
Kconfig net: add CONFIG_NET_INGRESS to enable ingress filtering 2015-05-14 01:10:05 -04:00
Makefile tc: introduce Flower classifier 2015-05-13 15:19:48 -04:00
sch_api.c net_sched: fix qdisc_tree_decrease_qlen() races 2015-12-03 14:59:05 -05:00
sch_atm.c net: sched: consolidate tc_classify{,_compat} 2015-08-27 14:18:48 -07:00
sch_blackhole.c net/sched: make sch_blackhole.c explicitly non-modular 2015-10-09 07:52:28 -07:00
sch_cbq.c net: sched: consolidate tc_classify{,_compat} 2015-08-27 14:18:48 -07:00
sch_choke.c net: sched: kill dead code in sch_choke.c 2015-11-03 13:30:47 -05:00
sch_codel.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-05-13 14:31:43 -04:00
sch_drr.c net: sched: consolidate tc_classify{,_compat} 2015-08-27 14:18:48 -07:00
sch_dsmark.c sch_dsmark: improve memory locality 2015-09-17 22:37:19 -07:00
sch_fifo.c net: sched: drop all special handling of tx_queue_len == 0 2015-08-18 11:55:08 -07:00
sch_fq_codel.c net: sched: consolidate tc_classify{,_compat} 2015-08-27 14:18:48 -07:00
sch_fq.c net: synack packets can be attached to request sockets 2015-10-11 05:05:06 -07:00
sch_generic.c net_sched: fix qdisc_tree_decrease_qlen() races 2015-12-03 14:59:05 -05:00
sch_gred.c net: sched: drop all special handling of tx_queue_len == 0 2015-08-18 11:55:08 -07:00
sch_hfsc.c net: sched: consolidate tc_classify{,_compat} 2015-08-27 14:18:48 -07:00
sch_hhf.c sch_hhf: fix return value of hhf_drop() 2015-10-11 04:49:33 -07:00
sch_htb.c net: sched: consolidate tc_classify{,_compat} 2015-08-27 14:18:48 -07:00
sch_ingress.c net: sched: further simplify handle_ing 2015-05-11 11:10:35 -04:00
sch_mq.c net_sched: fix qdisc_tree_decrease_qlen() races 2015-12-03 14:59:05 -05:00
sch_mqprio.c net_sched: fix qdisc_tree_decrease_qlen() races 2015-12-03 14:59:05 -05:00
sch_multiq.c net: sched: consolidate tc_classify{,_compat} 2015-08-27 14:18:48 -07:00
sch_netem.c net: sched: deprecate enqueue_root() 2015-05-11 14:17:32 -04:00
sch_pie.c
sch_plug.c net: sched: drop all special handling of tx_queue_len == 0 2015-08-18 11:55:08 -07:00
sch_prio.c net: sched: consolidate tc_classify{,_compat} 2015-08-27 14:18:48 -07:00
sch_qfq.c net: sched: consolidate tc_classify{,_compat} 2015-08-27 14:18:48 -07:00
sch_red.c
sch_sfb.c net: sched: consolidate tc_classify{,_compat} 2015-08-27 14:18:48 -07:00
sch_sfq.c net: sched: consolidate tc_classify{,_compat} 2015-08-27 14:18:48 -07:00
sch_tbf.c
sch_teql.c