linux/block
Tejun Heo 1d3650f713 cfq-iosched: implement hierarchy-ready cfq_group charge scaling
Currently, cfqg charges are scaled directly according to cfqg->weight.
Regardless of the number of active cfqgs or the amount of active
weights, a given weight value always scales charge the same way.  This
works fine as long as all cfqgs are treated equally regardless of
their positions in the hierarchy, which is what cfq currently
implements.  It can't work in hierarchical settings because the
interpretation of a given weight value depends on where the weight is
located in the hierarchy.

This patch reimplements cfqg charge scaling so that it can be used to
support hierarchy properly.  The scheme is fairly simple and
light-weight.

* When a cfqg is added to the service tree, v(disktime)weight is
  calculated.  It walks up the tree to root calculating the fraction
  it has in the hierarchy.  At each level, the fraction can be
  calculated as

    cfqg->weight / parent->level_weight

  By compounding these, the global fraction of vdisktime the cfqg has
  claim to - vfraction - can be determined.

* When the cfqg needs to be charged, the charge is scaled inversely
  proportionally to the vfraction.

The new scaling scheme uses the same CFQ_SERVICE_SHIFT for fixed point
representation as before; however, the smallest scaling factor is now
1 (ie. 1 << CFQ_SERVICE_SHIFT).  This is different from before where 1
was for CFQ_WEIGHT_DEFAULT and higher weight would result in smaller
scaling factor.

While this shifts the global scale of vdisktime a bit, it doesn't
change the relative relationships among cfqgs and the scheduling
result isn't different.

cfq_group_notify_queue_add uses fixed CFQ_IDLE_DELAY when appending
new cfqg to the service tree.  The specific value of CFQ_IDLE_DELAY
didn't have any relevance to vdisktime before and is unlikely to cause
any visible behavior difference now especially as the scale shift
isn't that large.

As the new scheme now makes proper distinction between cfqg->weight
and ->leaf_weight, reverse the weight aliasing for root cfqgs.  For
root, both weights are now mapped to ->leaf_weight instead of the
other way around.

Because we're still using cfqg_flat_parent(), this patch shouldn't
change the scheduling behavior in any noticeable way.

v2: Beefed up comments on vfraction as requested by Vivek.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
2013-01-09 08:05:11 -08:00
..
partitions Merge branch 'for-3.8/drivers' of git://git.kernel.dk/linux-block 2012-12-17 13:39:11 -08:00
blk-cgroup.c cfq-iosched: add leaf_weight 2013-01-09 08:05:10 -08:00
blk-cgroup.h cfq-iosched: add leaf_weight 2013-01-09 08:05:10 -08:00
blk-core.c block: export block_unplug tracepoint 2012-12-14 20:49:27 +01:00
blk-exec.c Merge branch 'for-3.8/core' of git://git.kernel.dk/linux-block 2012-12-17 08:27:23 -08:00
blk-flush.c
blk-integrity.c
blk-ioc.c block: uninitialized ioc->nr_tasks triggers WARN_ON 2012-08-01 12:17:27 +02:00
blk-iopoll.c
blk-lib.c block: add plug for blkdev_issue_discard 2012-12-14 20:46:04 +01:00
blk-map.c block: re-use existing 'reading' variable instead of checking direction again 2011-12-21 15:27:24 +01:00
blk-merge.c block: Implement support for WRITE SAME 2012-09-20 14:31:45 +02:00
blk-settings.c block: discard granularity might not be power of 2 2012-12-14 20:46:04 +01:00
blk-softirq.c sched, block: Unify cache detection 2012-01-27 13:28:48 +01:00
blk-sysfs.c block: Rename queue dead flag 2012-12-06 14:30:58 +01:00
blk-tag.c block/blk-tag.c: Remove useless kfree 2012-09-12 22:25:12 +02:00
blk-throttle.c block: Rename queue dead flag 2012-12-06 14:30:58 +01:00
blk-timeout.c block: Drop dead function blk_abort_queue() 2012-06-15 08:46:23 +02:00
blk.h block: Avoid that request_fn is invoked on a dead queue 2012-12-06 14:32:01 +01:00
bsg-lib.c bsg: Remove unused function bsg_goose_queue() 2012-12-06 14:33:02 +01:00
bsg.c bsg: fix sysfs link remove warning 2012-02-08 20:02:03 +01:00
cfq-iosched.c cfq-iosched: implement hierarchy-ready cfq_group charge scaling 2013-01-09 08:05:11 -08:00
compat_ioctl.c block: Add BLKROTATIONAL ioctl 2012-01-11 16:29:31 +01:00
deadline-iosched.c deadline: Allow 0ms deadline latency, increase the read speed 2012-12-09 19:19:23 +01:00
elevator.c block: recursive merge requests 2012-11-09 08:44:27 +01:00
genhd.c Merge branch 'for-3.8/drivers' of git://git.kernel.dk/linux-block 2012-12-17 13:39:11 -08:00
ioctl.c Merge branch 'for-3.7/core' of git://git.kernel.dk/linux-block 2012-10-11 09:04:23 +09:00
Kconfig percpu_rw_semaphore: introduce CONFIG_PERCPU_RWSEM 2012-12-17 17:15:18 -08:00
Kconfig.iosched blkcg: make CONFIG_BLK_CGROUP bool 2012-03-06 21:27:21 +01:00
Makefile separate partition format handling from generic code 2012-01-03 22:54:06 -05:00
noop-iosched.c elevator: make elevator_init_fn() return 0/-errno 2012-03-06 21:27:21 +01:00
partition-generic.c block: add partition resize function to blkpg ioctl 2012-08-01 12:24:18 +02:00
scsi_ioctl.c scsi: Silence unnecessary warnings about ioctl to partition 2012-06-15 12:52:46 +02:00