linux/drivers
Mike Snitzer 0ce65797a7 dm: impose configurable deadline for dm_request_fn's merge heuristic
Otherwise, for sequential workloads, the dm_request_fn can allow
excessive request merging at the expense of increased service time.

Add a per-device sysfs attribute to allow the user to control how long a
request, that is a reasonable merge candidate, can be queued on the
request queue.  The resolution of this request dispatch deadline is in
microseconds (ranging from 1 to 100000 usecs), to set a 20us deadline:
  echo 20 > /sys/block/dm-7/dm/rq_based_seq_io_merge_deadline

The dm_request_fn's merge heuristic and associated extra accounting is
disabled by default (rq_based_seq_io_merge_deadline is 0).

This sysfs attribute is not applicable to bio-based DM devices so it
will only ever report 0 for them.

By allowing a request to remain on the queue it will block others
requests on the queue.  But introducing a short dequeue delay has proven
very effective at enabling certain sequential IO workloads on really
fast, yet IOPS constrained, devices to build up slightly larger IOs --
yielding 90+% throughput improvements.  Having precise control over the
time taken to wait for larger requests to build affords control beyond
that of waiting for certain IO sizes to accumulate (which would require
a deadline anyway).  This knob will only ever make sense with sequential
IO workloads and the particular value used is storage configuration
specific.

Given the expected niche use-case for when this knob is useful it has
been deemed acceptable to expose this relatively crude method for
crafting optimal IO on specific storage -- especially given the solution
is simple yet effective.  In the context of DM multipath, it is
advisable to tune this sysfs attribute to a value that offers the best
performance for the common case (e.g. if 4 paths are expected active,
tune for that; if paths fail then performance may be slightly reduced).

Alternatives were explored to have request-based DM autotune this value
(e.g. if/when paths fail) but they were quickly deemed too fragile and
complex to warrant further design and development time.  If this problem
proves more common as faster storage emerges we'll have to look at
elevating a generic solution into the block core.

Tested-by: Shiva Krishna Merla <shivakrishna.merla@netapp.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2015-04-15 12:10:15 -04:00
..
accessibility
acpi Revert "x86/PCI: Refine the way to release PCI IRQ resources" 2015-03-20 14:56:19 +01:00
amba
android android: binder: fix binder mmap failures 2015-03-01 18:43:51 -08:00
ata ata: Add a new flag to destinguish sas controller 2015-03-19 14:14:43 -04:00
atm
auxdisplay
base regmap: Fix for v4.0 2015-03-24 16:42:54 -07:00
bcma
block NVMe: Initialize device list head before starting 2015-03-23 09:35:12 -06:00
bluetooth Bluetooth: btusb: Fix issue with CSR based Intel Wireless controllers 2015-02-23 09:30:35 +02:00
bus ARM: SoC platform changes 2015-02-17 09:27:54 -08:00
cdrom
char Not entirely surprising: the ongoing QEMU work on virtio 1.0 has revealed 2015-03-17 10:36:01 -07:00
clk The clk fixes for 4.0-rc4 comprise three themes. First are the usual 2015-03-15 15:07:08 -07:00
clocksource clocksource/drivers/sun5i: Fix cpufreq interaction with sched_clock() 2015-03-26 10:59:40 +01:00
connector
coresight
cpufreq Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal 2015-03-06 13:43:33 -08:00
cpuidle cpuidle: mvebu: Update cpuidle thresholds for Armada XP SOCs 2015-03-13 18:31:29 +01:00
crypto Merge branch 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma 2015-02-18 08:49:20 -08:00
dca
devfreq
dio
dma dmaengine: pl08x: Define capabilities for generic capabilities reporting 2015-03-18 21:34:29 +05:30
dma-buf
edac * A fix to sb_edac for proper detection on SNB machines 2015-02-19 11:18:14 -08:00
eisa
extcon
firewire
firmware * Fix regression in DMI sysfs code for handling "End of Table" entry 2015-03-02 14:18:57 +01:00
fmc
gpio gpio: tps65912: fix wrong container_of arguments 2015-02-23 15:40:32 +01:00
gpu drm/i915: Fixup legacy plane->crtc link for initial fb config 2015-03-26 13:39:04 +02:00
hid HID: wacom: check for wacom->shared before following the pointer 2015-03-17 20:59:55 +01:00
hsi
hv
hwmon hwmon: (ads7828) Check return value of devm_regmap_init_i2c 2015-02-22 20:10:30 -08:00
hwspinlock
i2c Revert "i2c: core: Dispose OF IRQ mapping at client removal time" 2015-03-12 10:23:05 +01:00
ide ide_tape: convert jiffies with jiffies_to_msecs 2015-03-18 23:25:57 -04:00
idle
iio Second round of IIO fixes for the 4.0 cycle (or round one part two really!) 2015-02-28 07:19:27 -08:00
infiniband IB/mlx4: Saturate RoCE port PMA counters in case of overflow 2015-03-18 15:17:11 -04:00
input Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2015-03-19 16:43:10 -07:00
iommu iommu/io-pgtable-arm: Add built time dependency 2015-03-03 14:04:12 +01:00
ipack
irqchip Merge branch 'irqchip/urgent-gic' into irqchip/urgent 2015-03-15 01:41:26 +00:00
isdn isdn: icn: use strlcpy() when parsing setup options 2015-03-15 22:24:37 -04:00
leds
lguest OK, this has the big virtio 1.0 implementation, as specified by OASIS. 2015-02-18 09:24:01 -08:00
macintosh
mailbox
mcb
md dm: impose configurable deadline for dm_request_fn's merge heuristic 2015-04-15 12:10:15 -04:00
media
memory
memstick
message
mfd mfd: kempld-core: Fix callback return value check 2015-03-12 09:27:58 +00:00
misc mei: make device disabled on stop unconditionally 2015-03-01 19:34:50 -08:00
mmc mmc: pwrseq_simple: fix error path in mmc_pwrseq_simple_alloc 2015-03-19 11:26:35 +01:00
mtd This pull request fixes a bug introduced during the v4.0 merge window where we 2015-03-21 10:36:44 -07:00
net cx82310_eth: wait for firmware to become ready 2015-03-21 18:23:19 -04:00
nfc
ntb
nubus
of Revert "of: Fix premature bootconsole disable with 'stdout-path'" 2015-03-19 08:46:54 -05:00
oprofile
parisc
parport
pci PCI updates for v4.0: 2015-03-12 09:45:46 -07:00
pcmcia Revert "pcmcia: add a new resource manager for non ISA systems" 2015-03-11 14:21:23 +01:00
phy phy: omap-usb2: Fix missing clk_prepare call when using old dt name 2015-03-13 17:14:39 +05:30
pinctrl pinctrl: sun4i: GPIOs configured as irq must be set to input before reading 2015-03-18 10:56:46 +01:00
platform Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2015-02-21 11:12:07 -08:00
pnp Merge branches 'pnp', 'pm-cpuidle' and 'pm-cpufreq' 2015-02-21 04:29:16 +01:00
power
powercap powercap / RAPL: handle domains with different energy units 2015-03-13 23:18:44 +01:00
pps
ps3
ptp
pwm pwm: tegra: Use NSEC_PER_SEC 2015-02-18 08:40:29 +01:00
rapidio Merge branch 'for-linus' of git://git.infradead.org/users/vkoul/slave-dma 2015-02-18 08:49:20 -08:00
ras
regulator Merge remote-tracking branches 'regulator/fix/doc' and 'regulator/fix/palmas' into regulator-linus 2015-03-23 11:43:42 -07:00
remoteproc
reset
rpmsg virtio_rpmsg: set DRIVER_OK before using device 2015-03-13 15:55:42 +10:30
rtc drivers/rtc/rtc-mrst: fix suspend/resume 2015-03-25 16:20:30 -07:00
s390 s390/dcss: array index 'i' is used before limits check. 2015-02-26 09:24:48 +01:00
sbus
scsi Merge branch 'for-4.0-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata 2015-03-24 17:08:29 -07:00
sfi
sh drivers: sh: Disable PM runtime for multi-platform r8a7740 with genpd 2015-02-24 07:26:12 +09:00
sn
soc ARM: SoC driver updates 2015-02-17 09:38:59 -08:00
spi Merge remote-tracking branches 'spi/fix/dw', 'spi/fix/queue' and 'spi/fix/qup' into spi-linus 2015-03-24 10:38:44 -07:00
spmi
ssb
staging vt6655: Fix late setting of byRFType. 2015-03-09 11:33:13 +01:00
target target: do not reject FUA CDBs when write cache is enabled but emulate_write_cache is 0 2015-03-19 23:26:46 -07:00
tc
thermal thermal: Make sysfs attributes of cooling devices default attributes 2015-03-05 01:47:57 -04:00
thunderbolt
tty serial: 8250_dw: Fix deadlock in LCR workaround 2015-03-11 16:39:52 +01:00
uio
usb USB / PHY driver fixes for 4.0-rc5 2015-03-22 11:33:55 -07:00
uwb
vfio vfio-pci: Add missing break to enable VFIO_PCI_ERR_IRQ_INDEX 2015-03-12 09:51:38 -06:00
vhost Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2015-03-21 11:24:38 -07:00
video OMAPDSS: fix regression with display sysfs files 2015-02-26 10:23:15 +02:00
virt
virtio virtio_mmio: fix access width for mmio 2015-03-17 12:12:21 +10:30
vlynq
vme
w1
watchdog watchdog: imgpdc: Fix default heartbeat 2015-03-27 08:47:50 +01:00
xen Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending 2015-03-21 11:24:38 -07:00
zorro
Kconfig
Makefile