linux/drivers/block
Ilya Dryomov 2761713d35 rbd: fix copyup completion race
For write/discard obj_requests that involved a copyup method call, the
opcode of the first op is CEPH_OSD_OP_CALL and the ->callback is
rbd_img_obj_copyup_callback().  The latter frees copyup pages, sets
->xferred and delegates to rbd_img_obj_callback(), the "normal" image
object callback, for reporting to block layer and putting refs.

rbd_osd_req_callback() however treats CEPH_OSD_OP_CALL as a trivial op,
which means obj_request is marked done in rbd_osd_trivial_callback(),
*before* ->callback is invoked and rbd_img_obj_copyup_callback() has
a chance to run.  Marking obj_request done essentially means giving
rbd_img_obj_callback() a license to end it at any moment, so if another
obj_request from the same img_request is being completed concurrently,
rbd_img_obj_end_request() may very well be called on such prematurally
marked done request:

<obj_request-1/2 reply>
handle_reply()
  rbd_osd_req_callback()
    rbd_osd_trivial_callback()
    rbd_obj_request_complete()
    rbd_img_obj_copyup_callback()
    rbd_img_obj_callback()
                                    <obj_request-2/2 reply>
                                    handle_reply()
                                      rbd_osd_req_callback()
                                        rbd_osd_trivial_callback()
      for_each_obj_request(obj_request->img_request) {
        rbd_img_obj_end_request(obj_request-1/2)
        rbd_img_obj_end_request(obj_request-2/2) <--
      }

Calling rbd_img_obj_end_request() on such a request leads to trouble,
in particular because its ->xfferred is 0.  We report 0 to the block
layer with blk_update_request(), get back 1 for "this request has more
data in flight" and then trip on

    rbd_assert(more ^ (which == img_request->obj_request_count));

with rhs (which == ...) being 1 because rbd_img_obj_end_request() has
been called for both requests and lhs (more) being 1 because we haven't
got a chance to set ->xfferred in rbd_img_obj_copyup_callback() yet.

To fix this, leverage that rbd wants to call class methods in only two
cases: one is a generic method call wrapper (obj_request is standalone)
and the other is a copyup (obj_request is part of an img_request).  So
make a dedicated handler for CEPH_OSD_OP_CALL and directly invoke
rbd_img_obj_copyup_callback() from it if obj_request is part of an
img_request, similar to how CEPH_OSD_OP_READ handler invokes
rbd_img_obj_request_read_callback().

Since rbd_img_obj_copyup_callback() is now being called from the OSD
request callback (only), it is renamed to rbd_osd_copyup_callback().

Cc: Alex Elder <elder@linaro.org>
Cc: stable@vger.kernel.org # 3.10+, needs backporting for < 3.18
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Alex Elder <elder@linaro.org>
2015-07-31 11:38:57 +03:00
..
aoe
drbd Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-07-04 19:36:06 -07:00
mtip32xx mtip32xx: Fix accessing freed memory 2015-06-24 08:48:46 -06:00
paride Char/Misc driver patches for 4.2-rc1 2015-06-26 14:51:15 -07:00
rsxx block/rsxx: use generic io stats accounting functions to simplify io stat accounting 2014-11-24 08:05:18 -07:00
xen-blkback xen: features and cleanups for 4.2-rc0 2015-07-01 11:53:46 -07:00
zram zram: check comp algorithm availability earlier 2015-06-25 17:00:37 -07:00
amiflop.c
ataflop.c
brd.c brd: rename XIP to DAX 2015-02-16 17:56:04 -08:00
cciss_cmd.h
cciss_scsi.c scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
cciss_scsi.h
cciss.c cciss: correct the non-resettable board list 2015-05-31 11:14:34 -07:00
cciss.h
cpqarray.c genirq: Remove the deprecated 'IRQF_DISABLED' request_irq() flag entirely 2015-03-05 20:53:06 +01:00
cpqarray.h
cryptoloop.c
DAC960.c
DAC960.h
floppy.c floppy: Avoid manual call of device_create_file() 2015-02-03 13:00:36 +01:00
hd.c
ida_cmd.h
ida_ioctl.h
Kconfig libnvdimm, pmem: move pmem to drivers/nvdimm/ 2015-06-24 21:24:10 -04:00
loop.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-07-04 19:36:06 -07:00
loop.h block: loop: don't hold lo_ctl_mutex in lo_open 2015-05-20 09:06:09 -06:00
Makefile libnvdimm, pmem: move pmem to drivers/nvdimm/ 2015-06-24 21:24:10 -04:00
mg_disk.c
nbd.c block: nbd: convert to blkdev_reread_part() 2015-05-20 09:06:13 -06:00
null_blk.c null_blk: fix use-after-free problem 2015-07-22 13:30:20 -06:00
nvme-core.c NVMe: Reread partitions on metadata formats 2015-07-15 15:36:47 -06:00
nvme-scsi.c Merge branch 'for-4.2/drivers' of git://git.kernel.dk/linux-block 2015-06-25 15:12:50 -07:00
osdblk.c block: support different tag allocation policy 2015-01-23 14:15:46 -07:00
pktcdvd.c writeback: separate out include/linux/backing-dev-defs.h 2015-06-02 08:33:34 -06:00
ps3disk.c
ps3vram.c block/ps3vram: Remove obsolete reference to MTD 2015-06-10 14:06:55 -06:00
rbd_types.h
rbd.c rbd: fix copyup completion race 2015-07-31 11:38:57 +03:00
skd_main.c
skd_s1120.h
smart1,2.h
sunvdc.c sunvdc: reconnect ldc after vds service domain restarts 2014-12-11 18:52:45 -08:00
swim3.c powerpc: Move Power Macintosh drivers to generic byteswappers 2015-03-23 14:29:40 +11:00
swim_asm.S
swim.c
sx8.c block: rename REQ_TYPE_SPECIAL to REQ_TYPE_DRV_PRIV 2015-05-05 13:40:03 -06:00
umem.c
umem.h
virtio_blk.c block: rename REQ_TYPE_SPECIAL to REQ_TYPE_DRV_PRIV 2015-05-05 13:40:03 -06:00
xen-blkfront.c xen: features and cleanups for 4.2-rc0 2015-07-01 11:53:46 -07:00
xsysace.c
z2ram.c