Commit Graph

3048 Commits

Author SHA1 Message Date
Kevin Wolf
cfa1a5723f block: Drop permissions when migration completes
With image locking, permissions affect other qemu processes as well. We
want to be sure that the destination can run, so let's drop permissions
on the source when migration completes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-05-11 12:08:24 +02:00
Kevin Wolf
4417ab7adf block: New BdrvChildRole.activate() for blk_resume_after_migration()
Instead of manually calling blk_resume_after_migration() in migration
code after doing bdrv_invalidate_cache_all(), integrate the BlockBackend
activation with cache invalidation into a single function. This is
achieved with a new callback in BdrvChildRole that is called by
bdrv_invalidate_cache_all().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-05-11 12:08:24 +02:00
Max Reitz
293073a56c qcow2: Discard preallocated zero clusters
In discard_single_l2(), we completely discard normal clusters instead of
simply turning them into preallocated zero clusters. That means we
should probably do the same with such preallocated zero clusters:
Discard them instead of keeping them allocated.

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 12:08:24 +02:00
Max Reitz
564a6b6938 qcow2: Reuse preallocated zero clusters
Instead of just freeing preallocated zero clusters and completely
allocating them from scratch, reuse them.

We cannot do this in handle_copied(), however, since this is a COW
operation. Therefore, we have to add the new logic to handle_alloc() and
simply return the existing offset if it exists. The only catch is that
we have to convince qcow2_alloc_cluster_link_l2() not to free the old
clusters (because we have reused them).

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 12:08:24 +02:00
Max Reitz
92413c16be qcow2: Fix preallocation size formula
When calculating the number of reftable entries, we should actually use
the number of refblocks and not (wrongly[1]) re-calculate it.

[1] "Wrongly" means: Dividing the number of clusters by the number of
    entries per refblock and rounding down instead of up.

Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 12:08:24 +02:00
Fam Zheng
244a566810 file-posix: Add image locking to perm operations
This extends the permission bits of op blocker API to external using
Linux OFD locks.

Each permission in @perm and @shared_perm is represented by a locked
byte in the image file.  Requesting a permission in @perm is translated
to a shared lock of the corresponding byte; rejecting to share the same
permission is translated to a shared lock of a separate byte. With that,
we use 2x number of bytes of distinct permission types.

virtlockd in libvirt locks the first byte, so we do locking from a
higher offset.

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:15:32 +02:00
Fam Zheng
1c3a555c35 file-win32: Error out if locking=on
We share the same set of QAPI options with file-posix, but locking is
not supported here. So error out if it is specified as 'on' for now.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:08:41 +02:00
Fam Zheng
16b48d5d66 file-posix: Add 'locking' option
Making this option available even before implementing it will let
converting tests easier: in coming patches they can specify the option
already when necessary, before we actually write code to lock the
images.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-05-11 11:08:40 +02:00
Markus Armbruster
bd269ebc82 sockets: Limit SocketAddressLegacy to external interfaces
SocketAddressLegacy is a simple union, and simple unions are awkward:
they have their variant members wrapped in a "data" object on the
wire, and require additional indirections in C.  SocketAddress is the
equivalent flat union.  Convert all users of SocketAddressLegacy to
SocketAddress, except for existing external interfaces.

See also commit fce5d53..9445673 and 85a82e8..c5f1ae3.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493192202-3184-7-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Minor editing accident fixed, commit message and a comment tweaked]

Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-09 09:14:40 +02:00
Markus Armbruster
62cf396b5d sockets: Rename SocketAddressFlat to SocketAddress
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493192202-3184-6-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
2017-05-09 09:14:40 +02:00
Markus Armbruster
dfd100f242 sockets: Rename SocketAddress to SocketAddressLegacy
The next commit will rename SocketAddressFlat to SocketAddress, and
the commit after that will replace most uses of SocketAddressLegacy by
SocketAddress, replacing most of this commit's renames right back.

Note that checkpatch emits a few "line over 80 characters" warnings.
The long lines are all temporary; the SocketAddressLegacy replacement
will shorten them again.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493192202-3184-5-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-09 09:14:40 +02:00
Markus Armbruster
0785bd7a7c sockets: Prepare inet_parse() for flattened SocketAddress
I'm going to flatten SocketAddress: rename SocketAddress to
SocketAddressLegacy, SocketAddressFlat to SocketAddress, eliminate
SocketAddressLegacy except in external interfaces.

inet_parse() returns a newly allocated InetSocketAddress.  Lift the
allocation from inet_parse() into its caller socket_parse() to prepare
for flattening SocketAddress.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1493192202-3184-3-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Straightforward rebase]
2017-05-09 09:14:40 +02:00
Eric Blake
46f5ac205a qobject: Use simpler QDict/QList scalar insertion macros
We now have macros in place to make it less verbose to add a scalar
to QDict and QList, so use them.

Patch created mechanically via:
  spatch --sp-file scripts/coccinelle/qobject.cocci \
    --macro-file scripts/cocci-macro-file.h --dir . --in-place
then touched up manually to fix a couple of '?:' back to original
spacing, as well as avoiding a long line in monitor.c.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-7-eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-09 09:13:51 +02:00
Eric Blake
de6e7951fe qobject: Drop useless QObject casts
We have macros in place to make it less verbose to add a subtype
of QObject to both QDict and QList. While we have made cleanups
like this in the past (see commit fcfcd8ffc, for example), having
it be automated by Coccinelle makes it easier to maintain.

Patch created mechanically via:
  spatch --sp-file scripts/coccinelle/qobject.cocci \
    --macro-file scripts/cocci-macro-file.h --dir . --in-place
then I verified that no manual touchups were required.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170427215821.19397-5-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-05-08 20:32:14 +02:00
Eric Blake
048c5fd1bf qcow2: Allow discard of final unaligned cluster
As mentioned in commit 0c1bd46, we ignored requests to
discard the trailing cluster of an unaligned image.  While
discard is an advisory operation from the guest standpoint,
(and we are therefore free to ignore any request), our
qcow2 implementation exploits the fact that a discarded
cluster reads back as 0.  As long as we discard on cluster
boundaries, we are fine; but that means we could observe
non-zero data leaked at the tail of an unaligned image.

Enhance iotest 66 to cover this case, and fix the implementation
to honor a discard request on the final partial cluster.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170407013709.18440-1-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28 16:02:03 +02:00
Max Reitz
f59adb3256 block: Add .bdrv_truncate() error messages
Add missing error messages for the block driver implementations of
.bdrv_truncate(); drop the generic one from block.c's bdrv_truncate().

Since one of these changes touches a mis-indented block in
block/file-posix.c, this patch fixes that coding style issue along the
way.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170328205129.15138-5-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28 16:02:03 +02:00
Max Reitz
4bff28b81a block: Add errp to BD.bdrv_truncate()
Add an Error parameter to the block drivers' bdrv_truncate() interface.
If a block driver does not set this in case of an error, the generic
bdrv_truncate() implementation will do so.

Where it is obvious, this patch also makes some block drivers set this
value.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170328205129.15138-4-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28 16:02:03 +02:00
Max Reitz
ed3d2ec98a block: Add errp to b{lk,drv}_truncate()
For one thing, this allows us to drop the error message generation from
qemu-img.c and blockdev.c and instead have it unified in
bdrv_truncate().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170328205129.15138-3-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28 16:02:02 +02:00
Max Reitz
55b9392b98 block/vhdx: Make vhdx_create() always set errp
This patch makes vhdx_create() always set errp in case of an error. It
also adds errp parameters to vhdx_create_bat() and
vhdx_create_new_region_table() so we can pass on the error object
generated by blk_truncate() as of a future commit.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20170328205129.15138-2-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-28 16:02:02 +02:00
Denis V. Lunev
f13ce1be35 block: fix alignment calculations in bdrv_co_do_zero_pwritev
tail_padding_bytes is calculated wrong. F.e. for
    offset = 0
    bytes = 2048
    align = 512
we will have tail_padding_bytes = 512 which is definitely wrong. The patch
fixes that arithmetics.

Fortunately this problem is harmless, we will have 1 extra allocation and
free thus there is no need to put this into stable. The problem is here
from the very beginning.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 16:24:01 +02:00
Max Reitz
de234897b6 block: Do not unref bs->file on error in BD's open
The block layer takes care of removing the bs->file child if the block
driver's bdrv_open()/bdrv_file_open() implementation fails. The block
driver therefore does not need to do so, and indeed should not unless it
sets bs->file to NULL afterwards -- because if this is not done, the
bdrv_unref_child() in bdrv_open_inherit() will dereference the freed
memory block at bs->file afterwards, which is not good.

We can now decide whether to add a "bs->file = NULL;" after each of the
offending bdrv_unref_child() invocations, or just drop them altogether.
The latter is simpler, so let's do that.

Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 16:12:13 +02:00
Fam Zheng
e914404efb block: Remove NULL check in bdrv_co_flush
Reported by Coverity. We already use bs in bdrv_inc_in_flight before
checking for NULL. It is unnecessary as all callers pass non-NULL bs, so
drop it.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 15:39:50 +02:00
Max Reitz
362b3786eb Revert "block/io: Comment out permission assertions"
This reverts commit e3e0003a8f.

This commit was necessary for the 2.9 release because we were unable to
fix the underlying issue(s) in time. However, we will be for 2.10.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 15:39:49 +02:00
Kevin Wolf
0d5e0bb2f7 file-win32: Remove unnecessary include
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 15:39:49 +02:00
Kevin Wolf
ad02b7af0c file-posix: Remove unnecessary includes
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 15:39:49 +02:00
Krzysztof Kozlowski
0731a50feb block: Constify data passed by pointer to blk_name
blk_name() is not modifying data passed to it through pointer and it
returns also a pointer to const so the argument can be made const for
code safeness.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-27 15:39:49 +02:00
Jeff Cody
56e7cf8df0 block/rbd: Add support for reopen()
This adds support for reopen in rbd, for changing between r/w and r/o.

Note, that this is only a flag change, but we will block a change from
r/o to r/w if we are using an RBD internal snapshot.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: d4e87539167ec6527d44c97b164eabcccf96e4f3.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody
80b61a27c6 block/rbd - update variable names to more apt names
Update 'clientname' to be 'user', which tracks better with both
the QAPI and rados variable naming.

Update 'name' to be 'image_name', as it indicates the rbd image.
Naming it 'image' would have been ideal, but we are using that for
the rados_image_t value returned by rbd_open().

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: b7ec1fb2e1cf36f9b6911631447a5b0422590b7d.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody
e2b8247a32 block: do not set BDS read_only if copy_on_read enabled
A few block drivers will set the BDS read_only flag from their
.bdrv_open() function.  This means the bs->read_only flag could
be set after we enable copy_on_read, as the BDRV_O_COPY_ON_READ
flag check occurs prior to the call to bdrv->bdrv_open().

This adds an error return to bdrv_set_read_only(), and an error will be
return if we try to set the BDS to read_only while copy_on_read is
enabled.

This patch also changes the behavior of vvfat.  Before, vvfat could
override the drive 'readonly' flag with its own, internal 'rw' flag.

For instance, this -drive parameter would result in a writable image:

"-drive format=vvfat,dir=/tmp/vvfat,rw,if=virtio,readonly=on"

This is not correct.  Now, attempting to use the above -drive parameter
will result in an error (i.e., 'rw' is incompatible with 'readonly=on').

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 0c5b4c1cc2c651471b131f21376dfd5ea24d2196.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody
fe5241bfe3 block: add bdrv_set_read_only() helper function
We have a helper wrapper for checking for the BDS read_only flag,
add a helper wrapper to set the read_only flag as well.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 9b18972d05f5fa2ac16c014f0af98d680553048d.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Ashish Mittal
da92c3ff60 block/vxhs.c: Add support for a new block device type called "vxhs"
Source code for the qnio library that this code loads can be downloaded from:
https://github.com/VeritasHyperScale/libqnio.git

Sample command line using JSON syntax:
./x86_64-softmmu/qemu-system-x86_64 -name instance-00000008 -S -vnc 0.0.0.0:0
-k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
-msg timestamp=on
'json:{"driver":"vxhs","vdisk-id":"c3e9095a-a5ee-4dce-afeb-2a59fb387410",
"server":{"host":"172.172.17.4","port":"9999"}}'

Sample command line using URI syntax:
qemu-img convert -f raw -O raw -n
/var/lib/nova/instances/_base/0c5eacd5ebea5ed914b6a3e7b18f1ce734c386ad
vxhs://192.168.0.1:9999/c6718f6b-0401-441d-a8c3-1f0064d75ee0

Sample command line using TLS credentials (run in secure mode):
./qemu-io --object
tls-creds-x509,id=tls0,dir=/etc/pki/qemu/vxhs,endpoint=client -c 'read
-v 66000 2.5k' 'json:{"server.host": "127.0.0.1", "server.port": "9999",
"vdisk-id": "/test.raw", "driver": "vxhs", "tls-creds":"tls0"}'

[Jeff: Modified trace-events with the correct string formatting]

Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: 1491277689-24949-2-git-send-email-Ashish.Mittal@veritas.com
2017-04-24 15:08:42 -04:00
Fam Zheng
cb8d4bf677 nfs: Make errp the last parameter of nfs_client_open
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-10-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:44 +02:00
Fam Zheng
78bbd910bb block: Make errp the last parameter of commit_active_start
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-9-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:44 +02:00
Fam Zheng
51ccfa2dbf mirror: Make errp the last parameter of mirror_start_job
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-8-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:44 +02:00
Fam Zheng
375092332e crypto: Make errp the last parameter of functions
Move opaque to 2nd instead of the 2nd to last, so that compilers help
check with the conversion.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-7-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Commit message typo corrected]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:22 +02:00
Fam Zheng
6dffc1f670 socket: Make errp the last parameter of inet_connect_saddr
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-3-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:12:59 +02:00
Fam Zheng
226799cec5 socket: Make errp the last parameter of socket_connect
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-2-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:12:59 +02:00
Fam Zheng
178bd438af block: Walk bs->children carefully in bdrv_drain_recurse
The recursive bdrv_drain_recurse may run a block job completion BH that
drops nodes. The coming changes will make that more likely and use-after-free
would happen without this patch

Stash the bs pointer and use bdrv_ref/bdrv_unref in addition to
QLIST_FOREACH_SAFE to prevent such a case from happening.

Since bdrv_unref accesses global state that is not protected by the AioContext
lock, we cannot use bdrv_ref/bdrv_unref unconditionally.  Fortunately the
protection is not needed in IOThread because only main loop can modify a graph
with the AioContext lock held.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170418143044.12187-2-famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Tested-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-04-18 22:56:28 +08:00
Max Reitz
e3e0003a8f block/io: Comment out permission assertions
In case of block migration, there may be writes to BlockBackends that do
not have the write permission taken. Before this issue is fixed (which
is not going to happen in 2.9), we therefore cannot assert that this is
the case.

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20170411145050.31290-1-mreitz@redhat.com
Tested-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-11 16:09:31 +01:00
Kevin Wolf
5eceb01adf sheepdog: Fix crash in co_read_response()
This fixes a regression introduced in commit 9d456654.

aio_co_wake() can only be used to reenter a coroutine that was already
previously entered, otherwise co->ctx is uninitialised and we access
garbage. Using it immediately after qemu_coroutine_create() like in
co_read_response() is wrong and causes segfaults.

Replace the call with aio_co_enter(), which gets an explicit AioContext
parameter and works even for new coroutines.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1491919733-21065-1-git-send-email-kwolf@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-11 16:08:29 +01:00
Fam Zheng
2ec9a782d1 iscsi: Fix iscsi_create
Since d5895fcb (iscsi: Split URL into individual options), creating
qcow2 image on an iscsi LUN fails:

    qemu-img create -f qcow2 iscsi://$SERVER/$IQN/0 1G
    qemu-img: iscsi://$SERVER/$IQN/0: Could not create image: Invalid
        argument

The problem is iscsi_open now expects that transport_name, portal and
target are already parsed into structured options by
iscsi_parse_filename, but it is not called in iscsi_create.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 20170410075451.21329-1-famz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
[mreitz: Dropped now superfluous
         qdict_put(bs_options, "filename", ...)]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-11 15:33:00 +02:00
Eric Blake
1606e4cf8a throttle: Remove block from group on hot-unplug
When a block device that is part of a throttle group is hot-unplugged,
we forgot to remove it from the throttle group. This leaves stale
memory around, and causes an easily reproducible crash:

$ ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio \
-device virtio-scsi-pci,bus=pci.0 -drive \
id=drive_image2,if=none,format=raw,file=file2,bps=512000,iops=100,group=foo \
-device scsi-hd,id=image2,drive=drive_image2 -drive \
id=drive_image3,if=none,format=raw,file=file3,bps=512000,iops=100,group=foo \
-device scsi-hd,id=image3,drive=drive_image3
{'execute':'qmp_capabilities'}
{'execute':'device_del','arguments':{'id':'image3'}}
{'execute':'system_reset'}

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1428810

Suggested-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170406190847.29347-1-eblake@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-11 15:33:00 +02:00
Dong Jia Shi
7a9e51198c block: pass the right options for BlockDriver.bdrv_open()
raw_open() expects the caller always passing in the right actual
@options parameter. But when trying to applying snapshot on a RBD
image, bdrv_snapshot_goto() calls raw_open() (by calling the
bdrv_open callback on the BlockDriver) with a NULL @options, and
that will result in a Segmentation fault.

For the other non-raw format drivers, it also makes sense to passing
in the actual options, althought they don't trigger the problem so
far.

Let's prepare a @options by adding the "file" key-value pair to a
copy of the actual options that were given for the node (i.e.
bs->options), and pass it to the callback.

BlockDriver.bdrv_open() expects bs->file to be NULL and just
overwrites it with the result from bdrv_open_child(). That means we
should actually make sure it's NULL because otherwise the child BDS
will have a reference count that is 1 too high. So we unconditionally
invoke bdrv_unref_child() before calling BlockDriver.bdrv_open(), and
we wrap everything in bdrv_ref()/bdrv_unref() so the BDS isn't
deleted in the meantime.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-id: 20170405091909.36357-2-bjsdjshi@linux.vnet.ibm.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-11 15:33:00 +02:00
Fam Zheng
76296dff97 sheepdog: Use bdrv_coroutine_enter before BDRV_POLL_WHILE
When called from main thread, the coroutine should run in the context of
bs. Use bdrv_coroutine_enter to ensure that.

Signed-off-by: Fam Zheng <famz@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
49ca625913 block: Fix bdrv_co_flush early return
bdrv_inc_in_flight and bdrv_dec_in_flight are mandatory for
BDRV_POLL_WHILE to work, even for the shortcut case where flush is
unnecessary. Move the if block to below bdrv_dec_in_flight, and BTW fix
the variable declaration position.

Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
e92f0e1910 block: Use bdrv_coroutine_enter to start I/O coroutines
BDRV_POLL_WHILE waits for the started I/O by releasing bs's ctx then polling
the main context, which relies on the yielded coroutine continuing on bs->ctx
before notifying qemu_aio_context with bdrv_wakeup().

Thus, using qemu_coroutine_enter to start I/O is wrong because if the coroutine
is entered from main loop, co->ctx will be qemu_aio_context, as a result of the
"release, poll, acquire" loop of BDRV_POLL_WHILE, race conditions happen when
both main thread and the iothread access the same BDS:

  main loop                                iothread
-----------------------------------------------------------------------
  blockdev_snapshot
    aio_context_acquire(bs->ctx)
                                           virtio_scsi_data_plane_handle_cmd
    bdrv_drained_begin(bs->ctx)
    bdrv_flush(bs)
      bdrv_co_flush(bs)                      aio_context_acquire(bs->ctx).enter
        ...
        qemu_coroutine_yield(co)
      BDRV_POLL_WHILE()
        aio_context_release(bs->ctx)
                                             aio_context_acquire(bs->ctx).return
                                               ...
                                                 aio_co_wake(co)
        aio_poll(qemu_aio_context)               ...
          co_schedule_bh_cb()                    ...
            qemu_coroutine_enter(co)             ...

              /* (A) bdrv_co_flush(bs)           /* (B) I/O on bs */
                      continues... */
                                             aio_context_release(bs->ctx)
        aio_context_acquire(bs->ctx)

Note that in above case, bdrv_drained_begin() doesn't do the "release,
poll, acquire" in BDRV_POLL_WHILE, because bs->in_flight == 0.

Fix this by using bdrv_coroutine_enter and enter coroutine in the right
context.

iotests 109 output is updated because the coroutine reenter flow during
mirror job complete is different (now through co_queue_wakeup, instead
of the unconditional qemu_coroutine_switch before), making the end job
len different.

Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
14e9559f46 block: Make bdrv_parent_drained_begin/end public
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
19dd29e8a7 mirror: Fix aio context of mirror_top_bs
It should be moved to the same context as source, before inserting to the
graph.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-07 14:44:06 +02:00
Kevin Wolf
1bf03e66fd block: Don't check permissions for copy on read
The assertion is currently failing. We can't require callers to have
write permissions when all they are doing is a read, so comment it out.
Add a FIXME comment in the code so that the check is re-enabled when
copy on read is refactored into its own filter driver.

Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
2017-04-07 14:44:06 +02:00
Max Reitz
7a25fcd056 block/mirror: Fix use-after-free
If @bs does not have any parents, the only reference to @mirror_top_bs
will be held by the BlockJob object after the bdrv_unref() following
block_job_create(). However, if block_job_create() fails, this reference
will not exist and @mirror_top_bs will have been deleted when we
goto fail.

The issue comes back at all later entries to the fail label: We delete
the BlockJob object before rolling back our changes to the node graph.
This means that we will delete @mirror_top_bs in the process.

All in all, whenever @bs does not have any parents and we go down the
fail path we will dereference @mirror_top_bs after it has been deleted.

Fix this by invoking bdrv_unref() only when block_job_create() was
successful and by bdrv_ref()'ing @mirror_top_bs in the fail path before
deleting the BlockJob object. Finally, bdrv_unref() it at the end of the
fail path after we actually no longer need it.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-07 14:44:05 +02:00