The Xen blkif protocol requires that sector based quantities should be
interpreted strictly as multiples of 512 bytes. Specifically:
"first_sect and last_sect in blkif_request_segment, as well as
sector_number in blkif_request, are always expressed in 512-byte units."
Commit fcab2b464e "xen: add header and build dataplane/xen-block.c"
incorrectly modified behaviour to use the block device logical_block_size
property as the scale, instead of correctly shifting values by the
hardcoded BDRV_SECTOR_BITS (and hence scaling them to 512 byte units).
This patch undoes that change and restores compliance with the spec.
Furthermore, this patch also restores the original xen_disk behaviour
of advertizing a hardcoded 'sector-size' value of 512 in xenstore and
scaling 'sectors' accordingly. The realize() method is also modified to
fail if logical_block_size is set to anything other than 512.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20190401121719.27208-1-paul.durrant@citrix.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
...and properly enable it when synthesizing a drive.
The Xen toolstack sets 'discard-enable' to '1' in xenstore when it wants
to enable discard on a specified image. The code in
xen_block_drive_create() correctly parses this and uses it to set
'discard' to 'unmap' for the file_layer, but fails to do the same for the
driver_layer (which effectively disables it). Meanwhile the code in
xen_block_realize() advertizes discard support to the frontend in the
default case (because conf->discard_granularity defaults to -1), even when
the underlying image may not handle it.
This patch adds the missing option to the driver_layer in
xen_block_driver_create() and checks whether BDRV_O_UNMAP is actually
set on the block device before advertizing discard to the frontend.
In the case that discard is supported it also makes sure that the
granularity is set to the physical block size.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20190320142825.24565-1-paul.durrant@citrix.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Compiling with GCC 9 complains
hw/s390x/3270-ccw.c: In function ‘emulated_ccw_3270_cb’:
hw/s390x/3270-ccw.c:81:19: error: taking address of packed member of ‘struct SCHIB’ may result in an unaligned pointer value [-Werror=address-of-packed-member]
81 | SCSW *s = &sch->curr_status.scsw;
| ^~~~~~~~~~~~~~~~~~~~~~
This local variable is only present to save a little bit of
typing when setting the field later. Get rid of this to avoid
the warning about unaligned accesses.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190329111104.17223-15-berrange@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Compiling with GCC 9 complains
hw/s390x/ipl.c: In function ‘s390_ipl_set_boot_menu’:
hw/s390x/ipl.c:256:25: warning: taking address of packed member of ‘struct QemuIplParameters’ may result in an unaligned pointer value [-Waddress-of-packed-member]
256 | uint32_t *timeout = &ipl->qipl.boot_menu_timeout;
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
This local variable is only present to save a little bit of
typing when setting the field later. Get rid of this to avoid
the warning about unaligned accesses.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190329111104.17223-14-berrange@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The GCC 9 compiler complains about many places in s390 code
that take the address of members of the 'struct SCHIB' which
is marked packed:
hw/s390x/css.c: In function ‘sch_handle_clear_func’:
hw/s390x/css.c:698:15: warning: taking address of packed member of ‘struct SCHIB’ may result in an unaligned pointer val\
ue [-Waddress-of-packed-member]
698 | PMCW *p = &sch->curr_status.pmcw;
| ^~~~~~~~~~~~~~~~~~~~~~
hw/s390x/css.c:699:15: warning: taking address of packed member of ‘struct SCHIB’ may result in an unaligned pointer val\
ue [-Waddress-of-packed-member]
699 | SCSW *s = &sch->curr_status.scsw;
| ^~~~~~~~~~~~~~~~~~~~~~
...snip many more...
Almost all of these are just done for convenience to avoid
typing out long variable/field names when referencing struct
members. We can get most of this convenience by taking the
address of the 'struct SCHIB' instead, avoiding triggering
the compiler warnings.
In a couple of places we copy via a local variable which is
a technique already applied elsewhere in s390 code for this
problem.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190329111104.17223-13-berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The GCC 9 compiler complains about many places in s390 code
that take the address of members of the 'struct SCHIB' which
is marked packed:
hw/vfio/ccw.c: In function ‘vfio_ccw_io_notifier_handler’:
hw/vfio/ccw.c:133:15: warning: taking address of packed member of ‘struct SCHIB’ may result in an unaligned pointer value \
[-Waddress-of-packed-member]
133 | SCSW *s = &sch->curr_status.scsw;
| ^~~~~~~~~~~~~~~~~~~~~~
hw/vfio/ccw.c:134:15: warning: taking address of packed member of ‘struct SCHIB’ may result in an unaligned pointer value \
[-Waddress-of-packed-member]
134 | PMCW *p = &sch->curr_status.pmcw;
| ^~~~~~~~~~~~~~~~~~~~~~
...snip many more...
Almost all of these are just done for convenience to avoid
typing out long variable/field names when referencing struct
members. We can get most of this convenience by taking the
address of the 'struct SCHIB' instead, avoiding triggering
the compiler warnings.
In a couple of places we copy via a local variable which is
a technique already applied elsewhere in s390 code for this
problem.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190329111104.17223-12-berrange@redhat.com>
Reviewed-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Farhan Ali <alifm@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
VTD_RTADDR_RTT is dropped even by the VT-d spec, so QEMU should
probably do the same thing (after all we never really implemented it).
Since we've had a field for that in the migration stream, to keep
compatibility we need to fill the hole up.
Please refer to VT-d spec 10.4.6.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20190329061422.7926-3-peterx@redhat.com>
Reviewed-by: Liu, Yi L <yi.l.liu@intel.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When introducing the initial support for scalable mode we added a
new field into vmstate however we blindly migrate that field without
notice. That'll break migration no matter forward or backward.
The normal way should be that we use something like
VMSTATE_UINT32_TEST() or subsections for the new vmstate field however
for this case of vt-d we can even make it simpler because we've
already migrated all the registers and it'll be fairly simple that we
re-generate root_scalable field from the register values during post
load of the device.
Fixes: fb43cf739e ("intel_iommu: scalable mode emulation")
Reviewed-by: Yi Sun <yi.y.sun@linux.intel.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20190329061422.7926-2-peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Message-Id: <20190321161832.10533-1-yuval.shaia@oracle.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If we try to use the intel-iommu device with vfio-pci devices without
caching mode enabled, we're told:
qemu-system-x86_64: We need to set caching-mode=1 for intel-iommu to enable
device assignment with IOMMU protection.
But to enable caching mode, the option is actually "caching-mode=on".
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Message-Id: <155364147432.16467.15898335025013220939.stgit@gimli.home>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Williamson <<a href="mailto:alex.williamson@redhat.com" target="_blank" rel="noreferrer">alex.williamson@redhat.com</a>><br>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The callers to bios_linker_find_file() assert that the file entry returned
is not NULL, except for those in bios_linker_loader_add_pointer(). Add two
asserts in that case for completeness and to facilitate static code analysis.
Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
Message-Id: <1553199229-25318-1-git-send-email-liam.merwick@oracle.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Ensure watch IDs unique within a monitor and avoid integer wraparound
issues when many watches are set & unset over time.
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJco1tUAAoJEL6G67QVEE/f9xYP/2OTsAk6dHCcWqc///jXN0JW
OLcnpmjuh5KgUmOrs7NIRYRo2DARrhNDDXckHjDuX2HmnzoTB5p7DXLWMPgAzxj6
o83yxYT0Qqw79wxqbDmJ1y3kUVka1r94wCmmvUmxQZjLbDk8u2ytNUDbEOVlM0VJ
R7T800j8PhbY1pg9PiJJZonm7dxZAVNSCF4AgkrJVIes6MqhvNUhVsck5q/CC3yi
0jkglsiVTyXjKpJsioM9VhQXeqiHfFl3qjcPi0XQ99lVpTMFysNd4UdnVXhNs9NI
Xe52yQ7NG3fEggUivtFBmvVplNOg09K6U2vkyk9z8S1WTLZsbabwvwSvkVPW3Ty0
QaDgI6wNSiBb7Lfl2bbUR6X3j9QXUn8bhlczBsorS4uU4+NdHGyERsZLKLZazYYz
5t+xm4YBGJhtlt8IglNBjtaJFBuqgjrBEuBkfpevB2ocioWnU6APOdwWkKlSe3+r
EuKFUcbhRiRAGO05DjntNkePigb66IVkSiJXd4uSid/+akn5kQ19PSD8tXdfvZ7h
8HYTnWZa3OSLd7tZlwRzimnKuVSIsSnj+NobkIs8HSS3N7oN1t4FC5jH9D6GmK0R
djQs6a5mHQbonrc0dXxlb+P8u6I+ppxznPpfL/aQJ4pFM18V6gmcZEQluhwALNe9
rdPUnCudC/9SUNpMf3rI
=qriK
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/berrange/tags/filemon-next-pull-request' into staging
filemon: various fixes / improvements to file monitor for USB MTP
Ensure watch IDs unique within a monitor and avoid integer wraparound
issues when many watches are set & unset over time.
# gpg: Signature made Tue 02 Apr 2019 13:53:40 BST
# gpg: using RSA key BE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>" [full]
# gpg: aka "Daniel P. Berrange <berrange@redhat.com>" [full]
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF
* remotes/berrange/tags/filemon-next-pull-request:
filemon: fix watch IDs to avoid potential wraparound issues
filemon: ensure watch IDs are unique to QFileMonitor scope
tests: refactor file monitor test to make it more understandable
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
- file-posix: Ignore unlock failure instead of crashing
- gluster: Limit the transfer size to 512 MiB
- stream: Fix backing chain freezing
- qemu-img: Enable BDRV_REQ_MAY_UNMAP for zero writes in convert
- iotests fixes
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJco1nvAAoJEH8JsnLIjy/WKSAP/AyAGt4DFbsfVq1SbX2MB/3F
QtLi3q3Yjy71OaNyMJsqgd8dxoqO1bWKC7MmLxwFZwvwmXhgq7EVbJ6vJ7kzuHw5
FMsVavm61mL/vsKypFieYHbahiKC31R7sw4Vl7yeKG02oDgi/0SnIQxxO0rmIZxG
Jx+p5tBDmrJ2WpgCWI0yr3VuYeJ/gJyw8la3Ubf1OzTBuTyfg94vSCsECX7GWm1w
0YkcRODct96mmqKCf6haHXQs9tTa6FnC9fsReenYcQg7nyqTttyXJz08fuqGQwWm
sNcmMp2/YxAg2Z85Aq21fmTkXbGopEHtoC2fg71WET1DzpqMvvJPFBIOneBSzUon
KHrZfZnziAzUESHDqenCQcHlbtWXxPY+7H0lojvhsLVzyEns6eLOSer093HqAh4s
+av55+ucTUYRl9Uk8/vFPdFBh0bNJRJB2yazp0Fk5Yoo8y70TXePQ8ZkpjH9VSpb
uQXy+FBc7LXYHB1KW/vhtuVoKgftbHi29rQ3imb9y6HYuJoJYZS9uGgrFpzF0YBN
/mBG17YqE8Imfuavt0WQqNLSub/km9eqFrejnMqDBka1hZblecOB+3Ujcud5C7Vk
HtKDdr2/Z+M1KApFMKhNZzEksXwhOVQ6VVZZfRGCjc/EXMOZ2gg8YlqOtotUzA0i
vV2wMgpqE2Ihx1gIIqw/
=a4LV
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches:
- file-posix: Ignore unlock failure instead of crashing
- gluster: Limit the transfer size to 512 MiB
- stream: Fix backing chain freezing
- qemu-img: Enable BDRV_REQ_MAY_UNMAP for zero writes in convert
- iotests fixes
# gpg: Signature made Tue 02 Apr 2019 13:47:43 BST
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream:
tests/qemu-iotests/235: Allow fallback to tcg
block: test block-stream with a base node that is used by block-commit
block: freeze the backing chain earlier in stream_start()
block: continue until base is found in bdrv_freeze_backing_chain() et al
block/file-posix: do not fail on unlock bytes
tests/qemu-iotests: Remove redundant COPYING file
block/gluster: limit the transfer size to 512 MiB
qemu-img: Enable BDRV_REQ_MAY_UNMAP in convert
iotests: Fix test 200 on s390x without virtio-pci
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Watch IDs are allocated from incrementing a int counter against
the QFileMonitor object. In very long life QEMU processes with
a huge amount of USB MTP activity creating & deleting directories
it is just about conceivable that the int counter can wrap
around. This would result in incorrect behaviour of the file
monitor watch APIs due to clashing watch IDs.
Instead of trying to detect this situation, this patch changes
the way watch IDs are allocated. It is turned into an int64_t
variable where the high 32 bits are set from the underlying
inotify "int" ID. This gives an ID that is guaranteed unique
for the directory as a whole, and we can rely on the kernel
to enforce this. QFileMonitor then sets the low 32 bits from
a per-directory counter.
The USB MTP device only sets watches on the directory as a
whole, not files within, so there is no risk of guest
triggered wrap around on the low 32 bits.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The watch IDs are mistakenly only unique within the scope of the
directory being monitored. This is not useful for clients which are
monitoring multiple directories. They require watch IDs to be unique
globally within the QFileMonitor scope.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Bandan Das <bsd@redhat.com>
Reviewed-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
The current file monitor unit tests are too clever for their own good
making it hard to understand the desired output.
Instead of trying to infer the expected events, explicitly list the
events we expect in the operation sequence.
Instead of dynamically building a matrix of tests, just have one giant
operation sequence that validates all scenarios in a single test.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
When the user specifies a list of accelerators, we pick the first one
that initializes successfully. Recent commit 1a3ec8c156 broke that.
Reproducer:
$ qemu-system-x86_64 --machine accel=xen:tcg
xencall: error: Could not obtain handle on privileged command interface: No such file or directory
xen be core: xen be core: can't open xen interface
can't open xen interface
qemu-system-x86_64: failed to initialize Xen: Operation not permitted
qemu-system-x86_64: /home/armbru/work/qemu/qom/object.c:436: object_set_accelerator_compat_props: Assertion `!object_compat_props[0]' failed.
Root cause: we register accelerator compat properties even when the
accelerator fails. The failed assertion is
object_set_accelerator_compat_props() telling us off. Fix by calling
it only for the accelerator that succeeded.
Fixes: 1a3ec8c156
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20190401090827.20793-6-armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190401090827.20793-5-armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
migrate_add_blocker() asserts we have a current_migration object, in
migrate_get_current(). We do only after migration_object_init().
This contributes to the following dependency cycle:
* configure_blockdev() must run before machine_set_property()
so machine properties can refer to block backends
* machine_set_property() before configure_accelerator()
so machine properties like kvm-irqchip get applied
* configure_accelerator() before migration_object_init()
so that Xen's accelerator compat properties get applied.
* migration_object_init() before configure_blockdev()
so configure_blockdev() can add migration blockers
The cycle was closed when recent commit cda4aa9a5a "Create block
backends before setting machine properties" added the first
dependency, and satisfied it by violating the last one. Broke block
backends that add migration blockers, as demonstrated by qemu-iotests
055.
To fix it, break the last dependency: make migrate_add_blocker()
usable before migration_object_init().
The previous commit already removed the use of migrate_get_current()
from migrate_add_blocker() itself. Didn't quite do the trick, as
there's another one hiding in migration_is_idle().
The use there isn't actually necessary: when no migration object has
been created yet, migration is surely idle. Make migration_is_idle()
return true then.
Fixes: cda4aa9a5a
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190401090827.20793-4-armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
This reverts commit 3df663e575.
This reverts commit b605c47b57.
Command line option --only-migratable is for disallowing any
configuration that can block migration.
Initially, --only-migratable set global variable @only_migratable.
Commit 3df663e575 "migration: move only_migratable to MigrationState"
replaced it by MigrationState member @only_migratable. That was a
mistake.
First, it doesn't make sense on the design level. MigrationState
captures the state of an individual migration, but --only-migratable
isn't a property of an individual migration, it's a restriction on
QEMU configuration. With fault tolerance, we could have several
migrations at once. --only-migratable would certainly protect all of
them. Storing it in MigrationState feels inappropriate.
Second, it contributes to a dependency cycle that manifests itself as
a bug now.
Putting @only_migratable into MigrationState means its available only
after migration_object_init().
We can't set it before migration_object_init(), so we delay setting it
with a global property (this is fixup commit b605c47b57 "migration:
fix handling for --only-migratable").
We can't get it before migration_object_init(), so anything that uses
it can only run afterwards.
Since migrate_add_blocker() needs to obey --only-migratable, any code
adding migration blockers can run only afterwards. This contributes
to the following dependency cycle:
* configure_blockdev() must run before machine_set_property()
so machine properties can refer to block backends
* machine_set_property() before configure_accelerator()
so machine properties like kvm-irqchip get applied
* configure_accelerator() before migration_object_init()
so that Xen's accelerator compat properties get applied.
* migration_object_init() before configure_blockdev()
so configure_blockdev() can add migration blockers
The cycle was closed when recent commit cda4aa9a5a "Create block
backends before setting machine properties" added the first
dependency, and satisfied it by violating the last one. Broke block
backends that add migration blockers.
Moving @only_migratable into MigrationState was a mistake. Revert it.
This doesn't quite break the "migration_object_init() before
configure_blockdev() dependency, since migrate_add_blocker() still has
another dependency on migration_object_init(). To be addressed the
next commit.
Note that the reverted commit made -only-migratable sugar for -global
migration.only-migratable=on below the hood. Documentation has only
ever mentioned -only-migratable. This commit removes the arcane &
undocumented alternative to -only-migratable again. Nobody should be
using it.
Conflicts:
include/migration/misc.h
migration/migration.c
migration/migration.h
vl.c
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190401090827.20793-3-armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
This reverts commit e60483f2f8.
Recent commit cda4aa9a5a moved block backend creation before machine
property evaluation. This broke block backends registering migration
blockers. Commit e60483f2f8 fixed it by moving migration object
creation before block backend creation. This broke migration with
Xen. Turns out we need to configure the accelerator before we create
the migration object so that Xen's accelerator compat properties get
applied. Revert the flawed commit. This fixes the Xen regression,
but brings back the block backend regression. The next commits will
fix it again.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190401090827.20793-2-armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
We blame badness in positional arguments on the last option argument:
$ qemu-system-x86_64 -vnc :1 bad.img
qemu-system-x86_64: -vnc :1: Could not open 'foo': No such file or directory
I believe we've done this ever since we reported locations. Fix it to
qemu-system-x86_64: bad.img: Could not open 'bad.img': No such file or directory
Reported-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190318183312.4684-1-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
iotest 235 currently only works with KVM - this is bad for systems where
it is not available, e.g. CI pipelines. The test also works when using
"tcg" as accelerator, so we can simply add that to the list of accelerators,
too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The base node of a block-stream operation indicates the first image
from the backing chain starting from which no data is copied to the
top node.
The block-stream job allows others to use that base image, so a second
block-stream job could be writing to it at the same time. An important
restriction is that the base image must not disappear while the stream
job is ongoing. stream_start() freezes the backing chain from top to
base with that purpose but it does it too late in the code so there is
a race condition there.
This bug was fixed in the previous commit, and this patch contains an
iotest for this scenario.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit 6585493369 added code to freeze
the backing chain from 'top' to 'base' for the duration of the
block-stream job.
The problem is that the freezing happens too late in stream_start():
during the bdrv_reopen_set_read_only() call earlier in that function
another job can jump in and remove the base image. If that happens we
have an invalid chain and QEMU crashes.
This patch puts the bdrv_freeze_backing_chain() call at the beginning
of the function.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
All three functions that handle the BdrvChild.frozen attribute walk
the backing chain from 'bs' to 'base' and stop either when 'base' is
found or at the end of the chain if 'base' is NULL.
However if 'base' is not found then the functions return without
errors as if it was NULL.
This is wrong: if the caller passed an incorrect parameter that means
that there is a bug in the code.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_replace_child() calls bdrv_check_perm() with error_abort on
loosening permissions. However file-locking operations may fail even
in this case, for example on NFS. And this leads to Qemu crash.
Let's avoid such errors. Note, that we ignore such things anyway on
permission update commit and abort.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The file tests/qemu-iotests/COPYING is the same text as in the
COPYING file in the main directory. So as far as I can see, we don't
need the duplicate here.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Several versions of GlusterFS (3.12? -> 6.0.1) fail when the
transfer size is greater or equal to 1024 MiB, so we are
limiting the transfer size to 512 MiB to avoid this rare issue.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1691320
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
With Kevin's "block: Fix slow pre-zeroing in qemu-img convert"[1]
(commit c9fdcf202f, 'qemu-img: Use BDRV_REQ_NO_FALLBACK for
pre-zeroing') we skip the pre zero step called like this:
blk_make_zero(s->target, BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK)
And we write zeroes later using:
blk_co_pwrite_zeroes(s->target,
sector_num << BDRV_SECTOR_BITS,
n << BDRV_SECTOR_BITS, 0);
Since we use flags=0, this is translated to NBD_CMD_WRITE_ZEROES with
NBD_CMD_FLAG_NO_HOLE flag, which cause the NBD server to allocated space
instead of punching a hole.
Here is an example failure:
$ dd if=/dev/urandom of=src.img bs=1M count=5
$ truncate -s 50m src.img
$ truncate -s 50m dst.img
$ nbdkit -f -v -e '' -U nbd.sock file file=dst.img
$ ./qemu-img convert -n src.img nbd:unix:nbd.sock
We can see in nbdkit log that it received the NBD_CMD_FLAG_NO_HOLE
(may_trim=0):
nbdkit: file[1]: debug: newstyle negotiation: flags: export 0x4d
nbdkit: file[1]: debug: pwrite count=2097152 offset=0
nbdkit: file[1]: debug: pwrite count=2097152 offset=2097152
nbdkit: file[1]: debug: pwrite count=1048576 offset=4194304
nbdkit: file[1]: debug: zero count=33554432 offset=5242880 may_trim=0
nbdkit: file[1]: debug: zero count=13631488 offset=38797312 may_trim=0
nbdkit: file[1]: debug: flush
And the image became fully allocated:
$ qemu-img info dst.img
virtual size: 50M (52428800 bytes)
disk size: 50M
With this change we see that nbdkit did not receive the
NBD_CMD_FLAG_NO_HOLE (may_trim=1):
nbdkit: file[1]: debug: newstyle negotiation: flags: export 0x4d
nbdkit: file[1]: debug: pwrite count=2097152 offset=0
nbdkit: file[1]: debug: pwrite count=2097152 offset=2097152
nbdkit: file[1]: debug: pwrite count=1048576 offset=4194304
nbdkit: file[1]: debug: zero count=33554432 offset=5242880 may_trim=1
nbdkit: file[1]: debug: zero count=13631488 offset=38797312 may_trim=1
nbdkit: file[1]: debug: flush
And the file is sparse as expected:
$ qemu-img info dst.img
virtual size: 50M (52428800 bytes)
disk size: 5.0M
[1] http://lists.nongnu.org/archive/html/qemu-block/2019-03/msg00761.html
Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
virtio-pci is optional on s390x, e.g. in downstream RHEL builds, it
is disabled. On s390x, virtio-ccw should be used instead. Other tests
like 051 or 240 already use virtio-scsi-ccw instead of virtio-scsi-pci
on s390x, so let's do the same here and always use virtio-scsi-ccw on
s390x.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Currently the default audio timer frequency is 10000Hz instead of
a period of 10000us. Also the audiodev timer-period property gets
converted like a frequency. Only handling of the legacy
QEMU_AUDIO_TIMER_PERIOD environment variable is correct because
it's actually a frequency.
With this patch the property timer-period is really a timer period
and QEMU_AUDIO_TIMER_PERIOD remains a frequency.
Fixes: 71830221fb "-audiodev command line option basic implementation."
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Reviewed-by: Zoltán Kővágó <DirtY.iCE.hu@gmail.com>
Message-id: 90b95e4f-39ef-2b01-da6a-857ebaee1ec5@t-online.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This function is used in the delete path only and can
be replaced by a call to usb_mtp_object_free.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bandan Das <bsd@redhat.com>
Message-Id: <20190401211712.19012-3-bsd@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Spotted by Coverity: CID 1399414
mtp delete allows the return status of delete succeeded,
partial_delete or readonly - when none of the objects could be
deleted. Give more meaningful names to return values of the
delete function.
Some initiators recurse over the objects themselves. In that case,
only READ_ONLY can be returned.
Signed-off-by: Bandan Das <bsd@redhat.com>
Message-Id: <20190401211712.19012-2-bsd@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
- Better behavior of qemu-img map on NBD images
- Fixes for NBD protocol alignment corner cases:
- the server has fewer places where it sends reads or block status
not aligned to its advertised block size
- the client has more cases where it can work around server
non-compliance present in qemu 3.1
- the client now avoids non-compliant requests when interoperating
with nbdkit or other servers not advertising block size
-----BEGIN PGP SIGNATURE-----
iQEcBAABCAAGBQJcohr+AAoJEKeha0olJ0Nq+pQH/2AfH172yH2QcTLFbamBhWr1
gCeU4Kk4U9vD5itKBY7SQfdJxebzSJTkDcSTsGSXI7Q/cadzCNU2HJh7/ALbkbTL
suDju7uzmvbW7ogMbV/EO4qmomn04cNg+mUkgUgPQZGcnZAaXt2tFtWzPOLuboc1
kJhsLDfE9A/tFuz+Io7Umrjw6TH8CuerGw+xUJ4hATqiWPd/5k0WazvmKis143io
zS8ffMZJzJBmpSDtohojz6EFYZOKTFrPNcUwly8cVtRReSV5OiOhFW2Ul6HNT6I7
2hxkSYFB/7V4zDM0OGPkbhdLAkggFAxAZD16sBzbggFnMyJvqFIj+RUBXuKpv/g=
=BGOe
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2019-04-01' into staging
nbd patches for 2019-04-01
- Better behavior of qemu-img map on NBD images
- Fixes for NBD protocol alignment corner cases:
- the server has fewer places where it sends reads or block status
not aligned to its advertised block size
- the client has more cases where it can work around server
non-compliance present in qemu 3.1
- the client now avoids non-compliant requests when interoperating
with nbdkit or other servers not advertising block size
# gpg: Signature made Mon 01 Apr 2019 15:06:54 BST
# gpg: using RSA key A7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full]
# gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" [full]
# gpg: aka "[jpeg image of size 6874]" [full]
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A
* remotes/ericb/tags/pull-nbd-2019-04-01:
nbd/client: Trace server noncompliance on structured reads
nbd/server: Advertise actual minimum block size
block: Add bdrv_get_request_alignment()
nbd/client: Support qemu-img convert from unaligned size
nbd/client: Reject inaccessible tail of inconsistent server
nbd/client: Report offsets in bdrv_block_status
nbd/client: Lower min_block for block-status, unaligned size
iotests: Add 241 to test NBD on unaligned images
nbd-client: Work around server BLOCK_STATUS misalignment at EOF
qemu-img: Gracefully shutdown when map can't finish
nbd: Permit simple error to NBD_CMD_BLOCK_STATUS
nbd: Don't lose server's error to NBD_CMD_BLOCK_STATUS
nbd: Tolerate some server non-compliance in NBD_CMD_BLOCK_STATUS
qemu-img: Report bdrv_block_status failures
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Just as we recently added a trace for a server sending block status
that doesn't match the server's advertised minimum block alignment,
let's do the same for read chunks. But since qemu 3.1 is such a
server (because it advertised 512-byte alignment, but when serving a
file that ends in data but is not sector-aligned, NBD_CMD_READ would
detect a mid-sector change between data and hole at EOF and the
resulting read chunks are unaligned), we don't want to change our
behavior of otherwise tolerating unaligned reads.
Note that even though we fixed the server for 4.0 to advertise an
actual block alignment (which gets rid of the unaligned reads at EOF
for posix files), we can still trigger it via other means:
$ qemu-nbd --image-opts driver=blkdebug,align=512,image.driver=file,image.filename=/path/to/non-aligned-file
Arguably, that is a bug in the blkdebug block status function, for
leaking a block status that is not aligned. It may also be possible to
observe issues with a backing layer with smaller alignment than the
active layer, although so far I have been unable to write a reliable
iotest for that scenario.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190330165349.32256-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Both NBD_CMD_BLOCK_STATUS and structured NBD_CMD_READ will split their
reply according to bdrv_block_status() boundaries. If the block device
has a request_alignment smaller than 512, but we advertise a block
alignment of 512 to the client, then this can result in the server
reply violating client expectations by reporting a smaller region of
the export than what the client is permitted to address (although this
is less of an issue for qemu 4.0 clients, given recent client patches
to overlook our non-compliance at EOF). Since it's always better to
be strict in what we send, it is worth advertising the actual minimum
block limit rather than blindly rounding it up to 512.
Note that this patch is not foolproof - it is still possible to
provoke non-compliant server behavior using:
$ qemu-nbd --image-opts driver=blkdebug,align=512,image.driver=file,image.filename=/path/to/non-aligned-file
That is arguably a bug in the blkdebug driver (it should never pass
back block status smaller than its alignment, even if it has to make
multiple bdrv_get_status calls and determine the
least-common-denominator status among the group to return). It may
also be possible to observe issues with a backing layer with smaller
alignment than the active layer, although so far I have been unable to
write a reliable iotest for that scenario (but again, an issue like
that could be argued to be a bug in the block layer, or something
where we need a flag to bdrv_block_status() to state whether the
result must be aligned to the current layer's limits or can be
subdivided for accuracy when chasing backing files).
Anyways, as blkdebug is not normally used, and as this patch makes our
server more interoperable with qemu 3.1 clients, it is worth applying
now, even while we still work on a larger patch series for the 4.1
timeframe to have byte-accurate file lengths.
Note that the iotests output changes - for 223 and 233, we can see the
server's better granularity advertisement; and for 241, the three test
cases have the following effects:
- natural alignment: the server's smaller alignment is now advertised,
and the hole reported at EOF is now the right result; we've gotten rid
of the server's non-compliance
- forced server alignment: the server still advertises 512 bytes, but
still sends a mid-sector hole. This is still a server compliance bug,
which needs to be fixed in the block layer in a later patch; output
does not change because the client is already being tolerant of the
non-compliance
- forced client alignment: the server's smaller alignment means that
the client now sees the server's status change mid-sector without any
protocol violations, but the fact that the map shows an unaligned
mid-sector hole is evidence of the block layer problems with aligned
block status, to be fixed in a later patch
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190329042750.14704-7-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: rebase to enhanced iotest 241 coverage]
The next patch needs access to a device's minimum permitted
alignment, since NBD wants to advertise this to clients. Add
an accessor function, borrowing from blk_get_max_transfer()
for accessing a backend's block limits.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20190329042750.14704-6-eblake@redhat.com>
If an NBD server advertises a size that is not a multiple of a sector,
the block layer rounds up that size, even though we set info.size to
the exact byte value sent by the server. The block layer then proceeds
to let us read or query block status on the hole that it added past
EOF, which the NBD server is unlikely to be happy with. Fortunately,
qemu as a server never advertizes an unaligned size, so we generally
don't run into this problem; but the nbdkit server makes it easy to
test:
$ printf %1000d 1 > f1
$ ~/nbdkit/nbdkit -fv file f1 & pid=$!
$ qemu-img convert -f raw nbd://localhost:10809 f2
$ kill $pid
$ qemu-img compare f1 f2
Pre-patch, the server attempts a 1024-byte read, which nbdkit
rightfully rejects as going beyond its advertised 1000 byte size; the
conversion fails and the output files differ (not even the first
sector is copied, because qemu-img does not follow ddrescue's habit of
trying smaller reads to get as much information as possible in spite
of errors). Post-patch, the client's attempts to read (and query block
status, for new enough nbdkit) are properly truncated to the server's
length, with sane handling of the hole the block layer forced on
us. Although f2 ends up as a larger file (1024 bytes instead of 1000),
qemu-img compare shows the two images to have identical contents for
display to the guest.
I didn't add iotests coverage since I didn't want to add a dependency
on nbdkit in iotests. I also did NOT patch write, trim, or write
zeroes - these commands continue to fail (usually with ENOSPC, but
whatever the server chose), because we really can't write to the end
of the file, and because 'qemu-img convert' is the most common case
where we care about being tolerant (which is read-only). Perhaps we
could truncate the request if the client is writing zeros to the tail,
but that seems like more work, especially if the block layer is fixed
in 4.1 to track byte-accurate sizing (in which case this patch would
be reverted as unnecessary).
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190329042750.14704-5-eblake@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
The NBD spec suggests that a server should never advertise a size
inconsistent with its minimum block alignment, as that tail is
effectively inaccessible to a compliant client obeying those block
constraints. Since we have a habit of rounding up rather than
truncating, to avoid losing the last few bytes of user input, and we
cannot access the tail when the server advertises bogus block sizing,
abort the connection to alert the server to fix their bug. And
rejecting such servers matches what we already did for a min_block
that was not a power of 2 or which was larger than max_block.
Does not impact either qemu (which always sends properly aligned
sizes) or nbdkit (which does not send minimum block requirements yet);
so this is mostly aimed at new NBD server implementations, and ensures
that the rest of our code can assume the size is aligned.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190330155704.24191-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
In usb_mask_to_str() we convert a mask of USB speeds into
a human-readable string (like "full+high") for use in
tracing and error messages. However the conversion code
doesn't do anything to the string buffer if the passed in
speedmask doesn't match any of the recognized speeds,
which means that the tracing and error messages will
end up with random garbage in them. This can happen if
we're doing USB device passthrough.
Handle the "unrecognized speed" case by using the
string "unknown".
Fixes: https://bugs.launchpad.net/qemu/+bug/1603785
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20190328133503.6490-1-peter.maydell@linaro.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This reverts commit bd56d37884.
Turned out it isn't that simple as the device needs the pit object link.
So "-device isa-pcspk" isn't going wo work anyway. We are in freeze, so
just reverting the thing is the best way to handle this for now, trying
to come up with something better can be done in the 4.1 devel cycle.
Also add a comment noting the object link.
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20190328071121.21147-1-kraxel@redhat.com
It is desirable for 'qemu-img map' to have the same output for a file
whether it is served over file or nbd protocols. However, ever since
we implemented block status for NBD (2.12), the NBD protocol forgot to
inform the block layer that as the final layer in the chain, the
offset is valid; without an offset, the human-readable form of
qemu-img map gives up with the unhelpful:
$ nbdkit -U - data data="1" size=512 --run 'qemu-img map $nbd'
Offset Length Mapped to File
qemu-img: File contains external, encrypted or compressed clusters.
The --output=json form always works, because it is reporting the
lower-level bdrv_block_status results directly rather than trying to
filter out sparse ranges for human consumption - but now it also
shows the offset member.
With this patch, the human output changes to:
Offset Length Mapped to File
0 0x200 0 nbd+unix://?socket=/tmp/nbdkitOxeoLa/socket
This change is observable to several iotests.
Fixes: 78a33ab5
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190329042750.14704-4-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
We have a latent bug in our NBD client code, tickled by the brand new
nbdkit 1.11.10 block status support:
$ nbdkit --filter=log --filter=truncate -U - \
data data="1" size=511 truncate=64K logfile=/dev/stdout \
--run 'qemu-img convert $nbd /var/tmp/out'
...
qemu-img: block/io.c:2122: bdrv_co_block_status: Assertion `*pnum && QEMU_IS_ALIGNED(*pnum, align) && align > offset - aligned_offset' failed.
The culprit? Our implementation of .bdrv_co_block_status can return
unaligned block status for any server that operates with a lower
actual alignment than what we tell the block layer in
request_alignment, in violation of the block layer's constraints. To
date, we've been unable to trip the bug, because qemu as NBD server
always advertises block sizing (at which point it is a server bug if
the server sends unaligned status - although qemu 3.1 is such a server
and I've sent separate patches for 4.0 both to get the server to obey
the spec, and to let the client to tolerate server oddities at EOF).
But nbdkit does not (yet) advertise block sizing, and therefore is not
in violation of the spec for returning block status at whatever
boundaries it wants, and those unaligned results can occur anywhere
rather than just at EOF. While we are still wise to avoid sending
sub-sector read/write requests to a server of unknown origin, we MUST
consider that a server telling us block status without an advertised
block size is correct. So, we either have to munge unaligned answers
from the server into aligned ones that we hand back to the block
layer, or we have to tell the block layer about a smaller alignment.
Similarly, if the server advertises an image size that is not
sector-aligned, we might as well assume that the server intends to let
us access those tail bytes, and therefore supports a minimum block
size of 1, regardless of whether the server supports block status
(although we still need more patches to fix the problem that with an
unaligned image, we can send read or block status requests that exceed
EOF to the server). Again, qemu as server cannot trip this problem
(because it rounds images to sector alignment), but nbdkit advertised
unaligned size even before it gained block status support.
Solve both alignment problems at once by using better heuristics on
what alignment to report to the block layer when the server did not
give us something to work with. Note that very few NBD servers
implement block status (to date, only qemu and nbdkit are known to do
so); and as the NBD spec mentioned block sizing constraints prior to
documenting block status, it can be assumed that any future
implementations of block status are aware that they must advertise
block size if they want a minimum size other than 1.
We've had a long history of struggles with picking the right alignment
to use in the block layer, as evidenced by the commit message of
fd8d372d (v2.12) that introduced the current choice of forced 512-byte
alignment.
There is no iotest coverage for this fix, because qemu can't provoke
it, and I didn't want to make test 241 dependent on nbdkit.
Fixes: fd8d372d
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20190329042750.14704-3-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>