Compatibility properties started life as a qdev property thing: we
supported them only for qdev properties, and implemented them with the
machinery backing command line option -global.
Recent commit fa0cb34d22 put them to use (tacitly) with memory
backend objects (subtypes of TYPE_MEMORY_BACKEND). To make that
possible, we first moved the work of applying them from the -global
machinery into TYPE_DEVICE's .instance_post_init() method
device_post_init(), in commits ea9ce8934c and b66bbee39f, then made
it available to TYPE_MEMORY_BACKEND's .instance_post_init() method
host_memory_backend_post_init() as object_apply_compat_props(), in
commit 1c3994f6d2.
Note the code smell: we now have function name starting with object_
in hw/core/qdev.c. It has to be there rather than in qom/, because it
calls qdev_get_machine() to find the current accelerator's and
machine's compat_props.
Turns out calling qdev_get_machine() there is problematic. If we
qdev_create() from a machine's .instance_init() method, we call
device_post_init() and thus qdev_get_machine() before main() can
create "/machine" in QOM. qdev_get_machine() tries to get it with
container_get(), which "helpfully" creates it as "container" object,
and returns that. object_apply_compat_props() tries to paper over the
problem by doing nothing when the value of qdev_get_machine() isn't a
TYPE_MACHINE. But the damage is done already: when main() later
attempts to create the real "/machine", it fails with "attempt to add
duplicate property 'machine' to object (type 'container')", and
aborts.
Since no machine .instance_init() calls qdev_create() so far, the bug
is latent. But since I want to do that, I get to fix the bug first.
Observe that object_apply_compat_props() doesn't actually need the
MachineState, only its the compat_props member of its MachineClass and
AccelClass. This permits a simple fix: register MachineClass and
AccelClass compat_props with the object_apply_compat_props() machinery
right after these classes get selected.
This is actually similar to how things worked before commits
ea9ce8934c and b66bbee39f, except we now register much earlier. The
old code registered them only after the machine's .instance_init()
ran, which would've broken compatibility properties for any devices
created there.
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190308131445.17502-2-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Our pflash devices are simplistically modelled has having
"num-blocks" sectors of equal size "sector-length". Real hardware
commonly has sectors of different sizes. How our "sector-length"
property is related to the physical device's multiple sector sizes
is unclear.
Helper functions pflash_cfi01_register() and pflash_cfi02_register()
create a pflash device, set properties including "sector-length" and
"num-blocks", and realize. They take parameters @size, @sector_len
and @nb_blocs.
QOMification left parameter @size unused. Obviously, @size should
match @sector_len and @nb_blocs, i.e. size == sector_len * nb_blocs.
All callers satisfy this.
Remove @nb_blocs and compute it from @size and @sector_len.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20190308094610.21210-16-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
QOMification left parameter @qdev unused in pflash_cfi01_register()
and pflash_cfi02_register(). All callers pass NULL. Remove.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190308094610.21210-15-armbru@redhat.com>
We have two open-coded copies of macro PFLASH_CFI01(). Move the macro
to the header, so we can ditch the copies. Move PFLASH_CFI02() to the
header for symmetry.
We define macros TYPE_PFLASH_CFI01 and TYPE_PFLASH_CFI02 for type name
strings, then mostly use the strings. If the macros are worth
defining, they are worth using. Replace the strings by the macros.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20190308094610.21210-6-armbru@redhat.com>
pflash_cfi01.c and pflash_cfi02.c start their identifiers with
pflash_cfi01_ and pflash_cfi02_ respectively, except for
CFI_PFLASH01(), TYPE_CFI_PFLASH01, CFI_PFLASH02(), TYPE_CFI_PFLASH02.
Rename for consistency.
Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20190308094610.21210-5-armbru@redhat.com>
flash.h's incomplete struct pflash_t is completed both in
pflash_cfi01.c and in pflash_cfi02.c. The complete types are
incompatible. This can hide type errors, such as passing a pflash_t
created with pflash_cfi02_register() to pflash_cfi01_get_memory().
Furthermore, POSIX reserves typedef names ending with _t.
Rename the two structs to PFlashCFI01 and PFlashCFI02.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190308094610.21210-2-armbru@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20190307080244.9011-4-kraxel@redhat.com
- qcow2: Support for external data files
- qcow2: Default to 4KB for the qcow2 cache entry size
- Apply block driver whitelist for -drive format=help
- Several qemu-iotests improvements
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJcgmYDAAoJEH8JsnLIjy/WiEgP/jirn5n4bFHDSzpofRxgpcEG
2SoBpYtJQvAVgXmg8P1ZKeaX5yiiZpJS475ShuH6C3dnWuaHBBjvQfDkLLrh3v05
4KJyZQUFx0WQZqkEiHxtgmE3iIOAsWe8VzBCZjsFITdp+fN8HlRjKVofyYP0y48G
k9PzMlEe30Wu2s+lq6PEIXgpwIZqOVl1V3C7wDF8Vg/1OOsv+rK0vKD8yra/oQAc
mthM+hjkpa+ohjE5aTeCsDVxO56AStvXv0d3bE0aF/ZtCvQdbVh5coYj1ldNz6VY
Bvx+UP7kPHJw0wesZJXLeVyZUMyHAuu9vW6zKlDYJ7PoIXSLXC7rYyoEofvkAzQL
awSluFj4HpYU1dKseJ8LzrMyUqjJ2eLJ+K48iNIh0vNlVuvX8aR62dIv0Obo3rod
Y7zgyd6mSgDGulDa3xVsD0+eAUUEbLaUDBV80+M7S2/V9YP4Bt7lKbm0SQB4dUxm
eOJfnbMTpIYUUjYtpCkURED3MlTIA51fy28O87TMJwa11sJTXRscysE9oNQNsvwW
2UPhMPLykMI023glhV0vCwgXQ5kktOvpaB3U7LGQhhHd1ed9sdLEB1bO9eKWYr4N
h6QPLBRLGUYd+BFMMIBLQtx8r+mAn1hUZoX4zxTn0lD50Rch3pRnCjMoC1iWqRkS
J204Lzum5kgNPheyVxhy
=GCRI
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches:
- qcow2: Support for external data files
- qcow2: Default to 4KB for the qcow2 cache entry size
- Apply block driver whitelist for -drive format=help
- Several qemu-iotests improvements
# gpg: Signature made Fri 08 Mar 2019 12:54:27 GMT
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (33 commits)
qcow2 spec: Describe string header extensions
qemu-iotests: Add dependency to qemu-nbd tool
ahci-test: Add dependency to qemu-img tool
qemu-iotests: amend with external data file
qemu-iotests: General tests for qcow2 with external data file
qemu-iotests: Preallocation with external data file
qcow2: Implement data-file-raw create option
qcow2: Store data file name in the image
qcow2: Creating images with external data file
qcow2: Add basic data-file infrastructure
qcow2: Support external data file in qemu-img check
qcow2: Return error for snapshot operation with data file
qcow2: External file I/O
qcow2: Prepare qcow2_co_block_status() for data file
qcow2: Return 0/-errno in qcow2_alloc_compressed_cluster_offset()
qcow2: Don't assume 0 is an invalid cluster offset
qcow2: Prepare count_contiguous_clusters() for external data file
qcow2: Prepare qcow2_get_cluster_type() for external data file
qcow2: Pass bs to qcow2_get_cluster_type()
qcow2: Basic definitions for external data files
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Provide an option to force QEMU to always keep the external data file
consistent as a standalone read-only raw image.
At the moment, this means making sure that write_zeroes requests are
forwarded to the data file instead of just updating the metadata, and
checking that no backing file is used.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_iterate_format (which is currently only used for printing out the
formats supported by the block layer) doesn't take format whitelisting
into account.
This creates a problem for tests: they enumerate supported formats to
decide which tests to enable, but then discover that QEMU doesn't let
them actually use some of those formats.
To avoid that, exclude formats that are not whitelisted from
enumeration, if whitelisting is in use. Since we have separate
whitelists for r/w and r/o, take this a parameter to
bdrv_iterate_format, and print two lists of supported formats (r/w and
r/o) in main qemu.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In existing code we create the gcontext dynamically at the first
access of the gcontext from caller. That can bring some complexity
and potential races during using iothread. Since the context itself
is not that big a resource, and we won't have millions of iothread,
let's simply create the gcontext unconditionally.
This will also be a preparation work further to move the thread
context push operation earlier than before (now it's only pushed right
before we want to start running the gmainloop).
Removing the g_once since it's not necessary, while introducing a new
run_gcontext boolean to show whether we want to run the gcontext.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-id: 20190306115532.23025-3-peterx@redhat.com
Message-Id: <20190306115532.23025-3-peterx@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Only sending an init-done message using lock+cond seems an overkill to
me. Replacing it with a simpler semaphore.
Meanwhile, init the semaphore unconditionally, then we can destroy it
unconditionally too in finalize which seems cleaner.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-id: 20190306115532.23025-2-peterx@redhat.com
Message-Id: <20190306115532.23025-2-peterx@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Introduced in 64b40bc54a, this definition is no more used since
a0b753dfd3. Remove it.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Add qgraph API that allows to add/remove nodes and edges from the graph,
implementation of Depth First Search to discover the paths and basic unit
test to check correctness of the API.
Included also a main executable that takes care of starting the framework,
create the nodes, set the available drivers/machines, discover the path and
run tests.
graph.h provides the public API to manage the graph nodes/edges
graph_extra.h provides a more private API used successively by the gtest integration part
qos-test.c provides the main executable
Signed-off-by: Emanuele Giuseppe Esposito <e.emanuelegiuseppe@gmail.com>
[Paolo's changes compared to the Google Summer of Code submission:
* added subprocess to test options
* refactored object creation to support live migration tests
* removed driver .before callback (unused)
* removed test .after callbacks (replaced by GTest destruction queue)]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
slirp migration code uses QEMU vmstate so far, when building WITH_QEMU.
Introduce slirp_state_{load,save,version}() functions to move the
state saving handling to libslirp side.
So far, the bitstream compatibility should remain equal with current
QEMU, as this is effectively using the same code, with the same format
etc. When libslirp is made standalone, we will need some mechanism to
ensure bitstream compatibility regardless of the libslirp version
installed. See the FIXME note in the code.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20190212162524.31504-3-marcandre.lureau@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
As with the previous patch to qemu-nbd, the nbd-server-start QMP command
also needs to be able to specify authorization when enabling TLS encryption.
First the client must create a QAuthZ object instance using the
'object-add' command:
{
'execute': 'object-add',
'arguments': {
'qom-type': 'authz-list',
'id': 'authz0',
'parameters': {
'policy': 'deny',
'rules': [
{
'match': '*CN=fred',
'policy': 'allow'
}
]
}
}
}
They can then reference this in the new 'tls-authz' parameter when
executing the 'nbd-server-start' command:
{
'execute': 'nbd-server-start',
'arguments': {
'addr': {
'type': 'inet',
'host': '127.0.0.1',
'port': '9000'
},
'tls-creds': 'tls0',
'tls-authz': 'authz0'
}
}
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20190227162035.18543-3-berrange@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Currently any client which can complete the TLS handshake is able to use
the NBD server. The server admin can turn on the 'verify-peer' option
for the x509 creds to require the client to provide a x509 certificate.
This means the client will have to acquire a certificate from the CA
before they are permitted to use the NBD server. This is still a fairly
low bar to cross.
This adds a '--tls-authz OBJECT-ID' option to the qemu-nbd command which
takes the ID of a previously added 'QAuthZ' object instance. This will
be used to validate the client's x509 distinguished name. Clients
failing the authorization check will not be permitted to use the NBD
server.
For example to setup authorization that only allows connection from a client
whose x509 certificate distinguished name is
CN=laptop.example.com,O=Example Org,L=London,ST=London,C=GB
escape the commas in the name and use:
qemu-nbd --object tls-creds-x509,id=tls0,dir=/home/berrange/qemutls,\
endpoint=server,verify-peer=yes \
--object 'authz-simple,id=auth0,identity=CN=laptop.example.com,,\
O=Example Org,,L=London,,ST=London,,C=GB' \
--tls-creds tls0 \
--tls-authz authz0 \
....other qemu-nbd args...
NB: a real shell command line would not have leading whitespace after
the line continuation, it is just included here for clarity.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20190227162035.18543-2-berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: split long line in --help text, tweak 233 to show that whitespace
after ,, in identity= portion is actually okay]
Signed-off-by: Eric Blake <eblake@redhat.com>
Let's use a wrapper instead of looking it up manually. This function can
than be reused when we explicitly want to have the bus hotplug handler
(e.g. when the bus hotplug handler was overwritten by the machine
hotplug handler).
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190228122849.4296-4-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
it will allow to return another hotplug handler than the default
one for a specific bus based device type. Which is needed to handle
non trivial plug/unplug sequences that need the access to resources
configured outside of bus where device is attached.
That will allow for returned hotplug handler to orchestrate wiring
in arbitrary order, by chaining other hotplug handlers when
it's needed.
PS:
It could be used for hybrid virtio-mem and virtio-pmem devices
where it will return machine as hotplug handler which will do
necessary wiring at machine level and then pass control down
the chain to bus specific hotplug handler.
Example of top level hotplug handler override and custom plug sequence:
some_machine_get_hotplug_handler(machine){
if (object_dynamic_cast(OBJECT(dev), TYPE_SOME_BUS_DEVICE)) {
return HOTPLUG_HANDLER(machine);
}
return NULL;
}
some_machine_device_plug(hotplug_dev, dev) {
if (object_dynamic_cast(OBJECT(dev), TYPE_SOME_BUS_DEVICE)) {
/* do machine specific initialization */
some_machine_init_special_device(dev)
/* pass control to bus specific handler */
hotplug_handler_plug(dev->parent_bus->hotplug_handler, dev)
}
}
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190228122849.4296-3-david@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The qbus_is_full(BusState *bus) function (qdev_monitor.c) compares the max_index
value of the BusState structure with the max_dev value of the BusClass structure
to determine whether the maximum number of children has been reached for the
bus. The problem is, the max_index field of the BusState structure does not
necessarily reflect the number of devices that have been plugged into
the bus.
Whenever a child device is plugged into the bus, the bus's max_index value is
assigned to the child device and then incremented. If the child is subsequently
unplugged, the value of the max_index does not change and no longer reflects the
number of children.
When the bus's max_index value reaches the maximum number of devices
allowed for the bus (i.e., the max_dev field in the BusClass structure),
attempts to plug another device will be rejected claiming that the bus is
full -- even if the bus is actually empty.
To resolve the problem, a new 'num_children' field is being added to the
BusState structure to keep track of the number of children plugged into the
bus. It will be incremented when a child is plugged, and decremented when a
child is unplugged.
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Pierre Morel<pmorel@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Message-Id: <1545062250-7573-1-git-send-email-akrowiak@linux.ibm.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(This replaces the pull sent yesterday)
a) 4 small fixes including the cancel problem
that caused the ahci migration test to fail
intermittently
b) Yury's ignore-shared feature
c) Juan's extra tests
d) Wei Wang's free page hinting
e) Some Colo fixes from Zhang Chen
Diff from yesterdays pull:
1) A missing fix of mine (cleanup during exit)
2) Changes from Eric/Markus on 'Create socket-address parameter'
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJcf7GJAAoJEAUWMx68W/3n+lMP/Rl/d7hpi0Ve2fm3VEwoFJea
IRiqo7Yk6heyTCIutFq15pD2ef49AXHpLeGBp9gFNb4bdFTQzHmwOPxeJWig8YXV
m+j5sGRaM9sV8XX24DsZM7yFhpVJmWky8ivMSv3LeEmjx251B9CNL13dc/qVUQHv
lYP6ewnOmtjvR+x+z9Q/+vafnpLWJSxup1G0pZWdQfLpl71E2sMf7FY/G5EroVnf
AXmJb1sjdFXF7n968myfcgYETHsnY0SUa89Bcnd+i40DXvSfa4njXSdE4FOhyIim
n8c4SyRA/Ah2EUl+UGxn8TQ78C4RA3dUS+uXJDmjL1e4ACvqq//nhsfIqTJ9AbgF
Jhx5ArwqrGf7D+/PM5ivDocNplT5JFcCB4OCmZO96Kn0/F6M3UHuL1+IvpQcFMm8
1Ar1REB7BZ6f+QLfY8KKuzVrVRzUBi0DbqFHj5TNIStizOkuUEMMRpcWImBMzslG
531YgTnsSeFfFr13ZJlXDscZSZ5i+fJMjNbH9QpTNy8qmLJoZzbKqpmP4pZmHVI2
w3g1pCHpFejuQtUTNMR3+9mVH5hO+MNrANsTH0yfAXYDNToJ6NkY1nnILHp4P7t1
tqHYN7AO2ZXTTTMSnfyv1+2wh3HZRFB/y7uF6uEowBfuZTRHBHnkaVQp5WbVVSJu
4ovMmHDkcX2bM7VWwTHS
=dk89
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20190306a' into staging
Migation pull 2019-03-06
(This replaces the pull sent yesterday)
a) 4 small fixes including the cancel problem
that caused the ahci migration test to fail
intermittently
b) Yury's ignore-shared feature
c) Juan's extra tests
d) Wei Wang's free page hinting
e) Some Colo fixes from Zhang Chen
Diff from yesterdays pull:
1) A missing fix of mine (cleanup during exit)
2) Changes from Eric/Markus on 'Create socket-address parameter'
# gpg: Signature made Wed 06 Mar 2019 11:39:53 GMT
# gpg: using RSA key 0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>" [full]
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert/tags/pull-migration-20190306a: (22 commits)
qapi/migration.json: Remove a variable that doesn't exist in example
Migration/colo.c: Make COLO node running after failover
Migration/colo.c: Fix double close bug when occur COLO failover
virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT
migration/ram.c: add the free page optimization enable flag
migration/ram.c: add a notifier chain for precopy
migration: API to clear bits of guest free pages from the dirty bitmap
migration: use bitmap_mutex in migration_bitmap_clear_dirty
bitmap: bitmap_count_one_with_offset
bitmap: fix bitmap_count_one
tests: Add basic migration precopy tcp test
migration: Create socket-address parameter
tests: Add migration xbzrle test
migration: Add capabilities validation
tests/migration-test: Add a test for ignore-shared capability
migration: Add an ability to ignore shared RAM blocks
migration: Introduce ignore-shared capability
exec: Change RAMBlockIterFunc definition
migration/rdma: clang compilation fix
migration: Cleanup during exit
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The new feature enables the virtio-balloon device to receive hints of
guest free pages from the free page vq.
A notifier is registered to the migration precopy notifier chain. The
notifier calls free_page_start after the migration thread syncs the dirty
bitmap, so that the free page optimization starts to clear bits of free
pages from the bitmap. It calls the free_page_stop before the migration
thread syncs the bitmap, which is the end of the current round of ram
save. The free_page_stop is also called to stop the optimization in the
case when there is an error occurred in the process of ram saving.
Note: balloon will report pages which were free at the time of this call.
As the reporting happens asynchronously, dirty bit logging must be
enabled before this free_page_start call is made. Guest reporting must be
disabled before the migration dirty bitmap is synchronized.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Peter Xu <peterx@redhat.com>
Message-Id: <1544516693-5395-8-git-send-email-wei.w.wang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
dgilbert: Dropped kernel header update, fixed up CMD_ID_* name change
This patch adds the free page optimization enable flag, and a function
to set this flag. When the free page optimization is enabled, not all
the pages are needed to be sent in the bulk stage.
Why using a new flag, instead of directly disabling ram_bulk_stage when
the optimization is running?
Thanks for Peter Xu's reminder that disabling ram_bulk_stage will affect
the use of compression. Please see save_page_use_compression. When
xbzrle and compression are used, if free page optimizaion causes the
ram_bulk_stage to be disabled, save_page_use_compression will return
false, which disables the use of compression. That is, if free page
optimization avoids the sending of half of the guest pages, the other
half of pages loses the benefits of compression in the meantime. Using a
new flag to let migration_bitmap_find_dirty skip the free pages in the
bulk stage will avoid the above issue.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Peter Xu <peterx@redhat.com>
Message-Id: <1544516693-5395-7-git-send-email-wei.w.wang@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This patch adds a notifier chain for the memory precopy. This enables various
precopy optimizations to be invoked at specific places.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <1544516693-5395-6-git-send-email-wei.w.wang@intel.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This patch adds an API to clear bits corresponding to guest free pages
from the dirty bitmap. Spilt the free page block if it crosses the QEMU
RAMBlock boundary.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Peter Xu <peterx@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <1544516693-5395-5-git-send-email-wei.w.wang@intel.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Count the number of 1s in a bitmap starting from an offset.
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1544516693-5395-3-git-send-email-wei.w.wang@intel.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
BITMAP_LAST_WORD_MASK(nbits) returns 0xffffffff when "nbits=0", which
makes bitmap_count_one fail to handle the "nbits=0" case. It appears to be
preferred to remain BITMAP_LAST_WORD_MASK identical to the kernel
implementation that it is ported from.
So this patch fixes bitmap_count_one to handle the nbits=0 case.
Inital Discussion Link:
https://www.mail-archive.com/qemu-devel@nongnu.org/msg554316.html
Signed-off-by: Wei Wang <wei.w.wang@intel.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Peter Xu <peterx@redhat.com>
Message-Id: <1544516693-5395-2-git-send-email-wei.w.wang@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
If ignore-shared capability is set then skip shared RAMBlocks during the
RAM migration.
Also, move qemu_ram_foreach_migratable_block (and rename) to the
migration code, because it requires access to the migration capabilities.
Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
Message-Id: <20190215174548.2630-4-yury-kotov@yandex-team.ru>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Currently, qemu_ram_foreach_* calls RAMBlockIterFunc with many
block-specific arguments. But often iter func needs RAMBlock*.
This refactoring is needed for fast access to RAMBlock flags from
qemu_ram_foreach_block's callback. The only way to achieve this now
is to call qemu_ram_block_from_host (which also enumerates blocks).
So, this patch reduces complexity of
qemu_ram_foreach_block() -> cb() -> qemu_ram_block_from_host()
from O(n^2) to O(n).
Fix RAMBlockIterFunc definition and add some functions to read
RAMBlock* fields witch were passed.
Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru>
Message-Id: <20190215174548.2630-2-yury-kotov@yandex-team.ru>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Currently we cleanup the migration object as we exit main after the
main_loop finishes; however if there's a migration running things
get messy and we can end up with the migration thread still trying
to access freed structures.
We now take a ref to the object around the migration thread itself,
so the act of dropping the ref during exit doesn't cause us to lose
the state until the thread quits.
Cancelling the migration during migration also tries to get the thread
to quit.
We do this a bit earlier; so hopefully migration gets out of the way
before all the devices etc are freed.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20190227164900.16378-1-dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
All accessors that have an endian infix DO have an underscore between
{size} and {endian}.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <155119086741.1037569.12734854713022304642.stgit@bahia.lan>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Function acpi_table_add_builtin() is not used anymore.
Remove the definition and declaration.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20190214084939.20640-3-richardw.yang@linux.intel.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Function pc_acpi_init() is not used anymore.
Remove the definition and declaration.
Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20190214084939.20640-2-richardw.yang@linux.intel.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Implement the watchdog timer for the stellaris boards.
This device is a close variant of the CMSDK APB watchdog
device, so we can model it by subclassing that device and
tweaking the behaviour of some of its registers.
Signed-off-by: Michel Heily <michelheily@gmail.com>
Reviewed-by: Peter Maydell <petser.maydell@linaro.org>
[PMM: rewrote commit message, fixed a few checkpatch nits,
added comment giving the URL of the spec for the Stellaris
variant of the watchdog device]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Up to now the memory map has been static and the high IO region
base has always been 256GiB.
This patch modifies the virt_set_memmap() function, which freezes
the memory map, so that the high IO range base becomes floating,
located after the initial RAM and the device memory.
The function computes
- the base of the device memory,
- the size of the device memory,
- the high IO region base
- the highest GPA used in the memory map.
Entries of the high IO region are assigned a base address. The
device memory is initialized.
The highest GPA used in the memory map will be used at VM creation
to choose the requested IPA size.
Setting all the existing highmem IO regions beyond the RAM
allows to have a single contiguous RAM region (initial RAM and
possible hotpluggable device memory). That way we do not need
to do invasive changes in the EDK2 FW to support a dynamic
RAM base.
Still the user cannot request an initial RAM size greater than 255GB.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 20190304101339.25970-8-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
On ARM, the kvm_type will be resolved by querying the KVMState.
Let's add the MachineState handle to the callback so that we
can retrieve the KVMState handle. in kvm_init, when the callback
is called, the kvm_state variable is not yet set.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 20190304101339.25970-5-eric.auger@redhat.com
[ppc parts]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In the prospect to introduce an extended memory map supporting more
RAM, let's split the memory map array into two parts:
- the former a15memmap, renamed base_memmap, contains regions below
and including the RAM. MemMapEntries initialized in this array
have a static size and base address.
- extended_memmap, only initialized with entries located after the
RAM. MemMapEntries initialized in this array only get their size
initialized. Their base address is dynamically computed depending
on the the top of the RAM, with same alignment as their size.
Eventually base_memmap entries are copied into the extended_memmap
array. Using two separate arrays however clarifies which entries
are statically allocated and those which are dynamically allocated.
This new split will allow to grow the RAM size without changing the
description of the high IO entries.
We introduce a new virt_set_memmap() helper function which
"freezes" the memory map. We call it in machvirt_init as
memory attributes of the machine are not yet set when
virt_instance_init() gets called.
The memory map is unchanged (the top of the initial RAM still is
256GiB). Then come the high IO regions with same layout as before.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 20190304101339.25970-4-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In preparation for a split of the memory map into a static
part and a dynamic part floating after the RAM, let's rename the
regions located after the RAM
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-id: 20190304101339.25970-3-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Some network devices have a capability to do self announcements
(ex: virtio-net). Add infrastructure that would allow devices
to expose this ability.
Signed-off-by: Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Switch the announcements to using the new announce timer.
Move the code that does it to announce.c rather than savevm
because it really has nothing to do with the actual migration.
Migration starts the announce from bh's and so they're all
in the main thread/bql, and so there's never any racing with
the timers themselves.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Switch virtio's self announcement to use the AnnounceTimer.
It keeps it's own AnnounceTimer (per device), and starts running it
using a migration post-load and a virtual clock; that way the
announce happens once the guest is actually running.
The timer uses the migration parameters to set the timing of
the repeats.
Based on earlier patches by myself and
Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Add migration parameters that control RARP/GARP announcement timeouts.
Based on earlier patches by myself and
Vladislav Yasevich <vyasevic@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The 'announce timer' will be used by migration, and explicit
requests for qemu to perform network announces.
Based on the work by Germano Veit Michel <germano@redhat.com>
and Vlad Yasevich <vyasevic@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Lots of work on tests: BiosTablesTest UEFI app,
vhost-user testing for non-Linux hosts.
Misc cleanups and fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJccBqMAAoJECgfDbjSjVRpvSEIAKYPRNdCBX/SSS/L/tmJS5Zt
8IyU/HW1YJ249vO+aT6z4Q3QPgqNC3KjXC3brx/WRoPZnRroen4rv2Kqnk6SayPa
a52d2ubXKWxb3swdG1CAVzFRhq/ABpgAPx0dr1JW+RXgo2lxpJ4GNYxKMosQTaPE
hRNeXl1XlcIK525kJhFH3Hlij9mTRuY6T7ydpPQd8dUq2dBRaL9RrzZRrkZxCy6l
gQPUqNzPhG0XXyOiJmwYyVX0zGzbYrMLrMQAor2SBIYmU+zv2eZGPJUYxoMTUMzt
YR0WCpvkvPITlAryaBoozAIDYVz8PxBRT1KRwpDal+2rzlm6o+veKDiF8R46gn0=
=GzUz
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci, pc, virtio: fixes, cleanups, tests
Lots of work on tests: BiosTablesTest UEFI app,
vhost-user testing for non-Linux hosts.
Misc cleanups and fixes all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 22 Feb 2019 15:51:40 GMT
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (26 commits)
pci: Sanity test minimum downstream LNKSTA
hw/smbios: fix offset of type 3 sku field
pci: Move NVIDIA vendor id to the rest of ids
virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size
virtio-balloon: Use ram_block_discard_range() instead of raw madvise()
virtio-balloon: Rework ballon_page() interface
virtio-balloon: Corrections to address verification
virtio-balloon: Remove unnecessary MADV_WILLNEED on deflate
i386/kvm: ignore masked irqs when update msi routes
contrib/vhost-user-blk: fix the compilation issue
Revert "contrib/vhost-user-blk: fix the compilation issue"
pc-dimm: use same mechanism for [get|set]_addr
tests/data: introduce "uefi-boot-images" with the "bios-tables-test" ISOs
tests/uefi-test-tools: add build scripts
tests: introduce "uefi-test-tools" with the BiosTablesTest UEFI app
roms: build the EfiRom utility from the roms/edk2 submodule
roms: add the edk2 project as a git submodule
vhost-user-test: create a temporary directory per TestServer
vhost-user-test: small changes to init_hugepagefs
vhost-user-test: create a main loop per TestServer
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>