Soon we'll be in either ACTIVE or POSTCOPY_ACTIVE when we
complete migration, and we need to know which we expect to be
in to change state safely.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Add qemu_savevm_state_complete_postcopy to complement
qemu_savevm_state_complete_precopy together with a new
save_live_complete_postcopy method on devices.
The save_live_complete_precopy method is called on
all devices during a precopy migration, and all non-postcopy
devices during a postcopy migration at the transition.
The save_live_complete_postcopy method is called at
the end of postcopy for all postcopiable devices.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
'MIGRATION_STATUS_POSTCOPY_ACTIVE' is entered after migrate_start_postcopy
'migration_in_postcopy' is provided for other sections to know if
they're in postcopy.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Once postcopy is enabled (with migrate_set_capability), the migration
will still start on precopy mode. To cause a transition into postcopy
the:
migrate_start_postcopy
command must be issued. Postcopy will start sometime after this
(when it's next checked in the migration loop).
Issuing the command before migration has started will error,
and issuing after it has finished is ignored.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Provide a check to see if the OS we're running on has all the bits
needed for postcopy.
Creates postcopy-ram.c which will get most of the other helpers we need.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Modify save_live_pending to return separate postcopiable and
non-postcopiable counts.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
MIG_CMD_PACKAGED is a migration command that wraps a chunk of migration
stream inside a package whose length can be determined purely by reading
its header. The destination guarantees that the whole MIG_CMD_PACKAGED
is read off the stream prior to parsing the contents.
This is used by postcopy to load device state (from the package)
while leaving the main stream free to receive memory pages.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The state of the postcopy process is managed via a series of messages;
* Add wrappers and handlers for sending/receiving these messages
* Add state variable that track the current state of postcopy
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The 'postcopy ram' capability allows postcopy migration of RAM;
note that the migration starts off in precopy mode until
postcopy mode is triggered (see the migrate_start_postcopy
patch later in the series).
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Postcopy needs to have two migration streams loading concurrently;
one from memory (with the device state) and the other from the fd
with the memory transactions.
Split the core of qemu_loadvm_state out so we can use it for both.
Allow the inner loadvm loop to quit and cause the parent loops to
exit as well.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Open a return path, and handle messages that are received upon it.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Add migrate_send_rp_message to send a message from destination to source along the return path.
(It uses a mutex to let it be called from multiple threads)
Add migrate_send_rp_shut to send a 'shut' message to indicate
the destination is finished with the RP.
Add migrate_send_rp_ack to send a 'PONG' message in response to a PING
Use it in the MSG_RP_PING handler
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Add two src->dest commands:
* OPEN_RETURN_PATH - To request that the destination open the return path
* PING - Request an acknowledge from the destination
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Create QEMU_VM_COMMAND section type for sending commands from
source to destination. These commands are not intended to convey
guest state but to control the migration process.
For use in postcopy.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Postcopy needs a method to send messages from the destination back to
the source, this is the 'return path'.
Wire it up for 'socket' QEMUFile's.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
In postcopy we're going to need to perform the complete phase
for postcopiable devices at a different point, start out by
renaming all of the 'complete's to make the difference obvious.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Suspend to file is very much like a migrate, and it makes life
easier if we have the Migration state available, so initialise it
in the savevm.c code for suspending.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewd-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Useful for debugging the migration bitmap and other bitmaps
of the same format (including the sentmap in postcopy).
The bitmap is printed to stderr.
Lines that are all the expected value are excluded so the output
can be quite compact for many bitmaps.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Add QEMU_MADV_NOHUGEPAGE as an OS-independent version of
MADV_NOHUGEPAGE.
We include sys/mman.h before making the test to ensure
that we pick up the system defines.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Add a wrapper to change the blocking status on a QEMUFile
rather than having to use qemu_set_block(qemu_get_fd(f));
it seems best to avoid exposing the fd since not all QEMUFile's
really have one. With this wrapper we could move the implementation
down to be different on different transports.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
qemu_get_buffer always copies the data it reads to a users buffer,
however in many cases the file buffer inside qemu_file could be given
back to the caller, avoiding the copy. This isn't always possible
depending on the size and alignment of the data.
Thus 'qemu_get_buffer_in_place' either copies the data to a supplied
buffer or updates a pointer to the internal buffer if convenient.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
'file' becomes confusing when you have flows in each direction;
rename to make it clear.
This leaves just the main forward direction ms->file, which is used
in a lot of places and is probably not worth renaming given the churn.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Add a function to find a RAMBlock by name; use it in two
of the places that already open code that loop; we've
got another use later in postcopy.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Postcopy sends RAMBlock names and offsets over the wire (since it can't
rely on the order of ramaddr being the same), and it starts out with
HVA fault addresses from the kernel.
qemu_ram_block_from_host translates a HVA into a RAMBlock, an offset
in the RAMBlock and the global ram_addr_t value.
Rewrite qemu_ram_addr_from_host to use qemu_ram_block_from_host.
Provide qemu_ram_get_idstr since its the actual name text sent on the
wire.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The HOST_PAGE_ALIGN macros don't work until the page size variables
have been set up; later in postcopy I use those macros in the RAM
code, and it can be triggered using -object.
Fix this by initialising page_size_init() earlier - it's currently
initialised inside the accelerators, move it up into vl.c.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The migration code generally is built target-independent, however
there are a few places where knowing the target page size would
avoid artificially moving stuff into migration/ram.c.
Provide 'qemu_target_page_bits()' that returns TARGET_PAGE_BITS
to other bits of code so that they can stay target-independent.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Add a flag that when set, will cause the primary CPU to start in secure
mode, even if the overall boot is non-secure. This is useful for when
there is a board-setup blob that needs to run from secure mode, but
device and secondary CPU init should still be done as-normal for a non-
secure boot.
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: d1170774d5446d715fced7739edfc61a5be931f9.1447007690.git.crosthwaite.peter@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We have several tests that perform multiple sub-actions that are
expected to fail. Asserting that an error occurred, then clearing
it up to prepare for the next action, turned into enough
boilerplate that it was sometimes forgotten (for example, a number
of tests added to test-qmp-input-visitor.c in d88f5fd leaked err).
Worse, if an error is not reset to NULL, we risk invalidating
later use of that error (passing a non-NULL err into a function
is generally a bad idea). Encapsulate the boilerplate into a
single helper function error_free_or_abort(), and consistently
use it.
The new function is added into error.c for use everywhere,
although it is anticipated that testsuites will be the main
client.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Adding an assertion to qobject_decref() will ensure that a
programming error causing use-after-free will result in
immediate failure (provided no other thread has started
using the memory) instead of silently attempting to wrap
refcnt around and leaving the problem to potentially bite
later at a harder point to diagnose.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1446791754-23823-4-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
To minimize code duplication, epoll is hooked into aio-posix's
aio_poll() instead of rolling its own. This approach also has both
compile-time and run-time switchability.
1) When QEMU starts with a small number of fds in the event loop, ppoll
is used.
2) When QEMU starts with a big number of fds, or when more devices are
hot plugged, epoll kicks in when the number of fds hits the threshold.
3) Some fds may not support epoll, such as tty based stdio. In this
case, it falls back to ppoll.
A rough benchmark with scsi-disk on virtio-scsi dataplane (epoll gets
enabled from 64 onward). Numbers are in MB/s.
===============================================
| master | epoll
| |
scsi disks # | read randrw | read randrw
-------------|----------------|----------------
1 | 86 36 | 92 45
8 | 87 43 | 86 41
64 | 71 32 | 70 38
128 | 48 24 | 58 31
256 | 37 19 | 57 28
===============================================
To comply with aio_{disable,enable}_external, we always use ppoll when
aio_external_disabled() is true.
[Removed #ifdef CONFIG_EPOLL around AioContext epollfd field declaration
since the field is also referenced outside CONFIG_EPOLL code.
--Stefan]
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1446177989-6702-4-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This is the place to initialize platform specific bits of AioContext.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1446177989-6702-3-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This allows AioContext users to check the enable/disable state of
external clients.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1446177989-6702-2-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add a Sysbus AHCI subclass for the Allwinner AHCI. It has a few extra
vendor specific registers which are used for phy and power init.
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 833b5b05ed5ade38bf69656679b0a7575e79492b.1445917756.git.crosthwaite.peter@gmail.com
[resolved patch context on pull --js]
Signed-off-by: John Snow <jsnow@redhat.com>
Also change the misleading definition of macro OBJECT_CLASS_CHECK
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJWPHM6AAoJEL/70l94x66DK5YIAJTNthYWL8eNhQ1iek6CLlV+
etVXm3JDmkV0zOfYVHLBb44VLZ6I1ocas+57F/kmz7SKpMLiI6bMXRxhTSkiO4D+
3N36cWQf3fq+P0DmxuikMlYGz8V6QQ5PQE2xJKV0ZIWAkiqInxilkN3qt81sNR+A
A9Ohom3sc0eGHyYJcVDK4krbnNSAZjIB2yMWperw61x+GYAhxjA02HPUgB32KK6q
KrdnKmnRu9Cw6y4wTCbbDITJztPexZYsX2DOJh30wC0eNcE+MZ7J2im8Frpxe+Ml
C8MUuvSqLOyeu9tUfrXGzd6kMtEKrmU+fh2nNbxJbtfowDjkW2jcIEgC0UjkGE4=
=BF1q
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream-replay' into staging
So here it is, let's see what happens.
# gpg: Signature made Fri 06 Nov 2015 09:30:34 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
* remotes/bonzini/tags/for-upstream-replay:
replay: recording of the user input
replay: command line options
replay: replay blockers for devices
replay: initialization and deinitialization
replay: ptimer
bottom halves: introduce bh call function
replay: checkpoints
icount: improve counting for record/replay
replay: shutdown event
replay: recording and replaying clock ticks
replay: asynchronous events infrastructure
replay: interrupts and exceptions
cpu: replay instructions sequence
cpu-exec: allow temporary disabling icount
replay: introduce icount event
replay: introduce mutex to protect the replay log
replay: internal functions for replay log
replay: global variables and function stubs
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This records user input (keyboard and mouse events) in record mode and replays
these input events in replay mode.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20150917162524.8676.11696.stgit@PASHA-ISP.def.inno>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Some devices are not supported by record/replay subsystem.
This patch introduces replay blocker which denies starting record/replay
if such devices are included into the configuration.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20150917162512.8676.11367.stgit@PASHA-ISP.def.inno>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
This patch introduces the functions for enabling the record/replay and for
freeing the resources when simulator closes.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20150917162507.8676.90232.stgit@PASHA-ISP.def.inno>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
This patch adds deterministic replay for hardware periodic countdown timers.
ptimer uses bottom halves layer to execute such an asynchronous callback.
We put this callback into the replay queue instead of bottom halves one.
When checkpoint is met by main loop thread, the replay queue is processed
and callback is executed. Binding callback moment to one of the checkpoints
makes it deterministic.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20150917162456.8676.83366.stgit@PASHA-ISP.def.inno>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
This patch introduces aio_bh_call function. It is used to execute
bottom halves as callbacks without adding them to the queue.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20150917162450.8676.56980.stgit@PASHA-ISP.def.inno>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
This patch introduces checkpoints that synchronize cpu thread and iothread.
When checkpoint is met in the code all asynchronous events from the queue
are executed.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20150917162444.8676.52916.stgit@PASHA-ISP.def.inno>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
This patch records and replays simulator shutdown event.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20150917162433.8676.32262.stgit@PASHA-ISP.def.inno>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Clock ticks are considered as the sources of non-deterministic data for
virtual machine. This patch implements saving the clock values when they
are acquired (virtual, host clock).
When replaying the execution corresponding values are read from log and
transfered to the module, which wants to read the values.
Such a design required the clock polling to be synchronized. Sometimes
it is not true - e.g. when timeouts for timer lists are checked. In this case
we use a cached value of the clock, passing it to the client code.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20150917162427.8676.36558.stgit@PASHA-ISP.def.inno>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
This patch adds module for saving and replaying asynchronous events.
These events include network packets, keyboard and mouse input,
USB packets, thread pool and bottom halves callbacks.
All events are stored in the queue to be processed at synchronization points
such as beginning of TB execution, or checkpoint in the iothread.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20150917162422.8676.88696.stgit@PASHA-ISP.def.inno>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
This patch includes modifications of common cpu files. All interrupts and
exceptions occured during recording are written into the replay log.
These events allow correct replaying the execution by kicking cpu thread
when one of these events is found in the log.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20150917162416.8676.57647.stgit@PASHA-ISP.def.inno>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
POPCNT is not available on Penryn and older and on Opteron_G2 and older,
and we want to make the default CPU runnable in most hosts, so it won't
be enabled by default in KVM mode.
We should eventually have all features supported by TCG enabled by
default in TCG mode, but as we don't have a good mechanism today to
ensure we have different defaults in KVM and TCG mode, disable POPCNT in
the qemu64 and qemu32 CPU models entirely.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
ABM is not available on Sandy Bridge and older, and we want to make the
default CPU runnable in most hosts, so it won't be enabled by default in
KVM mode.
We should eventually have all features supported by TCG enabled by
default in TCG mode, but as we don't have a good mechanism today to
ensure we have different defaults in KVM and TCG mode, disable ABM in
the qemu64 CPU model entirely.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
SSE4a is not available in any Intel CPU, and we want to make the default
CPU runnable in most hosts, so it doesn't make sense to enable it by
default in KVM mode.
We should eventually have all features supported by TCG enabled by
default in TCG mode, but as we don't have a good mechanism today to
ensure we have different defaults in KVM and TCG mode, disable SSE4a in
the qemu64 CPU model entirely.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The default CPU model (qemu64) have some issues today: it enables some
features (ABM and SSE4a) that are not present in many host CPUs. That
means many hosts (but not all of them) had those features silently
disabled in the default configuration in QEMU 2.4 and older.
With the new "check=on" default, this causes warnings to be printed in
the default configuration, because of the lack of SSE4A on all Intel
hosts, and the lack of ABM on Sandy Bridge and older hosts:
$ qemu-system-x86_64 -machine pc,accel=kvm
warning: host doesn't support requested feature: CPUID.80000001H:ECX.abm [bit 5]
warning: host doesn't support requested feature: CPUID.80000001H:ECX.sse4a [bit 6]
Those issues will be fixed in pc-*-2.5 and newer. But as we can't change
the guest ABI in pc-*-2.4, disable "check" mode by default in pc-*-2.4
and older so we don't print spurious warnings.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This patch adds calls to replay functions into the icount setup block.
In record mode number of executed instructions is written to the log.
In replay mode number of istructions to execute is taken from the replay log.
When replayed instructions counter is expired qemu_notify_event()
function is called to wake up the iothread.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20150917162405.8676.31890.stgit@PASHA-ISP.def.inno>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch is required for deterministic replay to generate an exception
by trying executing an instruction without changing icount.
It adds new flag to TB for disabling icount while translating it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20150917162359.8676.77011.stgit@PASHA-ISP.def.inno>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch adds icount event to the replay subsystem. This event corresponds
to execution of several instructions and used to synchronize input events
in the replay phase.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20150917162354.8676.31351.stgit@PASHA-ISP.def.inno>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch adds global variables, defines, function declarations,
and function stubs for deterministic VM replay used by external modules.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20150917162337.8676.41538.stgit@PASHA-ISP.def.inno>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reverts commit de9d61e83d.
Now 'cpu_clean_all_dirty' is useless, we can revert the related code.
Conflicts:
include/sysemu/kvm.h
Signed-off-by: Liang Li <liang.z.li@intel.com>
Message-Id: <1446695464-27116-3-git-send-email-liang.z.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This makes the purpose of the function clearer: it is not about the
version of QEMU that's running, but the version string exposed in the
emulated hardware.
Cc: Andrzej Zaborowski <balrogg@gmail.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: John Snow <jsnow@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1446233769-7892-3-git-send-email-ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
'cleanup' seems more appropriate than 'cancel'.
Signed-off-by: Liang Li <liang.z.li@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>al3
Reviewed-by: Amit Shah <amit.shah@redhat.com>al3
Signed-off-by: Juan Quintela <quintela@redhat.com>al3
The function qemu_savevm_state_cancel is called after the migration
in migration_thread, it seems strange to 'cancel' it after completion,
rename it to qemu_savevm_state_cleanup looks better.
Signed-off-by: Liang Li <liang.z.li@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>al3
Reviewed-by: Amit Shah <amit.shah@redhat.com>al3
Signed-off-by: Juan Quintela <quintela@redhat.com>al3
Change armv7m_init to return the DeviceState* for the NVIC.
This allows access to all GPIO blocks, not just the IRQ inputs.
Move qdev_get_gpio_in() calls out of armv7m_init() into
board code for stellaris and stm32f205 boards.
Signed-off-by: Michael Davidsaver <mdavidsaver@gmail.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add an API for boards to inject their own preboot software (or
firmware) sequence.
The software then returns to the bootloader via the link register. This
allows boards to do their own little bits of firmware setup without
needed to replace the bootloader completely (which is the requirement
for existing firmware support).
The blob is loaded by a callback if and only if doing a linux boot
(similar to the existing write_secondary support).
Rewrite the comment for the primary boot blob.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 070295644c6ac84696d743913296e8cfefb48c15.1446182614.git.crosthwaite.peter@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This converts vga code to curses code in console_write_bh().
With this changes, we can see line graphics (for example, dialog uses)
correctly.
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Version: GnuPG v1
iQEcBAABAgAGBQJWMmDMAAoJEJykq7OBq3PIHkIIAKyL9iY4EipKrtdMoWZu/Kfm
I9g4NVVqPF4QmTfYpZVxWglvBy0g0+2p1h4DQ5KheUNr7DV2uchSSsN38MWnEgH/
XTRpY858jcWx4sSAvYpz+kUVRBEtJJL8a/1aTBvYRxcbNE1X1lm72m7mm4KXGGud
PZ0fdj/UODHeoTOnMHddbs8Rs0kdHhlckl2Mfkz2dUgYAuZMK7xR7OIE7kOqWBcR
p5/I1Jq3wgmp267ZPVNS17u8Cff2PIElv0Z3Ouubixhhf+k5kvLBtgTbTJ81h7/4
NfmIRwsmAPhtnDSDXqFJ8KgwUYpGYYtPrK8DIXIWYwPdSjkIIdl1gNtc2CGyV3w=
=nA50
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Thu 29 Oct 2015 18:09:16 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
* remotes/stefanha/tags/block-pull-request:
block: Consider all child nodes in bdrv_requests_pending()
target-arm: xlnx-zynqmp: Add sdhci support.
sdhci: Split sdhci.h for public and internal device usage
sd.h: Move sd.h to include/hw/sd/
virtio: sync the dataplane vring state to the virtqueue before virtio_save
gdb command: qemu handlers
virtio-blk: switch off scsi-passthrough by default
ppc/spapr: add 2.4 compat props
s390x: include HW_COMPAT_* props
qemu-gdb: add $qemu_coroutine_sp and $qemu_coroutine_pc
qemu-gdb: extract parts of "qemu coroutine" implementation
qemu-gdb: allow using glibc_pointer_guard() on core dumps
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add two SYSBUS_SDHCI devices for xlnx-zynqmp
Signed-off-by: Sai Pavan Boddu <saipava@xilinx.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Split sdhci.h into pubilc version (i.e include/hw/sd/sdhci.h) and
internal version (i.e hw/sd/sdhci-interna.h) based on register
declarations and object declaration.
Signed-off-by: Sai Pavan Boddu <saipava@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Create a sd directory under include/hw/ and move sd.h to
include/hw/sd/
Signed-off-by: Sai Pavan Boddu <saipava@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Devices that are compliant with virtio-1 do not support scsi
passthrough any more (and it has not been a recommended setup
anyway for quite some time). To avoid having to switch it off
explicitly in newer qemus that turn on virtio-1 by default, let's
switch the default to scsi=false for 2.5.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Message-id: 1444991154-79217-4-git-send-email-cornelia.huck@de.ibm.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
QObject_HEAD is a macro expanding into the common part of structs that
are sub-types of QObject. It's always been just QObject base, and
unlikely to change. Drop the macro, because the code is clearer with
out it.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1444918537-18107-2-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Enable PCIe device multi-function hot-add, just ensure function 0 is added
last, then driver will get the notification to scan the slot.
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This reverts commit aa8580cddf.
As described in
http://article.gmane.org/gmane.comp.emulators.qemu/371432
that commit causes linux guests to crash on memory hot-unplug.
The original problem it's trying to solve has now
been addressed within virtio.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
virtio_map_sg currently fails if one of the entries it's mapping is
contigious in GPA but not HVA address space. Introduce virtio_map which
handles this by splitting sg entries.
This new API generally turns out to be a good idea since it's harder to
misuse: at least in one case the existing one was used incorrectly.
This will still fail if there's no space left in the sg, but luckily max
queue size in use is currently 256, while max sg size is 1024, so we
should be OK even is all entries happen to cross a single DIMM boundary.
Won't work well with very small DIMM sizes, unfortunately:
e.g. this will fail with 4K DIMMs where a single
request might span a large number of DIMMs.
Let's hope these are uncommon - at least we are not breaking things.
Note: virtio-scsi calls virtio_map_sg on data loaded from network, and
validates input, asserting on failure. Copy the validating code here -
it will be dropped from virtio-scsi in a follow-up patch.
Reported-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Currently, if the kernel does not have live migration API, the migration
will still be attempted, but vGIC save/restore functions will just not do
anything. This will result in a broken machine state.
This patch fixes the problem by adding migration blocker if kernel API is
not supported.
Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWLfISAAoJENro4Ql1lpzlHOsP/AkCzg2ontAGsZx+M1fCUn92
e0rRC14QQFkRGt1DchqWDnP5tkWkKeCi/gcyKVOHI6QcjcscxLNM3WVU0ZPw41ps
ZewbddKkDpTuv4yRGQGBe4BhcoMCYyuqfi1sfX19xqgM05SBjwk4kEGwSwZczz67
u1JSFAd4pjKj4Gfx8cLRk4GS4AyT5yvRW8GucrXKtF+Hhnk8Uq0wIvuBayHJvi9E
O40Jfg4fTU0QXYMI0keuYWhxJ12hStaUFgXANgelcuKOiUY+c3RzdFLKyL729Jf2
8PjyixxdPXKJCETCB/RxuPpS9cTifyBVL/0exVbzLvGk/W/9FTl782NxOHFEPcNc
CCnoZSEFUNtOzpvyf2K+xmbvuBYQ+5D272a7qvW1lMTgp0MvSfUMrh0qChrn/0j0
AJpAJOsf+Yverv5iY7/YcSAWbGCZWQypotPHQCd/9w0cXwQuY0V9Rm6PjSNc3SKi
3y7+5l6/sPyVBTDM6o23xd6Z9bRbliHzZd/zQEg6EYvlve2rtCJOlz5EBZAB/MTp
8SkHaKtTQVGFkw2YgF3HJGtc4EiqYwUh6vOV2CuFJO2yLhNrleKoCCXkBvtaX2ks
G3C9fr1mlqZYyAC1kDkHf6TywWkatBvSiiJLUOeWFG6CJ8c2YXuJEb8RqPcJ0j9c
pBBFeGc43sYxGIjdEQC9
=VOcb
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/elmarco/tags/ivshmem-pull-request' into staging
ivshmem series
# gpg: Signature made Mon 26 Oct 2015 09:27:46 GMT using RSA key ID 75969CE5
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>"
# gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5
* remotes/elmarco/tags/ivshmem-pull-request: (51 commits)
doc: document ivshmem & hugepages
ivshmem: use little-endian int64_t for the protocol
ivshmem: use kvm irqfd for msi notifications
ivshmem: rename MSI eventfd_table
ivshmem: remove EventfdEntry.vector
ivshmem: add hostmem backend
ivshmem: use qemu_strtosz()
ivshmem: do not keep shm_fd open
tests: add ivshmem qtest
qtest: add qtest_add_abrt_handler()
msix: implement pba write (but read-only)
contrib: remove unnecessary strdup()
ivshmem: add check on protocol version in QEMU
docs: update ivshmem device spec
ivshmem-server: fix hugetlbfs support
ivshmem-server: use a uint16 for client ID
ivshmem-client: check the number of vectors
contrib: add ivshmem client and server
util: const event_notifier_get_fd() argument
ivshmem: reset mask on device reset
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Send a protocol version as the first message from server, clients must
close communication if they don't support this protocol version. Older
QEMUs should be fine with this change in the protocol since they
overrides their own vm_id on reception of an id associated to no
eventfd.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[use fifo_update_and_get()]
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
ivshmem is going to use MSIX state conditionally.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
If a chardev is allowed to be created outside of QMP, then it must be
also possible to free it. This is useful for ivshmem that creates
chardev anonymously and must be able to free them.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Claudio Fontana <claudio.fontana@huawei.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJWKmeMAAoJEH8JsnLIjy/WUYYP/0hrpnuE14QBAW+RjV/40+fh
II3+RBX2Avz8ERWD29aAftqmNIigxPhdvzEdOP1/IRBnYNzBMUw6BTsqrV3IgA1X
ODIRFht3horyL6w5rfJLLbAVOyRPWTGHZNgxBN+GGy3Z/jLK+VH+1dK26rSd6p7o
QqsmBUPi5UQvSd89r+X1tVwFjT5Miw7CyFaijXdnVzs1LNpbtg49t4YpQH1eG5bf
aP4GXWn4g5/Ht8LSByuViDG3CpLjysSYSFPn/4HIP41BU6u3P6yD++g6nbdkvIsn
yDezoVpCEvKoYXfc1xGY3Q7+lwzV8wa5mzdtpy6eg2889dHoJuUePI6Yfza9TNJI
XzBJmYaBZx+289nxeAX2K3dRe0ilCEdWyujlhoonDuYOS9xbDiaouWcVZEw/0ky5
SUsRZYTZGGc1BOoFeBE4JpopFCPZ4a//bzi5GrlyEiwl7kpKPTMxFWvjSQpQ/Gzz
sPLxnn1y1AA4jAqgQNLFpCciJ1sH1WNmb00WjQkoEomIdpuvLvK1GUKfcwEERTWb
Ae8wlCbofkIJgQOwa9DTS/yDPfl3pUc/NgmRc+Qz/0snrtvmmsS+huJQQfCH1JDQ
p3jvurvQ7G5RkTzdOIbSkzfKaW8ZHq6ENWRP5HY/y8LontAVdYzT+DRLeyTpGfKL
ncgMgK6fT3rE+3lA8Acz
=xcrS
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Fri 23 Oct 2015 17:59:56 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
* remotes/kevin/tags/for-upstream: (37 commits)
tests: Add test case for aio_disable_external
block: Add "drained begin/end" for internal snapshot
block: Add "drained begin/end" for transactional blockdev-backup
block: Add "drained begin/end" for transactional backup
block: Add "drained begin/end" for transactional external snapshot
block: Introduce "drained begin/end" API
aio: introduce aio_{disable,enable}_external
dataplane: Mark host notifiers' client type as "external"
nbd: Mark fd handlers client type as "external"
aio: Add "is_external" flag for event handlers
throttle: Remove throttle_group_lock/unlock()
blockdev: Allow more options for BB-less BDS tree
blockdev: Pull out blockdev option extraction
blockdev: Do not create BDS for empty drive
block: Prepare for NULL BDS
block: Add blk_insert_bs()
block: Prepare remaining BB functions for NULL BDS
block: Fail requests to empty BlockBackend
block: Make some BB functions fall back to BBRS
block: Add BlockBackendRootState
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The semantics is that after bdrv_drained_begin(bs), bs will not get new external
requests until the matching bdrv_drained_end(bs).
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
All callers pass in false, and the real external ones will switch to
true in coming patches.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The group throttling code was always meant to handle its locking
internally. However, bdrv_swap() was touching the ThrottleGroup
structure directly and therefore needed an API for that.
Now that bdrv_swap() no longer exists there's no need for the
throttle_group_lock() API anymore.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This function associates the given BlockDriverState with the given
BlockBackend.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This structure will store some of the state of the root BDS if the BDS
tree is removed, so that state can be restored once a new BDS tree is
inserted.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Throttle groups are not necessarily referenced by BDSs alone; a later
patch will essentially allow BBs to reference them, too. Make the
ref/unref functions public so that reference can be properly accounted
for.
Their interface is slightly adjusted in that they return and take a
ThrottleState pointer, respectively, instead of a ThrottleGroup pointer.
Functionally, they are equivalent, but since ThrottleGroup is not meant
to be used outside of block/throttle-groups.c, ThrottleState is easier
to handle.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
These options are only relevant for the user of a whole BDS tree (like a
guest device or a block job) and should thus be moved into the
BlockBackend.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
As the comment above bdrv_get_stats() says, BlockAcctStats is something
which belongs to the device instead of each BlockDriverState. This patch
therefore moves it into the BlockBackend.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
BlockAcctStats contains statistics about the data transferred from and
to the device; wr_highest_sector does not fit in with the rest.
Furthermore, those statistics are supposed to be specific for a certain
device and not necessarily for a BDS (see the comment above
bdrv_get_stats()); on the other hand, wr_highest_sector may be a rather
important information to know for each BDS. When BlockAcctStats is
finally removed from the BDS, we will want to keep wr_highest_sector in
the BDS.
Finally, wr_highest_sector is renamed to wr_highest_offset and given the
appropriate meaning. Externally, it is represented as an offset so there
is no point in doing something different internally. Its definition is
changed to match that in qapi/block-core.json which is "the offset after
the greatest byte written to". Doing so should not cause any harm since
if external programs tried to calculate the volume usage by
(wr_highest_offset + 512) / volume_size, after this patch they will just
assume the volume to be full slightly earlier than before.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
guest_block_size is a guest device property so it should be moved into
the interface between block layer and guest devices, which is the
BlockBackend.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
blk_is_available() returns true iff the BDS is inserted (which means
blk_bs() is not NULL and bdrv_is_inserted() returns true) and if the
tray of the guest device is closed.
blk_is_inserted() is changed to return true only if blk_bs() is not
NULL.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Make bdrv_is_inserted(), blk_is_inserted(), and the callback
BlockDriver.bdrv_is_inserted() return a bool.
Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The host cache information may not make sense for the guest if the VM
CPU topology doesn't match the host CPU topology. To make sure we won't
expose broken cache information to the guest, disable cache info
passthrough by default, and add a new "host-cache-info" property that
can be used to enable the old behavior for users that really need it.
Cc: Benoît Canet <benoit@irqsave.net>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Because of the way non-VFIO guest IOMMU operations are KVM accelerated, not
all TCE tables (guest IOMMU contexts) can support VFIO devices. Currently,
this is decided at creation time.
To support hotplug of VFIO devices, we need to allow a TCE table which
previously didn't allow VFIO devices to be switched so that it can. This
patch adds an spapr_tce_set_need_vfio() function to do this, by
reallocating the table in userspace if necessary.
Currently this doesn't allow the KVM acceleration to be re-enabled if all
the VFIO devices are removed. That's an optimization for another time.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
The vfio_accel parameter used when creating a new TCE table (guest IOMMU
context) has a confusing name. What it really means is whether we need the
TCE table created to be able to support VFIO devices.
VFIO is relevant, because when available we use in-kernel acceleration of
the TCE table, but that may not work with VFIO devices because updates to
the table are handled in kernel, bypass qemu and so don't hit qemu's
infrastructure for keeping the VFIO host IOMMU state in sync with the guest
IOMMU state.
Rename the parameter to "need_vfio" throughout. This is a cosmetic change,
with no impact on the logic.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
At present the PCI host bridge (PHB) for the pseries machine type has a
fixed DMA window from 0..1GB (in PCI address space) which is mapped to real
memory via the PAPR paravirtualized IOMMU.
For better support of VFIO devices, we're going to want to allow for
different configurations of the DMA window.
Eventually we'll want to allow the guest itself to reconfigure the window
via the PAPR dynamic DMA window interface, but as a preliminary this patch
allows the user to reconfigure the window with new properties on the PHB
device.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>