Since commit 61f20b9dc5 ("spapr_nvram: Pre-initialize the NVRAM to
support the -prom-env parameter"), pseries machines can pre-initialize
the "system" partition in the NVRAM with the data passed to all -prom-env
parameters on the QEMU command line.
In this case it is assumed that all the data fits in 64 KiB, but the user
can easily pass more and crash QEMU:
$ qemu-system-ppc64 -M pseries $(for ((x=0;x<128;x++)); do \
echo -n " -prom-env " ; printf "%0.sx" {1..1024}; \
done) # this requires ~128 Kib
malloc(): corrupted top size
Aborted (core dumped)
This happens because we don't check if all the prom-env data fits in
the NVRAM and chrp_nvram_set_var() happily memcpy() it passed the
buffer.
This crash affects basically all ppc/ppc64 machine types that use -prom-env:
- pseries (all versions)
- g3beige
- mac99
and also sparc/sparc64 machine types:
- LX
- SPARCClassic
- SPARCbook
- SS-10
- SS-20
- SS-4
- SS-5
- SS-600MP
- Voyager
- sun4u
- sun4v
Add a max_len argument to chrp_nvram_create_system_partition() so that
it can check the available size before writing to memory.
Since NVRAM is populated at machine init, it seems reasonable to consider
this error as fatal. So, instead of reporting an error when we detect that
the NVRAM is too small and adapt all machine types to handle it, we simply
exit QEMU in all cases. This is still better than crashing. If someone
wants another behavior, I guess this can be reworked later.
Tested with:
$ yes q | \
(for arch in ppc ppc64 sparc sparc64; do \
echo == $arch ==; \
qemu=${arch}-softmmu/qemu-system-$arch; \
for mach in $($qemu -M help | awk '! /^Supported/ { print $1 }'); do \
echo $mach; \
$qemu -M $mach -monitor stdio -nodefaults -nographic \
$(for ((x=0;x<128;x++)); do \
echo -n " -prom-env " ; printf "%0.sx" {1..1024}; \
done) >/dev/null; \
done; echo; \
done)
Without the patch, affected machine types cause QEMU to report some
memory corruption and crash:
malloc(): corrupted top size
free(): invalid size
*** stack smashing detected ***: terminated
With the patch, QEMU prints the following message and exits:
NVRAM is too small. Try to pass less data to -prom-env
It seems that the conditions for the crash have always existed, but it
affects pseries, the machine type I care for, since commit 61f20b9dc5
only.
Fixes: 61f20b9dc5 ("spapr_nvram: Pre-initialize the NVRAM to support the -prom-env parameter")
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1867739
Reported-by: John Snow <jsnow@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159736033937.350502.12402444542194031035.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that kvmppc_xive_cpu_get_state() returns negative on error, use that
and get rid of the temporary Error object and error_propagate().
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707852916.1489912.8376334685349668124.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that kvmppc_xive_cpu_connect() returns a negative errno on failure,
use that and get rid of the local_err boilerplate.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707852234.1489912.16410314514265848075.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that all these functions return a negative errno on failure, check
that and get rid of the local_err boilerplate.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707851537.1489912.1030839306195472651.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that kvmppc_xive_cpu_get_state() and kvmppc_xive_cpu_set_state()
return negative errnos on failures, use that instead local_err because
it is the recommended practice. Also return that instead of -1 since
vmstate expects negative errnos.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707850840.1489912.14912810818646455474.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that all these functions return a negative errno on failure, check
that because it is preferred to local_err. And most of all, propagate it
because vmstate expects negative errnos.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707850148.1489912.18355118622296682631.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that kvmppc_xive_get_queues() returns a negative errno on failure, check
with that because it is preferred to local_err. And most of all, propagate
it because vmstate expects negative errnos.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707849455.1489912.6034461176847728064.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Since kvm_device_access() returns a negative errno on failure, convert
kvmppc_xive_set_source_config() to use it for error checking. This allows
to get rid of the local_err boilerplate.
Propagate the return value so that callers may use it as well to check
failures.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707848764.1489912.17078842252160674523.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Since kvmppc_xive_get_queue_config() has a return value, convert
kvmppc_xive_get_queues() to use it for error checking. This allows
to get rid of the local_err boiler plate.
Propagate the return value so that callers may use it as well to check
failures.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707848069.1489912.14879208798696134531.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Since kvm_device_access() returns a negative errno on failure, convert
kvmppc_xive_get_queue_config() and kvmppc_xive_set_queue_config() to
use it for error checking. This allows to get rid of the local_err
boilerplate.
Propagate the return value so that callers may use it as well to check
failures.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707847357.1489912.2032291280645236480.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
kvm_set_one_reg() returns a negative errno on failure, use that instead
of errno. Also propagate it to callers so they can use it to check
for failures and hopefully get rid of their local_err boilerplate.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707846665.1489912.14267225652103441921.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Callers currently check failures of kvmppc_xive_mmap() through the
@errp argument, which isn't a recommanded practice. It is preferred
to use a return value when possible.
Since NULL isn't an invalid address in theory, it seems better to
return MAP_FAILED and to teach callers to handle it.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707845972.1489912.719896767746375765.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Since kvmppc_xive_source_reset_one() has a return value, convert
kvmppc_xive_source_reset() to use it for error checking. This
allows to get rid of the local_err boiler plate.
Propagate the return value so that callers may use it as well to check
failures.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707845245.1489912.9151822670764690034.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Use error_setg_errno() instead of error_setg(strerror()). While here,
use -ret instead of errno since kvm_vcpu_enable_cap() returns a negative
errno on failure.
Use ERRP_GUARD() to ensure that errp can be passed to error_append_hint(),
and get rid of the local_err boilerplate.
Propagate the return value so that callers may use it as well to check
failures.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707844549.1489912.4862921680328017645.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The spapr_phb_realize() function has a local_err variable which
is used to:
1) check failures of spapr_irq_findone() and spapr_irq_claim()
2) prepend extra information to the error message
Recent work from Markus Armbruster highlighted we get better
code when testing the return value of a function, rather than
setting up all the local_err boiler plate. For similar reasons,
it is now preferred to use ERRP_GUARD() and error_prepend()
rather than error_propagate_prepend().
Since spapr_irq_findone() and spapr_irq_claim() return negative
values in case of failure, do both changes.
This is just cleanup, no functional impact.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <159707843851.1489912.6108405733810934642.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
All callers guard these functions with an xive_in_kernel() helper. Make
it clear that they are only to be called when the KVM XIVE device exists.
Note that the check on xive is dropped in kvmppc_xive_disconnect(). It
really cannot be NULL since it comes from set_active_intc() which only
passes pointers to allocated objects.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <159679994169.876294.11026653581505077112.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Calls to the KVM XIVE device are guarded by kvm_irqchip_in_kernel(). This
ensures that QEMU won't try to use the device if KVM is disabled or if
an in-kernel irqchip isn't required.
When using ic-mode=dual with the pseries machine, we have two possible
interrupt controllers: XIVE and XICS. The kvm_irqchip_in_kernel() helper
will return true as soon as any of the KVM device is created. It might
lure QEMU to think that the other one is also around, while it is not.
This is exactly what happens with ic-mode=dual at machine init when
claiming IRQ numbers, which must be done on all possible IRQ backends,
eg. RTAS event sources or the PHB0 LSI table : only the KVM XICS device
is active but we end up calling kvmppc_xive_source_reset_one() anyway,
which fails. This doesn't cause any trouble because of another bug :
kvmppc_xive_source_reset_one() lacks an error_setg() and callers don't
see the failure.
Most of the other kvmppc_xive_* functions have similar xive->fd
checks to filter out the case when KVM XIVE isn't active. It
might look safer to have idempotent functions but it doesn't
really help to understand what's going on when debugging.
Since we already have all the kvm_irqchip_in_kernel() in place,
also have the callers to check xive->fd as well before calling
KVM XIVE specific code. This is straight-forward for the spapr
specific XIVE code. Some more care is needed for the platform
agnostic XIVE code since it cannot access xive->fd directly.
Introduce new in_kernel() methods in some base XIVE classes
for this purpose and implement them only in spapr.
In all cases, we still need to call kvm_irqchip_in_kernel() so that
compilers can optimize the kvmppc_xive_* calls away when CONFIG_KVM
isn't defined, thus avoiding the need for stubs.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159679993438.876294.7285654331498605426.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Depending on whether XIVE is emultated or backed with a KVM XIVE device,
the ESB MMIOs of a XIVE source point to an I/O memory region or a mapped
memory region.
This is currently handled by checking kvm_irqchip_in_kernel() returns
false in xive_source_realize(). This is a bit awkward as we usually
need to do extra things when we're using the in-kernel backend, not
less. But most important, we can do better: turn the existing "xive.esb"
memory region into a plain container, introduce an "xive.esb-emulated"
I/O subregion and rename the existing "xive.esb" subregion in the KVM
code to "xive.esb-kvm". Since "xive.esb-kvm" is added with overlap
and a higher priority, it prevails over "xive.esb-emulated" (ie.
a guest using KVM XIVE will interact with "xive.esb-kvm" instead of
the default "xive.esb-emulated" region.
While here, consolidate the computation of the MMIO region size in
a common helper.
Suggested-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159679992680.876294.7520540158586170894.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Since this function begins with:
/* The KVM XIVE device is not in use */
if (!xive || xive->fd == -1) {
return;
}
we obviously don't need to check xive->fd again.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159673297296.766512.14780055521619233656.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
If the creation of the KVM XIVE device fails for some reasons, the
negative errno ends up in xive->fd, but the rest of the code assumes
that xive->fd either contains an open fd, ie. positive value, or -1.
This doesn't cause any misbehavior except kvmppc_xive_disconnect()
that will try to close(xive->fd) during rollback and likely be
rewarded with an EBADF.
Only set xive->fd with a open fd.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159673296585.766512.15404407281299745442.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When starting an L2 KVM guest with `ic-mode=dual,kernel-irqchip=on`,
QEMU fails with:
KVM is too old to support ic-mode=dual,kernel-irqchip=on
This error message was introduced to detect older KVM versions that
didn't allow destruction and re-creation of the XICS KVM device that
we do at reboot. But it is actually the same issue that we get with
nested guests : when running under pseries, KVM currently provides
a genuine XICS device (not the XICS-on-XIVE device that we get
under powernv) which doesn't support destruction/re-creation.
This will eventually be fixed in KVM but in the meantime, update
the error message and documentation to mention the nested case.
While here, mention that in "No XIVE support in KVM" section that
this can also happen with "guest OSes supporting XIVE" since
we check this at init time before starting the guest.
Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Buglink: https://bugs.launchpad.net/qemu/+bug/1890290
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159664243614.622889.18307368735989783528.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Fix some typos in comments about code modeling coalescing points in the
XIVE routing engine (IVRE).
Signed-off-by: Gustavo Romero <gromero@linux.ibm.com>
Message-Id: <1595461434-27725-1-git-send-email-gromero@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Nested KVM HV only works if the kernel is using the radix MMU mode, ie.
the CPU is POWER9 and it is not running in some pre-power9 compat mode.
Otherwise, the KVM HV module fails to load in the guest with -ENODEV.
It might be painful for a user to discover this late that nested cannot
work with their setup. Erroring out at machine init instead seems to be
the best we can do.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <159491948127.188975.9621435875869177751.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We have a dedicated error API for hints. Use it instead of embedding
the hint in the error message, as recommanded in the "qapi/error.h"
header file.
While here, have cap_fwnmi_apply(), which already uses
error_append_hint(), to call ERRP_GUARD() as well.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <159594297421.8262.14314530897345809924.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When testing large LMB sizes (eg 4GB), I found a couple of places
that assume they are 32bit in size.
Signed-off-by: Anton Blanchard <anton@ozlabs.org>
Message-Id: <20200715004228.1262681-1-anton@ozlabs.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This likely affects other, less popular host architectures as well.
Less common host architectures under linux get QEMU_VMALLOC_ALIGN (from
which VIRTIO_MEM_MIN_BLOCK_SIZE is derived) define to a variable of
type uintptr, which isn't compatible with the format specifier used to
print a user message. Since this particular usage of the underlying data
seems unique to this file, the simple fix is to just cast
QEMU_VMALLOC_ALIGN to uint32_t, which corresponds to the format specifier
used.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Message-Id: <20200730130519.168475-1-brogers@suse.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
An assertion failure issue was found in the code that processes network packets
while adding data fragments into the packet context. It could be abused by a
malicious guest to abort the QEMU process on the host. This patch replaces the
affected assert() with a conditional statement, returning false if the current
data fragment exceeds max_raw_frags.
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Ziming Zhang <ezrakiez@gmail.com>
Reviewed-by: Dmitry Fleytman <dmitry.fleytman@gmail.com>
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The imx_epit device has a software-controllable reset triggered by
setting the SWR bit in the CR register. An error in commit cc2722ec83
means that we will end up assert()ing if the guest does this, because
the code in imx_epit_write() starts ptimer transactions, and then
imx_epit_reset() also starts ptimer transactions, triggering
"ptimer_transaction_begin: Assertion `!s->in_transaction' failed".
The cleanest way to avoid this double-transaction is to move the
start-transaction for the CR write handling down below the check of
the SWR bit.
Fixes: https://bugs.launchpad.net/qemu/+bug/1880424
Fixes: cc2722ec83
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20200727154550.3409-1-peter.maydell@linaro.org
The nrf51 SoC model wasn't setting the system_clock_scale
global.which meant that if guest code used the systick timer in "use
the processor clock" mode it would hang because time never advances.
Set the global to match the documented CPU clock speed for this SoC.
This SoC in fact doesn't have a SysTick timer (which is the only thing
currently that cares about the system_clock_scale), because it's
a configurable option in the Cortex-M0. However our Cortex-M0 and
thus our nrf51 and our micro:bit board do provide a SysTick, so
we ought to provide a functional one rather than a broken one.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20200727193458.31250-1-peter.maydell@linaro.org
The MSF2 SoC model and the Stellaris board code both wire
SYSRESETREQ up to a function that just invokes
qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET);
This is now the default action that the NVIC does if the line is
not connected, so we can delete the handling code.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20200728103744.6909-4-peter.maydell@linaro.org
The NVIC provides an outbound qemu_irq "SYSRESETREQ" which it signals
when the guest sets the SYSRESETREQ bit in the AIRCR register. This
matches the hardware design (where the CPU has a signal of this name
and it is up to the SoC to connect that up to an actual reset
mechanism), but in QEMU it mostly results in duplicated code in SoC
objects and bugs where SoC model implementors forget to wire up the
SYSRESETREQ line.
Provide a default behaviour for the case where SYSRESETREQ is not
actually connected to anything: use qemu_system_reset_request() to
perform a system reset. This will allow us to remove the
implementations of SYSRESETREQ handling from the boards where that's
exactly what it does, and also fixes the bugs in the board models
which forgot to wire up the signal:
* microbit
* mps2-an385
* mps2-an505
* mps2-an511
* mps2-an521
* musca-a
* musca-b1
* netduino
* netduinoplus2
We still allow the board to wire up the signal if it needs to, in case
we need to model more complicated reset controller logic or to model
buggy SoC hardware which forgot to wire up the line itself. But
defaulting to "reset the system" is more often going to be correct
than defaulting to "do nothing".
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20200728103744.6909-3-peter.maydell@linaro.org
The netduino2 and netduinoplus2 boards forgot to set the system_clock_scale
global, which meant that if guest code used the systick timer in "use
the processor clock" mode it would hang because time never advances.
Set the global to match the documented CPU clock speed of these boards.
Judging by the data sheet this is slightly simplistic because the
SoC allows configuration of the SYSCLK source and frequency via the
RCC (reset and clock control) module, but we don't model that.
Fixes: https://bugs.launchpad.net/qemu/+bug/1876187
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20200727162617.26227-1-peter.maydell@linaro.org
As pointed out by Peter, g_memdup(ms->loadparm, sizeof(ms->loadparm) + 1)
reads one past of the end of ms->loadparm, so g_memdup() can not be used
here.
Let's use g_strndup instead!
Fixes: d664548328 ("s390x/s390-virtio-ccw: fix loadparm property getter")
Fixes: Coverity CID 1431058
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20200730130156.35063-1-pasic@linux.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We try to check whether a peer is VDPA in order to get config from
there - with no peer, this leads to a NULL
pointer dereference. Add a check before trying to access the peer
type. No peer means not VDPA.
Fixes: 108a64818e ("vhost-vdpa: introduce vhost-vdpa backend")
Cc: Cindy Lu <lulu@redhat.com>
Tested-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
We should use the index passed by the caller instead of the queue_sel
when checking the enablement of a specific virtqueue.
This is reported in https://bugzilla.redhat.com/show_bug.cgi?id=1702608
Fixes: f19bcdfedd ("virtio-pci: implement queue_enabled method")
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Minor bugfixes all over the places, including one CVE.
Additionally, a fix for an ancient bug in migration -
one has to wonder how come no one noticed.
The fix is also non-trivial since we dare not break all
existing machine types with pci - we have a work around
in the works, for now we just skip the work-around for
old machine types.
Great job by Hogan Wang noticing, debugging and fixing it,
and thanks to Dr. David Alan Gilbert for reviewing the patches.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAl8e9CIPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpsAIH/2EEq9rLpjqMJdzRvjq3/UAHsvm42zeTnJl7
81cM887Mrg2Nd7MXFoxurLK5UEehTzlD2DRTvaDFfJaJlrtkPM2QEU2X/6c3syAS
GbmOQaljQtR4zEFE81t84mZQS025Gp0s+uble7KvtXakgp1A/vdu93OEvJkhtRY8
JBdRMlTt2T0eizvHn1obBKjaQN7tAUKl5KagHWxP1ApGU0YibUbrBadpJ18ZcKMl
vwB3dwmoi4f7AjuC0GnxYKp7kC/MMhUPFoDxQKI7d+wMGFnbsAF4sBIN9EZKeOkv
xT2InNSAzk/PTSuQpnDnZQjmrf4dPuL/GNJ8vQk27eaFfVchJyc=
=Bu6o
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio,pci: bugfixes
Minor bugfixes all over the places, including one CVE.
Additionally, a fix for an ancient bug in migration -
one has to wonder how come no one noticed.
The fix is also non-trivial since we dare not break all
existing machine types with pci - we have a work around
in the works, for now we just skip the work-around for
old machine types.
Great job by Hogan Wang noticing, debugging and fixing it,
and thanks to Dr. David Alan Gilbert for reviewing the patches.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Mon 27 Jul 2020 16:34:58 BST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
virtio-pci: fix virtio_pci_queue_enabled()
MAINTAINERS: Cover the firmware JSON schema
vhost-vdpa :Fix Coverity CID 1430270 / CID 1420267
libvhost-user: Report descriptor index on panic
Fix vhost-user buffer over-read on ram hot-unplug
hw/pci-host: save/restore pci host config register
virtio-mem-pci: force virtio version 1
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In legacy mode, virtio_pci_queue_enabled() falls back to
virtio_queue_enabled() to know if the queue is enabled.
But virtio_queue_enabled() calls again virtio_pci_queue_enabled()
if k->queue_enabled is set. This ends in a crash after a stack
overflow.
The problem can be reproduced with
"-device virtio-net-pci,disable-legacy=off,disable-modern=true
-net tap,vhost=on"
And a look to the backtrace is very explicit:
...
#4 0x000000010029a438 in virtio_queue_enabled ()
#5 0x0000000100497a9c in virtio_pci_queue_enabled ()
...
#130902 0x000000010029a460 in virtio_queue_enabled ()
#130903 0x0000000100497a9c in virtio_pci_queue_enabled ()
#130904 0x000000010029a460 in virtio_queue_enabled ()
#130905 0x0000000100454a20 in vhost_net_start ()
...
This patch fixes the problem by introducing a new function
for the legacy case and calls it from virtio_pci_queue_enabled().
It also calls it from virtio_queue_enabled() to avoid code duplication.
Fixes: f19bcdfedd ("virtio-pci: implement queue_enabled method")
Cc: Jason Wang <jasowang@redhat.com>
Cc: Cindy Lu <lulu@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20200727153319.43716-1-lvivier@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When booting an EL3 cpu with -kernel, we set up EL3 and then
drop down to EL2. We need to enable access to v8.5-MemTag
tag allocation at EL3 before doing so.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200724163853.504655-3-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When booting an EL3 cpu with -kernel, we set up EL3 and then
drop down to EL2. We need to enable access to v8.3-PAuth
keys and instructions at EL3 before doing so.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200724163853.504655-2-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The SDRAM Memory Controller has a 32-bit address bus, thus
supports up to 4 GiB of DRAM. There is a signed to unsigned
conversion error with the AST2600 maximum memory size:
(uint64_t)(2048 << 20) = (uint64_t)(-2147483648)
= 0xffffffff40000000
= 16 EiB - 2 GiB
Fix by using the IEC suffixes which are usually safer, and add
an assertion check to verify the memory is valid. This would have
caught this bug:
$ qemu-system-arm -M ast2600-evb
qemu-system-arm: hw/misc/aspeed_sdmc.c:258: aspeed_sdmc_realize: Assertion `asc->max_ram_size < 4 * GiB' failed.
Aborted (core dumped)
Fixes: 1550d72679 ("aspeed/sdmc: Add AST2600 support")
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
data_length is a constant value, so we use assert instead of
condition check.
Signed-off-by: Dongjiu Geng <gengdongjiu@huawei.com>
Message-id: 20200622113146.33421-1-gengdongjiu@huawei.com
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In the function vhost_vdpa_dma_map/unmap, The struct msg was not initialized all its fields.
Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20200710064642.24505-1-lulu@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS vhost-user protocol
feature introduced a shadow-table, used by the backend to dynamically
determine how a vdev's memory regions have changed since the last
vhost_user_set_mem_table() call. On hot-remove, a memmove() operation
is used to overwrite the removed shadow region descriptor(s). The size
parameter of this memmove was off by 1 such that if a VM with a backend
supporting the VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS filled it's
shadow-table (by performing the maximum number of supported hot-add
operatons) and attempted to remove the last region, Qemu would read an
out of bounds value and potentially crash.
This change fixes the memmove() bounds such that this erroneous read can
never happen.
Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com>
Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <1594799958-31356-1-git-send-email-raphael.norwitz@nutanix.com>
Fixes: f1aeb14b08 ("Transmit vhost-user memory regions individually")
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The pci host config register is used to save PCI address for
read/write config data. If guest writes a value to config register,
and then QEMU pauses the vcpu to migrate, after the migration, the guest
will continue to write pci config data, and the write data will be ignored
because of new qemu process losing the config register state.
To trigger the bug:
1. guest is booting in seabios.
2. guest enables the SMRAM in seabios:piix4_apmc_smm_setup, and then
expects to disable the SMRAM by pci_config_writeb.
3. after guest writes the pci host config register, QEMU pauses vcpu
to finish migration.
4. guest write of config data(0x0A) fails to disable the SMRAM because
the config register state is lost.
5. guest continues to boot and crashes in ipxe option ROM due to SMRAM
in enabled state.
Example Reproducer:
step 1. Make modifications to seabios and qemu for increase reproduction
efficiency, write 0xf0 to 0x402 port notify qemu to stop vcpu after
0x0cf8 port wrote i440 configure register. qemu stop vcpu when catch
0x402 port wrote 0xf0.
seabios:/src/hw/pci.c
@@ -52,6 +52,11 @@ void pci_config_writeb(u16 bdf, u32 addr, u8 val)
writeb(mmconfig_addr(bdf, addr), val);
} else {
outl(ioconfig_cmd(bdf, addr), PORT_PCI_CMD);
+ if (bdf == 0 && addr == 0x72 && val == 0xa) {
+ dprintf(1, "stop vcpu\n");
+ outb(0xf0, 0x402); // notify qemu to stop vcpu
+ dprintf(1, "resume vcpu\n");
+ }
outb(val, PORT_PCI_DATA + (addr & 3));
}
}
qemu:hw/char/debugcon.c
@@ -60,6 +61,9 @@ static void debugcon_ioport_write(void *opaque, hwaddr addr, uint64_t val,
printf(" [debugcon: write addr=0x%04" HWADDR_PRIx " val=0x%02" PRIx64 "]\n", addr, val);
#endif
+ if (ch == 0xf0) {
+ vm_stop(RUN_STATE_PAUSED);
+ }
/* XXX this blocks entire thread. Rewrite to use
* qemu_chr_fe_write and background I/O callbacks */
qemu_chr_fe_write_all(&s->chr, &ch, 1);
step 2. start vm1 by the following command line, and then vm stopped.
$ qemu-system-x86_64 -machine pc-i440fx-5.0,accel=kvm\
-netdev tap,ifname=tap-test,id=hostnet0,vhost=on,downscript=no,script=no\
-device virtio-net-pci,netdev=hostnet0,id=net0,bus=pci.0,addr=0x13,bootindex=3\
-device cirrus-vga,id=video0,vgamem_mb=16,bus=pci.0,addr=0x2\
-chardev file,id=seabios,path=/var/log/test.seabios,append=on\
-device isa-debugcon,iobase=0x402,chardev=seabios\
-monitor stdio
step 3. start vm2 to accept vm1 state.
$ qemu-system-x86_64 -machine pc-i440fx-5.0,accel=kvm\
-netdev tap,ifname=tap-test1,id=hostnet0,vhost=on,downscript=no,script=no\
-device virtio-net-pci,netdev=hostnet0,id=net0,bus=pci.0,addr=0x13,bootindex=3\
-device cirrus-vga,id=video0,vgamem_mb=16,bus=pci.0,addr=0x2\
-chardev file,id=seabios,path=/var/log/test.seabios,append=on\
-device isa-debugcon,iobase=0x402,chardev=seabios\
-monitor stdio \
-incoming tcp:127.0.0.1:8000
step 4. execute the following qmp command in vm1 to migrate.
(qemu) migrate tcp:127.0.0.1:8000
step 5. execute the following qmp command in vm2 to resume vcpu.
(qemu) cont
Before this patch, we get KVM "emulation failure" error on vm2.
This patch fixes it.
Cc: qemu-stable@nongnu.org
Signed-off-by: Hogan Wang <hogan.wang@huawei.com>
Message-Id: <20200727084621.3279-1-hogan.wang@huawei.com>
Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Trying to run simple virtio-mem-pci examples currently fails with
qemu-system-x86_64: -device virtio-mem-pci,id=vm0,memdev=mem0,node=0,
requested-size=300M: device is modern-only, use disable-legacy=on
due to the added safety checks in 9b3a35ec82 ("virtio: verify that legacy
support is not accidentally on").
As noted by Conny, we have to force virtio version 1. While at it, use
qdev_realize() to set the parent bus and realize - like most other
virtio-*-pci implementations.
Fixes: 0b9a2443a4 ("virtio-pci: Proxy for virtio-mem")
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200727115905.129397-1-david@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Don't send the trailing 0 from the string.
Signed-off-by: KONRAD Frederic <frederic.konrad@adacore.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1592215252-26742-2-git-send-email-frederic.konrad@adacore.com>
Message-Id: <20200724064509.331-4-alex.bennee@linaro.org>