* Added logic to check for the supported line width range before setting the line width to avoid errors.
I also moved the glLineWidth call so that it could be after the call to get the supported line width range for the desired line type.
* Moved the glLineWidth call outside the if/else
* Moved the code to query line GL_SMOOTH_LINE_WIDTH_RANGE and GL_ALIASED_LINE_WIDTH_RANGE to nv2a_gl_context_init(void) so that it's just called while OpenGL is being initialized.
* Removed the lineWidth local variable. It's simpler to just call glLineWidth in the if and else blocks
When vfio_enable_vectors() returns with less than requested nr_vectors
we retry with what kernel reported back. But the retry path doesn't
call vfio_prepare_kvm_msi_virq_batch() and this results in,
qemu-system-aarch64: vfio: Error: Failed to enable 4 MSI vectors, retry with 1
qemu-system-aarch64: ../hw/vfio/pci.c:602: vfio_commit_kvm_msi_virq_batch: Assertion `vdev->defer_kvm_irq_routing' failed
Fixes: dc580d51f7 ("vfio: defer to commit kvm irq routing when enable msi/msix")
Reviewed-by: Longpeng <longpeng2@huawei.com>
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit c17408892319712c12357e5d1c6b305499c58c2a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The kvm irqchip notifier is only registered if the device supports
INTx, however it's unconditionally removed in vfio realize error
path. If the assigned device does not support INTx, this will cause
QEMU to crash when vfio realize fails. Change it to conditionally
remove the notifier only if the notify hook is setup.
Before fix:
(qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,xres=1
Connection closed by foreign host.
After fix:
(qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,xres=1
Error: vfio 0000:81:11.1: xres and yres properties require display=on
(qemu)
Fixes: c5478fea27 ("vfio/pci: Respond to KVM irqchip change notifier")
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit 357bd7932a136613d700ee8bc83e9165f059d1f7)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
It is possible to store a very large value to the decrementer that it
does not raise the decrementer exception so the timer is scheduled, but
the next time value wraps and is treated as in the past.
This can occur if (u64)-1 is stored on a zero-triggered exception, or
(u64)-1 is stored twice on an underflow-triggered exception, for
example.
If such a value is set in DECAR, it gets stored to the decrementer by
the timer function, which then immediately causes another timer, which
hangs QEMU.
Clamp the decrementer to the implemented width, and use that as the
value for the timer calculation, effectively preventing this overflow.
Reported-by: sdicaro@DDCI.com
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20230530131214.373524-1-npiggin@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit 09d2db9f46e38e2da990df8ad914d735d764251a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In the case where the console does not have gl capability, and
if blob is set to true, make sure that the display updates still
work. Commit e86a93f554 accidentally broke this by misplacing
the return statement (in resource_flush) causing the updates to
be silently ignored.
Fixes: e86a93f554 ("virtio-gpu: splitting one extended mode guest fb into n-scanouts")
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Dongwon Kim <dongwon.kim@intel.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20230623060454.3749910-1-vivek.kasireddy@intel.com>
(cherry picked from commit 34e29d85a7734802317c4cac9ad52b10d461c1dc)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
vhost_dev_start function does not release memory_listener object
in case of an error. This may crash the guest when vhost is unable
to set memory table:
stack trace of thread 125653:
Program terminated with signal SIGSEGV, Segmentation fault
#0 memory_listener_register (qemu-kvm + 0x6cda0f)
#1 vhost_dev_start (qemu-kvm + 0x699301)
#2 vhost_net_start (qemu-kvm + 0x45b03f)
#3 virtio_net_set_status (qemu-kvm + 0x665672)
#4 qmp_set_link (qemu-kvm + 0x548fd5)
#5 net_vhost_user_event (qemu-kvm + 0x552c45)
#6 tcp_chr_connect (qemu-kvm + 0x88d473)
#7 tcp_chr_new_client (qemu-kvm + 0x88cf83)
#8 tcp_chr_accept (qemu-kvm + 0x88b429)
#9 qio_net_listener_channel_func (qemu-kvm + 0x7ac07c)
#10 g_main_context_dispatch (libglib-2.0.so.0 + 0x54e2f)
Release memory_listener objects in the error path.
Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
Message-Id: <20230529114333.31686-2-ppandit@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Fixes: c471ad0e9b ("vhost_net: device IOTLB support")
Cc: qemu-stable@nongnu.org
Acked-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 1e3ffb34f764f8ac4c003b2b2e6a775b2b073a16)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Give current QEMU version string to SeaBIOS-hppa via fw_cfg interface so
that the firmware can show the QEMU version in the boot menu info.
Signed-off-by: Helge Deller <deller@gmx.de>
(cherry picked from commit 069d296669448b9eef72c6332ae84af962d9582c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When the OS triggers a reboot, the reset helper function sends a
qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET) together with an
EXCP_HLT exception to halt the CPUs.
So, at reboot when initializing the CPUs again, make sure to set all
instruction pointers to the firmware entry point, disable any interrupts,
disable data and instruction translations, enable PSW_Q bit and tell qemu
to unhalt (halted=0) the CPUs again.
This fixes the various reboot issues which were seen when rebooting a
Linux VM, including the case where even the monarch CPU has been virtually
halted from the OS (e.g. via "chcpu -d 0" inside the Linux VM).
Signed-off-by: Helge Deller <deller@gmx.de>
(cherry picked from commit 50ba97e928b44ff5bc731c9ffe68d86acbe44639)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The nrf51_timer has a free-running counter which we implement using
the pattern of using two fields (update_counter_ns, counter) to track
the last point at which we calculated the counter value, and the
counter value at that time. Then we can find the current counter
value by converting the difference in wall-clock time between then
and now to a tick count that we need to add to the counter value.
Unfortunately the nrf51_timer's implementation of this has a bug
which means it loses time every time update_counter() is called.
After updating s->counter it always sets s->update_counter_ns to
'now', even though the actual point when s->counter hit the new value
will be some point in the past (half a tick, say). In the worst case
(guest code in a tight loop reading the counter, icount mode) the
counter is continually queried less than a tick after it was last
read, so s->counter never advances but s->update_counter_ns does, and
the guest never makes forward progress.
The fix for this is to only advance update_counter_ns to the
timestamp of the last tick, not all the way to 'now'. (This is the
pattern used in hw/misc/mps2-fpgaio.c's counter.)
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20230606134917.3782215-1-peter.maydell@linaro.org
(cherry picked from commit d2f9a79a8cf6ab992e1d0f27ad05b3e582d2b18a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In commit 2c5fa0778c3b430 we fixed an endianness bug in the Allwinner
A10 PIC model; however in the process we introduced a regression.
This is because the old code was robust against the incoming 'level'
argument being something other than 0 or 1, whereas the new code was
not.
In particular, the allwinner-sdhost code treats its IRQ line
as 0-vs-non-0 rather than 0-vs-1, so when the SD controller
set its IRQ line for any reason other than transmit the
interrupt controller would ignore it. The observed effect
was a guest timeout when rebooting the guest kernel.
Handle level values other than 0 or 1, to restore the old
behaviour.
Fixes: 2c5fa0778c3b430 ("hw/intc/allwinner-a10-pic: Don't use set_bit()/clear_bit()")
(Mjt: 5eb742fce5 in stable-7.2)
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Message-id: 20230606104609.3692557-2-peter.maydell@linaro.org
(cherry picked from commit f837b468cdaa7e736b5385c7dc4f8c5adcad3bf1)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
As mentioned in docs/devel/style.rst "Automatic memory deallocation":
* Variables declared with g_auto* MUST always be initialized,
otherwise the cleanup function will use uninitialized stack memory
This avoids QEMU to coredump when running the "hash test" command
under Zephyr.
Cc: Steven Lee <steven_lee@aspeedtech.com>
Cc: Joel Stanley <joel@jms.id.au>
Cc: qemu-stable@nongnu.org
Fixes: c5475b3f9a ("hw: Model ASPEED's Hash and Crypto Engine")
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Message-Id: <20230421131547.2177449-1-clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit c8f48b120b31f6bbe33135ef5d478e485c37e3c2)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Command "qemu-system-riscv64 -machine virt
-m 2G -smp 1 -numa node,mem=1G -numa node,mem=1G"
would trigger this problem.Backtrace with:
#0 0x0000555555b5b1a4 in riscv_numa_get_default_cpu_node_id at ../hw/riscv/numa.c:211
#1 0x00005555558ce510 in machine_numa_finish_cpu_init at ../hw/core/machine.c:1230
#2 0x00005555558ce9d3 in machine_run_board_init at ../hw/core/machine.c:1346
#3 0x0000555555aaedc3 in qemu_init_board at ../softmmu/vl.c:2513
#4 0x0000555555aaf064 in qmp_x_exit_preconfig at ../softmmu/vl.c:2609
#5 0x0000555555ab1916 in qemu_init at ../softmmu/vl.c:3617
#6 0x000055555585463b in main at ../softmmu/main.c:47
This commit fixes the issue by adding parameter checks.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com>
Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn>
Signed-off-by: Yin Wang <yin.wang@intel.com>
Message-Id: <20230519023758.1759434-1-yin.wang@intel.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit b9cedbf19cb4be04908a3a623f0f237875483499)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The printed offset value is prefixed with 0x, but was actually printed
in decimal. To spare others the confusion, adjust the format specifier
to hexadecimal.
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com>
Reviewed-by: Jagannathan Raman <jag.raman@oracle.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 5fb9e8295531f957cf7ac20e89736c8963a25e04)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The 9p protocol does not specifically define how server shall behave when
client tries to open a special file, however from security POV it does
make sense for 9p server to prohibit opening any special file on host side
in general. A sane Linux 9p client for instance would never attempt to
open a special file on host side, it would always handle those exclusively
on its guest side. A malicious client however could potentially escape
from the exported 9p tree by creating and opening a device file on host
side.
With QEMU this could only be exploited in the following unsafe setups:
- Running QEMU binary as root AND 9p 'local' fs driver AND 'passthrough'
security model.
or
- Using 9p 'proxy' fs driver (which is running its helper daemon as
root).
These setups were already discouraged for safety reasons before,
however for obvious reasons we are now tightening behaviour on this.
Fixes: CVE-2023-2861
Reported-by: Yanwu Shen <ywsPlz@gmail.com>
Reported-by: Jietao Xiao <shawtao1125@gmail.com>
Reported-by: Jinku Li <jkli@xidian.edu.cn>
Reported-by: Wenbo Shen <shenwenbo@zju.edu.cn>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <E1q6w7r-0000Q0-NM@lizzy.crudebyte.com>
(cherry picked from commit f6b0de53fb87ddefed348a39284c8e2f28dc4eda)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: drop adding qemu_fstat wrapper for 7.2 where wrappers aren't used)
When passing --smp with a number lower than XLNX_ZYNQMP_NUM_APU_CPUS,
the expression (ms->smp.cpus - XLNX_ZYNQMP_NUM_APU_CPUS) will result
in a positive number as ms->smp.cpus is a unsigned int.
This will raise the following error afterwards, as Qemu will try to
instantiate some additional RPUs.
| $ qemu-system-aarch64 --smp 1 -M xlnx-zcu102
| **
| ERROR:../src/tcg/tcg.c:777:tcg_register_thread:
| assertion failed: (n < tcg_max_ctxs)
Signed-off-by: Clément Chigot <chigot@adacore.com>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Tested-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Message-id: 20230524143714.565792-1-chigot@adacore.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit c9ba1c9f02cfede5329f504cdda6fd3a256e0434)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When we receive a packet from the xilinx_axienet and then try to s2mem
through the xilinx_axidma, if the descriptor ring buffer is full in the
xilinx axidma driver, we’ll assert the DMASR.HALTED in the
function : stream_process_s2mem and return 0. In the end, we’ll be stuck in
an infinite loop in axienet_eth_rx_notify.
This patch checks the DMASR.HALTED state when we try to push data
from xilinx axi-enet to xilinx axi-dma. When the DMASR.HALTED is asserted,
we will not keep pushing the data and then prevent the infinte loop.
Signed-off-by: Tommy Wu <tommy.wu@sifive.com>
Reviewed-by: Edgar E. Iglesias <edgar@zeroasic.com>
Reviewed-by: Frank Chang <frank.chang@sifive.com>
Message-id: 20230519062137.1251741-1-tommy.wu@sifive.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 31afe04586efeccb80cc36ffafcd0e32a3245ffb)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit cef2e7148e32 ("hw/isa/i82378: Remove intermediate IRQ forwarder")
passes s->cpu_intr to i8259_init() in i82378_realize() directly. However, s-
>cpu_intr isn't initialized yet since that happens after the south bridge's
pci_realize_and_unref() in board code. Fix this by initializing s->cpu_intr
before realizing the south bridge.
Fixes: cef2e7148e32 ("hw/isa/i82378: Remove intermediate IRQ forwarder")
Signed-off-by: Bernhard Beschow <shentey@gmail.com>
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20230304114043.121024-4-shentey@gmail.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit 2237af5e60ada06d90bf714e85523deafd936b9b)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
QEMU aborts when default RAM backend should be used (i.e. no
explicit '-machine memory-backend=' specified) but user
has created an object which 'id' equals to default RAM backend
name used by board.
$QEMU -machine pc \
-object memory-backend-ram,id=pc.ram,size=4294967296
Actual results:
QEMU 7.2.0 monitor - type 'help' for more information
(qemu) Unexpected error in object_property_try_add() at ../qom/object.c:1239:
qemu-kvm: attempt to add duplicate property 'pc.ram' to object (type 'container')
Aborted (core dumped)
Instead of abort, check for the conflicting 'id' and exit with
an error, suggesting how to remedy the issue.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2207886
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230522131717.3780533-1-imammedo@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit a37531f2381c4e294e48b1417089474128388b44)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
We cannot use the generic reentrancy guard in the LSI code, so
we have to manually prevent endless reentrancy here. The problematic
lsi_execute_script() function has already a way to detect whether
too many instructions have been executed - we just have to slightly
change the logic here that it also takes into account if the function
has been called too often in a reentrant way.
The code in fuzz-lsi53c895a-test.c has been taken from an earlier
patch by Mauro Matteo Cascella.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1563
Message-Id: <20230522091011.1082574-1-thuth@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit b987718bbb1d0eabf95499b976212dd5f0120d75)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When the OHCI controller's framenumber is incremented, HccaPad1 register
should be set to zero (Ref OHCI Spec 4.4)
ReactOS uses hccaPad1 to determine if the OHCI hardware is running,
consequently it fails this check in current qemu master.
Signed-off-by: Ryan Wendland <wendland@live.com.au>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1048
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 6301460ce9f59885e8feb65185bcfb6b128c8eff)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The bytes and packets counter registers are cleared on read.
Copying the "total counter" registers to the "good counter" registers has
side effects.
If the "total" register is never read by the OS, it only gets incremented.
This leads to exponential growth of the "good" register.
This commit increments the counters individually to avoid this.
Signed-off-by: Timothée Cocault <timothee.cocault@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 8d689f6aae8be096b4a1859be07c1b083865f755)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: removed hw/net/igb_core.c part: igb introduced in 8.0)
The Software Developer's Manual 13.7.4.5 "Packets Transmitted (64 Bytes)
Count" says:
> This register counts the number of packets transmitted that are
> exactly 64 bytes (from <Destination Address> through <CRC>,
> inclusively) in length.
It also says similar for the other Tx statistics registers. Add the
number of bytes for CRC to those registers.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit c50b152485d4e10dfa1e1d7ea668f29a5fb92e9c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: pick this for 7.2 too: a fix by its own and makes next patch to apply cleanly)
Ensure op_info is not NULL in case of QCRYPTODEV_BACKEND_ALG_SYM algtype.
Fixes: 0e660a6f90 ("crypto: Introduce RSA algorithm")
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Reported-by: Yiming Tao <taoym@zju.edu.cn>
Message-Id: <20230509075317.1132301-1-mcascell@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: zhenwei pi<pizhenwei@bytedance.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 3e69908907f8d3dd20d5753b0777a6e3824ba824)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: context tweak after 999c789f00 cryptodev: Introduce cryptodev alg type in QAPI)
The commit 93a97dc520 ("virtio-net: enable vq reset feature") enables
unconditionally vq reset feature as long as the device is emulated.
This makes impossible to actually disable the feature, and it causes
migration problems from qemu version previous than 7.2.
The entire final commit is unneeded as device system already enable or
disable the feature properly.
This reverts commit 93a97dc520.
Fixes: 93a97dc520 ("virtio-net: enable vq reset feature")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230504101447.389398-1-eperezma@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 1fac00f70b3261050af5564b20ca55c1b2a3059a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
QEMU invokes vhost_svq_add() when adding a guest's element
into SVQ. In vhost_svq_add(), it uses vhost_svq_available_slots()
to check whether QEMU can add the element into SVQ. If there is
enough space, then QEMU combines some out descriptors and some
in descriptors into one descriptor chain, and adds it into
`svq->vring.desc` by vhost_svq_vring_write_descs().
Yet the problem is that, `svq->shadow_avail_idx - svq->shadow_used_idx`
in vhost_svq_available_slots() returns the number of occupied elements,
or the number of descriptor chains, instead of the number of occupied
descriptors, which may cause wrapping in SVQ descriptor ring.
Here is an example. In vhost_handle_guest_kick(), QEMU forwards
as many available buffers to device by virtqueue_pop() and
vhost_svq_add_element(). virtqueue_pop() returns a guest's element,
and then this element is added into SVQ by vhost_svq_add_element(),
a wrapper to vhost_svq_add(). If QEMU invokes virtqueue_pop() and
vhost_svq_add_element() `svq->vring.num` times,
vhost_svq_available_slots() thinks QEMU just ran out of slots and
everything should work fine. But in fact, virtqueue_pop() returns
`svq->vring.num` elements or descriptor chains, more than
`svq->vring.num` descriptors due to guest memory fragmentation,
and this causes wrapping in SVQ descriptor ring.
This bug is valid even before marking the descriptors used.
If the guest memory is fragmented, SVQ must add chains
so it can try to add more descriptors than possible.
This patch solves it by adding `num_free` field in
VhostShadowVirtqueue structure and updating this field
in vhost_svq_add() and vhost_svq_get_buf(), to record
the number of free descriptors.
Fixes: 100890f7ca ("vhost: Shadow virtqueue buffers forwarding")
Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230509084817.3973-1-yin31149@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
(cherry picked from commit 5d410557dea452f6231a7c66155e29a37e168528)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Using linux 6.x guest, at boot time, an inquiry on a scsi-generic
device makes qemu crash. This is caused by a buffer overflow when
scsi-generic patches the block limits VPD page.
Do the operations on a temporary on-stack buffer that is guaranteed
to be large enough.
Reported-by: Théo Maillart <tmaillart@freebox.fr>
Analyzed-by: Théo Maillart <tmaillart@freebox.fr>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 9bd634b2f5e2f10fe35d7609eb83f30583f2e15a)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This reverts commit a7f523c7d1.
The nested event loop is broken by design. It's only user was removed.
Drop the code as well so that nobody ever tries to use it again.
I had to fix a couple of trivial conflicts around return values because
of 025faa872b ("vhost-user: stick to -errno error return convention").
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20230119172424.478268-3-groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
(cherry picked from commit 4382138f642f69fdbc79ebf4e93d84be8061191f)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This reverts commit db8a3772e3.
Motivation : this is breaking vhost-user with DPDK as reported in [0].
Received unexpected msg type. Expected 22 received 40
Fail to update device iotlb
Received unexpected msg type. Expected 40 received 22
Received unexpected msg type. Expected 22 received 11
Fail to update device iotlb
Received unexpected msg type. Expected 11 received 22
vhost VQ 1 ring restore failed: -71: Protocol error (71)
Received unexpected msg type. Expected 22 received 11
Fail to update device iotlb
Received unexpected msg type. Expected 11 received 22
vhost VQ 0 ring restore failed: -71: Protocol error (71)
unable to start vhost net: 71: falling back on userspace virtio
The failing sequence that leads to the first error is :
- QEMU sends a VHOST_USER_GET_STATUS (40) request to DPDK on the master
socket
- QEMU starts a nested event loop in order to wait for the
VHOST_USER_GET_STATUS response and to be able to process messages from
the slave channel
- DPDK sends a couple of legitimate IOTLB miss messages on the slave
channel
- QEMU processes each IOTLB request and sends VHOST_USER_IOTLB_MSG (22)
updates on the master socket
- QEMU assumes to receive a response for the latest VHOST_USER_IOTLB_MSG
but it gets the response for the VHOST_USER_GET_STATUS instead
The subsequent errors have the same root cause : the nested event loop
breaks the order by design. It lures QEMU to expect responses to the
latest message sent on the master socket to arrive first.
Since this was only needed for DAX enablement which is still not merged
upstream, just drop the code for now. A working solution will have to
be merged later on. Likely protect the master socket with a mutex
and service the slave channel with a separate thread, as discussed with
Maxime in the mail thread below.
[0] https://lore.kernel.org/qemu-devel/43145ede-89dc-280e-b953-6a2b436de395@redhat.com/
Reported-by: Yanghang Liu <yanghliu@redhat.com>
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2155173
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20230119172424.478268-2-groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
(cherry picked from commit f340a59d5a852d75ae34555723694c7e8eafbd0c)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Intel specifies that the Intel IGD must occupy slot 2 on the PCI bus,
as noted in docs/igd-assign.txt in the Qemu source code.
Currently, when the xl toolstack is used to configure a Xen HVM guest with
Intel IGD passthrough to the guest with the Qemu upstream device model,
a Qemu emulated PCI device will occupy slot 2 and the Intel IGD will occupy
a different slot. This problem often prevents the guest from booting.
The only available workarounds are not good: Configure Xen HVM guests to
use the old and no longer maintained Qemu traditional device model
available from xenbits.xen.org which does reserve slot 2 for the Intel
IGD or use the "pc" machine type instead of the "xenfv" machine type and
add the xen platform device at slot 3 using a command line option
instead of patching qemu to fix the "xenfv" machine type directly. The
second workaround causes some degredation in startup performance such as
a longer boot time and reduced resolution of the grub menu that is
displayed on the monitor. This patch avoids that reduced startup
performance when using the Qemu upstream device model for Xen HVM guests
configured with the igd-passthru=on option.
To implement this feature in the Qemu upstream device model for Xen HVM
guests, introduce the following new functions, types, and macros:
* XEN_PT_DEVICE_CLASS declaration, based on the existing TYPE_XEN_PT_DEVICE
* XEN_PT_DEVICE_GET_CLASS macro helper function for XEN_PT_DEVICE_CLASS
* typedef XenPTQdevRealize function pointer
* XEN_PCI_IGD_SLOT_MASK, the value of slot_reserved_mask to reserve slot 2
* xen_igd_reserve_slot and xen_igd_clear_slot functions
Michael Tsirkin:
* Introduce XEN_PCI_IGD_DOMAIN, XEN_PCI_IGD_BUS, XEN_PCI_IGD_DEV, and
XEN_PCI_IGD_FN - use them to compute the value of XEN_PCI_IGD_SLOT_MASK
The new xen_igd_reserve_slot function uses the existing slot_reserved_mask
member of PCIBus to reserve PCI slot 2 for Xen HVM guests configured using
the xl toolstack with the gfx_passthru option enabled, which sets the
igd-passthru=on option to Qemu for the Xen HVM machine type.
The new xen_igd_reserve_slot function also needs to be implemented in
hw/xen/xen_pt_stub.c to prevent FTBFS during the link stage for the case
when Qemu is configured with --enable-xen and --disable-xen-pci-passthrough,
in which case it does nothing.
The new xen_igd_clear_slot function overrides qdev->realize of the parent
PCI device class to enable the Intel IGD to occupy slot 2 on the PCI bus
since slot 2 was reserved by xen_igd_reserve_slot when the PCI bus was
created in hw/i386/pc_piix.c for the case when igd-passthru=on.
Move the call to xen_host_pci_device_get, and the associated error
handling, from xen_pt_realize to the new xen_igd_clear_slot function to
initialize the device class and vendor values which enables the checks for
the Intel IGD to succeed. The verification that the host device is an
Intel IGD to be passed through is done by checking the domain, bus, slot,
and function values as well as by checking that gfx_passthru is enabled,
the device class is VGA, and the device vendor in Intel.
Signed-off-by: Chuck Zmudzinski <brchuckz@aol.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Message-Id: <b1b4a21fe9a600b1322742dda55a40e9961daa57.1674346505.git.brchuckz@aol.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
(cherry picked from commit 4f67543bb8c5b031c2ad3785c1a2f3c255d72b25)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
xen_9pfs_free can't use gnttabdev since it is already closed and NULL-ed
out when free is called. Do the teardown in _disconnect(). This
matches the setup done in _connect().
trace-events are also added for the XenDevOps functions.
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Message-Id: <20230502143722.15613-1-jandryuk@gmail.com>
[C.S.: - Remove redundant return in xen_9pfs_free().
- Add comment to trace-events. ]
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
(cherry picked from commit 92e667f6fd5806a6a705a2a43e572bd9ec6819da)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: minor context conflict in hw/9pfs/xen-9p-backend.c)