Shameer Kolothum [Tue, 13 Jun 2023 14:09:43 +0000 (15:09 +0100)]
vfio/pci: Call vfio_prepare_kvm_msi_virq_batch() in MSI retry path
When vfio_enable_vectors() returns with less than requested nr_vectors
we retry with what kernel reported back. But the retry path doesn't
call vfio_prepare_kvm_msi_virq_batch() and this results in,
qemu-system-aarch64: vfio: Error: Failed to enable 4 MSI vectors, retry with 1
qemu-system-aarch64: ../hw/vfio/pci.c:602: vfio_commit_kvm_msi_virq_batch: Assertion `vdev->defer_kvm_irq_routing' failed
Fixes: dc580d51f7dd ("vfio: defer to commit kvm irq routing when enable msi/msix") Reviewed-by: Longpeng <longpeng2@huawei.com> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit c17408892319712c12357e5d1c6b305499c58c2a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Zhenzhong Duan [Thu, 29 Jun 2023 08:40:38 +0000 (16:40 +0800)]
vfio/pci: Fix a segfault in vfio_realize
The kvm irqchip notifier is only registered if the device supports
INTx, however it's unconditionally removed in vfio realize error
path. If the assigned device does not support INTx, this will cause
QEMU to crash when vfio realize fails. Change it to conditionally
remove the notifier only if the notify hook is setup.
Before fix:
(qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,xres=1
Connection closed by foreign host.
After fix:
(qemu) device_add vfio-pci,host=81:11.1,id=vfio1,bus=root1,xres=1
Error: vfio 0000:81:11.1: xres and yres properties require display=on
(qemu)
Fixes: c5478fea27ac ("vfio/pci: Respond to KVM irqchip change notifier") Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit 357bd7932a136613d700ee8bc83e9165f059d1f7) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Nicholas Piggin [Tue, 30 May 2023 13:12:12 +0000 (23:12 +1000)]
target/ppc: Fix decrementer time underflow and infinite timer loop
It is possible to store a very large value to the decrementer that it
does not raise the decrementer exception so the timer is scheduled, but
the next time value wraps and is treated as in the past.
This can occur if (u64)-1 is stored on a zero-triggered exception, or
(u64)-1 is stored twice on an underflow-triggered exception, for
example.
If such a value is set in DECAR, it gets stored to the decrementer by
the timer function, which then immediately causes another timer, which
hangs QEMU.
Clamp the decrementer to the implemented width, and use that as the
value for the timer calculation, effectively preventing this overflow.
Reported-by: sdicaro@DDCI.com Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20230530131214.373524-1-npiggin@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit 09d2db9f46e38e2da990df8ad914d735d764251a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Laurent Vivier [Fri, 2 Jun 2023 16:27:35 +0000 (18:27 +0200)]
vhost: fix vhost_dev_enable_notifiers() error case
in vhost_dev_enable_notifiers(), if virtio_bus_set_host_notifier(true)
fails, we call vhost_dev_disable_notifiers() that executes
virtio_bus_set_host_notifier(false) on all queues, even on queues that
have failed to be initialized.
This triggers a core dump in memory_region_del_eventfd():
virtio_bus_set_host_notifier: unable to init event notifier: Too many open files (-24)
vhost VQ 1 notifier binding failed: 24
.../softmmu/memory.c:2611: memory_region_del_eventfd: Assertion `i != mr->ioeventfd_nb' failed.
Fix the problem by providing to vhost_dev_disable_notifiers() the
number of queues to disable.
Fixes: 8771589b6f81 ("vhost: simplify vhost_dev_enable_notifiers") Cc: longpeng2@huawei.com Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20230602162735.3670785-1-lvivier@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 92099aa4e9a3bb6856c290afaf41c76f9e3dd9fd) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Eugenio Pérez [Fri, 2 Jun 2023 17:33:28 +0000 (19:33 +0200)]
vdpa: mask _F_CTRL_GUEST_OFFLOADS for vhost vdpa devices
QEMU does not emulate it so it must be disabled as long as the backend
does not support it.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230602173328.1917385-1-eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>
(cherry picked from commit 51e84244a7799172f4239482199e9b4bdcd23172) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Nicholas Piggin [Tue, 27 Jun 2023 06:14:06 +0000 (16:14 +1000)]
icount: don't adjust virtual time backwards after warp
The icount-based QEMU_CLOCK_VIRTUAL runs ahead of the RT clock at times.
When warping, it is possible it is still ahead at the end of the warp,
which causes icount adaptive mode to adjust it backward. This can result
in the machine observing time going backwards.
Prevent this by clamping adaptive adjustment to 0 at minimum.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-ID: <20230627061406.241847-1-npiggin@gmail.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 67f85346ca9305d9fb3254ceff735ceaadeb0911) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This commit breaks "-drive if=pflash,readonly=on,file=image.iso". It
claims to merely replace an open-coded version of blk_name() by a
call, but that's not the case. Sorry for the inconvenience!
Reported-by: Jakub Jermář <jakub@jermar.eu> Cc: qemu-stable@nongnu.org Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230515151104.1350155-1-armbru@redhat.com> Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
(cherry picked from commit ac5e8c1dec246950d73e22dceab5cb36e82aac0b) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Vivek Kasireddy [Fri, 23 Jun 2023 06:04:54 +0000 (23:04 -0700)]
virtio-gpu: Make non-gl display updates work again when blob=true
In the case where the console does not have gl capability, and
if blob is set to true, make sure that the display updates still
work. Commit e86a93f55463 accidentally broke this by misplacing
the return statement (in resource_flush) causing the updates to
be silently ignored.
Fixes: e86a93f55463 ("virtio-gpu: splitting one extended mode guest fb into n-scanouts") Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Marc-André Lureau <marcandre.lureau@redhat.com> Cc: Dongwon Kim <dongwon.kim@intel.com> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20230623060454.3749910-1-vivek.kasireddy@intel.com>
(cherry picked from commit 34e29d85a7734802317c4cac9ad52b10d461c1dc) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Ani Sinha [Mon, 19 Jun 2023 06:52:09 +0000 (12:22 +0530)]
vhost-vdpa: do not cleanup the vdpa/vhost-net structures if peer nic is present
When a peer nic is still attached to the vdpa backend, it is too early to free
up the vhost-net and vdpa structures. If these structures are freed here, then
QEMU crashes when the guest is being shut down. The following call chain
would result in an assertion failure since the pointer returned from
vhost_vdpa_get_vhost_net() would be NULL:
Therefore, we defer freeing up the structures until at guest shutdown
time when qemu_cleanup() calls net_cleanup() which then calls
qemu_del_net_client() which would eventually call vhost_vdpa_cleanup()
again to free up the structures. This time, the loop in net_cleanup()
ensures that vhost_vdpa_cleanup() will be called one last time when
all the peer nics are detached and freed.
All unit tests pass with this change.
CC: imammedo@redhat.com CC: jusual@redhat.com CC: mst@redhat.com Fixes: CVE-2023-3301
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929 Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20230619065209.442185-1-anisinha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit a0d7215e339b61c7d7a7b3fcf754954d80d93eb8) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: context change for stable-8.0)
Eugenio Pérez [Fri, 2 Jun 2023 17:34:51 +0000 (19:34 +0200)]
vdpa: fix not using CVQ buffer in case of error
Bug introducing when refactoring. Otherway, the guest never received
the used buffer.
Fixes: be4278b65fc1 ("vdpa: extract vhost_vdpa_net_cvq_add from vhost_vdpa_net_handle_ctrl_avail") Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230602173451.1917999-1-eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>
(cherry picked from commit d45243bcfc61a3c34f96a4fc34bffcb9929daba0) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Prasad Pandit [Mon, 29 May 2023 11:43:33 +0000 (17:13 +0530)]
vhost: release virtqueue objects in error path
vhost_dev_start function does not release virtqueue objects when
event_notifier_init() function fails. Release virtqueue objects
and log a message about function failure.
Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
Message-Id: <20230529114333.31686-3-ppandit@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Fixes: f9a09ca3ea ("vhost: add support for configure interrupt") Reviewed-by: Peter Xu <peterx@redhat.com> Cc: qemu-stable@nongnu.org Acked-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 77ece20ba04582d94c345ac0107ddff2fd18d27a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Prasad Pandit [Mon, 29 May 2023 11:43:32 +0000 (17:13 +0530)]
vhost: release memory_listener object in error path
vhost_dev_start function does not release memory_listener object
in case of an error. This may crash the guest when vhost is unable
to set memory table:
Release memory_listener objects in the error path.
Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
Message-Id: <20230529114333.31686-2-ppandit@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Fixes: c471ad0e9b ("vhost_net: device IOTLB support") Cc: qemu-stable@nongnu.org Acked-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 1e3ffb34f764f8ac4c003b2b2e6a775b2b073a16) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Helge Deller [Sat, 24 Jun 2023 09:45:52 +0000 (11:45 +0200)]
target/hppa: Update to SeaBIOS-hppa version 8
Update SeaBIOS-hppa to version 8.
Fixes:
- boot of HP-UX with SMP, and
- reboot of Linux and HP-UX with SMP
Enhancements:
- show qemu version in boot menu
- adds exit menu entry in boot menu to quit emulation
- allow to trace PCD_CHASSIS codes & machine run status
Signed-off-by: Helge Deller <deller@gmx.de>
(cherry picked from commit 34ec3aea54368a92b62a55c656335885ba8c65ef) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Helge Deller [Tue, 20 Jun 2023 19:39:47 +0000 (21:39 +0200)]
target/hppa: New SeaBIOS-hppa version 7
Update SeaBIOS-hppa to version 7 which fixes a boot problem
with Debian-12 install CD images.
The problem with Debian-12 is, that the ramdisc got bigger
than what the firmware could load in one call to the LSI
scsi driver.
Signed-off-by: Helge Deller <deller@gmx.de>
(cherry picked from commit bb9c998ca9343d445c76b69fa15dea9db692f526) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: pick this one before picking next 34ec3aea54368a92b6 "SeaBIOS-hppa version 8")
Helge Deller [Fri, 23 Jun 2023 06:24:30 +0000 (08:24 +0200)]
target/hppa: Fix OS reboot issues
When the OS triggers a reboot, the reset helper function sends a
qemu_system_reset_request(SHUTDOWN_CAUSE_GUEST_RESET) together with an
EXCP_HLT exception to halt the CPUs.
So, at reboot when initializing the CPUs again, make sure to set all
instruction pointers to the firmware entry point, disable any interrupts,
disable data and instruction translations, enable PSW_Q bit and tell qemu
to unhalt (halted=0) the CPUs again.
This fixes the various reboot issues which were seen when rebooting a
Linux VM, including the case where even the monarch CPU has been virtually
halted from the OS (e.g. via "chcpu -d 0" inside the Linux VM).
Signed-off-by: Helge Deller <deller@gmx.de>
(cherry picked from commit 50ba97e928b44ff5bc731c9ffe68d86acbe44639) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Peter Maydell [Tue, 20 Jun 2023 16:20:24 +0000 (17:20 +0100)]
pc-bios/keymaps: Use the official xkb name for Arabic layout, not the legacy synonym
The xkb official name for the Arabic keyboard layout is 'ara'.
However xkb has for at least the past 15 years also permitted it to
be named via the legacy synonym 'ar'. In xkeyboard-config 2.39 this
synoynm was removed, which breaks compilation of QEMU:
FAILED: pc-bios/keymaps/ar
/home/fred/qemu-git/src/qemu/build-full/qemu-keymap -f pc-bios/keymaps/ar -l ar
xkbcommon: ERROR: Couldn't find file "symbols/ar" in include paths
xkbcommon: ERROR: 1 include paths searched:
xkbcommon: ERROR: /usr/share/X11/xkb
xkbcommon: ERROR: 3 include paths could not be added:
xkbcommon: ERROR: /home/fred/.config/xkb
xkbcommon: ERROR: /home/fred/.xkb
xkbcommon: ERROR: /etc/xkb
xkbcommon: ERROR: Abandoning symbols file "(unnamed)"
xkbcommon: ERROR: Failed to compile xkb_symbols
xkbcommon: ERROR: Failed to compile keymap
The upstream xkeyboard-config change removing the compat
mapping is:
https://gitlab.freedesktop.org/xkeyboard-config/xkeyboard-config/-/commit/470ad2cd8fea84d7210377161d86b31999bb5ea6
Make QEMU always ask for the 'ara' xkb layout, which should work on
both older and newer xkeyboard-config. We leave the QEMU name for
this keyboard layout as 'ar'; it is not the only one where our name
for it deviates from the xkb standard name.
Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20230620162024.1132013-1-peter.maydell@linaro.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1709
(cherry picked from commit 497fad38979c16b6412388927401e577eba43d26) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Peter Maydell [Thu, 22 Jun 2023 13:08:23 +0000 (14:08 +0100)]
host-utils: Avoid using __builtin_subcll on buggy versions of Apple Clang
We use __builtin_subcll() to do a 64-bit subtract with borrow-in and
borrow-out when the host compiler supports it. Unfortunately some
versions of Apple Clang have a bug in their implementation of this
intrinsic which means it returns the wrong value. The effect is that
a QEMU built with the affected compiler will hang when emulating x86
or m68k float80 division.
The upstream LLVM issue is:
https://github.com/llvm/llvm-project/issues/55253
The commit that introduced the bug apparently never made it into an
upstream LLVM release without the subsequent fix
https://github.com/llvm/llvm-project/commit/fffb6e6afdbaba563189c1f715058ed401fbc88d
but unfortunately it did make it into Apple Clang 14.0, as shipped
in Xcode 14.3 (14.2 is reported to be OK). The Apple bug number is FB12210478.
Add ifdefs to avoid use of __builtin_subcll() on Apple Clang version
14 or greater. There is not currently a version of Apple Clang which
has the bug fix -- when one appears we should be able to add an upper
bound to the ifdef condition so we can start using the builtin again.
We make the lower bound a conservative "any Apple clang with major
version 14 or greater" because the consequences of incorrectly
disabling the builtin when it would work are pretty small and the
consequences of not disabling it when we should are pretty bad.
Many thanks to those users who both reported this bug and also
did a lot of work in identifying the root cause; in particular
to Daniel Bertalan and osy.
Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1631
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1659 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Tested-by: Daniel Bertalan <dani@danielbertalan.dev> Tested-by: Tested-By: Solra Bizna <solra@bizna.name>
Message-id: 20230622130823.1631719-1-peter.maydell@linaro.org
(cherry picked from commit b0438861efe1dfbdfdd9fa1d9aa05100d37ea8ee) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
target/tricore: Add CHECK_REG_PAIR() for insn accessing 64 bit regs
some insns were not checking if an even index was used to access a 64
bit register. In the worst case that could lead to a buffer overflow as
reported in https://gitlab.com/qemu-project/qemu/-/issues/1698.
Reported-by: Siqi Chen <coc.cyqh@gmail.com> Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-Id: <20230612113245.56667-4-kbastian@mail.uni-paderborn.de>
(cherry picked from commit 6991777ec4b2a344d47bddec62744bedd9883d78) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Siqi Chen [Mon, 12 Jun 2023 11:32:42 +0000 (13:32 +0200)]
target/tricore: Fix out-of-bounds index in imask instruction
When translating "imask" instruction of Tricore architecture, QEMU did not check whether the register index was out of bounds, resulting in a global-buffer-overflow.
Peter Maydell [Tue, 6 Jun 2023 13:49:17 +0000 (14:49 +0100)]
hw/timer/nrf51_timer: Don't lose time when timer is queried in tight loop
The nrf51_timer has a free-running counter which we implement using
the pattern of using two fields (update_counter_ns, counter) to track
the last point at which we calculated the counter value, and the
counter value at that time. Then we can find the current counter
value by converting the difference in wall-clock time between then
and now to a tick count that we need to add to the counter value.
Unfortunately the nrf51_timer's implementation of this has a bug
which means it loses time every time update_counter() is called.
After updating s->counter it always sets s->update_counter_ns to
'now', even though the actual point when s->counter hit the new value
will be some point in the past (half a tick, say). In the worst case
(guest code in a tight loop reading the counter, icount mode) the
counter is continually queried less than a tick after it was last
read, so s->counter never advances but s->update_counter_ns does, and
the guest never makes forward progress.
The fix for this is to only advance update_counter_ns to the
timestamp of the last tick, not all the way to 'now'. (This is the
pattern used in hw/misc/mps2-fpgaio.c's counter.)
Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 20230606134917.3782215-1-peter.maydell@linaro.org
(cherry picked from commit d2f9a79a8cf6ab992e1d0f27ad05b3e582d2b18a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Peter Maydell [Tue, 6 Jun 2023 10:46:08 +0000 (11:46 +0100)]
hw/intc/allwinner-a10-pic: Handle IRQ levels other than 0 or 1
In commit 2c5fa0778c3b430 we fixed an endianness bug in the Allwinner
A10 PIC model; however in the process we introduced a regression.
This is because the old code was robust against the incoming 'level'
argument being something other than 0 or 1, whereas the new code was
not.
In particular, the allwinner-sdhost code treats its IRQ line
as 0-vs-non-0 rather than 0-vs-1, so when the SD controller
set its IRQ line for any reason other than transmit the
interrupt controller would ignore it. The observed effect
was a guest timeout when rebooting the guest kernel.
Handle level values other than 0 or 1, to restore the old
behaviour.
Fixes: 2c5fa0778c3b430 ("hw/intc/allwinner-a10-pic: Don't use set_bit()/clear_bit()")
(Mjt: af08c70ef5204fe in stable-8.0) Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Guenter Roeck <linux@roeck-us.net>
Message-id: 20230606104609.3692557-2-peter.maydell@linaro.org
(cherry picked from commit f837b468cdaa7e736b5385c7dc4f8c5adcad3bf1) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Peter Maydell [Mon, 19 Jun 2023 10:20:18 +0000 (11:20 +0100)]
target/arm: Return correct result for LDG when ATA=0
The LDG instruction loads the tag from a memory address (identified
by [Xn + offset]), and then merges that tag into the destination
register Xt. We implemented this correctly for the case when
allocation tags are enabled, but didn't get it right when ATA=0:
instead of merging the tag bits into Xt, we merged them into the
memory address [Xn + offset] and then set Xt to that.
Merge the tag bits into the old Xt value, as they should be.
Cc: qemu-stable@nongnu.org Fixes: c15294c1e36a7dd9b25 ("target/arm: Implement LDG, STG, ST2G instructions") Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 7e2788471f9e079fff696a694721a7d41a451839) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Peter Maydell [Mon, 19 Jun 2023 10:20:18 +0000 (11:20 +0100)]
target/arm: Fix return value from LDSMIN/LDSMAX 8/16 bit atomics
The atomic memory operations are supposed to return the old memory
data value in the destination register. This value is not
sign-extended, even if the operation is the signed minimum or
maximum. (In the pseudocode for the instructions the returned data
value is passed to ZeroExtend() to create the value in the register.)
We got this wrong because we were doing a 32-to-64 zero extend on the
result for 8 and 16 bit data values, rather than the correct amount
of zero extension.
Fix the bug by using ext8u and ext16u for the MO_8 and MO_16 data
sizes rather than ext32u.
Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230602155223.2040685-2-peter.maydell@linaro.org
(cherry picked from commit 243705aa6ea3465b20e9f5a8bfcf36d3153f3c10) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
As mentioned in docs/devel/style.rst "Automatic memory deallocation":
* Variables declared with g_auto* MUST always be initialized,
otherwise the cleanup function will use uninitialized stack memory
This avoids QEMU to coredump when running the "hash test" command
under Zephyr.
Cc: Steven Lee <steven_lee@aspeedtech.com> Cc: Joel Stanley <joel@jms.id.au> Cc: qemu-stable@nongnu.org Fixes: c5475b3f9a ("hw: Model ASPEED's Hash and Crypto Engine") Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Message-Id: <20230421131547.2177449-1-clg@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit c8f48b120b31f6bbe33135ef5d478e485c37e3c2) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Yin Wang [Fri, 19 May 2023 02:37:58 +0000 (10:37 +0800)]
hw/riscv: qemu crash when NUMA nodes exceed available CPUs
Command "qemu-system-riscv64 -machine virt
-m 2G -smp 1 -numa node,mem=1G -numa node,mem=1G"
would trigger this problem.Backtrace with:
#0 0x0000555555b5b1a4 in riscv_numa_get_default_cpu_node_id at ../hw/riscv/numa.c:211
#1 0x00005555558ce510 in machine_numa_finish_cpu_init at ../hw/core/machine.c:1230
#2 0x00005555558ce9d3 in machine_run_board_init at ../hw/core/machine.c:1346
#3 0x0000555555aaedc3 in qemu_init_board at ../softmmu/vl.c:2513
#4 0x0000555555aaf064 in qmp_x_exit_preconfig at ../softmmu/vl.c:2609
#5 0x0000555555ab1916 in qemu_init at ../softmmu/vl.c:3617
#6 0x000055555585463b in main at ../softmmu/main.c:47
This commit fixes the issue by adding parameter checks.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> Reviewed-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com> Reviewed-by: Weiwei Li <liweiwei@iscas.ac.cn> Signed-off-by: Yin Wang <yin.wang@intel.com>
Message-Id: <20230519023758.1759434-1-yin.wang@intel.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit b9cedbf19cb4be04908a3a623f0f237875483499) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Nicholas Piggin [Tue, 30 May 2023 13:04:47 +0000 (23:04 +1000)]
target/ppc: Fix PMU hflags calculation
Some of the PMU hflags bits can go out of synch, for example a store to
MMCR0 with PMCjCE=1 fails to update hflags correctly and results in
hflags mismatch:
This can be reproduced by running perf on a recent machine.
Some of the fragility here is the duplication of PMU hflags calculations.
This change consolidates that in a single place to update pmu-related
hflags, to be called after a well defined state changes.
The post-load PMU update is pulled out of the MSR update because it does
not depend on the MSR value.
Fixes: 8b3d1c49a9f0 ("target/ppc: Add new PMC HFLAGS") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20230530130447.372617-1-npiggin@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit 6494d2c1fd4ebc37b575130399a97a1fcfff1afc) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Nicholas Piggin [Tue, 30 May 2023 13:21:27 +0000 (23:21 +1000)]
target/ppc: Fix nested-hv HEAI delivery
ppc hypervisors turn HEAI interrupts into program interrupts injected
into the guest that executed the illegal instruction, if the hypervisor
doesn't handle it some other way.
The nested-hv implementation failed to account for this HEAI->program
conversion. The virtual hypervisor wants to see the HEAI when running
a nested guest, so that interrupt type can be returned to its KVM
caller.
Fixes: 7cebc5db2eba6 ("target/ppc: Introduce a vhyp framework for nested HV support") Cc: balaton@eik.bme.hu Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20230530132127.385001-1-npiggin@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit 6c242e79b876b3570b8fd2f10f2a502467758e56) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Nicholas Piggin [Mon, 5 Jun 2023 02:54:42 +0000 (12:54 +1000)]
target/ppc: Fix lqarx to set cpu_reserve
lqarx does not set cpu_reserve, which causes stqcx. to never succeed.
Cc: qemu-stable@nongnu.org Fixes: 94bf2658676 ("target/ppc: Use atomic load for LQ and LQARX") Fixes: 57b38ffd0c6 ("target/ppc: Use tcg_gen_qemu_{ld,st}_i128 for LQARX, LQ, STQ") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230605025445.161932-1-npiggin@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit e025e8f5a8a7e32409bb4c7c509d752486113188) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The printed offset value is prefixed with 0x, but was actually printed
in decimal. To spare others the confusion, adjust the format specifier
to hexadecimal.
Signed-off-by: Mattias Nissler <mnissler@rivosinc.com> Reviewed-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 5fb9e8295531f957cf7ac20e89736c8963a25e04) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
9pfs: prevent opening special files (CVE-2023-2861)
The 9p protocol does not specifically define how server shall behave when
client tries to open a special file, however from security POV it does
make sense for 9p server to prohibit opening any special file on host side
in general. A sane Linux 9p client for instance would never attempt to
open a special file on host side, it would always handle those exclusively
on its guest side. A malicious client however could potentially escape
from the exported 9p tree by creating and opening a device file on host
side.
With QEMU this could only be exploited in the following unsafe setups:
- Running QEMU binary as root AND 9p 'local' fs driver AND 'passthrough'
security model.
or
- Using 9p 'proxy' fs driver (which is running its helper daemon as
root).
These setups were already discouraged for safety reasons before,
however for obvious reasons we are now tightening behaviour on this.
Fixes: CVE-2023-2861 Reported-by: Yanwu Shen <ywsPlz@gmail.com> Reported-by: Jietao Xiao <shawtao1125@gmail.com> Reported-by: Jinku Li <jkli@xidian.edu.cn> Reported-by: Wenbo Shen <shenwenbo@zju.edu.cn> Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Reviewed-by: Greg Kurz <groug@kaod.org> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <E1q6w7r-0000Q0-NM@lizzy.crudebyte.com>
(cherry picked from commit f6b0de53fb87ddefed348a39284c8e2f28dc4eda) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Mark Somerville [Thu, 6 Apr 2023 12:45:31 +0000 (13:45 +0100)]
qga: Fix suspend on Linux guests without systemd
Allow the Linux guest agent to attempt each of the suspend methods
(systemctl, pm-* and writing to /sys) in turn.
Prior to this guests without systemd failed to suspend due to
`guest_suspend` returning early regardless of the return value of
`systemd_supports_mode`.
Signed-off-by: Mark Somerville <mark@qpok.net> Reviewed-by: Konstantin Kostiuk <kkostiuk@redhat.com> Signed-off-by: Konstantin Kostiuk <kkostiuk@redhat.com>
(cherry picked from commit 86dcb6ab9b603450eb6d896cdc95286de2c7d561) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
David Woodhouse [Wed, 12 Apr 2023 18:50:59 +0000 (19:50 +0100)]
hw/xen: Fix memory leak in libxenstore_open() for Xen
There was a superfluous allocation of the XS handle, leading to it
being leaked on both the error path and the success path (where it gets
allocated again).
Fixes: ba2a92db1ff6 ("hw/xen: Add xenstore operations to allow redirection to internal emulation") Suggested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20230412185102.441523-3-dwmw2@infradead.org> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
(cherry picked from commit 8442232eba1b041b379ca5845df8252c1e905e43) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Thomas Huth [Thu, 30 Mar 2023 15:26:13 +0000 (17:26 +0200)]
hw/mips/malta: Fix the malta machine on big endian hosts
Booting a Linux kernel with the malta machine is currently broken
on big endian hosts. The cpu_to_gt32 macro wants to byteswap a value
for little endian targets only, but uses the wrong way to do this:
cpu_to_[lb]e32 works the other way round on big endian hosts! Fix
it by using the same ways on both, big and little endian hosts.
Fixes: 0c8427baf0 ("hw/mips/malta: Use bootloader helper to set BAR registers") Cc: qemu-stable@nongnu.org
Message-Id: <20230330152613.232082-1-thuth@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit dc96009afd8cf2372fa1bbced0bcbcbb2c5d6f1b) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Fixes: 076d4d39b65f ("s390x/cpumodel: wire up cpu type + id for TCG") Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230605113950.1169228-2-iii@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 71b11cbe1c34411238703abe24bfaf2e9712c30d) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Ilya Leoshkevich [Wed, 10 May 2023 23:02:12 +0000 (01:02 +0200)]
linux-user/s390x: Fix single-stepping SVC
Currently single-stepping SVC executes two instructions. The reason is
that EXCP_DEBUG for the SVC instruction itself is masked by EXCP_SVC.
Fix by re-raising EXCP_DEBUG.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230510230213.330134-2-iii@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 01b9990a3fb84bb9a14017255ab1a4fa86588215) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Peter Maydell [Tue, 23 May 2023 13:17:26 +0000 (14:17 +0100)]
target/arm: Explicitly select short-format FSR for M-profile
For M-profile, there is no guest-facing A-profile format FSR, but we
still use the env->exception.fsr field to pass fault information from
the point where a fault is raised to the code in
arm_v7m_cpu_do_interrupt() which interprets it and sets the M-profile
specific fault status registers. So it doesn't matter whether we
fill in env->exception.fsr in the short format or the LPAE format, as
long as both sides agree. As it happens arm_v7m_cpu_do_interrupt()
assumes short-form.
In compute_fsr_fsc() we weren't explicitly choosing short-form for
M-profile, but instead relied on it falling out in the wash because
arm_s1_regime_using_lpae_format() would be false. This was broken in
commit 452c67a4 when we added v8R support, because we said "PMSAv8 is
always LPAE format" (as it is for v8R), forgetting that we were
implicitly using this code path on M-profile. At that point we would
hit a g_assert_not_reached():
ERROR:../../target/arm/internals.h:549:arm_fi_to_lfsc: code should not be reached
#7 0x0000555555e055f7 in arm_fi_to_lfsc (fi=0x7fffecff9a90) at ../../target/arm/internals.h:549
#8 0x0000555555e05a27 in compute_fsr_fsc (env=0x555557356670, fi=0x7fffecff9a90, target_el=1, mmu_idx=1, ret_fsc=0x7fffecff9a1c)
at ../../target/arm/tlb_helper.c:95
#9 0x0000555555e05b62 in arm_deliver_fault (cpu=0x555557354800, addr=268961344, access_type=MMU_INST_FETCH, mmu_idx=1, fi=0x7fffecff9a90)
at ../../target/arm/tlb_helper.c:132
#10 0x0000555555e06095 in arm_cpu_tlb_fill (cs=0x555557354800, address=268961344, size=1, access_type=MMU_INST_FETCH, mmu_idx=1, probe=false, retaddr=0)
at ../../target/arm/tlb_helper.c:260
The specific assertion changed when commit fcc7404eff24b4c added
"assert not M-profile" to arm_is_secure_below_el3(), because the
conditions being checked in compute_fsr_fsc() include
arm_el_is_aa64(), which will end up calling arm_is_secure_below_el3()
and asserting before we try to call arm_fi_to_lfsc():
#7 0x0000555555efaf43 in arm_is_secure_below_el3 (env=0x5555574665a0) at ../../target/arm/cpu.h:2396
#8 0x0000555555efb103 in arm_is_el2_enabled (env=0x5555574665a0) at ../../target/arm/cpu.h:2448
#9 0x0000555555efb204 in arm_el_is_aa64 (env=0x5555574665a0, el=1) at ../../target/arm/cpu.h:2509
#10 0x0000555555efbdfd in compute_fsr_fsc (env=0x5555574665a0, fi=0x7fffecff99e0, target_el=1, mmu_idx=1, ret_fsc=0x7fffecff996c)
Avoid the assertion and the incorrect FSR format selection by
explicitly making M-profile use the short-format in this function.
Fixes: 452c67a42704 ("target/arm: Enable TTBCR_EAE for ARMv8-R AArch32")a
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1658 Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230523131726.866635-1-peter.maydell@linaro.org
(cherry picked from commit d7fe699be54b2cbb8e4ee37b63588b3458a49da7) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Clément Chigot [Wed, 24 May 2023 14:37:14 +0000 (16:37 +0200)]
hw/arm/xlnx-zynqmp: fix unsigned error when checking the RPUs number
When passing --smp with a number lower than XLNX_ZYNQMP_NUM_APU_CPUS,
the expression (ms->smp.cpus - XLNX_ZYNQMP_NUM_APU_CPUS) will result
in a positive number as ms->smp.cpus is a unsigned int.
This will raise the following error afterwards, as Qemu will try to
instantiate some additional RPUs.
| $ qemu-system-aarch64 --smp 1 -M xlnx-zcu102
| **
| ERROR:../src/tcg/tcg.c:777:tcg_register_thread:
| assertion failed: (n < tcg_max_ctxs)
Signed-off-by: Clément Chigot <chigot@adacore.com> Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com> Tested-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Message-id: 20230524143714.565792-1-chigot@adacore.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit c9ba1c9f02cfede5329f504cdda6fd3a256e0434) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Tommy Wu [Thu, 25 May 2023 09:37:51 +0000 (10:37 +0100)]
hw/dma/xilinx_axidma: Check DMASR.HALTED to prevent infinite loop.
When we receive a packet from the xilinx_axienet and then try to s2mem
through the xilinx_axidma, if the descriptor ring buffer is full in the
xilinx axidma driver, we’ll assert the DMASR.HALTED in the
function : stream_process_s2mem and return 0. In the end, we’ll be stuck in
an infinite loop in axienet_eth_rx_notify.
This patch checks the DMASR.HALTED state when we try to push data
from xilinx axi-enet to xilinx axi-dma. When the DMASR.HALTED is asserted,
we will not keep pushing the data and then prevent the infinte loop.
Signed-off-by: Tommy Wu <tommy.wu@sifive.com> Reviewed-by: Edgar E. Iglesias <edgar@zeroasic.com> Reviewed-by: Frank Chang <frank.chang@sifive.com>
Message-id: 20230519062137.1251741-1-tommy.wu@sifive.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 31afe04586efeccb80cc36ffafcd0e32a3245ffb) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
ui/sdl2: disable SDL_HINT_GRAB_KEYBOARD on Windows
Windows sends an extra left control key up/down input event for
every right alt key up/down input event for keyboards with
international layout. Since commit 830473455f ("ui/sdl2: fix
handling of AltGr key on Windows") QEMU uses a Windows low level
keyboard hook procedure to reliably filter out the special left
control key and to grab the keyboard on Windows.
The SDL2 version 2.0.16 introduced its own Windows low level
keyboard hook procedure to grab the keyboard. Windows calls this
callback before the QEMU keyboard hook procedure. This disables
the special left control key filter when the keyboard is grabbed.
To fix the problem, disable the SDL2 Windows low level keyboard
hook procedure.
Reported-by: Bernhard Beschow <shentey@gmail.com> Signed-off-by: Volker Rümelin <vr_qemu@t-online.de> Reviewed-by: Thomas Huth <thuth@redhat.com> Tested-by: Bernhard Beschow <shentey@gmail.com>
Message-Id: <20230418062823.5683-1-vr_qemu@t-online.de>
(cherry picked from commit 1dfea3f212e43bfd59d1e1f40b9776db440b211f) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Bernhard Beschow [Mon, 17 Apr 2023 19:21:39 +0000 (21:21 +0200)]
ui/sdl2: Grab Alt+F4 also under Windows
SDL doesn't grab Alt+F4 under Windows by default. Pressing Alt+F4 thus closes
the VM immediately without confirmation, possibly leading to data loss. Fix
this by always grabbing Alt+F4 on Windows hosts, too.
Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <20230417192139.43263-3-shentey@gmail.com>
(cherry picked from commit 083db9db44c89d7ea7f81844302194d708bcff2b) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Bernhard Beschow [Mon, 17 Apr 2023 19:21:38 +0000 (21:21 +0200)]
ui/sdl2: Grab Alt+Tab also in fullscreen mode
By default, SDL grabs Alt+Tab only in non-fullscreen mode. This causes Alt+Tab
to switch tasks on the host rather than in the VM in fullscreen mode while it
switches tasks in non-fullscreen mode in the VM. Fix this confusing behavior
by grabbing Alt+Tab in fullscreen mode, always causing tasks to be switched in
the VM.
Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <20230417192139.43263-2-shentey@gmail.com>
(cherry picked from commit efc00a37090eced53bff8b42d26991252aaacc44) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
commit 4814d3cbf ("ui/dbus: restrict opengl to gbm-enabled config")
assumes that whenever GBM is available, OpenGL is. This is not always
the case, let's further restrict opengl-related paths and fix some
compilation issues.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230515132348.1024663-1-marcandre.lureau@redhat.com>
(cherry picked from commit 0b31e48d62c8f3a282d1bffbcc0e90200df9f9f0) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Erico Nunes [Mon, 20 Mar 2023 16:08:56 +0000 (17:08 +0100)]
ui/gtk-egl: fix scaling for cursor position in scanout mode
vc->gfx.w and vc->gfx.h are not updated appropriately in this code path,
which leads to a different scaling factor for rendering the cursor on
some edge cases (e.g. the focus has left and re-entered the gtk window).
This can be reproduced using vhost-user-gpu with the gtk ui on the x11
backend.
Use the surface dimensions which are already updated accordingly.
Signed-off-by: Erico Nunes <ernunes@redhat.com> Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230320160856.364319-2-ernunes@redhat.com>
(cherry picked from commit f8a951bb951140a585341c700ebeec58d83f7bbc) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Erico Nunes [Mon, 20 Mar 2023 16:08:55 +0000 (17:08 +0100)]
ui/gtk: use widget size for cursor motion event
The gd_motion_event size has some calculations for the cursor position,
which also take into account things like different size of the
framebuffer compared to the window size.
The use of window size makes things more difficult though, as at least
in the case of Wayland includes the size of ui elements like a menu bar
at the top of the window. This leads to a wrong position calculation by
a few pixels.
Fix it by using the size of the widget, which already returns the size
of the actual space to render the framebuffer.
Erico Nunes [Mon, 20 Feb 2023 17:56:05 +0000 (18:56 +0100)]
ui/gtk: fix passing y0_top parameter to scanout
The dmabuf->y0_top flag is passed to .dpy_gl_scanout_dmabuf(), however
in the gtk ui both implementations dropped it when doing the next
scanout_texture call.
Fixes flipped linux console using vhost-user-gpu with the gtk ui
display.
Signed-off-by: Erico Nunes <ernunes@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20230220175605.43759-1-ernunes@redhat.com>
(cherry picked from commit 94400fa53f81c9f58ad88cf3f3e7ea89ec423d39) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit cef2e7148e32 ("hw/isa/i82378: Remove intermediate IRQ forwarder")
passes s->cpu_intr to i8259_init() in i82378_realize() directly. However, s-
>cpu_intr isn't initialized yet since that happens after the south bridge's
pci_realize_and_unref() in board code. Fix this by initializing s->cpu_intr
before realizing the south bridge.
Fixes: cef2e7148e32 ("hw/isa/i82378: Remove intermediate IRQ forwarder") Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20230304114043.121024-4-shentey@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit 2237af5e60ada06d90bf714e85523deafd936b9b) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Richard Purdie [Wed, 10 May 2023 11:19:13 +0000 (12:19 +0100)]
target/ppc: Fix fallback to MFSS for MFFS* instructions on pre 3.0 ISAs
The following commits changed the code such that the fallback to MFSS for MFFSCRN,
MFFSCRNI, MFFSCE and MFFSL on pre 3.0 ISAs was removed and became an illegal instruction:
The hardware will handle them as a MFFS instruction as the code did previously.
This means applications that were segfaulting under qemu when encountering these
instructions which is used in glibc libm functions for example.
The fallback for MFFSCDRN and MFFSCDRNI added in a later patch was also missing.
This patch restores the fallback to MFSS for these instructions on pre 3.0s ISAs
as the hardware decoder would, fixing the segfaulting libm code. It doesn't have
the fallback for 3.0 onwards to match hardware behaviour.
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org> Reviewed-by: Matheus Ferst <matheus.ferst@eldorado.org.br> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230510111913.1718734-1-richard.purdie@linuxfoundation.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit 5260ecffd24e36c029849f379c8b9cc3d099c879) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Thomas Huth [Mon, 24 Apr 2023 09:22:36 +0000 (10:22 +0100)]
scripts/device-crash-test: Add a parameter to run with TCG only
We're currently facing the problem that the device-crash-test script
runs twice as long in the CI when a runner supports KVM - which sometimes
results in a timeout of the CI job. To get a more deterministic runtime
here, add an option to the script that allows to run it with TCG only.
Reported-by: Eldon Stegall <eldon-qemu@eldondev.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230414145845.456145-3-thuth@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230424092249.58552-6-alex.bennee@linaro.org>
(cherry picked from commit 8b869aa59109d238fd684e1ade204b6942202120) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Thomas Huth [Mon, 24 Apr 2023 09:22:35 +0000 (10:22 +0100)]
gitlab-ci: Avoid to re-run "configure" in the device-crash-test jobs
After "make check-venv" had been added to these jobs, they started
to re-run "configure" each time since our logic in the makefile
thinks that some files are out of date here. Avoid it with the same
trick that we are using in buildtest-template.yml already by disabling
the up-to-date check via NINJA=":".
Fixes: 1d8cf47e5b ("tests: run 'device-crash-test' from tests/venv") Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230414145845.456145-2-thuth@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20230424092249.58552-5-alex.bennee@linaro.org>
(cherry picked from commit 4d3bd91b26a69b39a178744d3d6e5f23050afb23) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Kevin Wolf [Wed, 10 May 2023 20:35:55 +0000 (22:35 +0200)]
block/export: Fix null pointer dereference in error path
There are some error paths in blk_exp_add() that jump to 'fail:' before
'exp' is even created. So we can't just unconditionally access exp->blk.
Add a NULL check, and switch from exp->blk to blk, which is available
earlier, just to be extra sure that we really cover all cases where
BlockDevOps could have been set for it (in practice, this only happens
in drv->create() today, so this part of the change isn't strictly
necessary).
Fixes: Coverity CID 1509238 Fixes: de79b52604e43fdeba6cee4f5af600b62169f2d2 Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-3-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Tested-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit a184563778f2b8970eb93291f08108e66432a575) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0bfd14149b248e8097ea4da1f9d53beb5c5b0cca) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Igor Mammedov [Mon, 22 May 2023 13:17:17 +0000 (15:17 +0200)]
machine: do not crash if default RAM backend name has been stolen
QEMU aborts when default RAM backend should be used (i.e. no
explicit '-machine memory-backend=' specified) but user
has created an object which 'id' equals to default RAM backend
name used by board.
$QEMU -machine pc \
-object memory-backend-ram,id=pc.ram,size=4294967296
Actual results:
QEMU 7.2.0 monitor - type 'help' for more information
(qemu) Unexpected error in object_property_try_add() at ../qom/object.c:1239:
qemu-kvm: attempt to add duplicate property 'pc.ram' to object (type 'container')
Aborted (core dumped)
Instead of abort, check for the conflicting 'id' and exit with
an error, suggesting how to remedy the issue.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2207886 Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20230522131717.3780533-1-imammedo@redhat.com> Tested-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit a37531f2381c4e294e48b1417089474128388b44) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Thomas Huth [Mon, 22 May 2023 09:10:11 +0000 (11:10 +0200)]
hw/scsi/lsi53c895a: Fix reentrancy issues in the LSI controller (CVE-2023-0330)
We cannot use the generic reentrancy guard in the LSI code, so
we have to manually prevent endless reentrancy here. The problematic
lsi_execute_script() function has already a way to detect whether
too many instructions have been executed - we just have to slightly
change the logic here that it also takes into account if the function
has been called too often in a reentrant way.
The code in fuzz-lsi53c895a-test.c has been taken from an earlier
patch by Mauro Matteo Cascella.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1563
Message-Id: <20230522091011.1082574-1-thuth@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alexander Bulekov <alxndr@bu.edu> Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit b987718bbb1d0eabf95499b976212dd5f0120d75) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Paolo Bonzini [Tue, 23 May 2023 15:58:40 +0000 (17:58 +0200)]
usb/ohci: Set pad to 0 after frame update
When the OHCI controller's framenumber is incremented, HccaPad1 register
should be set to zero (Ref OHCI Spec 4.4)
ReactOS uses hccaPad1 to determine if the OHCI hardware is running,
consequently it fails this check in current qemu master.
Signed-off-by: Ryan Wendland <wendland@live.com.au>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1048 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 6301460ce9f59885e8feb65185bcfb6b128c8eff) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Akihiko Odaki [Tue, 23 May 2023 02:39:12 +0000 (11:39 +0900)]
util/vfio-helpers: Use g_file_read_link()
When _FORTIFY_SOURCE=2, glibc version is 2.35, and GCC version is
12.1.0, the compiler complains as follows:
In file included from /usr/include/features.h:490,
from /usr/include/bits/libc-header-start.h:33,
from /usr/include/stdint.h:26,
from /usr/lib/gcc/aarch64-unknown-linux-gnu/12.1.0/include/stdint.h:9,
from /home/alarm/q/var/qemu/include/qemu/osdep.h:94,
from ../util/vfio-helpers.c:13:
In function 'readlink',
inlined from 'sysfs_find_group_file' at ../util/vfio-helpers.c:116:9,
inlined from 'qemu_vfio_init_pci' at ../util/vfio-helpers.c:326:18,
inlined from 'qemu_vfio_open_pci' at ../util/vfio-helpers.c:517:9:
/usr/include/bits/unistd.h:119:10: error: argument 2 is null but the corresponding size argument 3 value is 4095 [-Werror=nonnull]
119 | return __glibc_fortify (readlink, __len, sizeof (char),
| ^~~~~~~~~~~~~~~
This error implies the allocated buffer can be NULL. Use
g_file_read_link(), which allocates buffer automatically to avoid the
error.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
(cherry picked from commit dbdea0dbfe2cef9ef6c752e9077e4fc98724194c) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Stefan Hajnoczi [Thu, 13 Apr 2023 17:19:46 +0000 (13:19 -0400)]
rtl8139: fix large_send_mss divide-by-zero
If the driver sets large_send_mss to 0 then a divide-by-zero occurs.
Even if the division wasn't a problem, the for loop that emits MSS-sized
packets would never terminate.
Solve these issues by skipping offloading when large_send_mss=0.
This issue was found by OSS-Fuzz as part of Alexander Bulekov's device
fuzzing work. The reproducer is:
Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1582 Closes: https://gitlab.com/qemu-project/qemu/-/issues/1582 Cc: qemu-stable@nongnu.org Cc: Peter Maydell <peter.maydell@linaro.org> Fixes: 6d71357a3b65 ("rtl8139: honor large send MSS value") Reported-by: Alexander Bulekov <alxndr@bu.edu> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Alexander Bulekov <alxndr@bu.edu> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 792676c165159c11412346870fd58fd243ab2166) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Akihiko Odaki [Tue, 23 May 2023 02:43:00 +0000 (11:43 +0900)]
igb: Always copy ethernet header
igb_receive_internal() used to check the iov length to determine
copy the iovs to a contiguous buffer, but the check is flawed in two
ways:
- It does not ensure that iovcnt > 0.
- It does not take virtio-net header into consideration.
The size of this copy is just 22 octets, which can be even less than
the code size required for checks. This (wrong) optimization is probably
not worth so just remove it. Removing this also allows igb to assume
aligned accesses for the ethernet header.
Fixes: 3a977deebe ("Intrdocue igb device emulation") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit dc9ef1bf454811646b3ee6387f1b96f63f538a18) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Akihiko Odaki [Tue, 23 May 2023 02:42:59 +0000 (11:42 +0900)]
e1000e: Always copy ethernet header
e1000e_receive_internal() used to check the iov length to determine
copy the iovs to a contiguous buffer, but the check is flawed in two
ways:
- It does not ensure that iovcnt > 0.
- It does not take virtio-net header into consideration.
The size of this copy is just 18 octets, which can be even less than
the code size required for checks. This (wrong) optimization is probably
not worth so just remove it.
Fixes: 6f3fbe4ed0 ("net: Introduce e1000e device emulation") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 310a128eae12339f97f6c940a7ddf92f40d283e4) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Akihiko Odaki [Tue, 23 May 2023 02:42:58 +0000 (11:42 +0900)]
net/net_rx_pkt: Use iovec for net_rx_pkt_set_protocols()
igb does not properly ensure the buffer passed to
net_rx_pkt_set_protocols() is contiguous for the entire L2/L3/L4 header.
Allow it to pass scattered data to net_rx_pkt_set_protocols().
Fixes: 3a977deebe ("Intrdocue igb device emulation") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 2f0fa232b8c330df029120a6824c8be3d4eb5cae) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Akihiko Odaki [Tue, 23 May 2023 02:42:57 +0000 (11:42 +0900)]
igb: Clear IMS bits when committing ICR access
The datasheet says contradicting statements regarding ICR accesses so it
is not reliable to determine the behavior of ICR accesses. However,
e1000e does clear IMS bits when reading ICR accesses and Linux also
expects ICR accesses will clear IMS bits according to:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/ethernet/intel/igb/igb_main.c?h=v6.2#n8048
Fixes: 3a977deebe ("Intrdocue igb device emulation") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit f0b1df5c4502b5ec89f83417924935ab201511d0) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Akihiko Odaki [Tue, 23 May 2023 02:42:56 +0000 (11:42 +0900)]
igb: Do not require CTRL.VME for tx VLAN tagging
While the datasheet of e1000e says it checks CTRL.VME for tx VLAN
tagging, igb's datasheet has no such statements. It also says for
"CTRL.VLE":
> This register only affects the VLAN Strip in Rx it does not have any
> influence in the Tx path in the 82576.
(Appendix A. Changes from the 82575)
There is no "CTRL.VLE" so it is more likely that it is a mistake of
CTRL.VME.
Fixes: fba7c3b788 ("igb: respect VMVIR and VMOLR for VLAN") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit e209716749cda1581cfc8e582591c0216c30ab0d) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Akihiko Odaki [Tue, 23 May 2023 02:42:55 +0000 (11:42 +0900)]
igb: Fix Rx packet type encoding
igb's advanced descriptor uses a packet type encoding different from
one used in e1000e's extended descriptor. Fix the logic to encode
Rx packet type accordingly.
Fixes: 3a977deebe ("Intrdocue igb device emulation") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit ed447c60b341f1714b3c800d7f9c68898e873f78) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Akihiko Odaki [Tue, 23 May 2023 02:42:54 +0000 (11:42 +0900)]
e1000x: Fix BPRC and MPRC
Before this change, e1000 and the common code updated BPRC and MPRC
depending on the matched filter, but e1000e and igb decided to update
those counters by deriving the packet type independently. This
inconsistency caused a multicast packet to be counted twice.
Updating BPRC and MPRC depending on are fundamentally flawed anyway as
a filter can be used for different types of packets. For example, it is
possible to filter broadcast packets with MTA.
Always determine what counters to update by inspecting the packets.
Fixes: 3b27430177 ("e1000: Implementing various counters") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit f3f9b726afba1f53663768603189e574f80b5907) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The bytes and packets counter registers are cleared on read.
Copying the "total counter" registers to the "good counter" registers has
side effects.
If the "total" register is never read by the OS, it only gets incremented.
This leads to exponential growth of the "good" register.
This commit increments the counters individually to avoid this.
Signed-off-by: Timothée Cocault <timothee.cocault@gmail.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 8d689f6aae8be096b4a1859be07c1b083865f755) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Kevin Wolf [Wed, 17 May 2023 15:28:33 +0000 (17:28 +0200)]
nbd/server: Fix drained_poll to wake coroutine in right AioContext
nbd_drained_poll() generally runs in the main thread, not whatever
iothread the NBD server coroutine is meant to run in, so it can't
directly reenter the coroutines to wake them up.
The code seems to have the right intention, it specifies the correct
AioContext when it calls qemu_aio_coroutine_enter(). However, this
functions doesn't schedule the coroutine to run in that AioContext, but
it assumes it is already called in the home thread of the AioContext.
To fix this, add a new thread-safe qio_channel_wake_read() that can be
called in the main thread to wake up the coroutine in its AioContext,
and use this in nbd_drained_poll().
Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230517152834.277483-3-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 7c1f51bf38de8cea4ed5030467646c37b46edeb7) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Kevin Wolf [Wed, 17 May 2023 15:28:32 +0000 (17:28 +0200)]
graph-lock: Disable locking for now
In QEMU 8.0, we've been seeing deadlocks in bdrv_graph_wrlock(). They
come from callers that hold an AioContext lock, which is not allowed
during polling. In theory, we could temporarily release the lock, but
callers are inconsistent about whether they hold a lock, and if they do,
some are also confused about which one they hold. While all of this is
fixable, it's not trivial, and the best course of action for 8.0.1 is
probably just disabling the graph locking code temporarily.
We don't currently rely on graph locking yet. It is supposed to replace
the AioContext lock eventually to enable multiqueue support, but as long
as we still have the AioContext lock, it is sufficient without the graph
lock. Once the AioContext lock goes away, the deadlock doesn't exist any
more either and this commit can be reverted. (Of course, it can also be
reverted while the AioContext lock still exists if the callers have been
fixed.)
Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230517152834.277483-2-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 80fc5d260002432628710f8b0c7cfc7d9b97bb9d) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Stefan Hajnoczi [Mon, 1 May 2023 17:34:43 +0000 (13:34 -0400)]
block: compile out assert_bdrv_graph_readable() by default
reader_count() is a performance bottleneck because the global
aio_context_list_lock mutex causes thread contention. Put this debugging
assertion behind a new ./configure --enable-debug-graph-lock option and
disable it by default.
The --enable-debug-graph-lock option is also enabled by the more general
--enable-debug option.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230501173443.153062-1-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 58a2e3f5c37be02dac3086b81bdda9414b931edf) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: pick this one up so the next patch which disables this applies cleanly)
Stefan Hajnoczi [Tue, 2 May 2023 18:41:34 +0000 (14:41 -0400)]
tested: add test for nested aio_poll() in poll handlers
Cc: qemu-stable@nongnu.org Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230502184134.534703-3-stefanha@redhat.com>
[kwolf: Restrict to CONFIG_POSIX, Windows doesn't support polling] Tested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 844a12a63e12b1235a8fc17f9b278929dc6eb00e) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Stefan Hajnoczi [Tue, 2 May 2023 18:41:33 +0000 (14:41 -0400)]
aio-posix: do not nest poll handlers
QEMU's event loop supports nesting, which means that event handler
functions may themselves call aio_poll(). The condition that triggered a
handler must be reset before the nested aio_poll() call, otherwise the
same handler will be called and immediately re-enter aio_poll. This
leads to an infinite loop and stack exhaustion.
Poll handlers are especially prone to this issue, because they typically
reset their condition by finishing the processing of pending work.
Unfortunately it is during the processing of pending work that nested
aio_poll() calls typically occur and the condition has not yet been
reset.
Disable a poll handler during ->io_poll_ready() so that a nested
aio_poll() call cannot invoke ->io_poll_ready() again. As a result, the
disabled poll handler and its associated fd handler do not run during
the nested aio_poll(). Calling aio_set_fd_handler() from inside nested
aio_poll() could cause it to run again. If the fd handler is pending
inside nested aio_poll(), then it will also run again.
In theory fd handlers can be affected by the same issue, but they are
more likely to reset the condition before calling nested aio_poll().
This is a special case and it's somewhat complex, but I don't see a way
around it as long as nested aio_poll() is supported.
Cc: qemu-stable@nongnu.org Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2186181 Fixes: c38270692593 ("block: Mark bdrv_co_io_(un)plug() and callers GRAPH_RDLOCK") Cc: Kevin Wolf <kwolf@redhat.com> Cc: Emanuele Giuseppe Esposito <eesposit@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230502184134.534703-2-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 6d740fb01b9f0f5ea7a82f4d5e458d91940a19ee) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Eugenio Pérez [Thu, 4 May 2023 10:14:47 +0000 (12:14 +0200)]
virtio-net: not enable vq reset feature unconditionally
The commit 93a97dc5200a ("virtio-net: enable vq reset feature") enables
unconditionally vq reset feature as long as the device is emulated.
This makes impossible to actually disable the feature, and it causes
migration problems from qemu version previous than 7.2.
The entire final commit is unneeded as device system already enable or
disable the feature properly.
This reverts commit 93a97dc5200a95e63b99cb625f20b7ae802ba413. Fixes: 93a97dc5200a ("virtio-net: enable vq reset feature") Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230504101447.389398-1-eperezma@redhat.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 1fac00f70b3261050af5564b20ca55c1b2a3059a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Leonardo Bras [Wed, 3 May 2023 00:27:02 +0000 (21:27 -0300)]
hw/pci: Disable PCI_ERR_UNCOR_MASK register for machine type < 8.0
Since it's implementation on v8.0.0-rc0, having the PCI_ERR_UNCOR_MASK
set for machine types < 8.0 will cause migration to fail if the target
QEMU version is < 8.0.0 :
qemu-system-x86_64: get_pci_config_device: Bad config data: i=0x10a read: 40 device: 0 cmask: ff wmask: 0 w1cmask:0
qemu-system-x86_64: Failed to load PCIDevice:config
qemu-system-x86_64: Failed to load e1000e:parent_obj
qemu-system-x86_64: error while loading state for instance 0x0 of device '0000:00:02.0/e1000e'
qemu-system-x86_64: load of migration failed: Invalid argument
The above test migrated a 7.2 machine type from QEMU master to QEMU 7.2.0,
with this cmdline:
In order to fix this, property x-pcie-err-unc-mask was introduced to
control when PCI_ERR_UNCOR_MASK is enabled. This property is enabled by
default, but is disabled if machine type <= 7.2.
Fixes: 010746ae1d ("hw/pci/aer: Implement PCI_ERR_UNCOR_MASK register") Suggested-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Leonardo Bras <leobras@redhat.com>
Message-Id: <20230503002701.854329-1-leobras@redhat.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1576 Tested-by: Fiona Ebner <f.ebner@proxmox.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 5ed3dabe57dd9f4c007404345e5f5bf0e347317f) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Hawkins Jiawei [Tue, 9 May 2023 08:48:17 +0000 (16:48 +0800)]
vhost: fix possible wrap in SVQ descriptor ring
QEMU invokes vhost_svq_add() when adding a guest's element
into SVQ. In vhost_svq_add(), it uses vhost_svq_available_slots()
to check whether QEMU can add the element into SVQ. If there is
enough space, then QEMU combines some out descriptors and some
in descriptors into one descriptor chain, and adds it into
`svq->vring.desc` by vhost_svq_vring_write_descs().
Yet the problem is that, `svq->shadow_avail_idx - svq->shadow_used_idx`
in vhost_svq_available_slots() returns the number of occupied elements,
or the number of descriptor chains, instead of the number of occupied
descriptors, which may cause wrapping in SVQ descriptor ring.
Here is an example. In vhost_handle_guest_kick(), QEMU forwards
as many available buffers to device by virtqueue_pop() and
vhost_svq_add_element(). virtqueue_pop() returns a guest's element,
and then this element is added into SVQ by vhost_svq_add_element(),
a wrapper to vhost_svq_add(). If QEMU invokes virtqueue_pop() and
vhost_svq_add_element() `svq->vring.num` times,
vhost_svq_available_slots() thinks QEMU just ran out of slots and
everything should work fine. But in fact, virtqueue_pop() returns
`svq->vring.num` elements or descriptor chains, more than
`svq->vring.num` descriptors due to guest memory fragmentation,
and this causes wrapping in SVQ descriptor ring.
This bug is valid even before marking the descriptors used.
If the guest memory is fragmented, SVQ must add chains
so it can try to add more descriptors than possible.
This patch solves it by adding `num_free` field in
VhostShadowVirtqueue structure and updating this field
in vhost_svq_add() and vhost_svq_get_buf(), to record
the number of free descriptors.
Fixes: 100890f7ca ("vhost: Shadow virtqueue buffers forwarding") Signed-off-by: Hawkins Jiawei <yin31149@gmail.com> Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230509084817.3973-1-yin31149@gmail.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Lei Yang <leiyang@redhat.com>
(cherry picked from commit 5d410557dea452f6231a7c66155e29a37e168528) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Paolo Bonzini [Tue, 9 May 2023 14:17:15 +0000 (16:17 +0200)]
target/i386: fix operand size for VCOMI/VUCOMI instructions
Compared to other SSE instructions, VUCOMISx and VCOMISx are different:
the single and double precision versions are distinguished through a
prefix, however they use no-prefix and 0x66 for SS and SD respectively.
Scalar values usually are associated with 0xF2 and 0xF3.
Because of these, they incorrectly perform a 128-bit memory load instead
of a 32- or 64-bit load. Fix this by writing a custom decoding function.
I tested that the reproducer is fixed and the test-avx output does not
change.
Reported-by: Gabriele Svelto <gsvelto@mozilla.com>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1637 Fixes: f8d19eec0d53 ("target/i386: reimplement 0x0f 0x28-0x2f, add AVX", 2022-10-18) Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 2b55e479e6fcbb466585fd25077a50c32e10dc3a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Paolo Bonzini [Wed, 10 May 2023 16:15:25 +0000 (18:15 +0200)]
scsi-generic: fix buffer overflow on block limits inquiry
Using linux 6.x guest, at boot time, an inquiry on a scsi-generic
device makes qemu crash. This is caused by a buffer overflow when
scsi-generic patches the block limits VPD page.
Do the operations on a temporary on-stack buffer that is guaranteed
to be large enough.
Reported-by: Théo Maillart <tmaillart@freebox.fr> Analyzed-by: Théo Maillart <tmaillart@freebox.fr> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 9bd634b2f5e2f10fe35d7609eb83f30583f2e15a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Eric Blake [Tue, 2 May 2023 20:52:12 +0000 (15:52 -0500)]
migration: Attempt disk reactivation in more failure scenarios
Commit fe904ea824 added a fail_inactivate label, which tries to
reactivate disks on the source after a failure while s->state ==
MIGRATION_STATUS_ACTIVE, but didn't actually use the label if
qemu_savevm_state_complete_precopy() failed. This failure to
reactivate is also present in commit 6039dd5b1c (also covering the new
s->state == MIGRATION_STATUS_DEVICE state) and 403d18ae (ensuring
s->block_inactive is set more reliably).
Consolidate the two labels back into one - no matter HOW migration is
failed, if there is any chance we can reach vm_start() after having
attempted inactivation, it is essential that we have tried to restart
disks before then. This also makes the cleanup more like
migrate_fd_cancel().
Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20230502205212.134680-1-eblake@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 6dab4c93ecfae48e2e67b984d1032c1e988d3005) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: minor context tweak near added comment in migration/migration.c)
Consider what happens when performing a migration between two host
machines connected to an NFS server serving multiple block devices to
the guest, when the NFS server becomes unavailable. The migration
attempts to inactivate all block devices on the source (a necessary
step before the destination can take over); but if the NFS server is
non-responsive, the attempt to inactivate can itself fail. When that
happens, the destination fails to get the migrated guest (good,
because the source wasn't able to flush everything properly):
(qemu) qemu-kvm: load of migration failed: Input/output error
at which point, our only hope for the guest is for the source to take
back control. With the current code base, the host outputs a message, but then appears to resume:
Whether the guest is recoverable on the source after the first failure
is debatable, but what we do not want is to have qemu itself fail due
to an assertion. It looks like the problem is as follows:
In migration.c:migration_completion(), the source sets 'inactivate' to
true (since COLO is not enabled), then tries
savevm.c:qemu_savevm_state_complete_precopy() with a request to
inactivate block devices. In turn, this calls
block.c:bdrv_inactivate_all(), which fails when flushing runs up
against the non-responsive NFS server. With savevm failing, we are
now left in a state where some, but not all, of the block devices have
been inactivated; but migration_completion() then jumps to 'fail'
rather than 'fail_invalidate' and skips an attempt to reclaim those
those disks by calling bdrv_activate_all(). Even if we do attempt to
reclaim disks, we aren't taking note of failure there, either.
Thus, we have reached a state where the migration engine has forgotten
all state about whether a block device is inactive, because we did not
set s->block_inactive in enough places; so migration allows the source
to reach vm_start() and resume execution, violating the block layer
invariant that the guest CPUs should not be restarted while a device
is inactive. Note that the code in migration.c:migrate_fd_cancel()
will also try to reactivate all block devices if s->block_inactive was
set, but because we failed to set that flag after the first failure,
the source assumes it has reclaimed all devices, even though it still
has remaining inactivated devices and does not try again. Normally,
qmp_cont() will also try to reactivate all disks (or correctly fail if
the disks are not reclaimable because NFS is not yet back up), but the
auto-resumption of the source after a migration failure does not go
through qmp_cont(). And because we have left the block layer in an
inconsistent state with devices still inactivated, the later migration
attempt is hitting the assertion failure.
Since it is important to not resume the source with inactive disks,
this patch marks s->block_inactive before attempting inactivation,
rather than after succeeding, in order to prevent any vm_start() until
it has successfully reactivated all devices.
See also https://bugzilla.redhat.com/show_bug.cgi?id=2058982
Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Acked-by: Lukas Straub <lukasstraub2@web.de> Tested-by: Lukas Straub <lukasstraub2@web.de> Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 403d18ae384239876764bbfa111d6cc5dcb673d1) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Michael Tokarev [Sun, 9 Apr 2023 10:53:27 +0000 (13:53 +0300)]
linux-user: fix getgroups/setgroups allocations
linux-user getgroups(), setgroups(), getgroups32() and setgroups32()
used alloca() to allocate grouplist arrays, with unchecked gidsetsize
coming from the "guest". With NGROUPS_MAX being 65536 (linux, and it
is common for an application to allocate NGROUPS_MAX for getgroups()),
this means a typical allocation is half the megabyte on the stack.
Which just overflows stack, which leads to immediate SIGSEGV in actual
system getgroups() implementation.
An example of such issue is aptitude, eg
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=811087#72
Cap gidsetsize to NGROUPS_MAX (return EINVAL if it is larger than that),
and use heap allocation for grouplist instead of alloca(). While at it,
fix coding style and make all 4 implementations identical.
Try to not impose random limits - for example, allow gidsetsize to be
negative for getgroups() - just do not allocate negative-sized grouplist
in this case but still do actual getgroups() call. But do not allow
negative gidsetsize for setgroups() since its argument is unsigned.
Capping by NGROUPS_MAX seems a bit arbitrary, - we can do more, it is
not an error if set size will be NGROUPS_MAX+1. But we should not allow
integer overflow for the array being allocated. Maybe it is enough to
just call g_try_new() and return ENOMEM if it fails.
Maybe there's also no need to convert setgroups() since this one is
usually smaller and known beforehand (KERN_NGROUPS_MAX is actually 63, -
this is apparently a kernel-imposed limit for runtime group set).
The patch fixes aptitude segfault mentioned above.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <20230409105327.1273372-1-mjt@msgid.tls.msk.ru> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit 1e35d327890bdd117a67f79c52e637fb12bb1bf4) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
If a program requires fr1, we should set the FR bit of CP0 control status
register and add F64 hardware flag. The corresponding `else if` branch
statement is copied from the linux kernel sources (see `arch_check_elf` function
in linux/arch/mips/kernel/elf.c).
Signed-off-by: Daniil Kovalev <dkovalev@compiler-toolchain-for.me> Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Message-Id: <20230404052153.16617-1-dkovalev@compiler-toolchain-for.me> Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit a0f8d2701b205d9d7986aa555e0566b13dc18fa0) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Alex Bennée [Tue, 2 May 2023 14:20:59 +0000 (15:20 +0100)]
tests/docker: bump the xtensa base to debian:11-slim
Stretch is going out of support so things like security updates will
fail. As the toolchain itself is binary it hopefully won't mind the
underlying OS being updated.
Message-Id: <20230503091244.1450613-3-alex.bennee@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reported-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 3217b84f3cd813a7daffc64b26543c313f3a042a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Lizhi Yang [Thu, 11 May 2023 08:01:19 +0000 (16:01 +0800)]
docs/about/emulation: fix typo
Duplicated word "are".
Signed-off-by: Lizhi Yang <sledgeh4w@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230511080119.99018-1-sledgeh4w@gmail.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit c70bb9a771d467302d1c7df5c5bd56b48f42716e) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>