A subsystem reset contains a reset of AP resources which has been
missing. Adding the AP bridge to the list of device types that need
reset fixes this issue.
Reviewed-by: Jason J. Herne <jjherne@linux.ibm.com> Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Fixes: a51b3153 ("s390x/ap: base Adjunct Processor (AP) object model")
Message-ID: <20230823142219.1046522-2-seiden@linux.ibm.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 297ec01f0b9864ea8209ca0ddc6643b4c0574bdb) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555888630 in dpy_ui_info_supported (con=0x0) at ../ui/console.c:812
812 return con->hw_ops->ui_info != NULL;
(gdb) bt
#0 0x0000555555888630 in dpy_ui_info_supported (con=0x0) at ../ui/console.c:812
#1 0x00005555558a44b1 in protocol_client_msg (vs=0x5555578c76c0, data=0x5555581e93f0 <incomplete sequence \373>, len=24) at ../ui/vnc.c:2585
#2 0x00005555558a19ac in vnc_client_read (vs=0x5555578c76c0) at ../ui/vnc.c:1607
#3 0x00005555558a1ac2 in vnc_client_io (ioc=0x5555581eb0e0, condition=G_IO_IN, opaque=0x5555578c76c0) at ../ui/vnc.c:1635
Fixes:
https://issues.redhat.com/browse/RHEL-2600
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Albert Esteve <aesteve@redhat.com>
(cherry picked from commit 48a35e12faf90a896c5aa4755812201e00d60316) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Stefan Berger [Thu, 13 Jul 2023 17:19:55 +0000 (13:19 -0400)]
hw/tpm: TIS on sysbus: Remove unsupport ppi command line option
The ppi command line option for the TIS device on sysbus never worked
and caused an immediate segfault. Remove support for it since it also
needs support in the firmware and needs testing inside the VM.
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com> Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20230713171955.149236-1-stefanb@linux.ibm.com
(cherry picked from commit 4c46fe2ed492f35f411632c8b5a8442f322bc3f0) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Leon Schuermann [Tue, 29 Aug 2023 21:50:46 +0000 (17:50 -0400)]
target/riscv/pmp.c: respect mseccfg.RLB for pmpaddrX changes
When the rule-lock bypass (RLB) bit is set in the mseccfg CSR, the PMP
configuration lock bits must not apply. While this behavior is
implemented for the pmpcfgX CSRs, this bit is not respected for
changes to the pmpaddrX CSRs. This patch ensures that pmpaddrX CSR
writes work even on locked regions when the global rule-lock bypass is
enabled.
Signed-off-by: Leon Schuermann <leons@opentitan.org> Reviewed-by: Mayuresh Chitale <mchitale@ventanamicro.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230829215046.1430463-1-leon@is.currently.online> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 4e3adce1244e1ca30ec05874c3eca14911dc0825) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
target/riscv: fix satp_mode_finalize() when satp_mode.supported = 0
In the same emulated RISC-V host, the 'host' KVM CPU takes 4 times
longer to boot than the 'rv64' KVM CPU.
The reason is an unintended behavior of riscv_cpu_satp_mode_finalize()
when satp_mode.supported = 0, i.e. when cpu_init() does not set
satp_mode_max_supported(). satp_mode_max_from_map(map) does:
31 - __builtin_clz(map)
This means that, if satp_mode.supported = 0, satp_mode_supported_max
wil be '31 - 32'. But this is C, so satp_mode_supported_max will gladly
set it to UINT_MAX (4294967295). After that, if the user didn't set a
satp_mode, set_satp_mode_default_map(cpu) will make
cfg.satp_mode.map = cfg.satp_mode.supported
So satp_mode.map = 0. And then satp_mode_map_max will be set to
satp_mode_max_from_map(cpu->cfg.satp_mode.map), i.e. also UINT_MAX. The
guard "satp_mode_map_max > satp_mode_supported_max" doesn't protect us
here since both are UINT_MAX.
And finally we have 2 loops:
for (int i = satp_mode_map_max - 1; i >= 0; --i) {
Which are, in fact, 2 loops from UINT_MAX -1 to -1. This is where the
extra delay when booting the 'host' CPU is coming from.
Commit 43d1de32f8 already set a precedence for satp_mode.supported = 0
in a different manner. We're doing the same here. If supported == 0,
interpret as 'the CPU wants the OS to handle satp mode alone' and skip
satp_mode_finalize().
We'll also put a guard in satp_mode_max_from_map() to assert out if map
is 0 since the function is not ready to deal with it.
Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Fixes: 6f23aaeb9b ("riscv: Allow user to set the satp mode") Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Message-ID: <20230817152903.694926-1-dbarboza@ventanamicro.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 3a2fc23563885c219c73c8f24318921daf02f3f2) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
On a dtb dumped from the virt machine, dt-validate complains:
soc: pmu: {'riscv,event-to-mhpmcounters': [[1, 1, 524281], [2, 2, 524284], [65561, 65561, 524280], [65563, 65563, 524280], [65569, 65569, 524280]], 'compatible': ['riscv,pmu']} should not be valid under {'type': 'object'}
from schema $id: http://devicetree.org/schemas/simple-bus.yaml#
That's pretty cryptic, but running the dtb back through dtc produces
something a lot more reasonable:
Warning (simple_bus_reg): /soc/pmu: missing or empty reg/ranges property
Moving the riscv,pmu node out of the soc bus solves the problem.
Signed-off-by: Conor Dooley <conor.dooley@microchip.com> Acked-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-ID: <20230727-groom-decline-2c57ce42841c@spud> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 9ff31406312500053ecb5f92df01dd9ce52e635d) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
LIU Zhiwei [Fri, 11 Aug 2023 05:54:38 +0000 (13:54 +0800)]
linux-user/riscv: Use abi type for target_ucontext
We should not use types dependend on host arch for target_ucontext.
This bug is found when run rv32 applications.
Signed-off-by: LIU Zhiwei <zhiwei_liu@linux.alibaba.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230811055438.1945-1-zhiwei_liu@linux.alibaba.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit ae7d4d625cab49657b9fc2be09d895afb9bcdaf0) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Jason Chien [Fri, 28 Jul 2023 08:24:39 +0000 (08:24 +0000)]
hw/intc: Make rtc variable names consistent
The variables whose values are given by cpu_riscv_read_rtc() should be named
"rtc". The variables whose value are given by cpu_riscv_read_rtc_raw()
should be named "rtc_r".
Signed-off-by: Jason Chien <jason.chien@sifive.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230728082502.26439-2-jason.chien@sifive.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 9382a9eafccad8dc6a487ea3a8d2bed03dc35db9) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Jason Chien [Fri, 28 Jul 2023 08:24:38 +0000 (08:24 +0000)]
hw/intc: Fix upper/lower mtime write calculation
When writing the upper mtime, we should keep the original lower mtime
whose value is given by cpu_riscv_read_rtc() instead of
cpu_riscv_read_rtc_raw(). The same logic applies to writes to lower mtime.
Signed-off-by: Jason Chien <jason.chien@sifive.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-ID: <20230728082502.26439-1-jason.chien@sifive.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit e0922b73baf00c4c19d4ad30d09bb94f7ffea0f4) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Thomas Huth [Fri, 21 Jul 2023 09:47:20 +0000 (11:47 +0200)]
hw/char/riscv_htif: Fix the console syscall on big endian hosts
Values that have been read via cpu_physical_memory_read() from the
guest's memory have to be swapped in case the host endianess differs
from the guest.
Fixes: a6e13e31d5 ("riscv_htif: Support console output via proxy syscall") Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Bin Meng <bmeng@tinylab.org> Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
Message-Id: <20230721094720.902454-3-thuth@redhat.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit 058096f1c55ab688db7e1d6814aaefc1bcd87f7a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(Mjt: context fix in hw/char/riscv_htif.c for #include)
Thomas Huth [Tue, 11 Apr 2023 18:34:17 +0000 (20:34 +0200)]
include/exec: Provide the tswap() functions for target independent code, too
In some cases of target independent code, it would be useful to have access
to the functions that swap endianess in case it differs between guest and
host. Thus re-implement the tswapXX() functions in a new header that can be
included separately. The check whether the swapping is needed continues to
be done at compile-time for target specific code, while it is done at
run-time in target-independent code.
Message-Id: <20230411183418.1640500-3-thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 24be3369ad63c3882be42dd510a45bad52816fd1) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(trivial change needed for the next commit 058096f1c5
"hw/char/riscv_htif: Fix the console syscall on big endian hosts")
Thomas Huth [Fri, 21 Jul 2023 09:47:19 +0000 (11:47 +0200)]
hw/char/riscv_htif: Fix printing of console characters on big endian hosts
The character that should be printed is stored in the 64 bit "payload"
variable. The code currently tries to print it by taking the address
of the variable and passing this pointer to qemu_chr_fe_write(). However,
this only works on little endian hosts where the least significant bits
are stored on the lowest address. To do this in a portable way, we have
to store the value in an uint8_t variable instead.
Fixes: 5033606780 ("RISC-V HTIF Console") Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Bin Meng <bmeng@tinylab.org> Reviewed-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230721094720.902454-2-thuth@redhat.com> Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
(cherry picked from commit c255946e3df4d9660e4f468a456633c24393d468) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Colton Lewis [Thu, 31 Aug 2023 19:00:52 +0000 (19:00 +0000)]
arm64: Restore trapless ptimer access
Due to recent KVM changes, QEMU is setting a ptimer offset resulting
in unintended trap and emulate access and a consequent performance
hit. Filter out the PTIMER_CNT register to restore trapless ptimer
access.
Quoting Andrew Jones:
Simply reading the CNT register and writing back the same value is
enough to set an offset, since the timer will have certainly moved
past whatever value was read by the time it's written. QEMU
frequently saves and restores all registers in the get-reg-list array,
unless they've been explicitly filtered out (with Linux commit 680232a94c12, KVM_REG_ARM_PTIMER_CNT is now in the array). So, to
restore trapless ptimer accesses, we need a QEMU patch to filter out
the register.
See
https://lore.kernel.org/kvmarm/gsntttsonus5.fsf@coltonlewis-kvm.c.googlers.com/T/#m0770023762a821db2a3f0dd0a7dc6aa54e0d0da9
for additional context.
Cc: qemu-stable@nongnu.org Signed-off-by: Andrew Jones <andrew.jones@linux.dev> Signed-off-by: Colton Lewis <coltonlewis@google.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Colton Lewis <coltonlewis@google.com>
Message-id: 20230831190052.129045-1-coltonlewis@google.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 682814e2a3c883b27f24b9e7cab47313c49acbd4) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Kevin Wolf [Tue, 5 Sep 2023 14:50:02 +0000 (16:50 +0200)]
virtio: Drop out of coroutine context in virtio_load()
virtio_load() as a whole should run in coroutine context because it
reads from the migration stream and we don't want this to block.
However, it calls virtio_set_features_nocheck() and devices don't
expect their .set_features callback to run in a coroutine and therefore
call functions that may not be called in coroutine context. To fix this,
drop out of coroutine context for calling virtio_set_features_nocheck().
Without this fix, the following crash was reported:
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1 0x00007efc738c05d3 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2 0x00007efc73873d26 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x00007efc738477f3 in __GI_abort () at abort.c:79
#4 0x00007efc7384771b in __assert_fail_base (fmt=0x7efc739dbcb8 "", assertion=assertion@entry=0x560aebfbf5cf "!qemu_in_coroutine()",
file=file@entry=0x560aebfcd2d4 "../block/graph-lock.c", line=line@entry=275, function=function@entry=0x560aebfcd34d "void bdrv_graph_rdlock_main_loop(void)") at assert.c:92
#5 0x00007efc7386ccc6 in __assert_fail (assertion=0x560aebfbf5cf "!qemu_in_coroutine()", file=0x560aebfcd2d4 "../block/graph-lock.c", line=275,
function=0x560aebfcd34d "void bdrv_graph_rdlock_main_loop(void)") at assert.c:101
#6 0x0000560aebcd8dd6 in bdrv_register_buf ()
#7 0x0000560aeb97ed97 in ram_block_added.llvm ()
#8 0x0000560aebb8303f in ram_block_add.llvm ()
#9 0x0000560aebb834fa in qemu_ram_alloc_internal.llvm ()
#10 0x0000560aebb2ac98 in vfio_region_mmap ()
#11 0x0000560aebb3ea0f in vfio_bars_register ()
#12 0x0000560aebb3c628 in vfio_realize ()
#13 0x0000560aeb90f0c2 in pci_qdev_realize ()
#14 0x0000560aebc40305 in device_set_realized ()
#15 0x0000560aebc48e07 in property_set_bool.llvm ()
#16 0x0000560aebc46582 in object_property_set ()
#17 0x0000560aebc4cd58 in object_property_set_qobject ()
#18 0x0000560aebc46ba7 in object_property_set_bool ()
#19 0x0000560aeb98b3ca in qdev_device_add_from_qdict ()
#20 0x0000560aebb1fbaf in virtio_net_set_features ()
#21 0x0000560aebb46b51 in virtio_set_features_nocheck ()
#22 0x0000560aebb47107 in virtio_load ()
#23 0x0000560aeb9ae7ce in vmstate_load_state ()
#24 0x0000560aeb9d2ee9 in qemu_loadvm_state_main ()
#25 0x0000560aeb9d45e1 in qemu_loadvm_state ()
#26 0x0000560aeb9bc32c in process_incoming_migration_co.llvm ()
#27 0x0000560aebeace56 in coroutine_trampoline.llvm ()
Cc: qemu-stable@nongnu.org Buglink: https://issues.redhat.com/browse/RHEL-832 Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20230905145002.46391-3-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 92e2e6a867334a990f8d29f07ca34e3162fdd6ec) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Thomas Huth [Thu, 17 Aug 2023 12:56:00 +0000 (14:56 +0200)]
hw/net/vmxnet3: Fix guest-triggerable assert()
The assert() that checks for valid MTU sizes can be triggered by
the guest (e.g. with the reproducer code from the bug ticket
https://gitlab.com/qemu-project/qemu/-/issues/517 ). Let's avoid
this problem by simply logging the error and refusing to activate
the device instead.
Fixes: d05dcd94ae ("net: vmxnet3: validate configuration values during activate") Signed-off-by: Thomas Huth <thuth@redhat.com> Cc: qemu-stable@nongnu.org Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
[Mjt: change format specifier from %d to %u for uint32_t argument]
(cherry picked from commit 90a0778421acdf4ca903be64c8ed19378183c944) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
docs/multi-thread-compression.txt uses parameter names with
underscores instead of dashes. Wrong since day one.
docs/rdma.txt, tests/qemu-iotests/181, and tests/qtest/test-hmp.c are
wrong the same way since commit cbde7be900d2 (v6.0.0). Hard to see,
as test-hmp doesn't check whether the commands work, and iotest 181
appears to be unaffected.
Fixes: 263170e679df (docs: Add a doc about multiple thread compression) Fixes: cbde7be900d2 (migrate: remove QMP/HMP commands for speed, downtime and cache size) Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit b21a6e31a182a5ae7436a444f840d49aac07c94f) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Thomas Huth [Tue, 29 Aug 2023 13:29:48 +0000 (15:29 +0200)]
qemu-options.hx: Rephrase the descriptions of the -hd* and -cdrom options
The current description says that these options will create a device
on the IDE bus, which is only true on x86. So rephrase these sentences
a little bit to speak of "default bus" instead.
Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit bcd8e243083c878884e52d609deddbe6be17c730) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Hang Yu [Sat, 12 Aug 2023 06:52:29 +0000 (14:52 +0800)]
hw/i2c/aspeed: Fix TXBUF transmission start position error
According to the ast2600 datasheet and the linux aspeed i2c driver,
the TXBUF transmission start position should be TXBUF[0] instead
of TXBUF[1],so the arg pool_start is useless,and the address is not
included in TXBUF.So even if Tx Count equals zero,there is at least
1 byte data needs to be transmitted,and M_TX_CMD should not be cleared
at this condition.The driver url is:
https://github.com/AspeedTech-BMC/linux/blob/aspeed-master-v5.15/drivers/i2c/busses/i2c-ast2600.c
Signed-off-by: Hang Yu <francis_yuu@stu.pku.edu.cn> Fixes: 6054fc73e8f4 ("aspeed/i2c: Add support for pool buffer transfers") Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit 961faf3ddbd8ffcdf776bbcf88af0bc97218114a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Hang Yu [Sat, 12 Aug 2023 06:52:28 +0000 (14:52 +0800)]
hw/i2c/aspeed: Fix Tx count and Rx size error in buffer pool mode
Fixed inconsistency between the regisiter bit field definition header file
and the ast2600 datasheet. The reg name is I2CD1C:Pool Buffer Control
Register in old register mode and I2CC0C: Master/Slave Pool Buffer Control
Register in new register mode. They share bit field
[12:8]:Transmit Data Byte Count and bit field
[29:24]:Actual Received Pool Buffer Size according to the datasheet.
According to the ast2600 datasheet,the actual Tx count is
Transmit Data Byte Count plus 1, and the max Rx size is
Receive Pool Buffer Size plus 1, both in Pool Buffer Control Register.
The version before forgot to plus 1, and mistake Rx count for Rx size.
Signed-off-by: Hang Yu <francis_yuu@stu.pku.edu.cn> Fixes: 3be3d6ccf2ad ("aspeed: i2c: Migrate to registerfields API") Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit 97b8aa5ae9ff197394395eda5062ea3681e09c28) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Niklas Cassel [Fri, 9 Jun 2023 14:08:44 +0000 (16:08 +0200)]
hw/ide/ahci: fix broken SError handling
When encountering an NCQ error, you should not write the NCQ tag to the
SError register. This is completely wrong.
The SError register has a clear definition, where each bit represents a
different error, see PxSERR definition in AHCI 1.3.1.
If we write a random value (like the NCQ tag) in SError, e.g. Linux will
read SError, and will trigger arbitrary error handling depending on the
NCQ tag that happened to be executing.
In case of success, ncq_cb() will call ncq_finish().
In case of error, ncq_cb() will call ncq_err() (which will clear
ncq_tfs->used), and then call ncq_finish(), thus using ncq_tfs->used is
sufficient to tell if finished should get set or not.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230609140844.202795-9-nks@flawful.org Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 9f89423537653de07ca40c18b5ff5b70b104cc93) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Niklas Cassel [Fri, 9 Jun 2023 14:08:43 +0000 (16:08 +0200)]
hw/ide/ahci: fix ahci_write_fis_sdb()
When there is an error, we need to raise a TFES error irq, see AHCI 1.3.1,
5.3.13.1 SDB:Entry.
If ERR_STAT is set, we jump to state ERR:FatalTaskfile, which will raise
a TFES IRQ unconditionally, regardless if the I bit is set in the FIS or
not.
Thus, we should never raise a normal IRQ after having sent an error IRQ.
It is valid to signal successfully completed commands as finished in the
same SDB FIS that generates the error IRQ. The important thing is that
commands that did not complete successfully (e.g. commands that were
aborted, do not get the finished bit set).
Before this commit, there was never a TFES IRQ raised on NCQ error.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230609140844.202795-8-nks@flawful.org Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 7e85cb0db4c693b4e084a00e66fe73a22ed1688a) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Niklas Cassel [Fri, 9 Jun 2023 14:08:42 +0000 (16:08 +0200)]
hw/ide/ahci: PxCI should not get cleared when ERR_STAT is set
For NCQ, PxCI is cleared on command queued successfully.
For non-NCQ, PxCI is cleared on command completed successfully.
Successfully means ERR_STAT, BUSY and DRQ are all cleared.
A command that has ERR_STAT set, does not get to clear PxCI.
See AHCI 1.3.1, section 5.3.8, states RegFIS:Entry and RegFIS:ClearCI,
and 5.3.16.5 ERR:FatalTaskfile.
In the case of non-NCQ commands, not clearing PxCI is needed in order
for host software to be able to see which command slot that failed.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Message-id: 20230609140844.202795-7-nks@flawful.org Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 1a16ce64fda11bdf50f0c4ab5d9fdde72c1383a2) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Niklas Cassel [Fri, 9 Jun 2023 14:08:41 +0000 (16:08 +0200)]
hw/ide/ahci: PxSACT and PxCI is cleared when PxCMD.ST is cleared
According to AHCI 1.3.1 definition of PxSACT:
This field is cleared when PxCMD.ST is written from a '1' to a '0' by
software. This field is not cleared by a COMRESET or a software reset.
According to AHCI 1.3.1 definition of PxCI:
This field is also cleared when PxCMD.ST is written from a '1' to a '0'
by software.
Clearing PxCMD.ST is part of the error recovery procedure, see
AHCI 1.3.1, section "6.2 Error Recovery".
If we don't clear PxCI on error recovery, the previous command will
incorrectly still be marked as pending after error recovery.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230609140844.202795-6-nks@flawful.org Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit d73b84d0b664e60fffb66f46e84d0db4a8e1c713) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Niklas Cassel [Fri, 9 Jun 2023 14:08:40 +0000 (16:08 +0200)]
hw/ide/ahci: simplify and document PxCI handling
The AHCI spec states that:
For NCQ, PxCI is cleared on command queued successfully.
For non-NCQ, PxCI is cleared on command completed successfully.
(A non-NCQ command that completes with error does not clear PxCI.)
The current QEMU implementation either clears PxCI in check_cmd(),
or in ahci_cmd_done().
check_cmd() will clear PxCI for a command if handle_cmd() returns 0.
handle_cmd() will return -1 if BUSY or DRQ is set.
The QEMU implementation for NCQ commands will currently not set BUSY
or DRQ, so they will always have PxCI cleared by handle_cmd().
ahci_cmd_done() will never even get called for NCQ commands.
Non-NCQ commands are executed by ide_bus_exec_cmd().
Non-NCQ commands in QEMU are implemented either in a sync or in an async
way.
For non-NCQ commands implemented in a sync way, the command handler will
return true, and when ide_bus_exec_cmd() sees that a command handler
returns true, it will call ide_cmd_done() (which will call
ahci_cmd_done()). For a command implemented in a sync way,
ahci_cmd_done() will do nothing (since busy_slot is not set). Instead,
after ide_bus_exec_cmd() has finished, check_cmd() will clear PxCI for
these commands.
For non-NCQ commands implemented in an async way (using either aiocb or
pio_aiocb), the command handler will return false, ide_bus_exec_cmd()
will not call ide_cmd_done(), instead it is expected that the async
callback function will call ide_cmd_done() once the async command is
done. handle_cmd() will set busy_slot, if and only if BUSY or DRQ is
set, and this is checked _after_ ide_bus_exec_cmd() has returned.
handle_cmd() will return -1, so check_cmd() will not clear PxCI.
When the async callback calls ide_cmd_done() (which will call
ahci_cmd_done()), it will see that busy_slot is set, and
ahci_cmd_done() will clear PxCI.
This seems racy, since busy_slot is set _after_ ide_bus_exec_cmd() has
returned. The callback might come before busy_slot gets set. And it is
quite confusing that ahci_cmd_done() will be called for all non-NCQ
commands when the command is done, but will only clear PxCI in certain
cases, even though it will always write a D2H FIS and raise an IRQ.
Even worse, in the case where ahci_cmd_done() does not clear PxCI, it
still raises an IRQ. Host software might thus read an old PxCI value,
since PxCI is cleared (by check_cmd()) after the IRQ has been raised.
Try to simplify this by always setting busy_slot for non-NCQ commands,
such that ahci_cmd_done() will always be responsible for clearing PxCI
for non-NCQ commands.
For NCQ commands, clear PxCI when we receive the D2H FIS, but before
raising the IRQ, see AHCI 1.3.1, section 5.3.8, states RegFIS:Entry and
RegFIS:ClearCI.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Message-id: 20230609140844.202795-5-nks@flawful.org Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit e2a5d9b3d9c3d311618160603cc9bc04fbd98796) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Niklas Cassel [Fri, 9 Jun 2023 14:08:39 +0000 (16:08 +0200)]
hw/ide/ahci: write D2H FIS when processing NCQ command
The way that BUSY + PxCI is cleared for NCQ (FPDMA QUEUED) commands is
described in SATA 3.5a Gold:
11.15 FPDMA QUEUED command protocol
DFPDMAQ2: ClearInterfaceBsy
"Transmit Register Device to Host FIS with the BSY bit cleared to zero
and the DRQ bit cleared to zero and Interrupt bit cleared to zero to
mark interface ready for the next command."
PxCI is currently cleared by handle_cmd(), but we don't write the D2H
FIS to the FIS Receive Area that actually caused PxCI to be cleared.
Similar to how ahci_pio_transfer() calls ahci_write_fis_pio() with an
additional parameter to write a PIO Setup FIS without raising an IRQ,
add a parameter to ahci_write_fis_d2h() so that ahci_write_fis_d2h()
also can write the FIS to the FIS Receive Area without raising an IRQ.
Change process_ncq_command() to call ahci_write_fis_d2h() without
raising an IRQ (similar to ahci_pio_transfer()), such that the FIS
Receive Area is in sync with the PxTFD shadow register.
E.g. Linux reads status and error fields from the FIS Receive Area
directly, so it is wise to keep the FIS Receive Area and the PxTFD
shadow register in sync.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Message-id: 20230609140844.202795-4-nks@flawful.org Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 2967dc8209dd27b61a6ab7bad78cf7c6ec58ddb4) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Niklas Cassel [Fri, 9 Jun 2023 14:08:38 +0000 (16:08 +0200)]
hw/ide/core: set ERR_STAT in unsupported command completion
Currently, the first time sending an unsupported command
(e.g. READ LOG DMA EXT) will not have ERR_STAT set in the completion.
Sending the unsupported command again, will correctly have ERR_STAT set.
When ide_cmd_permitted() returns false, it calls ide_abort_command().
ide_abort_command() first calls ide_transfer_stop(), which will call
ide_transfer_halt() and ide_cmd_done(), after that ide_abort_command()
sets ERR_STAT in status.
ide_cmd_done() for AHCI will call ahci_write_fis_d2h() which writes the
current status in the FIS, and raises an IRQ. (The status here will not
have ERR_STAT set!).
Thus, we cannot call ide_transfer_stop() before setting ERR_STAT, as
ide_transfer_stop() will result in the FIS being written and an IRQ
being raised.
The reason why it works the second time, is that ERR_STAT will still
be set from the previous command, so when writing the FIS, the
completion will correctly have ERR_STAT set.
Set ERR_STAT before writing the FIS (calling cmd_done), so that we will
raise an error IRQ correctly when receiving an unsupported command.
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230609140844.202795-3-nks@flawful.org Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit c3461c6264a7c8ca15b117e91fe5da786924a784) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
target/ppc: Flush inputs to zero with NJ in ppc_store_vscr
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1779 Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit af03aeb631eeb81a44d2c0ff5b429cd4b5dc2799) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Nicholas Piggin [Tue, 8 Aug 2023 04:19:44 +0000 (14:19 +1000)]
ppc/vof: Fix missed fields in VOF cleanup
Failing to reset the of_instance_last makes ihandle allocation continue
to increase, which causes record-replay replay fail to match the
recorded trace.
Not resetting claimed_base makes VOF eventually run out of memory after
some resets.
Cc: Alexey Kardashevskiy <aik@ozlabs.ru> Fixes: fc8c745d501 ("spapr: Implement Open Firmware client interface") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit 7b8589d7ce7e23f26ff53338d575a5cbd7818e28) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Maksim Kostin [Wed, 9 Aug 2023 10:07:33 +0000 (13:07 +0300)]
hw/ppc/e500: fix broken snapshot replay
ppce500_reset_device_tree is registered for system reset, but after c4b075318eb1 this function rerandomizes rng-seed via
qemu_guest_getrandom_nofail. And when loading a snapshot, it tries to read
EVENT_RANDOM that doesn't exist, so we have an error:
qemu-system-ppc: Missing random event in the replay log
To fix this, use qemu_register_reset_nosnapshotload instead of
qemu_register_reset.
Reported-by: Vitaly Cheptsov <cheptsov@ispras.ru> Fixes: c4b075318eb1 ("hw/ppc: pass random seed to fdt ")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1634 Signed-off-by: Maksim Kostin <maksim.kostin@ispras.ru> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit 6ec65b69ba17c954414fa23a397fb8a3fcfb4a43) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
block-migration: Ensure we don't crash during migration cleanup
We can fail the blk_insert_bs() at init_blk_migration(), leaving the
BlkMigDevState without a dirty_bitmap and BlockDriverState. Account
for the possibly missing elements when doing cleanup.
Fix the following crashes:
Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555ec83ef in bdrv_release_dirty_bitmap (bitmap=0x0) at ../block/dirty-bitmap.c:359
359 BlockDriverState *bs = bitmap->bs;
#0 0x0000555555ec83ef in bdrv_release_dirty_bitmap (bitmap=0x0) at ../block/dirty-bitmap.c:359
#1 0x0000555555bba331 in unset_dirty_tracking () at ../migration/block.c:371
#2 0x0000555555bbad98 in block_migration_cleanup_bmds () at ../migration/block.c:681
Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555e971ff in bdrv_op_unblock (bs=0x0, op=BLOCK_OP_TYPE_BACKUP_SOURCE, reason=0x0) at ../block.c:7073
7073 QLIST_FOREACH_SAFE(blocker, &bs->op_blockers[op], list, next) {
#0 0x0000555555e971ff in bdrv_op_unblock (bs=0x0, op=BLOCK_OP_TYPE_BACKUP_SOURCE, reason=0x0) at ../block.c:7073
#1 0x0000555555e9734a in bdrv_op_unblock_all (bs=0x0, reason=0x0) at ../block.c:7095
#2 0x0000555555bbae13 in block_migration_cleanup_bmds () at ../migration/block.c:690
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Message-id: 20230731203338.27581-1-farosas@suse.de Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f187609f27b261702a17f79d20bf252ee0d4f9cd) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In early 2021 (see commit 2ad784339e "docs: update README to use
GitLab repo URLs") almost all of the code base was converted to
point to GitLab instead of git.qemu.org. During 2023, git.qemu.org
switched from a git mirror to a http redirect to GitLab (see [1]).
Update the LICENSE URL to match its previous content, displaying
the file raw content similarly to gitweb 'blob_plain' format ([2]).
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230822125716.55295-1-philmd@linaro.org>
(cherry picked from commit 09a3fffae00b042bed8ad9c351b1a58c505fde37) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Akihiko Odaki [Tue, 22 Aug 2023 16:31:02 +0000 (17:31 +0100)]
accel/kvm: Specify default IPA size for arm64
Before this change, the default KVM type, which is used for non-virt
machine models, was 0.
The kernel documentation says:
> On arm64, the physical address size for a VM (IPA Size limit) is
> limited to 40bits by default. The limit can be configured if the host
> supports the extension KVM_CAP_ARM_VM_IPA_SIZE. When supported, use
> KVM_VM_TYPE_ARM_IPA_SIZE(IPA_Bits) to set the size in the machine type
> identifier, where IPA_Bits is the maximum width of any physical
> address used by the VM. The IPA_Bits is encoded in bits[7-0] of the
> machine type identifier.
>
> e.g, to configure a guest to use 48bit physical address size::
>
> vm_fd = ioctl(dev_fd, KVM_CREATE_VM, KVM_VM_TYPE_ARM_IPA_SIZE(48));
>
> The requested size (IPA_Bits) must be:
>
> == =========================================================
> 0 Implies default size, 40bits (for backward compatibility)
> N Implies N bits, where N is a positive integer such that,
> 32 <= N <= Host_IPA_Limit
> == =========================================================
> Host_IPA_Limit is the maximum possible value for IPA_Bits on the host
> and is dependent on the CPU capability and the kernel configuration.
> The limit can be retrieved using KVM_CAP_ARM_VM_IPA_SIZE of the
> KVM_CHECK_EXTENSION ioctl() at run-time.
>
> Creation of the VM will fail if the requested IPA size (whether it is
> implicit or explicit) is unsupported on the host.
https://docs.kernel.org/virt/kvm/api.html#kvm-create-vm
So if Host_IPA_Limit < 40, specifying 0 as the type will fail. This
actually confused libvirt, which uses "none" machine model to probe the
KVM availability, on M2 MacBook Air.
Fix this by using Host_IPA_Limit as the default type when
KVM_CAP_ARM_VM_IPA_SIZE is available.
Cc: qemu-stable@nongnu.org Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20230727073134.134102-3-akihiko.odaki@daynix.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 1ab445af8cd99343f29032b5944023ad7d8edebf) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Akihiko Odaki [Tue, 22 Aug 2023 16:31:02 +0000 (17:31 +0100)]
kvm: Introduce kvm_arch_get_default_type hook
kvm_arch_get_default_type() returns the default KVM type. This hook is
particularly useful to derive a KVM type that is valid for "none"
machine model, which is used by libvirt to probe the availability of
KVM.
For MIPS, the existing mips_kvm_type() is reused. This function ensures
the availability of VZ which is mandatory to use KVM on the current
QEMU.
Cc: qemu-stable@nongnu.org Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20230727073134.134102-2-akihiko.odaki@daynix.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: added doc comment for new function] Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 5e0d65909c6f335d578b90491e165440c99adf81) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
and the display stays black. When running QEMU with "-d guest_errors",
it shows an error message like this:
virtio_gpu_create_mapping_iov: nr_entries is too big (83886080 > 16384)
which indicates that this value has not been properly byte-swapped.
And indeed, the virtio_gpu_create_blob_bswap() function (that should
swap the fields in the related structure) fails to swap some of the
entries. After correctly swapping all missing values here, too, the
virtio-gpu device is now also working with blob=true on s390x hosts.
Unlike most other instructions that contain an immediate element index,
VREP's one is 16-bit, and not 4-bit. The code uses only 8 bits, so
using, e.g., 0x101 does not lead to a specification exception.
Fix by checking all 16 bits.
Cc: qemu-stable@nongnu.org Fixes: 28d08731b1d8 ("s390x/tcg: Implement VECTOR REPLICATE") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230807163459.849766-1-iii@linux.ibm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 23e87d419f347b6b5f4da3bf70d222acc24cdb64) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
hw/sd/sdhci: Do not force sdhci_mmio_*_ops onto all SD controllers
Since commit c0a55a0c9da2 "hw/sd/sdhci: Support big endian SD host controller
interfaces" sdhci_common_realize() forces all SD card controllers to use either
sdhci_mmio_le_ops or sdhci_mmio_be_ops, depending on the "endianness" property.
However, there are device models which use different MMIO ops: TYPE_IMX_USDHC
uses usdhc_mmio_ops and TYPE_S3C_SDHCI uses sdhci_s3c_mmio_ops.
Forcing sdhci_mmio_le_ops breaks SD card handling on the "sabrelite" board, for
example. Fix this by defaulting the io_ops to little endian and switch to big
endian in sdhci_common_realize() only if there is a matchig big endian variant
available.
Fixes: c0a55a0c9da2 ("hw/sd/sdhci: Support big endian SD host controller
interfaces")
Signed-off-by: Bernhard Beschow <shentey@gmail.com> Tested-by: Guenter Roeck <linux@roeck-us.net>
Message-Id: <20230709080950.92489-1-shentey@gmail.com>
(cherry picked from commit 3b830790151ff231531ef2595793e387dd154efb) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Luca Bonissi [Thu, 3 Aug 2023 00:15:57 +0000 (02:15 +0200)]
Fixed incorrect LLONG alignment for openrisc and cris
OpenRISC (or1k) has long long alignment to 4 bytes, but currently not
defined in abitypes.h. This lead to incorrect packing of /epoll_event/
structure and eventually infinite loop while waiting for file
descriptor[s] event[s].
Fixed also CRIS alignments (1 byte for all types).
Signed-off-by: Luca Bonissi <qemu@bonslack.org>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1770 Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 6ee960823da8fd780ae9912c4327b7e85e80d846) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
include/exec/user: Set ABI_LLONG_ALIGNMENT to 4 for nios2
Based on gcc's nios2.h setting BIGGEST_ALIGNMENT to 32 bits.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit ea9812d93f9c3e1a308ac33097021c50d581d10e) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
include/exec/user: Set ABI_LLONG_ALIGNMENT to 4 for microblaze
Based on gcc's microblaze.h setting BIGGEST_ALIGNMENT to 32 bits.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit e73f27003e777fd9b77d13e71c5268015b8ed2b6) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Klaus Jensen [Wed, 19 Jul 2023 18:21:58 +0000 (20:21 +0200)]
hw/nvme: fix compliance issue wrt. iosqes/iocqes
As of prior to this patch, the controller checks the value of CC.IOCQES
and CC.IOSQES prior to enabling the controller. As reported by Ben in
GitLab issue #1691, this is not spec compliant. The controller should
only check these values when queues are created.
This patch moves these checks to nvme_create_cq(). We do not need to
check it in nvme_create_sq() since that will error out if the completion
queue is not already created.
Also, since the controller exclusively supports SQEs of size 64 bytes
and CQEs of size 16 bytes, hard code that.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1691 Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
(cherry picked from commit 6a33f2e920ec0b489a77200888e3692664077f2d) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Klaus Jensen [Thu, 3 Aug 2023 18:44:23 +0000 (20:44 +0200)]
hw/nvme: fix oob memory read in fdp events log
As reported by Trend Micro's Zero Day Initiative, an oob memory read
vulnerability exists in nvme_fdp_events(). The host-provided offset is
not verified.
Fix this.
This is only exploitable when Flexible Data Placement mode (fdp=on) is
enabled.
Fixes: CVE-2023-4135 Fixes: 73064edfb864 ("hw/nvme: flexible data placement emulation") Reported-by: Trend Micro's Zero Day Initiative Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
(cherry picked from commit ecb1b7b082d3b7dceff0e486a114502fc52c0fdf) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
dump: kdump-zlib data pages not dumped with pvtime/aarch64
The kdump-zlib data pages are not dumped from aarch64 host when the
'pvtime' is involved, that is, when the block->target_end is not aligned to
page_size. In the below example, it is expected to dump two blocks.
However, there is an issue with get_next_page() so that the pages for
"mach-virt.ram" will not be dumped.
At line 1296, although we have reached at the end of the 'pvtime' block,
since it is not aligned to the page_size (e.g., 0x10000), it will not break
at line 1298.
As a result, get_next_page() will continue to the next
block ("mach-virt.ram"). Finally, when get_next_page() returns to the
caller:
- 'pfnptr' is referring to the 'pvtime'
- but 'blockptr' is referring to the "mach-virt.ram"
When get_next_page() is called the next time, "*pfnptr += 1" still refers
to the prior 'pvtime'. It will exit immediately because it is out of the
range of the current "mach-virt.ram".
The fix is to break when it is time to come to the next block, so that both
'pfnptr' and 'blockptr' refer to the same block.
Fixes: 94d788408d2d ("dump: fix kdump to work over non-aligned blocks") Cc: Joe Jin <joe.jin@oracle.com> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-ID: <20230713055819.30497-1-dongli.zhang@oracle.com>
(cherry picked from commit 8a64609eea8cb2bac015968c4b62da5bce266e22) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Zhao Liu [Wed, 28 Jun 2023 13:54:37 +0000 (21:54 +0800)]
hw/smbios: Fix core count in type4
>From SMBIOS 3.0 specification, core count field means:
Core Count is the number of cores detected by the BIOS for this
processor socket. [1]
Before 003f230e37d7 ("machine: Tweak the order of topology members in
struct CpuTopology"), MachineState.smp.cores means "the number of cores
in one package", and it's correct to use smp.cores for core count.
But 003f230e37d7 changes the smp.cores' meaning to "the number of cores
in one die" and doesn't change the original smp.cores' use in smbios as
well, which makes core count in type4 go wrong.
Fix this issue with the correct "cores per socket" caculation.
[1] SMBIOS 3.0.0, section 7.5.6, Processor Information - Core Count
Fixes: 003f230e37d7 ("machine: Tweak the order of topology members in struct CpuTopology") Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20230628135437.1145805-5-zhao1.liu@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 196ea60a734c346d7d75f1d89aa37703d4d854e7) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Zhao Liu [Wed, 28 Jun 2023 13:54:36 +0000 (21:54 +0800)]
hw/smbios: Fix thread count in type4
>From SMBIOS 3.0 specification, thread count field means:
Thread Count is the total number of threads detected by the BIOS for
this processor socket. It is a processor-wide count, not a
thread-per-core count. [1]
So here we should use threads per socket other than threads per core.
[1] SMBIOS 3.0.0, section 7.5.8, Processor Information - Thread Count
Fixes: c97294ec1b9e ("SMBIOS: Build aggregate smbios tables and entry point") Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20230628135437.1145805-4-zhao1.liu@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 7298fd7de5551c4501f54381228458e3c21cab4b) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Zhao Liu [Wed, 28 Jun 2023 13:54:35 +0000 (21:54 +0800)]
hw/smbios: Fix smbios_smp_sockets caculation
smp.sockets is the number of sockets which is configured by "-smp" (
otherwise, the default is 1). Trying to recalculate it here with another
rules leads to errors, such as:
1. 003f230e37d7 ("machine: Tweak the order of topology members in struct
CpuTopology") changes the meaning of smp.cores but doesn't fix
original smp.cores uses.
With the introduction of cluster, now smp.cores means the number of
cores in one cluster. So smp.cores * smp.threads just means the
threads in a cluster not in a socket.
2. On the other hand, we shouldn't use smp.cpus here because it
indicates the initial number of online CPUs at the boot time, and is
not mathematically related to smp.sockets.
So stop reinventing the another wheel and use the topo values that
has been calculated.
Fixes: 003f230e37d7 ("machine: Tweak the order of topology members in struct CpuTopology") Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20230628135437.1145805-3-zhao1.liu@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit d79a284a44bb7d88b233fb6bb12ea3723f43469d) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Zhao Liu [Wed, 28 Jun 2023 13:54:34 +0000 (21:54 +0800)]
machine: Add helpers to get cores/threads per socket
The number of cores/threads per socket are needed for smbios, and are
also useful for other modules.
Provide the helpers to wrap the calculation of cores/threads per socket
so that we can avoid calculation errors caused by other modules miss
topology changes.
Suggested-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
Message-Id: <20230628135437.1145805-2-zhao1.liu@linux.intel.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit a1d027be95bc375238e5b9292c6aa661a8ddef4c) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
As lpc-hc is designed for re-entrant calls from xscom, mark it
re-entrancy safe.
Reported-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
[clg: mark opb_master_regs as re-entrancy safe also ] Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com> Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230526073850.2772197-1-clg@kaod.org> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit 76f9ebffcd41b62ae9ec26a1c25676f2ae1d9cc3) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
As the code is designed for re-entrant calls to apic-msi, mark apic-msi
as reentrancy-safe.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu> Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20230427211013.2994127-9-alxndr@bu.edu> Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 50795ee051a342c681a9b45671c552fbd6274db8) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
As the code is designed for re-entrant calls from raven_io_ops to
pci-conf, mark raven_io_ops as reentrancy-safe.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20230427211013.2994127-8-alxndr@bu.edu> Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 6dad5a6810d9c60ca320d01276f6133bbcfa1fc7) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
bcm2835_property: disable reentrancy detection for iomem
As the code is designed for re-entrant calls from bcm2835_property to
bcm2835_mbox and back into bcm2835_property, mark iomem as
reentrancy-safe.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu> Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230427211013.2994127-7-alxndr@bu.edu> Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 985c4a4e547afb9573b6bd6843d20eb2c3d1d1cd) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Where somedisk.qcow2 is an image that contains already some partitions
and file systems.
In the boot menu of Fedora, go to
"Troubleshooting" -> "Rescue a Fedora system" -> "3) Skip to shell"
Then check "dmesg | grep -i 53c" for failure messages, and try to mount
a partition from somedisk.qcow2.
Message-Id: <20230516090556.553813-1-thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit d139fe9ad8a27bcc50b4ead77d2f97d191a0e95e) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
hw: replace most qemu_bh_new calls with qemu_bh_new_guarded
This protects devices from bh->mmio reentrancy issues.
Thanks: Thomas Huth <thuth@redhat.com> for diagnosing OS X test failure. Signed-off-by: Alexander Bulekov <alxndr@bu.edu> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Paul Durrant <paul@xen.org> Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230427211013.2994127-5-alxndr@bu.edu> Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit f63192b0544af5d3e4d5edfd85ab520fcf671377) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Advise authors to use the _guarded versions of the APIs, instead.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu> Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20230427211013.2994127-4-alxndr@bu.edu> Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit ef56ffbdd6b0605dc1e305611287b948c970e236) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
async: Add an optional reentrancy guard to the BH API
Devices can pass their MemoryReentrancyGuard (from their DeviceState),
when creating new BHes. Then, the async API will toggle the guard
before/after calling the BH call-back. This prevents bh->mmio reentrancy
issues.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu> Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <20230427211013.2994127-3-alxndr@bu.edu>
[thuth: Fix "line over 90 characters" checkpatch.pl error] Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 9c86c97f12c060bf7484dd931f38634e166a81f0) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Add a flag to the DeviceState, when a device is engaged in PIO/MMIO/DMA.
This flag is set/checked prior to calling a device's MemoryRegion
handlers, and set when device code initiates DMA. The purpose of this
flag is to prevent two types of DMA-based reentrancy issues:
1.) mmio -> dma -> mmio case
2.) bh -> dma write -> mmio case
These issues have led to problems such as stack-exhaustion and
use-after-frees.
Summary of the problem from Peter Maydell:
https://lore.kernel.org/qemu-devel/CAFEAcA_23vc7hE3iaM-JVA6W38LK4hJoWae5KcknhPRD5fPBZA@mail.gmail.com
Signed-off-by: Alexander Bulekov <alxndr@bu.edu> Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230427211013.2994127-2-alxndr@bu.edu>
[thuth: Replace warn_report() with warn_report_once()] Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit a2e1753b8054344f32cf94f31c6399a58794a380) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Matt Borgerson [Thu, 13 Jul 2023 07:29:01 +0000 (00:29 -0700)]
target/i386: Check CR0.TS before enter_mmx
When CR0.TS=1, execution of x87 FPU, MMX, and some SSE instructions will
cause a Device Not Available (DNA) exception (#NM). System software uses
this exception event to lazily context switch FPU state.
Before this patch, enter_mmx helpers may be generated just before #NM
generation, prematurely resetting FPU state before the guest has a
chance to save it.
Signed-off-by: Matt Borgerson <contact@mborgerson.com>
Message-ID: <CADc=-s5F10muEhLs4f3mxqsEPAHWj0XFfOC2sfFMVHrk9fcpMg@mail.gmail.com> Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit b2ea6450d8e1336a33eb958ccc64604bc35a43dd) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Nicholas Piggin [Sun, 30 Jul 2023 11:18:42 +0000 (21:18 +1000)]
target/ppc: Fix VRMA page size for ISA v3.0
Until v2.07s, the VRMA page size (L||LP) was encoded in LPCR[VRMASD].
In v3.0 that moved to the partition table PS field.
The powernv machine can now run KVM HPT guests on POWER9/10 CPUs with
this fix and the patch to add ASDR.
Fixes: 3367c62f522b ("target/ppc: Support for POWER9 native hash") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230730111842.39292-1-npiggin@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit 0e2a3ec36885f6d79a96230f582d4455878c6373) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Nicholas Piggin [Wed, 26 Jul 2023 18:22:27 +0000 (04:22 +1000)]
target/ppc: Fix pending HDEC when entering PM state
HDEC is defined to not wake from PM state. There is a check in the HDEC
timer to avoid setting the interrupt if we are in a PM state, but no
check on PM entry to lower HDEC if it already fired. This can cause a
HDECR wake up and QEMU abort with unsupported exception in Power Save
mode.
Fixes: 4b236b621bf ("ppc: Initial HDEC support") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230726182230.433945-4-npiggin@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit 9915dac4847f3cc5ffd36e4c374a4eec83fe09b5) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Nicholas Piggin [Wed, 26 Jul 2023 18:22:25 +0000 (04:22 +1000)]
target/ppc: Implement ASDR register for ISA v3.0 for HPT
The ASDR register was introduced in ISA v3.0. It has not been
implemented for HPT. With HPT, ASDR is the format of the slbmte RS
operand (containing VSID), which matches the ppc_slb_t field.
Fixes: 3367c62f522b ("target/ppc: Support for POWER9 native hash") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230726182230.433945-2-npiggin@gmail.com> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
(cherry picked from commit 9201af096962a1967ce5d0b270ed16ae4edd3db6) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mq()
According to VirtIO standard, "The class, command and
command-specific-data are set by the driver,
and the device sets the ack byte.
There is little it can do except issue a diagnostic
if ack is not VIRTIO_NET_OK."
Therefore, QEMU should stop sending the queued SVQ commands and
cancel the device startup if the device's ack is not VIRTIO_NET_OK.
Yet the problem is that, vhost_vdpa_net_load_mq() returns 1 based on
`*s->status != VIRTIO_NET_OK` when the device's ack is VIRTIO_NET_ERR.
As a result, net->nc->info->load() also returns 1, this makes
vhost_net_start_one() incorrectly assume the device state is
successfully loaded by vhost_vdpa_net_load() and return 0, instead of
goto `fail` label to cancel the device startup, as vhost_net_start_one()
only cancels the device startup when net->nc->info->load() returns a
negative value.
This patch fixes this problem by returning -EIO when the device's
ack is not VIRTIO_NET_OK.
Fixes: f64c7cda69 ("vdpa: Add vhost_vdpa_net_load_mq") Signed-off-by: Hawkins Jiawei <yin31149@gmail.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <ec515ebb0b4f56368751b9e318e245a5d994fa72.1688438055.git.yin31149@gmail.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit f45fd95ec9e8104f6af801c734375029dda0f542) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
vdpa: Return -EIO if device ack is VIRTIO_NET_ERR in _load_mac()
According to VirtIO standard, "The class, command and
command-specific-data are set by the driver,
and the device sets the ack byte.
There is little it can do except issue a diagnostic
if ack is not VIRTIO_NET_OK."
Therefore, QEMU should stop sending the queued SVQ commands and
cancel the device startup if the device's ack is not VIRTIO_NET_OK.
Yet the problem is that, vhost_vdpa_net_load_mac() returns 1 based on
`*s->status != VIRTIO_NET_OK` when the device's ack is VIRTIO_NET_ERR.
As a result, net->nc->info->load() also returns 1, this makes
vhost_net_start_one() incorrectly assume the device state is
successfully loaded by vhost_vdpa_net_load() and return 0, instead of
goto `fail` label to cancel the device startup, as vhost_net_start_one()
only cancels the device startup when net->nc->info->load() returns a
negative value.
This patch fixes this problem by returning -EIO when the device's
ack is not VIRTIO_NET_OK.
Fixes: f73c0c43ac ("vdpa: extract vhost_vdpa_net_load_mac from vhost_vdpa_net_load") Signed-off-by: Hawkins Jiawei <yin31149@gmail.com> Acked-by: Jason Wang <jasowang@redhat.com> Acked-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <a21731518644abbd0c495c5b7960527c5911f80d.1688438055.git.yin31149@gmail.com> Tested-by: Lei Yang <leiyang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit b479bc3c9d5e473553137641fd31069c251f0d6e) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
vdpa: Fix possible use-after-free for VirtQueueElement
QEMU uses vhost_handle_guest_kick() to forward guest's available
buffers to the vdpa device in SVQ avail ring.
In vhost_handle_guest_kick(), a `g_autofree` `elem` is used to
iterate through the available VirtQueueElements. This `elem` is
then passed to `svq->ops->avail_handler`, specifically to the
vhost_vdpa_net_handle_ctrl_avail(). If this handler fails to
process the CVQ command, vhost_handle_guest_kick() regains
ownership of the `elem`, and either frees it or requeues it.
Yet the problem is that, vhost_vdpa_net_handle_ctrl_avail()
mistakenly frees the `elem`, even if it fails to forward the
CVQ command to vdpa device. This can result in a use-after-free
for the `elem` in vhost_handle_guest_kick().
This patch solves this problem by refactoring
vhost_vdpa_net_handle_ctrl_avail() to only freeing the `elem` if
it owns it.
Fixes: bd907ae4b0 ("vdpa: manual forward CVQ buffers") Signed-off-by: Hawkins Jiawei <yin31149@gmail.com>
Message-Id: <e3f2d7db477734afe5c6a5ab3fa8b8317514ea34.1688746840.git.yin31149@gmail.com> Reviewed-by: Eugenio Pérez <eperezma@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 031b1abacbdb3f4e016b6b926f7e7876c05339bb) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Thomas Huth [Wed, 2 Aug 2023 13:57:23 +0000 (15:57 +0200)]
include/hw/i386/x86-iommu: Fix struct X86IOMMU_MSIMessage for big endian hosts
The first bitfield here is supposed to be used as a 64-bit equivalent
to the "uint64_t msi_addr" in the union. To make this work correctly
on big endian hosts, too, the __addr_hi field has to be part of the
bitfield, and the the bitfield members must be declared with "uint64_t"
instead of "uint32_t" - otherwise the values are placed in the wrong
bytes on big endian hosts.
Same applies to the 32-bit "msi_data" field: __resved1 must be part
of the bitfield, and the members must be declared with "uint32_t"
instead of "uint16_t".
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230802135723.178083-7-thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit e1e56c07d1fa24aa37a7e89e6633768fc8ea8705) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Thomas Huth [Wed, 2 Aug 2023 13:57:22 +0000 (15:57 +0200)]
hw/i386/x86-iommu: Fix endianness issue in x86_iommu_irq_to_msi_message()
The values in "msg" are assembled in host endian byte order (the other
field are also not swapped), so we must not swap the __addr_head here.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230802135723.178083-6-thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit 37cf5cecb039a063c0abe3b51ae30f969e73aa84) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Thomas Huth [Wed, 2 Aug 2023 13:57:21 +0000 (15:57 +0200)]
hw/i386/intel_iommu: Fix index calculation in vtd_interrupt_remap_msi()
The values in "addr" are populated locally in this function in host
endian byte order, so we must not swap the index_l field here.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230802135723.178083-5-thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit fcd8027423300b201b37842b88393dc5c6c8ee9e) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Thomas Huth [Wed, 2 Aug 2023 13:57:20 +0000 (15:57 +0200)]
hw/i386/intel_iommu: Fix struct VTDInvDescIEC on big endian hosts
On big endian hosts, we need to reverse the bitfield order in the
struct VTDInvDescIEC, just like it is already done for the other
bitfields in the various structs of the intel-iommu device.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230802135723.178083-4-thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit 4572b22cf9ba432fa3955686853c706a1821bbc7) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Thomas Huth [Wed, 2 Aug 2023 13:57:19 +0000 (15:57 +0200)]
hw/i386/intel_iommu: Fix endianness problems related to VTD_IR_TableEntry
The code already tries to do some endianness handling here, but
currently fails badly:
- While it already swaps the data when logging errors / tracing, it fails
to byteswap the value before e.g. accessing entry->irte.present
- entry->irte.source_id is swapped with le32_to_cpu(), though this is
a 16-bit value
- The whole union is apparently supposed to be swapped via the 64-bit
data[2] array, but the struct is a mixture between 32 bit values
(the first 8 bytes) and 64 bit values (the second 8 bytes), so this
cannot work as expected.
Fix it by converting the struct to two proper 64-bit bitfields, and
by swapping the values only once for everybody right after reading
the data from memory.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230802135723.178083-3-thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit 642ba89672279fbdd14016a90da239c85e845d18) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
After reading the guest memory with dma_memory_read(), we have
to make sure that we byteswap the little endian data to the host's
byte order.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230802135723.178083-2-thuth@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com>
(cherry picked from commit cc2a08480e19007c05be8fe5b6893e20448954dc) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
pci: do not respond config requests after PCI device eject
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2224964
In migration with VF failover, Windows guest and ACPI hot
unplug we do not need to satisfy config requests, otherwise
the guest immediately detects the device and brings up its
driver. Many network VF's are stuck on the guest PCI bus after
the migration.
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Message-Id: <20230728084049.191454-1-yuri.benditovich@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 348e354417b64c484877354ee7cc66f29fa6c7df) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
target/hppa: Move iaoq registers and thus reduce generated code size
On hppa the Instruction Address Offset Queue (IAOQ) registers specifies
the next to-be-executed instructions addresses. Each generated TB writes those
registers at least once, so those registers are used heavily in generated
code.
Looking at the generated assembly, for a x86-64 host this code
to write the address $0x7ffe826f into iaoq_f is generated:
0x7f73e8000184: c7 85 d4 01 00 00 6f 82 movl $0x7ffe826f, 0x1d4(%rbp)
0x7f73e800018c: fe 7f
0x7f73e800018e: c7 85 d8 01 00 00 73 82 movl $0x7ffe8273, 0x1d8(%rbp)
0x7f73e8000196: fe 7f
With the trivial change, by moving the variables iaoq_f and iaoq_b to
the top of struct CPUArchState, the offset to %rbp is reduced (from
0x1d4 to 0), which allows the x86-64 tcg to generate 3 bytes less of
generated code per move instruction:
0x7fc1e800018c: c7 45 00 6f 82 fe 7f movl $0x7ffe826f, (%rbp)
0x7fc1e8000193: c7 45 04 73 82 fe 7f movl $0x7ffe8273, 4(%rbp)
Overall this is a reduction of generated code (not a reduction of
number of instructions).
A test run with checks the generated code size by running "/bin/ls"
with qemu-user shows that the code size shrinks from 1616767 to 1569273
bytes, which is ~97% of the former size.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Helge Deller <deller@gmx.de> Cc: qemu-stable@nongnu.org
(cherry picked from commit f8c0fd9804f435a20c3baa4c0c77ba9a02af24ef) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
zhenwei pi [Thu, 3 Aug 2023 02:43:13 +0000 (10:43 +0800)]
virtio-crypto: verify src&dst buffer length for sym request
For symmetric algorithms, the length of ciphertext must be as same
as the plaintext.
The missing verification of the src_len and the dst_len in
virtio_crypto_sym_op_helper() may lead buffer overflow/divulged.
This patch is originally written by Yiming Tao for QEMU-SECURITY,
resend it(a few changes of error message) in qemu-devel.
Fixes: CVE-2023-3180 Fixes: 04b9b37edda("virtio-crypto: add data queue processing handler") Cc: Gonglei <arei.gonglei@huawei.com> Cc: Mauro Matteo Cascella <mcascell@redhat.com> Cc: Yiming Tao <taoym@zju.edu.cn> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Message-Id: <20230803024314.29962-2-pizhenwei@bytedance.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 9d38a8434721a6479fe03fb5afb150ca793d3980) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Li Feng [Mon, 31 Jul 2023 12:10:06 +0000 (20:10 +0800)]
vhost: fix the fd leak
When the vhost-user reconnect to the backend, the notifer should be
cleanup. Otherwise, the fd resource will be exhausted.
Fixes: f9a09ca3ea ("vhost: add support for configure interrupt") Signed-off-by: Li Feng <fengli@smartx.com> Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20230731121018.2856310-2-fengli@smartx.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Tested-by: Fiona Ebner <f.ebner@proxmox.com>
(cherry picked from commit 18f2971ce403008d5e1c2875b483c9d1778143dc) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Eric Auger [Mon, 17 Jul 2023 16:21:26 +0000 (18:21 +0200)]
hw/virtio-iommu: Fix potential OOB access in virtio_iommu_handle_command()
In the virtio_iommu_handle_command() when a PROBE request is handled,
output_size takes a value greater than the tail size and on a subsequent
iteration we can get a stack out-of-band access. Initialize the
output_size on each iteration.
The issue was found with ASAN. Credits to:
Yiming Tao(Zhejiang University)
Gaoning Pan(Zhejiang University)
Fixes: 1733eebb9e7 ("virtio-iommu: Implement RESV_MEM probe request") Signed-off-by: Eric Auger <eric.auger@redhat.com> Reported-by: Mauro Matteo Cascella <mcascell@redhat.com> Cc: qemu-stable@nongnu.org
Message-Id: <20230717162126.11693-1-eric.auger@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit cf2f89edf36a59183166ae8721a8d7ab5cd286bd) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The arguments for deposit64 are (value, start, length, fieldval); this
appears to have thought they were (value, fieldval, start,
length). Reorder the parameters to match the actual function.
Cc: qemu-stable@nongnu.org Fixes: 950272506d ("target/m68k: Use semihosting/syscalls.h") Reported-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230801154519.3505531-1-peter.maydell@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 8caaae7319a5f7ca449900c0e6bfcaed78fa3ae2) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The arguments for deposit64 are (value, start, length, fieldval); this
appears to have thought they were (value, fieldval, start,
length). Reorder the parameters to match the actual function.
Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Fixes: d1e23cbaa403b2d ("target/nios2: Use semihosting/syscalls.h") Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230731235245.295513-1-keithp@keithp.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit 71e2dd6aa1bdbac19c661638a4ae91816002ac9e) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Keith Packard [Tue, 1 Aug 2023 15:22:45 +0000 (08:22 -0700)]
target/nios2: Pass semihosting arg to exit
Instead of using R_ARG0 (the semihost function number), use R_ARG1
(the provided exit status).
Signed-off-by: Keith Packard <keithp@keithp.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230801152245.332749-1-keithp@keithp.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit c11d5bdae79a8edaf00dfcb2e49c064a50c67671) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
David Woodhouse [Fri, 7 Apr 2023 15:12:00 +0000 (17:12 +0200)]
hw/xen: fix off-by-one in xen_evtchn_set_gsi()
Coverity points out (CID 1508128) a bounds checking error. We need to check
for gsi >= IOAPIC_NUM_PINS, not just greater-than.
Also fix up an assert() that has the same problem, that Coverity didn't see.
Fixes: 4f81baa33ed6 ("hw/xen: Support GSI mapping to PIRQ") Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230801175747.145906-2-dwmw2@infradead.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
(cherry picked from commit cf885b19579646d6a085470658bc83432d6786d2) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
io: remove io watch if TLS channel is closed during handshake
The TLS handshake make take some time to complete, during which time an
I/O watch might be registered with the main loop. If the owner of the
I/O channel invokes qio_channel_close() while the handshake is waiting
to continue the I/O watch must be removed. Failing to remove it will
later trigger the completion callback which the owner is not expecting
to receive. In the case of the VNC server, this results in a SEGV as
vnc_disconnect_start() tries to shutdown a client connection that is
already gone / NULL.
CVE-2023-3354 Reported-by: jiangyegen <jiangyegen@huawei.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
(cherry picked from commit 10be627d2b5ec2d6b3dce045144aa739eef678b4) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Anthony PERARD [Tue, 4 Jul 2023 17:18:19 +0000 (18:18 +0100)]
xen-block: Avoid leaks on new error path
Commit 189829399070 ("xen-block: Use specific blockdev driver")
introduced a new error path, without taking care of allocated
resources.
So only allocate the qdicts after the error check, and free both
`filename` and `driver` when we are about to return and thus taking
care of both success and error path.
Coverity only spotted the leak of qdicts (*_layer variables).
Reported-by: Peter Maydell <peter.maydell@linaro.org> Fixes: Coverity CID 1508722, 1398649 Fixes: 189829399070 ("xen-block: Use specific blockdev driver") Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Paul Durrant <paul@xen.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230704171819.42564-1-anthony.perard@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
(cherry picked from commit aa36243514a777f76c8b8a19b1f8a71f27ec6c78) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Anthony PERARD [Fri, 14 Jul 2023 15:27:20 +0000 (16:27 +0100)]
thread-pool: signal "request_cond" while locked
thread_pool_free() might have been called on the `pool`, which would
be a reason for worker_thread() to quit. In this case,
`pool->request_cond` is been destroyed.
If worker_thread() didn't managed to signal `request_cond` before it
been destroyed by thread_pool_free(), we got:
util/qemu-thread-posix.c:198: qemu_cond_signal: Assertion `cond->initialized' failed.
One backtrace:
__GI___assert_fail (assertion=0x55555614abcb "cond->initialized", file=0x55555614ab88 "util/qemu-thread-posix.c", line=198,
function=0x55555614ad80 <__PRETTY_FUNCTION__.17104> "qemu_cond_signal") at assert.c:101
qemu_cond_signal (cond=0x7fffb800db30) at util/qemu-thread-posix.c:198
worker_thread (opaque=0x7fffb800dab0) at util/thread-pool.c:129
qemu_thread_start (args=0x7fffb8000b20) at util/qemu-thread-posix.c:505
start_thread (arg=<optimized out>) at pthread_create.c:486
linux-user/armeb: Fix __kernel_cmpxchg() for armeb
Commit 7f4f0d9ea870 ("linux-user/arm: Implement __kernel_cmpxchg with host
atomics") switched to use qatomic_cmpxchg() to swap a word with the memory
content, but missed to endianess-swap the oldval and newval values when
emulating an armeb CPU, which expects words to be stored in big endian in
the guest memory.
The bug can be verified with qemu >= v7.0 on any little-endian host, when
starting the armeb binary of the upx program, which just hangs without
this patch.
Cc: qemu-stable@nongnu.org Signed-off-by: Helge Deller <deller@gmx.de> Reported-by: "Markus F.X.J. Oberhumer" <markus@oberhumer.com> Reported-by: John Reiser <jreiser@BitWagon.com> Closes: https://github.com/upx/upx/issues/687
Message-Id: <ZMQVnqY+F+5sTNFd@p100> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
(cherry picked from commit 38dd78c41eaf08b490c9e7ec68fc508bbaa5cb1d) Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>