If reserved_va, then we have already reserved the entire
guest virtual address space; no need to remap page.
If !reserved_va, then use MAP_FIXED_NOREPLACE.
linux-user: Remove qemu_host_page_size from create_elf_tables
AT_PAGESZ is supposed to advertise the guest page size.
The random adjustment made here using qemu_host_page_size
does not match anything else within linux-user.
The idea here is good, but should be done more systemically
via adjustment to TARGET_PAGE_SIZE.
Jonathan Cameron [Mon, 19 Feb 2024 17:31:53 +0000 (17:31 +0000)]
tcg: Avoid double lock if page tables happen to be in mmio memory.
On i386, after fixing the page walking code to work with pages in
MMIO memory (specifically CXL emulated interleaved memory),
a crash was seen in an interrupt handling path.
Useful part of backtrace
7 0x0000555555ab1929 in bql_lock_impl (file=0x555556049122 "../../accel/tcg/cputlb.c", line=2033) at ../../system/cpus.c:524
8 bql_lock_impl (file=file@entry=0x555556049122 "../../accel/tcg/cputlb.c", line=line@entry=2033) at ../../system/cpus.c:520
9 0x0000555555c9f7d6 in do_ld_mmio_beN (cpu=0x5555578e0cb0, full=0x7ffe88012950, ret_be=ret_be@entry=0, addr=19595792376, size=size@entry=8, mmu_idx=4, type=MMU_DATA_LOAD, ra=0) at ../../accel/tcg/cputlb.c:2033
10 0x0000555555ca0fbd in do_ld_8 (cpu=cpu@entry=0x5555578e0cb0, p=p@entry=0x7ffff4efd1d0, mmu_idx=<optimized out>, type=type@entry=MMU_DATA_LOAD, memop=<optimized out>, ra=ra@entry=0) at ../../accel/tcg/cputlb.c:2356
11 0x0000555555ca341f in do_ld8_mmu (cpu=cpu@entry=0x5555578e0cb0, addr=addr@entry=19595792376, oi=oi@entry=52, ra=0, ra@entry=52, access_type=access_type@entry=MMU_DATA_LOAD) at ../../accel/tcg/cputlb.c:2439
12 0x0000555555ca5f59 in cpu_ldq_mmu (ra=52, oi=52, addr=19595792376, env=0x5555578e3470) at ../../accel/tcg/ldst_common.c.inc:169
13 cpu_ldq_le_mmuidx_ra (env=0x5555578e3470, addr=19595792376, mmu_idx=<optimized out>, ra=ra@entry=0) at ../../accel/tcg/ldst_common.c.inc:301
14 0x0000555555b4b5fc in ptw_ldq (ra=0, in=0x7ffff4efd320) at ../../target/i386/tcg/sysemu/excp_helper.c:98
15 ptw_ldq (ra=0, in=0x7ffff4efd320) at ../../target/i386/tcg/sysemu/excp_helper.c:93
16 mmu_translate (env=env@entry=0x5555578e3470, in=0x7ffff4efd3e0, out=0x7ffff4efd3b0, err=err@entry=0x7ffff4efd3c0, ra=ra@entry=0) at ../../target/i386/tcg/sysemu/excp_helper.c:174
17 0x0000555555b4c4b3 in get_physical_address (ra=0, err=0x7ffff4efd3c0, out=0x7ffff4efd3b0, mmu_idx=0, access_type=MMU_DATA_LOAD, addr=18446741874686299840, env=0x5555578e3470) at ../../target/i386/tcg/sysemu/excp_helper.c:580
18 x86_cpu_tlb_fill (cs=0x5555578e0cb0, addr=18446741874686299840, size=<optimized out>, access_type=MMU_DATA_LOAD, mmu_idx=0, probe=<optimized out>, retaddr=0) at ../../target/i386/tcg/sysemu/excp_helper.c:606
19 0x0000555555ca0ee9 in tlb_fill (retaddr=0, mmu_idx=0, access_type=MMU_DATA_LOAD, size=<optimized out>, addr=18446741874686299840, cpu=0x7ffff4efd540) at ../../accel/tcg/cputlb.c:1315
20 mmu_lookup1 (cpu=cpu@entry=0x5555578e0cb0, data=data@entry=0x7ffff4efd540, mmu_idx=0, access_type=access_type@entry=MMU_DATA_LOAD, ra=ra@entry=0) at ../../accel/tcg/cputlb.c:1713
21 0x0000555555ca2c61 in mmu_lookup (cpu=cpu@entry=0x5555578e0cb0, addr=addr@entry=18446741874686299840, oi=oi@entry=32, ra=ra@entry=0, type=type@entry=MMU_DATA_LOAD, l=l@entry=0x7ffff4efd540) at ../../accel/tcg/cputlb.c:1803
22 0x0000555555ca3165 in do_ld4_mmu (cpu=cpu@entry=0x5555578e0cb0, addr=addr@entry=18446741874686299840, oi=oi@entry=32, ra=ra@entry=0, access_type=access_type@entry=MMU_DATA_LOAD) at ../../accel/tcg/cputlb.c:2416
23 0x0000555555ca5ef9 in cpu_ldl_mmu (ra=0, oi=32, addr=18446741874686299840, env=0x5555578e3470) at ../../accel/tcg/ldst_common.c.inc:158
24 cpu_ldl_le_mmuidx_ra (env=env@entry=0x5555578e3470, addr=addr@entry=18446741874686299840, mmu_idx=<optimized out>, ra=ra@entry=0) at ../../accel/tcg/ldst_common.c.inc:294
25 0x0000555555bb6cdd in do_interrupt64 (is_hw=1, next_eip=18446744072399775809, error_code=0, is_int=0, intno=236, env=0x5555578e3470) at ../../target/i386/tcg/seg_helper.c:889
26 do_interrupt_all (cpu=cpu@entry=0x5555578e0cb0, intno=236, is_int=is_int@entry=0, error_code=error_code@entry=0, next_eip=next_eip@entry=0, is_hw=is_hw@entry=1) at ../../target/i386/tcg/seg_helper.c:1130
27 0x0000555555bb87da in do_interrupt_x86_hardirq (env=env@entry=0x5555578e3470, intno=<optimized out>, is_hw=is_hw@entry=1) at ../../target/i386/tcg/seg_helper.c:1162
28 0x0000555555b5039c in x86_cpu_exec_interrupt (cs=0x5555578e0cb0, interrupt_request=<optimized out>) at ../../target/i386/tcg/sysemu/seg_helper.c:197
29 0x0000555555c94480 in cpu_handle_interrupt (last_tb=<synthetic pointer>, cpu=0x5555578e0cb0) at ../../accel/tcg/cpu-exec.c:844
Peter identified this as being due to the BQL already being
held when the page table walker encounters MMIO memory and attempts
to take the lock again. There are other examples of similar paths
TCG, so this follows the approach taken in those of simply checking
if the lock is already held and if it is, don't take it again.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Suggested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Message-Id: <20240219173153.12114-4-Jonathan.Cameron@huawei.com>
[rth: Use BQL_LOCK_GUARD] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Peter Maydell [Mon, 19 Feb 2024 17:31:51 +0000 (17:31 +0000)]
accel/tcg: Set can_do_io at at start of lookup_tb_ptr helper
If a page table is in IO memory and lookup_tb_ptr probes
the TLB it can result in a page table walk for the instruction
fetch. If this hits IO memory and io_prepare falsely assumes
it needs to do a TLB recompile.
Avoid that by setting can_do_io at the start of lookup_tb_ptr.
tcg/aarch64: Apple does not align __int128_t in even registers
From https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms
When passing an argument with 16-byte alignment in integer registers,
Apple platforms allow the argument to start in an odd-numbered xN
register. The standard ABI requires it to begin in an even-numbered
xN register.
Cc: qemu-stable@nongnu.org Fixes: 5427a9a7604 ("tcg: Add TCG_TARGET_CALL_{RET,ARG}_I128")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2169 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <9fc0c2c7-dd57-459e-aecb-528edb74b4a7@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
linux-user/elfload: Write corefile elf header in one block
Fixes a bug in which write_note() wrote namesz_rounded
and datasz_rounded bytes, even though name and data
pointers contain only the unrounded number of bytes.
Instead of many small writes, allocate a block to contain all
of the elf headers and all of the notes. Copy the data into the
block piecemeal and the write it to the file as a chunk.
This also avoids the need to lseek forward for alignment.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Swap the ordering of vma_init and open. This will be necessary
for further changes, and adjusts the error cleanup path. Narrow
the scope of corefile, as the variable can be freed immediately
after use in open().
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
linux-user/elfload: Tidy fill_note_info and struct elf_note_info
In fill_note_info, there were unnecessary checks for
success of g_new/g_malloc. But these structures do not
need to be dyamically allocated at all, and can in fact
be statically allocated within the parent structure.
This removes all error paths from fill_note_info, so
change the return type to void.
Change type of signr to match both caller (elf_core_dump)
and callee (fill_prstatus), which both use int for signr.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Peter Maydell [Wed, 28 Feb 2024 17:27:10 +0000 (17:27 +0000)]
Merge tag 'migration-next-pull-request' of https://gitlab.com/peterx/qemu into staging
Migration pull request
- Fabiano's fixed-ram patches (1-5 only)
- Peter's cleanups on multifd tls IOC referencing
- Steve's cpr patches for vfio (migration patches only)
- Fabiano's fix on mbps stats racing with COMPLETE state
- Fabiano's fix on return path thread hang
# -----BEGIN PGP SIGNATURE-----
#
# iIcEABYKADAWIQS5GE3CDMRX2s990ak7X8zN86vXBgUCZd7AbhIccGV0ZXJ4QHJl
# ZGhhdC5jb20ACgkQO1/MzfOr1wbg0gDyA3Vg3pIqCJ+u+hLZ+QKxY/pnu8Y5kF+E
# HK2IdslQUQD+OX4ATUnl+CGMiVX9fjs1fKx0Z0Qetq8gC1YJF13yuA0=
# =P2QF
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 28 Feb 2024 05:11:10 GMT
# gpg: using EDDSA key B9184DC20CC457DACF7DD1A93B5FCCCDF3ABD706
# gpg: issuer "peterx@redhat.com"
# gpg: Good signature from "Peter Xu <xzpeter@gmail.com>" [marginal]
# gpg: aka "Peter Xu <peterx@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B918 4DC2 0CC4 57DA CF7D D1A9 3B5F CCCD F3AB D706
* tag 'migration-next-pull-request' of https://gitlab.com/peterx/qemu: (25 commits)
migration: Use migrate_has_error() in close_return_path_on_source()
migration: Join the return path thread before releasing to_dst_file
migration: Fix qmp_query_migrate mbps value
migration: options incompatible with cpr
migration: update cpr-reboot description
migration: stop vm for cpr
migration: notifier error checking
migration: refactor migrate_fd_connect failures
migration: per-mode notifiers
migration: MigrationNotifyFunc
migration: remove postcopy_after_devices
migration: MigrationEvent for notifiers
migration: convert to NotifierWithReturn
migration: remove error from notifier data
notify: pass error to notifier with return
migration/multifd: Drop unnecessary helper to destroy IOC
migration/multifd: Cleanup outgoing_args in state destroy
migration/multifd: Make multifd_channel_connect() return void
migration/multifd: Drop registered_yank
migration/multifd: Cleanup TLS iochannel referencing
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
ide, vl: turn -win2k-hack into a property on IDE devices
ide: collapse parameters to ide_init_drive
target/i386: leave the A20 bit set in the final NPT walk
target/i386: remove unnecessary/wrong application of the A20 mask
target/i386: Fix physical address truncation
target/i386: use separate MMU indexes for 32-bit accesses
target/i386: introduce function to query MMU indices
target/i386: check validity of VMCB addresses
target/i386: mask high bits of CR3 in 32-bit mode
vl, pc: turn -no-fd-bootchk into a machine property
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Wed, 28 Feb 2024 14:23:07 +0000 (14:23 +0000)]
Merge tag 'pull-maintainer-updates-280224-1' of https://gitlab.com/stsquad/qemu into staging
Testing, gdbstub and plugin updates:
- fix some test/tcg license headers to GPLv2+
- bump up check-tcg timeout to 120s
- avoid re-building VM images too often
- update OpenBSD to 7.4
- use GDBFeature to build gdbstub XML
- unify plugin vcpu count under qemu_plugin_num_vcpus
- avoid spurious idle/resume callbacks on new vCPUs
- ensure nios2-linux-user processes async work
- call vcpu_init plugin callback through async work
- define plugin helpers when registers being read
- add plugin API for reading register values
- add support for register tracking to execlog
- update plugin docs with assumptions
- mention plugins can trigger tb_flush in mttcg design doc
* tag 'pull-maintainer-updates-280224-1' of https://gitlab.com/stsquad/qemu: (29 commits)
docs/devel: plugins can trigger a tb flush
docs/devel: document some plugin assumptions
docs/devel: lift example and plugin API sections up
contrib/plugins: extend execlog to track register changes
contrib/plugins: fix imatch
tests/tcg: expand insn test case to exercise register API
plugins: add an API to read registers
plugins: create CPUPluginState and migrate plugin_mask
gdbstub: expose api to find registers
plugins: Use different helpers when reading registers
cpu: call plugin init hook asynchronously
linux-user: ensure nios2 processes queued work
plugins: fix order of init/idle/resume callback
plugins: add qemu_plugin_num_vcpus function
plugins: remove previous n_vcpus functions from API
gdbstub: Add members to identify registers to GDBFeature
hw/core/cpu: Remove gdb_get_dynamic_xml member
gdbstub: Infer number of core registers from XML
gdbstub: Simplify XML lookup
gdbstub: Change gdb_get_reg_cb and gdb_set_reg_cb
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Alex Bennée [Tue, 27 Feb 2024 14:43:34 +0000 (14:43 +0000)]
docs/devel: document some plugin assumptions
While we attempt to hide implementation details from the plugin we
shouldn't be totally obtuse. Let the user know what they can and can't
expect with the various instrumentation options.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240227144335.1196131-29-alex.bennee@linaro.org>
Alex Bennée [Tue, 27 Feb 2024 14:43:32 +0000 (14:43 +0000)]
contrib/plugins: extend execlog to track register changes
With the new plugin register API we can now track changes to register
values. Currently the implementation is fairly dumb which will slow
down if a large number of register values are being tracked. This
could be improved by only instrumenting instructions which mention
registers we are interested in tracking.
will display in the execlog any changes to the stack pointer (sp) and
the SVE Z registers.
As testing registers every instruction will be quite a heavy operation
there is an additional flag which attempts to optimise the register
tracking by only instrumenting instructions which are likely to change
its value. This relies on the QEMU disassembler showing up the register
names in disassembly so is an explicit opt-in.
Alex Bennée [Tue, 27 Feb 2024 14:43:31 +0000 (14:43 +0000)]
contrib/plugins: fix imatch
We can't directly save the ephemeral imatch from argv as that memory
will get recycled.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240227144335.1196131-26-alex.bennee@linaro.org>
Alex Bennée [Tue, 27 Feb 2024 14:43:29 +0000 (14:43 +0000)]
plugins: add an API to read registers
We can only request a list of registers once the vCPU has been
initialised so the user needs to use either call the get function on
vCPU initialisation or during the translation phase.
We don't expose the reg number to the plugin instead hiding it behind
an opaque handle. For now this is just the gdb_regnum encapsulated in
an anonymous GPOINTER but in future as we add more state for plugins
to track we can expand it.
Alex Bennée [Tue, 27 Feb 2024 14:43:28 +0000 (14:43 +0000)]
plugins: create CPUPluginState and migrate plugin_mask
As we expand the per-vCPU data for plugins we don't want to pollute
CPUState. For now this just moves the plugin_mask (renamed to
event_mask) as the memory callbacks are accessed directly by TCG
generated code.
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240227144335.1196131-23-alex.bennee@linaro.org>
Alex Bennée [Tue, 27 Feb 2024 14:43:24 +0000 (14:43 +0000)]
linux-user: ensure nios2 processes queued work
While async processes are rare for linux-user we do use them from time
to time. The most obvious one is tb_flush when we run out of
translation space. We will also need this when we move plugin
vcpu_init to an async task.
Fix nios2 to follow its older, wiser and more stable siblings.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240227144335.1196131-19-alex.bennee@linaro.org>
Pierrick Bouvier [Tue, 27 Feb 2024 14:43:23 +0000 (14:43 +0000)]
plugins: fix order of init/idle/resume callback
We found that vcpu_init_hook was called *after* idle callback.
vcpu_init is called from cpu_realize_fn, while idle/resume cb are called
from qemu_wait_io_event (in vcpu thread).
This change ensures we only call idle and resume cb only once a plugin
was init for a given vcpu.
Next change in the series will run vcpu_init asynchronously, which will
make it run *after* resume callback as well. So we fix this now.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20240213094009.150349-4-pierrick.bouvier@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240227144335.1196131-18-alex.bennee@linaro.org>
Pierrick Bouvier [Tue, 27 Feb 2024 14:43:22 +0000 (14:43 +0000)]
plugins: add qemu_plugin_num_vcpus function
We now keep track of how many vcpus were started. This way, a plugin can
easily query number of any vcpus at any point of execution, which
unifies user and system mode workflows.
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240213094009.150349-3-pierrick.bouvier@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240227144335.1196131-17-alex.bennee@linaro.org>
Akihiko Odaki [Tue, 27 Feb 2024 14:43:20 +0000 (14:43 +0000)]
gdbstub: Add members to identify registers to GDBFeature
These members will be used to help plugins to identify registers.
The added members in instances of GDBFeature dynamically generated by
CPUs will be filled in later changes.
Akihiko Odaki [Tue, 27 Feb 2024 14:43:17 +0000 (14:43 +0000)]
gdbstub: Simplify XML lookup
Now we know all instances of GDBFeature that is used in CPU so we can
traverse them to find XML. This removes the need for a CPU-specific
lookup function for dynamic XMLs.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231213-gdb-v17-7-777047380591@daynix.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240227144335.1196131-12-alex.bennee@linaro.org>
Akihiko Odaki [Tue, 27 Feb 2024 14:43:16 +0000 (14:43 +0000)]
gdbstub: Change gdb_get_reg_cb and gdb_set_reg_cb
Align the parameters of gdb_get_reg_cb and gdb_set_reg_cb with the
gdb_read_register and gdb_write_register members of CPUClass to allow
to unify the logic to access registers of the core and coprocessors
in the future.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231213-gdb-v17-6-777047380591@daynix.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240227144335.1196131-11-alex.bennee@linaro.org>
Akihiko Odaki [Tue, 27 Feb 2024 14:43:14 +0000 (14:43 +0000)]
gdbstub: Use GDBFeature for gdb_register_coprocessor
This is a tree-wide change to introduce GDBFeature parameter to
gdb_register_coprocessor(). The new parameter just replaces num_regs
and xml parameters for now. GDBFeature will be utilized to simplify XML
lookup in a following change.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Acked-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231213-gdb-v17-4-777047380591@daynix.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240227144335.1196131-9-alex.bennee@linaro.org>
Akihiko Odaki [Tue, 27 Feb 2024 14:43:13 +0000 (14:43 +0000)]
target/riscv: Use GDBFeature for dynamic XML
In preparation for a change to use GDBFeature as a parameter of
gdb_register_coprocessor(), convert the internal representation of
dynamic feature from plain XML to GDBFeature.
Akihiko Odaki [Tue, 27 Feb 2024 14:43:12 +0000 (14:43 +0000)]
target/ppc: Use GDBFeature for dynamic XML
In preparation for a change to use GDBFeature as a parameter of
gdb_register_coprocessor(), convert the internal representation of
dynamic feature from plain XML to GDBFeature.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231213-gdb-v17-2-777047380591@daynix.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240227144335.1196131-7-alex.bennee@linaro.org>
Akihiko Odaki [Tue, 27 Feb 2024 14:43:11 +0000 (14:43 +0000)]
target/arm: Use GDBFeature for dynamic XML
In preparation for a change to use GDBFeature as a parameter of
gdb_register_coprocessor(), convert the internal representation of
dynamic feature from plain XML to GDBFeature.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231213-gdb-v17-1-777047380591@daynix.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240227144335.1196131-6-alex.bennee@linaro.org>
Alex Bennée [Tue, 27 Feb 2024 14:43:09 +0000 (14:43 +0000)]
tests/vm: avoid re-building the VM images all the time
The main problem is that "check-venv" is a .PHONY target will always
evaluate and trigger a full re-build of the VM images. While its
tempting to drop it from the dependencies that does introduce a
breakage on freshly configured builds.
Fortunately we do have the otherwise redundant --force flag for the
script which up until now was always on. If we make the usage of
--force conditional on dependencies other than check-venv triggering
the update we can avoid the costly rebuild and still run cleanly on a
fresh checkout.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2118 Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240227144335.1196131-4-alex.bennee@linaro.org>
Alex Bennée [Tue, 27 Feb 2024 14:43:08 +0000 (14:43 +0000)]
tests/tcg: bump TCG test timeout to 120s
This is less than ideal but easier than making sure we get all the
iterations of the memory test. Update the comment accordingly.
Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240227144335.1196131-3-alex.bennee@linaro.org>
Alex Bennée [Tue, 27 Feb 2024 14:43:07 +0000 (14:43 +0000)]
tests/tcg: update licenses to GPLv2 as intended
My default header template is GPLv3 but for QEMU code we really should
stick to GPLv2-or-later (allowing others to up-license it if they
wish). While this is test code we should still be consistent on the
source distribution.
I wrote all of this code so its not a problem. However there remains
one GPLv3 file left which is the crt0-tc2x.S for TriCore.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240227144335.1196131-2-alex.bennee@linaro.org>
Cédric Le Goater [Mon, 26 Feb 2024 20:31:22 +0000 (17:31 -0300)]
migration: Use migrate_has_error() in close_return_path_on_source()
close_return_path_on_source() retrieves the migration error from the
the QEMUFile '->to_dst_file' to know if a shutdown is required. This
shutdown is required to exit the return-path thread.
Avoid relying on '->to_dst_file' and use migrate_has_error() instead.
(using to_dst_file is a heuristic to infer whether
rp_state.from_dst_file might be stuck on a recvmsg(). Using a generic
method for detecting errors is more reliable. We also want to reduce
dependency on QEMUFile::last_error)
Suggested-by: Peter Xu <peterx@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com>
[added some words about the motivation for this patch] Signed-off-by: Fabiano Rosas <farosas@suse.de> Link: https://lore.kernel.org/r/20240226203122.22894-3-farosas@suse.de Signed-off-by: Peter Xu <peterx@redhat.com>
Fabiano Rosas [Mon, 26 Feb 2024 20:31:21 +0000 (17:31 -0300)]
migration: Join the return path thread before releasing to_dst_file
The return path thread might hang at a blocking system call. Before
joining the thread we might need to issue a shutdown() on the socket
file descriptor to release it. To determine whether the shutdown() is
necessary we look at the QEMUFile error.
Make sure we only clean up the QEMUFile after the return path has been
waited for.
This fixes a hang when qemu_savevm_state_setup() produced an error
that was detected by migration_detect_error(). That skips
migration_completion() so close_return_path_on_source() would get
stuck waiting for the RP thread to terminate.
Fabiano Rosas [Mon, 26 Feb 2024 14:33:35 +0000 (11:33 -0300)]
migration: Fix qmp_query_migrate mbps value
The QMP command query_migrate might see incorrect throughput numbers
if it runs after we've set the migration completion status but before
migration_calculate_complete() has updated s->total_time and s->mbps.
The migration status would show COMPLETED, but the throughput value
would be the one from the last iteration and not the one from the
whole migration. This will usually be a larger value due to the time
period being smaller (one iteration).
Move migration_calculate_complete() earlier so that the status
MIGRATION_STATUS_COMPLETED is only emitted after the final counters
update. Keep everything under the BQL so the QMP thread sees the
updates as atomic.
Rename migration_calculate_complete to migration_completion_end to
reflect its new purpose of also updating s->state.
Steve Sistare [Thu, 22 Feb 2024 17:28:36 +0000 (09:28 -0800)]
migration: stop vm for cpr
When migration for cpr is initiated, stop the vm and set state
RUN_STATE_FINISH_MIGRATE before ram is saved. This eliminates the
possibility of ram and device state being out of sync, and guarantees
that a guest in the suspended state remains suspended, because qmp_cont
rejects a cont command in the RUN_STATE_FINISH_MIGRATE state.
Steve Sistare [Thu, 22 Feb 2024 17:28:35 +0000 (09:28 -0800)]
migration: notifier error checking
Check the status returned by migration notifiers for event type
MIG_EVENT_PRECOPY_SETUP, and report errors. None of the notifiers
return an error status at this time.
Steve Sistare [Thu, 22 Feb 2024 17:28:30 +0000 (09:28 -0800)]
migration: MigrationEvent for notifiers
Passing MigrationState to notifiers is unsound because they could access
unstable migration state internals or even modify the state. Instead, pass
the minimal info needed in a new MigrationEvent struct, which could be
extended in the future if needed.
Steve Sistare [Thu, 22 Feb 2024 17:28:29 +0000 (09:28 -0800)]
migration: convert to NotifierWithReturn
Change all migration notifiers to type NotifierWithReturn, so notifiers
can return an error status in a future patch. For now, pass NULL for the
notifier error parameter, and do not check the return value.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Link: https://lore.kernel.org/r/1708622920-68779-4-git-send-email-steven.sistare@oracle.com
[peterx: dropped unexpected update to roms/seabios-hppa] Signed-off-by: Peter Xu <peterx@redhat.com>
Steve Sistare [Thu, 22 Feb 2024 17:28:27 +0000 (09:28 -0800)]
notify: pass error to notifier with return
Pass an error object as the third parameter to "notifier with return"
notifiers, so clients no longer need to bundle an error object in the
opaque data. The new parameter is used in a later patch.
Peter Xu [Thu, 22 Feb 2024 09:53:01 +0000 (17:53 +0800)]
migration/multifd: Drop unnecessary helper to destroy IOC
Both socket_send_channel_destroy() and multifd_send_channel_destroy() are
unnecessary wrappers to destroy an IOC, as the only thing to do is to
release the final IOC reference. We have plenty of code that destroys an
IOC using direct unref() already; keep that style.
Peter Xu [Thu, 22 Feb 2024 09:53:00 +0000 (17:53 +0800)]
migration/multifd: Cleanup outgoing_args in state destroy
outgoing_args is a global cache of socket address to be reused in multifd.
Freeing the cache in per-channel destructor is more or less a hack. Move
it to multifd_send_cleanup_state() so it only get checked once. Use a
small helper to do so because it's internal of socket.c.
Peter Xu [Thu, 22 Feb 2024 09:52:58 +0000 (17:52 +0800)]
migration/multifd: Drop registered_yank
With a clear definition of p->c protocol, where we only set it up if the
channel is fully established (TLS or non-TLS), registered_yank boolean will
have equal meaning of "p->c != NULL".
Commit a1af605bd5 ("migration/multifd: fix hangup with TLS-Multifd due to
blocking handshake") introduced a thread for TLS channels, which will
resolve the issue on blocking the main thread. However in the same commit
p->c is slightly abused just to be able to pass over the pointer "p" into
the thread.
That's the major reason we'll need to conditionally free the io channel in
the fault paths.
To clean it up, using a separate structure to pass over both "p" and "tioc"
in the tls handshake thread. Then we can make it a rule that p->c will
never be set until the channel is completely setup. With that, we can drop
the tricky conditional unref of the io channel in the error path.
Fabiano Rosas [Tue, 20 Feb 2024 22:41:07 +0000 (19:41 -0300)]
tests/qtest/migration: Add a fd + file test
The fd URI supports an fd that is backed by a file. The code should
select between QIOChannelFile and QIOChannelSocket, depending on the
type of the fd. Add a test for that.
Paolo Bonzini [Fri, 22 Dec 2023 08:52:27 +0000 (09:52 +0100)]
target/i386: remove unnecessary/wrong application of the A20 mask
If ptw_translate() does a MMU_PHYS_IDX access, the A20 mask is already
applied in get_physical_address(), which is called via probe_access_full()
and x86_cpu_tlb_fill().
If ptw_translate() on the other hand does a MMU_NESTED_IDX access,
the A20 mask must not be applied to the address that is looked up in
the nested page tables; it must be applied only to the addresses that
hold the NPT entries (which is achieved via MMU_PHYS_IDX, per the
previous paragraph).
Therefore, we can remove A20 masking from the computation of the page
table entry's address, and let get_physical_address() or mmu_translate()
apply it when they know they are returning a host-physical address.
Cc: qemu-stable@nongnu.org Fixes: 4a1e9d4d11c ("target/i386: Use atomic operations for pte updates", 2022-10-18) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Fri, 22 Dec 2023 17:01:52 +0000 (18:01 +0100)]
target/i386: Fix physical address truncation
The address translation logic in get_physical_address() will currently
truncate physical addresses to 32 bits unless long mode is enabled.
This is incorrect when using physical address extensions (PAE) outside
of long mode, with the result that a 32-bit operating system using PAE
to access memory above 4G will experience undefined behaviour.
The truncation code was originally introduced in commit 33dfdb5 ("x86:
only allow real mode to access 32bit without LMA"), where it applied
only to translations performed while paging is disabled (and so cannot
affect guests using PAE).
Commit 9828198 ("target/i386: Add MMU_PHYS_IDX and MMU_NESTED_IDX")
rearranged the code such that the truncation also applied to the use
of MMU_PHYS_IDX and MMU_NESTED_IDX. Commit 4a1e9d4 ("target/i386: Use
atomic operations for pte updates") brought this truncation into scope
for page table entry accesses, and is the first commit for which a
Windows 10 32-bit guest will reliably fail to boot if memory above 4G
is present.
The truncation code however is not completely redundant. Even though the
maximum address size for any executed instruction is 32 bits, helpers for
operations such as BOUND, FSAVE or XSAVE may ask get_physical_address()
to translate an address outside of the 32-bit range, if invoked with an
argument that is close to the 4G boundary. Likewise for processor
accesses, for example TSS or IDT accesses, when EFER.LMA==0.
So, move the address truncation in get_physical_address() so that it
applies to 32-bit MMU indexes, but not to MMU_PHYS_IDX and MMU_NESTED_IDX.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2040 Fixes: 4a1e9d4d11c ("target/i386: Use atomic operations for pte updates", 2022-10-18) Cc: qemu-stable@nongnu.org Co-developed-by: Michael Brown <mcb30@ipxe.org> Signed-off-by: Michael Brown <mcb30@ipxe.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 2 Jan 2024 14:40:18 +0000 (15:40 +0100)]
target/i386: use separate MMU indexes for 32-bit accesses
Accesses from a 32-bit environment (32-bit code segment for instruction
accesses, EFER.LMA==0 for processor accesses) have to mask away the
upper 32 bits of the address. While a bit wasteful, the easiest way
to do so is to use separate MMU indexes. These days, QEMU anyway is
compiled with a fixed value for NB_MMU_MODES. Split MMU_USER_IDX,
MMU_KSMAP_IDX and MMU_KNOSMAP_IDX in two.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 2 Jan 2024 14:36:51 +0000 (15:36 +0100)]
target/i386: introduce function to query MMU indices
Remove knowledge of specific MMU indexes (other than MMU_NESTED_IDX and
MMU_PHYS_IDX) from mmu_translate(). This will make it possible to split
32-bit and 64-bit MMU indexes.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Fri, 22 Dec 2023 16:47:38 +0000 (17:47 +0100)]
target/i386: check validity of VMCB addresses
MSR_VM_HSAVE_PA bits 0-11 are reserved, as are the bits above the
maximum physical address width of the processor. Setting them to
1 causes a #GP (see "15.30.4 VM_HSAVE_PA MSR" in the AMD manual).
The same is true of VMCB addresses passed to VMRUN/VMLOAD/VMSAVE,
even though the manual is not clear on that.
Cc: qemu-stable@nongnu.org Fixes: 4a1e9d4d11c ("target/i386: Use atomic operations for pte updates", 2022-10-18) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Fri, 22 Dec 2023 08:27:36 +0000 (09:27 +0100)]
target/i386: mask high bits of CR3 in 32-bit mode
CR3 bits 63:32 are ignored in 32-bit mode (either legacy 2-level
paging or PAE paging). Do this in mmu_translate() to remove
the last where get_physical_address() meaningfully drops the high
bits of the address.
Cc: qemu-stable@nongnu.org Suggested-by: Richard Henderson <richard.henderson@linaro.org> Fixes: 4a1e9d4d11c ("target/i386: Use atomic operations for pte updates", 2022-10-18) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 13 Feb 2024 09:56:56 +0000 (10:56 +0100)]
vl, pc: turn -no-fd-bootchk into a machine property
Add a fd-bootchk property to PC machine types, so that -no-fd-bootchk
returns an error if the machine does not support booting from floppies
and checking for boot signatures therein.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>