Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging
* hot-unplug fixes for ioport
* purge qatomic_mb_read/set from monitor
* build system fixes
* OHCI fix from gitlab
* provide EPYC-Rome CPU model not susceptible to XSAVES erratum
# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmRvGpEUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroOa/Af/WS5/tmIlEYgH7UOPERQXNqf7+Jwj
# bA2wgqv3ZoQwcgp5f4EVjfA8ABfpGxLZy6xIdUSbWANb8lDJNuh/nPd/em3rWUAU
# LnJGGdo1vF31gfsVQnlzb7hJi3ur+e2f8JqkRVskDCk3a7YY44OCN42JdKWLrN9u
# CFf2zYqxMqXHjrYrY0Kx2oTkfGDZrfwUlx0vM4dHb8IEoxaplfDd8lJXQzjO4htr
# 3nPBPjQ+h08EeC7mObH4XoJE0omzovR10GkBo8K4q952xGOQ041Y/2YY7JwLfx0D
# na7IanVo+ZAmvTJZoJFSBwNnXkTMHvDH5+Hc45NSTsDBtz0YJhRxPw/z/A==
# =A5Lp
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 25 May 2023 01:21:37 AM PDT
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [undefined]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
monitor: do not use mb_read/mb_set
monitor: extract request dequeuing to a new function
monitor: introduce qmp_dispatcher_co_wake
monitor: cleanup fetching of QMP requests
monitor: cleanup detection of qmp_dispatcher_co shutting down
monitor: do not use mb_read/mb_set for suspend_cnt
monitor: add more *_locked() functions
monitor: allow calling monitor_resume under mon_lock
monitor: use QEMU_LOCK_GUARD a bit more
softmmu/ioport.c: make MemoryRegionPortioList owner of portio_list MemoryRegions
softmmu/ioport.c: QOMify MemoryRegionPortioList
softmmu/ioport.c: allocate MemoryRegionPortioList ports on the heap
usb/ohci: Set pad to 0 after frame update
meson: move -no-pie from linker to compiler
meson: fix rule for qemu-ga installer
meson.build: Fix glib -Wno-unused-function workaround
target/i386: EPYC-Rome model without XSAVES
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Paolo Bonzini [Wed, 15 Mar 2023 11:34:01 +0000 (12:34 +0100)]
monitor: do not use mb_read/mb_set
Instead of relying on magic memory barriers, document the pattern that
is being used. It is the one based on Dekker's algorithm, and in this
case it is embodied as follows:
enqueue request; sleeping = true;
smp_mb(); smp_mb();
if (sleeping) kick(); if (!have a request) yield();
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Fri, 3 Mar 2023 11:51:33 +0000 (12:51 +0100)]
monitor: cleanup fetching of QMP requests
Use a continue statement so that "after going to sleep" is treated the same
way as "after processing a request". Pull the monitor_lock critical
section out of monitor_qmp_requests_pop_any_with_lock() and protect
qmp_dispatcher_co_shutdown with the monitor_lock.
The two changes are complex to separate because monitor_qmp_dispatcher_co()
previously had a complicated logic to check for shutdown both before
and after going to sleep.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Fri, 3 Mar 2023 11:45:29 +0000 (12:45 +0100)]
monitor: cleanup detection of qmp_dispatcher_co shutting down
Instead of overloading qmp_dispatcher_co_busy, make the coroutine
pointer NULL. This will make things break spectacularly if somebody
tries to start a request after monitor_cleanup().
AIO_WAIT_WHILE_UNLOCKED() does not need qatomic_mb_read(), because
the macro contains all the necessary memory barriers.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Fri, 3 Mar 2023 12:32:13 +0000 (13:32 +0100)]
monitor: do not use mb_read/mb_set for suspend_cnt
Clean up monitor_event to just use monitor_suspend/monitor_resume,
using mon->mux_out to protect against incorrect nesting (especially
on startup).
The only remaining case of reading suspend_cnt is in the can_read
callback, which is just advisory and can use qatomic_read.
As an extra benefit, mux_out is now simply protected by mon_lock.
Also, moving the prompt to the beginning of the main loop removes
it from the output in some error cases where QEMU does not actually
start successfully. It is not a full fix and it would be nice to
also remove the monitor heading, but this is already a small (though
unintentional) improvement.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Wed, 17 May 2023 15:19:03 +0000 (17:19 +0200)]
monitor: allow calling monitor_resume under mon_lock
Move monitor_resume()'s call to readline_show_prompt() outside the
potentially locked section. Reuse the existing monitor_accept_input()
bottom half for this purpose.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Mark Cave-Ayland [Wed, 19 Apr 2023 15:16:52 +0000 (16:16 +0100)]
softmmu/ioport.c: make MemoryRegionPortioList owner of portio_list MemoryRegions
Currently when portio_list MemoryRegions are freed using portio_list_destroy() the RCU
thread segfaults generating a backtrace similar to that below:
#0 0x5555599a34b6 in phys_section_destroy ../softmmu/physmem.c:996
#1 0x5555599a37a3 in phys_sections_free ../softmmu/physmem.c:1011
#2 0x5555599b24aa in address_space_dispatch_free ../softmmu/physmem.c:2430
#3 0x55555996a283 in flatview_destroy ../softmmu/memory.c:292
#4 0x55555a2cb9fb in call_rcu_thread ../util/rcu.c:284
#5 0x55555a29b71d in qemu_thread_start ../util/qemu-thread-posix.c:541
#6 0x7ffff4a0cea6 in start_thread nptl/pthread_create.c:477
#7 0x7ffff492ca2e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0xfca2e)
The problem here is that portio_list_destroy() unparents the portio_list
MemoryRegions causing them to be freed immediately, however the flatview
still has a reference to the MemoryRegion and so causes a use-after-free
segfault when the RCU thread next updates the flatview.
Solve the lifetime issue by making MemoryRegionPortioList the owner of the
portio_list MemoryRegions, and then reparenting them to the portio_list
owner. This ensures that they can be accessed as QOM children via the
portio_list owner, yet the MemoryRegionPortioList owns the refcount.
Update portio_list_destroy() to unparent the MemoryRegion from the
portio_list owner (while keeping mrpio->mr live until finalization of the
MemoryRegionPortioList), so that the portio_list MemoryRegions remain
allocated until flatview_destroy() removes the final refcount upon the
next flatview update.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230419151652.362717-4-mark.cave-ayland@ilande.co.uk> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Mark Cave-Ayland [Wed, 19 Apr 2023 15:16:51 +0000 (16:16 +0100)]
softmmu/ioport.c: QOMify MemoryRegionPortioList
The aim of QOMification is so that the lifetime of the MemoryRegionPortioList
structure can be managed using QOM's in-built refcounting instead of having to
handle this manually.
Due to the use of an opaque pointer it isn't possible to model the new
TYPE_MEMORY_REGION_PORTIO_LIST directly using QOM properties, however since
use of the new object is restricted to the portio API we can simply set the
opaque pointer (and the heap-allocated port list) internally.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230419151652.362717-3-mark.cave-ayland@ilande.co.uk> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Mark Cave-Ayland [Wed, 19 Apr 2023 15:16:50 +0000 (16:16 +0100)]
softmmu/ioport.c: allocate MemoryRegionPortioList ports on the heap
In order to facilitate a conversion of MemoryRegionPortioList to a QOM object
move the allocation of MemoryRegionPortioList ports to the heap instead of
using a variable-length member at the end of the MemoryRegionPortioList
structure.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230419151652.362717-2-mark.cave-ayland@ilande.co.uk> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Tue, 23 May 2023 15:58:40 +0000 (17:58 +0200)]
usb/ohci: Set pad to 0 after frame update
When the OHCI controller's framenumber is incremented, HccaPad1 register
should be set to zero (Ref OHCI Spec 4.4)
ReactOS uses hccaPad1 to determine if the OHCI hardware is running,
consequently it fails this check in current qemu master.
Signed-off-by: Ryan Wendland <wendland@live.com.au>
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1048 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Mon, 22 May 2023 08:05:33 +0000 (10:05 +0200)]
meson: move -no-pie from linker to compiler
The large comment in the patch says it all; the -no-pie flag is broken and
this is why it was not included in QEMU_LDFLAGS before commit a988b4c5614
("build: move remaining compiler flag tests to meson", 2023-05-18). And
some distros made things even worse, so we have to add it to the compiler
command line.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1664 Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Mon, 22 May 2023 07:19:03 +0000 (09:19 +0200)]
meson: fix rule for qemu-ga installer
The bindir variable is not available in the "glib" variable, which is an internal
dependency (created with "declare_dependency"). Use glib_pc instead, which contains
the variable as it is instantiated from glib-2.0.pc.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We want to only enable '-Wno-unused-function' if glib's version is
smaller than '2.57.2' and has a G_DEFINE_AUTOPTR_CLEANUP_FUNC()
implementation that doesn't take into account unused functions. But the
compilation test isn't working as intended as '-Wunused-function' isn't
enabled while running it.
Let's enable it.
Fixes: fc9a809e0d28 ("build: move glib detection and workarounds to meson") Signed-off-by: Nicolas Saenz Julienne <nsaenz@amazon.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230524173123.66483-1-nsaenz@amazon.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Maksim Davydov [Wed, 24 May 2023 21:37:48 +0000 (00:37 +0300)]
target/i386: EPYC-Rome model without XSAVES
Based on the kernel commit "b0563468ee x86/CPU/AMD: Disable XSAVES on
AMD family 0x17", host system with EPYC-Rome can clear XSAVES capability
bit. In another words, EPYC-Rome host without XSAVES can occur. Thus, we
need an EPYC-Rome cpu model (without this feature) that matches the
solution of fixing this erratum
Signed-off-by: Maksim Davydov <davydov-max@yandex-team.ru>
Message-Id: <20230524213748.8918-1-davydov-max@yandex-team.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Camilla Conte [Mon, 22 May 2023 17:41:54 +0000 (18:41 +0100)]
Add Kubernetes runner configuration
Custom values for the gitlab-runner Helm chart.
See https://wiki.qemu.org/Testing/CI/KubernetesRunners.
Signed-off-by: Camilla Conte <cconte@redhat.com>
Message-Id: <20230522174153.46801-6-cconte@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Camilla Conte [Mon, 22 May 2023 17:41:53 +0000 (18:41 +0100)]
Add CI variable RUNNER_TAG
This allows to set a job tag dynamically.
We need this to be able to select the Kubernetes runner.
See https://wiki.qemu.org/Testing/CI/KubernetesRunners.
Signed-off-by: Camilla Conte <cconte@redhat.com>
Message-Id: <20230522174153.46801-5-cconte@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Camilla Conte [Mon, 22 May 2023 17:41:52 +0000 (18:41 +0100)]
Add loop over docker info
Wait for docker info to return successfuly to ensure that
the docker server (daemon) started.
This is needed for jobs running on Kubernetes.
See https://wiki.qemu.org/Testing/CI/KubernetesRunners.
Signed-off-by: Camilla Conte <cconte@redhat.com>
Message-Id: <20230522174153.46801-4-cconte@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Camilla Conte [Mon, 22 May 2023 17:41:51 +0000 (18:41 +0100)]
Use docker "stable" tag
Use the same tag in all jobs.
Signed-off-by: Camilla Conte <cconte@redhat.com>
Message-Id: <20230522174153.46801-3-cconte@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Camilla Conte [Mon, 22 May 2023 17:41:50 +0000 (18:41 +0100)]
Remove redundant CI variables
These are not needed when using gitlab.com shared runners.
Signed-off-by: Camilla Conte <cconte@redhat.com>
Message-Id: <20230522174153.46801-2-cconte@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Akihiko Odaki [Tue, 23 May 2023 02:39:12 +0000 (11:39 +0900)]
util/vfio-helpers: Use g_file_read_link()
When _FORTIFY_SOURCE=2, glibc version is 2.35, and GCC version is
12.1.0, the compiler complains as follows:
In file included from /usr/include/features.h:490,
from /usr/include/bits/libc-header-start.h:33,
from /usr/include/stdint.h:26,
from /usr/lib/gcc/aarch64-unknown-linux-gnu/12.1.0/include/stdint.h:9,
from /home/alarm/q/var/qemu/include/qemu/osdep.h:94,
from ../util/vfio-helpers.c:13:
In function 'readlink',
inlined from 'sysfs_find_group_file' at ../util/vfio-helpers.c:116:9,
inlined from 'qemu_vfio_init_pci' at ../util/vfio-helpers.c:326:18,
inlined from 'qemu_vfio_open_pci' at ../util/vfio-helpers.c:517:9:
/usr/include/bits/unistd.h:119:10: error: argument 2 is null but the corresponding size argument 3 value is 4095 [-Werror=nonnull]
119 | return __glibc_fortify (readlink, __len, sizeof (char),
| ^~~~~~~~~~~~~~~
This error implies the allocated buffer can be NULL. Use
g_file_read_link(), which allocates buffer automatically to avoid the
error.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Cédric Le Goater <clg@redhat.com> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Zhenzhong Duan [Wed, 17 May 2023 02:46:51 +0000 (10:46 +0800)]
vfio/pci: Fix a use-after-free issue
vbasedev->name is freed wrongly which leads to garbage VFIO trace log.
Fix it by allocating a dup of vbasedev->name and then free the dup.
Fixes: 2dca1b37a760 ("vfio/pci: add support for VF token") Suggested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Cédric Le Goater <clg@redhat.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Acked-by: Alex Williamson <alex.williamson@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Cédric Le Goater <clg@redhat.com>
Merge tag 'pull-tcg-20230523-3' of https://gitlab.com/rth7680/qemu into staging
util: Host cpu detection for x86 and aa64
util: Use cpu detection for bufferiszero
migration: Use cpu detection for xbzrle
tcg: Replace and remove cpu_atomic_{ld,st}o*
host/include: Split qemu/atomic128.h
tcg: Remove DEBUG_DISAS
tcg: Remove USE_TCG_OPTIMIZATIONS
* tag 'pull-tcg-20230523-3' of https://gitlab.com/rth7680/qemu: (28 commits)
tcg: Remove USE_TCG_OPTIMIZATIONS
tcg: Remove DEBUG_DISAS
qemu/atomic128: Add runtime test for FEAT_LSE2
qemu/atomic128: Improve cmpxchg fallback for atomic16_set
tcg: Split out tcg/debug-assert.h
accel/tcg: Correctly use atomic128.h in ldst_atomicity.c.inc
qemu/atomic128: Split atomic16_read
accel/tcg: Eliminate #if on HAVE_ATOMIC128 and HAVE_CMPXCHG128
accel/tcg: Remove prot argument to atomic_mmu_lookup
accel/tcg: Remove cpu_atomic_{ld,st}o_*_mmu
target/s390x: Always use cpu_atomic_cmpxchgl_be_mmu in do_csst
target/s390x: Use cpu_{ld,st}*_mmu in do_csst
accel/tcg: Unify cpu_{ld,st}*_{be,le}_mmu
target/s390x: Use tcg_gen_qemu_{ld,st}_i128 for LPQ, STPQ
target/ppc: Use tcg_gen_qemu_{ld,st}_i128 for LQARX, LQ, STQ
include/qemu: Move CONFIG_ATOMIC128_OPT handling to atomic128.h
meson: Fix detect atomic128 support with optimization
include/host: Split out atomic128-ldst.h
include/host: Split out atomic128-cas.h
util: Add cpuinfo-aarch64.c
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Create both atomic16_read_ro and atomic16_read_rw.
Previously we pretended that we had atomic16_read in system mode,
because we "know" that all ram is always writable to the host.
Now, expose read-only and read-write versions all of the time.
For aarch64, do not fall back to __atomic_read_16 even if
supported by the compiler, to work around a clang bug.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
accel/tcg: Eliminate #if on HAVE_ATOMIC128 and HAVE_CMPXCHG128
These symbols will shortly become dynamic runtime tests and
therefore not appropriate for the preprocessor. Use the
matching CONFIG_* symbols for that purpose.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
target/s390x: Always use cpu_atomic_cmpxchgl_be_mmu in do_csst
Eliminate the CONFIG_USER_ONLY specialization.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use cpu_ld16_mmu and cpu_st16_mmu to eliminate the special case,
and change all of the *_data_ra functions to match.
Note that we check the alignment of both compare and store
pointers at the top of the function, so MO_ALIGN* may be
safely removed from the individual memory operations.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With the current structure of cputlb.c, there is no difference
between the little-endian and big-endian entry points, aside
from the assert. Unify the pairs of functions.
The only use of the functions with explicit endianness was in
target/sparc64, and that was only to satisfy the assert: the
correct endianness is already built into memop.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
target/s390x: Use tcg_gen_qemu_{ld,st}_i128 for LPQ, STPQ
No need to roll our own, as this is now provided by tcg.
This was the last use of retxl, so remove that too.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
meson: Fix detect atomic128 support with optimization
Silly typo: sizeof(16) != 16.
Fixes: e61f1efeb730 ("meson: Detect atomic128 support with optimization") Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Move the code from tcg/. The only use of these bits so far
is with respect to the atomicity of tcg operations.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The items in migration_files are built for libmigration and included
info softmmu_ss from there; no need to also include them directly.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Perform the function selection once, and only if CONFIG_AVX512_OPT
is enabled. Centralize the selection to xbzrle.c, instead of
spreading the init across 3 files.
Remove xbzrle-bench.c. The benefit of being able to benchmark
the different implementations is less important than not peeking
into the internals of the implementation.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Place the CONFIG_AVX512BW_OPT block at the top,
which will aid function selection in the next patch.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use cpuinfo_init() during init_accel(), and the variable cpuinfo
during test_buffer_is_zero_next_accel(). Adjust the logic that
cycles through the set of accelerators for testing.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use the CPUINFO_* bits instead of the individual boolean
variables that we had been using. Remove all of the init
code that was moved over to cpuinfo-i386.c.
Note that have_avx512* check both AVX512{F,VL}, as we had
previously done during tcg_target_init.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add a bit to indicate when VMOVDQU is also atomic if aligned.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add cpuinfo.h for i386 and x86_64, and the initialization
for that in util/. Populate that with a slightly altered
copy of the tcg host probing code. Other uses of cpuid.h
will be adjusted one patch at a time.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The entire contents of the header is host-specific, but the
existence of such a header is not, which could prevent some
host specific ifdefs at the top of the file for the include.
Add host/include/{arch,generic} to the project arguments.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Alexander Graf [Mon, 3 Apr 2023 22:14:21 +0000 (22:14 +0000)]
hostmem-file: add offset option
Add an option for hostmem-file to start the memory object at an offset
into the target file. This is useful if multiple memory objects reside
inside the same target file, such as a device node.
In particular, it's useful to map guest memory directly into /dev/mem
for experimentation.
To make this work consistently, also fix up all places in QEMU that
expect fd offsets to be 0.
Signed-off-by: Alexander Graf <graf@amazon.com>
Message-Id: <20230403221421.60877-1-graf@amazon.com> Acked-by: Markus Armbruster <armbru@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
Merge tag 'net-pull-request' of https://github.com/jasowang/qemu into staging
# -----BEGIN PGP SIGNATURE-----
# Version: GnuPG v1
#
# iQEcBAABAgAGBQJkbGmXAAoJEO8Ells5jWIR4ogH/R5+IgkZi1dwN/IxCpzTIc5H
# l5ncKK6TCqKCfgpFnFFLNKhcDqDczq4LhO42s/vnuOF8vIXcUVhLAz0HULARb46o
# p/7Ufn1k8Zg/HGtWwIW+9CcTkymsHzTOwFcTRFiCjpdkjaW1Wprb2q968f0Px8eS
# cKqC5xln8U+s02KWQMHlJili6BTPuw1ZNnYV3iq/81Me96WOtPd8c8ZSF4aVR2AB
# Kqah+BBOnk4p4kg9Gs0OvM4TffEBrsab8iu4s6SSQGA6ymCWY6GeCX0Ik4u9P1yE
# 6NtKLixBPO4fqLwWxWuKVJmaLKmuEd/FjZXWwITx9EPNtDuBuGLDKuvW8fJxkhw=
# =dw2I
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 23 May 2023 12:21:59 AM PDT
# gpg: using RSA key EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* tag 'net-pull-request' of https://github.com/jasowang/qemu: (50 commits)
rtl8139: fix large_send_mss divide-by-zero
docs/system/devices/igb: Note igb is tested for DPDK
MAINTAINERS: Add a reviewer for network packet abstractions
vmxnet3: Do not depend on PC
igb: Clear-on-read ICR when ICR.INTA is set
igb: Notify only new interrupts
e1000e: Notify only new interrupts
igb: Implement Tx timestamp
igb: Implement Rx PTP2 timestamp
igb: Implement igb-specific oversize check
igb: Filter with the second VLAN tag for extended VLAN
igb: Strip the second VLAN tag for extended VLAN
igb: Implement Tx SCTP CSO
igb: Implement Rx SCTP CSO
igb: Use UDP for RSS hash
igb: Implement MSI-X single vector mode
tests/qtest/libqos/igb: Set GPIE.Multiple_MSIX
hw/net/net_rx_pkt: Enforce alignment for eth_header
net/eth: Always add VLAN tag
net/eth: Use void pointers
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Stefan Hajnoczi [Thu, 13 Apr 2023 17:19:46 +0000 (13:19 -0400)]
rtl8139: fix large_send_mss divide-by-zero
If the driver sets large_send_mss to 0 then a divide-by-zero occurs.
Even if the division wasn't a problem, the for loop that emits MSS-sized
packets would never terminate.
Solve these issues by skipping offloading when large_send_mss=0.
This issue was found by OSS-Fuzz as part of Alexander Bulekov's device
fuzzing work. The reproducer is:
Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1582 Closes: https://gitlab.com/qemu-project/qemu/-/issues/1582 Cc: qemu-stable@nongnu.org Cc: Peter Maydell <peter.maydell@linaro.org> Fixes: 6d71357a3b65 ("rtl8139: honor large send MSS value") Reported-by: Alexander Bulekov <alxndr@bu.edu> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Tested-by: Alexander Bulekov <alxndr@bu.edu> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
Akihiko Odaki [Tue, 23 May 2023 02:43:38 +0000 (11:43 +0900)]
MAINTAINERS: Add a reviewer for network packet abstractions
I have made significant changes for network packet abstractions so add
me as a reviewer.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
Akihiko Odaki [Tue, 23 May 2023 02:43:37 +0000 (11:43 +0900)]
vmxnet3: Do not depend on PC
vmxnet3 has no dependency on PC, and VMware Fusion actually makes it
available on Apple Silicon according to:
https://kb.vmware.com/s/article/90364
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
Akihiko Odaki [Tue, 23 May 2023 02:43:36 +0000 (11:43 +0900)]
igb: Clear-on-read ICR when ICR.INTA is set
For GPIE.NSICR, Section 7.3.2.1.2 says:
> ICR bits are cleared on register read. If GPIE.NSICR = 0b, then the
> clear on read occurs only if no bit is set in the IMS or at least one
> bit is set in the IMS and there is a true interrupt as reflected in
> ICR.INTA.
e1000e does similar though it checks for CTRL_EXT.IAME, which does not
exist on igb.
Suggested-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
Akihiko Odaki [Tue, 23 May 2023 02:43:34 +0000 (11:43 +0900)]
e1000e: Notify only new interrupts
In MSI-X mode, if there are interrupts already notified but not cleared
and a new interrupt arrives, e1000e incorrectly notifies the notified
ones again along with the new one.
To fix this issue, replace e1000e_update_interrupt_state() with
two new functions: e1000e_raise_interrupts() and
e1000e_lower_interrupts(). These functions don't only raise or lower
interrupts, but it also performs register writes which updates the
interrupt state. Before it performs a register write, these function
determines the interrupts already raised, and compares with the
interrupts raised after the register write to determine the interrupts
to notify.
The introduction of these functions made tracepoints which assumes that
the caller of e1000e_update_interrupt_state() performs register writes
obsolete. These tracepoints are now removed, and alternative ones are
added to the new functions.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
Akihiko Odaki [Tue, 23 May 2023 02:43:21 +0000 (11:43 +0900)]
net/eth: Use void pointers
The uses of uint8_t pointers were misleading as they are never accessed
as an array of octets and it even require more strict alignment to
access as struct eth_header.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
Akihiko Odaki [Tue, 23 May 2023 02:43:18 +0000 (11:43 +0900)]
igb: Clear EICR bits for delayed MSI-X interrupts
Section 7.3.4.1 says:
> When auto-clear is enabled for an interrupt cause, the EICR bit is
> set when a cause event mapped to this vector occurs. When the EITR
> Counter reaches zero, the MSI-X message is sent on PCIe. Then the
> EICR bit is cleared and enabled to be set by a new cause event
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
Akihiko Odaki [Tue, 23 May 2023 02:43:17 +0000 (11:43 +0900)]
igb: Fix igb_mac_reg_init coding style alignment
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
Akihiko Odaki [Tue, 23 May 2023 02:43:13 +0000 (11:43 +0900)]
e1000e: Reset packet state after emptying Tx queue
Keeping Tx packet state after the transmit queue is emptied has some
problems:
- The datasheet says the descriptors can be reused after the transmit
queue is emptied, but the Tx packet state may keep references to them.
- The Tx packet state cannot be migrated so it can be reset anytime the
migration happens.
Always reset Tx packet state always after the queue is emptied.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
Akihiko Odaki [Tue, 23 May 2023 02:43:12 +0000 (11:43 +0900)]
igb: Read DCMD.VLE of the first Tx descriptor
Section 7.2.2.3 Advanced Transmit Data Descriptor says:
> For frames that spans multiple descriptors, all fields apart from
> DCMD.EOP, DCMD.RS, DCMD.DEXT, DTALEN, Address and DTYP are valid only
> in the first descriptors and are ignored in the subsequent ones.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
Akihiko Odaki [Tue, 23 May 2023 02:43:11 +0000 (11:43 +0900)]
igb: Remove goto
The goto is a bit confusing as it changes the control flow only if L4
protocol is not recognized. It is also different from e1000e, and
noisy when comparing e1000e and igb.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
Akihiko Odaki [Tue, 23 May 2023 02:43:10 +0000 (11:43 +0900)]
igb: Always log status after building rx metadata
Without this change, the status flags may not be traced e.g. if checksum
offloading is disabled.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
Akihiko Odaki [Tue, 23 May 2023 02:43:09 +0000 (11:43 +0900)]
e1000e: Always log status after building rx metadata
Without this change, the status flags may not be traced e.g. if checksum
offloading is disabled.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
Akihiko Odaki [Tue, 23 May 2023 02:43:08 +0000 (11:43 +0900)]
e1000x: Rename TcpIpv6 into TcpIpv6Ex
e1000e and igb employs NetPktRssIpV6TcpEx for RSS hash if TcpIpv6 MRQC
bit is set. Moreover, igb also has a MRQC bit for NetPktRssIpV6Tcp
though it is not implemented yet. Rename it to TcpIpv6Ex to avoid
confusion.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
Akihiko Odaki [Tue, 23 May 2023 02:43:07 +0000 (11:43 +0900)]
e1000x: Take CRC into consideration for size check
Section 13.7.15 Receive Length Error Count says:
> Packets over 1522 bytes are oversized if LongPacketEnable is 0b
> (RCTL.LPE). If LongPacketEnable (LPE) is 1b, then an incoming packet
> is considered oversized if it exceeds 16384 bytes.
> These lengths are based on bytes in the received packet from
> <Destination Address> through <CRC>, inclusively.
As QEMU processes packets without CRC, the number of bytes for CRC
need to be subtracted. This change adds some size definitions to be used
to derive the new size thresholds to eth.h.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>
Akihiko Odaki [Tue, 23 May 2023 02:43:05 +0000 (11:43 +0900)]
net/eth: Rename eth_setup_vlan_headers_ex
The old eth_setup_vlan_headers has no user so remove it and rename
eth_setup_vlan_headers_ex.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
Akihiko Odaki [Tue, 23 May 2023 02:43:03 +0000 (11:43 +0900)]
tests/avocado: Remove test_igb_nomsi_kvm
It is unlikely to find more bugs with KVM so remove test_igb_nomsi_kvm
to save time to run it.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Acked-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
Akihiko Odaki [Tue, 23 May 2023 02:43:01 +0000 (11:43 +0900)]
Fix references to igb Avocado test
Fixes: 9f95111474 ("tests/avocado: re-factor igb test to avoid timeouts") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Jason Wang <jasowang@redhat.com>
Akihiko Odaki [Tue, 23 May 2023 02:43:00 +0000 (11:43 +0900)]
igb: Always copy ethernet header
igb_receive_internal() used to check the iov length to determine
copy the iovs to a contiguous buffer, but the check is flawed in two
ways:
- It does not ensure that iovcnt > 0.
- It does not take virtio-net header into consideration.
The size of this copy is just 22 octets, which can be even less than
the code size required for checks. This (wrong) optimization is probably
not worth so just remove it. Removing this also allows igb to assume
aligned accesses for the ethernet header.
Fixes: 3a977deebe ("Intrdocue igb device emulation") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Signed-off-by: Jason Wang <jasowang@redhat.com>
Akihiko Odaki [Tue, 23 May 2023 02:42:59 +0000 (11:42 +0900)]
e1000e: Always copy ethernet header
e1000e_receive_internal() used to check the iov length to determine
copy the iovs to a contiguous buffer, but the check is flawed in two
ways:
- It does not ensure that iovcnt > 0.
- It does not take virtio-net header into consideration.
The size of this copy is just 18 octets, which can be even less than
the code size required for checks. This (wrong) optimization is probably
not worth so just remove it.
Fixes: 6f3fbe4ed0 ("net: Introduce e1000e device emulation") Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Signed-off-by: Jason Wang <jasowang@redhat.com>