Thomas Huth [Tue, 3 Sep 2024 11:47:26 +0000 (13:47 +0200)]
tests/qtest/migration: Add a check for the availability of the "pc" machine
The test_vcpu_dirty_limit is the only test that does not check for the
availability of the machine before starting the test, so it fails when
QEMU has been configured with --without-default-devices. Add a check for
the "pc" machine type to fix it.
Arman Nabiev [Thu, 22 Aug 2024 16:56:53 +0000 (19:56 +0300)]
target/ppc: Fix migration of CPUs with TLB_EMB TLB type
In vmstate_tlbemb a cut-and-paste error meant we gave
this vmstate subsection the same "cpu/tlb6xx" name as
the vmstate_tlb6xx subsection. This breaks migration load
for any CPU using the TLB_EMB CPU type, because when we
see the "tlb6xx" name in the incoming data we try to
interpret it as a vmstate_tlb6xx subsection, which it
isn't the right format for:
$ qemu-system-ppc -drive
if=none,format=qcow2,file=/home/petmay01/test-images/virt/dummy.qcow2
-monitor stdio -M bamboo
QEMU 9.0.92 monitor - type 'help' for more information
(qemu) savevm foo
(qemu) loadvm foo
Missing section footer for cpu
Error: Error -22 while loading VM state
Correct the incorrect vmstate section name. Since migration
for these CPU types was completely broken before, we don't
need to care that this is a migration compatibility break.
This affects the PPC 405, 440, 460 and e200 CPU families.
Fabiano Rosas [Wed, 28 Aug 2024 14:56:50 +0000 (11:56 -0300)]
migration/multifd: Add documentation for multifd methods
Add documentation clarifying the usage of the multifd methods. The
general idea is that the client code calls into multifd to trigger
send/recv of data and multifd then calls these hooks back from the
worker threads at opportune moments so the client can process a
portion of the data.
Suggested-by: Peter Xu <peterx@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
Fabiano Rosas [Wed, 28 Aug 2024 14:56:48 +0000 (11:56 -0300)]
migration/multifd: Fix p->iov leak in multifd-uadk.c
The send_cleanup() hook should free the p->iov that was allocated at
send_setup(). This was missed because the UADK code is conditional on
the presence of the accelerator, so it's not tested by default.
Fixes: 819dd20636 ("migration/multifd: Add UADK initialization") Reported-by: Peter Xu <peterx@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
Fabiano Rosas [Wed, 28 Aug 2024 14:56:47 +0000 (11:56 -0300)]
migration/multifd: Stop changing the packet on recv side
As observed by Philippe, the multifd_ram_unfill_packet() function
currently leaves the MultiFDPacket structure with mixed
endianness. This is harmless, but ultimately not very clean.
Stop touching the received packet and do the necessary work using
stack variables instead.
While here tweak the error strings and fix the space before
semicolons. Also remove the "100 times bigger" comment because it's
just one possible explanation for a size mismatch and it doesn't even
match the code.
CC: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
Fabiano Rosas [Tue, 27 Aug 2024 17:46:03 +0000 (14:46 -0300)]
migration/multifd: Move nocomp code into multifd-nocomp.c
In preparation for adding new payload types to multifd, move most of
the no-compression code into multifd-nocomp.c. Let's try to keep a
semblance of layering by not mixing general multifd control flow with
the details of transmitting pages of ram.
There are still some pieces leftover, namely the p->normal, p->zero,
etc variables that we use for zero page tracking and the packet
allocation which is heavily dependent on the ram code.
Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
Prior to moving the ram code into multifd-nocomp.c, change the code to
register the nocomp ops dynamically so we don't need to have the ops
structure defined in multifd.c.
While here, move the ops struct initialization to the end of the file
to make the next diff cleaner.
Fabiano Rosas [Tue, 27 Aug 2024 17:46:00 +0000 (14:46 -0300)]
migration/multifd: Allow multifd sync without flush
Separate the multifd sync from flushing the client data to the
channels. These two operations are closely related but not strictly
necessary to be executed together.
The multifd sync is intrinsic to how multifd works. The multiple
channels operate independently and may finish IO out of order in
relation to each other. This applies also between the source and
destination QEMU.
Flushing the data that is left in the client-owned data structures
(e.g. MultiFDPages_t) prior to sync is usually the right thing to do,
but that is particular to how the ram migration is implemented with
several passes over dirty data.
Make these two routines separate, allowing future code to call the
sync by itself if needed. This also allows the usage of
multifd_ram_send to be isolated to ram code.
Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
Fabiano Rosas [Tue, 27 Aug 2024 17:45:59 +0000 (14:45 -0300)]
migration/multifd: Replace multifd_send_state->pages with client data
Multifd currently has a simple scheduling mechanism that distributes
work to the various channels by keeping storage space within each
channel and an extra space that is given to the client. Each time the
client fills the space with data and calls into multifd, that space is
given to the next idle channel and a free storage space is taken from
the channel and given to client for the next iteration.
This means we always need (#multifd_channels + 1) memory slots to
operate multifd.
This is fine, except that the presence of this one extra memory slot
doesn't allow different types of payloads to be processed at the same
time in different channels, i.e. the data type of
multifd_send_state->pages needs to be the same as p->pages.
For each new data type different from MultiFDPage_t that is to be
handled, this logic would need to be duplicated by adding new fields
to multifd_send_state, to the channels and to multifd_send_pages().
Fix this situation by moving the extra slot into the client and using
only the generic type MultiFDSendData in the multifd core.
Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
Fabiano Rosas [Tue, 27 Aug 2024 17:45:58 +0000 (14:45 -0300)]
migration/multifd: Don't send ram data during SYNC
Skip saving and loading any ram data in the packet in the case of a
SYNC. This fixes a shortcoming of the current code which requires a
reset of the MultiFDPages_t fields right after the previous
pending_job finishes, otherwise the very next job might be a SYNC and
multifd_send_fill_packet() will put the stale values in the packet.
By not calling multifd_ram_fill_packet(), we can stop resetting
MultiFDPages_t in the multifd core and leave that to the client code.
Actually moving the reset function is not yet done because
pages->num==0 is used by the client code to determine whether the
MultiFDPages_t needs to be flushed. The subsequent patches will
replace that with a generic flag that is not dependent on
MultiFDPages_t.
Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
Fabiano Rosas [Tue, 27 Aug 2024 17:45:57 +0000 (14:45 -0300)]
migration/multifd: Isolate ram pages packet data
While we cannot yet disentangle the multifd packet from page data, we
can make the code a bit cleaner by setting the page-related fields in
a separate function.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
Fabiano Rosas [Tue, 27 Aug 2024 17:45:56 +0000 (14:45 -0300)]
migration/multifd: Remove total pages tracing
The total_normal_pages and total_zero_pages elements are used only for
the end tracepoints of the multifd threads. These are not super useful
since they record per-channel numbers and are just the sum of all the
pages that are transmitted per-packet, for which we already have
tracepoints. Remove the totals from the tracing.
Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
Fabiano Rosas [Tue, 27 Aug 2024 17:45:55 +0000 (14:45 -0300)]
migration/multifd: Move pages accounting into multifd_send_zero_page_detect()
All references to pages are being removed from the multifd worker
threads in order to allow multifd to deal with different payload
types.
multifd_send_zero_page_detect() is called by all multifd migration
paths that deal with pages and is the last spot where zero pages and
normal page amounts are adjusted. Move the pages accounting into that
function.
Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
Fabiano Rosas [Tue, 27 Aug 2024 17:45:54 +0000 (14:45 -0300)]
migration/multifd: Replace p->pages with an union pointer
We want multifd to be able to handle more types of data than just ram
pages. To start decoupling multifd from pages, replace p->pages
(MultiFDPages_t) with the new type MultiFDSendData that hides the
client payload inside an union.
The general idea here is to isolate functions that *need* to handle
MultiFDPages_t and move them in the future to multifd-ram.c, while
multifd.c will stay with only the core functions that handle
MultiFDSendData/MultiFDRecvData.
Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
Fabiano Rosas [Tue, 27 Aug 2024 17:45:53 +0000 (14:45 -0300)]
migration/multifd: Make MultiFDPages_t:offset a flexible array member
We're about to use MultiFDPages_t from inside the MultiFDSendData
payload union, which means we cannot have pointers to allocated data
inside the pages structure, otherwise we'd lose the reference to that
memory once another payload type touches the union. Move the offset
array into the end of the structure and turn it into a flexible array
member, so it is allocated along with the rest of MultiFDSendData in
the next patches.
Note that other pointers, such as the ramblock pointer are still fine
as long as the storage for them is not owned by the migration code and
can be correctly released at some point.
Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
Fabiano Rosas [Tue, 27 Aug 2024 17:45:52 +0000 (14:45 -0300)]
migration/multifd: Introduce MultiFDSendData
Add a new data structure to replace p->pages in the multifd
channel. This new structure will hide the multifd payload type behind
an union, so we don't need to add a new field to the channel each time
we want to handle a different data type.
This also allow us to keep multifd_send_pages() as is, without needing
to complicate the pointer switching.
Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
Fabiano Rosas [Tue, 27 Aug 2024 17:45:49 +0000 (14:45 -0300)]
migration/multifd: Inline page_size and page_count
The MultiFD*Params structures are for per-channel data. Constant
values should not be there because that needlessly wastes cycles and
storage. The page_size and page_count fall into this category so move
them inline in multifd.h.
Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
Fabiano Rosas [Tue, 27 Aug 2024 17:45:48 +0000 (14:45 -0300)]
migration/multifd: Reduce access to p->pages
I'm about to replace the p->pages pointer with an opaque pointer, so
do a cleanup now to reduce direct accesses to p->page, which makes the
next diffs cleaner.
Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
Peter Maydell [Tue, 20 Aug 2024 14:49:12 +0000 (15:49 +0100)]
tests/qtest/migration-test: Don't leak QTestState in test_multifd_tcp_cancel()
In test_multifd_tcp_cancel() we create three QEMU processes: 'from',
'to' and 'to2'. We clean up (via qtest_quit()) 'from' and 'to2' when
we call test_migrate_end(), but never clean up 'to', which results in
this leak:
Direct leak of 336 byte(s) in 1 object(s) allocated from:
#0 0x55e984fcd328 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f328) (BuildId: 710d409b68bb04427009e9ca6e1b63ff8af785d3)
#1 0x7f0878b39c50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
#2 0x55e98503a172 in qtest_spawn_qemu tests/qtest/libqtest.c:397:21
#3 0x55e98502bc4a in qtest_init_internal tests/qtest/libqtest.c:471:9
#4 0x55e98502c5b7 in qtest_init_with_env tests/qtest/libqtest.c:533:21
#5 0x55e9850eef0f in test_migrate_start tests/qtest/migration-test.c:857:11
#6 0x55e9850eb01d in test_multifd_tcp_cancel tests/qtest/migration-test.c:3297:9
#7 0x55e985103407 in migration_test_wrapper tests/qtest/migration-helpers.c:456:5
Call qtest_quit() on 'to' to clean it up once it has exited.
Peter Maydell [Tue, 20 Aug 2024 14:49:11 +0000 (15:49 +0100)]
tests/qtest/migration-test: Don't strdup in get_dirty_rate()
We g_strdup() the "status" string we get out of the qdict in
get_dirty_rate(), but we never free it. Since we only use this
string while the dictionary is still valid, we don't need to strdup
at all; drop the unnecessary call to avoid this leak:
Direct leak of 18 byte(s) in 2 object(s) allocated from:
#0 0x564b3e01913e in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f13e) (BuildId: d6403a811332fcc846f93c45e23abfd06d1e67c4)
#1 0x7f2f278ff738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
#2 0x7f2f27914583 in g_strdup debian/build/deb/../../../glib/gstrfuncs.c:361:17
#3 0x564b3e14bb5b in get_dirty_rate tests/qtest/migration-test.c:3447:14
#4 0x564b3e138e00 in test_vcpu_dirty_limit tests/qtest/migration-test.c:3565:16
#5 0x564b3e14f417 in migration_test_wrapper tests/qtest/migration-helpers.c:456:5
Peter Maydell [Tue, 20 Aug 2024 14:49:10 +0000 (15:49 +0100)]
tests/qtest/migration-helpers: Don't dup argument to qdict_put_str()
In migrate_set_ports() we call qdict_put_str() with a value string
which we g_strdup(). However qdict_put_str() takes a copy of the
value string, it doesn't take ownership of it, so the g_strdup()
only results in a leak:
Direct leak of 6 byte(s) in 1 object(s) allocated from:
#0 0x56298023713e in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f13e) (BuildId: b2b9174a5a54707a7f76bca51cdc95d2aa08bac1)
#1 0x7fba0ad39738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
#2 0x7fba0ad4e583 in g_strdup debian/build/deb/../../../glib/gstrfuncs.c:361:17
#3 0x56298036b16e in migrate_set_ports tests/qtest/migration-helpers.c:145:49
#4 0x56298036ad1c in migrate_qmp tests/qtest/migration-helpers.c:228:9
#5 0x56298035b3dd in test_precopy_common tests/qtest/migration-test.c:1820:5
#6 0x5629803549dc in test_multifd_tcp_channels_none tests/qtest/migration-test.c:3077:5
#7 0x56298036d427 in migration_test_wrapper tests/qtest/migration-helpers.c:456:5
Peter Maydell [Tue, 20 Aug 2024 14:49:09 +0000 (15:49 +0100)]
tests/unit/crypto-tls-x509-helpers: deinit privkey in test_tls_cleanup
We create a gnutls_x509_privkey_t in test_tls_init(), but forget
to deinit it in test_tls_cleanup(), resulting in leaks
reported in hte migration test such as:
Indirect leak of 8 byte(s) in 1 object(s) allocated from:
#0 0x55fa6d11c12e in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f12e) (BuildId: 852a267993587f557f50e5715f352f43720077ba)
#1 0x7f073982685d in __gmp_default_allocate (/lib/x86_64-linux-gnu/libgmp.so.10+0xa85d) (BuildId: f110719303ddbea25a5e89ff730fec520eed67b0)
#2 0x7f0739836193 in __gmpz_realloc (/lib/x86_64-linux-gnu/libgmp.so.10+0x1a193) (BuildId: f110719303ddbea25a5e89ff730fec520eed67b0)
#3 0x7f0739836594 in __gmpz_import (/lib/x86_64-linux-gnu/libgmp.so.10+0x1a594) (BuildId: f110719303ddbea25a5e89ff730fec520eed67b0)
#4 0x7f07398a91ed in nettle_mpz_set_str_256_u (/lib/x86_64-linux-gnu/libhogweed.so.6+0xb1ed) (BuildId: 3cc4a3474de72db89e9dcc93bfb95fe377f48c37)
#5 0x7f073a146a5a (/lib/x86_64-linux-gnu/libgnutls.so.30+0x131a5a) (BuildId: 97b8f99f392f1fd37b969a7164bcea884e23649b)
#6 0x7f073a07192c (/lib/x86_64-linux-gnu/libgnutls.so.30+0x5c92c) (BuildId: 97b8f99f392f1fd37b969a7164bcea884e23649b)
#7 0x7f073a078333 (/lib/x86_64-linux-gnu/libgnutls.so.30+0x63333) (BuildId: 97b8f99f392f1fd37b969a7164bcea884e23649b)
#8 0x7f073a0e8353 (/lib/x86_64-linux-gnu/libgnutls.so.30+0xd3353) (BuildId: 97b8f99f392f1fd37b969a7164bcea884e23649b)
#9 0x7f073a0ef0ac in gnutls_x509_privkey_import (/lib/x86_64-linux-gnu/libgnutls.so.30+0xda0ac) (BuildId: 97b8f99f392f1fd37b969a7164bcea884e23649b)
#10 0x55fa6d2547e3 in test_tls_load_key tests/unit/crypto-tls-x509-helpers.c:99:11
#11 0x55fa6d25460c in test_tls_init tests/unit/crypto-tls-x509-helpers.c:128:15
#12 0x55fa6d2495c4 in test_migrate_tls_x509_start_common tests/qtest/migration-test.c:1044:5
#13 0x55fa6d24c23a in test_migrate_tls_x509_start_reject_anon_client tests/qtest/migration-test.c:1216:12
#14 0x55fa6d23fb40 in test_precopy_common tests/qtest/migration-test.c:1789:21
#15 0x55fa6d236b7c in test_precopy_tcp_tls_x509_reject_anon_client tests/qtest/migration-test.c:2614:5
(Oddly, there is no reported leak in the x509 unit tests, even though
those also use test_tls_init() and test_tls_cleanup().)
In the migration test we create several TLS certificates with
the TLS_* macros from crypto-tls-x509-helpers.h. These macros
create both a QCryptoTLSCertReq object which must be deinitialized
and also an on-disk certificate file. The migration test currently
removes the on-disk file in test_migrate_tls_x509_finish() but
never deinitializes the QCryptoTLSCertReq, which means that memory
allocated as part of it is leaked:
Indirect leak of 2 byte(s) in 1 object(s) allocated from:
#0 0x5558ba33712e in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f12e) (BuildId: 4c8618f663e538538cad19d35233124cea161491)
#1 0x7f64afc131f4 (/lib/x86_64-linux-gnu/libtasn1.so.6+0x81f4) (BuildId: 2fde6ecb43c586fe4077118f771077aa1298e7ea)
#2 0x7f64afc18d58 in asn1_write_value (/lib/x86_64-linux-gnu/libtasn1.so.6+0xdd58) (BuildId: 2fde6ecb43c586fe4077118f771077aa1298e7ea)
#3 0x7f64af8fc678 in gnutls_x509_crt_set_version (/lib/x86_64-linux-gnu/libgnutls.so.30+0xe7678) (BuildId: 97b8f99f392f1fd37b969a7164bcea884e23649b)
#4 0x5558ba470035 in test_tls_generate_cert tests/unit/crypto-tls-x509-helpers.c:234:5
#5 0x5558ba464e4a in test_migrate_tls_x509_start_common tests/qtest/migration-test.c:1058:5
#6 0x5558ba462c8a in test_migrate_tls_x509_start_default_host tests/qtest/migration-test.c:1123:12
#7 0x5558ba45ab40 in test_precopy_common tests/qtest/migration-test.c:1786:21
#8 0x5558ba450015 in test_precopy_unix_tls_x509_default_host tests/qtest/migration-test.c:2077:5
#9 0x5558ba46d3c7 in migration_test_wrapper tests/qtest/migration-helpers.c:456:5
(and similar reports).
The only function currently provided to deinit a QCryptoTLSCertReq is
test_tls_discard_cert(), which also removes the on-disk certificate
file. For the migration tests we need to retain the on-disk files
until we've finished running the test, so the simplest fix is to
provide a new function test_tls_deinit_cert() which does only the
cleanup of the QCryptoTLSCertReq, and call it in the right places.
In migrate_get_socket_address() we leak the SocketAddressList:
(cd build/asan && \
ASAN_OPTIONS="fast_unwind_on_malloc=0:strip_path_prefix=/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../"
QTEST_QEMU_BINARY=./qemu-system-x86_64 \
./tests/qtest/migration-test --tap -k -p /x86_64/migration/multifd/tcp/tls/psk/match )
[...]
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x563d7f22f318 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f318) (BuildId: 2ad6282fb5d076c863ab87f41a345d46dc965ded)
#1 0x7f9de3b39c50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
#2 0x563d7f3a119c in qobject_input_start_list qapi/qobject-input-visitor.c:336:17
#3 0x563d7f390fbf in visit_start_list qapi/qapi-visit-core.c:80:10
#4 0x563d7f3882ef in visit_type_SocketAddressList /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qapi/qapi-visit-sockets.c:519:10
#5 0x563d7f3658c9 in migrate_get_socket_address tests/qtest/migration-helpers.c:97:5
#6 0x563d7f362e24 in migrate_get_connect_uri tests/qtest/migration-helpers.c:111:13
#7 0x563d7f362bb2 in migrate_qmp tests/qtest/migration-helpers.c:222:23
#8 0x563d7f3533cd in test_precopy_common tests/qtest/migration-test.c:1817:5
#9 0x563d7f34dc1c in test_multifd_tcp_tls_psk_match tests/qtest/migration-test.c:3185:5
#10 0x563d7f365337 in migration_test_wrapper tests/qtest/migration-helpers.c:458:5
The code fishes out the SocketAddress from the list to return it, and the
callers are freeing that, but nothing frees the list.
Since this function is called in only two places, the simple fix is to
make it return the SocketAddressList rather than just a SocketAddress,
and then the callers can easily access the SocketAddress, and free
the whole SocketAddressList when they're done.
Peter Maydell [Tue, 20 Aug 2024 14:49:06 +0000 (15:49 +0100)]
tests/qtest/migration-test: Fix leaks in calc_dirtyrate_ready()
In calc_dirtyrate_ready() we g_strdup() a string but then never free it:
Direct leak of 19 byte(s) in 2 object(s) allocated from:
#0 0x55ead613413e in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f13e) (BuildId: e7cd5c37b2987a1af682b43ee5240b98bb316737)
#1 0x7f7a13d39738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
#2 0x7f7a13d4e583 in g_strdup debian/build/deb/../../../glib/gstrfuncs.c:361:17
#3 0x55ead6266f48 in calc_dirtyrate_ready tests/qtest/migration-test.c:3409:14
#4 0x55ead62669fe in wait_for_calc_dirtyrate_complete tests/qtest/migration-test.c:3422:13
#5 0x55ead6253df7 in test_vcpu_dirty_limit tests/qtest/migration-test.c:3562:9
#6 0x55ead626a407 in migration_test_wrapper tests/qtest/migration-helpers.c:456:5
We also fail to unref the QMP rsp_return, so we leak that also.
Rather than duplicating the string, use the in-place value from
the qdict, and then unref the qdict.
Peter Maydell [Tue, 20 Aug 2024 14:49:05 +0000 (15:49 +0100)]
tests/qtest/migration-test: Don't leak resp in multifd_mapped_ram_fdset_end()
In multifd_mapped_ram_fdset_end() we call qtest_qmp() but forgot
to unref the response QDict we get back, which means it is leaked:
Indirect leak of 4120 byte(s) in 1 object(s) allocated from:
#0 0x55c0c095d318 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/tests/qtest/migration-test+0x22f318) (BuildI
d: 07f667506452d6c467dbc06fd95191966d3e91b4)
#1 0x7f186f939c50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
#2 0x55c0c0ae9b01 in qdict_new qobject/qdict.c:30:13
#3 0x55c0c0afc16c in parse_object qobject/json-parser.c:317:12
#4 0x55c0c0afb90f in parse_value qobject/json-parser.c:545:16
#5 0x55c0c0afb579 in json_parser_parse qobject/json-parser.c:579:14
#6 0x55c0c0afa21d in json_message_process_token qobject/json-streamer.c:92:12
#7 0x55c0c0bca2e5 in json_lexer_feed_char qobject/json-lexer.c:313:13
#8 0x55c0c0bc97ce in json_lexer_feed qobject/json-lexer.c:350:9
#9 0x55c0c0afabbc in json_message_parser_feed qobject/json-streamer.c:121:5
#10 0x55c0c09cbd52 in qmp_fd_receive tests/qtest/libqmp.c:86:9
#11 0x55c0c09be69b in qtest_qmp_receive_dict tests/qtest/libqtest.c:760:12
#12 0x55c0c09bca77 in qtest_qmp_receive tests/qtest/libqtest.c:741:27
#13 0x55c0c09bee9d in qtest_vqmp tests/qtest/libqtest.c:812:12
#14 0x55c0c09bd257 in qtest_qmp tests/qtest/libqtest.c:835:16
#15 0x55c0c0a87747 in multifd_mapped_ram_fdset_end tests/qtest/migration-test.c:2393:12
#16 0x55c0c0a85eb3 in test_file_common tests/qtest/migration-test.c:1978:9
#17 0x55c0c0a746a3 in test_multifd_file_mapped_ram_fdset tests/qtest/migration-test.c:2437:5
#18 0x55c0c0a93237 in migration_test_wrapper tests/qtest/migration-helpers.c:458:5
#19 0x7f186f958aed in test_case_run debian/build/deb/../../../glib/gtestutils.c:2930:15
#20 0x7f186f958aed in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3018:16
#21 0x7f186f95880a in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3035:18
#22 0x7f186f95880a in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3035:18
#23 0x7f186f95880a in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3035:18
#24 0x7f186f95880a in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3035:18
#25 0x7f186f95880a in g_test_run_suite_internal debian/build/deb/../../../glib/gtestutils.c:3035:18
#26 0x7f186f958faa in g_test_run_suite debian/build/deb/../../../glib/gtestutils.c:3109:18
#27 0x7f186f959055 in g_test_run debian/build/deb/../../../glib/gtestutils.c:2231:7
#28 0x7f186f959055 in g_test_run debian/build/deb/../../../glib/gtestutils.c:2218:1
#29 0x55c0c0a6e427 in main tests/qtest/migration-test.c:4033:11
Unref the object after we've confirmed that it is what we expect.
If you invoke the migration-test binary in such a way that it doesn't run
any tests, then we never call bootfile_create(), and at the end of
main() bootfile_delete() will try to unlink(NULL), which is not valid.
This can happen if for instance you tell the test binary to run a
subset of tests that turns out to be empty, like this:
(cd build/asan && QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test --tap -k -p bang)
# random seed: R02S6501b289ff8ced4231ba452c3a87bc6f
# Skipping test: userfaultfd not available
1..0
../../tests/qtest/migration-test.c:182:12: runtime error: null pointer passed as argument 1, which is declared to never be null
/usr/include/unistd.h:858:48: note: nonnull attribute specified here
Handle this by making bootfile_delete() not needing to do anything
because bootfile_create() was never called.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Fabiano Rosas <farosas@suse.de>
[fixed conflict with aee07f2563] Signed-off-by: Fabiano Rosas <farosas@suse.de>
Fabiano Rosas [Fri, 2 Aug 2024 14:53:01 +0000 (11:53 -0300)]
tests/qtest/migration: Remove vmstate-static-checker test
I fumbled one of my last pull requests when fixing in-tree an issue
with commit 87d67fadb9 ("monitor: Stop removing non-duplicated
fds"). Basically mixed-up my `git add -p` and `git checkout -p` and
committed a piece of test infra that has not been reviewed yet.
This has not caused any bad symptoms because the test is not enabled
by default anywhere: make check doesn't use two qemu binaries and the
CI doesn't have PYTHON set for the compat tests. Besides, the test
works fine anyway, it would not break anything.
Remove this because it was never intended to be merged.
Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Fabiano Rosas <farosas@suse.de>
John Snow [Thu, 22 Aug 2024 20:48:03 +0000 (16:48 -0400)]
docs/sphinx: fix extra stuff in TOC after freeform QMP sections
Freeform sections with titles are currently generating a TOC entry for
the first paragraph in the section after the header, which is not what
we want.
(Easiest to observe directly in the QMP reference manual's
"Introduction" section.)
When freeform sections are parsed, we create both a section header *and*
an empty, title-less section. This causes some problems with sphinx's
post-parse tree transforms, see also 2664f317 - this is a similar issue:
Sphinx doesn't like section-less titles and it also doesn't like
title-less sections.
Modify qapidoc.py to parse text directly into the preceding section
title as child nodes, eliminating the section duplication. This removes
the extra text from the TOC.
Only very, very lightly tested: "it looks right at a glance" :tm:. I am
still in the process of rewriting qapidoc, so I didn't give it much
deeper thought.
Reported-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: John Snow <jsnow@redhat.com>
Message-ID: <20240822204803.1649762-1-jsnow@redhat.com>
Eric Blake [Thu, 22 Aug 2024 14:35:29 +0000 (09:35 -0500)]
nbd/server: CVE-2024-7409: Avoid use-after-free when closing server
Commit 3e7ef738 plugged the use-after-free of the global nbd_server
object, but overlooked a use-after-free of nbd_server->listener.
Although this race is harder to hit, notice that our shutdown path
first drops the reference count of nbd_server->listener, then triggers
actions that can result in a pending client reaching the
nbd_blockdev_client_closed() callback, which in turn calls
qio_net_listener_set_client_func on a potentially stale object.
If we know we don't want any more clients to connect, and have already
told the listener socket to shut down, then we should not be trying to
update the listener socket's associated function.
* tag 'pull-request-2024-08-26' of https://gitlab.com/thuth/qemu:
tests/qtest: Delete previous boot file
.gitlab-ci.d/windows.yml: Disable the qtests in the MSYS2 job
gitlab-ci: Replace build_script -> step_script in Cirrus jobs
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Akihiko Odaki [Fri, 23 Aug 2024 06:13:12 +0000 (15:13 +0900)]
tests/qtest: Delete previous boot file
A test run may create boot files several times. Delete the previous boot
file before creating a new one.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-ID: <20240823-san-v4-7-a24c6dfa4ceb@daynix.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
Thomas Huth [Tue, 20 Aug 2024 17:01:42 +0000 (19:01 +0200)]
.gitlab-ci.d/windows.yml: Disable the qtests in the MSYS2 job
The qtests are broken since a while in the MSYS2 job in the gitlab-CI,
likely due to some changes in the MSYS2 environment. So far nobody has
neither a clue what's going wrong here, nor an idea how to fix this
(in fact most QEMU developers even don't have a Windows environment
available for properly analyzing this problem), so we should disable the
qtests here for the time being to get at least test coverage again
for the remaining tests that are run here.
Since we already get compile-test coverage for the system emulation
in the cross-win64-system job, and since the MSYS2 job is one of the
longest running jobs in our CI (it takes more than 1 hour to complete),
let's seize the opportunity and also cut the run time by disabling
the system emulation completely here, including the libraries that
are only useful for system emulation. In case somebody ever figures
out the failure of the qtests on MSYS2, we can revert this patch
to get everything back.
Suggested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240820170142.55324-1-thuth@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
gitlab-ci: Replace build_script -> step_script in Cirrus jobs
Long due upgrade, see [1]:
In GitLab Runner 13.2 a translation for step_script to
build_script was added to the custom executor. In 14.0
the build_script stage will be replaced with step_script.
We are using GitLab 17 [2]!
This removes the following warning:
WARNING: Starting with version 17.0 the 'build_script'
stage will be replaced with 'step_script':
https://gitlab.com/groups/gitlab-org/-/epics/6112
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240816213203.18350-1-philmd@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>
Merge tag 'pull-trivial-patches' of https://gitlab.com/mjt0k/qemu into staging
trivial patches for 2024-08-23
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCAAdFiEEe3O61ovnosKJMUsicBtPaxppPlkFAmbImVIACgkQcBtPaxpp
# PllP3wf/TaYAQs0HkQRQ62/2wqnfABpZYft/g6EhHveZ/04pJ/eNIIiVqqUg4DGs
# i8fENABRlRPoeK5HtGVhHYbOg6tzje7MR0qdSmWaKb2R5pPqkLHZ6NTtQlINLpOb
# O8Nh1c5/qDW/pDPCWVLkEMTqKhtGfINr0pHSlTfOr0W9FrU1I6srvr6AZtrTORlL
# 5b79j5IZGQSj5zR3ViuKyEPdA5NRSeTOewg8WCKGSxZGk4OlVPevrEAGOyQReOuN
# HTfNi8KQH/pPzl6+f+THkgKmYYfUAlPvzkJDndV9vcPFLPI8ZncZ1o1Kmog6UERc
# s5J2vTcir/ReEukApRRsZkKHLAoYdQ==
# =Srl8
# -----END PGP SIGNATURE-----
# gpg: Signature made Sat 24 Aug 2024 12:14:42 AM AEST
# gpg: using RSA key 7B73BAD68BE7A2C289314B22701B4F6B1A693E59
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>" [full]
# gpg: aka "Michael Tokarev <mjt@debian.org>" [full]
# gpg: aka "Michael Tokarev <mjt@corpit.ru>" [full]
* tag 'pull-trivial-patches' of https://gitlab.com/mjt0k/qemu:
hw/display/vhost-user-gpu.c: fix vhost_user_gpu_chr_read()
system/vl.c: Print machine name, not "(null)", for unknown machine types
hw/x86: add a couple of comments explaining how the kernel image is parsed
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Peter Maydell [Thu, 22 Aug 2024 12:23:10 +0000 (13:23 +0100)]
system/vl.c: Print machine name, not "(null)", for unknown machine types
In commit 412d294ffdc we tried to improve the error message printed when
the machine type is unknown, but we used the wrong variable, resulting in:
$ ./build/x86/qemu-system-aarch64 -M bang
qemu-system-aarch64: unsupported machine type: "(null)"
Use -machine help to list supported machines
Use the right variable, so we produce more helpful output:
$ ./build/x86/qemu-system-aarch64 -M bang
qemu-system-aarch64: unsupported machine type: "bang"
Use -machine help to list supported machines
Note that we must move the qdict_del() to below the error_setg(),
because machine_type points into the value of that qdict entry,
and deleting it will make the pointer invalid.
Cc: qemu-stable@nongnu.org Fixes: 412d294ffdc ("vl.c: select_machine(): add selected machine type to error message") Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Ani Sinha [Fri, 19 Jul 2024 13:49:37 +0000 (19:19 +0530)]
hw/x86: add a couple of comments explaining how the kernel image is parsed
Cosmetic: add comments in x86_load_linux() pointing to the kernel documentation
so that users can better understand the code.
CC: qemu-trivial@nongnu.org Signed-off-by: Ani Sinha <anisinha@redhat.com> Reviewed-by: Zhao Liu <zhao1.liu@intel.com> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Merge tag 'pull-loongarch-20240821' of https://gitlab.com/gaosong/qemu into staging
Fix for 9.1
# -----BEGIN PGP SIGNATURE-----
#
# iLMEAAEKAB0WIQS4/x2g0v3LLaCcbCxAov/yOSY+3wUCZsVYjgAKCRBAov/yOSY+
# 306ZA/9/DFdJB5WbVtv8ZNaRKT2jj6N9o5YlLbO1HsdMGpJbDWNJAIrOIdfBCYzF
# oEvjuYItBI9DXcSUE748ucBkct/x4WkBwfL5mxfTRXOhvx3iKFeC2ZKyKPtsciRO
# QE4UDmrFbQ9IrW33Vw0+CRMlN/U8xBO7lPDfbk2MA7fM74ns8A==
# =EbRt
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 21 Aug 2024 01:01:34 PM AEST
# gpg: using RSA key B8FF1DA0D2FDCB2DA09C6C2C40A2FFF239263EDF
# gpg: Good signature from "Song Gao <m17746591750@163.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B8FF 1DA0 D2FD CB2D A09C 6C2C 40A2 FFF2 3926 3EDF
* tag 'pull-loongarch-20240821' of https://gitlab.com/gaosong/qemu:
hw/loongarch: Fix length for lowram in ACPI SRAT
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Fixes: fc100011f38d ("hw/loongarch: Refine acpi srat table for numa memory") Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Reviewed-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Song Gao <gaosong@loongson.cn>
Merge tag 'pull-misc-20240821' of https://gitlab.com/rth7680/qemu into staging
target/i386: Fix carry flag for BLSI
target/i386: Fix tss access size in switch_tss_ra
linux-user: Handle short reads in mmap_h_gt_g
bsd-user: Handle short reads in mmap_h_gt_g
* tag 'pull-misc-20240821' of https://gitlab.com/rth7680/qemu:
target/i386: Fix tss access size in switch_tss_ra
target/i386: Fix carry flag for BLSI
target/i386: Split out gen_prepare_val_nz
bsd-user: Handle short reads in mmap_h_gt_g
linux-user: Handle short reads in mmap_h_gt_g
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The two limit_max variables represent size - 1, just like the
encoding in the GDT, thus the 'old' access was off by one.
Access the minimal size of the new tss: the complete tss contains
the iopb, which may be a larger block than the access api expects,
and irrelevant because the iopb is not accessed during the
switch itself.
Fixes: 8b131065080a ("target/i386/tcg: use X86Access for TSS access")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2511 Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240819074052.207783-1-richard.henderson@linaro.org> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Split out the TCG_COND_TSTEQ logic from gen_prepare_eflags_z,
and use it for CC_OP_BMILG* as well. Prepare for requiring
both zero and non-zero senses.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240801075845.573075-2-richard.henderson@linaro.org>
In particular, if an image has a large bss, we can hit EOF before reading
all bytes of the mapping. Mirror the similar change to linux-user.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20240820050848.165253-3-richard.henderson@linaro.org>
Peter Maydell [Tue, 20 Aug 2024 14:44:29 +0000 (15:44 +0100)]
migration/multifd: Free MultiFDRecvParams::data
In multifd_recv_setup() we allocate (among other things)
* a MultiFDRecvData struct to multifd_recv_state::data
* a MultiFDRecvData struct to each multfd_recv_state->params[i].data
(Then during execution we might swap these pointers around.)
But in multifd_recv_cleanup() we free multifd_recv_state->data
in multifd_recv_cleanup_state() but we don't ever free the
multifd_recv_state->params[i].data. This results in a memory
leak reported by LeakSanitizer:
(cd build/asan && \
ASAN_OPTIONS="fast_unwind_on_malloc=0:strip_path_prefix=/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../" \
QTEST_QEMU_BINARY=./qemu-system-x86_64 \
./tests/qtest/migration-test --tap -k -p /x86_64/migration/multifd/file/mapped-ram )
[...]
Direct leak of 72 byte(s) in 3 object(s) allocated from:
#0 0x561cc0afcfd8 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x218efd8) (BuildId: be72e086d4e47b172b0a72779972213fd9916466)
#1 0x7f89d37acc50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
#2 0x561cc1e9c83c in multifd_recv_setup migration/multifd.c:1606:19
#3 0x561cc1e68618 in migration_ioc_process_incoming migration/migration.c:972:9
#4 0x561cc1e3ac59 in migration_channel_process_incoming migration/channel.c:45:9
#5 0x561cc1e4fa0b in file_accept_incoming_migration migration/file.c:132:5
#6 0x561cc30f2c0c in qio_channel_fd_source_dispatch io/channel-watch.c:84:12
#7 0x7f89d37a3c43 in g_main_dispatch debian/build/deb/../../../glib/gmain.c:3419:28
#8 0x7f89d37a3c43 in g_main_context_dispatch debian/build/deb/../../../glib/gmain.c:4137:7
#9 0x561cc3b21659 in glib_pollfds_poll util/main-loop.c:287:9
#10 0x561cc3b1ff93 in os_host_main_loop_wait util/main-loop.c:310:5
#11 0x561cc3b1fb5c in main_loop_wait util/main-loop.c:589:11
#12 0x561cc1da2917 in qemu_main_loop system/runstate.c:801:9
#13 0x561cc3796c1c in qemu_default_main system/main.c:37:14
#14 0x561cc3796c67 in main system/main.c:48:12
#15 0x7f89d163bd8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#16 0x7f89d163be3f in __libc_start_main csu/../csu/libc-start.c:392:3
#17 0x561cc0a79fa4 in _start (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x210bfa4) (BuildId: be72e086d4e47b172b0a72779972213fd9916466)
Direct leak of 24 byte(s) in 1 object(s) allocated from:
#0 0x561cc0afcfd8 in __interceptor_calloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x218efd8) (BuildId: be72e086d4e47b172b0a72779972213fd9916466)
#1 0x7f89d37acc50 in g_malloc0 debian/build/deb/../../../glib/gmem.c:161:13
#2 0x561cc1e9bed9 in multifd_recv_setup migration/multifd.c:1588:32
#3 0x561cc1e68618 in migration_ioc_process_incoming migration/migration.c:972:9
#4 0x561cc1e3ac59 in migration_channel_process_incoming migration/channel.c:45:9
#5 0x561cc1e4fa0b in file_accept_incoming_migration migration/file.c:132:5
#6 0x561cc30f2c0c in qio_channel_fd_source_dispatch io/channel-watch.c:84:12
#7 0x7f89d37a3c43 in g_main_dispatch debian/build/deb/../../../glib/gmain.c:3419:28
#8 0x7f89d37a3c43 in g_main_context_dispatch debian/build/deb/../../../glib/gmain.c:4137:7
#9 0x561cc3b21659 in glib_pollfds_poll util/main-loop.c:287:9
#10 0x561cc3b1ff93 in os_host_main_loop_wait util/main-loop.c:310:5
#11 0x561cc3b1fb5c in main_loop_wait util/main-loop.c:589:11
#12 0x561cc1da2917 in qemu_main_loop system/runstate.c:801:9
#13 0x561cc3796c1c in qemu_default_main system/main.c:37:14
#14 0x561cc3796c67 in main system/main.c:48:12
#15 0x7f89d163bd8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#16 0x7f89d163be3f in __libc_start_main csu/../csu/libc-start.c:392:3
#17 0x561cc0a79fa4 in _start (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x210bfa4) (BuildId: be72e086d4e47b172b0a72779972213fd9916466)
SUMMARY: AddressSanitizer: 96 byte(s) leaked in 4 allocation(s).
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio: regression fixes
3 small patches to make sure we don't ship regressions.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmbEdw8PHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRp0dsIAKTzhmBR3IviFQVo223RgcDfthxoKejTB5tv
# EhGVUi4ddrViIIHsKFZ0pTHXnRcwHpPRokg6GrbqNhrAM6K7ptP8pkEK1DDkbGtq
# HaeceK55nNZ/wM1O5xHpRLVc2WtxmBrliDTFHGB2HjURO/kpjoHqWbE6Sn4GILc1
# EYU2T3Wn1UFgj+H4L7yF4SzmQSmyzq+7Tml6Z2GzpsatdwCoFQz2nA28piCnRMCq
# lusMo2YdE6js9JS/h+zMqgKValuCyuU7S7ZbSO2dvYQwt/hgk07BegBrdsAENNh6
# 0IWRHrojwAg+4U6ULzbrBG6/hW2A8Q5065D8Nf9Bjy4eAU7QSbU=
# =K6xx
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 20 Aug 2024 08:59:27 PM AEST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: The key's User ID is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu:
virtio-pci: Fix the use of an uninitialized irqfd
hw/audio/virtio-snd: fix invalid param check
vhost: Add VIRTIO_NET_F_RSC_EXT to vhost feature bits
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Cindy Lu [Tue, 6 Aug 2024 09:37:12 +0000 (17:37 +0800)]
virtio-pci: Fix the use of an uninitialized irqfd
The crash was reported in MAC OS and NixOS, here is the link for this bug
https://gitlab.com/qemu-project/qemu/-/issues/2334
https://gitlab.com/qemu-project/qemu/-/issues/2321
In this bug, they are using the virtio_input device. The guest notifier was
not supported for this device, The function virtio_pci_set_guest_notifiers()
was not called, and the vector_irqfd was not initialized.
So the fix is adding the check for vector_irqfd in virtio_pci_get_notifier()
The function virtio_pci_get_notifier() can be used in various devices.
It could also be called when VIRTIO_CONFIG_S_DRIVER_OK is not set. In this situation,
the vector_irqfd being NULL is acceptable. We can allow the device continue to boot
If the vector_irqfd still hasn't been initialized after VIRTIO_CONFIG_S_DRIVER_OK
is set, it means that the function set_guest_notifiers was not called before the
driver started. This indicates that the device is not using the notifier.
At this point, we will let the check fail.
This fix is verified in vyatta,MacOS,NixOS,fedora system.
The bt tree for this bug is:
Thread 6 "CPU 0/KVM" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7c817be006c0 (LWP 1269146)]
kvm_virtio_pci_vq_vector_use () at ../qemu-9.0.0/hw/virtio/virtio-pci.c:817
817 if (irqfd->users == 0) {
(gdb) thread apply all bt
...
Thread 6 (Thread 0x7c817be006c0 (LWP 1269146) "CPU 0/KVM"):
0 kvm_virtio_pci_vq_vector_use () at ../qemu-9.0.0/hw/virtio/virtio-pci.c:817
1 kvm_virtio_pci_vector_use_one () at ../qemu-9.0.0/hw/virtio/virtio-pci.c:893
2 0x00005983657045e2 in memory_region_write_accessor () at ../qemu-9.0.0/system/memory.c:497
3 0x0000598365704ba6 in access_with_adjusted_size () at ../qemu-9.0.0/system/memory.c:573
4 0x0000598365705059 in memory_region_dispatch_write () at ../qemu-9.0.0/system/memory.c:1528
5 0x00005983659b8e1f in flatview_write_continue_step.isra.0 () at ../qemu-9.0.0/system/physmem.c:2713
6 0x000059836570ba7d in flatview_write_continue () at ../qemu-9.0.0/system/physmem.c:2743
7 flatview_write () at ../qemu-9.0.0/system/physmem.c:2774
8 0x000059836570bb76 in address_space_write () at ../qemu-9.0.0/system/physmem.c:2894
9 0x0000598365763afe in address_space_rw () at ../qemu-9.0.0/system/physmem.c:2904
10 kvm_cpu_exec () at ../qemu-9.0.0/accel/kvm/kvm-all.c:2917
11 0x000059836576656e in kvm_vcpu_thread_fn () at ../qemu-9.0.0/accel/kvm/kvm-accel-ops.c:50
12 0x0000598365926ca8 in qemu_thread_start () at ../qemu-9.0.0/util/qemu-thread-posix.c:541
13 0x00007c8185bcd1cf in ??? () at /usr/lib/libc.so.6
14 0x00007c8185c4e504 in clone () at /usr/lib/libc.so.6
Fixes: 2ce6cff94d ("virtio-pci: fix use of a released vector") Cc: qemu-stable@nongnu.org Signed-off-by: Cindy Lu <lulu@redhat.com>
Message-Id: <20240806093715.65105-1-lulu@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Volker Rümelin [Fri, 2 Aug 2024 07:18:05 +0000 (09:18 +0200)]
hw/audio/virtio-snd: fix invalid param check
Commit 9b6083465f ("virtio-snd: check for invalid param shift
operands") tries to prevent invalid parameters specified by the
guest. However, the code is not correct.
Change the code so that the parameters format and rate, which are
a bit numbers, are compared with the bit size of the data type.
Fixes: 9b6083465f ("virtio-snd: check for invalid param shift operands") Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <20240802071805.7123-1-vr_qemu@t-online.de> Reviewed-by: Manos Pitsidianakis <manos.pitsidianakis@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Akihiko Odaki [Fri, 2 Aug 2024 05:38:19 +0000 (14:38 +0900)]
vhost: Add VIRTIO_NET_F_RSC_EXT to vhost feature bits
VIRTIO_NET_F_RSC_EXT is implemented in the rx data path, which vhost
implements, so vhost needs to support the feature if it is ever to be
enabled with vhost. The feature must be disabled otherwise.
Fixes: 2974e916df87 ("virtio-net: support RSC v4/v6 tcp traffic for Windows HCK") Reported-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20240802-rsc-v1-1-2b607bd2f555@daynix.com> Acked-by: Jason Wang <jasowang@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Klaus Jensen [Tue, 20 Aug 2024 04:16:48 +0000 (06:16 +0200)]
hw/nvme: fix leak of uninitialized memory in io_mgmt_recv
Yutaro Shimizu from the Cyber Defense Institute discovered a bug in the
NVMe emulation that leaks contents of an uninitialized heap buffer if
subsystem and FDP emulation are enabled.
Cc: qemu-stable@nongnu.org Reported-by: Yutaro Shimizu <shimizu@cyberdefense.jp> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Merge tag 'hw-misc-20240820' of https://github.com/philmd/qemu into staging
Various fixes
- Null pointer dereference in IPI IOCSR (Jiaxun)
- Correct '-smbios type=4' in man page (Heinrich)
- Use correct MMU index in MIPS get_pte (Phil)
- Reset MPQEMU remote message using device_cold_reset (Peter)
- Update linux-user MIPS CPU list (Phil)
- Do not let exec_command read console if no pattern to wait for (Nick)
- Remove shadowed declaration warning (Pierrick)
- Restrict STQF opcode to SPARC V9 (Richard)
- Add missing Kconfig dependency for POWERNV ISA serial port (Bernhard)
- Do not allow vmport device without i8042 PS/2 controller (Kamil)
- Fix QCryptoTLSCredsPSK leak (Peter)
* tag 'hw-misc-20240820' of https://github.com/philmd/qemu:
crypto/tlscredspsk: Free username on finalize
hw/i386/pc: Ensure vmport prerequisites are fulfilled
hw/i386/pc: Unify vmport=auto handling
hw/ppc/Kconfig: Add missing SERIAL_ISA dependency to POWERNV machine
target/sparc: Restrict STQF to sparcv9
contrib/plugins/execlog: Fix shadowed declaration warning
tests/avocado: Mark ppc_hv_tests.py as non-flaky after fixed console interaction
tests/avocado: exec_command should not consume console output
linux-user/mips: Select Loongson CPU for Loongson binaries
linux-user/mips: Select MIPS64R2-generic for Rel2 binaries
linux-user/mips: Select Octeon68XX CPU for Octeon binaries
linux-user/mips: Do not try to use removed R5900 CPU
hw/remote/message.c: Don't directly invoke DeviceClass:reset
hw/dma/xilinx_axidma: Use semicolon at end of statement, not comma
target/mips: Load PTE as DATA
target/mips: Use correct MMU index in get_pte()
target/mips: Pass page table entry size as MemOp to get_pte()
qemu-options.hx: correct formatting -smbios type=4
hw/mips/loongson3_virt: Fix condition of IPI IOCSR connection
hw/mips/loongson3_virt: Store core_iocsr into LoongsonMachineState
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Peter Maydell [Mon, 19 Aug 2024 14:50:21 +0000 (15:50 +0100)]
crypto/tlscredspsk: Free username on finalize
When the creds->username property is set we allocate memory
for it in qcrypto_tls_creds_psk_prop_set_username(), but
we never free this when the QCryptoTLSCredsPSK is destroyed.
Free the memory in finalize.
This fixes a LeakSanitizer complaint in migration-test:
Direct leak of 5 byte(s) in 1 object(s) allocated from:
#0 0x5624e5c99dee in malloc (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x218edee) (BuildId: a9e623fa1009a9435c0142c037cd7b8c1ad04ce3)
#1 0x7fb199ae9738 in g_malloc debian/build/deb/../../../glib/gmem.c:128:13
#2 0x7fb199afe583 in g_strdup debian/build/deb/../../../glib/gstrfuncs.c:361:17
#3 0x5624e82ea919 in qcrypto_tls_creds_psk_prop_set_username /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../crypto/tlscredspsk.c:255:23
#4 0x5624e812c6b5 in property_set_str /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object.c:2277:5
#5 0x5624e8125ce5 in object_property_set /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object.c:1463:5
#6 0x5624e8136e7c in object_set_properties_from_qdict /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object_interfaces.c:55:14
#7 0x5624e81372d2 in user_creatable_add_type /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object_interfaces.c:112:5
#8 0x5624e8137964 in user_creatable_add_qapi /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/object_interfaces.c:157:11
#9 0x5624e891ba3c in qmp_object_add /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qom/qom-qmp-cmds.c:227:5
#10 0x5624e8af9118 in qmp_marshal_object_add /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qapi/qapi-commands-qom.c:337:5
#11 0x5624e8bd1d49 in do_qmp_dispatch_bh /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../qapi/qmp-dispatch.c:128:5
#12 0x5624e8cb2531 in aio_bh_call /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/async.c:171:5
#13 0x5624e8cb340c in aio_bh_poll /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/async.c:218:13
#14 0x5624e8c0be98 in aio_dispatch /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/aio-posix.c:423:5
#15 0x5624e8cba3ce in aio_ctx_dispatch /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/async.c:360:5
#16 0x7fb199ae0d3a in g_main_dispatch debian/build/deb/../../../glib/gmain.c:3419:28
#17 0x7fb199ae0d3a in g_main_context_dispatch debian/build/deb/../../../glib/gmain.c:4137:7
#18 0x5624e8cbe1d9 in glib_pollfds_poll /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/main-loop.c:287:9
#19 0x5624e8cbcb13 in os_host_main_loop_wait /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/main-loop.c:310:5
#20 0x5624e8cbc6dc in main_loop_wait /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../util/main-loop.c:589:11
#21 0x5624e6f3f917 in qemu_main_loop /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../system/runstate.c:801:9
#22 0x5624e893379c in qemu_default_main /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../system/main.c:37:14
#23 0x5624e89337e7 in main /mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/../../system/main.c:48:12
#24 0x7fb197972d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#25 0x7fb197972e3f in __libc_start_main csu/../csu/libc-start.c:392:3
#26 0x5624e5c16fa4 in _start (/mnt/nvmedisk/linaro/qemu-from-laptop/qemu/build/asan/qemu-system-x86_64+0x210bfa4) (BuildId: a9e623fa1009a9435c0142c037cd7b8c1ad04ce3)
SUMMARY: AddressSanitizer: 5 byte(s) leaked in 1 allocation(s).
Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-ID: <20240819145021.38524-1-peter.maydell@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Kamil Szczęk [Sat, 17 Aug 2024 15:26:15 +0000 (15:26 +0000)]
hw/i386/pc: Ensure vmport prerequisites are fulfilled
Since commit 4ccd5fe22feb95137d325f422016a6473541fe9f ('pc: add option
to disable PS/2 mouse/keyboard'), the vmport will not be created unless
the i8042 PS/2 controller is enabled. To avoid confusion, let's fail if
vmport was explicitly requested, but the i8042 controller is disabled.
This also changes the behavior of vmport=auto to take i8042 controller
availability into account.
Signed-off-by: Kamil Szczęk <kamil@szczek.dev> Reviewed-by: Bernhard Beschow <shentey@gmail.com>
Message-ID: <0MS3y5E-hHqODIhiuFxmCnIrXd612JIGq31UuMsz4KGCKZ_wWuF-PHGKTRSGS0nWaPEddOdF4YOczHdgorulECPo792OhWov7O9BBF6UMX4=@szczek.dev> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Kamil Szczęk [Sat, 17 Aug 2024 15:25:31 +0000 (15:25 +0000)]
hw/i386/pc: Unify vmport=auto handling
The code which translates vmport=auto to on/off is currently separate
for each PC machine variant, while being functionally equivalent.
This moves the translation into a shared initialization function, while
also tightening the enum assertion.
Signed-off-by: Kamil Szczęk <kamil@szczek.dev> Reviewed-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <v8pz1uwgIYWkidgZK-o8H-qJvnSyl0641XVmNO43Qls307AA3QRPuad_py6xGe0JAxB6yDEe76oZ8tau_n-2Y6sJBCKzCujNbEUUFhd-ahI=@szczek.dev> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Bernhard Beschow [Wed, 14 Aug 2024 18:15:32 +0000 (20:15 +0200)]
hw/ppc/Kconfig: Add missing SERIAL_ISA dependency to POWERNV machine
The machine calls serial_hds_isa_init() which is provided by serial-isa.c,
guarded by SERIAL_ISA.
Signed-off-by: Bernhard Beschow <shentey@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240814181534.218964-4-shentey@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
../contrib/plugins/execlog.c: In function ‘vcpu_tb_trans’:
../contrib/plugins/execlog.c:236:22: error: declaration of ‘n’ shadows a previous local [-Werror=shadow=local]
236 | for (int n = 0; n < all_reg_names->len; n++) {
| ^
../contrib/plugins/execlog.c:184:12: note: shadowed declaration is here
184 | size_t n = qemu_plugin_tb_n_insns(tb);
|
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240814233645.944327-2-pierrick.bouvier@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Nicholas Piggin [Mon, 5 Aug 2024 23:28:13 +0000 (09:28 +1000)]
tests/avocado: Mark ppc_hv_tests.py as non-flaky after fixed console interaction
Now that exec_command doesn't incorrectly consume console output,
and guest time is set correctly, ppc_hv_tests.py is working more
reliably. Try marking it non-flaky.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-ID: <20240805232814.267843-3-npiggin@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Nicholas Piggin [Mon, 5 Aug 2024 23:28:12 +0000 (09:28 +1000)]
tests/avocado: exec_command should not consume console output
_console_interaction reads data from the console even when there is only
an input string to send, and no output data to wait on. This can cause
lines to be missed by wait_for_console_pattern calls that follows an
exec_command. Fix this by not reading the console if there is no pattern
to wait for.
This solves occasional hangs in ppc_hv_tests.py, usually when run on KVM
hosts that are fast enough to output important lines quickly enough to be
consumed by exec_command, so they get missed by subsequent wait for
pattern calls.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240805232814.267843-2-npiggin@gmail.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
linux-user/mips: Select Loongson CPU for Loongson binaries
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240814133928.6746-5-philmd@linaro.org>
linux-user/mips: Select MIPS64R2-generic for Rel2 binaries
Cc: YunQiang Su <syq@debian.org> Reported-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240814133928.6746-4-philmd@linaro.org>
Directly invoking the DeviceClass::reset method is a bad idea,
because if the device is using three-phase reset then it relies on
transitional reset machinery which is likely to disappear at some
point.
Reset the device in the standard way, by calling device_cold_reset().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240813165250.2717650-7-peter.maydell@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Peter Maydell [Tue, 13 Aug 2024 16:52:45 +0000 (17:52 +0100)]
hw/dma/xilinx_axidma: Use semicolon at end of statement, not comma
In axidma_class_init() we accidentally used a comma at the end of
a statement rather than a semicolon. This has no ill effects, but
it's obviously not intended and it means that Coccinelle scripts
for instance will fail to match on the two statements. Use a
semicolon instead.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240813165250.2717650-6-peter.maydell@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Suggested-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20240814090452.2591-2-philmd@linaro.org>
Jiaxun Yang [Fri, 21 Jun 2024 13:11:14 +0000 (14:11 +0100)]
hw/mips/loongson3_virt: Fix condition of IPI IOCSR connection
>>> CID 1547264: Null pointer dereferences (REVERSE_INULL)
>>> Null-checking "ipi" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
Merge tag 'pull-riscv-to-apply-20240819-1' of https://github.com/alistair23/qemu into staging
RISC-V PR for 9.1
This reverts a commit adding `#msi-cells=<0>` to the virt machine
as that commit results in PCI devices unable to us MSIs. Even though
it's a kernel bug, we don't want to break existing users.
* Revert adding #msi-cells to virt machine
# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEaukCtqfKh31tZZKWr3yVEwxTgBMFAmbCzDEACgkQr3yVEwxT
# gBP2Jw/+Phcb9tw8vv3kHyjXaH5JuqMvRvE0DZi3Zub9cdwIygXEC8/o0q4Szh+4
# FGZbxSsQ6XdfOW87qY66kTlM8yxVJf2RoQcQ27QTs0kCM3TR/1nzRbc2wWPMYRmH
# FvOL926Nr+ysxtVd84HZc82GwQpEIG1qdWpy5VECMZXW8mtOTQjgltKuiH9Jl+ZX
# N0uqWc4/lp+x+UIZqS9b76AiZ8l1G5nRFdXgmKKU7J8iVeWLRRzV1NRu+cZP4WEv
# kjpMODdedScEcvqb122SVTTJcpdvhuB+bWH6mITajbt2G4YxsNYJ9594nef/sKBH
# hf3oSfXUnwDqTldnrkFonO9OhdO3ZCdtqw5Lzi1E/D2zny2CnMMIAcs8hbenVGkW
# NW0J/z84J+X1qf5gmt07l2BlUhBooCS8TJsbO8PX/lR2iCL/BxuKHEjxCnCZ6f5z
# 3FxhqO3Shk9FnfAsTxtY00RLmRo4t+ESTsBsZPiSXB3EmCo/BmgR/0Grm7UKZbbL
# /9lzUHyUYj09Mvk7IJc4KGjihfQ9TwjNdlmq2MlRHWdVT09+Bu7DRhHvNzuVYMb9
# 1iktWv4Fnit6Xe6rPOvNXF5ilmUu2fm3p6z2ogG8cRbPHPPQ7NLx8BQSqPvBHdfx
# KIV6f1xBJSSQcTdIq/ySnN1SF1h2YVPLIlv1Aap3kN/J71kkpLY=
# =C6id
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 19 Aug 2024 02:38:09 PM AEST
# gpg: using RSA key 6AE902B6A7CA877D6D659296AF7C95130C538013
# gpg: Good signature from "Alistair Francis <alistair@alistair23.me>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 6AE9 02B6 A7CA 877D 6D65 9296 AF7C 9513 0C53 8013
* tag 'pull-riscv-to-apply-20240819-1' of https://github.com/alistair23/qemu:
Revert "hw/riscv/virt.c: imsics DT: add '#msi-cells'"
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Linux does not properly handle '#msi-cells=<0>' when searching for
MSI controllers for PCI devices which results in the devices being
unable to use MSIs. A patch for Linux has been sent[1] but until it,
or something like it, is merged and in distro kernels we should stop
adding the property. It's harmless to stop adding it since the
absence of the property and a value of zero for the property mean
the same thing according to the DT binding definition.
Merge tag 'pull-maintainer-9.1-rc3-160824-1' of https://gitlab.com/stsquad/qemu into staging
Some fixes for 9.1-rc3 (build, replay, docs, plugins)
- re-enable gdbsim-r5f562n8 test
- ensure updates to python deps re-trigger configure
- tweak configure detection of GDB MTE support
- make checkpatch emit more warnings on updating headers
- allow i386 access_ptr to force slow path for plugins
- fixe some replay regressions
- update the replay-dump tool
- better handle muxed chardev during replay
- clean up TCG plugins docs to mention scoreboards
- fix plugin scoreboard race condition
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAma/UJcACgkQ+9DbCVqe
# KkT51gf/buOo0leJnBkYDTPWOOsDupW/nUUqOlTStvpKGEVNZgmxH0V4ffdCNO8E
# P4xQpD8WrpFKZHu2zE7EmXJ6/wkSp2BeSPcZ8lhld8jKNY3ksBlsCwb26/D9WsWK
# /JaqAegdg3fwCgbcQ057dRlKJV2ojjWD/JqPWa5G9AIlSqiHEfvcTj9t33BpJKXC
# xV7Yt1TZExkfkCAny54Sx4O6oiDhvSgJmWCUGIVE2W39+g3jUKf2tvbggR5MEIH3
# fJ/F2vmcnllmK21awiRa9/WVZ55+Cbgj6PlLf/Qh6rhzooTMy+x0G+5BkNtZwNCs
# 8qFu8vFkuJM9YwDw9btaz3b+nG8Mzg==
# =HUN1
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 16 Aug 2024 11:13:59 PM AEST
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
* tag 'pull-maintainer-9.1-rc3-160824-1' of https://gitlab.com/stsquad/qemu: (21 commits)
plugins: fix race condition with scoreboards
docs/devel: update tcg-plugins page
docs: Fix some typos (found by typos) and grammar issues
savevm: Fix load_snapshot error path crash
virtio-net: Use virtual time for RSC timers
virtio-net: Use replay_schedule_bh_event for bhs that affect machine state
chardev: set record/replay on the base device of a muxed device
tests/avocado: replay_kernel.py add x86-64 q35 machine test
Revert "replay: stop us hanging in rr_wait_io_event"
replay: allow runstate shutdown->running when replaying trace
tests/avocado: excercise scripts/replay-dump.py in replay tests
scripts/replay-dump.py: rejig decoders in event number order
scripts/replay-dump.py: Update to current rr record format
buildsys: Fix building without plugins on Darwin
target/i386: allow access_ptr to force slow path on failed probe
scripts/checkpatch: more checks on files imported from Linux
configure: Fix GDB version detection for GDB_HAS_MTE
configure: Avoid use of param. expansion when using gdb_version
configure: Fix arch detection for GDB_HAS_MTE
Makefile: trigger re-configure on updated pythondeps
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Pierrick Bouvier [Tue, 13 Aug 2024 20:23:29 +0000 (21:23 +0100)]
plugins: fix race condition with scoreboards
A deadlock can be created if a new vcpu (a) triggers a scoreboard
reallocation, and another vcpu (b) wants to create a new scoreboard at
the same time.
In this case, (a) holds the plugin lock, and starts an exclusive
section, waiting for (b). But at the same time, (b) is waiting for
plugin lock.
The solution is to drop the lock before entering the exclusive section.
This bug can be easily reproduced by creating a callback for any tb
exec, that allocates a new scoreboard. In this case, as soon as we reach
more than 16 vcpus, the deadlock occurs.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2344 Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Message-Id: <20240812220748.95167-2-pierrick.bouvier@linaro.org>
[AJB: tweak var position to meet coding style] Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20240813202329.1237572-22-alex.bennee@linaro.org>
Stefan Weil [Tue, 13 Aug 2024 20:23:27 +0000 (21:23 +0100)]
docs: Fix some typos (found by typos) and grammar issues
Fix the misspellings of "overriden" also in code comments.
Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20240813125638.395461-1-sw@weilnetz.de> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240813202329.1237572-20-alex.bennee@linaro.org>
Nicholas Piggin [Tue, 13 Aug 2024 20:23:25 +0000 (21:23 +0100)]
virtio-net: Use virtual time for RSC timers
Receive coalescing is visible to the target machine, so its timers
should use virtual time like other timers in virtio-net, to be
compatible with record-replay.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20240813050638.446172-10-npiggin@gmail.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240813202329.1237572-18-alex.bennee@linaro.org>
Nicholas Piggin [Tue, 13 Aug 2024 20:23:24 +0000 (21:23 +0100)]
virtio-net: Use replay_schedule_bh_event for bhs that affect machine state
The regular qemu_bh_schedule() calls result in non-deterministic
execution of the bh in record-replay mode, which causes replay failure.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20240813050638.446172-9-npiggin@gmail.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240813202329.1237572-17-alex.bennee@linaro.org>
Nicholas Piggin [Tue, 13 Aug 2024 20:23:22 +0000 (21:23 +0100)]
tests/avocado: replay_kernel.py add x86-64 q35 machine test
The x86-64 pc machine is flaky with record/replay, but q35 is more
stable. Add a q35 test to replay_kernel.py.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20240813050638.446172-7-npiggin@gmail.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240813202329.1237572-15-alex.bennee@linaro.org>
That commit causes reverse_debugging.py test failures, and does
not seem to solve the root cause of the problem x86-64 still
hangs in record/replay tests.
The problem with short-cutting the iowait that was taken during
record phase is that related events will not get consumed at the
same points (e.g., reading the clock).
A hang with zero icount always seems to be a symptom of an earlier
problem that has caused the recording to become out of synch with
the execution and consumption of events by replay.
Acked-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20240813050638.446172-6-npiggin@gmail.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240813202329.1237572-14-alex.bennee@linaro.org>
Nicholas Piggin [Tue, 13 Aug 2024 20:23:20 +0000 (21:23 +0100)]
replay: allow runstate shutdown->running when replaying trace
When replaying a trace, it is possible to go from shutdown to running
with a reverse-debugging step. This can be useful if the problem being
debugged triggers a reset or shutdown.
This can be tested by making a recording of a machine that shuts down,
then using -action shutdown=pause when replaying it. Continuing to the
end of the trace then reverse-stepping in gdb crashes due to invalid
runstate transition.
Just permitting the transition seems to be all that's necessary for
reverse-debugging to work well in such a state.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20240813050638.446172-5-npiggin@gmail.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20240813202329.1237572-13-alex.bennee@linaro.org>