xen/9pfs: yield when there isn't enough room on the ring
Instead of truncating replies, which is problematic, wait until the
client reads more data and frees bytes on the reply ring.
Do that by calling qemu_coroutine_yield(). The corresponding
qemu_coroutine_enter_if_inactive() is called from xen_9pfs_bh upon
receiving the next notification from the client.
We need to be careful to avoid races in case xen_9pfs_bh and the
coroutine are both active at the same time. In xen_9pfs_bh, wait until
either the critical section is over (ring->co == NULL) or until the
coroutine becomes inactive (qemu_coroutine_yield() was called) before
continuing. Then, simply wake up the coroutine if it is inactive.
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Message-Id: <20200521192627.15259-2-sstabellini@kernel.org> Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit a4c4d462729466c4756bac8a0a8d77eb63b21ef7) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
If delivery of some 9pfs response fails for some reason, log the
error message by mentioning the 9P protocol reply type, not by
client's request type. The latter could be misleading that the
error occurred already when handling the request input.
Dan Robertson [Mon, 25 May 2020 08:38:03 +0000 (10:38 +0200)]
9pfs: include linux/limits.h for XATTR_SIZE_MAX
linux/limits.h should be included for the XATTR_SIZE_MAX definition used
by v9fs_xattrcreate.
Fixes: 3b79ef2cf488 ("9pfs: limit xattr size in xattrcreate") Signed-off-by: Dan Robertson <dan@dlrobertson.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Message-Id: <20200515203015.7090-2-dan@dlrobertson.com> Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 03556ea920b23c466ce7c1283199033de33ee671) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Omar Sandoval [Thu, 14 May 2020 06:06:43 +0000 (08:06 +0200)]
9pfs: local: ignore O_NOATIME if we don't have permissions
QEMU's local 9pfs server passes through O_NOATIME from the client. If
the QEMU process doesn't have permissions to use O_NOATIME (namely, it
does not own the file nor have the CAP_FOWNER capability), the open will
fail. This causes issues when from the client's point of view, it
believes it has permissions to use O_NOATIME (e.g., a process running as
root in the virtual machine). Additionally, overlayfs on Linux opens
files on the lower layer using O_NOATIME, so in this case a 9pfs mount
can't be used as a lower layer for overlayfs (cf.
https://github.com/osandov/drgn/blob/dabfe1971951701da13863dbe6d8a1d172ad9650/vmtest/onoatimehack.c
and https://github.com/NixOS/nixpkgs/issues/54509).
Luckily, O_NOATIME is effectively a hint, and is often ignored by, e.g.,
network filesystems. open(2) notes that O_NOATIME "may not be effective
on all filesystems. One example is NFS, where the server maintains the
access time." This means that we can honor it when possible but fall
back to ignoring it.
Eric Blake [Mon, 8 Jun 2020 18:26:38 +0000 (13:26 -0500)]
block: Call attention to truncation of long NBD exports
Commit 93676c88 relaxed our NBD client code to request export names up
to the NBD protocol maximum of 4096 bytes without NUL terminator, even
though the block layer can't store anything longer than 4096 bytes
including NUL terminator for display to the user. Since this means
there are some export names where we have to truncate things, we can
at least try to make the truncation a bit more obvious for the user.
Note that in spite of the truncated display name, we can still
communicate with an NBD server using such a long export name; this was
deemed nicer than refusing to even connect to such a server (since the
server may not be under our control, and since determining our actual
length limits gets tricky when nbd://host:port/export and
nbd+unix:///export?socket=/path are themselves variable-length
expansions beyond the export name but count towards the block layer
name length).
Reported-by: Xueqiang Wei <xuwei@redhat.com> Fixes: https://bugzilla.redhat.com/1843684 Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200610163741.3745251-3-eblake@redhat.com>
(cherry picked from commit 5c86bdf1208916ece0b87e1151c9b48ee54faa3e) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
virtio-balloon: unref the iothread when unrealizing
We took a reference when realizing, so let's drop that reference when
unrealizing.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT") Cc: qemu-stable@nongnu.org Cc: Wei Wang <wei.w.wang@intel.com> Cc: Alexander Duyck <alexander.duyck@gmail.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200520100439.19872-4-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 105aef9c9479786d27c1c45c9b0b1fa03dc46be3) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
virtio-balloon: fix free page hinting check on unrealize
Checking against guest features is wrong. We allocated data structures
based on host features. We can rely on "free_page_bh" as an indicator
whether to un-do stuff instead.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT") Cc: qemu-stable@nongnu.org Cc: Wei Wang <wei.w.wang@intel.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Philippe Mathieu-Daudé <philmd@redhat.com> Cc: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200520100439.19872-3-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 49b01711b8eb3796c6904c7f85d2431572cfe54f) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Doing a "write 0xc0000e30 0x24
0x030000000300000003000000030000000300000003000000030000000300000003000000"
We will trigger a SEGFAULT. Let's move the check and bail out.
While at it, move the static initializations to instance_init().
free_page_report_status and block_iothread are implicitly set to the
right values (0/false) already, so drop the initialization.
Reviewed-by: Alexander Duyck <alexander.h.duyck@linux.intel.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Fixes: c13c4153f76d ("virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT") Reported-by: Alexander Bulekov <alxndr@bu.edu> Cc: qemu-stable@nongnu.org Cc: Wei Wang <wei.w.wang@intel.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Philippe Mathieu-Daudé <philmd@redhat.com> Cc: Alexander Duyck <alexander.duyck@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200520100439.19872-2-david@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 12fc8903a8ee09fb5f642de82699a0b211e1b5a7) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Eric Blake [Mon, 8 Jun 2020 18:26:37 +0000 (13:26 -0500)]
nbd/server: Avoid long error message assertions CVE-2020-10761
Ever since commit 36683283 (v2.8), the server code asserts that error
strings sent to the client are well-formed per the protocol by not
exceeding the maximum string length of 4096. At the time the server
first started sending error messages, the assertion could not be
triggered, because messages were completely under our control.
However, over the years, we have added latent scenarios where a client
could trigger the server to attempt an error message that would
include the client's information if it passed other checks first:
- requesting NBD_OPT_INFO/GO on an export name that is not present
(commit 0cfae925 in v2.12 echoes the name)
- requesting NBD_OPT_LIST/SET_META_CONTEXT on an export name that is
not present (commit e7b1948d in v2.12 echoes the name)
At the time, those were still safe because we flagged names larger
than 256 bytes with a different message; but that changed in commit 93676c88 (v4.2) when we raised the name limit to 4096 to match the NBD
string limit. (That commit also failed to change the magic number
4096 in nbd_negotiate_send_rep_err to the just-introduced named
constant.) So with that commit, long client names appended to server
text can now trigger the assertion, and thus be used as a denial of
service attack against a server. As a mitigating factor, if the
server requires TLS, the client cannot trigger the problematic paths
unless it first supplies TLS credentials, and such trusted clients are
less likely to try to intentionally crash the server.
We may later want to further sanitize the user-supplied strings we
place into our error messages, such as scrubbing out control
characters, but that is less important to the CVE fix, so it can be a
later patch to the new nbd_sanitize_name.
Consideration was given to changing the assertion in
nbd_negotiate_send_rep_verr to instead merely log a server error and
truncate the message, to avoid leaving a latent path that could
trigger a future CVE DoS on any new error message. However, this
merely complicates the code for something that is already (correctly)
flagging coding errors, and now that we are aware of the long message
pitfall, we are less likely to introduce such errors in the future,
which would make such error handling dead code.
Reported-by: Xueqiang Wei <xuwei@redhat.com> CC: qemu-stable@nongnu.org Fixes: https://bugzilla.redhat.com/1843684 CVE-2020-10761 Fixes: 93676c88d7 Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200610163741.3745251-2-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
(cherry picked from commit 5c4fe018c025740fef4a0a4421e8162db0c3eefd) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Greg Kurz [Mon, 25 May 2020 08:38:03 +0000 (10:38 +0200)]
9p: Lock directory streams with a CoMutex
Locking was introduced in QEMU 2.7 to address the deprecation of
readdir_r(3) in glibc 2.24. It turns out that the frontend code is
the worst place to handle a critical section with a pthread mutex:
the code runs in a coroutine on behalf of the QEMU mainloop and then
yields control, waiting for the fsdev backend to process the request
in a worker thread. If the client resends another readdir request for
the same fid before the previous one finally unlocked the mutex, we're
deadlocked.
This never bit us because the linux client serializes readdir requests
for the same fid, but it is quite easy to demonstrate with a custom
client.
A good solution could be to narrow the critical section in the worker
thread code and to return a copy of the dirent to the frontend, but
this causes quite some changes in both 9p.c and codir.c. So, instead
of that, in order for people to easily backport the fix to older QEMU
versions, let's simply use a CoMutex since all the users for this
sit in coroutines.
Fixes: 7cde47d4a89d ("9p: add locking to V9fsDir") Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Message-Id: <158981894794.109297.3530035833368944254.stgit@bahia.lan> Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit ed463454efd0ac3042ff772bfe1b1d846dc281a5) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Igor Mammedov [Thu, 30 Apr 2020 15:46:06 +0000 (11:46 -0400)]
hostmem: don't use mbind() if host-nodes is empty
Since 5.0 QEMU uses hostmem backend for allocating main guest RAM.
The backend however calls mbind() which is typically NOP
in case of default policy/absent host-nodes bitmap.
However when runing in container with black-listed mbind()
syscall, QEMU fails to start with error
"cannot bind memory to host NUMA nodes: Operation not permitted"
even when user hasn't provided host-nodes to pin to explictly
(which is the case with -m option)
To fix issue, call mbind() only in case when user has provided
host-nodes explicitly (i.e. host_nodes bitmap is not empty).
That should allow to run QEMU in containers with black-listed
mbind() without memory pinning. If QEMU provided memory-pinning
is required user still has to white-list mbind() in container
configuration.
Reported-by: Manuel Hohmann <mhohmann@physnet.uni-hamburg.de> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20200430154606.6421-1-imammedo@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Cc: qemu-stable@nongnu.org Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 70b6d525dfb51d5e523d568d1139fc051bc223c5) Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Peter Maydell [Wed, 22 Apr 2020 12:45:01 +0000 (13:45 +0100)]
target/arm: Fix ID_MMFR4 value on AArch64 'max' CPU
In commit 41a4bf1feab098da4cd the added code to set the CNP
field in ID_MMFR4 for the AArch64 'max' CPU had a typo
where it used the wrong variable name, resulting in ID_MMFR4
fields AC2, XNX and LSM being wrong. Fix the typo.
Fixes: 41a4bf1feab098da4cd Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 20200422124501.28015-1-peter.maydell@linaro.org
Peter Maydell [Mon, 20 Apr 2020 18:57:18 +0000 (19:57 +0100)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-5.0-20200417' into staging
ppc patch queue for 2020-04-17
Here are a few late bugfixes for qemu-5.0 in the ppc target code.
Unless some really nasty last minute bug shows up, I expect this to be
the last ppc pull request for qemu-5.0.
* remotes/dgibson/tags/ppc-for-5.0-20200417:
target/ppc: Fix mtmsr(d) L=1 variant that loses interrupts
target/ppc: Fix wrong interpretation of the disposition flag.
linux-user/ppc: Fix padding in mcontext_t for ppc64
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
block/iscsi:fix heap-buffer-overflow in iscsi_aio_ioctl_cb
There is an overflow, the source 'datain.data[2]' is 100 bytes,
but the 'ss' is 252 bytes.This may cause a security issue because
we can access a lot of unrelated memory data.
The len for sbp copy data should take the minimum of mx_sb_len and
sb_len_wr, not the maximum.
If we use iscsi device for VM backend storage, ASAN show stack:
READ of size 252 at 0xfffd149dcfc4 thread T0
#0 0xaaad433d0d34 in __asan_memcpy (aarch64-softmmu/qemu-system-aarch64+0x2cb0d34)
#1 0xaaad45f9d6d0 in iscsi_aio_ioctl_cb /qemu/block/iscsi.c:996:9
#2 0xfffd1af0e2dc (/usr/lib64/iscsi/libiscsi.so.8+0xe2dc)
#3 0xfffd1af0d174 (/usr/lib64/iscsi/libiscsi.so.8+0xd174)
#4 0xfffd1af19fac (/usr/lib64/iscsi/libiscsi.so.8+0x19fac)
#5 0xaaad45f9acc8 in iscsi_process_read /qemu/block/iscsi.c:403:5
#6 0xaaad4623733c in aio_dispatch_handler /qemu/util/aio-posix.c:467:9
#7 0xaaad4622f350 in aio_dispatch_handlers /qemu/util/aio-posix.c:510:20
#8 0xaaad4622f350 in aio_dispatch /qemu/util/aio-posix.c:520
#9 0xaaad46215944 in aio_ctx_dispatch /qemu/util/async.c:298:5
#10 0xfffd1bed12f4 in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x512f4)
#11 0xaaad46227de0 in glib_pollfds_poll /qemu/util/main-loop.c:219:9
#12 0xaaad46227de0 in os_host_main_loop_wait /qemu/util/main-loop.c:242
#13 0xaaad46227de0 in main_loop_wait /qemu/util/main-loop.c:518
#14 0xaaad43d9d60c in qemu_main_loop /qemu/softmmu/vl.c:1662:9
#15 0xaaad4607a5b0 in main /qemu/softmmu/main.c:49:5
#16 0xfffd1a460b9c in __libc_start_main (/lib64/libc.so.6+0x20b9c)
#17 0xaaad43320740 in _start (aarch64-softmmu/qemu-system-aarch64+0x2c00740)
0xfffd149dcfc4 is located 0 bytes to the right of 100-byte region [0xfffd149dcf60,0xfffd149dcfc4)
allocated by thread T0 here:
#0 0xaaad433d1e70 in __interceptor_malloc (aarch64-softmmu/qemu-system-aarch64+0x2cb1e70)
#1 0xfffd1af0e254 (/usr/lib64/iscsi/libiscsi.so.8+0xe254)
#2 0xfffd1af0d174 (/usr/lib64/iscsi/libiscsi.so.8+0xd174)
#3 0xfffd1af19fac (/usr/lib64/iscsi/libiscsi.so.8+0x19fac)
#4 0xaaad45f9acc8 in iscsi_process_read /qemu/block/iscsi.c:403:5
#5 0xaaad4623733c in aio_dispatch_handler /qemu/util/aio-posix.c:467:9
#6 0xaaad4622f350 in aio_dispatch_handlers /qemu/util/aio-posix.c:510:20
#7 0xaaad4622f350 in aio_dispatch /qemu/util/aio-posix.c:520
#8 0xaaad46215944 in aio_ctx_dispatch /qemu/util/async.c:298:5
#9 0xfffd1bed12f4 in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x512f4)
#10 0xaaad46227de0 in glib_pollfds_poll /qemu/util/main-loop.c:219:9
#11 0xaaad46227de0 in os_host_main_loop_wait /qemu/util/main-loop.c:242
#12 0xaaad46227de0 in main_loop_wait /qemu/util/main-loop.c:518
#13 0xaaad43d9d60c in qemu_main_loop /qemu/softmmu/vl.c:1662:9
#14 0xaaad4607a5b0 in main /qemu/softmmu/main.c:49:5
#15 0xfffd1a460b9c in __libc_start_main (/lib64/libc.so.6+0x20b9c)
#16 0xaaad43320740 in _start (aarch64-softmmu/qemu-system-aarch64+0x2c00740)
Reported-by: Euler Robot <euler.robot@huawei.com> Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20200418062602.10776-1-kuhn.chenqun@huawei.com Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
More recently, Linux reduced the occurance of operations (e.g., rfi)
which stop translation and allow pending interrupts to be processed.
This started causing hangs in Linux boot in long-running kernel tests,
running with '-d int' shows the decrementer stops firing despite DEC
wrapping and MSR[EE]=1.
The cause is the broken mtmsr L=1 behaviour, which is contrary to the
architecture. From Power ISA v3.0B, p.977, Move To Machine State Register,
Programming Note states:
If MSR[EE]=0 and an External, Decrementer, or Performance Monitor
exception is pending, executing an mtmsrd instruction that sets
MSR[EE] to 1 will cause the interrupt to occur before the next
instruction is executed, if no higher priority exception exists
Fix this by handling L=1 exactly the same way as L=0, modulo the MSR
bits altered.
The confusion arises from L=0 being "context synchronizing" whereas L=1
is "execution synchronizing", which is a weaker semantic. However this
is not a relaxation of the requirement that these exceptions cause
interrupts when MSR[EE]=1 (e.g., when mtmsr executes to completion as
TCG is doing here), rather it specifies how a pipelined processor can
have multiple instructions in flight where one may influence how another
behaves.
Cc: qemu-stable@nongnu.org Reported-by: Anton Blanchard <anton@ozlabs.org> Reported-by: Nathan Chancellor <natechancellor@gmail.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Message-Id: <20200414111131.465560-1-npiggin@gmail.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Tested-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target/ppc: Fix wrong interpretation of the disposition flag.
Bitwise AND with kvm_run->flags to evaluate if we recovered from
MCE or not is not correct, As disposition in kvm_run->flags is a
two-bit integer value and not a bit map, So check for equality
instead of bitwise AND.
Without the fix qemu treats any unrecoverable mce error as recoverable
and ends up in a mce loop inside the guest, Below are the MCE logs before
and after the fix.
linux-user/ppc: Fix padding in mcontext_t for ppc64
The padding that was added in 95cda4c44ee was added to a union,
and so it had no effect. This fixes misalignment errors detected
by clang sanitizers for ppc64 and ppc64le.
In addition, only ppc64 allocates space for VSX registers, so do
not save them for ppc32. The kernel only has references to
CONFIG_SPE in signal_32.c, so do not attempt to save them for ppc64.
Fixes: 95cda4c44ee Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200407032105.26711-1-richard.henderson@linaro.org> Acked-by: Laurent Vivier <laurent@vivier.eu> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
linux-user/syscall.c: add target-to-host mapping for epoll_create1()
Noticed by Barnabás Virágh as a python-3.7 failue on qemu-alpha.
The bug shows up on alpha as it's one of the targets where
EPOLL_CLOEXEC differs from other targets:
sysdeps/unix/sysv/linux/alpha/bits/epoll.h: EPOLL_CLOEXEC = 01000000
sysdeps/unix/sysv/linux/bits/epoll.h: EPOLL_CLOEXEC = 02000000
* remotes/mdroth/tags/qga-pull-2020-04-15-tag:
qga: Restrict guest-file-read count to 48 MB to avoid crashes
qga: Extract qmp_guest_file_read() to common commands.c
qga: Extract guest_file_handle_find() to commands-common.h
Revert "prevent crash when executing guest-file-read with large count"
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
qga: Restrict guest-file-read count to 48 MB to avoid crashes
On [*] Daniel Berrangé commented:
The QEMU guest agent protocol is not sensible way to access huge
files inside the guest. It requires the inefficient process of
reading the entire data into memory than duplicating it again in
base64 format, and then copying it again in the JSON serializer /
monitor code.
For arbitrary general purpose file access, especially for large
files, use a real file transfer program or use a network block
device, not the QEMU guest agent.
To avoid bug reports as BZ#1594054 (CVE-2018-12617), follow his
suggestion to put a low, hard limit on "count" in the guest agent
QAPI schema, and don't allow count to be larger than 48 MB.
Fixes: CVE-2018-12617 Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1594054 Reported-by: Fakhri Zulkifli <mohdfakhrizulkifli@gmail.com> Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
*update schema documentation to indicate 48MB limit instead of 10MB Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Revert "prevent crash when executing guest-file-read with large count"
As noted by Daniel Berrangé in [*], the fix from commit 807e2b6fce
which replaced malloc() by try_malloc() is not enough, the process
can still run out of memory a few line later:
Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Peter Maydell [Wed, 15 Apr 2020 11:02:59 +0000 (12:02 +0100)]
Merge remote-tracking branch 'remotes/stsquad/tags/pull-more-fixes-150420-1' into staging
More small fixes for rc3
- tweak docker FEATURE flags for document building
- include sphinx configure check in config.log
- disable PIE for Windows builds
- fix /proc/self/stat handling
- a number of gdbstub fixups following GByteArray conversion
# gpg: Signature made Wed 15 Apr 2020 11:38:56 BST
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-more-fixes-150420-1:
gdbstub: Introduce gdb_get_float32() to get 32-bit float registers
gdbstub: Do not use memset() on GByteArray
gdbstub: i386: Fix gdb_get_reg16() parameter to unbreak gdb
target/m68k/helper: Fix m68k_fpu_gdb_get_reg() use of GByteArray
linux-user: fix /proc/self/stat handling
configure: disable PIE for Windows builds
configure: redirect sphinx-build check to config.log
tests/docker: add docs FEATURE flag and use for test-misc
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Alex Bennée [Tue, 14 Apr 2020 20:06:23 +0000 (21:06 +0100)]
linux-user: fix /proc/self/stat handling
In the original bug report long files names in Guix caused
/proc/self/stat be truncated without the trailing ") " as specified in
proc manpage which says:
(2) comm %s
The filename of the executable, in parentheses. This
is visible whether or not the executable is swapped
out.
In the kernel this is currently done by do_task_stat calling
proc_task_name() which uses a structure limited by TASK_COMM_LEN (16).
Additionally it should only be reporting the executable name rather
than the full path. Fix both these failings while cleaning up the code
to use GString to build up the reported values. As the whole function
is cleaned up also adjust the white space to the current coding style.
Message-ID: <fb4c55fa-d539-67ee-c6c9-de8fb63c8488@inria.fr> Reported-by: Brice Goglin <Brice.Goglin@inria.fr> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200414200631.12799-10-alex.bennee@linaro.org>
Alex Bennée [Tue, 14 Apr 2020 20:06:21 +0000 (21:06 +0100)]
configure: redirect sphinx-build check to config.log
Otherwise it's hard to debug whats going on.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200414200631.12799-8-alex.bennee@linaro.org>
Alex Bennée [Tue, 14 Apr 2020 20:06:20 +0000 (21:06 +0100)]
tests/docker: add docs FEATURE flag and use for test-misc
The test-misc docker test fails on a number of images which don't have
the prerequisites to build the docs. Use the FEATURES flag so we can
skip those tests.
As the sphinx test fails to detect whatever feature we need to get
hxtool to work we drop them from debian9 so the windows build doesn't
attempt to build the docs.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200414200631.12799-7-alex.bennee@linaro.org>
* remotes/bonzini/tags/for-upstream:
hax: Windows doesn't like posix device names
tests: numa: test one backend with prealloc enabled
hostmem: set default prealloc_threads to valid value
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 14 Apr 2020 16:27:00 +0000 (17:27 +0100)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20200414' into staging
patch queue:
* Fix some problems that trip up Coverity's scanner
* run-coverity-scan: New script automating the scan-and-upload process
* docs: Improve our gdbstub documentation
* configure: Honour --disable-werror for Sphinx
* docs: Fix errors produced when building with Sphinx 3.0
* docs: Require Sphinx 1.6 or better
* Add deprecation notice for KVM support on AArch32 hosts
* remotes/pmaydell/tags/pull-target-arm-20200414:
Deprecate KVM support for AArch32
docs: Require Sphinx 1.6 or better
kernel-doc: Use c:struct for Sphinx 3.0 and later
scripts/kernel-doc: Add missing close-paren in c:function directives
configure: Honour --disable-werror for Sphinx
docs: Improve our gdbstub documentation
scripts/coverity-scan: Add Docker support
scripts/run-coverity-scan: Script to run Coverity Scan build
linux-user/flatload.c: Use "" for include of QEMU header target_flat.h
thread.h: Remove trailing semicolons from Coverity qemu_mutex_lock() etc
thread.h: Fix Coverity version of qemu_cond_timedwait()
osdep.h: Drop no-longer-needed Coverity workarounds
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 14 Apr 2020 12:09:35 +0000 (13:09 +0100)]
Deprecate KVM support for AArch32
The Linux kernel has dropped support for allowing 32-bit Arm systems
to host KVM guests (kernel commit 541ad0150ca4aa663a2, which just
landed upstream in the 5.7 merge window). Mark QEMU's support for
this configuration as deprecated, so that we can delete that support
code in 5.2.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Andrew Jones <drjones@redhat.com>
Peter Maydell [Tue, 14 Apr 2020 12:41:14 +0000 (13:41 +0100)]
docs: Require Sphinx 1.6 or better
Versions of Sphinx older than 1.6 can't build all of our documentation,
because they are too picky about the syntax of the argument to the
option:: directive; see Sphinx bugs #646, #3366:
Trying to build with a 1.4.x Sphinx fails with
docs/system/images.rst:4: SEVERE: Duplicate ID: "cmdoption-qcow2-arg-encrypt"
and a 1.5.x Sphinx fails with
docs/system/invocation.rst:544: WARNING: Malformed option description '[enable=]PATTERN', should look like "opt", "-opt
args", "--opt args", "/opt args" or "+opt args"
Update our needs_sphinx setting to indicate that we require at least
1.6. This will allow configure to fall back to "don't build the
docs" rather than causing the build to fail entirely, which is
probably what most users building on a host old enough to have such
an old Sphinx would want; if they do want the docs then they'll have
a useful indication of what they need to do (upgrade Sphinx!) rather
than a confusing error message.
In theory our distro support policy would suggest that we should
support building on the Sphinx shipped in those distros, but:
* EPEL7 has Sphinx 1.2.3 (which we've never supported!)
* Debian Stretch has Sphinx 1.4.8
Trying to get our docs to work with Sphinx 1.4 is not tractable
for the 5.0 release and I'm not sure it's worthwhile effort anyway;
at least with this change the build as a whole now succeeds.
Thanks to John Snow for doing the investigation and testing to
confirm what Sphinx versions fail in what ways and what distros
shipped what.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Peter Maydell [Tue, 14 Apr 2020 12:50:41 +0000 (13:50 +0100)]
kernel-doc: Use c:struct for Sphinx 3.0 and later
The kernel-doc Sphinx plugin and associated script currently emit
'c:type' directives for "struct foo" documentation.
Sphinx 3.0 warns about this:
/home/petmay01/linaro/qemu-from-laptop/qemu/docs/../include/exec/memory.h:3: WARNING: Type must be either just a name or a typedef-like declaration.
If just a name:
Error in declarator or parameters
Invalid C declaration: Expected identifier in nested name, got keyword: struct [error at 6]
struct MemoryListener
------^
If typedef-like declaration:
Error in declarator or parameters
Invalid C declaration: Expected identifier in nested name. [error at 21]
struct MemoryListener
---------------------^
because it wants us to use the new-in-3.0 'c:struct' instead.
Plumb the Sphinx version through to the kernel-doc script
and use it to select 'c:struct' for newer versions than 3.0.
Fixes: LP:1872113 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Peter Maydell [Sat, 11 Apr 2020 18:29:33 +0000 (19:29 +0100)]
scripts/kernel-doc: Add missing close-paren in c:function directives
When kernel-doc generates a 'c:function' directive for a function
one of whose arguments is a function pointer, it fails to print
the close-paren after the argument list of the function pointer
argument, for instance in the memory API documentation:
.. c:function:: void memory_region_init_resizeable_ram (MemoryRegion * mr, struct Object * owner, const char * name, uint64_t size, uint64_t max_size, void (*resized) (const char*, uint64_t length, void *host, Error ** errp)
which should have a ')' after the 'void *host' which is the
last argument to 'resized'.
Older versions of Sphinx don't try to parse the argumnet
to c:function, but Sphinx 3.0 does do this and will complain:
/home/petmay01/linaro/qemu-from-laptop/qemu/docs/../include/exec/memory.h:834: WARNING: Error in declarator or parameters
Invalid C declaration: Expecting "," or ")" in parameters, got "EOF". [error at 208]
void memory_region_init_resizeable_ram (MemoryRegion * mr, struct Object * owner, const char * name, uint64_t size, uint64_t max_size, void (*resized) (const char*, uint64_t length, void *host, Error ** errp)
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------^
Add the missing close-paren.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200411182934.28678-3-peter.maydell@linaro.org Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Igor Mammedov [Wed, 25 Mar 2020 09:44:23 +0000 (05:44 -0400)]
tests: numa: test one backend with prealloc enabled
Cannibalize one backend in the HMAT test to make sure that
prealloc=y is tested.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20200325094423.24293-3-imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Igor Mammedov [Wed, 25 Mar 2020 09:44:22 +0000 (05:44 -0400)]
hostmem: set default prealloc_threads to valid value
Commit 4ebc74dbbf removed default prealloc_threads initialization
by mistake, and that makes QEMU crash with division on zero at
numpages_per_thread = numpages / memset_num_threads;
when QEMU is started with following backend
-object memory-backend-ram,id=ram-node0,prealloc=yes,size=128M
Return back initialization removed by 4ebc74dbbf to fix issue.
Fixes: 4ebc74dbbf7ad50e4101629f3f5da5fdc1544051 Reported-by: Raphael Norwitz <raphael.norwitz@nutanix.com> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20200325094423.24293-2-imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Peter Maydell [Sat, 11 Apr 2020 18:29:32 +0000 (19:29 +0100)]
configure: Honour --disable-werror for Sphinx
If we are not making warnings fatal for compilation, make them
non-fatal when building the Sphinx documentation also. (For instance
Sphinx 3.0 warns about some constructs that older versions were happy
with, which is a build failure if we use the warnings-as-errors
flag.)
This provides a workaround at least for LP:1872113.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200411182934.28678-2-peter.maydell@linaro.org Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Peter Maydell [Fri, 3 Apr 2020 09:40:14 +0000 (10:40 +0100)]
docs: Improve our gdbstub documentation
The documentation of our -s and -gdb options is quite old; in
particular it still claims that it will cause QEMU to stop and wait
for the gdb connection, when this has not been true for some time:
you also need to pass -S if you want to make QEMU not launch the
guest on startup.
Improve the documentation to mention this requirement in the
executable's --help output, the documentation of the -gdb option in
the manual, and in the "GDB usage" chapter.
Includes some minor tweaks to these paragraphs of documentation
since I was editing them anyway (such as dropping the description
of our gdb support as "primitive").
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20200403094014.9589-1-peter.maydell@linaro.org
Peter Maydell [Thu, 19 Mar 2020 19:33:23 +0000 (19:33 +0000)]
scripts/coverity-scan: Add Docker support
Add support for running the Coverity Scan tools inside a Docker
container rather than directly on the host system.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200319193323.2038-7-peter.maydell@linaro.org
Peter Maydell [Thu, 19 Mar 2020 19:33:22 +0000 (19:33 +0000)]
scripts/run-coverity-scan: Script to run Coverity Scan build
Add a new script to automate the process of running the Coverity
Scan build tools and uploading the resulting tarball to the
website.
This is intended eventually to be driven from Travis,
but it can be run locally, if you are a maintainer of the
QEMU project on the Coverity Scan website and have the secret
upload token.
The script must be run on a Fedora 30 system. Support for using a
Docker container is added in a following commit.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20200319193323.2038-6-peter.maydell@linaro.org
Peter Maydell [Thu, 19 Mar 2020 19:33:21 +0000 (19:33 +0000)]
linux-user/flatload.c: Use "" for include of QEMU header target_flat.h
The target_flat.h file is a QEMU header, so we should include it using
quotes, not angle brackets.
Coverity otherwise is unable to find the header:
"../linux-user/flatload.c", line 40: error #1712: cannot open source file
"target_flat.h"
#include <target_flat.h>
^
because the relevant directory is only on the -iquote path, not the -I path.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200319193323.2038-5-peter.maydell@linaro.org
Peter Maydell [Thu, 19 Mar 2020 19:33:20 +0000 (19:33 +0000)]
thread.h: Remove trailing semicolons from Coverity qemu_mutex_lock() etc
All the Coverity-specific definitions of qemu_mutex_lock() and friends
have a trailing semicolon. This works fine almost everywhere because
of QEMU's mandatory-braces coding style and because most callsites are
simple, but target/s390x/sigp.c has a use of qemu_mutex_trylock() as
an if() statement, which makes the ';' a syntax error:
"../target/s390x/sigp.c", line 461: warning #18: expected a ")"
if (qemu_mutex_trylock(&qemu_sigp_mutex)) {
^
Remove the bogus semicolons from the macro definitions.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200319193323.2038-4-peter.maydell@linaro.org
Peter Maydell [Thu, 19 Mar 2020 19:33:19 +0000 (19:33 +0000)]
thread.h: Fix Coverity version of qemu_cond_timedwait()
For Coverity's benefit, we provide simpler versions of functions like
qemu_mutex_lock(), qemu_cond_wait() and qemu_cond_timedwait(). When
we added qemu_cond_timedwait() in commit 3dcc9c6ec4ea, a cut and
paste error meant that the Coverity version of qemu_cond_timedwait()
was using the wrong _impl function, which makes the Coverity parser
complain:
"/qemu/include/qemu/thread.h", line 159: warning #140: too many arguments in
function call
return qemu_cond_timedwait(cond, mutex, ms);
^
"/qemu/include/qemu/thread.h", line 159: warning #120: return value type does
not match the function type
return qemu_cond_timedwait(cond, mutex, ms);
^
"/qemu/include/qemu/thread.h", line 156: warning #1563: function
"qemu_cond_timedwait" not emitted, consider modeling it or review
parse diagnostics to improve fidelity
static inline bool (qemu_cond_timedwait)(QemuCond *cond, QemuMutex *mutex,
^
These aren't fatal, but reduce the scope of the analysis. Fix the error.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200319193323.2038-3-peter.maydell@linaro.org
Peter Maydell [Thu, 19 Mar 2020 19:33:18 +0000 (19:33 +0000)]
osdep.h: Drop no-longer-needed Coverity workarounds
In commit a1a98357e3fd in 2018 we added some workarounds for Coverity
not being able to handle the _Float* types introduced by recent
glibc. Newer versions of the Coverity scan tools have support for
these types, and will fail with errors about duplicate typedefs if we
have our workaround. Remove our copy of the typedefs.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20200319193323.2038-2-peter.maydell@linaro.org
* remotes/bonzini/tags/for-upstream:
module: increase dirs array size by one
memory: Do not allow direct write access to rom_device regions
vl.c: error out if -mem-path is used together with -M memory-backend
rcu: do not mention atomic_mb_read/set in documentation
atomics: update documentation
atomics: convert to reStructuredText
oslib-posix: take lock before qemu_cond_broadcast
piix: fix xenfv regression, add compat machine xenfv-4.2
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Summarizing the issue:
1. Memory regions contain ram blocks with a different size, if the
size is not properly aligned. While memory regions can have an
unaligned size, ram blocks can't. This is true when creating
resizable memory region with an unaligned size.
2. When resizing a ram block/memory region, the size of the memory
region is set to the aligned size. The callback is called with
the aligned size. The unaligned piece is lost.
Because of the above, if ACPI blob length modifications happens
after the initial virt_acpi_build() call, and the changed blob
length is within the PAGE size boundary, then the revised size
is not seen by the firmware on Guest reboot.
Hence make sure callback is called if memory region size is changed,
irrespective of aligned or not.
Signed-off-by: David Hildenbrand <david@redhat.com>
[Shameer: added commit log] Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20200403101827.30664-4-shameerali.kolothum.thodi@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Any sub-page size update to ACPI MRs will be lost during
migration, as we use aligned size in ram_load_precopy() ->
qemu_ram_resize() path. This will result in inconsistency in
FWCfgEntry sizes between source and destination. In order to avoid
this, save and restore them separately during migration.
Up until now, this problem may not be that relevant for x86 as both
ACPI table and Linker MRs gets padded and aligned. Also at present,
qemu_ram_resize() doesn't invoke callback to update FWCfgEntry for
unaligned size changes. But since we are going to fix the
qemu_ram_resize() in the subsequent patch, the issue may become
more serious especially for RSDP MR case.
Moreover, the issue will soon become prominent in arm/virt as well
where the MRs are not padded or aligned at all and eventually have
acpi table changes as part of future additions like NVDIMM hot-add
feature.
Suggested-by: David Hildenbrand <david@redhat.com> Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Acked-by: David Hildenbrand <david@redhat.com>
Message-Id: <20200403101827.30664-3-shameerali.kolothum.thodi@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use macro for "etc/table-loader" and move it to the header
file similar to ACPI_BUILD_TABLE_FILE/ACPI_BUILD_RSDP_FILE etc.
Signed-off-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20200403101827.30664-2-shameerali.kolothum.thodi@huawei.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Raphael Norwitz [Thu, 26 Mar 2020 08:57:27 +0000 (04:57 -0400)]
MAINTAINERS: Add myself as vhost-user-blk maintainer
As suggested by Michael, let's add me as a maintainer of
vhost-user-blk and vhost-user-scsi.
CC: Michael S. Tsirkin <mst@redhat.com>
CC Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <1585213047-20089-1-git-send-email-raphael.norwitz@nutanix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Bruce Rogers [Sat, 11 Apr 2020 01:07:46 +0000 (19:07 -0600)]
module: increase dirs array size by one
With the module upgrades code change, the statically sized dirs array
can now overflow. Increase it's size by one, according to the new
maximum possible usage.
Fixes: bd83c861c0 ("modules: load modules from versioned /var/run dir") Signed-off-by: Bruce Rogers <brogers@suse.com>
Message-Id: <20200411010746.472295-1-brogers@suse.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Alexander Duyck [Fri, 10 Apr 2020 03:41:50 +0000 (20:41 -0700)]
memory: Do not allow direct write access to rom_device regions
According to the documentation in memory.h a ROM memory region will be
backed by RAM for reads, but is supposed to go through a callback for
writes. Currently we were not checking for the existence of the rom_device
flag when determining if we could perform a direct write or not.
To correct that add a check to memory_region_is_direct so that if the
memory region has the rom_device flag set we will return false for all
checks where is_write is set.
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Message-Id: <20200410034150.24738.98143.stgit@localhost.localdomain> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Igor Mammedov [Thu, 9 Apr 2020 13:41:33 +0000 (09:41 -0400)]
vl.c: error out if -mem-path is used together with -M memory-backend
the former is not actually used by explicit backend, so instead of
silently ignoring the option in non valid context, exit with error.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20200409134133.11339-1-imammedo@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Mon, 6 Apr 2020 09:34:12 +0000 (11:34 +0200)]
atomics: update documentation
Some of the constraints on operand sizes have been relaxed, so adjust the
documentation.
Deprecate atomic_mb_read and atomic_mb_set; it is not really possible to
use them correctly because they do not interoperate with sequentially-consistent
RMW operations.
Finally, extend the memory barrier pairing section to cover acquire and
release semantics in general, roughly based on the KVM Forum 2016 talk,
"<atomic.h> weapons".
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
OPC_SYNC_WMB, OPC_SYNC_MB, OPC_SYNC_ACQUIRE, OPC_SYNC_RELEASE and
OPC_SYNC_RMB have wrong encode. According to the mips manual,
their encode should be 'OPC_SYNC | 0x?? << 6' rather than
'OPC_SYNC | 0x?? << 5'. Wrong encode can lead illegal instruction
errors. These instructions often appear with multi-threaded
simulation.
Fixes: 6f0b99104a3 ("tcg/mips: Add support for fence") Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Aleksandar Markovic <aleksandar.qemu.devel@gmail.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: lixinyu <precinct@mail.ustc.edu.cn>
Message-Id: <20200411124612.12560-1-precinct@mail.ustc.edu.cn> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In touch_all_pages, if the mutex is not taken around qemu_cond_broadcast,
qemu_cond_broadcast may be called before all touch page threads enter
qemu_cond_wait. In this case, the touch page threads wait forever for the
main thread to wake them up, causing a deadlock.
Signed-off-by: Bauerchen <bauerchen@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
With QEMU 4.0 an incompatible change was added to pc_piix, which makes it
practical impossible to migrate domUs started with qemu2 or qemu3 to
newer qemu versions. Commit 7fccf2a06890e3bc3b30e29827ad3fb93fe88fea
added and enabled a new member "smbus_no_migration_support". In commit 4ab2f2a8aabfea95cc53c64e13b3f67960b27fdf the vmstate_acpi got new
elements, which are conditionally filled. As a result, an incoming
migration expected smbus related data unless smbus migration was
disabled for a given MachineClass. Since first commit forgot to handle
'xenfv', domUs started with QEMU 4.x are incompatible with their QEMU
siblings.
Using other existing machine types, such as 'pc-i440fx-3.1', is not
possible because 'xenfv' creates the 'xen-platform' PCI device at
00:02.0, while all other variants to run a domU would create it at
00:04.0.
To cover both the existing and the broken case of 'xenfv' in a single
qemu binary, a new compatibility variant of 'xenfv-4.2' must be added
which targets domUs started with qemu 4.2. The existing 'xenfv' restores
compatibility of QEMU 5.x with qemu 3.1.
Host admins who started domUs with QEMU 4.x (preferrable QEMU 4.2)
have to use a wrapper script which appends '-machine xenfv-4.2' to
the device-model command line. This is only required if there is no
maintenance window which allows to temporary shutdown the domU and
restart it with a fixed device-model.
The wrapper script is as simple as this:
#!/bin/sh
exec /usr/bin/qemu-system-i386 "$@" -machine xenfv-4.2
With xl this script will be enabled with device_model_override=, see
xl.cfg(5). To live migrate a domU, adjust the existing domU.cfg and pass
it to xl migrate or xl save/restore:
xl migrate -C new-domU.cfg domU remote-host
xl save domU CheckpointFile new-domU.cfg
xl restore new-domU.cfg CheckpointFile
With libvirt this script will be enabled with the <emulator> element in
domU.xml. Use 'virsh edit' prior 'virsh migrate' to replace the existing
<emulator> element to point it to the wrapper script.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Message-Id: <20200327151841.13877-1-olaf@aepfle.de>
[Adjust tests for blacklisted machine types, simplifying the one in
qom-test. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* remotes/stefanha/tags/block-pull-request:
async: use explicit memory barriers
aio-wait: delegate polling of main AioContext if BQL not held
aio-posix: signal-proof fdmon-io_uring
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Paolo Bonzini [Tue, 7 Apr 2020 14:07:46 +0000 (10:07 -0400)]
async: use explicit memory barriers
When using C11 atomics, non-seqcst reads and writes do not participate
in the total order of seqcst operations. In util/async.c and util/aio-posix.c,
in particular, the pattern that we use
write ctx->notify_me write bh->scheduled
read bh->scheduled read ctx->notify_me
if !bh->scheduled, sleep if ctx->notify_me, notify
needs to use seqcst operations for both the write and the read. In
general this is something that we do not want, because there can be
many sources that are polled in addition to bottom halves. The
alternative is to place a seqcst memory barrier between the write
and the read. This also comes with a disadvantage, in that the
memory barrier is implicit on strongly-ordered architectures and
it wastes a few dozen clock cycles.
Fortunately, ctx->notify_me is never written concurrently by two
threads, so we can assert that and relax the writes to ctx->notify_me.
The resulting solution works and performs well on both aarch64 and x86.
Note that the atomic_set/atomic_read combination is not an atomic
read-modify-write, and therefore it is even weaker than C11 ATOMIC_RELAXED;
on x86, ATOMIC_RELAXED compiles to a locked operation.
Analyzed-by: Ying Fang <fangying1@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Tested-by: Ying Fang <fangying1@huawei.com>
Message-Id: <20200407140746.8041-6-pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Paolo Bonzini [Tue, 7 Apr 2020 14:07:45 +0000 (10:07 -0400)]
aio-wait: delegate polling of main AioContext if BQL not held
Any thread that is not a iothread returns NULL for qemu_get_current_aio_context().
As a result, it would also return true for
in_aio_context_home_thread(qemu_get_aio_context()), causing
AIO_WAIT_WHILE to invoke aio_poll() directly. This is incorrect
if the BQL is not held, because aio_poll() does not expect to
run concurrently from multiple threads, and it can actually
happen when savevm writes to the vmstate file from the
migration thread.
Therefore, restrict in_aio_context_home_thread to return true
for the main AioContext only if the BQL is held.
The function is moved to aio-wait.h because it is mostly used
there and to avoid a circular reference between main-loop.h
and block/aio.h.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200407140746.8041-5-pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Stefan Hajnoczi [Wed, 8 Apr 2020 09:11:39 +0000 (10:11 +0100)]
aio-posix: signal-proof fdmon-io_uring
The io_uring_enter(2) syscall returns with errno=EINTR when interrupted
by a signal. Retry the syscall in this case.
It's essential to do this in the io_uring_submit_and_wait() case. My
interpretation of the Linux v5.5 io_uring_enter(2) code is that it
shouldn't affect the io_uring_submit() case, but there is no guarantee
this will always be the case. Let's check for -EINTR around both APIs.
Note that the liburing APIs have -errno return values.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200408091139.273851-1-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* remotes/xtensa/tags/20200407-xtensa:
target/xtensa: statically allocate xtensa_insnbufs in DisasContext
target/xtensa: fix pasto in pfwait.r opcode name
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Max Filippov [Tue, 7 Apr 2020 03:59:54 +0000 (20:59 -0700)]
target/xtensa: statically allocate xtensa_insnbufs in DisasContext
Rather than dynamically allocate, and risk failing to free
when we longjmp out of the translator, allocate the maximum
buffer size based on the maximum supported instruction length.
Suggested-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Max Filippov <jcmvbkbc@gmail.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Richard Henderson <richard.henderson@linaro.org>
Peter Maydell [Tue, 7 Apr 2020 21:12:04 +0000 (22:12 +0100)]
Merge remote-tracking branch 'remotes/stsquad/tags/pull-misc-fixes-070420-1' into staging
Various fixes:
- add .github repo lockdown config
- better handle missing symbols in elf-ops
- protect fcntl64 with #ifdef
- remove unused macros from test
- fix handling of /proc/self/maps
- avoid BAD_SHIFT in x80 softfloat
- properly terminate on .hex EOF
- fix configure probe on windows cross build
- fix %r12 guest_base initialization
# gpg: Signature made Tue 07 Apr 2020 16:31:14 BST
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-misc-fixes-070420-1:
tcg/i386: Fix %r12 guest_base initialization
configure: Add -Werror to PIE probe
hw/core: properly terminate loading .hex on EOF record
linux-user: clean-up padding on /proc/self/maps
linux-user: factor out reading of /proc/self/maps
softfloat: Fix BAD_SHIFT from normalizeFloatx80Subnormal
gdbstub: fix compiler complaining
target/xtensa: add FIXME for translation memory leak
linux-user: more debug for init_guest_space
tests/tcg: remove extraneous pasting macros
linux-user: protect fcntl64 with an #ifdef
elf-ops: bail out if we have no function symbols
.github: Enable repo-lockdown bot to refuse GitHub pull requests
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 7 Apr 2020 18:12:45 +0000 (19:12 +0100)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches:
- Fix crashes and hangs related to iothreads, bdrv_drain and block jobs:
- Fix some AIO context locking in jobs
- Fix blk->in_flight during blk_wait_while_drained()
- vpc: Don't round up already aligned BAT sizes
# gpg: Signature made Tue 07 Apr 2020 15:25:24 BST
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream:
vpc: Don't round up already aligned BAT sizes
block: Fix blk->in_flight during blk_wait_while_drained()
block: Increase BB.in_flight for coroutine and sync interfaces
block-backend: Reorder flush/pdiscard function definitions
backup: don't acquire aio_context in backup_clean
replication: assert we own context before job_cancel_sync
job: take each job's lock individually in job_txn_apply
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 7 Apr 2020 16:38:47 +0000 (17:38 +0100)]
Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2020-04-07' into staging
Block patches for 5.0-rc2:
- Fix double QLIST_REMOVE() and potential request object leak in
xen-block
- Prevent a potential assertion failure in qcow2's code for compressed
clusters by rejecting invalid (unaligned) requests with -EIO
- Prevent discards on qcow2 v2 images from making backing data reappear
- Make qemu-img convert report I/O error locations by byte offsets
consistently
- Fix for potential I/O test errors (accidental globbing due to missing
quotes)
When %gs cannot be used, we use register offset addressing.
This path is almost never used, so it was clearly not tested.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20200406174803.8192-1-richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Without -Werror, the probe may succeed, but then compilation fails
later when -Werror is added for other reasons. Shows up on windows,
where the compiler complains about -fPIC.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200401214756.6559-1-richard.henderson@linaro.org>
Message-Id: <20200403191150.863-13-alex.bennee@linaro.org>
Alex Bennée [Fri, 3 Apr 2020 19:11:49 +0000 (20:11 +0100)]
hw/core: properly terminate loading .hex on EOF record
The https://makecode.microbit.org/#editor generates slightly weird
.hex files which work fine on a real microbit but causes QEMU to
choke. The reason is extraneous data after the EOF record which causes
the loader to attempt to write a bigger file than it should to the
"rom". According to the HEX file spec an EOF really should be the last
thing we process so lets do that.
Reported-by: Ursula Bennée <alex.bennee@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200403191150.863-12-alex.bennee@linaro.org>
Alex Bennée [Fri, 3 Apr 2020 19:11:47 +0000 (20:11 +0100)]
linux-user: clean-up padding on /proc/self/maps
Don't use magic spaces, calculate the justification for the file
field like the kernel does with seq_pad.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200403191150.863-10-alex.bennee@linaro.org>
Alex Bennée [Fri, 3 Apr 2020 19:11:46 +0000 (20:11 +0100)]
linux-user: factor out reading of /proc/self/maps
Unfortunately reading /proc/self/maps is still considered the gold
standard for a process finding out about it's own memory layout. As we
will want this data in other contexts soon factor out the code to read
and parse the data. Rather than just blindly copying the existing
sscanf based code we use a more modern glib version of the parsing
code to make a more general purpose map structure.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20200403191150.863-9-alex.bennee@linaro.org>
softfloat: Fix BAD_SHIFT from normalizeFloatx80Subnormal
All other calls to normalize*Subnormal detect zero input before
the call -- this is the only outlier. This case can happen with
+0.0 + +0.0 = +0.0 or -0.0 + -0.0 = -0.0, so return a zero of
the correct sign.
Reported-by: Coverity (CID 1421991) Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20200327232042.10008-1-richard.henderson@linaro.org>
Message-Id: <20200403191150.863-8-alex.bennee@linaro.org>
./gdbstub.c: In function ‘handle_query_thread_extra’:
/usr/include/glib-2.0/glib/glib-autocleanups.h:28:10:
error: ‘cpu_name’ may be used uninitialized in this function
[-Werror=maybe-uninitialized]
g_free (*pp);
^
./gdbstub.c:2063:26: note: ‘cpu_name’ was declared here
g_autofree char *cpu_name;
^
cc1: all warnings being treated as errors
Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Message-Id: <20200326151407.25046-1-dplotnikov@virtuozzo.com> Reported-by: Euler Robot <euler.robot@huawei.com> Reported-by: Chen Qun <kuhn.chenqun@huawei.com> Reviewed-by: Miroslav Rezanina <mrezanin@redhat.com>
Message-Id: <20200325092137.24020-1-kuhn.chenqun@huawei.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200403191150.863-7-alex.bennee@linaro.org>
Alex Bennée [Fri, 3 Apr 2020 19:11:43 +0000 (20:11 +0100)]
target/xtensa: add FIXME for translation memory leak
Dynamically allocating a new structure within the DisasContext can
potentially leak as we can longjmp out of the translation loop (see
test_phys_mem). The proper fix would be to use static allocation
within the DisasContext but as the Xtensa translator imports it's code
from elsewhere I leave that as an exercise for the maintainer.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Message-Id: <20200403191150.863-6-alex.bennee@linaro.org>
Alex Bennée [Fri, 3 Apr 2020 19:11:42 +0000 (20:11 +0100)]
linux-user: more debug for init_guest_space
Searching for memory space can cause problems so lets extend the
CPU_LOG_PAGE output so you can watch init_guest_space fail to
allocate memory. A more involved fix is actually required to make this
function play nicely with the large guard pages the sanitiser likes to
use.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20200403191150.863-5-alex.bennee@linaro.org>
Alex Bennée [Fri, 3 Apr 2020 19:11:41 +0000 (20:11 +0100)]
tests/tcg: remove extraneous pasting macros
We are not using them and they just get in the way.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200403191150.863-4-alex.bennee@linaro.org>