Peter Maydell [Sat, 12 Feb 2022 22:04:07 +0000 (22:04 +0000)]
Merge remote-tracking branch 'remotes/vsementsov/tags/pull-nbd-2022-02-09-v2' into staging
nbd: handle AioContext change correctly
v2: add my s-o-b marks to each commit
# gpg: Signature made Fri 11 Feb 2022 13:14:55 GMT
# gpg: using RSA key 8B9C26CDB2FD147C880E86A1561F24C1F19F79FB
# gpg: Good signature from "Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8B9C 26CD B2FD 147C 880E 86A1 561F 24C1 F19F 79FB
* remotes/vsementsov/tags/pull-nbd-2022-02-09-v2:
iotests/281: Let NBD connection yield in iothread
block/nbd: Move s->ioc on AioContext change
iotests/281: Test lingering timers
iotests.py: Add QemuStorageDaemon class
block/nbd: Assert there are no timers when closed
block/nbd: Delete open timer when done
block/nbd: Delete reconnect delay timer when done
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Fri, 11 Feb 2022 13:11:49 +0000 (13:11 +0000)]
Merge remote-tracking branch 'remotes/stsquad/tags/pull-testing-and-plugins-090222-1' into staging
Testing and plugin updates:
- include vhost tests in qtest
- clean-up gcov ephemera in clean/.gitignore
- lcitool and docker updates
- mention .editorconfig in devel notes
- switch Centos8 to Centos Stream 8
- remove TCG tracing support
- add coverage plugin using drcov format
- expand abilities of libinsn.so plugin
- use correct logging for i386 int cases
- move reset of plugin data to start of block
- deprecate ppc6432abi
- fix TARGET_ABI_FMT_ptr for softmmu builds
# gpg: Signature made Wed 09 Feb 2022 14:13:14 GMT
# gpg: using RSA key 6685AE99E75167BCAFC8DF35FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" [full]
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-testing-and-plugins-090222-1: (28 commits)
include/exec: fix softmmu version of TARGET_ABI_FMT_lx
linux-user: Remove the deprecated ppc64abi32 target
plugins: move reset of plugin data to tb_start
target/i386: use CPU_LOG_INT for IRQ servicing
tests/plugins: add instruction matching to libinsn.so
tests/plugin: allow libinsn.so per-CPU counts
contrib/plugins: add a drcov plugin
plugins: add helper functions for coverage plugins
tracing: excise the tcg related from tracetool
tracing: remove the trace-tcg includes from the build
tracing: remove TCG memory access tracing
docs: remove references to TCG tracing
tests/tcg/sh4: disable another unreliable test
tests: Update CentOS 8 container to CentOS Stream 8
tests/lcitool: Allow lcitool-refresh in out-of-tree builds, too
gitlab: fall back to commit hash in qemu-setup filename
docs/devel: mention our .editorconfig
tests/lcitool: Install libibumad to cover RDMA on Debian based distros
tests: Manually remove libxml2 on MSYS2 runners
tests/lcitool: Refresh submodule and remove libxml2
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Hanna Reitz [Fri, 4 Feb 2022 11:10:12 +0000 (12:10 +0100)]
iotests/281: Let NBD connection yield in iothread
Put an NBD block device into an I/O thread, and then read data from it,
hoping that the NBD connection will yield during that read. When it
does, the coroutine must be reentered in the block device's I/O thread,
which will only happen if the NBD block driver attaches the connection's
QIOChannel to the new AioContext. It did not do that after 4ddb5d2fde
("block/nbd: drop connection_co") and prior to "block/nbd: Move s->ioc
on AioContext change", which would cause an assertion failure.
To improve our chances of yielding, the NBD server is throttled to
reading 64 kB/s, and the NBD client reads 128 kB, so it should yield at
some point.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Hanna Reitz [Fri, 4 Feb 2022 11:10:11 +0000 (12:10 +0100)]
block/nbd: Move s->ioc on AioContext change
s->ioc must always be attached to the NBD node's AioContext. If that
context changes, s->ioc must be attached to the new context.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2033626 Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Hanna Reitz [Fri, 4 Feb 2022 11:10:10 +0000 (12:10 +0100)]
iotests/281: Test lingering timers
Prior to "block/nbd: Delete reconnect delay timer when done" and
"block/nbd: Delete open timer when done", both of those timers would
remain scheduled even after successfully (re-)connecting to the server,
and they would not even be deleted when the BDS is deleted.
This test constructs exactly this situation:
(1) Configure an @open-timeout, so the open timer is armed, and
(2) Configure a @reconnect-delay and trigger a reconnect situation
(which succeeds immediately), so the reconnect delay timer is armed.
Then we immediately delete the BDS, and sleep for longer than the
@open-timeout and @reconnect-delay. Prior to said patches, this caused
one (or both) of the timer CBs to access already-freed data.
Accessing freed data may or may not crash, so this test can produce
false successes, but I do not know how to show the problem in a better
or more reliable way. If you run this test on "block/nbd: Assert there
are no timers when closed" and without the fix patches mentioned above,
you should reliably see an assertion failure.
(But all other tests that use the reconnect delay timer (264 and 277)
will fail in that configuration, too; as will nbd-reconnect-on-open,
which uses the open timer.)
Remove this test from the quick group because of the two second sleep
this patch introduces.
(I decided to put this test case into 281, because the main bug this
series addresses is in the interaction of the NBD block driver and I/O
threads, which is precisely the scope of 281. The test case for that
other bug will also be put into the test class added here.
Also, excuse the test class's name, I couldn't come up with anything
better. The "yield" part will make sense two patches from now.)
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Hanna Reitz [Fri, 4 Feb 2022 11:10:09 +0000 (12:10 +0100)]
iotests.py: Add QemuStorageDaemon class
This is a rather simple class that allows creating a QSD instance
running in the background and stopping it when no longer needed.
The __del__ handler is a safety net for when something goes so wrong in
a test that e.g. the tearDown() method is not called (e.g. setUp()
launches the QSD, but then launching a VM fails). We do not want the
QSD to continue running after the test has failed, so __del__() will
take care to kill it.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Hanna Reitz [Fri, 4 Feb 2022 11:10:08 +0000 (12:10 +0100)]
block/nbd: Assert there are no timers when closed
Our two timers must not remain armed beyond nbd_clear_bdrvstate(), or
they will access freed data when they fire.
This patch is separate from the patches that actually fix the issue
(HEAD^^ and HEAD^) so that you can run the associated regression iotest
(281) on a configuration that reproducibly exposes the bug.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Hanna Reitz [Fri, 4 Feb 2022 11:10:07 +0000 (12:10 +0100)]
block/nbd: Delete open timer when done
We start the open timer to cancel the connection attempt after a while.
Once nbd_do_establish_connection() has returned, the attempt is over,
and we no longer need the timer.
Delete it before returning from nbd_open(), so that it does not persist
for longer. It has no use after nbd_open(), and just like the reconnect
delay timer, it might well be dangerous if it were to fire afterwards.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Hanna Reitz [Fri, 4 Feb 2022 11:10:06 +0000 (12:10 +0100)]
block/nbd: Delete reconnect delay timer when done
We start the reconnect delay timer to cancel the reconnection attempt
after a while. Once nbd_co_do_establish_connection() has returned, this
attempt is over, and we no longer need the timer.
Delete it before returning from nbd_reconnect_attempt(), so that it does
not persist beyond the I/O request that was paused for reconnecting; we
do not want it to fire in a drained section, because all sort of things
can happen in such a section (e.g. the AioContext might be changed, and
we do not want the timer to fire in the wrong context; or the BDS might
even be deleted, and so the timer CB would access already-freed data).
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Thomas Huth [Wed, 12 Jan 2022 11:27:22 +0000 (11:27 +0000)]
linux-user: Remove the deprecated ppc64abi32 target
It's likely broken, and nobody cared for picking it up again
during the deprecation phase, so let's remove this now.
Since this is the last entry in deprecated_targets_list, remove
the related code in the configure script, too.
Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Acked-by: Cédric Le Goater <clg@kaod.org> Acked-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20211215084958.185214-1-thuth@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20220112112722.3641051-32-alex.bennee@linaro.org>
Alex Bennée [Fri, 4 Feb 2022 20:43:35 +0000 (20:43 +0000)]
plugins: move reset of plugin data to tb_start
We can't always guarantee we get to the end of a translator loop.
Although this can happen for a variety of reasons it does happen more
often on x86 system emulation when an instruction spans across to an
un-faulted page. This caused confusion of the instruction tracking
data resulting in apparent reverse execution (at least from the
plugins point of view).
Fix this by moving the reset code to plugin_gen_tb_start so we always
start with a clean slate.
We unconditionally reset tcg_ctx->plugin_insn as the
plugin_insn_append code uses this as a proxy for knowing if plugins
are enabled for the current instruction. Otherwise we can hit a race
where a previously instrumented thread leaves a stale value after the
main thread exits and disables instrumentation.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/824 Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220204204335.1689602-27-alex.bennee@linaro.org>
Alex Bennée [Fri, 4 Feb 2022 20:43:34 +0000 (20:43 +0000)]
target/i386: use CPU_LOG_INT for IRQ servicing
I think these have been wrong since f193c7979c (do not depend on
thunk.h - more log items). Fix them so as not to confuse other
debugging.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220204204335.1689602-26-alex.bennee@linaro.org>
Alex Bennée [Fri, 4 Feb 2022 20:43:33 +0000 (20:43 +0000)]
tests/plugins: add instruction matching to libinsn.so
This adds simple instruction matching to the libinsn.so plugin which
is useful for examining the execution distance between instructions.
For example to track how often we flush in ARM due to TLB updates:
-plugin ./tests/plugin/libinsn.so,match=tlbi
which leads to output like this:
0xffffffc01019a918, 'tlbi vale1is, x1', 5702 hits, 31825 match hits, Δ+8112 since last match, 68859 avg insns/match
0xffffffc01019a918, 'tlbi vale1is, x1', 5703 hits, 56593 match hits, Δ+17712125 since last match, 33455 avg insns/match
0xffffffc01019a918, 'tlbi vale1is, x1', 5704 hits, 56594 match hits, Δ+12689 since last match, 33454 avg insns/match
0xffffffc01019a918, 'tlbi vale1is, x1', 5705 hits, 56595 match hits, Δ+12585 since last match, 33454 avg insns/match
0xffffffc01019a918, 'tlbi vale1is, x1', 5706 hits, 56596 match hits, Δ+10491 since last match, 33454 avg insns/match
0xffffffc01019a918, 'tlbi vale1is, x1', 5707 hits, 56597 match hits, Δ+4721 since last match, 33453 avg insns/match
0xffffffc01019a918, 'tlbi vale1is, x1', 5708 hits, 56598 match hits, Δ+10733 since last match, 33453 avg insns/match
0xffffffc01019a918, 'tlbi vale1is, x1', 5709 hits, 56599 match hits, Δ+61959 since last match, 33453 avg insns/match
0xffffffc01019a918, 'tlbi vale1is, x1', 5710 hits, 56600 match hits, Δ+55235 since last match, 33454 avg insns/match
0xffffffc01019a918, 'tlbi vale1is, x1', 5711 hits, 56601 match hits, Δ+54373 since last match, 33454 avg insns/match
0xffffffc01019a918, 'tlbi vale1is, x1', 5712 hits, 56602 match hits, Δ+2705 since last match, 33453 avg insns/match
0xffffffc01019a918, 'tlbi vale1is, x1', 5713 hits, 56603 match hits, Δ+17262 since last match, 33453 avg insns/match
0xffffffc01019a918, 'tlbi vale1is, x1', 5714 hits, 56604 match hits, Δ+17206 since last match, 33453 avg insns/match
0xffffffc01019a918, 'tlbi vale1is, x1', 5715 hits, 56605 match hits, Δ+28940 since last match, 33453 avg insns/match
0xffffffc01019a918, 'tlbi vale1is, x1', 5716 hits, 56606 match hits, Δ+7370 since last match, 33452 avg insns/match
0xffffffc01019a918, 'tlbi vale1is, x1', 5717 hits, 56607 match hits, Δ+7066 since last match, 33452 avg insns/match
showing we do some sort of TLBI invalidation every 33 thousand
instructions.
Cc: Vasilev Oleg <vasilev.oleg@huawei.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Emilio Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220204204335.1689602-25-alex.bennee@linaro.org>
Alex Bennée [Fri, 4 Feb 2022 20:43:32 +0000 (20:43 +0000)]
tests/plugin: allow libinsn.so per-CPU counts
We won't go fully flexible but for most system emulation 8 vCPUs
resolution should be enough for anybody ;-)
Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220204204335.1689602-24-alex.bennee@linaro.org>
Alex Bennée [Fri, 4 Feb 2022 20:43:29 +0000 (20:43 +0000)]
tracing: excise the tcg related from tracetool
Now we have no TCG trace events and no longer handle them in the code
we can remove the handling from the tracetool to generate them. vcpu
tracing is still available although the existing syscall event is an
exercise in redundancy (plugins and -strace can also get the
information).
Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Cc: Luis Vilanova <vilanova@imperial.ac.uk> Cc: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220204204335.1689602-21-alex.bennee@linaro.org>
Alex Bennée [Fri, 4 Feb 2022 20:43:28 +0000 (20:43 +0000)]
tracing: remove the trace-tcg includes from the build
Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Cc: Luis Vilanova <vilanova@imperial.ac.uk> Cc: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220204204335.1689602-20-alex.bennee@linaro.org>
Alex Bennée [Fri, 4 Feb 2022 20:43:27 +0000 (20:43 +0000)]
tracing: remove TCG memory access tracing
If you really want to trace all memory operations TCG plugins gives
you a more flexible interface for doing so.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Cc: Luis Vilanova <vilanova@imperial.ac.uk> Cc: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220204204335.1689602-19-alex.bennee@linaro.org>
Alex Bennée [Fri, 4 Feb 2022 20:43:26 +0000 (20:43 +0000)]
docs: remove references to TCG tracing
Users wanting this sort of functionality should turn to TCG plugins
instead.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Cc: Luis Vilanova <vilanova@imperial.ac.uk> Cc: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220204204335.1689602-18-alex.bennee@linaro.org>
Alex Bennée [Fri, 4 Feb 2022 20:43:25 +0000 (20:43 +0000)]
tests/tcg/sh4: disable another unreliable test
Given the other failures it looks like general thread handling on sh4
is sketchy. It fails more often on CI than on my developer machine
though. See https://gitlab.com/qemu-project/qemu/-/issues/856 for more
details.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Laurent Vivier <laurent@vivier.eu> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220204204335.1689602-17-alex.bennee@linaro.org>
Thomas Huth [Fri, 4 Feb 2022 20:43:24 +0000 (20:43 +0000)]
tests: Update CentOS 8 container to CentOS Stream 8
Support for CentOS 8 has stopped at the end of 2021, so let's
switch to the Stream variant instead.
Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220201101911.97900-1-thuth@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20220204204335.1689602-16-alex.bennee@linaro.org>
Thomas Huth [Fri, 4 Feb 2022 20:43:23 +0000 (20:43 +0000)]
tests/lcitool: Allow lcitool-refresh in out-of-tree builds, too
When running "make lcitool-refresh" in an out-of-tree build, it
currently fails with an error message from git like this:
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
Fix it by changing to the source directory first before updating
the submodule.
Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220201085554.85733-1-thuth@redhat.com> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20220204204335.1689602-15-alex.bennee@linaro.org>
Stefan Hajnoczi [Fri, 4 Feb 2022 20:43:22 +0000 (20:43 +0000)]
gitlab: fall back to commit hash in qemu-setup filename
Personal repos may not have release tags (v6.0.0, v6.1.0, etc) and this
causes cross_system_build_job to fail when pretty-printing a unique
qemu-setup-*.exe name:
Alex Bennée [Fri, 4 Feb 2022 20:43:21 +0000 (20:43 +0000)]
docs/devel: mention our .editorconfig
Ideally we should keep all our automatic formatting gubins in here.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220204204335.1689602-13-alex.bennee@linaro.org>
tests/lcitool: Install libibumad to cover RDMA on Debian based distros
On Debian we also need libibumad to enable RDMA:
$ ../configure --enable-rdma
ERROR: OpenFabrics librdmacm/libibverbs/libibumad not present.
Your options:
(1) Fast: Install infiniband packages (devel) from your distro.
(2) Cleanest: Install libraries from www.openfabrics.org
(3) Also: Install softiwarp if you don't have RDMA hardware
Add the dependency to lcitool's qemu.yml (where librdmacm and
libibverbs are already listed) and refresh the generated files
by running:
$ make lcitool-refresh
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20220121154134.315047-8-f4bug@amsat.org>
Message-Id: <20220204204335.1689602-12-alex.bennee@linaro.org>
lcitool doesn't support MSYS2 targets, so manually remove
this now unnecessary library.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20220121154134.315047-7-f4bug@amsat.org>
Message-Id: <20220204204335.1689602-11-alex.bennee@linaro.org>
tests/lcitool: Refresh submodule and remove libxml2
The previous commit removed all uses of libxml2.
Refresh lcitool submodule, update qemu.yml and refresh the generated
files by running:
$ make lcitool-refresh
Note: This refreshment also removes libudev dependency on Fedora
and CentOS due to libvirt-ci commit 18bfaee ("mappings: Improve
mapping for libudev"), since "The udev project has been absorbed
by the systemd project", and lttng-ust on FreeBSD runners due to
libvirt-ci commit 6dd9b6f ("guests: drop lttng-ust from FreeBSD
platform").
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20220121154134.315047-6-f4bug@amsat.org>
Message-Id: <20220204204335.1689602-10-alex.bennee@linaro.org>
Michael Tokarev [Fri, 4 Feb 2022 20:43:17 +0000 (20:43 +0000)]
drop libxml2 checks since libxml is not actually used (for parallels)
For a long time, we assumed that libxml2 is necessary for parallels
block format support (block/parallels*). However, this format actually
does not use libxml [*]. Since this is the only user of libxml2 in
whole QEMU tree, we can drop all libxml2 checks and dependencies too.
It is even more: --enable-parallels configure option was the only
option which was silently ignored when it's (fake) dependency
(libxml2) isn't installed.
Drop all mentions of libxml2.
[*] Actually the basis for libxml use were introduced in commit ed279a06c53 ("configure: add dependency") but the implementation
was never merged:
https://lore.kernel.org/qemu-devel/70227bbd-a517-70e9-714f-e6e0ec431be9@openvz.org/
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20220119090423.149315-1-mjt@msgid.tls.msk.ru> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
[PMD: Updated description and adapted to use lcitool] Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20220121154134.315047-5-f4bug@amsat.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20220204204335.1689602-9-alex.bennee@linaro.org>
tests/lcitool: Include local qemu.yml when refreshing cirrus-ci files
The script only include the local qemu.yml for Dockerfiles.
Since we want to keep the Cirrus-CI generated files in sync,
also use the --data-dir option in generate_cirrus().
Fixes: c45a540f4bd (".gitlab-ci.d/cirrus: auto-generate variables with lcitool") Reported-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20220121154134.315047-4-f4bug@amsat.org>
Message-Id: <20220204204335.1689602-8-alex.bennee@linaro.org>
MAINTAINERS: Cover lcitool submodule with build test / automation
lcitool is used by build test / automation, we want maintainers
to get notified if the submodule is updated.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20220121154134.315047-2-f4bug@amsat.org>
Message-Id: <20220204204335.1689602-6-alex.bennee@linaro.org>
Alex Bennée [Fri, 4 Feb 2022 20:43:13 +0000 (20:43 +0000)]
.gitignore: add .gcov pattern
The gcovr tool is very messy and can leave a lot of crap in the source
tree even when using build directories.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220204204335.1689602-5-alex.bennee@linaro.org>
Alex Bennée [Fri, 4 Feb 2022 20:43:12 +0000 (20:43 +0000)]
Makefile: also remove .gcno files when cleaning
Left over .gcno files from old builds can really confuse gcov and the
user expects a clean slate after "make clean". Make clean mean clean.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220204204335.1689602-4-alex.bennee@linaro.org>
Alex Bennée [Fri, 4 Feb 2022 20:43:11 +0000 (20:43 +0000)]
tests/qtest: enable more vhost-user tests by default
If this starts causing failures again we should probably fix that.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20220204204335.1689602-3-alex.bennee@linaro.org>
Alex Bennée [Fri, 4 Feb 2022 20:43:10 +0000 (20:43 +0000)]
tests/Makefile.include: clean-up old code
This is no longer needed since a2ce7dbd91 ("meson: convert tests/qtest
to meson", 2020-08-21)
Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220204204335.1689602-2-alex.bennee@linaro.org>
Peter Maydell [Tue, 8 Feb 2022 11:40:08 +0000 (11:40 +0000)]
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20220208' into staging
target-arm queue:
* Fix handling of SVE ZCR_LEN when using VHE
* xlnx-zynqmp: 'Or' the QSPI / QSPI DMA IRQs
* Don't ever enable PSCI when booting guest in EL3
* Adhere to SMCCC 1.3 section 5.2
* highbank: Fix issues with booting SMP
* midway: Fix issues booting at all
* boot: Drop existing dtb /psci node rather than retaining it
* versal-virt: Always call arm_load_kernel()
* force flag recalculation when messing with DAIF
* hw/timer/armv7m_systick: Update clock source before enabling timer
* hw/arm/smmuv3: Fix device reset
* hw/intc/arm_gicv3_its: refactorings and minor bug fixes
* hw/sensor: Add lsm303dlhc magnetometer device
* remotes/pmaydell/tags/pull-target-arm-20220208: (39 commits)
hw/sensor: Add lsm303dlhc magnetometer device
hw/intc/arm_gicv3_its: Split error checks
hw/intc/arm_gicv3_its: Don't allow intid 1023 in MAPI/MAPTI
hw/intc/arm_gicv3_its: In MAPC with V=0, don't check rdbase field
hw/intc/arm_gicv3_its: Drop TableDesc and CmdQDesc valid fields
hw/intc/arm_gicv3_its: Make update_ite() use ITEntry
hw/intc/arm_gicv3_its: Pass ITE values back from get_ite() via a struct
hw/intc/arm_gicv3_its: Avoid nested ifs in get_ite()
hw/intc/arm_gicv3_its: Fix address calculation in get_ite() and update_ite()
hw/intc/arm_gicv3_its: Pass CTEntry to update_cte()
hw/intc/arm_gicv3_its: Keep CTEs as a struct, not a raw uint64_t
hw/intc/arm_gicv3_its: Pass DTEntry to update_dte()
hw/intc/arm_gicv3_its: Keep DTEs as a struct, not a raw uint64_t
hw/intc/arm_gicv3_its: Use address_space_map() to access command queue packets
hw/arm/smmuv3: Fix device reset
hw/timer/armv7m_systick: Update clock source before enabling timer
arm: force flag recalculation when messing with DAIF
hw/arm: versal-virt: Always call arm_load_kernel()
hw/arm/boot: Drop existing dtb /psci node rather than retaining it
hw/arm/boot: Drop nb_cpus field from arm_boot_info
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Kevin Townsend [Sun, 30 Jan 2022 09:50:32 +0000 (10:50 +0100)]
hw/sensor: Add lsm303dlhc magnetometer device
This commit adds emulation of the magnetometer on the LSM303DLHC.
It allows the magnetometer's X, Y and Z outputs to be set via the
mag-x, mag-y and mag-z properties, as well as the 12-bit
temperature output via the temperature property. Sensor can be
enabled with 'CONFIG_LSM303DLHC_MAG=y'.
Signed-off-by: Kevin Townsend <kevin.townsend@linaro.org>
Message-id: 20220130095032.35392-1-kevin.townsend@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 1 Feb 2022 19:32:07 +0000 (19:32 +0000)]
hw/intc/arm_gicv3_its: Split error checks
In most of the ITS command processing, we check different error
possibilities one at a time and log them appropriately. In
process_mapti() and process_mapd() we have code which checks
multiple error cases at once, which means the logging is less
specific than it could be. Split those cases up.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-14-peter.maydell@linaro.org
Peter Maydell [Tue, 1 Feb 2022 19:32:06 +0000 (19:32 +0000)]
hw/intc/arm_gicv3_its: Don't allow intid 1023 in MAPI/MAPTI
When handling MAPI/MAPTI, we allow the supplied interrupt ID to be
either 1023 or something in the valid LPI range. This is a mistake:
only a real valid LPI is allowed. (The general behaviour of the ITS
is that most interrupt ID fields require a value in the LPI range;
the exception is that fields specifying a doorbell value, which are
all in GICv4 commands, allow also 1023 to mean "no doorbell".)
Remove the condition that incorrectly allows 1023 here.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-13-peter.maydell@linaro.org
Peter Maydell [Tue, 1 Feb 2022 19:32:05 +0000 (19:32 +0000)]
hw/intc/arm_gicv3_its: In MAPC with V=0, don't check rdbase field
In the MAPC command, if V=0 this is a request to delete a collection
table entry and the rdbase field of the command packet will not be
used. In particular, the specification says that the "UNPREDICTABLE
if rdbase is not valid" only applies for V=1.
We were doing a check-and-log-guest-error on rdbase regardless of
whether the V bit was set, and also (harmlessly but confusingly)
storing the contents of the rdbase field into the updated collection
table entry. Update the code so that if V=0 we don't check or use
the rdbase field value.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-12-peter.maydell@linaro.org
Peter Maydell [Tue, 1 Feb 2022 19:32:04 +0000 (19:32 +0000)]
hw/intc/arm_gicv3_its: Drop TableDesc and CmdQDesc valid fields
Currently we track in the TableDesc and CmdQDesc structs the state of
the GITS_BASER<n> and GITS_CBASER Valid bits. However we aren't very
consistent abut checking the valid field: we test it in update_cte()
and update_dte(), but not anywhere else we look things up in tables.
The GIC specification says that it is UNPREDICTABLE if a guest fails
to set any of these Valid bits before enabling the ITS via
GITS_CTLR.Enabled. So we can choose to handle Valid == 0 as
equivalent to a zero-length table. This is in fact how we're already
catching this case in most of the table-access paths: when Valid is 0
we leave the num_entries fields in TableDesc or CmdQDesc set to zero,
and then the out-of-bounds check "index >= num_entries" that we have
to do anyway before doing any of these table lookups will always be
true, catching the no-valid-table case without any extra code.
So we can remove the checks on the valid field from update_cte()
and update_dte(): since these happen after the bounds check there
was never any case when the test could fail. That means the valid
fields would be entirely unused, so just remove them.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-11-peter.maydell@linaro.org
Peter Maydell [Tue, 1 Feb 2022 19:32:03 +0000 (19:32 +0000)]
hw/intc/arm_gicv3_its: Make update_ite() use ITEntry
Make the update_ite() struct use the new ITEntry struct, so that
callers don't need to assemble the in-memory ITE data themselves, and
only get_ite() and update_ite() need to care about that in-memory
layout. We can then drop the no-longer-used IteEntry struct
definition.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-10-peter.maydell@linaro.org
Peter Maydell [Tue, 1 Feb 2022 19:32:02 +0000 (19:32 +0000)]
hw/intc/arm_gicv3_its: Pass ITE values back from get_ite() via a struct
In get_ite() we currently return the caller some of the fields of an
Interrupt Table Entry via a set of pointer arguments, and validate
some of them internally (interrupt type and valid bit) to return a
simple true/false 'valid' indication. Define a new ITEntry struct
which has all the fields that the in-memory ITE has, and bring the
get_ite() function in to line with get_dte() and get_cte().
This paves the way for handling virtual interrupts, which will want
a different subset of the fields in the ITE. Handling them under
the old "lots of pointer arguments" scheme would have meant a
confusingly large set of arguments for this function.
The new struct ITEntry is obviously confusably similar to the
existing IteEntry struct, whose fields are the raw 12 bytes
of the in-memory ITE. In the next commit we will make update_ite()
use ITEntry instead of IteEntry, which will allow us to delete
the IteEntry struct and remove the confusion.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-9-peter.maydell@linaro.org
Peter Maydell [Tue, 1 Feb 2022 19:32:01 +0000 (19:32 +0000)]
hw/intc/arm_gicv3_its: Avoid nested ifs in get_ite()
The get_ite() code has some awkward nested if statements; clean
them up by returning early if the memory accesses fail.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-8-peter.maydell@linaro.org
Peter Maydell [Tue, 1 Feb 2022 19:32:00 +0000 (19:32 +0000)]
hw/intc/arm_gicv3_its: Fix address calculation in get_ite() and update_ite()
In get_ite() and update_ite() we work with a 12-byte in-guest-memory
table entry, which we intend to handle as an 8-byte value followed by
a 4-byte value. Unfortunately the calculation of the address of the
4-byte value is wrong, because we write it as:
table_base_address + (index * entrysize) + 4
(obfuscated by the way the expression has been written)
when it should be + 8. This bug meant that we overwrote the top
bytes of the 8-byte value with the 4-byte value. There are no
guest-visible effects because the top half of the 8-byte value
contains only the doorbell interrupt field, which is used only in
GICv4, and the two bugs in the "write ITE" and "read ITE" codepaths
cancel each other out.
We can't simply change the calculation, because this would break
migration of a (TCG) guest from the old version of QEMU which had
in-guest-memory interrupt tables written using the buggy version of
update_ite(). We must also at the same time change the layout of the
fields within the ITE_L and ITE_H values so that the in-memory
locations of the fields we care about (VALID, INTTYPE, INTID and
ICID) stay the same.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-7-peter.maydell@linaro.org
Peter Maydell [Tue, 1 Feb 2022 19:31:59 +0000 (19:31 +0000)]
hw/intc/arm_gicv3_its: Pass CTEntry to update_cte()
Make update_cte() take a CTEntry struct rather than all the fields
of the new CTE as separate arguments.
This brings it into line with the update_dte() API.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-6-peter.maydell@linaro.org
Peter Maydell [Tue, 1 Feb 2022 19:31:58 +0000 (19:31 +0000)]
hw/intc/arm_gicv3_its: Keep CTEs as a struct, not a raw uint64_t
In the ITS, a CTE is an entry in the collection table, which contains
multiple fields. Currently the function get_cte() which reads one
entry from the device table returns a success/failure boolean and
passes back the raw 64-bit integer CTE value via a pointer argument.
We then extract fields from the CTE as we need them.
Create a real C struct with the same fields as the CTE, and
populate it in get_cte(), so that that function and update_cte()
are the only ones which need to care about the in-guest-memory
format of the CTE.
This brings get_cte()'s API into line with get_dte().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-5-peter.maydell@linaro.org
Peter Maydell [Tue, 1 Feb 2022 19:31:57 +0000 (19:31 +0000)]
hw/intc/arm_gicv3_its: Pass DTEntry to update_dte()
Make update_dte() take a DTEntry struct rather than all the fields of
the new DTE as separate arguments.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-4-peter.maydell@linaro.org
Peter Maydell [Tue, 1 Feb 2022 19:31:56 +0000 (19:31 +0000)]
hw/intc/arm_gicv3_its: Keep DTEs as a struct, not a raw uint64_t
In the ITS, a DTE is an entry in the device table, which contains
multiple fields. Currently the function get_dte() which reads one
entry from the device table returns it as a raw 64-bit integer,
which we then pass around in that form, only extracting fields
from it as we need them.
Create a real C struct with the same fields as the DTE, and
populate it in get_dte(), so that that function and update_dte()
are the only ones that need to care about the in-guest-memory
format of the DTE.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-3-peter.maydell@linaro.org
Peter Maydell [Tue, 1 Feb 2022 19:31:55 +0000 (19:31 +0000)]
hw/intc/arm_gicv3_its: Use address_space_map() to access command queue packets
Currently the ITS accesses each 8-byte doubleword in a 4-doubleword
command packet with a separate address_space_ldq_le() call. This is
awkward because the individual command processing functions have
ended up with code to handle "load more doublewords out of the
packet", which is both unwieldy and also a potential source of bugs
because it's not obvious when looking at a line that pulls a field
out of the 'value' variable which of the 4 doublewords that variable
currently holds.
Switch to using address_space_map() to map the whole command packet
at once and fish the four doublewords out of it. Then each process_*
function can start with a few lines of code that extract the fields
it cares about.
This requires us to split out the guts of process_its_cmd() into a
new do_process_its_cmd(), because we were previously overloading the
value and offset arguments as a backdoor way to directly pass the
devid and eventid from a write to GITS_TRANSLATER. The new
do_process_its_cmd() takes those arguments directly, and
process_its_cmd() is just a wrapper that does the "read fields from
command packet" part.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220201193207.2771604-2-peter.maydell@linaro.org
Eric Auger [Wed, 2 Feb 2022 11:16:02 +0000 (12:16 +0100)]
hw/arm/smmuv3: Fix device reset
We currently miss a bunch of register resets in the device reset
function. This sometimes prevents the guest from rebooting after
a system_reset (with virtio-blk-pci). For instance, we may get
the following errors:
invalid STE
smmuv3-iommu-memory-region-0-0 translation failed for iova=0x13a9d2000(SMMU_EVT_C_BAD_STE)
Invalid read at addr 0x13A9D2000, size 2, region '(null)', reason: rejected
invalid STE
smmuv3-iommu-memory-region-0-0 translation failed for iova=0x13a9d2000(SMMU_EVT_C_BAD_STE)
Invalid write at addr 0x13A9D2000, size 2, region '(null)', reason: rejected
invalid STE
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-id: 20220202111602.627429-1-eric.auger@redhat.com Fixes: 10a83cb988 ("hw/arm/smmuv3: Skeleton") Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Richard Petri [Tue, 1 Feb 2022 19:26:51 +0000 (20:26 +0100)]
hw/timer/armv7m_systick: Update clock source before enabling timer
Starting the SysTick timer and changing the clock source a the same time
will result in an error, if the previous clock period was zero. For exmaple,
on the mps2-tz platforms, no refclk is present. Right after reset, the
configured ptimer period is zero, and trying to enabling it will turn it off
right away. E.g., code running on the platform setting
should change the clock source and enable the timer on real hardware, but
resulted in an error in qemu.
Signed-off-by: Richard Petri <git@rpls.de> Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20220201192650.289584-1-git@rpls.de Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Alex Bennée [Wed, 2 Feb 2022 12:23:53 +0000 (12:23 +0000)]
arm: force flag recalculation when messing with DAIF
The recently introduced debug tests in kvm-unit-tests exposed an error
in our handling of singlestep cause by stale hflags. This is caught by
--enable-debug-tcg when running the tests.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reported-by: Andrew Jones <drjones@redhat.com> Tested-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220202122353.457084-1-alex.bennee@linaro.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Always call arm_load_kernel() regardless of kernel_filename being
set. This is needed because arm_load_kernel() sets up reset for
the CPUs.
Fixes: 6f16da53ff (hw/arm: versal: Add a virtual Xilinx Versal board) Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 20220130110313.4045351-2-edgar.iglesias@gmail.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Thu, 27 Jan 2022 15:46:39 +0000 (15:46 +0000)]
hw/arm/boot: Drop existing dtb /psci node rather than retaining it
If we're using PSCI emulation, we add a /psci node to the device tree
we pass to the guest. At the moment, if the dtb already has a /psci
node in it, we retain it, rather than replacing it. (This behaviour
was added in commit c39770cd637765 in 2018.)
This is a problem if the existing node doesn't match our PSCI
emulation. In particular, it might specify the wrong method (HVC vs
SMC), or wrong function IDs for cpu_suspend/cpu_off/etc, in which
case the guest will not get the behaviour it wants when it makes PSCI
calls.
An example of this is trying to boot the highbank or midway board
models using the device tree supplied in the kernel sources: this
device tree includes a /psci node that specifies function IDs that
don't match the (PSCI 0.2 compliant) IDs that QEMU uses. The dtb
cpu_suspend function ID happens to match the PSCI 0.2 cpu_off ID, so
the guest hangs after booting when the kernel tries to idle the CPU
and instead it gets turned off.
Instead of retaining an existing /psci node, delete it entirely
and replace it with a node whose properties match QEMU's PSCI
emulation behaviour. This matches the way we handle /memory nodes,
where we also delete any existing nodes and write in ones that
match the way QEMU is going to behave.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com> Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Tested-by: Cédric Le Goater <clg@kaod.org> Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20220127154639.2090164-17-peter.maydell@linaro.org
Peter Maydell [Thu, 27 Jan 2022 15:46:38 +0000 (15:46 +0000)]
hw/arm/boot: Drop nb_cpus field from arm_boot_info
We use the arm_boot_info::nb_cpus field in only one place, and that
place can easily get the number of CPUs locally rather than relying
on the board code to have set the field correctly. (At least one
board, xlnx-versal-virt, does not set the field despite having more
than one CPU.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com> Tested-by: Cédric Le Goater <clg@kaod.org> Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20220127154639.2090164-16-peter.maydell@linaro.org
Peter Maydell [Thu, 27 Jan 2022 15:46:37 +0000 (15:46 +0000)]
hw/arm/highbank: Drop unused secondary boot stub code
The highbank and midway board code includes boot-stub code for
handling secondary CPU boot which keeps the secondaries in a pen
until the primary writes to a known location with the address they
should jump to.
This code is never used, because the boards enable QEMU's PSCI
emulation, so secondary CPUs are kept powered off until the PSCI call
which turns them on, and then start execution from the address given
by the guest in that PSCI call. Delete the unreachable code.
(The code was wrong for midway in any case -- on the Cortex-A15 the
GIC CPU interface registers are at a different offset from PERIPHBASE
compared to the Cortex-A9, and the code baked-in the offsets for
highbank's A9.)
Note that this commit implicitly depends on the preceding "Don't
write secondary boot stub if using PSCI" commit -- the default
secondary-boot stub code overlaps with one of the highbank-specific
bootcode rom blobs, so we must suppress the secondary-boot
stub code entirely, not merely replace the highbank-specific
version with the default.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com> Tested-by: Cédric Le Goater <clg@kaod.org> Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20220127154639.2090164-15-peter.maydell@linaro.org
Peter Maydell [Thu, 27 Jan 2022 15:46:36 +0000 (15:46 +0000)]
hw/arm/boot: Don't write secondary boot stub if using PSCI
If we're using PSCI emulation to start secondary CPUs, there is no
point in writing the "secondary boot" stub code, because it will
never be used -- secondary CPUs start powered-off, and when powered
on are set to begin execution at the address specified by the guest's
power-on PSCI call, not at the stub.
Move the call to the hook that writes the secondary boot stub code so
that we can do it only if we're starting a Linux kernel and not using
PSCI.
(None of the users of the hook care about the ordering of its call
relative to anything else: they only use it to write a rom blob to
guest memory.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com> Tested-by: Cédric Le Goater <clg@kaod.org> Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20220127154639.2090164-14-peter.maydell@linaro.org
Peter Maydell [Thu, 27 Jan 2022 15:46:35 +0000 (15:46 +0000)]
hw/arm/boot: Prevent setting both psci_conduit and secure_board_setup
Now that we have dealt with the one special case (highbank) that needed
to set both psci_conduit and secure_board_setup, we don't need to
allow that combination any more. It doesn't make sense in general,
so use an assertion to ensure we don't add new boards that do it
by accident without thinking through the consequences.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com> Tested-by: Cédric Le Goater <clg@kaod.org> Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20220127154639.2090164-13-peter.maydell@linaro.org
Peter Maydell [Thu, 27 Jan 2022 15:46:34 +0000 (15:46 +0000)]
hw/arm/highbank: Drop use of secure_board_setup
Guest code on highbank may make non-PSCI SMC calls in order to
enable/disable the L2x0 cache controller (see the Linux kernel's
arch/arm/mach-highbank/highbank.c highbank_l2c310_write_sec()
function). The ABI for this is documented in kernel commit 8e56130dcb as being borrowed from the OMAP44xx ROM. The OMAP44xx TRM
documents this function ID as having no return value and potentially
trashing all guest registers except SP and PC. For QEMU's purposes
(where our L2x0 model is a stub and enabling or disabling it doesn't
affect the guest behaviour) a simple "do nothing" SMC is fine.
We currently implement this NOP behaviour using a little bit of
Secure code we run before jumping to the guest kernel, which is
written by arm_write_secure_board_setup_dummy_smc(). The code sets
up a set of Secure vectors where the SMC entry point returns without
doing anything.
Now that the PSCI SMC emulation handles all SMC calls (setting r0 to
an error code if the input r0 function identifier is not recognized),
we can use that default behaviour as sufficient for the highbank
cache controller call. (Because the guest code assumes r0 has no
interesting value on exit it doesn't matter that we set it to the
error code). We can therefore delete the highbank board code that
sets secure_board_setup to true and writes the secure-code bootstub.
(Note that because the OMAP44xx ABI puts function-identifiers in
r12 and PSCI uses r0, we only avoid a clash because Linux's code
happens to put the function-identifier in both registers. But this
is true also when the kernel is running on real firmware that
implements both ABIs as far as I can see.)
This change fixes in passing booting on the 'midway' board model,
which has been completely broken since we added support for Hyp
mode to the Cortex-A15 CPU. When we did that boot.c was made to
start running the guest code in Hyp mode; this includes the
board_setup hook, which instantly UNDEFs because the NSACR is
not accessible from Hyp. (Put another way, we never made the
secure_board_setup hook support cope with Hyp mode.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com> Tested-by: Cédric Le Goater <clg@kaod.org> Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20220127154639.2090164-12-peter.maydell@linaro.org
Peter Maydell [Mon, 7 Feb 2022 16:54:30 +0000 (16:54 +0000)]
arm: tcg: Adhere to SMCCC 1.3 section 5.2
The SMCCC 1.3 spec section 5.2 says
The Unknown SMC Function Identifier is a sign-extended value of (-1)
that is returned in the R0, W0 or X0 registers. An implementation must
return this error code when it receives:
* An SMC or HVC call with an unknown Function Identifier
* An SMC or HVC call for a removed Function Identifier
* An SMC64/HVC64 call from AArch32 state
To comply with these statements, let's always return -1 when we encounter
an unknown HVC or SMC call.
[PMM:
This is a reinstatement of commit 9fcd15b9193e819b, previously
reverted in commit 4825eaae4fdd56fba0f; we can do this now that we
have arranged for all the affected board models to not enable the
PSCI emulation if they are running guest code at EL3. This avoids
the regressions that caused us to revert the change for 7.0.]
Signed-off-by: Alexander Graf <agraf@csgraf.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com> Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Tested-by: Cédric Le Goater <clg@kaod.org> Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Thu, 27 Jan 2022 15:46:32 +0000 (15:46 +0000)]
hw/arm: highbank: For EL3 guests, don't enable PSCI, start all cores
Change the highbank/midway boards to use the new boot.c functionality
to allow us to enable psci-conduit only if the guest is being booted
in EL1 or EL2, so that if the user runs guest EL3 firmware code our
PSCI emulation doesn't get in its way.
To do this we stop setting the psci-conduit and start-powered-off
properties on the CPU objects in the board code, and instead set the
psci_conduit field in the arm_boot_info struct to tell the common
boot loader code that we'd like PSCI if the guest is starting at an
EL that it makes sense with (in which case it will set these
properties).
This means that when running guest code at EL3, all the cores
will start execution at once on poweron. This matches the
real hardware behaviour. (A brief description of the hardware
boot process is in the u-boot documentation for these boards:
https://u-boot.readthedocs.io/en/latest/board/highbank/highbank.html#boot-process
-- in theory one might run the 'a9boot'/'a15boot' secure monitor
code in QEMU, though we probably don't emulate enough for that.)
This affects the highbank and midway boards.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com> Tested-by: Cédric Le Goater <clg@kaod.org> Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20220127154639.2090164-10-peter.maydell@linaro.org
Peter Maydell [Thu, 27 Jan 2022 15:46:31 +0000 (15:46 +0000)]
hw/arm/virt: Let boot.c handle PSCI enablement
Instead of setting the CPU psci-conduit and start-powered-off
properties in the virt board code, set the arm_boot_info psci_conduit
field so that the boot.c code can do it.
This will fix a corner case where we were incorrectly enabling PSCI
emulation when booting guest code into EL3 because it was an ELF file
passed to -kernel or to the generic loader. (EL3 guest code started
via -bios or -pflash was already being run with PSCI emulation
disabled.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com> Tested-by: Cédric Le Goater <clg@kaod.org> Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20220127154639.2090164-9-peter.maydell@linaro.org
Peter Maydell [Thu, 27 Jan 2022 15:46:30 +0000 (15:46 +0000)]
hw/arm/versal: Let boot.c handle PSCI enablement
Instead of setting the CPU psci-conduit and start-powered-off
properties in the xlnx-versal-virt board code, set the arm_boot_info
psci_conduit field so that the boot.c code can do it.
This will fix a corner case where we were incorrectly enabling PSCI
emulation when booting guest code into EL3 because it was an ELF file
passed to -kernel. (EL3 guest code started via -bios, -pflash, or
the generic loader was already being run with PSCI emulation
disabled.)
Note that EL3 guest code has no way to turn on the secondary CPUs
because there's no emulated power controller, but this was already
true for EL3 guest code run via -bios, -pflash, or the generic
loader.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com> Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Tested-by: Cédric Le Goater <clg@kaod.org> Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20220127154639.2090164-8-peter.maydell@linaro.org
Peter Maydell [Thu, 27 Jan 2022 15:46:29 +0000 (15:46 +0000)]
hw/arm/xlnx-zcu102: Don't enable PSCI conduit when booting guest in EL3
Change the Xilinx ZynqMP-based board xlnx-zcu102 to use the new
boot.c functionality to allow us to enable psci-conduit only if
the guest is being booted in EL1 or EL2, so that if the user runs
guest EL3 firmware code our PSCI emulation doesn't get in its
way.
To do this we stop setting the psci-conduit property on the CPU
objects in the SoC code, and instead set the psci_conduit field in
the arm_boot_info struct to tell the common boot loader code that
we'd like PSCI if the guest is starting at an EL that it makes
sense with.
Note that this means that EL3 guest code will have no way
to power on secondary cores, because we don't model any
kind of power controller that does that on this SoC.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com> Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Tested-by: Cédric Le Goater <clg@kaod.org> Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com> Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220127154639.2090164-7-peter.maydell@linaro.org
Peter Maydell [Thu, 27 Jan 2022 15:46:28 +0000 (15:46 +0000)]
hw/arm: allwinner: Don't enable PSCI conduit when booting guest in EL3
Change the allwinner-h3 based board to use the new boot.c
functionality to allow us to enable psci-conduit only if the guest is
being booted in EL1 or EL2, so that if the user runs guest EL3
firmware code our PSCI emulation doesn't get in its way.
To do this we stop setting the psci-conduit property on the CPU
objects in the SoC code, and instead set the psci_conduit field in
the arm_boot_info struct to tell the common boot loader code that
we'd like PSCI if the guest is starting at an EL that it makes sense
with.
This affects the orangepi-pc board.
This commit leaves the secondary CPUs in the powered-down state if
the guest is booting at EL3, which is the same behaviour as before
this commit. The secondaries can no longer be started by that EL3
code making a PSCI call but can still be started via the CPU
Configuration Module registers (which we model in
hw/misc/allwinner-cpucfg.c).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Niek Linnenbank <nieklinnenbank@gmail.com> Tested-by: Cédric Le Goater <clg@kaod.org> Tested-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Message-id: 20220127154639.2090164-6-peter.maydell@linaro.org
Peter Maydell [Thu, 27 Jan 2022 15:46:27 +0000 (15:46 +0000)]
hw/arm: imx: Don't enable PSCI conduit when booting guest in EL3
Change the iMX-SoC based boards to use the new boot.c functionality
to allow us to enable psci-conduit only if the guest is being booted
in EL1 or EL2, so that if the user runs guest EL3 firmware code our
PSCI emulation doesn't get in its way.
To do this we stop setting the psci-conduit property on the CPU
objects in the SoC code, and instead set the psci_conduit field in
the arm_boot_info struct to tell the common boot loader code that
we'd like PSCI if the guest is starting at an EL that it makes
sense with.
This affects the mcimx6ul-evk and mcimx7d-sabre boards.
Note that for the mcimx7d board, this means that when running guest
code at EL3 there is currently no way to power on the secondary CPUs,
because we do not currently have a model of the system reset
controller module which should be used to do that for the imx7 SoC,
only for the imx6 SoC. (Previously EL3 code which knew it was
running on QEMU could use a PSCI call to do this.) This doesn't
affect the imx6ul-evk board because it is uniprocessor.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Tested-by: Cédric Le Goater <clg@kaod.org> Acked-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220127154639.2090164-5-peter.maydell@linaro.org
Peter Maydell [Thu, 27 Jan 2022 15:46:26 +0000 (15:46 +0000)]
hw/arm/boot: Support setting psci-conduit based on guest EL
Currently we expect board code to set the psci-conduit property on
CPUs and ensure that secondary CPUs are created with the
start-powered-off property set to false, if the board wishes to use
QEMU's builtin PSCI emulation. This worked OK for the virt board
where we first wanted to use it, because the virt board directly
creates its CPUs and is in a reasonable position to set those
properties. For other boards which model real hardware and use a
separate SoC object, however, it is more awkward. Most PSCI-using
boards just set the psci-conduit board unconditionally.
This was never strictly speaking correct (because you would not be
able to run EL3 guest firmware that itself provided the PSCI
interface, as the QEMU implementation would overrule it), but mostly
worked in practice because for non-PSCI SMC calls QEMU would emulate
the SMC instruction as normal (by trapping to guest EL3). However,
we would like to make our PSCI emulation follow the part of the SMCC
specification that mandates that SMC calls with unknown function
identifiers return a failure code, which means that all SMC calls
will be handled by the PSCI code and the "emulate as normal" path
will no longer be taken.
We tried to implement that in commit 9fcd15b9193e81
("arm: tcg: Adhere to SMCCC 1.3 section 5.2"), but this
regressed attempts to run EL3 guest code on the affected boards:
* mcimx6ul-evk, mcimx7d-sabre, orangepi, xlnx-zcu102
* for the case only of EL3 code loaded via -kernel (and
not via -bios or -pflash), virt and xlnx-versal-virt
so for the 7.0 release we reverted it (in commit 4825eaae4fdd56f).
This commit provides a mechanism that boards can use to arrange that
psci-conduit is set if running guest code at a low enough EL but not
if it would be running at the same EL that the conduit implies that
the QEMU PSCI implementation is using. (Later commits will convert
individual board models to use this mechanism.)
We do this by moving the setting of the psci-conduit and
start-powered-off properties to arm_load_kernel(). Boards which want
to potentially use emulated PSCI must set a psci_conduit field in the
arm_boot_info struct to the type of conduit they want to use (SMC or
HVC); arm_load_kernel() will then set the CPUs up accordingly if it
is not going to start the guest code at the same or higher EL as the
fake QEMU firmware would be at.
Board/SoC code which uses this mechanism should no longer set the CPU
psci-conduit property directly. It should only set the
start-powered-off property for secondaries if EL3 guest firmware
running bare metal expects that rather than the alternative "all CPUs
start executing the firmware at once".
Note that when calculating whether we are going to run guest
code at EL3, we ignore the setting of arm_boot_info::secure_board_setup,
which might cause us to run a stub bit of guest code at EL3 which
does some board-specific setup before dropping to EL2 or EL1 to
run the guest kernel. This is OK because only one board that
enables PSCI sets secure_board_setup (the highbank board), and
the stub code it writes will behave the same way whether the
one SMC call it makes is handled by "emulate the SMC" or by
"PSCI default returns an error code". So we can leave that stub
code in place until after we've changed the PSCI default behaviour;
at that point we will remove it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Tested-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20220127154639.2090164-4-peter.maydell@linaro.org
Peter Maydell [Thu, 27 Jan 2022 15:46:25 +0000 (15:46 +0000)]
cpu.c: Make start-powered-off settable after realize
The CPU object's start-powered-off property is currently only
settable before the CPU object is realized. For arm machines this is
awkward, because we would like to decide whether the CPU should be
powered-off based on how we are booting the guest code, which is
something done in the machine model code and in common code called by
the machine model, which runs much later and in completely different
parts of the codebase from the SoC object code that is responsible
for creating and realizing the CPU objects.
Allow start-powered-off to be set after realize. Since this isn't
something that's supported by the DEFINE_PROP_* macros, we have to
switch the property definition to use the
object_class_property_add_bool() function.
Note that it doesn't conceptually make sense to change the setting of
the property after the machine has been completely initialized,
beacuse this would mean that the behaviour of the machine when first
started would differ from its behaviour when the system is
subsequently reset. (It would also require the underlying state to
be migrated, which we don't do.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Tested-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20220127154639.2090164-3-peter.maydell@linaro.org
Peter Maydell [Thu, 27 Jan 2022 15:46:24 +0000 (15:46 +0000)]
target/arm: make psci-conduit settable after realize
We want to allow the psci-conduit property to be set after realize,
because the parts of the code which are best placed to decide if it's
OK to enable QEMU's builtin PSCI emulation (the board code and the
arm_load_kernel() function are distant from the code which creates
and realizes CPUs (typically inside an SoC object's init and realize
method) and run afterwards.
Since the DEFINE_PROP_* macros don't have support for creating
properties which can be changed after realize, change the property to
be created with object_property_add_uint32_ptr(), which is what we
already use in this function for creating settable-after-realize
properties like init-svtor and init-nsvtor.
Note that it doesn't conceptually make sense to change the setting of
the property after the machine has been completely initialized,
beacuse this would mean that the behaviour of the machine when first
started would differ from its behaviour when the system is
subsequently reset. (It would also require the underlying state to
be migrated, which we don't do.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Tested-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20220127154639.2090164-2-peter.maydell@linaro.org
'Or' the IRQs coming from the QSPI and QSPI DMA models. This is done for
avoiding the situation where one of the models incorrectly deasserts an
interrupt asserted from the other model (which will result in that the IRQ
is lost and will not reach guest SW).
Signed-off-by: Francisco Iglesias <francisco.iglesias@xilinx.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Luc Michel <luc@lmichel.fr>
Message-id: 20220203151742.1457-1-francisco.iglesias@xilinx.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* remotes/lvivier-gitlab/tags/linux-user-for-7.0-pull-request:
linux-user/syscall: Translate TARGET_RLIMIT_RTTIME
linux-user: Move generic TARGET_RLIMIT* definitions to generic/target_resource.h
linux-user: Implement starttime field in self stat emulation
linux-user: sigprocmask check read perms first
linux-user: rt_sigprocmask, check read perms first
linux-user: Fix inotify on aarch64
linux-user/alpha: Fix target rlimits for alpha and rearrange for clarity
linux-user: Remove unnecessary 'aligned' attribute from TaskState
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Sun, 6 Feb 2022 10:46:46 +0000 (10:46 +0000)]
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio,pc: features, cleanups, fixes
Part of ACPI ERST support
fixes, cleanups
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Sun 06 Feb 2022 09:36:24 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (24 commits)
util/oslib-posix: Fix missing unlock in the error path of os_mem_prealloc()
ACPI ERST: step 6 of bios-tables-test.c
ACPI ERST: bios-tables-test testcase
ACPI ERST: qtest for ERST
ACPI ERST: create ACPI ERST table for pc/x86 machines
ACPI ERST: build the ACPI ERST table
ACPI ERST: support for ACPI ERST feature
ACPI ERST: header file for ERST
ACPI ERST: PCI device_id for ERST
ACPI ERST: bios-tables-test.c steps 1 and 2
libvhost-user: Map shared RAM with MAP_NORESERVE to support virtio-mem with hugetlb
libvhost-user: handle removal of identical regions
libvhost-user: prevent over-running max RAM slots
libvhost-user: fix VHOST_USER_REM_MEM_REG not closing the fd
libvhost-user: Simplify VHOST_USER_REM_MEM_REG
libvhost-user: Add vu_add_mem_reg input validation
libvhost-user: Add vu_rem_mem_reg input validation
tests: acpi: test short OEM_ID/OEM_TABLE_ID values in test_oem_fields()
tests: acpi: update expected blobs
acpi: fix OEM ID/OEM Table ID padding
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
util/oslib-posix: Fix missing unlock in the error path of os_mem_prealloc()
We're missing an unlock in case installing the signal handler failed.
Fortunately, we barely see this error in real life.
Fixes: a960d6642d39 ("util/oslib-posix: Support concurrent os_mem_prealloc() invocation") Fixes: CID 1468941 Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Pankaj Gupta <pankaj.gupta@ionos.com> Cc: Daniel P. Berrangé <berrange@redhat.com> Cc: Michal Privoznik <mprivozn@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20220111120830.119912-1-david@redhat.com> Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Note that the contents of tests/data/q35/ERST.acpierst and
tests/data/microvm/ERST.pcie are the same except for differences
due to assigned base address.
Files tests/data/pc/DSDT.acpierst and tests/data/acpi/q35/DSDT.acpierst
are new files (and are included as a result of 'make check' process).
Rather than provide the entire content, I am providing the differences
between pc/DSDT and pc/DSDT.acpierst, and the difference between
q35/DSDT and q35/DSDT.acpierst, with an explanation to follow.
diff pc/DSDT pc/DSDT.acpierst:
@@ -5,13 +5,13 @@
*
* Disassembling to symbolic ASL+ operators
*
- * Disassembly of tests/data/acpi/pc/DSDT, Thu Dec 2 10:10:13 2021
+ * Disassembly of tests/data/acpi/pc/DSDT.acpierst, Thu Dec 2 12:59:36 2021
*
* Original Table Header:
* Signature "DSDT"
- * Length 0x00001772 (6002)
+ * Length 0x00001751 (5969)
* Revision 0x01 **** 32-bit table (V1), no 64-bit math support
- * Checksum 0x9E
+ * Checksum 0x95
* OEM ID "BOCHS "
* OEM Table ID "BXPC "
* OEM Revision 0x00000001 (1)
@@ -964,16 +964,11 @@ DefinitionBlock ("", "DSDT", 1, "BOCHS "
For both pc and q35, there is but a small difference between this
DSDT.acpierst and the corresponding DSDT. In both cases, the changes
occur under the hiearchy:
Scope (\_SB)
{
Scope (PCI0)
{
which leads me to believe that the change to the DSDT was needed
due to the introduction of the ERST PCI device.
And is explained in detail by Ani Sinha:
I have convinced myself of the changes we see in the DSDT tables.
On i440fx side, we are adding a non-hotpluggable pci device on slot 3.
So the changes we see are basically replacing an empty hotpluggable
slot on the pci root port with a non-hotplugggable device.
On q35, bsel on pcie root bus is not set (its not hotpluggable bus),
so the change basically adds the address enumeration for the device.
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Acked-by: Ani Sinha <ani@anisinha.ca>
Message-Id: <1643402289-22216-11-git-send-email-eric.devolder@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eric DeVolder [Fri, 28 Jan 2022 20:38:08 +0000 (15:38 -0500)]
ACPI ERST: bios-tables-test testcase
This change implements the test suite checks for the ERST table.
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Reviewed-by: Ani Sinha <ani@anisinha.ca>
Message-Id: <1643402289-22216-10-git-send-email-eric.devolder@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eric DeVolder [Fri, 28 Jan 2022 20:38:07 +0000 (15:38 -0500)]
ACPI ERST: qtest for ERST
This change provides a qtest that locates and then does a simple
interrogation of the ERST feature within the guest.
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Reviewed-by: Ani Sinha <ani@anisinha.ca>
Message-Id: <1643402289-22216-9-git-send-email-eric.devolder@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eric DeVolder [Fri, 28 Jan 2022 20:38:06 +0000 (15:38 -0500)]
ACPI ERST: create ACPI ERST table for pc/x86 machines
This change exposes ACPI ERST support for x86 guests.
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Reviewed-by: Ani Sinha <ani@anisinha.ca>
Message-Id: <1643402289-22216-8-git-send-email-eric.devolder@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eric DeVolder [Fri, 28 Jan 2022 20:38:05 +0000 (15:38 -0500)]
ACPI ERST: build the ACPI ERST table
This builds the ACPI ERST table to inform OSPM how to communicate
with the acpi-erst device.
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Reviewed-by: Ani Sinha <ani@anisinha.ca>
Message-Id: <1643402289-22216-7-git-send-email-eric.devolder@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eric DeVolder [Fri, 28 Jan 2022 20:38:04 +0000 (15:38 -0500)]
ACPI ERST: support for ACPI ERST feature
This implements a PCI device for ACPI ERST. This implements the
non-NVRAM "mode" of operation for ERST as it is supported by
Linux and Windows.
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Reviewed-by: Ani Sinha <ani@anisinha.ca>
Message-Id: <1643402289-22216-6-git-send-email-eric.devolder@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eric DeVolder [Fri, 28 Jan 2022 20:38:03 +0000 (15:38 -0500)]
ACPI ERST: header file for ERST
This change introduces the public defintions for ACPI ERST.
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Reviewed-by: Ani Sinha <ani@anisinha.ca>
Message-Id: <1643402289-22216-5-git-send-email-eric.devolder@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eric DeVolder [Fri, 28 Jan 2022 20:38:02 +0000 (15:38 -0500)]
ACPI ERST: PCI device_id for ERST
This change reserves the PCI device_id for the new ACPI ERST
device.
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Ani Sinha <ani@anisinha.ca>
Message-Id: <1643402289-22216-4-git-send-email-eric.devolder@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Eric DeVolder [Fri, 28 Jan 2022 20:38:00 +0000 (15:38 -0500)]
ACPI ERST: bios-tables-test.c steps 1 and 2
Following the guidelines in tests/qtest/bios-tables-test.c, this
change adds empty placeholder files per step 1 for the new ERST
table, and excludes resulting changed files in bios-tables-test-allowed-diff.h
per step 2.
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Acked-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1643402289-22216-2-git-send-email-eric.devolder@oracle.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
libvhost-user: Map shared RAM with MAP_NORESERVE to support virtio-mem with hugetlb
For fd-based shared memory, MAP_NORESERVE is only effective for hugetlb,
otherwise it's ignored. Older Linux versions that didn't support
reservation of huge pages ignored MAP_NORESERVE completely.
The first client to mmap a hugetlb fd without MAP_NORESERVE will
trigger reservation of huge pages for the whole mmapped range. There are
two cases to consider:
1) QEMU mapped RAM without MAP_NORESERVE
We're not dealing with a sparse mapping, huge pages for the whole range
have already been reserved by QEMU. An additional mmap() without
MAP_NORESERVE won't have any effect on the reservation.
2) QEMU mapped RAM with MAP_NORESERVE
We're delaing with a sparse mapping, no huge pages should be reserved.
Further mappings without MAP_NORESERVE should be avoided.
For 1), it doesn't matter if we set MAP_NORESERVE or not, so we can
simply set it. For 2), we'd be overriding QEMUs decision and trigger
reservation of huge pages, which might just fail if there are not
sufficient huge pages around. We must map with MAP_NORESERVE.
This change is required to support virtio-mem with hugetlb: a
virtio-mem device mapped into the guest physical memory corresponds to
a sparse memory mapping and QEMU maps this memory with MAP_NORESERVE.
Whenever memory in that sparse region will be accessed by the VM, QEMU
populates huge pages for the affected range by preallocating memory
and handling any preallocation errors gracefully.
So let's map shared RAM with MAP_NORESERVE. As libvhost-user only
supports Linux, there shouldn't be anything to take care of in regard of
other OS support.
Without this change, libvhost-user will fail mapping the region if there
are currently not enough huge pages to perform the reservation:
fv_panic: libvhost-user: region mmap error: Cannot allocate memory
Cc: "Marc-André Lureau" <marcandre.lureau@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Raphael Norwitz <raphael.norwitz@nutanix.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20220111123939.132659-1-david@redhat.com> Acked-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Raphael Norwitz [Mon, 17 Jan 2022 04:12:35 +0000 (04:12 +0000)]
libvhost-user: handle removal of identical regions
Today if QEMU (or any other VMM) has sent multiple copies of the same
region to a libvhost-user based backend and then attempts to remove the
region, only one instance of the region will be removed, leaving stale
copies of the region in dev->regions[].
This change resolves this by having vu_rem_mem_reg() iterate through all
regions in dev->regions[] and delete all matching regions.
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20220117041050.19718-7-raphael.norwitz@nutanix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com>
Raphael Norwitz [Mon, 17 Jan 2022 04:12:34 +0000 (04:12 +0000)]
libvhost-user: prevent over-running max RAM slots
When VHOST_USER_PROTOCOL_F_CONFIGURE_MEM_SLOTS support was added to
libvhost-user, no guardrails were added to protect against QEMU
attempting to hot-add too many RAM slots to a VM with a libvhost-user
based backed attached.
This change adds the missing error handling by introducing a check on
the number of RAM slots the device has available before proceeding to
process the VHOST_USER_ADD_MEM_REG message.
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20220117041050.19718-6-raphael.norwitz@nutanix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
libvhost-user: fix VHOST_USER_REM_MEM_REG not closing the fd
We end up not closing the file descriptor, resulting in leaking one
file descriptor for each VHOST_USER_REM_MEM_REG message.
Fixes: 875b9fd97b34 ("Support individual region unmap in libvhost-user") Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Raphael Norwitz <raphael.norwitz@nutanix.com> Cc: "Marc-André Lureau" <marcandre.lureau@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Coiby Xu <coiby.xu@gmail.com> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20220117041050.19718-5-raphael.norwitz@nutanix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's avoid having to manually copy all elements. Copy only the ones
necessary to close the hole and perform the operation in-place without
a second array.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David Hildenbrand <david@redhat.com> Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20220117041050.19718-4-raphael.norwitz@nutanix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Today if multiple FDs are sent from the VMM to the backend in a
VHOST_USER_ADD_MEM_REG message, one FD will be mapped and the remaining
FDs will be leaked. Therefore if multiple FDs are sent we report an
error and fail the operation, closing all FDs in the message.
Likewise in case the VMM sends a message with a size less than that
of a memory region descriptor, we add a check to gracefully report an
error and fail the operation rather than crashing.
Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20220117041050.19718-3-raphael.norwitz@nutanix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com>
Today if multiple FDs are sent from the VMM to the backend in a
VHOST_USER_REM_MEM_REG message, one FD will be unmapped and the remaining
FDs will be leaked. Therefore if multiple FDs are sent we report an
error and fail the operation, closing all FDs in the message.
Likewise in case the VMM sends a message with a size less than that of a
memory region descriptor, we add a check to gracefully report an error
and fail the operation rather than crashing.
Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Message-Id: <20220117041050.19718-2-raphael.norwitz@nutanix.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com>
Igor Mammedov [Fri, 14 Jan 2022 14:26:41 +0000 (09:26 -0500)]
tests: acpi: test short OEM_ID/OEM_TABLE_ID values in test_oem_fields()
Previous patch [1] added explicit whitespace padding to OEM_ID/OEM_TABLE_ID
values used in test_oem_fields() testcase to avoid false positive and
bisection issues when QEMU is switched to \0' padding. As result
testcase ceased to test values that were shorter than max possible
length values.
Update testcase to make sure that it's testing shorter IDs like it
used to before [2].
1) "tests: acpi: manually pad OEM_ID/OEM_TABLE_ID for test_oem_fields() test"
2) 602b458201 ("acpi: Permit OEM ID and OEM table ID fields to be changed")
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20220114142641.1727679-1-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Igor Mammedov [Wed, 12 Jan 2022 13:03:32 +0000 (08:03 -0500)]
tests: acpi: update expected blobs
Expected changes caused by previous commit:
nvdimm ssdt (q35/pc/virt):
- * OEM Table ID "NVDIMM "
+ * OEM Table ID "NVDIMM"
SLIC test FADT (tests/data/acpi/q35/FACP.slic):
-[010h 0016 8] Oem Table ID : "ME "
+[010h 0016 8] Oem Table ID : "ME"
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20220112130332.1648664-5-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Igor Mammedov [Wed, 12 Jan 2022 13:03:31 +0000 (08:03 -0500)]
acpi: fix OEM ID/OEM Table ID padding
Commit [2] broke original '\0' padding of OEM ID and OEM Table ID
fields in headers of ACPI tables. While it doesn't have impact on
default values since QEMU uses 6 and 8 characters long values
respectively, it broke usecase where IDs are provided on QEMU CLI.
It shouldn't affect guest (but may cause licensing verification
issues in guest OS).
One of the broken usecases is user supplied SLIC table with IDs
shorter than max possible length, where [2] mangles IDs with extra
spaces in RSDT and FADT tables whereas guest OS expects those to
mirror the respective values of the used SLIC table.
Fix it by replacing whitespace padding with '\0' padding in
accordance with [1] and expectations of guest OS
1) ACPI spec, v2.0b
17.2 AML Grammar Definition
...
//OEM ID of up to 6 characters. If the OEM ID is
//shorter than 6 characters, it can be terminated
//with a NULL character.
2) Fixes: 602b458201 ("acpi: Permit OEM ID and OEM table ID fields to be changed")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/707 Reported-by: Dmitry V. Orekhov <dima.orekhov@gmail.com> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Cc: qemu-stable@nongnu.org
Message-Id: <20220112130332.1648664-4-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Ani Sinha <ani@anisinha.ca> Tested-by: Dmitry V. Orekhov dima.orekhov@gmail.com
Igor Mammedov [Wed, 12 Jan 2022 13:03:30 +0000 (08:03 -0500)]
tests: acpi: whitelist nvdimm's SSDT and FACP.slic expected blobs
The next commit will revert OEM fields whitespace padding to
padding with '\0' as it was before [1]. That will change OEM
Table ID for:
* SSDT.*: where it was padded from 6 characters to 8
* FACP.slic: where it was padded from 2 characters to 8
after reverting whitespace padding, it will be replaced with
'\0' which effectively will shorten OEM table ID to 6 and 2
characters.
Whitelist affected tables before introducing the change.
1) 602b458201 ("acpi: Permit OEM ID and OEM table ID fields to be changed") Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20220112130332.1648664-3-imammedo@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>