With the stack mapped on a per-CPU basis there's no risk of other CPUs being
able to read the stack contents, but vCPUs running on the current pCPU could
read stack rubble from operations of previous vCPUs.
The #DF stack is not zeroed because handling of #DF results in a panic.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
x86/mm: switch to a per-CPU mapped stack when using ASI
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
There's further work required in order to allocate the stack from domheap
instead of xenheap, otherwise the per-CPU mapping is not really helpful.
Should the stack always be mapped in the per-CPU VA range regardless of whether
ASI is active on the system for any domain? That might simplify some of the
logic.
x86/pv: allow using a unique a per-pCPU root page table
When running PV guests it's possible for the guest to use the same root page
table (L4) for all vCPUs, which in turn will result in Xen also using the same
root page table on all pPCU that are running any domain vCPU.
With XPTI Xen switches to a per-CPU shadow L4 when running in guest context,
switching to the fully populated L4 when in Xen context.
Take advantage of this existing shadowing and force the usage of a per-CPU L4
that shadows the guest selected L4 when Address Space Isolation is requested
for PV guests.
In order to map the guest L4 in a per-CPU slot the CPU needs to be using a
per-CPU L4. Account for this and only attempt to map the guest L4 once the CPU
has already loaded the per-CPU shadow L4.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Roger Pau Monne [Wed, 26 Jun 2024 12:50:04 +0000 (14:50 +0200)]
x86/mm: introduce a per-CPU fixmap area
Introduce the logic to manage a per-CPU fixmap area. This includes adding a
new set of headers that are capable of creating mappings in the per-CPU
page-table regions by making use of the map_pages_to_xen_cpu().
This per-CPU fixmap area is currently set to use one L3 slot: 1GiB of linear
address space.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Roger Pau Monne [Wed, 26 Jun 2024 14:27:57 +0000 (16:27 +0200)]
x86/mm: allow modifying per-CPU entries of remote page-tables
Add support for modifying the per-CPU page-tables entries of remote CPUs, this
will be required in order to setup the page-tables of CPUs before bringing them
up. A restriction is added so that remote page-tables can only be modified as
long as the remote CPU is not yet online.
Non functional change, as there's no user introduced that modifies remote
page-tables.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Can be merged with previous patch?
Roger Pau Monne [Wed, 26 Jun 2024 09:07:00 +0000 (11:07 +0200)]
x86/mm: introduce support to populate a per-CPU page-table region
Add logic in map_pages_to_xen() and modify_xen_mappings() so that TLB flushes
are only performed locally when dealing with entries in the per-CPU area of the
page-tables.
No functional change intended, as there are no callers added that setup of
modify per-CPU mappings, nor is the per-CPU area still properly setup in
the page-tables yet.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Roger Pau Monne [Tue, 25 Jun 2024 16:16:40 +0000 (18:16 +0200)]
x86/mm: introduce usage of a per-CPU L3 for the per-domain slot
So far slot 260 has always been per-domain, ie: all vCPUs of a domain share the
same L3. Currently only 3 slots are used in that L3, which leaves plenty of
room.
Introduce a per-CPU L3, which gets populated with the running domain L3 slots,
basically being a mem copy of the contents of d->arch.perdomain_l3_pg.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Roger Pau Monne [Fri, 21 Jun 2024 13:28:49 +0000 (15:28 +0200)]
x86/idle: use a per-pCPU L4
Don't share the same L4 (currently idle_pg_table) across all the idle vCPUs.
Instead have a single L4 per-pCPU, as that allows to have per-pCPU mappings.
This change only switches to a per-pPCU idle L4, but it should still be a clone
of idle_pg_table, hence no functional change expected.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Roger Pau Monne [Mon, 17 Jun 2024 16:02:34 +0000 (18:02 +0200)]
x86/hvm: use a per-pCPU monitor table in shadow mode
Instead of allocating a monitor table for each vCPU when running in HVM shadow
mode, use a per-pCPU monitor table, which gets the per-domain slot updated on
guest context switch.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Roger Pau Monne [Mon, 17 Jun 2024 16:02:02 +0000 (18:02 +0200)]
x86/hvm: use a per-pCPU monitor table in HAP mode
Instead of allocating a monitor table for each vCPU when running in HVM HAP
mode, use a per-pCPU monitor table, which gets the per-domain slot updated on
guest context switch.
This limits the amount of memory used for HVM HAP monitor tables to the amount
of active pCPUs, rather than to the number of vCPUs. It also simplifies vCPU
allocation and teardown, since the monitor table handling is removed from
there.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
x86/mm: move FLUSH_ROOT_PGTBL handling before TLB flush
Move the handling of FLUSH_ROOT_PGTBL in flush_area_local() ahead of the logic
that does the TLB flushing, in preparation for further changes requiring the
TLB flush to be strictly done after having handled FLUSH_ROOT_PGTBL.
No functional change intended.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Roger Pau Monne [Fri, 14 Jun 2024 10:41:04 +0000 (12:41 +0200)]
x86/pv: untie issuing FLUSH_ROOT_PGTBL from XPTI
The current logic gates issuing flush TLB requests with the FLUSH_ROOT_PGTBL
flag to XPTI being enabled.
In preparation for FLUSH_ROOT_PGTBL also being needed when not using XPTI,
untie it from the xpti domain boolean and instead introduce a new flush_root_pt
field.
No functional change intended, as flush_root_pt == xpti.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
x86/spec-ctrl: initialize per-domain XPTI in spec_ctrl_init_domain()
XPTI being a speculation mitigations feels better to be initialized in
spec_ctrl_init_domain().
No functional change intended, although the call to spec_ctrl_init_domain() in
arch_domain_create() needs to be moved ahead of pv_domain_initialise() for
d->->arch.pv.xpti to be correctly set.
Move it ahead of most of the initialization functions, since
spec_ctrl_init_domain() doesn't depend on any value in the struct domain being
set.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Roger Pau Monne [Tue, 18 Jun 2024 14:51:51 +0000 (16:51 +0200)]
x86/dom0: only disable SMAP for the PV dom0 build
The PVH dom0 builder doesn't switch page tables and has no need to run with
SMAP disabled.
Put the SMAP disabling close to the code region where it's necessary, as it
then becomes obvious why switch_cr3_cr4() is required instead of
write_ptbase().
Note removing SMAP from cr4_pv32_mask is not required, as we never jump into
guest context, and hence updating the value of cr4_pv32_mask is not relevant.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Jan Beulich [Tue, 16 Jul 2024 12:09:14 +0000 (14:09 +0200)]
x86/IRQ: avoid double unlock in map_domain_pirq()
Forever since its introduction the main loop in the function dealing
with multi-vector MSI had error exit points ("break") with different
properties: In one case no IRQ descriptor lock is being held.
Nevertheless the subsequent error cleanup path assumed such a lock would
uniformly need releasing. Identify the case by setting "desc" to NULL,
thus allowing the unlock to be skipped as necessary.
This is CVE-2024-31143 / XSA-458.
Coverity ID: 1605298 Fixes: d1b6d0a02489 ("x86: enable multi-vector MSI") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Andrew Cooper [Thu, 11 Jul 2024 15:09:58 +0000 (16:09 +0100)]
CI: Add Ubuntu 22.04 (Jammy) and 24.04 (Noble) testing
The containers are exactly as per 20.04 (Focal). However, this now brings us
to 5 releases * 4 build jobs worth of Ubuntu testing, which is overkill.
The oldest and newest toolchains are the most likely to find problems with new
code, so reduce the middle 3 releases (18/20/22) to just a single smoke test
each.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Thu, 11 Jul 2024 15:09:22 +0000 (16:09 +0100)]
CI: Refresh Ubuntu Focal container as 20.04-x86_64
As with 16.04 (Xenial), with python3-setuptools included. Having this package
only in some containers was intentional; see commit bbc72a7877d8 ("automation:
Add python3's setuptools to some containers") for the rational.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 10 Jul 2024 13:37:53 +0000 (14:37 +0100)]
CI: Refresh OpenSUSE Leap container
See prior patch for most discussion.
Despite appearing to be a fixed release (and therefore not marked as permitted
failure), the dockerfile references the `leap` tag which is rolling in
practice. Switch to 15.6 explicitly, for better test stability.
Vs tumbleweed, use `zypper update` rather than dist-upgrade, and retain the
RomBIOS dependencies; bin86 and dev86.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 10 Jul 2024 13:40:23 +0000 (14:40 +0100)]
CI: Refresh OpenSUSE Tumbleweed container
Existing as suse:opensuse-tumbleweed is a historical quirk, and adjusted for
consistency with all the other containers.
Make it non-root, use heredocs for legibility, and use the zypper long names
for the benefit of those wondering what was being referenced or duplicated.
Trim the dependencies substantially. Testing docs isn't very interesting and
saves a lot of space. Other savings come from removing a huge pile of
optional QEMU dependencies (QEMU just needs to build the Xen parts to be
useful here, not have a full GUI environment).
Finally, there where some packages such as bc, libssh2-devel, libtasn1-devel
and nasm that I'm not aware of any reason to have had, even historically.
Furthermore, identify which components of the build use which dependencies,
which will help managing them in the future.
Thanks to Olaf Hering for dependency fixes that have been subsumed into this
total overhaul.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Tue, 9 Jul 2024 14:54:52 +0000 (15:54 +0100)]
CI: Refresh and upgrade the GCC-IBT container
Upgrade from Debian buster to bookworm, GCC 11.3 to 11.4 and to be a non-root
container.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Mon, 8 Jul 2024 17:18:22 +0000 (18:18 +0100)]
CI: Refresh bullseye-ppc64le as debian:11-ppc64le
... in the style of debian:12-ppc64le.
Rename the jobs and reposition them later as they're not a dependency for the
smoke testing any more.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Mon, 8 Jul 2024 17:17:25 +0000 (18:17 +0100)]
CI: Use debian:12-ppc64le for smoke testing
qemu-system-ppc64/8.1.0-ppc64 was added because bullseye's QEMU didn't
understand the powernv9 machine. However bookworm's QEMU does and this is
preferable to maintaining a random build of QEMU ourselves.
Use the debian:12-ppc64le container and test the output of that build too.
Remove qemu-system-ppc64-8.1.0-ppc64-export which is unused now.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Mon, 8 Jul 2024 17:00:21 +0000 (18:00 +0100)]
CI: Introduce a debian:12-ppc64le container
... conforming to the new naming scheme; $DISTRO-$VERSION-$ARCH-* so the jobs
sort more coherently.
Make it non-root by default, and set XEN_TARGET_ARCH=ppc64. Include QEMU too,
which will be used subsequently.
Add build jobs too, with debian-12-ppc64le-gcc-debug specifically early as it
will be used for smoke testing shortly.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 10 Jul 2024 12:38:52 +0000 (13:38 +0100)]
CI: Mark Archlinux/x86 as allowing failures
Archlinux is a rolling distro. As a consequence, rebuilding the container
periodically changes the toolchain, and this affects all stable branches in
one go.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 10 Jul 2024 00:01:13 +0000 (01:01 +0100)]
CI: Drop Ubuntu Trusty testing
This is also End of Life.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Tue, 9 Jul 2024 23:26:56 +0000 (00:26 +0100)]
CI: Drop Debian Stretch testing
Debian stretch is also End of Life. Update a couple of test steps to use
bookworm instead.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Tue, 9 Jul 2024 23:02:47 +0000 (00:02 +0100)]
CI: Drop Debian Jessie dockerfiles
These were removed from testing in Xen 4.18.
Fixes: 3817e3c1b4b8 ("automation: Remove testing on Debian Jessie") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
as PPC64 doesn't want randconfig right now, and buster-gcc-ibt is a special
job with a custom compiler.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Mon, 8 Jul 2024 17:00:49 +0000 (18:00 +0100)]
CI: Fix CONTAINER_UID0=1 scripts/containerize
Right now, most build containers use root. Archlinux, Fedora and Yocto set up
a regular user called `user`.
For those containers, trying to containerize as root fails, because
CONTAINER_UID0=1 does nothing, whereas CONTAINER_UID0=0 forces the user away
from root.
To make CONTAINER_UID0=1 work reliably, force to root if requested.
Fixes: 17fbe6504dfd ("automation: introduce a new variable to control container user") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Thu, 4 Jul 2024 12:09:21 +0000 (13:09 +0100)]
build: Drop xorg-x11 as a build dependency
The history on this one is complicated. The note to README was added in
commit 1f95747a4f16 ("Add openssl-dev and xorg-x11-dev to README") in 2007.
At the time, there was a vendered version of Qemu in xen.git with a local
modification using <X11/keysymdef.h> to access the monitor console over VNC.
The final reference to keysymdef.h was dropped in commit 85896a7c4dc7 ("build:
add autoconf to replace custom checks in tools/check") in 2012. The next
prior mention was in 2009 with commit a8ccb671c377 ("tools: fix x11 check")
noting that x11 was not a direct dependcy of Xen; it was transitive through
SDL for Qemu for source-based distros.
It appears there may have been other unspecified dependencies on xorg,
e.g. the use of lndir by unmodified_drivers which are no longer relevant
either.
These days its only the Debian based dockerfiles which install xorg-x11, and
Qemu builds fine in these and others without x11.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 3 Jul 2024 20:02:20 +0000 (21:02 +0100)]
CI: Refresh the Coverity Github Action configuration
Update to Ubuntu 24.04, and checkout@v4 as v2 is deprecated.
The build step goes out of it's way to exclude docs and stubdom (but include
plain MiniOS), so disable those at the ./configure stage.
Refresh the package list. libbz2-dev was in there twice, and e2fslibs-dev is
a a transitional package to libext2fs-dev. I'm not aware of libtool ever
having been a Xen dependency.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Thu, 4 Jul 2024 12:08:40 +0000 (13:08 +0100)]
build: Fix the version of python checked for by ./configure
We previously upped the minimum python version to 2.7, but neglected to
reflect this in ./configure
Fixes: 2a353c048c68 ("tools: Don't use distutils in configure or Makefile") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 3 Jul 2024 17:21:09 +0000 (18:21 +0100)]
build: Regenerate ./configure with Autoconf 2.71
This is the version now found in Debian Bookworm.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
x86/physdev: Return pirq that irq was already mapped to
Fix bug introduced by 0762e2502f1f ("x86/physdev: factor out the code to allocate and
map a pirq"). After that re-factoring, when pirq<0 and current_pirq>0, it means
caller want to allocate a free pirq for irq but irq already has a mapped pirq, then
it returns the negative pirq, so it fails. However, the logic before that
re-factoring is different, it should return the current_pirq that irq was already
mapped to and make the call success.
Fixes: 0762e2502f1f ("x86/physdev: factor out the code to allocate and map a pirq") Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 3 Jul 2024 11:06:46 +0000 (12:06 +0100)]
CI: Rework the CentOS7 container
CentOS 7 is fully End-of-life as of 2024-06-30, and the Yum repo configuration
points at URLs which have become non-existent.
First, start by using a heredoc RUN for legibility. It's important to use
`set -e` to offset the fact that we're no longer chaining every command
together with an &&.
Also, because we're using a single RUN command to perform all RPM operations,
we no longer need to work around the OverlayFS bug.
Adjust the CentOS-*.repo files to point at vault.centos.org. This also
involves swapping mirrorlist= for baseurl= in the yum config.
Use a minor bashism to express the dependenices more coherently, and identify
why we have certain dependencies. Some adjustments are:
* We need bzip2-devel for the dombuilder. bzip2 needs retaining stubdom or
`tar` fails to unpack the .bz2 archives.
* {lzo,lz4,ztd}-devel are new optional dependency since the last time this
package list was refreshed.
* openssl-devel hasn't been a dependency since Xen 4.6.
* We long ago ceased being able to build Qemu and SeaBIOS in this container,
so drop their dependencies too.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
For inline files, use COPY with a heredoc, rather than opencoding it through
/bin/sh.
No practical change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Tue, 2 Jul 2024 13:34:36 +0000 (14:34 +0100)]
CI: Formalise the use of heredocs
Commit b5739330d7f4 introduced the use of heredocs in the jessie/stretch
dockerfiles.
It turns out this was introduced by BuildKit in 2018 along with a
standardisation of Dockerfile syntax, and has subsequently been adopted by the
docker community.
Annotate all dockerfiles with a statement of the syntax in use, and extend
README.md details including how to activate BuildKit when it's available but
off by default.
This allows the containers to be rebuilt following commit a0e29b316363 ("CI:
Drop glibc-i386 from the build containers").
Fixes: b5739330d7f4 ("automation: fix jessie/stretch images to use archive.debian.org apt repos") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Python regexes should use raw strings. Convert all regexes, and drop escaped
backslashes. Note that regular escape sequences are interpreted normally when
parsing a regex, so \n even in a raw-string regex is a newline.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 3 Jul 2024 20:59:34 +0000 (21:59 +0100)]
build/mkheader: Remove C-isms from the code
This was clearly written by a C programmer, rather than a python programmer.
Drop all the useless semi-colons.
The very final line of the script simply references f.close, rather than
calling the function. Switch to using a with: statement, as python does care
about unclosed files if you enable enough warnings.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 3 Jul 2024 22:01:11 +0000 (23:01 +0100)]
tools/xs-clients: Fix `make clean` rule
Prior to the split, "the clients" used tools/xenstored/Makefile.common whose
clean rule includes *.o whereas after the split, the removal of *.o was lost
by virtule of not including Makefile.common any more.
This is the bug behind the following build error:
make[2]: Entering directory '/local/xen.git/tools/xs-clients'
gcc xenstore_client.o (snip)
/usr/bin/ld: xenstore_client.o: relocation R_X86_64_32S against `.rodata' can not be used when making a PIE object; recompile with -fPIE
/usr/bin/ld: failed to set dynamic section sizes: bad value
collect2: error: ld returned 1 exit status
make[2]: *** [Makefile:35: xenstore] Error 1
which was caused by `make clean` not properly cleaning the tree as I was
swapping between various build containers.
Switch to a plain single-colon clean rule.
Fixes: 5c293058b130 ("tools/xenstore: move xenstored sources into dedicated directory") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
xen/riscv: use .insn with operands to support the older gas
Support for specifying "raw" insns was added only in 2.38.
To support older version it would be better switch to .insn
with operands.
The following compilation error occurs:
./arch/riscv/include/asm/processor.h: Assembler messages:
./arch/riscv/include/asm/processor.h:70: Error: unrecognized opcode `0x0100000F'
In case of the following Binutils:
$ riscv64-linux-gnu-as --version
GNU assembler (GNU Binutils for Debian) 2.35.2
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Drop the 0, which is in line with how we annotate RCs elsewhere.
Fixes: 4a73eb4c205d ("Update Xen version to 4.19-rc") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Jan Beulich [Wed, 3 Jul 2024 12:04:15 +0000 (14:04 +0200)]
cmdline: "extra_guest_irqs" is inapplicable to PVH
PVH in particular has no (externally visible) notion of pIRQ-s. Mention
that in the description of the respective command line option and have
arch_hwdom_irqs() also reflect this (thus suppressing the log message
there as well, as being pretty meaningless in this case anyway).
Suggested-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Jan Beulich [Wed, 3 Jul 2024 12:03:27 +0000 (14:03 +0200)]
amend 'cmdline: document and enforce "extra_guest_irqs" upper bounds'
Address late review comments for what is now commit 17f6d398f765:
- bound max_irqs right away against nr_irqs
- introduce a #define for a constant used twice
Requested-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Jan Beulich [Tue, 2 Jul 2024 10:01:59 +0000 (12:01 +0200)]
xen: avoid UB in guest handle field accessors
Much like noted in 43d5c5d5f70b ("xen: avoid UB in guest handle
arithmetic"), address calculations involved in accessing a struct field
can overflow, too. Cast respective pointers to "unsigned long" and
convert type checking accordingly. Remaining arithmetic is, despite
there possibly being mathematical overflow, okay as per the C99 spec:
"A computation involving unsigned operands can never overflow, because a
result that cannot be represented by the resulting unsigned integer type
is reduced modulo the number that is one greater than the largest value
that can be represented by the resulting type." The overflow that we
need to guard against is checked for in array_access_ok().
While there add the missing (see {,__}copy_to_guest_offset()) is-not-
const checks to {,__}copy_field_to_guest().
Typically, but not always, no change to generated code; code generation
(register allocation) is different for at least common/grant_table.c.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Jan Beulich [Tue, 2 Jul 2024 10:01:21 +0000 (12:01 +0200)]
x86/entry: don't clear DF when raising #UD for lack of syscall handler
While doing so is intentional when invoking the actual callback, to
mimic a hard-coded SYCALL_MASK / FMASK MSR, the same should not be done
when no handler is available and hence #UD is raised.
Fixes: ca6fcf4321b3 ("x86/pv: Inject #UD for missing SYSCALL callbacks") Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Jan Beulich [Tue, 2 Jul 2024 10:00:27 +0000 (12:00 +0200)]
cmdline: document and enforce "extra_guest_irqs" upper bounds
PHYSDEVOP_pirq_eoi_gmfn_v<N> accepting just a single GFN implies that no
more than 32k pIRQ-s can be used by a domain on x86. Document this upper
bound.
To also enforce the limit, (ab)use both arch_hwdom_irqs() (changing its
parameter type) and setup_system_domains(). This is primarily to avoid
exposing the two static variables or introducing yet further arch hooks.
While touching arch_hwdom_irqs() also mark it hwdom-init.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Fri, 28 Jun 2024 13:04:30 +0000 (14:04 +0100)]
tools/libxs: Fix CLOEXEC handling in xs_fileno()
xs_fileno() opens a pipe on first use to communicate between the watch thread
and the main thread. Nothing ever sets CLOEXEC on the file descriptors.
Check for the availability of the pipe2() function with configure. Despite
starting life as Linux-only, FreeBSD and NetBSD have gained it.
When pipe2() isn't available, try our best with pipe() and set_cloexec().
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: Anthony PERARD <anthony.perard@vates.tech>
Andrew Cooper [Fri, 28 Jun 2024 13:10:12 +0000 (14:10 +0100)]
tools/libxs: Fix CLOEXEC handling in get_dev()
Move the O_CLOEXEC compatibility outside of an #ifdef USE_PTHREAD block.
Introduce set_cloexec() to wrap fcntl() setting FD_CLOEXEC. It will be reused
for other CLOEXEC fixes too.
Use set_cloexec() when O_CLOEXEC isn't available as a best-effort fallback.
Fixes: f4f2f3402b2f ("tools/libxs: Open /dev/xen/xenbus fds as O_CLOEXEC") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: Anthony PERARD <anthony.perard@vates.tech>
Andrew Cooper [Thu, 27 Jun 2024 12:22:14 +0000 (13:22 +0100)]
tools/dombuilder: Correct the length calculation in xc_dom_alloc_segment()
xc_dom_alloc_segment() is passed a size in bytes, calculates a size in pages
from it, then fills in the new segment information with a bytes value
re-calculated from the number of pages.
This causes the module information given to the guest (MB, or PVH) to have
incorrect sizes; specifically, sizes rounded up to the next page.
This in turn is problematic for Xen. When Xen finds a gzipped module, it
peeks at the end metadata to judge the decompressed size, which is a -4
backreference from the reported end of the module.
Fill in seg->vend using the correct number of bytes.
Fixes: ea7c8a3d0e82 ("libxc: reorganize domain builder guest memory allocator") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
During Gitlab CI randconfig job for RISC-V failed witn an error:
common/trace.c:57:22: error: expected '=', ',', ';', 'asm' or
'__attribute__' before '__read_mostly'
57 | static u32 data_size __read_mostly;
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: George Dunlap <george.dunlap@cloud.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Jan Beulich [Tue, 2 Jul 2024 06:35:56 +0000 (08:35 +0200)]
pirq_cleanup_check() leaks
Its original introduction had two issues: For one the "common" part of
the checks (carried out in the macro) was inverted. And then after
removal from the radix tree the structure wasn't scheduled for freeing.
(All structures still left in the radix tree would be freed upon domain
destruction, though.)
For the freeing to be safe even if it didn't use RCU (i.e. to avoid use-
after-free), re-arrange checks/operations in evtchn_close(), such that
the pointer wouldn't be used anymore after calling pirq_cleanup_check()
(noting that unmap_domain_pirq_emuirq() itself calls the function in the
success case).
Fixes: c24536b636f2 ("replace d->nr_pirqs sized arrays with radix tree") Fixes: 79858fee307c ("xen: fix hvm_domain_use_pirq's behavior") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
George Dunlap [Wed, 26 Jun 2024 15:07:30 +0000 (16:07 +0100)]
MAINTAINERS: Step down as maintainer and committer
Remain a Reviewer on the golang bindings and scheduler for now (using
a xenproject.org alias), since there may be architectural decisions I
can shed light on.
Remove the XENTRACE section entirely, as there's no obvious candidate
to take it over; having the respective parts fall back to the tools
and The Rest seems the most reasonable option.
Nicola Vetrini [Thu, 27 Jun 2024 11:48:08 +0000 (13:48 +0200)]
x86/traps: address violations of MISRA C Rule 20.7
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.
Remove from the ECLAIR integration scripts an unused option, which
was already ignored, and make the help texts consistent
with the rest of the scripts.
Nicola Vetrini [Thu, 27 Jun 2024 11:47:16 +0000 (13:47 +0200)]
x86/irq: address violations of MISRA C Rule 20.7
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.
Nicola Vetrini [Thu, 27 Jun 2024 11:46:57 +0000 (13:46 +0200)]
automation/eclair_analysis: address violations of MISRA C Rule 20.7
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses".
The local helpers GRP2 and XADD in the x86 emulator use their first
argument as the constant expression for a case label. This pattern
is deviated project-wide, because it is very unlikely to induce
developer confusion and result in the wrong control flow being
carried out.
Nicola Vetrini [Thu, 27 Jun 2024 11:46:27 +0000 (13:46 +0200)]
xen/guest_access: address violations of MISRA rule 20.7
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.
Nicola Vetrini [Thu, 27 Jun 2024 11:46:02 +0000 (13:46 +0200)]
xen/self-tests: address violations of MISRA rule 20.7
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.
Nicola Vetrini [Thu, 27 Jun 2024 11:45:18 +0000 (13:45 +0200)]
automation/eclair: address violations of MISRA C Rule 20.7
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses".
The helper macro bitmap_switch has parameters that cannot be parenthesized
in order to comply with the rule, as that would break its functionality.
Moreover, the risk of misuse due developer confusion is deemed not
substantial enough to warrant a more involved refactor, thus the macro
is deviated for this rule.
George Dunlap [Mon, 24 Jun 2024 08:31:52 +0000 (09:31 +0100)]
CHANGELOG: Add entries related to tracing
Signed-off-by: George Dunlap <george.dunlap@cloud.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
George Dunlap [Mon, 24 Jun 2024 10:23:18 +0000 (11:23 +0100)]
tools/xenalyze: Remove argp_program_bug_address
xenalyze sets argp_program_bug_address to my old Citrix address. This
was done before xenalyze was in the xen.git tree; and it's the only
program in the tree which does so.
Now that xenalyze is part of the normal Xen distribution, it should be
obvious where to report bugs.
Signed-off-by: George Dunlap <george.dunlap@cloud.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
George Dunlap [Mon, 24 Jun 2024 08:43:04 +0000 (09:43 +0100)]
CHANGELOG.md: Fix indentation of "Removed" section
Signed-off-by: George Dunlap <george.dunlap@cloud.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
automation/eclair_analysis: deviate and|or|xor|not for MISRA C Rule 21.2
Rule 21.2 reports identifiers reserved for the C and POSIX standard
libraries: or, and, not and xor are reserved identifiers because they
constitute alternate spellings for the corresponding operators (they are
defined as macros by iso646.h); however Xen doesn't use standard library
headers, so there is no risk of overlap.
This addresses violations arising from x86_emulate/x86_emulate.c, where
label statements named as or, and and xor appear.
Jan Beulich [Tue, 25 Jun 2024 09:37:44 +0000 (11:37 +0200)]
gnttab: fix compat query-size handling
The odd DEFINE_XEN_GUEST_HANDLE(), inconsistent with all other similar
constructs, should have caught my attention. Turns out it was needed for
the build to succeed merely because the corresponding #ifndef had a
typo. That typo in turn broke compat mode guests, by having query-size
requests of theirs wire into the domain_crash() at the bottom of the
switch().
Fixes: 8c3bb4d8ce3f ("xen/gnttab: Perform compat/native gnttab_query_size check") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Oleksii Kurochko <Oleksii.kurochko@gmail.com>
Jan Beulich [Tue, 25 Jun 2024 09:36:59 +0000 (11:36 +0200)]
xen: re-add type checking to {,__}copy_from_guest_offset()
When re-working them to avoid UB on guest address calculations, I failed
to add explicit type checks in exchange for the implicit ones that until
then had happened in assignments that were there anyway.
Fixes: 43d5c5d5f70b ("xen: avoid UB in guest handle arithmetic") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Fri, 21 Jun 2024 20:57:59 +0000 (21:57 +0100)]
x86/pagewalk: Address MISRA R8.3 violation in guest_walk_tables()
Commit 4c5d78a10dc8 ("x86/pagewalk: Re-implement the pagetable walker")
intentionally renamed guest_walk_tables()'s 'pfec' parameter to 'walk' because
it's not a PageFault Error Code, despite the name of some of the constants
passed in. Sadly the constants-cleanup I've been meaning to do since then
still hasn't come to pass.
Update the declaration to match, to placate MISRA.
Fixes: 4c5d78a10dc8 ("x86/pagewalk: Re-implement the pagetable walker") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
automation/eclair: add deviations of MISRA C Rule 5.5
MISRA C Rule 5.5 states that "Identifiers shall be distinct from macro
names".
Update ECLAIR configuration to deviate:
- macros expanding to their own name;
- clashes between macros and non-callable entities;
- clashes related to the selection of specific implementations of string
handling functions.