Andrew Cooper [Thu, 11 Jul 2024 15:09:22 +0000 (16:09 +0100)]
CI: Refresh Ubuntu Focal container as 20.04-x86_64
As with 16.04 (Xenial), with python3-setuptools included. Having this package
only in some containers was intentional; see commit bbc72a7877d8 ("automation:
Add python3's setuptools to some containers") for the rational.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 10 Jul 2024 13:37:53 +0000 (14:37 +0100)]
CI: Refresh OpenSUSE Leap container
See prior patch for most discussion.
Despite appearing to be a fixed release (and therefore not marked as permitted
failure), the dockerfile references the `leap` tag which is rolling in
practice. Switch to 15.6 explicitly, for better test stability.
Vs tumbleweed, use `zypper update` rather than dist-upgrade, and retain the
RomBIOS dependencies; bin86 and dev86.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 10 Jul 2024 13:40:23 +0000 (14:40 +0100)]
CI: Refresh OpenSUSE Tumbleweed container
Existing as suse:opensuse-tumbleweed is a historical quirk, and adjusted for
consistency with all the other containers.
Make it non-root, use heredocs for legibility, and use the zypper long names
for the benefit of those wondering what was being referenced or duplicated.
Trim the dependencies substantially. Testing docs isn't very interesting and
saves a lot of space. Other savings come from removing a huge pile of
optional QEMU dependencies (QEMU just needs to build the Xen parts to be
useful here, not have a full GUI environment).
Finally, there where some packages such as bc, libssh2-devel, libtasn1-devel
and nasm that I'm not aware of any reason to have had, even historically.
Furthermore, identify which components of the build use which dependencies,
which will help managing them in the future.
Thanks to Olaf Hering for dependency fixes that have been subsumed into this
total overhaul.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Tue, 9 Jul 2024 14:54:52 +0000 (15:54 +0100)]
CI: Refresh and upgrade the GCC-IBT container
Upgrade from Debian buster to bookworm, GCC 11.3 to 11.4 and to be a non-root
container.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Mon, 8 Jul 2024 17:18:22 +0000 (18:18 +0100)]
CI: Refresh bullseye-ppc64le as debian:11-ppc64le
... in the style of debian:12-ppc64le.
Rename the jobs and reposition them later as they're not a dependency for the
smoke testing any more.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Mon, 8 Jul 2024 17:17:25 +0000 (18:17 +0100)]
CI: Use debian:12-ppc64le for smoke testing
qemu-system-ppc64/8.1.0-ppc64 was added because bullseye's QEMU didn't
understand the powernv9 machine. However bookworm's QEMU does and this is
preferable to maintaining a random build of QEMU ourselves.
Use the debian:12-ppc64le container and test the output of that build too.
Remove qemu-system-ppc64-8.1.0-ppc64-export which is unused now.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Mon, 8 Jul 2024 17:00:21 +0000 (18:00 +0100)]
CI: Introduce a debian:12-ppc64le container
... conforming to the new naming scheme; $DISTRO-$VERSION-$ARCH-* so the jobs
sort more coherently.
Make it non-root by default, and set XEN_TARGET_ARCH=ppc64. Include QEMU too,
which will be used subsequently.
Add build jobs too, with debian-12-ppc64le-gcc-debug specifically early as it
will be used for smoke testing shortly.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 10 Jul 2024 12:38:52 +0000 (13:38 +0100)]
CI: Mark Archlinux/x86 as allowing failures
Archlinux is a rolling distro. As a consequence, rebuilding the container
periodically changes the toolchain, and this affects all stable branches in
one go.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 10 Jul 2024 00:01:13 +0000 (01:01 +0100)]
CI: Drop Ubuntu Trusty testing
This is also End of Life.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Tue, 9 Jul 2024 23:26:56 +0000 (00:26 +0100)]
CI: Drop Debian Stretch testing
Debian stretch is also End of Life. Update a couple of test steps to use
bookworm instead.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Tue, 9 Jul 2024 23:02:47 +0000 (00:02 +0100)]
CI: Drop Debian Jessie dockerfiles
These were removed from testing in Xen 4.18.
Fixes: 3817e3c1b4b8 ("automation: Remove testing on Debian Jessie") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
as PPC64 doesn't want randconfig right now, and buster-gcc-ibt is a special
job with a custom compiler.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Mon, 8 Jul 2024 17:00:49 +0000 (18:00 +0100)]
CI: Fix CONTAINER_UID0=1 scripts/containerize
Right now, most build containers use root. Archlinux, Fedora and Yocto set up
a regular user called `user`.
For those containers, trying to containerize as root fails, because
CONTAINER_UID0=1 does nothing, whereas CONTAINER_UID0=0 forces the user away
from root.
To make CONTAINER_UID0=1 work reliably, force to root if requested.
Fixes: 17fbe6504dfd ("automation: introduce a new variable to control container user") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Thu, 4 Jul 2024 12:09:21 +0000 (13:09 +0100)]
build: Drop xorg-x11 as a build dependency
The history on this one is complicated. The note to README was added in
commit 1f95747a4f16 ("Add openssl-dev and xorg-x11-dev to README") in 2007.
At the time, there was a vendered version of Qemu in xen.git with a local
modification using <X11/keysymdef.h> to access the monitor console over VNC.
The final reference to keysymdef.h was dropped in commit 85896a7c4dc7 ("build:
add autoconf to replace custom checks in tools/check") in 2012. The next
prior mention was in 2009 with commit a8ccb671c377 ("tools: fix x11 check")
noting that x11 was not a direct dependcy of Xen; it was transitive through
SDL for Qemu for source-based distros.
It appears there may have been other unspecified dependencies on xorg,
e.g. the use of lndir by unmodified_drivers which are no longer relevant
either.
These days its only the Debian based dockerfiles which install xorg-x11, and
Qemu builds fine in these and others without x11.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 3 Jul 2024 20:02:20 +0000 (21:02 +0100)]
CI: Refresh the Coverity Github Action configuration
Update to Ubuntu 24.04, and checkout@v4 as v2 is deprecated.
The build step goes out of it's way to exclude docs and stubdom (but include
plain MiniOS), so disable those at the ./configure stage.
Refresh the package list. libbz2-dev was in there twice, and e2fslibs-dev is
a a transitional package to libext2fs-dev. I'm not aware of libtool ever
having been a Xen dependency.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Thu, 4 Jul 2024 12:08:40 +0000 (13:08 +0100)]
build: Fix the version of python checked for by ./configure
We previously upped the minimum python version to 2.7, but neglected to
reflect this in ./configure
Fixes: 2a353c048c68 ("tools: Don't use distutils in configure or Makefile") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 3 Jul 2024 17:21:09 +0000 (18:21 +0100)]
build: Regenerate ./configure with Autoconf 2.71
This is the version now found in Debian Bookworm.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
x86/physdev: Return pirq that irq was already mapped to
Fix bug introduced by 0762e2502f1f ("x86/physdev: factor out the code to allocate and
map a pirq"). After that re-factoring, when pirq<0 and current_pirq>0, it means
caller want to allocate a free pirq for irq but irq already has a mapped pirq, then
it returns the negative pirq, so it fails. However, the logic before that
re-factoring is different, it should return the current_pirq that irq was already
mapped to and make the call success.
Fixes: 0762e2502f1f ("x86/physdev: factor out the code to allocate and map a pirq") Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com> Signed-off-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 3 Jul 2024 11:06:46 +0000 (12:06 +0100)]
CI: Rework the CentOS7 container
CentOS 7 is fully End-of-life as of 2024-06-30, and the Yum repo configuration
points at URLs which have become non-existent.
First, start by using a heredoc RUN for legibility. It's important to use
`set -e` to offset the fact that we're no longer chaining every command
together with an &&.
Also, because we're using a single RUN command to perform all RPM operations,
we no longer need to work around the OverlayFS bug.
Adjust the CentOS-*.repo files to point at vault.centos.org. This also
involves swapping mirrorlist= for baseurl= in the yum config.
Use a minor bashism to express the dependenices more coherently, and identify
why we have certain dependencies. Some adjustments are:
* We need bzip2-devel for the dombuilder. bzip2 needs retaining stubdom or
`tar` fails to unpack the .bz2 archives.
* {lzo,lz4,ztd}-devel are new optional dependency since the last time this
package list was refreshed.
* openssl-devel hasn't been a dependency since Xen 4.6.
* We long ago ceased being able to build Qemu and SeaBIOS in this container,
so drop their dependencies too.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
For inline files, use COPY with a heredoc, rather than opencoding it through
/bin/sh.
No practical change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Tue, 2 Jul 2024 13:34:36 +0000 (14:34 +0100)]
CI: Formalise the use of heredocs
Commit b5739330d7f4 introduced the use of heredocs in the jessie/stretch
dockerfiles.
It turns out this was introduced by BuildKit in 2018 along with a
standardisation of Dockerfile syntax, and has subsequently been adopted by the
docker community.
Annotate all dockerfiles with a statement of the syntax in use, and extend
README.md details including how to activate BuildKit when it's available but
off by default.
This allows the containers to be rebuilt following commit a0e29b316363 ("CI:
Drop glibc-i386 from the build containers").
Fixes: b5739330d7f4 ("automation: fix jessie/stretch images to use archive.debian.org apt repos") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Python regexes should use raw strings. Convert all regexes, and drop escaped
backslashes. Note that regular escape sequences are interpreted normally when
parsing a regex, so \n even in a raw-string regex is a newline.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 3 Jul 2024 20:59:34 +0000 (21:59 +0100)]
build/mkheader: Remove C-isms from the code
This was clearly written by a C programmer, rather than a python programmer.
Drop all the useless semi-colons.
The very final line of the script simply references f.close, rather than
calling the function. Switch to using a with: statement, as python does care
about unclosed files if you enable enough warnings.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 3 Jul 2024 22:01:11 +0000 (23:01 +0100)]
tools/xs-clients: Fix `make clean` rule
Prior to the split, "the clients" used tools/xenstored/Makefile.common whose
clean rule includes *.o whereas after the split, the removal of *.o was lost
by virtule of not including Makefile.common any more.
This is the bug behind the following build error:
make[2]: Entering directory '/local/xen.git/tools/xs-clients'
gcc xenstore_client.o (snip)
/usr/bin/ld: xenstore_client.o: relocation R_X86_64_32S against `.rodata' can not be used when making a PIE object; recompile with -fPIE
/usr/bin/ld: failed to set dynamic section sizes: bad value
collect2: error: ld returned 1 exit status
make[2]: *** [Makefile:35: xenstore] Error 1
which was caused by `make clean` not properly cleaning the tree as I was
swapping between various build containers.
Switch to a plain single-colon clean rule.
Fixes: 5c293058b130 ("tools/xenstore: move xenstored sources into dedicated directory") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
xen/riscv: use .insn with operands to support the older gas
Support for specifying "raw" insns was added only in 2.38.
To support older version it would be better switch to .insn
with operands.
The following compilation error occurs:
./arch/riscv/include/asm/processor.h: Assembler messages:
./arch/riscv/include/asm/processor.h:70: Error: unrecognized opcode `0x0100000F'
In case of the following Binutils:
$ riscv64-linux-gnu-as --version
GNU assembler (GNU Binutils for Debian) 2.35.2
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Drop the 0, which is in line with how we annotate RCs elsewhere.
Fixes: 4a73eb4c205d ("Update Xen version to 4.19-rc") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Jan Beulich [Wed, 3 Jul 2024 12:04:15 +0000 (14:04 +0200)]
cmdline: "extra_guest_irqs" is inapplicable to PVH
PVH in particular has no (externally visible) notion of pIRQ-s. Mention
that in the description of the respective command line option and have
arch_hwdom_irqs() also reflect this (thus suppressing the log message
there as well, as being pretty meaningless in this case anyway).
Suggested-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Jan Beulich [Wed, 3 Jul 2024 12:03:27 +0000 (14:03 +0200)]
amend 'cmdline: document and enforce "extra_guest_irqs" upper bounds'
Address late review comments for what is now commit 17f6d398f765:
- bound max_irqs right away against nr_irqs
- introduce a #define for a constant used twice
Requested-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Jan Beulich [Tue, 2 Jul 2024 10:01:59 +0000 (12:01 +0200)]
xen: avoid UB in guest handle field accessors
Much like noted in 43d5c5d5f70b ("xen: avoid UB in guest handle
arithmetic"), address calculations involved in accessing a struct field
can overflow, too. Cast respective pointers to "unsigned long" and
convert type checking accordingly. Remaining arithmetic is, despite
there possibly being mathematical overflow, okay as per the C99 spec:
"A computation involving unsigned operands can never overflow, because a
result that cannot be represented by the resulting unsigned integer type
is reduced modulo the number that is one greater than the largest value
that can be represented by the resulting type." The overflow that we
need to guard against is checked for in array_access_ok().
While there add the missing (see {,__}copy_to_guest_offset()) is-not-
const checks to {,__}copy_field_to_guest().
Typically, but not always, no change to generated code; code generation
(register allocation) is different for at least common/grant_table.c.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Jan Beulich [Tue, 2 Jul 2024 10:01:21 +0000 (12:01 +0200)]
x86/entry: don't clear DF when raising #UD for lack of syscall handler
While doing so is intentional when invoking the actual callback, to
mimic a hard-coded SYCALL_MASK / FMASK MSR, the same should not be done
when no handler is available and hence #UD is raised.
Fixes: ca6fcf4321b3 ("x86/pv: Inject #UD for missing SYSCALL callbacks") Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Jan Beulich [Tue, 2 Jul 2024 10:00:27 +0000 (12:00 +0200)]
cmdline: document and enforce "extra_guest_irqs" upper bounds
PHYSDEVOP_pirq_eoi_gmfn_v<N> accepting just a single GFN implies that no
more than 32k pIRQ-s can be used by a domain on x86. Document this upper
bound.
To also enforce the limit, (ab)use both arch_hwdom_irqs() (changing its
parameter type) and setup_system_domains(). This is primarily to avoid
exposing the two static variables or introducing yet further arch hooks.
While touching arch_hwdom_irqs() also mark it hwdom-init.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Fri, 28 Jun 2024 13:04:30 +0000 (14:04 +0100)]
tools/libxs: Fix CLOEXEC handling in xs_fileno()
xs_fileno() opens a pipe on first use to communicate between the watch thread
and the main thread. Nothing ever sets CLOEXEC on the file descriptors.
Check for the availability of the pipe2() function with configure. Despite
starting life as Linux-only, FreeBSD and NetBSD have gained it.
When pipe2() isn't available, try our best with pipe() and set_cloexec().
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: Anthony PERARD <anthony.perard@vates.tech>
Andrew Cooper [Fri, 28 Jun 2024 13:10:12 +0000 (14:10 +0100)]
tools/libxs: Fix CLOEXEC handling in get_dev()
Move the O_CLOEXEC compatibility outside of an #ifdef USE_PTHREAD block.
Introduce set_cloexec() to wrap fcntl() setting FD_CLOEXEC. It will be reused
for other CLOEXEC fixes too.
Use set_cloexec() when O_CLOEXEC isn't available as a best-effort fallback.
Fixes: f4f2f3402b2f ("tools/libxs: Open /dev/xen/xenbus fds as O_CLOEXEC") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: Anthony PERARD <anthony.perard@vates.tech>
Andrew Cooper [Thu, 27 Jun 2024 12:22:14 +0000 (13:22 +0100)]
tools/dombuilder: Correct the length calculation in xc_dom_alloc_segment()
xc_dom_alloc_segment() is passed a size in bytes, calculates a size in pages
from it, then fills in the new segment information with a bytes value
re-calculated from the number of pages.
This causes the module information given to the guest (MB, or PVH) to have
incorrect sizes; specifically, sizes rounded up to the next page.
This in turn is problematic for Xen. When Xen finds a gzipped module, it
peeks at the end metadata to judge the decompressed size, which is a -4
backreference from the reported end of the module.
Fill in seg->vend using the correct number of bytes.
Fixes: ea7c8a3d0e82 ("libxc: reorganize domain builder guest memory allocator") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
During Gitlab CI randconfig job for RISC-V failed witn an error:
common/trace.c:57:22: error: expected '=', ',', ';', 'asm' or
'__attribute__' before '__read_mostly'
57 | static u32 data_size __read_mostly;
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: George Dunlap <george.dunlap@cloud.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Jan Beulich [Tue, 2 Jul 2024 06:35:56 +0000 (08:35 +0200)]
pirq_cleanup_check() leaks
Its original introduction had two issues: For one the "common" part of
the checks (carried out in the macro) was inverted. And then after
removal from the radix tree the structure wasn't scheduled for freeing.
(All structures still left in the radix tree would be freed upon domain
destruction, though.)
For the freeing to be safe even if it didn't use RCU (i.e. to avoid use-
after-free), re-arrange checks/operations in evtchn_close(), such that
the pointer wouldn't be used anymore after calling pirq_cleanup_check()
(noting that unmap_domain_pirq_emuirq() itself calls the function in the
success case).
Fixes: c24536b636f2 ("replace d->nr_pirqs sized arrays with radix tree") Fixes: 79858fee307c ("xen: fix hvm_domain_use_pirq's behavior") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
George Dunlap [Wed, 26 Jun 2024 15:07:30 +0000 (16:07 +0100)]
MAINTAINERS: Step down as maintainer and committer
Remain a Reviewer on the golang bindings and scheduler for now (using
a xenproject.org alias), since there may be architectural decisions I
can shed light on.
Remove the XENTRACE section entirely, as there's no obvious candidate
to take it over; having the respective parts fall back to the tools
and The Rest seems the most reasonable option.
Nicola Vetrini [Thu, 27 Jun 2024 11:48:08 +0000 (13:48 +0200)]
x86/traps: address violations of MISRA C Rule 20.7
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.
Remove from the ECLAIR integration scripts an unused option, which
was already ignored, and make the help texts consistent
with the rest of the scripts.
Nicola Vetrini [Thu, 27 Jun 2024 11:47:16 +0000 (13:47 +0200)]
x86/irq: address violations of MISRA C Rule 20.7
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.
Nicola Vetrini [Thu, 27 Jun 2024 11:46:57 +0000 (13:46 +0200)]
automation/eclair_analysis: address violations of MISRA C Rule 20.7
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses".
The local helpers GRP2 and XADD in the x86 emulator use their first
argument as the constant expression for a case label. This pattern
is deviated project-wide, because it is very unlikely to induce
developer confusion and result in the wrong control flow being
carried out.
Nicola Vetrini [Thu, 27 Jun 2024 11:46:27 +0000 (13:46 +0200)]
xen/guest_access: address violations of MISRA rule 20.7
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.
Nicola Vetrini [Thu, 27 Jun 2024 11:46:02 +0000 (13:46 +0200)]
xen/self-tests: address violations of MISRA rule 20.7
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.
Nicola Vetrini [Thu, 27 Jun 2024 11:45:18 +0000 (13:45 +0200)]
automation/eclair: address violations of MISRA C Rule 20.7
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses".
The helper macro bitmap_switch has parameters that cannot be parenthesized
in order to comply with the rule, as that would break its functionality.
Moreover, the risk of misuse due developer confusion is deemed not
substantial enough to warrant a more involved refactor, thus the macro
is deviated for this rule.
George Dunlap [Mon, 24 Jun 2024 08:31:52 +0000 (09:31 +0100)]
CHANGELOG: Add entries related to tracing
Signed-off-by: George Dunlap <george.dunlap@cloud.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
George Dunlap [Mon, 24 Jun 2024 10:23:18 +0000 (11:23 +0100)]
tools/xenalyze: Remove argp_program_bug_address
xenalyze sets argp_program_bug_address to my old Citrix address. This
was done before xenalyze was in the xen.git tree; and it's the only
program in the tree which does so.
Now that xenalyze is part of the normal Xen distribution, it should be
obvious where to report bugs.
Signed-off-by: George Dunlap <george.dunlap@cloud.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
George Dunlap [Mon, 24 Jun 2024 08:43:04 +0000 (09:43 +0100)]
CHANGELOG.md: Fix indentation of "Removed" section
Signed-off-by: George Dunlap <george.dunlap@cloud.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
automation/eclair_analysis: deviate and|or|xor|not for MISRA C Rule 21.2
Rule 21.2 reports identifiers reserved for the C and POSIX standard
libraries: or, and, not and xor are reserved identifiers because they
constitute alternate spellings for the corresponding operators (they are
defined as macros by iso646.h); however Xen doesn't use standard library
headers, so there is no risk of overlap.
This addresses violations arising from x86_emulate/x86_emulate.c, where
label statements named as or, and and xor appear.
Jan Beulich [Tue, 25 Jun 2024 09:37:44 +0000 (11:37 +0200)]
gnttab: fix compat query-size handling
The odd DEFINE_XEN_GUEST_HANDLE(), inconsistent with all other similar
constructs, should have caught my attention. Turns out it was needed for
the build to succeed merely because the corresponding #ifndef had a
typo. That typo in turn broke compat mode guests, by having query-size
requests of theirs wire into the domain_crash() at the bottom of the
switch().
Fixes: 8c3bb4d8ce3f ("xen/gnttab: Perform compat/native gnttab_query_size check") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Oleksii Kurochko <Oleksii.kurochko@gmail.com>
Jan Beulich [Tue, 25 Jun 2024 09:36:59 +0000 (11:36 +0200)]
xen: re-add type checking to {,__}copy_from_guest_offset()
When re-working them to avoid UB on guest address calculations, I failed
to add explicit type checks in exchange for the implicit ones that until
then had happened in assignments that were there anyway.
Fixes: 43d5c5d5f70b ("xen: avoid UB in guest handle arithmetic") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Fri, 21 Jun 2024 20:57:59 +0000 (21:57 +0100)]
x86/pagewalk: Address MISRA R8.3 violation in guest_walk_tables()
Commit 4c5d78a10dc8 ("x86/pagewalk: Re-implement the pagetable walker")
intentionally renamed guest_walk_tables()'s 'pfec' parameter to 'walk' because
it's not a PageFault Error Code, despite the name of some of the constants
passed in. Sadly the constants-cleanup I've been meaning to do since then
still hasn't come to pass.
Update the declaration to match, to placate MISRA.
Fixes: 4c5d78a10dc8 ("x86/pagewalk: Re-implement the pagetable walker") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
automation/eclair: add deviations of MISRA C Rule 5.5
MISRA C Rule 5.5 states that "Identifiers shall be distinct from macro
names".
Update ECLAIR configuration to deviate:
- macros expanding to their own name;
- clashes between macros and non-callable entities;
- clashes related to the selection of specific implementations of string
handling functions.
Michal Orzel [Fri, 21 Jun 2024 09:22:05 +0000 (11:22 +0200)]
xen/arm: static-shmem: request host address to be specified for 1:1 domains
As a follow up to commit cb1ddafdc573 ("xen/arm/static-shmem: Static-shmem
should be direct-mapped for direct-mapped domains") add a check to
request that both host and guest physical address must be supplied for
direct mapped domains. Otherwise return an error to prevent unwanted
behavior.
Jan Beulich [Wed, 22 May 2024 10:17:30 +0000 (12:17 +0200)]
x86/shadow: Don't leave trace record field uninitialized
The emulation_count field is set only conditionally right now. Convert
all field setting to an initializer, thus guaranteeing that field to be
set to 0 (default initialized) when GUEST_PAGING_LEVELS != 3.
Rework trace_shadow_emulate() to be consistent with the other trace helpers.
Coverity-ID: 1598430 Fixes: 9a86ac1aa3d2 ("xentrace 5/7: Additional tracing for the shadow code") Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Release-acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 22 May 2024 13:05:13 +0000 (14:05 +0100)]
x86/shadow: Rework trace_shadow_emulate_other() as sh_trace_gfn_va()
sh_trace_gfn_va() is very similar to sh_trace_gl1e_va(), and a rather shorter
name than trace_shadow_emulate_other().
It's only referenced in CONFIG_HVM=y builds, so give it a __maybe_unused to
placate randconfig builds.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 22 May 2024 12:58:22 +0000 (13:58 +0100)]
x86/shadow: Introduce sh_trace_gl1e_va()
trace_shadow_fixup() and trace_not_shadow_fault() both write out identical
trace records. Reimplement them in terms of a common sh_trace_gl1e_va().
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 22 May 2024 12:51:43 +0000 (13:51 +0100)]
x86/shadow: Rework trace_shadow_gen() into sh_trace_va()
The ((GUEST_PAGING_LEVELS - 2) << 8) expression in the event field is common
to all shadow trace events, so introduce sh_trace() as a very thin wrapper
around trace().
Then, rename trace_shadow_gen() to sh_trace_va() to better describe what it is
doing, and to be more consistent with later cleanup.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Tue, 7 May 2024 11:05:58 +0000 (12:05 +0100)]
tools/xl: Open xldevd.log with O_CLOEXEC
`xl devd` has been observed leaking /var/log/xldevd.log into children.
Note this is specifically safe; dup2() leaves O_CLOEXEC disabled on newfd, so
after setting up stdout/stderr, it's only the logfile fd which will close on
exec().
Link: https://github.com/QubesOS/qubes-issues/issues/8292 Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Demi Marie Obenour <demi@invisiblethingslab.com> Acked-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Jan Beulich [Thu, 20 Jun 2024 15:34:56 +0000 (17:34 +0200)]
libelf: avoid UB in elf_xen_feature_{get,set}()
When the left shift amount is up to 31, the shifted quantity wants to be
of unsigned int (or wider) type.
While there also adjust types: get doesn't alter the array and returns a
boolean, while both don't really accept negative "nr". Drop a stray
blank each as well.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Matthew Barnes [Thu, 20 Jun 2024 15:36:46 +0000 (16:36 +0100)]
x86/ioapic: Fix signed shifts in io_apic.c
There exists bitshifts in the IOAPIC code where signed integers are
shifted to the left by up to 31 bits, which is undefined behaviour.
This patch fixes this by changing the integers from signed to unsigned.
Signed-off-by: Matthew Barnes <matthew.barnes@cloud.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Jan Beulich [Thu, 20 Jun 2024 10:10:27 +0000 (12:10 +0200)]
livepatch: use appropriate type for buffer offset variables
As was made noticeable by the last of the commits referenced below,
using a fixed-size type for such purposes is not only against
./CODING_STYLE, but can lead to actual issues. Switch to using size_t
instead, thus also allowing calculations to be lighter-weight in 32-bit
builds.
No functional change for 64-bit builds.
Link: https://gitlab.com/xen-project/xen/-/jobs/7136417308 Fixes: b145b4a39c13 ("livepatch: Handle arbitrary size names with the list operation") Fixes: 5083e0ff939d ("livepatch: Add metadata runtime retrieval mechanism") Fixes: 43d5c5d5f70b ("xen: avoid UB in guest handle arithmetic") Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Roger Pau Monné [Thu, 20 Jun 2024 10:09:32 +0000 (12:09 +0200)]
x86/irq: forward pending interrupts to new destination in fixup_irqs()
fixup_irqs() is used to evacuate interrupts from to be offlined CPUs. Given
the CPU is to become offline, the normal migration logic used by Xen where the
vector in the previous target(s) is left configured until the interrupt is
received on the new destination is not suitable.
Instead attempt to do as much as possible in order to prevent loosing
interrupts. If fixup_irqs() is called from the CPU to be offlined (as is
currently the case for CPU hot unplug) attempt to forward pending vectors when
interrupts that target the current CPU are migrated to a different destination.
Additionally, for interrupts that have already been moved from the current CPU
prior to the call to fixup_irqs() but that haven't been delivered to the new
destination (iow: interrupts with move_in_progress set and the current CPU set
in ->arch.old_cpu_mask) also check whether the previous vector is pending and
forward it to the new destination.
This allows us to remove the window with interrupts enabled at the bottom of
fixup_irqs(). Such window wasn't safe anyway: references to the CPU to become
offline are removed from interrupts masks, but the per-CPU vector_irq[] array
is not updated to reflect those changes (as the CPU is going offline anyway).
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Leigh Brown [Thu, 20 Jun 2024 10:09:02 +0000 (12:09 +0200)]
tools/libs/light: Fix nic->vlan memory allocation
After the following commit: 3bc14e4fa4b9 ("tools/libs/light: Add vlan field to libxl_device_nic")
xl list -l aborts with a double free error if a domain has at least
one vif defined:
$ sudo xl list -l
free(): double free detected in tcache 2
Aborted
Orginally, the vlan field was called vid and was defined as an integer.
It was appropriate to call libxl__xs_read_checked() with gc passed as
the string data was copied to a different variable. However, the final
version uses a string data type and the call should have been changed
to use NOGC instead of gc to allow that data to live past the gc
controlled lifetime, in line with the other string fields.
This patch makes the change to pass NOGC instead of gc and moves the
new code to be next to the other string fields (fixing a couple of
errant tabs along the way), as recommended by Jason.
Fixes: 3bc14e4fa4b9 ("tools/libs/light: Add vlan field to libxl_device_nic") Signed-off-by: Leigh Brown <leigh@solinno.co.uk> Reviewed-by: Jason Andryuk <jason.andryuk@amd.com> Acked-by: Anthony PERARD <anthony.perard@vates.tech> Release-acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Jason Andryuk [Thu, 20 Jun 2024 10:08:42 +0000 (12:08 +0200)]
hotplug: Restore block-tap phy compatibility
backendtype=phy using the blktap kernel module needs to use write_dev,
but tapback can't support that. tapback should perform better, but make
the script compatible with the old kernel module again.
Fixes: 76a484193d ("hotplug: Update block-tap") Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> Acked-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Jan Beulich [Wed, 19 Jun 2024 12:11:07 +0000 (14:11 +0200)]
xen: avoid UB in guest handle arithmetic
At least XENMEM_memory_exchange can have huge values passed in the
nr_extents and nr_exchanged fields. Adding such values to pointers can
overflow, resulting in UB. Cast respective pointers to "unsigned long"
while at the same time making the necessary multiplication explicit.
Remaining arithmetic is, despite there possibly being mathematical
overflow, okay as per the C99 spec: "A computation involving unsigned
operands can never overflow, because a result that cannot be represented
by the resulting unsigned integer type is reduced modulo the number that
is one greater than the largest value that can be represented by the
resulting type." The overflow that we need to guard against is checked
for in array_access_ok().
Note that in / down from array_access_ok() the address value is only
ever cast to "unsigned long" anyway, which is why in the invocation from
guest_handle_subrange_okay() the value doesn't need casting back to
pointer type.
In compat grant table code change two guest_handle_add_offset() to avoid
passing in negative offsets.
Since {,__}clear_guest_offset() need touching anyway, also deal with
another (latent) issue there: They were losing the handle type, i.e. the
size of the individual objects accessed. Luckily the few users we
presently have all pass char or uint8 handles.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Tested-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Fri, 30 Apr 2021 15:14:36 +0000 (16:14 +0100)]
x86/defns: Clean up X86_{XCR0,XSS}_* constants
With the exception of one case in read_bndcfgu() which can use ilog2(),
the *_POS defines are unused. Drop them.
X86_XCR0_X87 is the name used by both the SDM and APM, rather than
X86_XCR0_FP.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Fri, 30 Apr 2021 19:17:55 +0000 (20:17 +0100)]
x86/cpuid: Fix handling of XSAVE dynamic leaves
First, if XSAVE is available in hardware but not visible to the guest, the
dynamic leaves shouldn't be filled in.
Second, the comment concerning XSS state is wrong. VT-x doesn't manage
host/guest state automatically, but there is provision for "host only" bits to
be set, so the implications are still accurate.
Introduce xstate_compressed_size() to mirror the uncompressed one. Cross
check it at boot.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Fri, 30 Apr 2021 19:17:55 +0000 (20:17 +0100)]
x86/cpu-policy: Simplify recalculate_xstate()
Make use of xstate_uncompressed_size() helper rather than maintaining the
running calculation while accumulating feature components.
The rest of the CPUID data can come direct from the raw cpu policy. All
per-component data form an ABI through the behaviour of the X{SAVE,RSTOR}*
instructions.
Use for_each_set_bit() rather than opencoding a slightly awkward version of
it. Mask the attributes in ecx down based on the visible features. This
isn't actually necessary for any components or attributes defined at the time
of writing (up to AMX), but is added out of an abundance of caution.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Fri, 30 Apr 2021 19:17:55 +0000 (20:17 +0100)]
x86/xstate: Rework xstate_ctxt_size() as xstate_uncompressed_size()
We're soon going to need a compressed helper of the same form.
The size of the uncompressed image depends on the single element with the
largest offset + size. Sadly this isn't always the element with the largest
index.
Name the per-xstate-component cpu_policy struture, for legibility of the logic
in xstate_uncompressed_size(). Cross-check with hardware during boot, and
remove hw_uncompressed_size().
This means that the migration paths don't need to mess with XCR0 just to
sanity check the buffer size. It also means we can drop the "fastpath" check
against xfeature_mask (there to skip some XCR0 writes); this path is going to
be dead logic the moment Xen starts using supervisor states itself.
The users of hw_uncompressed_size() in xstate_init() can (and indeed need) to
be replaced with CPUID instructions. They run with feature_mask in XCR0, and
prior to setup_xstate_features() on the BSP.
No practical change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 22 May 2024 23:55:34 +0000 (00:55 +0100)]
x86/boot: Collect the Raw CPU Policy earlier on boot
This is a tangle, but it's a small step in the right direction.
In the following change, xstate_init() is going to start using the Raw policy.
calculate_raw_cpu_policy() is sufficiently separate from the other policies to
safely move like this.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Fri, 21 Feb 2020 17:56:57 +0000 (17:56 +0000)]
x86/xstate: Cross-check dynamic XSTATE sizes at boot
Right now, xstate_ctxt_size() performs a cross-check of size with CPUID in for
every call. This is expensive, being used for domain create/migrate, as well
as to service certain guest CPUID instructions.
Instead, arrange to check the sizes once at boot. See the code comments for
details. Right now, it just checks hardware against the algorithm
expectations. Later patches will cross-check Xen's XSTATE calculations too.
Introduce more X86_XCR0_* and X86_XSS_* constants CPUID bits. This is to
maximise coverage in the sanity check, even if we don't expect to
use/virtualise some of these features any time soon. Leave HDC and HWP alone
for now; we don't have CPUID bits from them stored nicely.
Only perform the cross-checks when SELF_TESTS are active. It's only
developers or new hardware liable to trip these checks, and Xen at least
tracks "maximum value ever seen in xcr0" for the lifetime of the VM, which we
don't want to be tickling in the general case.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Wed, 22 May 2024 16:23:54 +0000 (17:23 +0100)]
x86/xstate: Fix initialisation of XSS cache
The clobbering of this_cpu(xcr0) and this_cpu(xss) to architecturally invalid
values is to force the subsequent set_xcr0() and set_msr_xss() to reload the
hardware register.
While XCR0 is reloaded in xstate_init(), MSR_XSS isn't. This causes
get_msr_xss() to return the invalid value, and logic of the form:
old = get_msr_xss();
set_msr_xss(new);
...
set_msr_xss(old);
to try and restore said invalid value.
The architecturally invalid value must be purged from the cache, meaning the
hardware register must be written at least once. This in turn highlights that
the invalid value must only be used in the case that the hardware register is
available.
Fixes: f7f4a523927f ("x86/xstate: reset cached register values on resume") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Fri, 14 Jun 2024 12:05:40 +0000 (13:05 +0100)]
xen/arch: Centralise __read_mostly and __ro_after_init
These living in cache.h is inherited from Linux, but cache.h is not a terribly
appropriately location for them to live.
__read_mostly is an optimisation related to data placement in order to avoid
having shared data in cachelines that are likely to be written to, but it
really is just a section of the linked image separating data by usage
patterns; it has nothing to do with cache sizes or flushing logic.
Worse, __ro_after_init was only in xen/cache.h because __read_mostly was in
arch/cache.h, and has literally nothing whatsoever to do with caches.
Move the definitions into xen/sections.h, which in particular means that
RISC-V doesn't need to repeat the problematic pattern. Take the opportunity
to provide a short descriptions of what these are used for.
For now, leave TODO comments next to the other identical definitions. It
turns out that unpicking cache.h is more complicated than it appears because a
number of files use it for transitive dependencies.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Tue, 18 Jun 2024 12:48:35 +0000 (13:48 +0100)]
xen/irq: Address MISRA Rule 8.3 violation
When centralising irq_ack_none(), different architectures had different names
for the parameter. As it's type is struct irq_desc *, it should be named
desc. Make this consistent.
No functional change.
Fixes: 8aeda4a241ab ("arch/irq: Make irq_ack_none() mandatory") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Julien Grall <jgrall@amazon.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>