Andrew Cooper [Fri, 28 Mar 2025 11:19:23 +0000 (11:19 +0000)]
x86: Drop asm/byteorder.h
With the common code moved fully onto xen/byteorder.h, clean up the dregs.
It turns out that msi.h has not needed byteorder.h since the use of
__{BIG,LITTLE}_ENDIAN_BITFIELD was dropped in commit d58f3941ce3f ("x86/MSI:
use standard C types in structures/unions").
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
--- CC: Anthony PERARD <anthony.perard@vates.tech> CC: Michal Orzel <michal.orzel@amd.com> CC: Jan Beulich <jbeulich@suse.com> CC: Julien Grall <julien@xen.org> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> CC: Bertrand Marquis <bertrand.marquis@arm.com> CC: Shawn Anastasio <sanastasio@raptorengineering.com> CC: Oleksii Kurochko <oleksii.kurochko@gmail.com> CC: Daniel P. Smith <dpsmith@apertussolutions.com> CC: Lin Liu <lin.liu@citrix.com>
v5:
* New
Andrew Cooper [Fri, 28 Mar 2025 11:50:16 +0000 (11:50 +0000)]
riscv: Remove asm/byteorder.h
With the common code moved fully onto xen/byteorder.h, clean up the dregs.
The use of byteorder.h in io.h appears to have been copy&paste from ARM. It's
not needed, but macros and types are.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
--- CC: Anthony PERARD <anthony.perard@vates.tech> CC: Michal Orzel <michal.orzel@amd.com> CC: Jan Beulich <jbeulich@suse.com> CC: Julien Grall <julien@xen.org> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> CC: Bertrand Marquis <bertrand.marquis@arm.com> CC: Shawn Anastasio <sanastasio@raptorengineering.com> CC: Oleksii Kurochko <oleksii.kurochko@gmail.com> CC: Daniel P. Smith <dpsmith@apertussolutions.com> CC: Lin Liu <lin.liu@citrix.com>
v5:
* New
Andrew Cooper [Fri, 28 Mar 2025 13:10:58 +0000 (13:10 +0000)]
ppc: Drop asm/byteorder.h
With the common code moved fully onto xen/byteorder.h, clean up the dregs.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
--- CC: Anthony PERARD <anthony.perard@vates.tech> CC: Michal Orzel <michal.orzel@amd.com> CC: Jan Beulich <jbeulich@suse.com> CC: Julien Grall <julien@xen.org> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> CC: Bertrand Marquis <bertrand.marquis@arm.com> CC: Shawn Anastasio <sanastasio@raptorengineering.com> CC: Oleksii Kurochko <oleksii.kurochko@gmail.com> CC: Daniel P. Smith <dpsmith@apertussolutions.com> CC: Lin Liu <lin.liu@citrix.com>
v5:
* New
Andrew Cooper [Fri, 28 Mar 2025 13:11:06 +0000 (13:11 +0000)]
arm: Remove asm/byteorder.h
With the common code moved fully onto xen/byteorder.h, clean up the dregs.
Sort includes in some files while swapping over to xen/byteorder.h.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
--- CC: Anthony PERARD <anthony.perard@vates.tech> CC: Michal Orzel <michal.orzel@amd.com> CC: Jan Beulich <jbeulich@suse.com> CC: Julien Grall <julien@xen.org> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> CC: Bertrand Marquis <bertrand.marquis@arm.com> CC: Shawn Anastasio <sanastasio@raptorengineering.com> CC: Oleksii Kurochko <oleksii.kurochko@gmail.com> CC: Daniel P. Smith <dpsmith@apertussolutions.com> CC: Lin Liu <lin.liu@citrix.com>
v5:
* New
Andrew Cooper [Fri, 28 Mar 2025 13:06:42 +0000 (13:06 +0000)]
xen/common: Switch {asm -> xen}/byteorder.h
Sort the includes. Drop useless includes of xen/types.h
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
--- CC: Anthony PERARD <anthony.perard@vates.tech> CC: Michal Orzel <michal.orzel@amd.com> CC: Jan Beulich <jbeulich@suse.com> CC: Julien Grall <julien@xen.org> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> CC: Bertrand Marquis <bertrand.marquis@arm.com> CC: Shawn Anastasio <sanastasio@raptorengineering.com> CC: Oleksii Kurochko <oleksii.kurochko@gmail.com> CC: Daniel P. Smith <dpsmith@apertussolutions.com> CC: Lin Liu <lin.liu@citrix.com>
v5:
* New
Andrew Cooper [Fri, 28 Mar 2025 13:02:53 +0000 (13:02 +0000)]
xsm/flask: Switch {asm -> xen}/byteorder.h
Sort the includes while at it.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
--- CC: Anthony PERARD <anthony.perard@vates.tech> CC: Michal Orzel <michal.orzel@amd.com> CC: Jan Beulich <jbeulich@suse.com> CC: Julien Grall <julien@xen.org> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> CC: Bertrand Marquis <bertrand.marquis@arm.com> CC: Shawn Anastasio <sanastasio@raptorengineering.com> CC: Oleksii Kurochko <oleksii.kurochko@gmail.com> CC: Daniel P. Smith <dpsmith@apertussolutions.com> CC: Lin Liu <lin.liu@citrix.com>
v5:
* New
Lin Liu [Mon, 18 Oct 2021 10:32:39 +0000 (10:32 +0000)]
crypto/vmac: Switch to xen/byteswap.h
This file has its own implementation of swap bytes. Clean up
the code with xen/byteswap.h.
No functional change.
Signed-off-by: Lin Liu <lin.liu@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
--- CC: Anthony PERARD <anthony.perard@vates.tech> CC: Michal Orzel <michal.orzel@amd.com> CC: Jan Beulich <jbeulich@suse.com> CC: Julien Grall <julien@xen.org> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> CC: Bertrand Marquis <bertrand.marquis@arm.com> CC: Shawn Anastasio <sanastasio@raptorengineering.com> CC: Oleksii Kurochko <oleksii.kurochko@gmail.com> CC: Daniel P. Smith <dpsmith@apertussolutions.com> CC: Lin Liu <lin.liu@citrix.com>
Lin Liu [Thu, 21 Oct 2021 02:54:19 +0000 (02:54 +0000)]
xen: Remove old byteorder infrastructure
It is no longer used.
Signed-off-by: Lin Liu <lin.liu@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
--- CC: Anthony PERARD <anthony.perard@vates.tech> CC: Michal Orzel <michal.orzel@amd.com> CC: Jan Beulich <jbeulich@suse.com> CC: Julien Grall <julien@xen.org> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> CC: Bertrand Marquis <bertrand.marquis@arm.com> CC: Shawn Anastasio <sanastasio@raptorengineering.com> CC: Oleksii Kurochko <oleksii.kurochko@gmail.com> CC: Daniel P. Smith <dpsmith@apertussolutions.com> CC: Lin Liu <lin.liu@citrix.com>
Lin Liu [Fri, 5 Nov 2021 08:15:29 +0000 (04:15 -0400)]
xen/decompressors: Use new byteorder infrastructure
unaligned.h already inlcudes byteorder.h, so most can simply be dropped.
No functional change.
Signed-off-by: Lin Liu <lin.liu@citrix.com>
--- CC: Anthony PERARD <anthony.perard@vates.tech> CC: Michal Orzel <michal.orzel@amd.com> CC: Jan Beulich <jbeulich@suse.com> CC: Julien Grall <julien@xen.org> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> CC: Bertrand Marquis <bertrand.marquis@arm.com> CC: Shawn Anastasio <sanastasio@raptorengineering.com> CC: Oleksii Kurochko <oleksii.kurochko@gmail.com> CC: Daniel P. Smith <dpsmith@apertussolutions.com> CC: Lin Liu <lin.liu@citrix.com>
v5:
* New
Lin Liu [Mon, 9 May 2022 05:47:10 +0000 (06:47 +0100)]
xen/arch: Switch to new byteorder infrastructure
This needs to be done in several steps, because of common vs arch issues.
Start by using the new common infastructure inside the arch infrastructure.
libelf-private.h is awkward, and the only thing in Xen using swab??()
directly. It needs updating at the same time.
Signed-off-by: Lin Liu <lin.liu@citrix.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
--- CC: Anthony PERARD <anthony.perard@vates.tech> CC: Michal Orzel <michal.orzel@amd.com> CC: Jan Beulich <jbeulich@suse.com> CC: Julien Grall <julien@xen.org> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> CC: Bertrand Marquis <bertrand.marquis@arm.com> CC: Shawn Anastasio <sanastasio@raptorengineering.com> CC: Oleksii Kurochko <oleksii.kurochko@gmail.com> CC: Daniel P. Smith <dpsmith@apertussolutions.com> CC: Lin Liu <lin.liu@citrix.com>
v5:
* Rebase
* Rearange from other patches to maintain bisectability
Lin Liu [Thu, 21 Oct 2021 02:52:39 +0000 (02:52 +0000)]
xen/decompressors: Remove use of *_to_cpup() helpers
These wrappers simply hide a deference, which adds to the cognitive complexity
of reading the code. As such, they're not going to be included in the new
byteswap infrastructure.
No functional change.
Signed-off-by: Lin Liu <lin.liu@citrix.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
--- CC: Anthony PERARD <anthony.perard@vates.tech> CC: Michal Orzel <michal.orzel@amd.com> CC: Jan Beulich <jbeulich@suse.com> CC: Julien Grall <julien@xen.org> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> CC: Bertrand Marquis <bertrand.marquis@arm.com> CC: Shawn Anastasio <sanastasio@raptorengineering.com> CC: Oleksii Kurochko <oleksii.kurochko@gmail.com> CC: Daniel P. Smith <dpsmith@apertussolutions.com> CC: Lin Liu <lin.liu@citrix.com>
v6:
* Fix lz4 and lzo1x too.
Lin Liu [Thu, 21 Oct 2021 02:52:39 +0000 (03:52 +0100)]
xen/device-tree: Remove use of *_to_cpup() helpers
These wrappers simply hide a deference, which adds to the cognitive complexity
of reading the code. As such, they're not going to be included in the new
byteswap infrastructure.
No functional change.
Signed-off-by: Lin Liu <lin.liu@citrix.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
--- CC: Anthony PERARD <anthony.perard@vates.tech> CC: Michal Orzel <michal.orzel@amd.com> CC: Jan Beulich <jbeulich@suse.com> CC: Julien Grall <julien@xen.org> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> CC: Bertrand Marquis <bertrand.marquis@arm.com> CC: Shawn Anastasio <sanastasio@raptorengineering.com> CC: Oleksii Kurochko <oleksii.kurochko@gmail.com> CC: Daniel P. Smith <dpsmith@apertussolutions.com> CC: Lin Liu <lin.liu@citrix.com>
v5:
* Rebase
* Split out of later patch
Lin Liu [Wed, 20 Oct 2021 04:29:46 +0000 (04:29 +0000)]
xen/lib: Switch to xen/byteorder.h
In divmod.c, additionally swap xen/lib.h for xen/macros.h as only ABS() is
needed.
In find-next-bit.c, ext2 has nothing to do with this logic. Despite the
comments, it was a local modification when the logic was imported from Linux,
because Xen didn't have a suitable helper.
The new infrsatructure does have a suitable primitive, so use it.
No functional change.
Signed-off-by: Lin Liu <lin.liu@citrix.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
--- CC: Anthony PERARD <anthony.perard@vates.tech> CC: Michal Orzel <michal.orzel@amd.com> CC: Jan Beulich <jbeulich@suse.com> CC: Julien Grall <julien@xen.org> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> CC: Bertrand Marquis <bertrand.marquis@arm.com> CC: Shawn Anastasio <sanastasio@raptorengineering.com> CC: Oleksii Kurochko <oleksii.kurochko@gmail.com> CC: Daniel P. Smith <dpsmith@apertussolutions.com> CC: Lin Liu <lin.liu@citrix.com>
The find-next-bit.c changes, being inside __BIG_ENDIAN aren't even compiled in
any build of Xen. I manually checked that they compiled.
v5:
* Rebase
* Include a fixto divmod.c
* Explain why even Linux has never had anything by the name ext2_swab()
Lin Liu [Mon, 9 May 2022 05:47:10 +0000 (01:47 -0400)]
xen: Implement common byte{order,swap}.h
The current swab??() infrastructure is unecesserily complicated, and can be
replaced entirely with compiler builtins.
All supported compilers provide __BYTE_ORDER__ and __builtin_bswap??().
Nothing in Xen cares about the values of __{BIG,LITTLE}_ENDIAN; just that one
of them is defined. Therefore, centralise their definitions in xen/config.h
Signed-off-by: Lin Liu <lin.liu@citrix.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
--- CC: Anthony PERARD <anthony.perard@vates.tech> CC: Michal Orzel <michal.orzel@amd.com> CC: Jan Beulich <jbeulich@suse.com> CC: Julien Grall <julien@xen.org> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> CC: Bertrand Marquis <bertrand.marquis@arm.com> CC: Shawn Anastasio <sanastasio@raptorengineering.com> CC: Oleksii Kurochko <oleksii.kurochko@gmail.com> CC: Daniel P. Smith <dpsmith@apertussolutions.com> CC: Lin Liu <lin.liu@citrix.com>
v6:
* Fix typos
v5:
* Rebase substantially
* Drop PASTE(). It doesn't work when BITS_PER_LONG isn't a plain integer
* Simplify in light of new toolchain baseline
Andrew Cooper [Fri, 28 Mar 2025 10:13:25 +0000 (10:13 +0000)]
xen: Remove __{BIG,LITTLE}_ENDIAN_BITFIELD
There is a singular user. It's unlikely we'll gain a big-endian build of Xen,
but it's far more unlikely that bitfields will differ from main endianness.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
--- CC: Anthony PERARD <anthony.perard@vates.tech> CC: Michal Orzel <michal.orzel@amd.com> CC: Jan Beulich <jbeulich@suse.com> CC: Julien Grall <julien@xen.org> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> CC: Bertrand Marquis <bertrand.marquis@arm.com> CC: Shawn Anastasio <sanastasio@raptorengineering.com> CC: Oleksii Kurochko <oleksii.kurochko@gmail.com> CC: Daniel P. Smith <dpsmith@apertussolutions.com> CC: Lin Liu <lin.liu@citrix.com>
I'm tempted to simply drop the logic in maptrack_node. If any big-endian
build of Xen came along, that's probably the least of it's worries.
Andrew Cooper [Fri, 28 Mar 2025 10:04:31 +0000 (10:04 +0000)]
xen/lzo: Remove more remanants of TMEM
This logic was inserted by commit 447f613c5404 ("lzo: update LZO compression
to current upstream version") but was only relevant for the TMEM logic, so
should have been deleted in commit c492e19fdd05 ("xen: remove tmem from
hypervisor")
Fixes: c492e19fdd05 ("xen: remove tmem from hypervisor") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
--- CC: Anthony PERARD <anthony.perard@vates.tech> CC: Michal Orzel <michal.orzel@amd.com> CC: Jan Beulich <jbeulich@suse.com> CC: Julien Grall <julien@xen.org> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> CC: Bertrand Marquis <bertrand.marquis@arm.com> CC: Shawn Anastasio <sanastasio@raptorengineering.com> CC: Oleksii Kurochko <oleksii.kurochko@gmail.com> CC: Daniel P. Smith <dpsmith@apertussolutions.com> CC: Lin Liu <lin.liu@citrix.com>
Notably, this also removes the singular case where anything in Xen cares about
the value in __BYTE_ORDER, __LITTLE_ENDIAN and __BIG_ENDIAN, and even then it
was only an adaptation to the MiniOS environment.
Andrew Cooper [Thu, 20 Mar 2025 14:05:58 +0000 (14:05 +0000)]
Xen: Update compiler baseline checks
We have checks in both xen/compiler.h, and Config.mk. Both are incomplete.
The check in Config.mk sees $(CC) in system and cross-compiler form, so cannot
express anything more than the global baseline. Change it to simply 5.1.
In xen/compiler.h, rewrite the expression for clarity/brevity.
Include a GCC 12.2 check for RISCV, and include a Clang 11 baseline check.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 30 Aug 2024 13:25:28 +0000 (14:25 +0100)]
ARM/vgic: Use for_each_set_bit() in vgic_mmio_write_sgir()
The bitmap_for_each() expression only inspects the bottom 8 bits of targets.
Change it's type to uint8_t and use for_each_set_bit() which is more efficient
over scalars.
GICD_SGI_TARGET_LIST_MASK is 2 bits wide. Two cases discard the prior
calculation of targets, and one case exits early.
Therefore, move the GICD_SGI_TARGET_MASK calculation into the only case which
wants it, and use MASK_EXTR() to simplify the expression.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Michal Orzel <michal.orzel@amd.com>
Andrew Cooper [Wed, 26 Mar 2025 15:26:56 +0000 (15:26 +0000)]
ARM/vgic: Fix out-of-bounds accesses in vgic_mmio_write_sgir()
The switch() statement is over bits 24:25 (unshifted) of the guest provided
value. This makes case 0x3: dead, and not an implementation of the 4th
possible state.
A guest which writes (0x3 << 24) | (0xff << 16) to this register will skip the
early exit, then enter bitmap_for_each() with targets not bound by nr_vcpus.
If the guest has fewer than 8 vCPUs, bitmap_for_each() will read off the end
of d->vcpu[] and use the resulting vcpu pointer to ultimately derive irq, and
perform out-of-bounds writes.
Fix this by changing case 0x3 to default.
Fixes: 08c688ca6422 ("ARM: new VGIC: Add SGIR register handler") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Oleksii Kurochko [Thu, 27 Mar 2025 11:23:10 +0000 (12:23 +0100)]
xen/riscv: add H extension to -march
H provides additional instructions and CSRs that control the new stage of
address translation and support hosting a guest OS in virtual S-mode
(VS-mode).
According to the Unprivileged Architecture (version 20240411) specification:
```
Table 74 summarizes the standardized extension names. The table also defines
the canonical order in which extension names must appear in the name string,
with top-to-bottom in table indicating first-to-last in the name string, e.g.,
RV32IMACV is legal, whereas RV32IMAVC is not.
```
According to Table 74, the h extension is placed last in the one-letter
extensions name part of the ISA string.
`h` is a standalone extension based on the patch [1] but it wasn't so
before.
As the minimal supported GCC version to build Xen for RISC-V is 12.2.0,
and for that version, h is still considered a prefix for the hypervisor
extension but the name of hypervisor extension must be more then 1 letter
extension, a workaround ( with using `hh` as an H extension name ) is
implemented as otherwise the following compilation error will occur:
error: '-march=rv64gc_h_zbb_zihintpause': name of hypervisor extension
must be more than 1 letter
After GCC version 13.1.0, the commit [1] introducing H extension support
allows us to drop the workaround with `hh` as hypervisor extension name
and use only one h in -march.
Jan Beulich [Thu, 27 Mar 2025 11:22:39 +0000 (12:22 +0100)]
Arm/domctl: correct XEN_DOMCTL_vuart_op error return value
copy_to_guest() returns the number of bytes not copied; that's not what
the function should return to its caller though. Convert to returning
-EFAULT instead.
Fixes: 86039f2e8c20 ("xen/arm: vpl011: Add a new domctl API to initialize vpl011") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Michal Orzel <michal.orzel@amd.com>
Jan Beulich [Thu, 27 Mar 2025 11:22:06 +0000 (12:22 +0100)]
x86/pmstat: correct get_cpufreq_para()'s error return value
copy_to_guest() returns the number of bytes not copied; that's not what
the function should return to its caller though. Convert to returning
-EFAULT instead.
Fixes: 7542c4ff00f2 ("Add user PM control interface") Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Thu, 27 Mar 2025 11:21:08 +0000 (12:21 +0100)]
x86/PVH: account for module command line length
As per observation in practice, initrd->cmdline_pa is not normally zero.
Hence so far we always appended at least one byte. That alone may
already render insufficient the "allocation" made by find_memory().
Things would be worse when there's actually a (perhaps long) command
line.
Skip setup when the command line is empty. Amend the "allocation" size
by padding and actual size of module command line. Along these lines
also skip initrd setup when the initrd is zero size.
Fixes: 0ecb8eb09f9f ("x86/pvh: pass module command line to dom0") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
Roger Pau Monne [Fri, 14 Mar 2025 12:37:46 +0000 (13:37 +0100)]
automation/cirrus-ci: add smoke tests for the FreeBSD builds
Introduce a basic set of smoke tests using the XTF selftest image, and run
them on QEMU. Use the matrix keyword to create a different task for each
XTF flavor on each FreeBSD build.
Roger Pau Monne [Sat, 15 Mar 2025 08:35:12 +0000 (09:35 +0100)]
automation/cirrus-ci: use matrix keyword to generate per-version build tasks
Move the current logic to use the matrix keyword to generate a task for
each version of FreeBSD we want to build Xen on. The matrix keyword
however cannot be used in YAML aliases, so it needs to be explicitly used
inside of each task, which creates a bit of duplication. At least abstract
the FreeBSD minor version numbers to avoid repetition of image names.
Note that the full build uses matrix over an env variable instead of using
it directly in image_family. This is so that the alias can also be set
based on the FreeBSD version, in preparation for adding further tasks that
will depend on the full build having finished.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Andrew Cooper [Tue, 25 Mar 2025 17:55:33 +0000 (17:55 +0000)]
x86/elf: Remove ASM_CALL_CONSTRAINT from elf_core_save_regs()
I was mistaken about when ASM_CALL_CONSTRAINT is applicable. It is not
applicable for plain pushes/pops, so remove it from the flags logic.
Clarify the description of ASM_CALL_CONSTRAINT to be explicit about unwinding
using framepointers.
Fixes: 0754534b8a38 ("x86/elf: Improve code generation in elf_core_save_regs()") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
This is MOV %cr8, which is wired up for hvm_mov_{to,from}_cr(); the VMExit
fastpaths, but not for the full emulation slowpaths.
Xen's handling of %cr8 turns out to be quite wrong. At a minimum, we need
storage for %cr8 separate to APIC_TPR, and to alter intercepts based on
whether the vLAPIC is enabled or not. But that's more work than there is time
for in the short term, so make a stopgap fix.
Extend hvmemul_{read,write}_cr() with %cr8 cases. Unlike hvm_mov_to_cr(),
hardware hasn't filtered out invalid values (#GP checks are ahead of
intercepts), so introduce X86_CR8_VALID_MASK.
Reported-by: Petr Beneš <w1benny@gmail.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Mon, 24 Mar 2025 21:44:30 +0000 (21:44 +0000)]
x86/emul: Rearrange the logic in hvmemul_{read,write}_cr()
In hvmemul_read_cr(), make the TRACE()/X86EMUL_OKAY path common in preparation
for adding a %cr8 case. Use a local 'val' variable instead of always
operating on a deferenced pointer.
In both, calculate curr once.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Thu, 20 Mar 2025 14:13:56 +0000 (14:13 +0000)]
CI: Update build tests based on new minimum toolchain requirements
Drop CentOS 7 entirely. It's way to old now.
Ubuntu 22.04 is the oldest Ubuntu with a suitable version of Clang, so swap
the 16.04 clang builds for 22.04.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
Jan Beulich [Wed, 26 Mar 2025 11:32:03 +0000 (12:32 +0100)]
x86/PVH: expose OEMx ACPI tables to Dom0
What they contain we don't know, but we can't sensibly hide them. On my
Skylake system OEM1 (with a description of "INTEL CPU EIST") is what
contains all the _PCT, _PPC, and _PSS methods, i.e. about everything
needed for cpufreq. (_PSD interestingly are in an SSDT there.)
Further OEM2 there has a description of "INTEL CPU HWP", while OEM4
has "INTEL CPU CST". Pretty clearly all three need exposing for
cpufreq and cpuidle to work.
Fixes: 8b1a5268daf0 ("pvh/dom0: whitelist PVH Dom0 ACPI tables") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Jan Beulich [Wed, 26 Mar 2025 11:31:33 +0000 (12:31 +0100)]
x86/pmstat: fold two allocations in get_cpufreq_para()
There's little point in allocation two uint32_t[] arrays separately.
We'll need the bigger of the two anyway, and hence we can use that
bigger one also for transiently storing the smaller number of items.
While there also drop j (we can use i twice) and adjust the type of
the remaining two variables on that line.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Wed, 26 Mar 2025 11:30:57 +0000 (12:30 +0100)]
xenpm: sanitize allocations in show_cpufreq_para_by_cpuid()
malloc(), when passed zero size, may return NULL (the behavior is
implementation defined). Mirror the ->gov_num check to the other two
allocations as well. Don't chance then actually using a NULL in
print_cpufreq_para().
Fixes: 75e06d089d48 ("xenpm: add cpu frequency control interface, through which user can") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
Jan Beulich [Tue, 25 Mar 2025 08:23:48 +0000 (09:23 +0100)]
arinc653: move next_switch_time access under lock
Even before its recent movement to the scheduler's private data
structure it looks to have been wrong to update the field under lock,
but then read it with the lock no longer held.
Coverity-ID: 1644500 Fixes: 9f0c658baedc ("arinc: add cpu-pool support to scheduler") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Nathan Studer <nathan.studer@dornerworks.com>
Denis Mukhin [Tue, 25 Mar 2025 08:22:59 +0000 (09:22 +0100)]
x86/irq: introduce APIC_VECTOR_VALID()
Add new macro APIC_VECTOR_VALID() to validate the interrupt vector
range as per [1]. This macro replaces hardcoded checks against the
open-coded value 16 in LAPIC and virtual LAPIC code and simplifies
the code a bit.
Sergiy Kibrik [Mon, 24 Mar 2025 11:55:39 +0000 (12:55 +0100)]
x86: make Viridian support optional
Add config option HVM_VIRIDIAN that covers viridian code within HVM.
Calls to viridian functions guarded by is_viridian_domain() and related macros.
Having this option may be beneficial by reducing code footprint for systems
that are not using Hyper-V.
Jan Beulich [Mon, 24 Mar 2025 11:55:24 +0000 (12:55 +0100)]
process/release: mention MAINTAINERS adjustments
For many major releases I've been updating ./MAINTAINERS _after_ the
respective branch was handed over to me. That update, however, is
relevant not only from the .1 minor release onwards, but right from the
.0 release. Hence it ought to be done as one of the last things before
tagging the tree for the new major release.
See the seemingly unrelated parts (as far as the commit subject goes) of
e.g. 9d465658b405 ("update Xen version to 4.20.1-pre") for an example.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Julien Grall <jgrall@amazon.com>
Andrew Cooper [Sat, 28 Dec 2024 14:56:40 +0000 (14:56 +0000)]
x86/traps: Introduce early_traps_init() and simplify setup
Something I overlooked when last cleaning up exception handling is that a TSS
is not necessary if IST isn't configured, and IST isn't necessary until we're
running guest code.
Introduce early_traps_init(), and rearrange the existing logic between this
and traps_init() later on boot, to allow defering TSS and IST setup.
In early_traps_init(), load the IDT and invalidate TR/LDTR; this sufficient
system-table setup to make exception handling work. The setup of the BSPs
per-cpu variables stay early too; they're used on certain error paths.
Move load_system_tables() later into traps_init(). Note that it already
contains enable_each_ist(), so this call is simply dropped.
This removes some complexity prior to having exception support, and lays the
groundwork to not even allocate a TSS when using FRED.
No practical change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Wed, 19 Mar 2025 12:12:37 +0000 (12:12 +0000)]
x86/boot: Simplify the expression for extra allocation space
The expression for one parameter of find_memory() is already complicated and
about to become moreso. Break it out into a new variable, and express it in
an easier-to-follow way.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
Fixes: 84c4461b7d3a ("Force out-of-line instances of inline functions into .init.text in init-only code") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monne [Wed, 12 Mar 2025 17:51:43 +0000 (18:51 +0100)]
kconfig/randconfig: enable UBSAN for randconfig
Introduce an additional Kconfig check to only offer the option if the
compiler supports -fsanitize=undefined.
We no longer use Travis CI, so the original motivation for not enabling
UBSAN might no longer present. Regardless, the option won't be present in
the first place if the compiler doesn't support -fsanitize=undefined.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monne [Mon, 17 Mar 2025 17:51:21 +0000 (18:51 +0100)]
x86/vga: fix mapping of the VGA text buffer
The call to ioremap_wc() in video_init() will always fail, because
video_init() is called ahead of vm_init_type(), and so the underlying
__vmap() call will fail to allocate the linear address space.
Fix by reverting to the previous behavior and use __va() for the VGA text
buffer, as it's below the 1MB boundary, and thus always mapped in the
directmap.
Fixes: 81d195c6c0e2 ('x86: introduce ioremap_wc()') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monne [Wed, 5 Mar 2025 17:08:13 +0000 (18:08 +0100)]
x86/mkelf32: account for offset when detecting note segment placement
mkelf32 attempt to check that the program header defined NOTE segment falls
inside of the LOAD segment, as the build-id should be loaded for Xen at
runtime to check.
However the current code doesn't take into account the LOAD program header
segment offset when calculating overlap with the NOTE segment. This
results in incorrect detection, and the following build error:
arch/x86/boot/mkelf32 --notes xen-syms ./.xen.elf32 0x200000 \
`nm xen-syms | sed -ne 's/^\([^ ]*\) . __2M_rwdata_end$/0x\1/p'`
Expected .note section within .text section!
Offset 4244776 not within 2910364!
Account for the program header offset of the LOAD segment when checking
whether the NOTE segments is contained within. Also fix the logic to
ensure the NOTE segments is fully contained between the LOAD segment.
Fixes: a353cab905af ('build_id: Provide ld-embedded build-ids') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monne [Mon, 17 Mar 2025 09:31:07 +0000 (10:31 +0100)]
automation/console.exp: do not assume expect is always at /usr/bin/
Instead use env to find the location of expect.
Additionally do not use the -f flag, as it's only meaningful when passing
arguments on the command line, which we never do for console.exp. From the
expect 5.45.4 man page:
> The -f flag prefaces a file from which to read commands from. The flag
> itself is optional as it is only useful when using the #! notation (see
> above), so that other arguments may be supplied on the command line.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Roger Pau Monne [Tue, 18 Mar 2025 08:20:59 +0000 (09:20 +0100)]
x86/shadow: fix UB pointer arithmetic in sh_mfn_is_a_page_table()
UBSAN complains with:
UBSAN: Undefined behaviour in arch/x86/mm/shadow/private.h:515:30
pointer operation overflowed ffff82e000000000 to ffff82dfffffffe0
[...]
Xen call trace:
[<ffff82d040303782>] R common/ubsan/ubsan.c#ubsan_epilogue+0xa/0xc0
[<ffff82d040304bc3>] F __ubsan_handle_pointer_overflow+0xcb/0x100
[<ffff82d040471b2d>] F arch/x86/mm/shadow/guest_2.c#sh_page_fault__guest_2+0x1e350
[<ffff82d0403b206b>] F svm_vmexit_handler+0xdf3/0x2450
[<ffff82d0402049c0>] F svm_stgi_label+0x5/0x15
Fix by moving the call to mfn_to_page() after the check of whether the
passed gmfn is valid. This avoid the call to mfn_to_page() with an
INVALID_MFN parameter.
While there make the page local variable const, it's not modified by the
function.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Adjust the calculations in COMPAT_ARG_XLAT_VIRT_BASE to subtract from the
per-domain area to obtain the mirrored linear address in the 4th slot,
instead of overflowing the per-domain linear address.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monne [Fri, 14 Mar 2025 09:40:49 +0000 (10:40 +0100)]
x86/wait: prevent duplicated assembly labels
When enabling UBSAN with clang, the following error is triggered during the
build:
common/wait.c:154:9: error: symbol '.L_wq_resume' is already defined
154 | "push %%rbx; push %%rbp; push %%r12;"
| ^
<inline asm>:1:121: note: instantiated into assembly here
1 | push %rbx; push %rbp; push %r12;push %r13; push %r14; push %r15;sub %esp,%ecx;cmp $4096, %ecx;ja .L_skip;mov %rsp,%rsi;.L_wq_resume: rep movsb;mov %rsp,%rsi;.L_skip:pop %r15; pop %r14; pop %r13;pop %r12; pop %rbp; pop %rbx
| ^
common/wait.c:154:9: error: symbol '.L_skip' is already defined
154 | "push %%rbx; push %%rbp; push %%r12;"
| ^
<inline asm>:1:159: note: instantiated into assembly here
1 | push %rbx; push %rbp; push %r12;push %r13; push %r14; push %r15;sub %esp,%ecx;cmp $4096, %ecx;ja .L_skip;mov %rsp,%rsi;.L_wq_resume: rep movsb;mov %rsp,%rsi;.L_skip:pop %r15; pop %r14; pop %r13;pop %r12; pop %rbp; pop %rbx
| ^
2 errors generated.
The inline assembly block in __prepare_to_wait() is duplicated, thus
leading to multiple definitions of the otherwise unique labels inside the
assembly block. GCC extended-asm documentation notes the possibility of
duplicating asm blocks:
> Under certain circumstances, GCC may duplicate (or remove duplicates of)
> your assembly code when optimizing. This can lead to unexpected duplicate
> symbol errors during compilation if your asm code defines symbols or
> labels. Using ‘%=’ (see AssemblerTemplate) may help resolve this problem.
Workaround the issue by latching esp to a local variable, this prevents
clang duplicating the inline asm blocks.
Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monne [Tue, 18 Mar 2025 08:31:35 +0000 (09:31 +0100)]
x86/msi: always propagate MSI register writes from __setup_msi_irq()
After 8e60d47cf011 writes from __setup_msi_irq() will no longer be
propagated to the MSI registers if the IOMMU IRTE was already allocated.
Given the purpose of __setup_msi_irq() is MSI initialization, always
propagate the write to the hardware, regardless of whether the IRTE was
already allocated.
No functional change expected, as the write should always be propagated in
__setup_msi_irq(), but make it explicit on the write_msi_msg() call.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monne [Mon, 17 Mar 2025 14:40:11 +0000 (15:40 +0100)]
x86/msi: always propagate MSI writes when not in active system mode
Relax the limitation on MSI register writes, and only apply it when the
system is in active state. For example AMD IOMMU drivers rely on using
set_msi_affinity() to force an MSI register write on resume from
suspension.
The original patch intention was to reduce the number of MSI register
writes when the system is in active state. Leave the other states to
always perform the writes, as it's safer given the existing code, and it's
expected to not make a difference performance wise.
For such propagation to work even when the IRT index is not updated the MSI
message must be adjusted in all success cases for AMD IOMMU, not just when
the index has been newly allocated.
Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Fixes: 8e60d47cf011 ('x86/iommu: avoid MSI address and data writes if IRT index hasn't changed') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Jan Beulich [Thu, 20 Mar 2025 07:51:55 +0000 (08:51 +0100)]
x86/setup: correct off-by-1 in module mapping
If a module's length is an exact multiple of PAGE_SIZE, the 2nd argument
passed to set_pdx_range() would be one larger than intended. Use
PFN_{UP,DOWN}() there instead.
Fixes: cd7cc5320bb2 ("x86/boot: add start and size fields to struct boot_module") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Andrew Cooper [Fri, 7 Mar 2025 17:29:10 +0000 (17:29 +0000)]
xen: Update toolchain requirements to GCC 5.1/Binutils 2.25 or Clang/LLVM 11
GCC 4.1.2 is from 2007, and Binutils 2.16 is a similar vintage. Clang 3.5 is
from 2014. Supporting toolchains this old is a massive development and
testing burden.
Set a minimum baseline of GCC 5.1 across the board, along with Binutils 2.25
which is the same age. These were chosen *3 years ago* as Linux's minimum
requirements because even back then, they were ubiquitous in distros. Choose
Clang/LLVM 11 as a baseline for similar reasons; the Linux commit making this
change two years ago cites a laudry list of code generation bugs.
This will allow us to retire a lot of compatiblity logic, and start using new
features previously unavailable because of no viable compatibility option.
Merge the ARM 32bit and 64bit sections now they're the same.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Julien Grall <jgrall@amazon.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Andrew Cooper [Wed, 19 Mar 2025 02:58:18 +0000 (02:58 +0000)]
x86/mm: Fix IS_ALIGNED() check in IS_LnE_ALIGNED()
The current CI failures turn out to be a latent bug triggered by a narrow set
of properties of the initrd and the host memory map, which CI encountered by
chance.
One step during boot involves constructing directmap mappings for modules.
With some probing at the point of creation, it is observed that there's a 4k
mapping missing towards the end of the initrd.
The conditions for this bug appear to be map_pages_to_xen() call with a start
address of exactly 4k beyond a 2M boundary, some number of full 2M pages, then
a tail needing 4k pages.
Anyway, the condition for spotting superpage boundaries in map_pages_to_xen()
is wrong. The IS_ALIGNED() macro expects a power of two for the alignment
argument, and subtracts 1 itself.
Fixing this causes the failing case to now boot.
Fixes: 97fb6fcf26e8 ("x86/mm: introduce helpers to detect super page alignment") Debugged-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jiqian Chen [Tue, 18 Mar 2025 08:48:00 +0000 (09:48 +0100)]
CHANGELOG.md: Mention PCI passthrough for HVM domUs
PCI passthrough is already supported for HVM domUs when dom0 is PVH
on x86. The last related patch on Qemu side was merged after Xen4.20
release. So mention this feature in Xen4.21 entry.
But SR-IOV is not yet supported on PVH dom0, add a note for it.
Juergen Gross [Tue, 18 Mar 2025 08:47:45 +0000 (09:47 +0100)]
tools/xenstored: use xenmanage_poll_changed_domain()
Instead of checking each known domain after having received a
VIRQ_DOM_EXC event, use the new xenmanage_poll_changed_domain()
function for directly getting the domid of a domain having changed
its state.
A test doing "xl shutdown" of 1000 guests has shown to reduce the
consumed cpu time of xenstored by 6% with this change applied.
Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
Jan Beulich [Tue, 18 Mar 2025 08:44:57 +0000 (09:44 +0100)]
symbols: don't over-align generated data
x86 is one of the few architectures where .align has the same meaning as
.balign; most other architectures (Arm, PPC, and RISC-V in particular)
give it the same meaning as .p2align. Aligning every one of these item
to 256 bytes (on all 64-bit architectures except x86-64) is clearly too
much.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
tools: Mark ACPI SDTs as NVS in the PVH build path
Commit cefeffc7e583 marked ACPI tables as NVS in the hvmloader path
because SeaBIOS may otherwise just mark it as RAM. There is, however,
yet another reason to do it even in the PVH path. Xen's incarnation of
AML relies on having access to some ACPI tables (e.g: _STA of Processor
objects relies on reading the processor online bit in its MADT entry)
This is problematic if the OS tries to reclaim ACPI memory for page
tables as it's needed for runtime and can't be reclaimed after the OSPM
is up and running.
Fixes: de6d188a519f ("hvmloader: flip "ACPI data" to "ACPI NVS" type for ACPI table region)" Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Tue, 11 Jun 2024 19:03:32 +0000 (20:03 +0100)]
x86/hvm: Use for_each_set_bit() in hvm_emulate_writeback()
... which is more consise than the opencoded form, and more efficient when
compiled.
Furthermore, now that find_{first,next}_bit() are no longer in use, the
seg_reg_{accessed,dirty} fields aren't forced to be unsigned long, although
they do need to remain unsigned int because of __set_bit() elsewhere.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Tue, 31 Dec 2024 16:52:39 +0000 (16:52 +0000)]
x86/boot: Fix zap_low_mappings() to map less of the trampoline
Regular data access into the trampoline is via the directmap.
As now discussed quite extensively in asm/trampoline.h, the trampoline is
arranged so that only the AP and S3 paths need an identity mapping, and that
they fit within a single page.
Right now, PFN_UP(trampoline_end - trampoline_start) is 2, causing more than
expected of the trampoline to be mapped. Cut it down just the single page it
ought to be.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monne [Thu, 13 Mar 2025 11:19:48 +0000 (12:19 +0100)]
x86/ioremap: prevent additions against the NULL pointer
This was reported by clang UBSAN as:
UBSAN: Undefined behaviour in arch/x86/mm.c:6297:40
applying zero offset to null pointer
[...]
Xen call trace:
[<ffff82d040303662>] R common/ubsan/ubsan.c#ubsan_epilogue+0xa/0xc0
[<ffff82d040304aa3>] F __ubsan_handle_pointer_overflow+0xcb/0x100
[<ffff82d0406ebbc0>] F ioremap_wc+0xc8/0xe0
[<ffff82d0406c3728>] F video_init+0xd0/0x180
[<ffff82d0406ab6f5>] F console_init_preirq+0x3d/0x220
[<ffff82d0406f1876>] F __start_xen+0x68e/0x5530
[<ffff82d04020482e>] F __high_start+0x8e/0x90
Fix bt_ioremap() and ioremap{,_wc}() to not add the offset if the returned
pointer from __vmap() is NULL.
Fixes: d0d4635d034f ('implement vmap()') Fixes: f390941a92f1 ('x86/DMI: fix table mapping when one lives above 1Mb') Fixes: 81d195c6c0e2 ('x86: introduce ioremap_wc()') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monne [Thu, 13 Mar 2025 10:08:05 +0000 (11:08 +0100)]
x86/dom0: placate GCC 12 compile-time errors with UBSAN and PVH_GUEST
When building Xen with GCC 12 with UBSAN and PVH_GUEST both enabled the
compiler emits the following errors:
arch/x86/setup.c: In function '__start_xen':
arch/x86/setup.c:1504:19: error: 'consider_modules' reading 40 bytes from a region of size 4 [-Werror=stringop-overread]
1504 | end = consider_modules(s, e, reloc_size + mask,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1505 | bi->mods, bi->nr_modules, -1);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/x86/setup.c:1504:19: note: referencing argument 4 of type 'const struct boot_module[0]'
arch/x86/setup.c:686:24: note: in a call to function 'consider_modules'
686 | static uint64_t __init consider_modules(
| ^~~~~~~~~~~~~~~~
arch/x86/setup.c:1535:19: error: 'consider_modules' reading 40 bytes from a region of size 4 [-Werror=stringop-overread]
1535 | end = consider_modules(s, e, size, bi->mods,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1536 | bi->nr_modules + relocated, j);
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
arch/x86/setup.c:1535:19: note: referencing argument 4 of type 'const struct boot_module[0]'
arch/x86/setup.c:686:24: note: in a call to function 'consider_modules'
686 | static uint64_t __init consider_modules(
| ^~~~~~~~~~~~~~~~
This seems to be the result of some function manipulation done by UBSAN
triggering GCC stringops related errors. Placate the errors by declaring
the function parameter as `const struct *boot_module` instead of `const
struct boot_module[]`.
Note that GCC 13 seems to be fixed, and doesn't trigger the error when
using `[]`.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monne [Wed, 12 Mar 2025 12:35:53 +0000 (13:35 +0100)]
xen/ubsan: provide helper for clang's -fsanitize=function
clang's -fsanitize=function relies on the presence of
__ubsan_handle_function_type_mismatch() to print the detection of indirect
calls of a function through a function pointer of the wrong type.
Implement the helper, inspired on the llvm ubsan lib implementation.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Extend coverage of CONFIG_VM_EVENT option and make the build of VM events
and monitoring support optional. Also make MEM_PAGING option depend on VM_EVENT
to document that mem_paging is relying on vm_event.
This is to reduce code size on Arm when this option isn't enabled.
Sergiy Kibrik [Fri, 14 Mar 2025 05:23:14 +0000 (07:23 +0200)]
x86:monitor: control monitor.c build with CONFIG_VM_EVENT option
Replace more general CONFIG_HVM option with CONFIG_VM_EVENT which is more
relevant and specific to monitoring. This is only to clarify at build level
to which subsystem this file belongs.
No functional change here, as VM_EVENT depends on HVM.
Signed-off-by: Sergiy Kibrik <Sergiy_Kibrik@epam.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Sergiy Kibrik [Fri, 14 Mar 2025 05:21:09 +0000 (07:21 +0200)]
xen: kconfig: rename MEM_ACCESS -> VM_EVENT
Use more generic CONFIG_VM_EVENT name throughout Xen code instead of
CONFIG_MEM_ACCESS. This reflects the fact that vm_event is a higher level
feature, with mem_access & monitor depending on it.
Suggested-by: Tamas K Lengyel <tamas@tklengyel.com> Acked-by: Tamas K Lengyel <tamas@tklengyel.com> Signed-off-by: Sergiy Kibrik <Sergiy_Kibrik@epam.com>
Andrew Cooper [Sun, 29 Dec 2024 14:06:18 +0000 (14:06 +0000)]
x86/elf: Improve code generation in elf_core_save_regs()
A CALL with 0 displacement is handled specially, and is why this logic
functions even with CET Shadow Stacks active. Nevertheless a RIP-relative LEA
is the more normal way of doing this in 64bit code.
The retrieval of flags modifies the stack pointer so needs to state a
dependency on the stack pointer. Despite it's name, ASM_CALL_CONSTRAINT is
the way to do this.
read_sreg() forces the answer through a register, causing code generation of
the form:
Jan Beulich [Fri, 14 Mar 2025 09:18:34 +0000 (10:18 +0100)]
VT-d: have set_msi_source_id() return a success indicator
Handling possible internal errors by just emitting a (debug-build-only)
log message can't be quite enough. Return error codes in those cases,
and have the caller propagate those up.
Drop a pointless return path, rather than "inventing" an error code for
it.
While touching the function declarator anyway also constify its first
parameter.
Fixes: 476bbccc811c ("VT-d: fix MSI source-id of interrupt remapping") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 14 Mar 2025 09:18:12 +0000 (10:18 +0100)]
VT-d: move obtaining of MSI/HPET source ID
This was the original attempt to address XSA-467, until it was found
that IRQs can be off already from higher up the call stack. Nevertheless
moving code out of locked regions is generally desirable anyway; some of
the callers, after all, don't disable interrupts or acquire other locks.
Hence, despite this not addressing the original report:
Data collection solely depends on the passed in PCI device. Furthermore,
since the function only writes to a local variable, we can pull the
invocation of set_msi_source_id() (and also set_hpet_source_id()) ahead
of the acquiring of the (IRQ-safe) lock.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Juergen Gross [Fri, 14 Mar 2025 09:17:11 +0000 (10:17 +0100)]
xen/sched: fix arinc653 to not use variables across cpupools
a653sched_do_schedule() is using two function local static variables,
which is resulting in bad behavior when using more than one cpupool
with the arinc653 scheduler.
Fix that by moving those variables to the scheduler private data.
Fixes: 22787f2e107c ("ARINC 653 scheduler") Reported-by: Choi Anderson <Anderson.Choi@boeing.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Nathan Studer <nathan.studer@dornerworks.com>
Jan Beulich [Thu, 13 Mar 2025 09:24:15 +0000 (10:24 +0100)]
x86/shadow: replace p2m_is_valid() uses
The justification for dropping p2m_mmio_dm from p2m_is_valid() was wrong
for two of the shadow mode uses.
In _sh_propagate() we want to create special L1 entries for p2m_mmio_dm
pages. Hence we need to make sure we don't bail early for that type.
In _sh_page_fault() we want to handle p2m_mmio_dm by forwarding to
(internal or external) emulation. Pull the !p2m_is_mmio() check out of
the || expression (as otherwise it would need adding to the lhs as
well).
In both cases, p2m_is_valid() in combination with p2m_is_grant() still
doesn't cover foreign mappings. Hence use p2m_is_any_ram() plus (as
necessary) p2m_mmio_* instead.
Fixes: be59cceb2dbb ("x86/P2M: don't include MMIO_DM in p2m_is_valid()") Reported-by: Luca Fancellu <Luca.Fancellu@arm.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Tested-by: Luca Fancellu <luca.fancellu@arm.com>
Jason Andryuk [Thu, 13 Mar 2025 09:23:52 +0000 (10:23 +0100)]
tools/libxl: Skip missing PCI GSIs
A PCI device may not have a legacy IRQ. In that case, we don't need to
do anything, so don't fail in libxl__arch_hvm_map_gsi() and
libxl__arch_hvm_unmap_gsi().
Requires an updated pciback to return -ENOENT.
Fixes: f97f885c7198 ("tools: Add new function to do PIRQ (un)map on PVH dom0") Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
Jason Andryuk [Thu, 13 Mar 2025 09:23:42 +0000 (10:23 +0100)]
tools/ctrl: Silence missing GSI in xc_pcidev_get_gsi()
It is valid for a PCI device to not have a legacy IRQ. In that case, do
not print an error to keep the logs clean.
This relies on pciback being updated to return -ENOENT for a missing
GSI.
Fixes: b93e5981d258 ("tools: Add new function to get gsi from dev") Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
Jan Beulich [Thu, 13 Mar 2025 09:23:10 +0000 (10:23 +0100)]
libxl: avoid infinite loop in libxl__remove_directory()
Infinitely retrying the rmdir() invocation makes little sense. While the
original observation was the log filling the disk (due to repeated
"Directory not empty" errors, in turn occurring for unclear reasons),
the loop wants breaking even if there was no error message being logged
(much like is done in the similar loops in libxl__remove_file() and
libxl__remove_file_or_directory()).
Fixes: c4dcbee67e6d ("libxl: provide libxl__remove_file et al") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: Anthony PERARD <anthony.perard@vates.tech>