]> xenbits.xensource.com Git - xen.git/log
xen.git
6 months agoxen/arm: dom0less: cope with missing /gic phandle
Stewart Hildebrand [Fri, 11 Oct 2024 21:19:56 +0000 (17:19 -0400)]
xen/arm: dom0less: cope with missing /gic phandle

If a partial DT has a /gic node, but no references to it, dtc may omit
the phandle property. With the phandle property missing,
fdt_get_phandle() returns 0, leading Xen to generate a malformed domU
dtb due to invalid interrupt-parent phandle references. 0 is an invalid
phandle value. Add a zero check, and fall back to GUEST_PHANDLE_GIC.

Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 months agodevice-tree: Move dt-overlay.c to common/device-tree/
Michal Orzel [Thu, 10 Oct 2024 10:57:46 +0000 (12:57 +0200)]
device-tree: Move dt-overlay.c to common/device-tree/

The code is DT specific and as such should be placed under common
directory for DT related files. Update MAINTAINERS file accordingly
and drop the line with a path from a top-level comment in dt-overlay.c.
It serves no purpose and requires being updated on every code movement.

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 months agox86emul/test: drop Xeon Phi S/G prefetch special case
Jan Beulich [Thu, 17 Oct 2024 12:14:51 +0000 (14:14 +0200)]
x86emul/test: drop Xeon Phi S/G prefetch special case

Another leftover from the dropping of Xeon Phi support.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 months agox86emul/test: correct loop body indentation in evex-disp8.c:test_one()
Jan Beulich [Thu, 17 Oct 2024 12:14:31 +0000 (14:14 +0200)]
x86emul/test: correct loop body indentation in evex-disp8.c:test_one()

For some reason I entirely consistently screwed these up.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
6 months agodocs: update documentation of reboot param
Marek Marczykowski-Górecki [Thu, 17 Oct 2024 12:13:50 +0000 (14:13 +0200)]
docs: update documentation of reboot param

Reflect changed default mode, and fix formatting of `efi` value.

Fixes: d81dd3130351 ("x86/shutdown: change default reboot method preference")
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 months agox86/boot: Improve MBI2 structure check
Frediano Ziglio [Tue, 15 Oct 2024 08:25:13 +0000 (09:25 +0100)]
x86/boot: Improve MBI2 structure check

Tag structure should contain at least the tag header.
Entire tag structure must be contained inside MBI2 data.

Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
6 months agox86/boot: Align mbi2.c stack to 16 bytes
Frediano Ziglio [Tue, 15 Oct 2024 08:25:12 +0000 (09:25 +0100)]
x86/boot: Align mbi2.c stack to 16 bytes

Most of Xen is built with a stack alignment of 8 bytes, but the UEFI spec
mandates 16 and UEFI services will fault if the stack is misaligned.

While the caller of efi_multiboot2_prelude() takes care to align the stack,
mbi2.c accidentally got the Xen-wide default of 8, and has a 50% chance of
crashing depending on how many variables the compiler decided to spill to the
stack.

Compile mbi2.c with the appropriate alignment for UEFI functionality.

Also take the opportunity to make it a fully .init object.

Fixes: eb21ce14d709 ('x86/boot: Rewrite EFI/MBI2 code partly in C')
Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
[rewrite the commit message]
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 months agoxen/public: add comments regarding interface version bumps
Juergen Gross [Tue, 15 Oct 2024 12:24:45 +0000 (14:24 +0200)]
xen/public: add comments regarding interface version bumps

domctl.h and sysctl.h have an interface version, which needs to be
bumped in case of incompatible modifications of the interface.

In order to avoid misunderstandings, add a comment to both headers
specifying in which cases a bump is needed.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
6 months agox86/boot: Prep work for 32bit object changes
Frediano Ziglio [Tue, 15 Oct 2024 12:24:25 +0000 (14:24 +0200)]
x86/boot: Prep work for 32bit object changes

Broken out of the subsequent patch for clarity.

 * Rename head-bin-objs to obj32
 * Use a .32.o suffix to distinguish these objects
 * Factor out $(LD32)

No functional change.

Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 months agoiommu/amd-vi: do not error if device referenced in IVMD is not behind any IOMMU
Roger Pau Monné [Tue, 15 Oct 2024 12:23:59 +0000 (14:23 +0200)]
iommu/amd-vi: do not error if device referenced in IVMD is not behind any IOMMU

IVMD table contains restrictions about memory which must be mandatory assigned
to devices (and which permissions it should use), or memory that should be
never accessible to devices.

Some hardware however contains ranges in IVMD that reference devices outside of
the IVHD tables (in other words, devices not behind any IOMMU).  Such mismatch
will cause Xen to fail in register_range_for_device(), ultimately leading to
the IOMMU being disabled, and Xen crashing as x2APIC support might be already
enabled and relying on the IOMMU functionality.

Relax IVMD parsing: allow IVMD blocks to reference devices not assigned to any
IOMMU.  It's impossible for Xen to fulfill the requirement in the IVMD block if
the device is not behind any IOMMU, but it's no worse than booting without
IOMMU support, and thus not parsing ACPI IVRS in the first place.

Reported-by: Willi Junga <xenproject@ymy.be>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 months agoxen/riscv: parse and handle fdt command line
Oleksii Kurochko [Tue, 15 Oct 2024 12:23:41 +0000 (14:23 +0200)]
xen/riscv: parse and handle fdt command line

Receive Xen's command line passed by DTB using boot_fdt_cmdline()
and passed it to cmdline_parse() for further procesinng and setup
of Xen-specific parameters.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 months agoxen/riscv: initialize bootinfo from dtb
Oleksii Kurochko [Tue, 15 Oct 2024 12:23:19 +0000 (14:23 +0200)]
xen/riscv: initialize bootinfo from dtb

Parse DTB during startup, allowing memory banks and reserved
memory regions to be set up, along with early device tree node
(chosen, "xen,domain", "reserved-memory", etc) handling.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 months agoxen/vpci: address violations of MISRA C Rule 16.3
Federico Serafini [Tue, 15 Oct 2024 12:22:56 +0000 (14:22 +0200)]
xen/vpci: address violations of MISRA C Rule 16.3

Address violations of MISRA C:2012 Rule 16.3:
"An unconditional `break' statement shall terminate every
switch-clause".

No functional change.

Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
6 months agoxen/common: move device initialization code to common code
Oleksii Kurochko [Tue, 15 Oct 2024 12:22:00 +0000 (14:22 +0200)]
xen/common: move device initialization code to common code

Remove the device initialization code from `xen/arch/arm/device.c`
and move it to the common code to avoid duplication and make it accessible
for both ARM and other architectures.
device_get_class(), device_init(), _sdevice[] and _edevice[] are wrapped by
"#ifdef CONFIG_HAS_DEVICE_TREE" for the case if an arch doesn't support
device tree.

Remove unnecessary inclusions of <asm/device.h> and <xen/init.h> from
`xen/arch/arm/device.c` as no code in the file relies on these headers.
Fix the inclusion order by moving <asm/setup.h> after <xen/*> headers to
resolve a compilation error:
   ./include/public/xen.h:968:35: error: unknown type name 'uint64_t'
    968 | __DEFINE_XEN_GUEST_HANDLE(uint64, uint64_t);
        |                                   ^~~~~~~~
   ./include/public/arch-arm.h:191:21: note: in definition of macro '___DEFINE_XEN_GUEST_HANDLE'
   191 |     typedef union { type *p; uint64_aligned_t q; }              \
       |                     ^~~~
   ./include/public/xen.h:968:1: note: in expansion of macro '__DEFINE_XEN_GUEST_HANDLE'
   968 | __DEFINE_XEN_GUEST_HANDLE(uint64, uint64_t);
because <asm/setup.h> includes <public/version.h>, which in turn includes
"xen.h", which requires <xen/types.h> to be processed correctly.
Additionally, add <xen/device_tree.h> to `device.c` as functions from this
header are used within the file.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
6 months agoxen/riscv: add section for device information in linker script
Oleksii Kurochko [Tue, 15 Oct 2024 12:21:14 +0000 (14:21 +0200)]
xen/riscv: add section for device information in linker script

Introduce a new `.dev.info` section in the RISC-V linker script to
handle device-specific information. This section is required by
common code (common/device.c: device_init(), device_get_class() ).
This section is aligned to `POINTER_ALIGN`, with `_sdevice` and `_edevice`
marking the start and end of the section, respectively.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 months agoxen/ppc: add section for device information in linker script
Oleksii Kurochko [Tue, 15 Oct 2024 12:21:04 +0000 (14:21 +0200)]
xen/ppc: add section for device information in linker script

Introduce a new `.dev.info` section in the PPC linker script to
handle device-specific information. This section is required by
common code (common/device.c: device_init(), device_get_class() ).
This section is aligned to `POINTER_ALIGN`, with `_sdevice` and `_edevice`
marking the start and end of the section, respectively.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Shawn Anastasio <sanastasio@raptorengineering.com>
6 months agoxen/arm: use {DT,ACPI}_DEV_INFO for device info sections
Oleksii Kurochko [Tue, 15 Oct 2024 12:20:43 +0000 (14:20 +0200)]
xen/arm: use {DT,ACPI}_DEV_INFO for device info sections

Refactor arm/xen.lds.S by replacing the inline definitions for
device info sections with the newly introduced {DT,ACPI}_DEV_INFO
macros from xen/xen.lds.h.

Change alignment of DT_DEV_INFO and ACPI_DEV_INFO sections from
8 to POINTER_ALIGN as struct acpi_device_desc and struct device_desc
don't have any uint64_t's so it is safe to do that.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
6 months agoxen: define ACPI and DT device info sections macros
Oleksii Kurochko [Tue, 15 Oct 2024 12:20:05 +0000 (14:20 +0200)]
xen: define ACPI and DT device info sections macros

Introduce macros to define device information sections based on
the configuration of ACPI or device tree support. These sections
are required for common code of device initialization and getting
an information about a device.

These macros are expected to be used across different
architectures (Arm, PPC, RISC-V), so they are moved to
the common xen/xen.lds.h, based on their original definition
in Arm.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 months agoxen: introduce DECL_SECTION_WITH_LADDR
Oleksii Kurochko [Tue, 15 Oct 2024 12:19:07 +0000 (14:19 +0200)]
xen: introduce DECL_SECTION_WITH_LADDR

Introduce DECL_SECTION_WITH_LADDR in order to signal whether
DECL_SECTION() should specify a load address or not.

Update {ppc,x86}/xen.lds.S to use DECL_SECTION_WITH_LADDR.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 months agoxen/spinlock: Fix UBSAN "load of address with insufficient space" in lock_prof_init()
Andrew Cooper [Mon, 14 Oct 2024 14:30:28 +0000 (15:30 +0100)]
xen/spinlock: Fix UBSAN "load of address with insufficient space" in lock_prof_init()

UBSAN complains:

  (XEN) ================================================================================
  (XEN) UBSAN: Undefined behaviour in common/spinlock.c:794:10
  (XEN) load of address ffff82d040ae24c8 with insufficient space
  (XEN) for an object of type 'struct lock_profile *'
  (XEN) ----[ Xen-4.20-unstable  x86_64  debug=y ubsan=y  Tainted:   C    ]----

This shows up with GCC-14, but not with GCC-12.  I have not bisected further.

Either way, the types for __lock_profile_{start,end} are incorrect.

They are an array of struct lock_profile pointers.  Correct the extern's
types, and adjust the loop to match.

No practical change.

Reported-by: Andreas Glashauser <ag@andreasglashauser.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
6 months agostubdom: use real lib dependencies for xenstore stubdoms
Juergen Gross [Thu, 10 Oct 2024 15:54:59 +0000 (17:54 +0200)]
stubdom: use real lib dependencies for xenstore stubdoms

Today the build of Xenstore stubdoms depend on libxenguest just because
libxenguest depends on all needed libraries. In reality there is no
dependency on libxenguest for Xenstore stubdoms.

Use the actual dependencies instead.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 months agotools/xenstored: remove unneeded libxenguest reference
Juergen Gross [Thu, 10 Oct 2024 15:54:58 +0000 (17:54 +0200)]
tools/xenstored: remove unneeded libxenguest reference

Today the xenstored Makefile contains an unneeded reference to the
not used libxenguest library.

Remove it.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 months agoconfig: update Mini-OS commit
Juergen Gross [Thu, 10 Oct 2024 15:54:57 +0000 (17:54 +0200)]
config: update Mini-OS commit

Update the Mini-OS upstream revision.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 months agodt-overlay: Print overlay I/O memory ranges in hex
Michal Orzel [Fri, 4 Oct 2024 12:22:20 +0000 (14:22 +0200)]
dt-overlay: Print overlay I/O memory ranges in hex

Printing I/O memory rangeset ranges in decimal is not very helpful when
debugging, so switch to hex by adding RANGESETF_prettyprint_hex flag
for iomem_ranges rangeset.

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
6 months agodt-overlay: Ignore nodes that do not have __overlay__ as their subnode
Michal Orzel [Fri, 4 Oct 2024 12:22:19 +0000 (14:22 +0200)]
dt-overlay: Ignore nodes that do not have __overlay__ as their subnode

Assumption stated in the comments as if fdt_for_each_subnode() checks
for parent < 0 is utterly wrong. If parent is < 0, node offset is set to
0 (i.e. the very first node in the tree) and the loop's body is executed.
This incorrect assumption causes overlay_node_count() to also count nodes
that do not have __overlay__ as their subnode. The same story goes for
overlay_get_nodes_info(), where we end up requiring each node directly
under root node to have "target-path" set. DTBOs can specify other nodes
including special ones like __symbols__, __fixups__ that can be left to
reduce the number of steps a user needs to do to when it comes to invalid
phandles.

Fix it by adding checks if overlay < 0 after respective calls to
fdt_subnode_offset().

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 months agodt-overlay: Support target-path being root node
Michal Orzel [Fri, 4 Oct 2024 12:22:18 +0000 (14:22 +0200)]
dt-overlay: Support target-path being root node

Even though in most cases device nodes are not present directly under
the root node, it's a perfectly valid configuration (e.g. Qemu virt
machine dtb). At the moment, we don't handle this scenario which leads
to unconditional addition of extra leading '/' in the node full path.
This makes the attempt to add such device overlay to fail.

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 months agodt-overlay: Remove ASSERT_UNREACHABLE from add_nodes()
Michal Orzel [Fri, 4 Oct 2024 12:22:17 +0000 (14:22 +0200)]
dt-overlay: Remove ASSERT_UNREACHABLE from add_nodes()

The assumption stated in the comment that the code will never get there
is incorrect. In overlay_get_nodes_info() we manually combine path from
target-path property with the node path by adding '/' as a separator.
This can differ from a path obtained by libfdt due to more advanced
logic used there which can for instance get rid of excessive slashes.
In case of incorrect target-path (e.g. target-path = "//axi"), the
comparison in dt_find_node_by_path_from() can fail triggering the assert
in debug builds.

Fixes: 0c0facdab6f5 ("xen/arm: Implement device tree node addition functionalities")
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Julien Grall <jgrall@amazon.com>
6 months agodevice-tree: Remove __init from unflatten_dt_alloc()
Michal Orzel [Fri, 4 Oct 2024 12:22:16 +0000 (14:22 +0200)]
device-tree: Remove __init from unflatten_dt_alloc()

With CONFIG_OVERLAY_DTB=y, unflatten_dt_alloc() is used as part of
unflatten_dt_node() used during runtime. In case of a binary compiled
such as unflatten_dt_alloc() does not get inlined (e.g. using -Og),
attempt to add an overlay to Xen (xl dt-overlay add) results in a crash.

(XEN) Instruction Abort Trap. Syndrome=0x7
(XEN) Walking Hypervisor VA 0xa00002c8cc0 on CPU2 via TTBR 0x0000000040340000
(XEN) 0TH[0x014] = 0x4033ff7f
(XEN) 1ST[0x000] = 0x4033ef7f
(XEN) 2ND[0x001] = 0x4000004033af7f
(XEN) 3RD[0x0c8] = 0x0
(XEN) CPU2: Unexpected Trap: Instruction Abort
(XEN) ----[ Xen-4.20-unstable  arm64  debug=y  Not tainted ]----
...
(XEN) Xen call trace:
(XEN)    [<00000a00002c8cc0>] 00000a00002c8cc0 (PC)
(XEN)    [<00000a0000202410>] device-tree.c#unflatten_dt_node+0xd0/0x504 (LR)
(XEN)    [<00000a0000204484>] unflatten_device_tree+0x54/0x1a0
(XEN)    [<00000a000020800c>] dt-overlay.c#handle_add_overlay_nodes+0x290/0x3d4
(XEN)    [<00000a0000208360>] dt_overlay_sysctl+0x8c/0x110
(XEN)    [<00000a000027714c>] arch_do_sysctl+0x1c/0x2c

Fixes: 9e9d2c079dc4 ("xen/arm/device: Remove __init from function type")
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 months agodt-overlay: Fix NULL pointer dereference
Michal Orzel [Fri, 4 Oct 2024 12:22:15 +0000 (14:22 +0200)]
dt-overlay: Fix NULL pointer dereference

Attempt to attach an overlay (xl dt-overlay attach) to a domain without
first adding this overlay to Xen (xl dt-overlay add) results in an
overlay track entry being NULL in handle_attach_overlay_nodes(). This
leads to NULL pointer dereference and the following data abort crash:

(XEN) Cannot find any matching tracker with input dtbo. Operation is supported only for prior added dtbo.
(XEN) Data Abort Trap. Syndrome=0x5
(XEN) Walking Hypervisor VA 0x40 on CPU0 via TTBR 0x0000000046948000
(XEN) 0TH[0x000] = 0x46940f7f
(XEN) 1ST[0x000] = 0x0
(XEN) CPU0: Unexpected Trap: Data Abort
(XEN) ----[ Xen-4.20-unstable  arm64  debug=y  Not tainted ]----
...
(XEN) Xen call trace:
(XEN)    [<00000a0000208b30>] dt_overlay_domctl+0x304/0x370 (PC)
(XEN)    [<00000a0000208b30>] dt_overlay_domctl+0x304/0x370 (LR)
(XEN)    [<00000a0000274b7c>] arch_do_domctl+0x48/0x328

Fixes: 4c733873b5c2 ("xen/arm: Add XEN_DOMCTL_dt_overlay and device attachment to domains")
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 months agostubdom: add fine grained library config items to Mini-OS configs
Juergen Gross [Thu, 10 Oct 2024 11:19:46 +0000 (13:19 +0200)]
stubdom: add fine grained library config items to Mini-OS configs

Today Mini-OS can only be configured to use all or no Xen libraries.
In order to prepare a more fine grained configuration scheme, add per
library config items to the Mini-OS config files.

As some libraries pull in others, the config files need to be
extended at build time to reflect those indirect library uses.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
6 months agoocaml/libs: Remove xsd_glue_dev package, package plugin_interface_v1.a
Andrii Sultanov [Wed, 9 Oct 2024 15:15:20 +0000 (16:15 +0100)]
ocaml/libs: Remove xsd_glue_dev package, package plugin_interface_v1.a

xsd_glue_dev packaging is inconsistent with the rest of OCaml packages and
isn't actually necessary. The .a is needed alongside compiled bytecode files
during linking and was missed in the initial oxenstored plugin work.

Specify OCAMLCFLAGS along with OCAMLOPTFLAGS.

Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
Acked-by: Christian Lindig <christian.lindig@cloud.com>
6 months agoFlask: replace uses of __u32
Jan Beulich [Thu, 10 Oct 2024 08:59:38 +0000 (10:59 +0200)]
Flask: replace uses of __u32

... by uint32_t.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Acked-by: Daniel P. Smith <dpsmith@apertussolutions.com>
6 months agoxen/riscv: register Xen's load address as a boot module
Oleksii Kurochko [Thu, 10 Oct 2024 08:55:24 +0000 (10:55 +0200)]
xen/riscv: register Xen's load address as a boot module

Avoid using BOOTMOD_XEN region for other purposes or boot modules
which could result in memory corruption or undefined behaviour.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 months agoxen/riscv: switch LINK_TO_LOAD() to virt_to_maddr()
Oleksii Kurochko [Thu, 10 Oct 2024 08:55:05 +0000 (10:55 +0200)]
xen/riscv: switch LINK_TO_LOAD() to virt_to_maddr()

Use virt_to_maddr() instead of LINK_TO_LOAD as virt_to_maddr()
covers all the cases where LINK_TO_LOAD() is used.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 months agoxen/riscv: implement virt_to_maddr()
Oleksii Kurochko [Thu, 10 Oct 2024 08:54:46 +0000 (10:54 +0200)]
xen/riscv: implement virt_to_maddr()

Implement the virt_to_maddr() function to convert virtual addresses
to machine addresses. The function includes checks for valid address
ranges, specifically the direct mapping region (DIRECTMAP_VIRT_START)
and the Xen's Linkage (XEN_VIRT_START) region. If the virtual address
falls outside of these regions, an assertion will trigger.
To implement this, the phys_offset variable is made accessible
outside of riscv/mm.c.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 months agox86: restore semicolon after explicit DS prefix
Jan Beulich [Thu, 10 Oct 2024 08:54:15 +0000 (10:54 +0200)]
x86: restore semicolon after explicit DS prefix

It's not unnecessary (as the earlier commit claimed): The integrated
assembler of Clang up to 11 complains about an "invalid operand for
instruction".

Fixes: b42cf31d1165 ("x86: use alternative_input() in cache_flush()")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 months agoxen: Update header guards - ARGO
Frediano Ziglio [Thu, 10 Oct 2024 08:53:15 +0000 (10:53 +0200)]
xen: Update header guards - ARGO

Updated header related to ARGO.

Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Acked-by: Christopher Clark <christopher.w.clark@gmail.com>
6 months agox86/vlapic: Move lapic migration checks to the check hooks
Alejandro Vallejo [Thu, 10 Oct 2024 08:52:43 +0000 (10:52 +0200)]
x86/vlapic: Move lapic migration checks to the check hooks

While doing this, factor out checks common to architectural and hidden
state.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
7 months agoCI: Stop building QEMU in general
Andrew Cooper [Sat, 13 Jul 2024 16:50:30 +0000 (17:50 +0100)]
CI: Stop building QEMU in general

We spend an awful lot of CI time building QEMU, even though most changes don't
touch the subset of tools/libs/ used by QEMU.  Some numbers taken at a time
when CI was otherwise quiet:

                       With     Without
  Alpine:              13m38s   6m04s
  Debian 12:           10m05s   8m10s
  OpenSUSE Tumbleweed: 11m40s   7m54s
  Ubuntu 24.04:        14m56s   8m06s

which is a >50% improvement in wallclock time in some cases.

The only build we have that needs QEMU is alpine-3.18-gcc-debug.  This is the
build deployed and used by the QubesOS ADL-* and Zen3p-* jobs.

Xilinx-x86_64 deploys it too, but is PVH-only and doesn't use QEMU.

QEMU is also built by CirrusCI for FreeBSD (fully Clang/LLVM toolchain).

This should help quite a lot with Gitlab CI capacity.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 months agoMAINTAINERS: Add myself as a reviewer for RISC-V
Oleksii Kurochko [Wed, 9 Oct 2024 07:57:37 +0000 (09:57 +0200)]
MAINTAINERS: Add myself as a reviewer for RISC-V

As an active contributor to Xen's RISC-V port, so add myself
to the list of reviewers.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agotypes: replace remaining uses of s64
Jan Beulich [Wed, 9 Oct 2024 07:56:43 +0000 (09:56 +0200)]
types: replace remaining uses of s64

... and move the type itself to linux-compat.h. An exception being
arch/arm/arm64/cpufeature.c and arch/arm/include/asm/arm64/cpufeature.h,
which are to use linux-compat.h instead (the former by including the
latter).

While doing so
- correct the type of union uu's uq field in lib/divmod.c,
- switch a few adjacent types as well, for (a little bit of)
  consistency.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Acked-by: Julien Grall <jgrall@amazon.com>
7 months agoMAINTAINERS: add myself as maintainer for arm tee
Bertrand Marquis [Wed, 9 Oct 2024 07:56:16 +0000 (09:56 +0200)]
MAINTAINERS: add myself as maintainer for arm tee

With Tee mediators now containing Optee and FF-A implementations, add
myself as maintainers to have someone handling the FF-A side.

Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
7 months agox86/msr: add log messages to MSR state load error paths
Roger Pau Monné [Wed, 9 Oct 2024 07:55:38 +0000 (09:55 +0200)]
x86/msr: add log messages to MSR state load error paths

Some error paths in the MSR state loading logic don't contain error messages,
which makes debugging them quite hard without adding extra patches to print the
information.

Add two new log messages to the MSR state load path that print information
about the entry that failed to load, for both PV and HVM.

While there also adjust XEN_DOMCTL_set_vcpu_msrs to return -ENXIO in case the
MSR is unhandled or can't be loaded, so it matches the error code used by HVM
MSR loading (and it's less ambiguous than -EINVAL).

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 months agox86/APIC: Switch flat driver to use phys dst for ext ints
Matthew Barnes [Wed, 9 Oct 2024 07:54:48 +0000 (09:54 +0200)]
x86/APIC: Switch flat driver to use phys dst for ext ints

External interrupts via logical delivery mode in xAPIC do not benefit
from targeting multiple CPUs and instead simply bloat up the vector
space.

However the xAPIC flat driver currently uses logical delivery for
external interrupts.

This patch switches the xAPIC flat driver to use physical destination
mode for external interrupts, instead of logical destination mode.

This patch also applies the following non-functional changes:
- Remove now unused logical flat functions
- Expand GENAPIC_FLAT and GENAPIC_PHYS macros, and delete them.

Resolves: https://gitlab.com/xen-project/xen/-/issues/194
Signed-off-by: Matthew Barnes <matthew.barnes@cloud.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
7 months agoxen: Update header guards - RISC-V
Frediano Ziglio [Wed, 9 Oct 2024 07:53:49 +0000 (09:53 +0200)]
xen: Update header guards - RISC-V

Update headers related to RISC-V.

Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agoxen: Update header guards - I/O MMU
Frediano Ziglio [Wed, 9 Oct 2024 07:53:25 +0000 (09:53 +0200)]
xen: Update header guards - I/O MMU

Update headers related to I/O MMU.

Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agoxen: Update header guards - Intel TXT
Frediano Ziglio [Wed, 9 Oct 2024 07:53:05 +0000 (09:53 +0200)]
xen: Update header guards - Intel TXT

Update the header related to Intel trusted execution technology.

Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agox86/domctl: fix maximum number of MSRs in XEN_DOMCTL_{get,set}_vcpu_msrs
Roger Pau Monné [Tue, 8 Oct 2024 12:37:53 +0000 (14:37 +0200)]
x86/domctl: fix maximum number of MSRs in XEN_DOMCTL_{get,set}_vcpu_msrs

Since the addition of the MSR_AMD64_DR{1-4}_ADDRESS_MASK MSRs to the
msrs_to_send array, the calculations for the maximum number of MSRs that
the hypercall can handle is off by 4.

Remove the addition of 4 to the maximum number of MSRs that
XEN_DOMCTL_{set,get}_vcpu_msrs supports, as those are already part of the
array.

A further adjustment could be to subtract 4 from the maximum size if the DBEXT
CPUID feature is not exposed to the guest, but guest_{rd,wr}msr() will already
perform that check when fetching or loading the MSRs.  The maximum array is
used to indicate the caller of the buffer it needs to allocate in the get case,
and as an early input sanitation in the set case, using a buffer size slightly
lager than required is not an issue.

Fixes: 86d47adcd3c4 ('x86/msr: Handle MSR_AMD64_DR{0-3}_ADDRESS_MASK in the new MSR infrastructure')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 months agodocs: fusa: Replace VM with domain
Ayan Kumar Halder [Tue, 8 Oct 2024 12:37:37 +0000 (14:37 +0200)]
docs: fusa: Replace VM with domain

We should use the word domain everywhere (instead of VM or guest).

Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
7 months agoxen/pci: address a violation of MISRA C Rule 16.3
Federico Serafini [Tue, 8 Oct 2024 12:37:16 +0000 (14:37 +0200)]
xen/pci: address a violation of MISRA C Rule 16.3

Refactor the code to avoid an implicit fallthrough and address
a violation of MISRA C:2012 Rule 16.3: "An unconditional `break'
statement shall terminate every switch-clause".

No functional change.

Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 months agox86/emul: add defensive code
Federico Serafini [Tue, 8 Oct 2024 12:36:59 +0000 (14:36 +0200)]
x86/emul: add defensive code

Add defensive code after unreachable program points.
This also meets the requirements to deviate violations of MISRA C:2012
Rule 16.3: "An unconditional `break' statement shall terminate every
switch-clause".

Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agoioreq: don't wrongly claim "success" in ioreq_send_buffered()
Jan Beulich [Tue, 8 Oct 2024 12:36:27 +0000 (14:36 +0200)]
ioreq: don't wrongly claim "success" in ioreq_send_buffered()

Returning a literal number is a bad idea anyway when all other returns
use IOREQ_STATUS_* values. The function is dead on Arm, and mapping to
X86EMUL_OKAY is surely wrong on x86.

Fixes: f6bf39f84f82 ("x86/hvm: add support for broadcast of buffered ioreqs...")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
7 months agox86/boot: Rationalise .gitignore
Frediano Ziglio [Mon, 7 Oct 2024 14:15:35 +0000 (15:15 +0100)]
x86/boot: Rationalise .gitignore

Strip all related content out of the root .gitignore, and provide a
more local .gitignore's with up-to-date patterns.

Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months ago.gitignore: Remove not generated files
Frediano Ziglio [Mon, 7 Oct 2024 14:15:34 +0000 (15:15 +0100)]
.gitignore: Remove not generated files

Both reloc.S and cmdline.S are not generated since commit
1ab7c128d9d1 ("x86/build: Don't convert boot/{cmdline,head}.bin back to .S")

Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agoautomation: use python-3.11 in Leap container
Olaf Hering [Mon, 7 Oct 2024 15:25:09 +0000 (17:25 +0200)]
automation: use python-3.11 in Leap container

python311 is available since Leap 15.4 as additional Python variant.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
7 months agoCI: Drop bin86/dev86 from archlinux container
Andrew Cooper [Tue, 2 Jul 2024 16:40:11 +0000 (17:40 +0100)]
CI: Drop bin86/dev86 from archlinux container

These packages have moved out of main to AUR, and are not easily accessible
any more.  Drop them, because they're only needed for RomBIOS which is very
legacy these days.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
7 months agox86: Use standard C types in multiboot2.h header
Frediano Ziglio [Tue, 8 Oct 2024 08:41:57 +0000 (09:41 +0100)]
x86: Use standard C types in multiboot2.h header

The header already uses standard types for many fields, extend
their usage.
No functional change.

Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agobuild: move xenlibs-dependencies make definition to uselibs.mk
Juergen Gross [Sat, 5 Oct 2024 15:15:47 +0000 (17:15 +0200)]
build: move xenlibs-dependencies make definition to uselibs.mk

In order to be able to use the xenlibs-dependencies macro from stubdom
build, move it to tools/libs/uselibs.mk, which is included from
current users and stubdom/Makefile.

No functional change intended.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Anthony PERARD <anthony.perard@vates.tech>
7 months agostubdom: explcitly add libc and lwip Mini-OS config options
Juergen Gross [Sat, 5 Oct 2024 15:15:46 +0000 (17:15 +0200)]
stubdom: explcitly add libc and lwip Mini-OS config options

Today the Mini-OS build systems derives libc and lwip config options
from the stubdom and LWIPDIR make variables supplied by the Xen build
system.

In order to prepare those being explicit Mini-OS config options, add
them to the related stubdom Mini-OS config files.

While at it remove the CONFIG_START_NETWORK setting from config files
disabling lwip, as CONFIG_START_NETWORK requires lwip for becoming
effective.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
7 months agostubdom: swtich to local .gitignore file
Juergen Gross [Sat, 5 Oct 2024 15:15:45 +0000 (17:15 +0200)]
stubdom: swtich to local .gitignore file

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
7 months agox86/dpci: do not leak pending interrupts on CPU offline
Roger Pau Monné [Mon, 7 Oct 2024 09:10:21 +0000 (11:10 +0200)]
x86/dpci: do not leak pending interrupts on CPU offline

The current dpci logic relies on a softirq being executed as a side effect of
the cpu_notifier_call_chain() call in the code path that offlines the target
CPU.  However the call to cpu_notifier_call_chain() won't trigger any softirq
processing, and even if it did, such processing should be done after all
interrupts have been migrated off the current CPU, otherwise new pending dpci
interrupts could still appear.

Currently the ASSERT() in the cpu callback notifier is fairly easy to trigger
by doing CPU offline from a PVH dom0.

Solve this by instead moving out any dpci interrupts pending processing once
the CPU is dead.  This might introduce more latency than attempting to drain
before the CPU is put offline, but it's less complex, and CPU online/offline is
not a common action.  Any extra introduced latency should be tolerable.

Fixes: f6dd295381f4 ('dpci: replace tasklet with softirq')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agoCODING_STYLE: header file guard naming rules
Jan Beulich [Mon, 7 Oct 2024 09:10:05 +0000 (11:10 +0200)]
CODING_STYLE: header file guard naming rules

Provide a (small) set of rules on how header guard identifiers ought to
be spelled and what precautions ought to be taken to avoid name
collisions.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
7 months agoefi: address violation of MISRA C Rule 16.3
Federico Serafini [Mon, 7 Oct 2024 09:08:18 +0000 (11:08 +0200)]
efi: address violation of MISRA C Rule 16.3

Use agreed syntax for pseudo-keyword fallthrough to meet the
requirements to deviate a violation of MISRA C:2012 Rule 16.3:
"An unconditional `break' statement shall terminate every
switch-clause".

No functional change.

Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
7 months agoautomation/eclair: tag Rule 13.6 as clean
Federico Serafini [Mon, 30 Sep 2024 12:49:17 +0000 (14:49 +0200)]
automation/eclair: tag Rule 13.6 as clean

Update ECLAIR configuration to consider Rule 13.6 as clean:
new violations of this rule will cause a failure of the CI pipeline.

Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 months agoxen/gnttab: address a violation of MISRA C Rule 13.6
Federico Serafini [Mon, 30 Sep 2024 12:49:16 +0000 (14:49 +0200)]
xen/gnttab: address a violation of MISRA C Rule 13.6

guest_handle_ok()'s expansion contains a sizeof() involving its
first argument guest_handle_cast().
The expansion of the latter, in turn, contains a variable
initialization.

Since MISRA considers the initialization (even of a local variable)
a side effect, the chain of expansions mentioned above violates
MISRA C:2012 Rule 13.6 (The operand of the `sizeof' operator shall not
contain any expression which has potential side effect).

Refactor the code to address the rule violation.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 months agoEFI: address a violation of MISRA C Rule 13.6
Federico Serafini [Mon, 30 Sep 2024 12:49:15 +0000 (14:49 +0200)]
EFI: address a violation of MISRA C Rule 13.6

guest_handle_ok()'s expansion contains a sizeof() involving its
first argument which is guest_handle_cast().
The expansion of the latter, in turn, contains a variable
initialization.

Since MISRA considers the initialization (even of a local variable)
a side effect, the chain of expansions mentioned above violates
MISRA C:2012 Rule 13.6 (The operand of the `sizeof' operator shall not
contain any expression which has potential side effect).

Refactor the code to address the rule violation.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
7 months agoCI: Fix builds following qemu-xen update
Andrew Cooper [Fri, 4 Oct 2024 13:27:02 +0000 (14:27 +0100)]
CI: Fix builds following qemu-xen update

A recent update to qemu-xen has bumped the build requirements, with Python 3.8
being the new baseline but also needing the 'ensurepip' and 'tomllib/tomli'
packages.

 * Ubuntu/Debian package 'ensurepip' separately, but it can be obtained by
   installing the python3-venv package.

 * 'tomllib' was added to the python standard library in Python 3.11, but
   previously it was a separate package named 'tomli'.

In terms of changes required to build QEMU:

 * Ubuntu 24.04 (Noble) has Python 3.12 so only needs python3-venv

 * Ubuntu 22.04 (Jammy) has Python 3.10 but does have a python3-tomli package
   that QEMU is happy with.

 * FreeBSD has Python 3.9, but Python 3.11 is available.

In terms of exclusions:

 * Ubuntu 20.04 (Focal) has Python 3.8, but lacks any kind of tomli package.

 * Fedora 29 (Python 3.7), OpenSUSE Leap 15.6 (Python 3.6), and Ubuntu
   18.04/Bionic (Python 3.6) are now too old.

Detecting tomllib/tomli is more than can fit in build's one-liner, so break it
out into a proper script.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
7 months agoautomation: shorten the timeout for smoke tests
Marek Marczykowski-Górecki [Fri, 4 Oct 2024 02:29:39 +0000 (04:29 +0200)]
automation: shorten the timeout for smoke tests

The smoke tests when successful complete in about 5s. Don't waste
20min+ on failure, shorten the timeout to 120s

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agoautomation: add a smoke test for xen.efi on X86
Marek Marczykowski-Górecki [Fri, 4 Oct 2024 02:29:38 +0000 (04:29 +0200)]
automation: add a smoke test for xen.efi on X86

Check if xen.efi is bootable with an XTF dom0.
The multiboot2+EFI path is tested on hardware tests already.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agoautomation: preserve built xen.efi
Marek Marczykowski-Górecki [Fri, 4 Oct 2024 02:29:37 +0000 (04:29 +0200)]
automation: preserve built xen.efi

It will be useful for further tests.  Deuplicate the collection.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86: Introduce X86_ET_* constants in x86-defns.h
Andrew Cooper [Fri, 18 Sep 2020 15:50:15 +0000 (16:50 +0100)]
x86: Introduce X86_ET_* constants in x86-defns.h

The FRED spec architecturalises the Event Type encoding, previously exposed
only in VMCB/VMCS fields.

Introduce the constants in x86-defns.h, making them a bit more concise, and
retire enum x86_event_type.

Take the opportunity to introduce X86_ET_OTHER.  It's absence appears to be a
bug in Introspection's Monitor Trap Flag support, when considering VECTORING
events during another VMExit.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 months agox86/boot: Convert remaining uses of the legacy ALIGN
Andrew Cooper [Wed, 2 Oct 2024 19:59:19 +0000 (20:59 +0100)]
x86/boot: Convert remaining uses of the legacy ALIGN

There are only two remaining standalone uses the legacy ALIGN macro.

Drop these by switching the .incbin's over to using FUNC()/END() which has
alignment handled internally.  While the incbin's aren't technically one
single function, they behave as if they are.

Finally, expand ALIGN inside the legacy ENTRY() macro in order to remove ALIGN
itself.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 months agoautomation: introduce TEST_TIMEOUT_OVERRIDE
Stefano Stabellini [Thu, 3 Oct 2024 20:22:51 +0000 (13:22 -0700)]
automation: introduce TEST_TIMEOUT_OVERRIDE

TEST_TIMEOUT is set as a CI/CD project variable, as it should be, to
match the capability and speed of the testing infrastructure.

As it turns out, TEST_TIMEOUT defined in test.yaml cannot override
TEST_TIMEOUT defined as CI/CD project variable. As a consequence, today
the TEST_TIMEOUT setting in test.yaml for the Xilinx jobs is ignored.

Instead, rename TEST_TIMEOUT to TEST_TIMEOUT_OVERRIDE in test.yaml and
check for TEST_TIMEOUT_OVERRIDE first in console.exp.

Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Reviewed-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
7 months agox86/boot: Don't use INC to set defaults
Andrew Cooper [Thu, 3 Oct 2024 14:03:38 +0000 (15:03 +0100)]
x86/boot: Don't use INC to set defaults

__efi64_mb2_start() makes some bold assumptions about the efi_platform and
skip_realmode booleans.  Set them to 1 explicitly, which is more robust.

Make the comment a little more concise.

No practical change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
7 months agoxen: move per-cpu area management into common code
Oleksii Kurochko [Thu, 3 Oct 2024 14:08:55 +0000 (16:08 +0200)]
xen: move per-cpu area management into common code

Centralize per-cpu area management to reduce code duplication and
enhance maintainability across architectures.

The per-cpu area management code, which is largely common among
architectures, is moved to a shared implementation in
xen/common/percpu.c. This change includes:
 * Remove percpu.c from the X86 and Arm architectures.
 * For x86, define INVALID_PERCPU_AREAS and PARK_OFFLINE_CPUS_VAR.
 * Drop the declaration of __per_cpu_offset[] from stubs.c in
   PPC and RISC-V to facilitate the build of the common per-cpu code.

No functional changes for x86.

For Arm add support of CPU_RESUME_FAILED, CPU_REMOVE and freeing of
percpu in the case when system_state != SYS_STATE_suspend, however,
there is no change in behavior for Arm at this time.

Move the asm-generic/percpu.h definitions to xen/percpu.h, except for
__per_cpu_start[] and __per_cpu_data_end[], which are moved to
common/percpu.c as they are only used in common/percpu.c.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
7 months agox86/boot: Rewrite EFI/MBI2 code partly in C
Frediano Ziglio [Tue, 1 Oct 2024 10:22:38 +0000 (11:22 +0100)]
x86/boot: Rewrite EFI/MBI2 code partly in C

No need to have it coded in assembly.
Declare efi_multiboot2 in a new header to reuse between implementations
and caller.

Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
7 months agostubdom: Fix newlib build with GCC-14
Andrew Cooper [Wed, 2 Oct 2024 18:01:26 +0000 (19:01 +0100)]
stubdom: Fix newlib build with GCC-14

Based on a fix from OpenSUSE, but adjusted to be Clang-compatible too.  Pass
-Wno-implicit-function-declaration library-wide rather than using local GCC
pragmas.

Fix of copy_past_newline() to avoid triggering -Wstrict-prototypes.

Link: https://build.opensuse.org/request/show/1178775
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
7 months agox86/kexec: Separate code and data into different cache lines
Andrew Cooper [Fri, 17 Feb 2023 17:01:22 +0000 (17:01 +0000)]
x86/kexec: Separate code and data into different cache lines

No functional change, but it performs a bit better.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 months agox86: move ENTRY(), GLOBAL(), and ALIGN
Jan Beulich [Wed, 2 Oct 2024 06:59:03 +0000 (08:59 +0200)]
x86: move ENTRY(), GLOBAL(), and ALIGN

... to boot code, limiting their scope and thus allowing to drop
respective #undef-s from the linker script.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86: convert dom_crash_sync_extable() annotation
Jan Beulich [Wed, 2 Oct 2024 06:56:45 +0000 (08:56 +0200)]
x86: convert dom_crash_sync_extable() annotation

... to that from the generic framework in xen/linkage.h.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86/kexec: convert entry point annotations
Jan Beulich [Wed, 2 Oct 2024 06:56:04 +0000 (08:56 +0200)]
x86/kexec: convert entry point annotations

Use the generic framework from xen/linkage.h.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86/ACPI: annotate assembly function/data with type and size
Jan Beulich [Wed, 2 Oct 2024 06:55:31 +0000 (08:55 +0200)]
x86/ACPI: annotate assembly function/data with type and size

Use the generic framework from xen/linkage.h.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agoVMX: convert entry point annotations
Jan Beulich [Wed, 2 Oct 2024 06:55:02 +0000 (08:55 +0200)]
VMX: convert entry point annotations

Use the generic framework from xen/linkage.h.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agoxen/riscv: introduce early_fdt_map()
Oleksii Kurochko [Wed, 2 Oct 2024 06:54:36 +0000 (08:54 +0200)]
xen/riscv: introduce early_fdt_map()

Introduce function which allows to map FDT to Xen.

Also, initialization of device_tree_flattened happens using
early_fdt_map().

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agoxen/riscv: page table handling
Oleksii Kurochko [Wed, 2 Oct 2024 06:53:59 +0000 (08:53 +0200)]
xen/riscv: page table handling

Implement map_pages_to_xen() which requires several
functions to manage page tables and entries:
- pt_update()
- pt_mapping_level()
- pt_update_entry()
- pt_next_level()
- pt_check_entry()

To support these operations, add functions for creating,
mapping, and unmapping Xen tables:
- create_table()
- map_table()
- unmap_table()

Introduce PTE_SMALL to indicate that 4KB mapping is needed
and PTE_POPULATE.

In addition introduce flush_tlb_range_va() for TLB flushing across
CPUs after updating the PTE for the requested mapping.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 months agox86: prefer RDTSCP in rdtsc_ordered()
Jan Beulich [Wed, 2 Oct 2024 06:52:18 +0000 (08:52 +0200)]
x86: prefer RDTSCP in rdtsc_ordered()

If available, its use is supposed to be cheaper than LFENCE+RDTSC, and
is virtually guaranteed to be cheaper than MFENCE+RDTSC.

Update commentary (and indentation) while there.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agodocs: fusa: Add Assumption of Use (AOU)
Michal Orzel [Tue, 24 Sep 2024 08:29:23 +0000 (09:29 +0100)]
docs: fusa: Add Assumption of Use (AOU)

AoU are the assumptions that Xen relies on other components (eg platform
platform, domains) to fulfill its requirements. In our case, platform means
a combination of hardware, firmware and bootloader.

We have defined AoU in the intro.rst and added AoU for the generic
timer.

Also, fixed a requirement to denote that Xen shall **not** expose the
system counter frequency via the "clock-frequency" device tree property.
The reason being the device tree documentation strongly discourages the
use of this peoperty. Further if the "clock-frequency" is exposed, then
it overrides the value programmed in the CNTFRQ_EL0 register.

So, the frequency shall be exposed via the CNTFRQ_EL0 register only and
consequently there is an assumption on the platform to program the
register correctly.

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
7 months agox86/pv: Rename pv.iobmp_limit to iobmp_nr and clarify behaviour
Andrew Cooper [Tue, 1 Oct 2024 12:00:13 +0000 (13:00 +0100)]
x86/pv: Rename pv.iobmp_limit to iobmp_nr and clarify behaviour

Ever since it's introduction in commit 013351bd7ab3 ("Define new event-channel
and physdev hypercalls") in 2006, the public interface was named nr_ports
while the internal field was called iobmp_limit.

Rename the internal field to iobmp_nr to match the public interface, and
clarify that, when nonzero, Xen will read 2 bytes.

There isn't a perfect parallel with a real TSS, but iobmp_nr being 0 is the
paravirt "no IOPB" case, and it is important that no read occurs in this case.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 months agox86/pv: Handle #PF correctly when reading the IO permission bitmap
Andrew Cooper [Mon, 30 Sep 2024 15:20:29 +0000 (16:20 +0100)]
x86/pv: Handle #PF correctly when reading the IO permission bitmap

The switch statement in guest_io_okay() is a very expensive way of
pre-initialising x with ~0, and performing a partial read into it.

However, the logic isn't correct either.

In a real TSS, the CPU always reads two bytes (like here), and any TSS limit
violation turns silently into no-access.  But, in-limit accesses trigger #PF
as usual.  AMD document this property explicitly, and while Intel don't (so
far as I can tell), they do behave consistently with AMD.

Switch from __copy_from_guest_offset() to __copy_from_guest_pv(), like
everything else in this file.  This removes code generation setting up
copy_from_user_hvm() (in the likely path even), and safety LFENCEs from
evaluate_nospec().

Change the logic to raise #PF if __copy_from_guest_pv() fails, rather than
disallowing the IO port access.  This brings the behaviour better in line with
normal x86.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 months agox86/pv: Rework guest_io_okay() to return X86EMUL_*
Andrew Cooper [Mon, 30 Sep 2024 15:09:51 +0000 (16:09 +0100)]
x86/pv: Rework guest_io_okay() to return X86EMUL_*

In order to fix a bug with guest_io_okay() (subsequent patch), rework
guest_io_okay() to take in an emulation context, and return X86EMUL_* rather
than a boolean.

For the failing case, take the opportunity to inject #GP explicitly, rather
than returning X86EMUL_UNHANDLEABLE.  There is a logical difference between
"we know what this is, and it's #GP", vs "we don't know what this is".

There is no change in practice as emulation is the final step on general #GP
resolution, but returning X86EMUL_UNHANDLEABLE would be a latent bug if a
subsequent action were to appear.

No practical change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 months agox86/MSR: improve code gen for rdmsr_safe() and rdtsc()
Jan Beulich [Tue, 1 Oct 2024 07:47:32 +0000 (09:47 +0200)]
x86/MSR: improve code gen for rdmsr_safe() and rdtsc()

To fold two 32-bit outputs from the asm()-s into a single 64-bit value
the compiler needs to emit a zero-extension insn for the low half. Both
RDMSR and RDTSC clear the upper halves of their output registers anyway,
though. So despite that zero-extending insn (a simple MOV) being cheap,
we can do better: Without one, by declaring the local variables as 64-
bit ones.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86: use alternative_input() in cache_flush()
Jan Beulich [Tue, 1 Oct 2024 07:47:05 +0000 (09:47 +0200)]
x86: use alternative_input() in cache_flush()

There's no point using alternative_io() when there are no outputs. While
there drop the unnecessary semicolon after "ds".

No functional change.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agoiommu/amd-vi: make IOMMU list ro after init
Roger Pau Monné [Tue, 1 Oct 2024 07:46:09 +0000 (09:46 +0200)]
iommu/amd-vi: make IOMMU list ro after init

The only functions to modify the list, amd_iommu_detect_one_acpi() and
amd_iommu_init_cleanup(), are already init.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agox86/traps: Re-enable interrupts after reading cr2 in the #PF handler
Alejandro Vallejo [Tue, 1 Oct 2024 07:45:49 +0000 (09:45 +0200)]
x86/traps: Re-enable interrupts after reading cr2 in the #PF handler

Hitting a page fault clobbers %cr2, so if a page fault is handled while
handling a previous page fault then %cr2 will hold the address of the
latter fault rather than the former. In particular, if a debug key
handler happens to trigger during #PF and before %cr2 is read, and that
handler itself encounters a #PF, then %cr2 will be corrupt for the outer #PF
handler.

This patch makes the page fault path delay re-enabling IRQs until %cr2
has been read in order to ensure it stays consistent.

A similar argument holds in additional cases, but they happen to be safe:
    * %dr6 inside #DB: Safe because IST exceptions don't re-enable IRQs.
    * MSR_XFD_ERR inside #NM: Safe because AMX isn't used in #NM handler.

While in the area, remove redundant q suffix to a movq in entry.S and
the space after the comma.

Fixes: a4cd20a19073 ("[XEN] 'd' key dumps both host and guest state.")
Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
7 months agox86/PV: simplify (and thus correct) guest accessor functions
Jan Beulich [Tue, 1 Oct 2024 07:44:55 +0000 (09:44 +0200)]
x86/PV: simplify (and thus correct) guest accessor functions

Taking a fault on a non-byte-granular insn means that the "number of
bytes not handled" return value would need extra care in calculating, if
we want callers to be able to derive e.g. exception context (to be
injected to the guest) - CR2 for #PF in particular - from the value. To
simplify things rather than complicating them, reduce inline assembly to
just byte-granular string insns. On recent CPUs that's also supposed to
be more efficient anyway.

For singular element accessors, however, alignment checks are added,
hence slightly complicating the code. Misaligned (user) buffer accesses
will now be forwarded to copy_{from,to}_guest_ll().

Naturally copy_{from,to}_unsafe_ll() accessors end up being adjusted the
same way, as they're produced by mere re-processing of the same code.
Otoh copy_{from,to}_unsafe() aren't similarly adjusted, but have their
comments made match reality; down the road we may want to change their
return types, e.g. to bool.

Fixes: 76974398a63c ("Added user-memory accessing functionality for x86_64")
Fixes: 7b8c36701d26 ("Introduce clear_user and clear_guest")
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agodrivers/video: Convert source files to UTF-8
Frediano Ziglio [Thu, 26 Sep 2024 15:46:06 +0000 (16:46 +0100)]
drivers/video: Convert source files to UTF-8

Most of the tools nowadays assume this encoding.
These files do not specify any encoding so convert them to the default.

Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agotools: Add new function to do PIRQ (un)map on PVH dom0
Jiqian Chen [Mon, 30 Sep 2024 08:14:01 +0000 (10:14 +0200)]
tools: Add new function to do PIRQ (un)map on PVH dom0

When dom0 is PVH, and passthrough a device to dumU, xl will
use the gsi number of device to do a pirq mapping, see
pci_add_dm_done->xc_physdev_map_pirq, but the gsi number is
got from file /sys/bus/pci/devices/<sbdf>/irq, that confuses
irq and gsi, they are in different space and are not equal,
so it will fail when mapping.
To solve this issue, to get the real gsi and add a new function
xc_physdev_map_pirq_gsi to get a free pirq for gsi.
Note: why not use current function xc_physdev_map_pirq, because
it doesn't support to allocate a free pirq, what's more, to
prevent changing it and affecting its callers, so add
xc_physdev_map_pirq_gsi.

Besides, PVH dom0 doesn't have PIRQs flag, it doesn't do
PHYSDEVOP_map_pirq for each gsi. So grant function callstack
pci_add_dm_done->XEN_DOMCTL_irq_permission will fail at function
domain_pirq_to_irq. And old hypercall XEN_DOMCTL_irq_permission
requires passing in pirq, it is not suitable for PVH dom0 that
doesn't have PIRQs to grant irq permission.
To solve this issue, use the another hypercall
XEN_DOMCTL_gsi_permission to grant the permission of irq(
translate from gsi) to dumU when dom0 has no PIRQs.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Chen Jiqian <Jiqian.Chen@amd.com>
Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
7 months agotools: Add new function to get gsi from dev
Jiqian Chen [Mon, 30 Sep 2024 08:13:46 +0000 (10:13 +0200)]
tools: Add new function to get gsi from dev

On PVH dom0, when passthrough a device to domU, QEMU and xl tools
want to use gsi number to do pirq mapping, see QEMU code
xen_pt_realize->xc_physdev_map_pirq, and xl code
pci_add_dm_done->xc_physdev_map_pirq, but in current codes, the gsi
number is got from file /sys/bus/pci/devices/<sbdf>/irq, that is
wrong, because irq is not equal with gsi, they are in different
spaces, so pirq mapping fails.

And in current codes, there is no method to get gsi for userspace.
For above purpose, add new function to get gsi, and the
corresponding ioctl is implemented on linux kernel side.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Chen Jiqian <Jiqian.Chen@amd.com>
Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
7 months agox86/irq: allow setting IRQ permissions from GSI instead of pIRQ
Jiqian Chen [Mon, 30 Sep 2024 08:13:15 +0000 (10:13 +0200)]
x86/irq: allow setting IRQ permissions from GSI instead of pIRQ

Some domains are not aware of the pIRQ abstraction layer that maps
interrupt sources into Xen space interrupt numbers.  pIRQs values are
only exposed to domains that have the option to route physical
interrupts over event channels.

This creates issues for PCI-passthrough from a PVH domain, as some of
the passthrough related hypercalls use pIRQ as references to physical
interrupts on the system.  One of such interfaces is
XEN_DOMCTL_irq_permission, used to grant or revoke access to
interrupts, takes a pIRQ as the reference to the interrupt to be
adjusted.

Since PVH doesn't manage interrupts in terms of pIRQs, introduce a new
hypercall that allows setting interrupt permissions based on GSI value
rather than pIRQ.

Note the GSI hypercall parameters is translated to an IRQ value (in
case there are ACPI overrides) before doing the checks.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Reviewed-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agoxen/riscv: introduce and initialize SBI RFENCE extension
Oleksii Kurochko [Mon, 30 Sep 2024 08:12:40 +0000 (10:12 +0200)]
xen/riscv: introduce and initialize SBI RFENCE extension

Introduce functions to work with the SBI RFENCE extension for issuing
various fence operations to remote CPUs.

Add the sbi_init() function along with auxiliary functions and macro
definitions for proper initialization and checking the availability of
SBI extensions. Currently, this is implemented only for RFENCE.

Introduce sbi_remote_sfence_vma() to send SFENCE_VMA instructions to
a set of target HARTs. This will support the implementation of
flush_xen_tlb_range_va().

Integrate __sbi_rfence_v02 from Linux kernel 6.6.0-rc4 with minimal
modifications:
 - Adapt to Xen code style.
 - Use cpuid_to_hartid() instead of cpuid_to_hartid_map[].
 - Update BIT(...) to BIT(..., UL).
 - Rename __sbi_rfence_v02_call to sbi_rfence_v02_real and
   remove the unused arg5.
 - Handle NULL cpu_mask to execute rfence on all CPUs by calling
   sbi_rfence_v02_real(..., 0UL, -1UL,...) instead of creating hmask.
 - change type for start_addr and size to vaddr_t and size_t.
 - Add an explanatory comment about when batching can and cannot occur,
   and why batching happens in the first place.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>