]> xenbits.xensource.com Git - people/royger/xen.git/log
people/royger/xen.git
8 weeks agoxen/events: fix global virq handling virq-v2 gitlab/virq-v2
Juergen Gross [Fri, 7 Mar 2025 10:11:41 +0000 (11:11 +0100)]
xen/events: fix global virq handling

VIRQs are split into "global" and "per vcpu" ones. Unfortunately in
reality there are "per domain" ones, too.

send_global_virq() and set_global_virq_handler() make only sense for
the real "global" ones, so replace virq_is_global() with a new
function get_virq_type() returning one of the 3 possible types (global,
domain, vcpu VIRQ).

To make its intended purpose more clear, also rename
send_guest_global_virq() to send_guest_domain_virq().

Fixes: 980822c5edd1 ("xen/events: allow setting of global virq handler only for unbound virqs")
Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
8 weeks agoxen/events: fix get_global_virq_handler() usage without hardware domain
Juergen Gross [Thu, 6 Mar 2025 16:23:36 +0000 (17:23 +0100)]
xen/events: fix get_global_virq_handler() usage without hardware domain

Some use cases of get_global_virq_handler() didn't account for the
case of running without hardware domain.

Fix that by testing get_global_virq_handler() returning NULL where
needed (e.g. when directly dereferencing the result).

Fixes: 980822c5edd1 ("xen/events: allow setting of global virq handler only for unbound virqs")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 months agoXSM: correct xsm_get_domain_state()
Jan Beulich [Thu, 6 Mar 2025 14:21:52 +0000 (15:21 +0100)]
XSM: correct xsm_get_domain_state()

Add the missing first parameter and move it next to a close relative.

Fixes: 3ad3df1bd0aa ("xen: add new domctl get_domain_state")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 months agoRevert "EFI: Avoid crash calling PrintErrMesg() from efi_multiboot2()"
Jan Beulich [Thu, 6 Mar 2025 14:20:39 +0000 (15:20 +0100)]
Revert "EFI: Avoid crash calling PrintErrMesg() from efi_multiboot2()"

This reverts commit eaed0d185ab8b73cd18ac2830878520b3011f5ab. It breaks the
build with old Clang (3.8).

2 months agoconfig: update Mini-OS commit
Juergen Gross [Thu, 6 Mar 2025 13:54:50 +0000 (14:54 +0100)]
config: update Mini-OS commit

Update the Mini-OS upstream revision.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agoxen/public: add missing Xenstore commands to xs_wire.h
Juergen Gross [Thu, 6 Mar 2025 13:03:51 +0000 (14:03 +0100)]
xen/public: add missing Xenstore commands to xs_wire.h

The GET_FEATURE, SET_FEATURE, GET_QUOTA and SET_QUOTA Xenstore commands
are defined in docs/misc/xenstore.txt, but they are missing in
xs_wire.h.

Add the missing commands to xs_wire.h

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 months agoxen/public: remove some unused defines from xs_wire.h
Juergen Gross [Thu, 6 Mar 2025 13:03:37 +0000 (14:03 +0100)]
xen/public: remove some unused defines from xs_wire.h

xs_wire.h contains some defines XS_WRITE_* which seem to be leftovers
from some decades ago. They haven't been used in the Xen tree since at
least Xen 2.0 and they make no sense anyway.

Remove them, as they seem not to be related to any Xen interface we
have today.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agoRISCV/bitops: Use Zbb to provide arch-optimised bitops
Andrew Cooper [Thu, 6 Mar 2025 13:03:15 +0000 (14:03 +0100)]
RISCV/bitops: Use Zbb to provide arch-optimised bitops

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
2 months agoxen/riscv: identify specific ISA supported by cpu
Oleksii Kurochko [Thu, 6 Mar 2025 13:02:51 +0000 (14:02 +0100)]
xen/riscv: identify specific ISA supported by cpu

Supported ISA extensions are specified in the device tree within the CPU
node, using two properties: `riscv,isa-extensions` and `riscv,isa`.

Currently, Xen does not support the `riscv,isa-extensions` property and
will be added in the future.

The `riscv,isa` property is parsed for each CPU, and the common extensions
are stored in the `host_riscv_isa` bitmap.
This bitmap is then used by `riscv_isa_extension_available()` to check
if a specific extension is supported.

The current implementation is based on Linux kernel v6.12-rc3
implementation with the following changes:
 - Drop unconditional setting of {RISCV_ISA_EXT_ZICSR,
   RISCV_ISA_EXT_ZIFENCEI, RISCV_ISA_EXT_ZICNTR, RISCV_ISA_EXT_ZIHPM} because
   Xen is going to run on hardware produced after the aforementioned
   extensions were split out of "i".
 - Remove saving of the ISA for each CPU, only the common available ISA is
   saved.
 - Remove ACPI-related code as ACPI is not supported by Xen.
 - Drop handling of elf_hwcap, since Xen does not provide hwcap to
   userspace.
 - Replace of_cpu_device_node_get() API, which is not available in Xen,
   with a combination of dt_for_each_child_node(), dt_device_type_is_equal(),
   and dt_get_cpuid_from_node() to retrieve cpuid and riscv,isa in
   riscv_fill_hwcap_from_isa_string().
 - Rename arguments of __RISCV_ISA_EXT_DATA() from _name to ext_name, and
   _id to ext_id for clarity.
 - Replace instances of __RISCV_ISA_EXT_DATA with RISCV_ISA_EXT_DATA.
 - Replace instances of __riscv_isa_extension_available with
   riscv_isa_extension_available for consistency. Also, update the type of
   `bit` argument of riscv_isa_extension_available().
 - Redefine RISCV_ISA_EXT_DATA() to work only with ext_name and ext_id,
   as other fields are not used in Xen currently. Also RISCV_ISA_EXT_DATA()
   is reworked in the way to take only one argument `ext_name`.
 - Add check of first 4 letters of riscv,isa string to
   riscv_isa_parse_string() as Xen doesn't do this check before so it is
   necessary to check correctness of riscv,isa string. ( it should start with
   rv{32,64} with taking into account upper and lower case of "rv").
   Additionally, check also that 'i' goes after 'rv{32,64}' to be sure that
   `out_bitmap` can't be empty.
 - Drop an argument of riscv_fill_hwcap() and riscv_fill_hwcap_from_isa_string()
   as it isn't used, at the moment.
 - Update the comment message about QEMU workaround.
 - Apply Xen coding style.
 - s/pr_info/printk.
 - Drop handling of uppercase letters of riscv,isa in riscv_isa_parse_string() as
   Xen checks that riscv,isa should be in lowercase according to the device tree
   bindings.
 - Update logic of riscv_isa_parse_string(): now it stops parsing of riscv,isa
   if illegal symbol was found instead of ignoring them.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agoxen/riscv: make zbb as mandatory
Oleksii Kurochko [Thu, 6 Mar 2025 13:01:53 +0000 (14:01 +0100)]
xen/riscv: make zbb as mandatory

According to riscv/booting.txt, it is expected that Zbb should be supported.

Drop ANDN_INSN() in asm/cmpxchg.h as Zbb is mandatory now so `andn`
instruction could be used directly.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 months agoxen/riscv: drop CONFIG_RISCV_ISA_RV64G
Oleksii Kurochko [Thu, 6 Mar 2025 13:01:26 +0000 (14:01 +0100)]
xen/riscv: drop CONFIG_RISCV_ISA_RV64G

'G' stands for "imafd_zicsr_zifencei".

Extensions 'f' and 'd' aren't really needed for Xen, and allowing floating
point registers to be used can lead to crashes.

Extensions 'i', 'm', 'a', 'zicsr', and 'zifencei' are necessary for the
operation of Xen, which is why they are used explicitly (unconditionally)
in -march.

Drop "Base ISA" choice from riscv/Kconfig as it is always empty.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 months agoautomation: drop debian:11-riscv64 container
Oleksii Kurochko [Thu, 6 Mar 2025 13:01:07 +0000 (14:01 +0100)]
automation: drop debian:11-riscv64 container

There are two reasons for that:
1. In the README, GCC baseline is chosen to be 12.2, whereas Debian 11
   uses GCC 10.2.1.
2. Xen requires mandatory some Z extensions, but GCC 10.2.1 does not
   support Z extensions in -march, causing the compilation to fail.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2 months agoVMX: convert vmx_vmfunc
Jan Beulich [Thu, 6 Mar 2025 13:00:25 +0000 (14:00 +0100)]
VMX: convert vmx_vmfunc

... to a field in the capability/controls struct.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2 months agoVMX: convert vmx_ept_vpid_cap
Jan Beulich [Thu, 6 Mar 2025 12:59:56 +0000 (13:59 +0100)]
VMX: convert vmx_ept_vpid_cap

... to fields in the capability/controls struct: Take the opportunity
and split the two halves into separate EPT and VPID fields.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2 months agoVMX: convert vmx_vmentry_control
Jan Beulich [Thu, 6 Mar 2025 12:59:30 +0000 (13:59 +0100)]
VMX: convert vmx_vmentry_control

... to a field in the capability/controls struct.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2 months agoVMX: convert vmx_vmexit_control
Jan Beulich [Thu, 6 Mar 2025 12:59:09 +0000 (13:59 +0100)]
VMX: convert vmx_vmexit_control

... to a field in the capability/controls struct.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2 months agoVMX: convert vmx_tertiary_exec_control
Jan Beulich [Thu, 6 Mar 2025 12:58:47 +0000 (13:58 +0100)]
VMX: convert vmx_tertiary_exec_control

... to a field in the capability/controls struct.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2 months agoVMX: convert vmx_secondary_exec_control
Jan Beulich [Thu, 6 Mar 2025 12:58:24 +0000 (13:58 +0100)]
VMX: convert vmx_secondary_exec_control

... to a field in the capability/controls struct.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2 months agoVMX: convert vmx_cpu_based_exec_control
Jan Beulich [Thu, 6 Mar 2025 12:58:04 +0000 (13:58 +0100)]
VMX: convert vmx_cpu_based_exec_control

... to a field in the capability/controls struct.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2 months agoVMX: convert vmx_pin_based_exec_control
Jan Beulich [Thu, 6 Mar 2025 12:57:41 +0000 (13:57 +0100)]
VMX: convert vmx_pin_based_exec_control

... to a field in the capability/controls struct. Use an instance of
that struct also in vmx_init_vmcs_config().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2 months agoVMX: convert vmx_basic_msr
Jan Beulich [Thu, 6 Mar 2025 12:57:21 +0000 (13:57 +0100)]
VMX: convert vmx_basic_msr

... to a struct field, which is then going to be accompanied by other
capability/control data presently living in individual variables. As
this structure isn't supposed to be altered post-boot, put it in
.data.ro_after_init right away.

Suggested-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2 months agoVMX: drop vmcs_revision_id
Jan Beulich [Thu, 6 Mar 2025 12:56:49 +0000 (13:56 +0100)]
VMX: drop vmcs_revision_id

It's effectively redundant with vmx_basic_msr. For the #define
replacement to work, struct vmcs_struct's respective field name also
needs to change: Drop the not really meaningful "vmcs_" prefix from it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2 months agox86/HVM: improve CET-IBT pruning of ENDBR
Jan Beulich [Thu, 6 Mar 2025 12:56:21 +0000 (13:56 +0100)]
x86/HVM: improve CET-IBT pruning of ENDBR

__init{const,data}_cf_clobber can have an effect only for pointers
actually populated in the respective tables. While not the case for SVM
right now, VMX installs a number of pointers only under certain
conditions. Hence the respective functions would have their ENDBR purged
only when those conditions are met. Invoke "pruning" functions after
having copied the respective tables, for them to install any "missing"
pointers.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2 months agotools/xenstored: use new stable interface instead of libxenctrl
Juergen Gross [Thu, 6 Mar 2025 12:54:55 +0000 (13:54 +0100)]
tools/xenstored: use new stable interface instead of libxenctrl

Replace the current use of the unstable xc_domain_getinfo_single()
interface with the stable domctl XEN_DOMCTL_get_domain_state call
via the new libxenmanage library.

This will remove the last usage of libxenctrl by Xenstore, so update
the library dependencies accordingly.

For now only do a direct replacement without using the functionality
of obtaining information about domains having changed the state.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
2 months agotools/libs: add a new libxenmanage library
Juergen Gross [Thu, 6 Mar 2025 12:53:56 +0000 (13:53 +0100)]
tools/libs: add a new libxenmanage library

In order to have a stable interface in user land for using stable
domctl and possibly later sysctl interfaces, add a new library
libxenmanage.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
2 months agoxen: add new domctl get_domain_state
Juergen Gross [Thu, 6 Mar 2025 12:52:38 +0000 (13:52 +0100)]
xen: add new domctl get_domain_state

Add a new domctl sub-function to get data of a domain having changed
state (this is needed by Xenstore).

The returned state just contains the domid, the domain unique id,
and some flags (existing, shutdown, dying).

In order to enable Xenstore stubdom being built for multiple Xen
versions, make this domctl stable.  For stable domctls the
interface_version is always 0.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 months agoxen: add bitmap to indicate per-domain state changes
Juergen Gross [Thu, 6 Mar 2025 12:52:14 +0000 (13:52 +0100)]
xen: add bitmap to indicate per-domain state changes

Add a bitmap with one bit per possible domid indicating the respective
domain has changed its state (created, deleted, dying, crashed,
shutdown).

Registering the VIRQ_DOM_EXC event will result in setting the bits for
all existing domains and resetting all other bits.

As the usage of this bitmap is tightly coupled with the VIRQ_DOM_EXC
event, it is meant to be used only by a single consumer in the system,
just like the VIRQ_DOM_EXC event.

Resetting a bit will be done in a future patch.

This information is needed for Xenstore to keep track of all domains.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 months agoxen/events: allow setting of global virq handler only for unbound virqs
Juergen Gross [Thu, 6 Mar 2025 12:51:55 +0000 (13:51 +0100)]
xen/events: allow setting of global virq handler only for unbound virqs

XEN_DOMCTL_set_virq_handler will happily steal a global virq from the
current domain having bound it and assign it to another domain. The
former domain will just never receive any further events for that
virq without knowing what happened.

Change the behavior to allow XEN_DOMCTL_set_virq_handler only if the
virq in question is not bound by the current domain allowed to use it.

Currently the only user of XEN_DOMCTL_set_virq_handler in the Xen code
base is init-xenstore-domain, so changing the behavior like above will
not cause any problems.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
2 months agoxen/events: don't allow binding a global virq from any domain
Juergen Gross [Thu, 6 Mar 2025 12:51:35 +0000 (13:51 +0100)]
xen/events: don't allow binding a global virq from any domain

Today Xen will happily allow binding a global virq by a domain which
isn't configured to receive it. This won't result in any bad actions,
but the bind will appear to have succeeded with no event ever being
received by that event channel.

Instead of allowing the bind, error out if the domain isn't set to
handle that virq. Note that this check is inside the write_lock() on
purpose, as a future patch will put a related check into
set_global_virq_handler() with the addition of using the same lock.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
2 months agoEFI: Avoid crash calling PrintErrMesg() from efi_multiboot2()
Frediano Ziglio [Thu, 6 Mar 2025 12:51:01 +0000 (13:51 +0100)]
EFI: Avoid crash calling PrintErrMesg() from efi_multiboot2()

Although code is compiled with -fpic option data is not position
independent. This causes data pointer to become invalid if
code is not relocated properly which is what happens for
efi_multiboot2 which is called by multiboot entry code.

Code tested adding
   PrintErrMesg(L"Test message", EFI_BUFFER_TOO_SMALL);
in efi_multiboot2 before calling efi_arch_edd (this function
can potentially call PrintErrMesg).

Before the patch (XenServer installation on Qemu, xen replaced
with vanilla xen.gz):
  Booting `XenServer (Serial)'Booting `XenServer (Serial)'
  Test message: !!!! X64 Exception Type - 0E(#PF - Page-Fault)  CPU Apic ID - 00000000 !!!!
  ExceptionData - 0000000000000000  I:0 R:0 U:0 W:0 P:0 PK:0 SS:0 SGX:0
  RIP  - 000000007EE21E9A, CS  - 0000000000000038, RFLAGS - 0000000000210246
  RAX  - 000000007FF0C1B5, RCX - 0000000000000050, RDX - 0000000000000010
  RBX  - 0000000000000000, RSP - 000000007FF0C180, RBP - 000000007FF0C210
  RSI  - FFFF82D040467CE8, RDI - 0000000000000000
  R8   - 000000007FF0C1C8, R9  - 000000007FF0C1C0, R10 - 0000000000000000
  R11  - 0000000000001020, R12 - FFFF82D040467CE8, R13 - 000000007FF0C1B8
  R14  - 000000007EA33328, R15 - 000000007EA332D8
  DS   - 0000000000000030, ES  - 0000000000000030, FS  - 0000000000000030
  GS   - 0000000000000030, SS  - 0000000000000030
  CR0  - 0000000080010033, CR2 - FFFF82D040467CE8, CR3 - 000000007FC01000
  CR4  - 0000000000000668, CR8 - 0000000000000000
  DR0  - 0000000000000000, DR1 - 0000000000000000, DR2 - 0000000000000000
  DR3  - 0000000000000000, DR6 - 00000000FFFF0FF0, DR7 - 0000000000000400
  GDTR - 000000007F9DB000 0000000000000047, LDTR - 0000000000000000
  IDTR - 000000007F48E018 0000000000000FFF,   TR - 0000000000000000
  FXSAVE_STATE - 000000007FF0BDE0
  !!!! Find image based on IP(0x7EE21E9A) (No PDB)  (ImageBase=000000007EE20000, EntryPoint=000000007EE23935) !!!!

After the patch:
  Booting `XenServer (Serial)'Booting `XenServer (Serial)'
  Test message: Buffer too small
  BdsDxe: loading Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)
  BdsDxe: starting Boot0000 "UiApp" from Fv(7CB8BDC9-F8EB-4F34-AAEA-3EE4AF6516A1)/FvFile(462CAA21-7614-4503-836E-8AB6F4662331)

This partially rollback commit 00d5d5ce23e6.

Fixes: 9180f5365524 ("x86: add multiboot2 protocol support for EFI platforms")
Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
2 months agoxen/arm: mpu: Ensure that the page size is 4KB
Ayan Kumar Halder [Tue, 4 Mar 2025 17:57:08 +0000 (17:57 +0000)]
xen/arm: mpu: Ensure that the page size is 4KB

Similar to commit (d736b6eb451b, "xen/arm: mpu: Define Xen start address for
MPU systems"), one needs to add a build assertion to ensure that the page size
is 4KB on arm32 based systems as well.
The existing build assertion is moved under "xen/arch/arm/mpu" as it applies
for both arm64 and arm32 based systems.

Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
Acked-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
2 months agoxen/arm: mpu: Move some of the definitions to common file
Ayan Kumar Halder [Tue, 4 Mar 2025 17:57:07 +0000 (17:57 +0000)]
xen/arm: mpu: Move some of the definitions to common file

For AArch32, refer to ARM DDI 0568A.c ID110520.
MPU_REGION_SHIFT is same between AArch32 and AArch64 (HPRBAR).
Also, NUM_MPU_REGIONS_SHIFT is same between AArch32 and AArch64
(HMPUIR).

Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Acked-by: Michal Orzel <michal.orzel@amd.com>
2 months agoXen: CI fixes from XSN-2
Andrew Cooper [Wed, 5 Mar 2025 22:17:22 +0000 (22:17 +0000)]
Xen: CI fixes from XSN-2

 * Add cf_check annotation to cmp_patch_id() used by bsearch().
 * Add U suffix to the K[] table to fix MISRA Rule 7.2 violations.

Fixes: 372af524411f ("xen/lib: Introduce SHA2-256")
Fixes: 630e8875ab36 ("x86/ucode: Perform extra SHA2 checks on AMD Fam17h/19h microcode")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2 months agox86/ucode: Perform extra SHA2 checks on AMD Fam17h/19h microcode
Andrew Cooper [Fri, 13 Dec 2024 14:34:00 +0000 (14:34 +0000)]
x86/ucode: Perform extra SHA2 checks on AMD Fam17h/19h microcode

Collisions have been found in the microcode signing algorithm used by AMD
Fam17h/19h CPUs, and now anyone can sign their own.

For more details, see:
  https://bughunters.google.com/blog/5424842357473280/zen-and-the-art-of-microcode-hacking
  https://www.amd.com/en/resources/product-security/bulletin/amd-sb-7033.html

As a stopgap mitigation, check the digest of patches against a table of blobs
with known provenance.  These are all Fam17h and Fam19h blobs included in
linux-firwmare at the time of writing, specifically:

  https://git.kernel.org/firmware/linux-firmware/c/48bb90cceb882cab8e9ab692bc5779d3bf3a13b8

This checks can be opted out of by booting with ucode=no-digest-check, but
doing so is not recommended.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2 months agoxen/lib: Introduce SHA2-256
Andrew Cooper [Fri, 13 Dec 2024 14:34:00 +0000 (14:34 +0000)]
xen/lib: Introduce SHA2-256

A future change will need to calculate SHA2-256 digests.  Introduce an
implementation in lib/, derived from Trenchboot which itself is derived from
Linux.

In order to be useful to other architectures, it is careful with endianness
and misaligned accesses as well as being more MISRA friendly, but is only
wired up for x86 in the short term.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2 months agoRevert "xen/riscv: drop CONFIG_RISCV_ISA_RV64G"
Jan Beulich [Wed, 5 Mar 2025 16:06:23 +0000 (17:06 +0100)]
Revert "xen/riscv: drop CONFIG_RISCV_ISA_RV64G"

This reverts commit 86b1b8ec3d9d0508a95540e368432291b883837f. It
fails in CI without an adjustment there.

2 months agotools/xl: fix channel configuration setting
Juergen Gross [Wed, 5 Mar 2025 15:37:37 +0000 (16:37 +0100)]
tools/xl: fix channel configuration setting

Channels work differently than other device types: their devid should
be -1 initially in order to distinguish them from the primary console
which has the devid of 0.

So when parsing the channel configuration, use
ARRAY_EXTEND_INIT_NODEVID() in order to avoid overwriting the devid
set by libxl_device_channel_init().

Fixes: 3a6679634766 ("libxl: set channel devid when not provided by application")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
2 months agox86/xstate: Map/unmap xsave area in {compress,expand}_xsave_states()
Alejandro Vallejo [Wed, 5 Mar 2025 15:37:14 +0000 (16:37 +0100)]
x86/xstate: Map/unmap xsave area in {compress,expand}_xsave_states()

No functional change.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 months agox86/domctl: Map/unmap xsave area in arch_get_info_guest()
Alejandro Vallejo [Wed, 5 Mar 2025 15:37:02 +0000 (16:37 +0100)]
x86/domctl: Map/unmap xsave area in arch_get_info_guest()

No functional change.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agox86/hvm: Map/unmap xsave area in hvmemul_{get,put}_fpu()
Alejandro Vallejo [Wed, 5 Mar 2025 15:36:25 +0000 (16:36 +0100)]
x86/hvm: Map/unmap xsave area in hvmemul_{get,put}_fpu()

No functional change.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agox86/xstate: Map/unmap xsave area in xstate_set_init() and handle_setbv()
Alejandro Vallejo [Wed, 5 Mar 2025 15:35:57 +0000 (16:35 +0100)]
x86/xstate: Map/unmap xsave area in xstate_set_init() and handle_setbv()

No functional change.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agox86/fpu: Map/umap xsave area in vcpu_{reset,setup}_fpu()
Alejandro Vallejo [Wed, 5 Mar 2025 15:35:37 +0000 (16:35 +0100)]
x86/fpu: Map/umap xsave area in vcpu_{reset,setup}_fpu()

No functional change.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agox86/hvm: Map/unmap xsave area in hvm_save_cpu_ctxt()
Alejandro Vallejo [Wed, 5 Mar 2025 15:35:04 +0000 (16:35 +0100)]
x86/hvm: Map/unmap xsave area in hvm_save_cpu_ctxt()

No functional change.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agox86/xstate: Create map/unmap primitives for xsave areas
Alejandro Vallejo [Wed, 5 Mar 2025 15:34:27 +0000 (16:34 +0100)]
x86/xstate: Create map/unmap primitives for xsave areas

Add infrastructure to simplify ASI handling. With ASI in the picture
we'll have several different means of accessing the XSAVE area of a
given vCPU, depending on whether a domain is covered by ASI or not and
whether the vCPU is question is scheduled on the current pCPU or not.

Having these complexities exposed at the call sites becomes unwieldy
very fast. These wrappers are intended to be used in a similar way to
map_domain_page() and unmap_domain_page(); The map operation will
dispatch the appropriate pointer for each case in a future patch, while
unmap will remain a no-op where no unmap is required (e.g: when there's
no ASI) and remove the transient maping if one was required.

Follow-up patches replace all uses of raw v->arch.xsave_area by this
mechanism in preparation to add the beforementioned dispatch logic to be
added at a later time.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agoxen/cpufreq: abstract Energy Performance Preference value
Penny Zheng [Wed, 5 Mar 2025 14:45:10 +0000 (15:45 +0100)]
xen/cpufreq: abstract Energy Performance Preference value

Intel's hwp Energy Performance Preference value is compatible with
CPPC's Energy Performance Preference value, so this commit abstracts
the value and re-place it in common header file cpufreq.h, to be
used not only for hwp in the future.

Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agoxen/riscv: drop CONFIG_RISCV_ISA_RV64G
Oleksii Kurochko [Wed, 5 Mar 2025 14:44:12 +0000 (15:44 +0100)]
xen/riscv: drop CONFIG_RISCV_ISA_RV64G

'G' stands for "imafd_zicsr_zifencei".

Extensions 'f' and 'd' aren't really needed for Xen, and allowing floating
point registers to be used can lead to crashes.

Extensions 'i', 'm', 'a', 'zicsr', and 'zifencei' are necessary for the
operation of Xen, which is why they are used explicitly (unconditionally)
in -march.

Drop "Base ISA" choice from riscv/Kconfig as it is always empty.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 months agoxen/README: add compiler and binutils versions for RISCV-64
Oleksii Kurochko [Wed, 5 Mar 2025 14:43:55 +0000 (15:43 +0100)]
xen/README: add compiler and binutils versions for RISCV-64

Considering that the Zbb extension is supported since GCC version 12 [1]
and that older GCC versions do not support Z extensions in -march (I haven't
faced this issue for GCC >=11.2), leading to compilation failures,
the baseline version for GCC is set to 12.2 and for GNU binutils to 2.39.

The GCC version is set to 12.2 instead of 12.1 because Xen's GitLab CI uses
Debian 12, which includes GCC 12.2 and GNU binutils 2.39.

[1] https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=149e217033f01410a9783c5cb2d020cf8334ae4c

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agoxen/list: fix comments in include/xen/list.h
Juergen Gross [Wed, 5 Mar 2025 14:43:32 +0000 (15:43 +0100)]
xen/list: fix comments in include/xen/list.h

There are several places in list.h where "list_struct" is used instead
of "struct list_head". Fix that.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agoxen/console: introduce console_{get,put}_domain()
Denis Mukhin [Wed, 5 Mar 2025 14:42:49 +0000 (15:42 +0100)]
xen/console: introduce console_{get,put}_domain()

console_input_domain() takes an RCU lock to protect domain structure.
That implies call to rcu_unlock_domain() after use.

Introduce a pair of console_get_domain() / console_put_domain() to highlight
the correct use of the call within the code interacting with Xen console
driver.

The new calls used in __serial_rx(), which also fixed console forwarding to
late hardware domains which run with domain IDs different from 0.

While moving the guest_printk() invocation also drop the redundant _G infix.

Signed-off-by: Denis Mukhin <dmukhin@ford.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
2 months agox86/HVM: drop redundant access splitting
Jan Beulich [Wed, 5 Mar 2025 14:42:12 +0000 (15:42 +0100)]
x86/HVM: drop redundant access splitting

With all paths into hvmemul_linear_mmio_access() coming through
linear_{read,write}(), there's no need anymore to split accesses at
page boundaries there. Leave an assertion, though.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2 months agox86/HVM: slightly improve CMPXCHG16B emulation
Jan Beulich [Wed, 5 Mar 2025 14:41:14 +0000 (15:41 +0100)]
x86/HVM: slightly improve CMPXCHG16B emulation

Using hvmemul_linear_mmio_write() directly (as fallback when mapping the
memory operand isn't possible) won't work properly when the access
crosses a RAM/MMIO boundary. Use linear_write() instead, which splits at
such boundaries as necessary.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 months agox86/dom0: be less restrictive with the Interrupt Address Range
Roger Pau Monne [Wed, 12 Feb 2025 10:37:50 +0000 (11:37 +0100)]
x86/dom0: be less restrictive with the Interrupt Address Range

Xen currently prevents dom0 from creating CPU or IOMMU page-table mappings
into the interrupt address range [0xfee00000, 0xfeefffff].  This range has
two different purposes.  For accesses from the CPU is contains the default
position of local APIC page at 0xfee00000.  For accesses from devices
it's the MSI address range, so the address field in the MSI entries
(usually) point to an address on that range to trigger an interrupt.

There are reports of Lenovo Thinkpad devices placing what seems to be the
UCSI shared mailbox at address 0xfeec2000 in the interrupt address range.
Attempting to use that device with a Linux PV dom0 leads to an error when
Linux kernel maps 0xfeec2000:

RIP: e030:xen_mc_flush+0x1e8/0x2b0
 xen_leave_lazy_mmu+0x15/0x60
 vmap_range_noflush+0x408/0x6f0
 __ioremap_caller+0x20d/0x350
 acpi_os_map_iomem+0x1a3/0x1c0
 acpi_ex_system_memory_space_handler+0x229/0x3f0
 acpi_ev_address_space_dispatch+0x17e/0x4c0
 acpi_ex_access_region+0x28a/0x510
 acpi_ex_field_datum_io+0x95/0x5c0
 acpi_ex_extract_from_field+0x36b/0x4e0
 acpi_ex_read_data_from_field+0xcb/0x430
 acpi_ex_resolve_node_to_value+0x2e0/0x530
 acpi_ex_resolve_to_value+0x1e7/0x550
 acpi_ds_evaluate_name_path+0x107/0x170
 acpi_ds_exec_end_op+0x392/0x860
 acpi_ps_parse_loop+0x268/0xa30
 acpi_ps_parse_aml+0x221/0x5e0
 acpi_ps_execute_method+0x171/0x3e0
 acpi_ns_evaluate+0x174/0x5d0
 acpi_evaluate_object+0x167/0x440
 acpi_evaluate_dsm+0xb6/0x130
 ucsi_acpi_dsm+0x53/0x80
 ucsi_acpi_read+0x2e/0x60
 ucsi_register+0x24/0xa0
 ucsi_acpi_probe+0x162/0x1e3
 platform_probe+0x48/0x90
 really_probe+0xde/0x340
 __driver_probe_device+0x78/0x110
 driver_probe_device+0x1f/0x90
 __driver_attach+0xd2/0x1c0
 bus_for_each_dev+0x77/0xc0
 bus_add_driver+0x112/0x1f0
 driver_register+0x72/0xd0
 do_one_initcall+0x48/0x300
 do_init_module+0x60/0x220
 __do_sys_init_module+0x17f/0x1b0
 do_syscall_64+0x82/0x170

Remove the restrictions to create mappings in the interrupt address range
for dom0.  Note that the restriction to map the local APIC page is enforced
separately, and that continues to be present.  Additionally make sure the
emulated local APIC page is also not mapped, in case dom0 is using it.

Note that even if the interrupt address range entries are populated in the
IOMMU page-tables no device access will reach those pages.  Device accesses
to the Interrupt Address Range will always be converted into Interrupt
Messages and are not subject to DMA remapping.

There's also the following restriction noted in Intel VT-d:

> Software must not program paging-structure entries to remap any address to
> the interrupt address range. Untranslated requests and translation requests
> that result in an address in the interrupt range will be blocked with
> condition code LGN.4 or SGN.8. Translated requests with an address in the
> interrupt address range are treated as Unsupported Request (UR).

Similarly for AMD-Vi:

> Accesses to the interrupt address range (Table 3) are defined to go through
> the interrupt remapping portion of the IOMMU and not through address
> translation processing. Therefore, when a transaction is being processed as
> an interrupt remapping operation, the transaction attribute of
> pretranslated or untranslated is ignored.
>
> Software Note: The IOMMU should
> not be configured such that an address translation results in a special
> address such as the interrupt address range.

However those restrictions don't apply to the identity mappings possibly
created for dom0, since the interrupt address range is never subject to DMA
remapping, and hence there's no output address after translation that
belongs to the interrupt address range.

Reported-by: Jürgen Groß <jgross@suse.com>
Link: https://lore.kernel.org/xen-devel/baade0a7-e204-4743-bda1-282df74e5f89@suse.com/
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agox86/iommu: account for IOMEM caps when populating dom0 IOMMU page-tables
Roger Pau Monne [Fri, 14 Feb 2025 09:39:29 +0000 (10:39 +0100)]
x86/iommu: account for IOMEM caps when populating dom0 IOMMU page-tables

The current code in arch_iommu_hwdom_init() kind of open-codes the same
MMIO permission ranges that are added to the hardware domain ->iomem_caps.
Avoid this duplication and use ->iomem_caps in arch_iommu_hwdom_init() to
filter which memory regions should be added to the dom0 IOMMU page-tables.

Note the IO-APIC and MCFG page(s) must be set as not accessible for a PVH
dom0, otherwise the internal Xen emulation for those ranges won't work.
This requires adjustments in dom0_setup_permissions().

The call to pvh_setup_mmcfg() in dom0_construct_pvh() must now strictly be
done ahead of setting up dom0 permissions, so take the opportunity to also
put it inside the existing is_hardware_domain() region.

Also the special casing of E820_UNUSABLE regions no longer needs to be done
in arch_iommu_hwdom_init(), as those regions are already blocked in
->iomem_caps and thus would be removed from the rangeset as part of
->iomem_caps processing in arch_iommu_hwdom_init().  The E820_UNUSABLE
regions below 1Mb are not removed from ->iomem_caps, that's a slight
difference for the IOMMU created page-tables, but the aim is to allow
access to the same memory either from the CPU or the IOMMU page-tables.

Since ->iomem_caps already takes into account the domain max paddr, there's
no need to remove any regions past the last address addressable by the
domain, as applying ->iomem_caps would have already taken care of that.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 months agox86/dom0: correctly set the maximum ->iomem_caps bound for PVH
Roger Pau Monne [Tue, 18 Feb 2025 16:57:49 +0000 (17:57 +0100)]
x86/dom0: correctly set the maximum ->iomem_caps bound for PVH

The logic in dom0_setup_permissions() sets the maximum bound in
->iomem_caps unconditionally using paddr_bits, which is not correct for HVM
based domains.  Instead use domain_max_paddr_bits() to get the correct
maximum paddr bits for each possible domain type.

Switch to using PFN_DOWN() instead of PAGE_SHIFT, as that's shorter.

Fixes: 53de839fb409 ('x86: constrain MFN range Dom0 may access')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 months agox86/dom0: attempt to fixup p2m page-faults for PVH dom0
Roger Pau Monne [Thu, 13 Feb 2025 09:58:45 +0000 (10:58 +0100)]
x86/dom0: attempt to fixup p2m page-faults for PVH dom0

When building a PVH dom0 Xen attempts to map all (relevant) MMIO regions
into the p2m for dom0 access.  However the information Xen has about the
host memory map is limited.  Xen doesn't have access to any resources
described in ACPI dynamic tables, and hence the p2m mappings provided might
not be complete.

PV doesn't suffer from this issue because a PV dom0 is capable of mapping
into it's page-tables any address not explicitly banned in d->iomem_caps.

Introduce a new command line options that allows Xen to attempt to fixup
the p2m page-faults, by creating p2m identity maps in response to p2m
page-faults.

This is aimed as a workaround to small ACPI regions Xen doesn't know about.
Note that missing large MMIO regions mapped in this way will lead to
slowness due to the VM exit processing, plus the mappings will always use
small pages.

The ultimate aim is to attempt to bring better parity with a classic PV
dom0.

Note such fixup rely on the CPU doing the access to the unpopulated
address.  If the access is attempted from a device instead there's no
possible way to fixup, as IOMMU page-fault are asynchronous.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Only slightly tested on my local PVH dom0 deployment.
---
Changes since v1:
 - Make the fixup function static.
 - Print message in case mapping already exists.

2 months agox86/emul: dump unhandled memory accesses for PVH dom0
Roger Pau Monne [Thu, 13 Feb 2025 08:08:01 +0000 (09:08 +0100)]
x86/emul: dump unhandled memory accesses for PVH dom0

A PV dom0 can map any host memory as long as it's allowed by the IO
capability range in d->iomem_caps.  On the other hand, a PVH dom0 has no
way to populate MMIO region onto it's p2m, so it's limited to what Xen
initially populates on the p2m based on the host memory map and the enabled
device BARs.

Introduce a new debug build only printk that reports attempts by dom0 to
access addresses not populated on the p2m, and not handled by any emulator.
This is for information purposes only, but might allow getting an idea of
what MMIO ranges might be missing on the p2m.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agox86/IDT: Rename X86_NR_VECTORS to X86_IDT_VECTORS
Andrew Cooper [Thu, 2 Jan 2025 16:56:59 +0000 (16:56 +0000)]
x86/IDT: Rename X86_NR_VECTORS to X86_IDT_VECTORS

Observant readers may have noticed that the FRED spec has another 8 bits of
space reserved immediately following the vector field.

Make the existing constant more precise.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agox86/IDT: Collect IDT related content idt.h
Andrew Cooper [Wed, 1 Jan 2025 15:43:20 +0000 (15:43 +0000)]
x86/IDT: Collect IDT related content idt.h

Logic concerning the IDT is somewhat different to the other system tables, and
in particular ought not to be in asm/processor.h.  Collect it together a new
header.

While doing so, make a few minor adjustments:

 * Make set_ist() use volatile rather than ACCESS_ONCE(), as
   _write_gate_lower() already does, removing the need for xen/lib.h.

 * Move the BUILD_BUG_ON() from subarch_percpu_traps_init() into mm.c's
   build_assertions(), rather than including idt.h into x86_64/traps.c.

 * Drop UL from IST constants.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agox86: Sort includes in various files
Andrew Cooper [Wed, 1 Jan 2025 15:51:57 +0000 (15:51 +0000)]
x86: Sort includes in various files

FRED support involves quite a lot of header file shuffling and cleanup.  Start
by sorting the includes of impacted files, and dropping duplciates.

  domain.c: Double asm/spec_ctrl.h
  power.c:  Double xen/sched.h
  setup.c:  Double xen/serial.h
  mm.c:     Double xen/mm.h

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agoxen: Don't cast away const-ness in vcpu_show_registers()
Andrew Cooper [Mon, 30 Dec 2024 06:41:46 +0000 (06:41 +0000)]
xen: Don't cast away const-ness in vcpu_show_registers()

The final hunk is `(struct vcpu *)v` in disguise, expressed using a runtime
pointer chase through memory and a technicality of the C type system to work
around the fact that get_hvm_registers() strictly requires a mutable pointer.

For anyone interested, this is one reason why C cannot optimise any reads
across sequence points, even for a function purporting to take a const object.

Anyway, have the function correctly state that it needs a mutable vcpu.  All
callers have a mutable vCPU to hand, and it removes the runtime pointer chase
in x86.

Make one style adjustment in ARM while adjusting the parameter type.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2 months agoautomation/eclair: Reduce verbosity of ECLAIR logs.
Nicola Vetrini [Tue, 4 Mar 2025 17:49:36 +0000 (18:49 +0100)]
automation/eclair: Reduce verbosity of ECLAIR logs.

While activating verbose logging simplifies debugging, this causes
GitLab logs to be truncated, preventing the links to the ECLAIR
analysis database to be shown.

No functional change.

Fixes: c4392ec83244 ("automation: Add ECLAIR utilities and settings")
Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
2 months agoCHANGELOG.md: Set release date for 4.20
Andrew Cooper [Mon, 3 Mar 2025 14:06:55 +0000 (14:06 +0000)]
CHANGELOG.md: Set release date for 4.20

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
2 months agocommon: remove -fno-stack-protector from EMBEDDED_EXTRA_CFLAGS
Volodymyr Babchuk [Mon, 17 Feb 2025 02:49:16 +0000 (02:49 +0000)]
common: remove -fno-stack-protector from EMBEDDED_EXTRA_CFLAGS

This patch is preparation for making stack protector
configurable. First step is to remove -fno-stack-protector flag from
EMBEDDED_EXTRA_CFLAGS so separate components (Hypervisor in this case)
can enable/disable this feature by themselves.

Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 months agoIOMMU/VT-d: Fix comment
Andrew Cooper [Mon, 24 Feb 2025 17:05:02 +0000 (17:05 +0000)]
IOMMU/VT-d: Fix comment

"find upstream bridge" is surprisingly jarring in context, considering that's
the name of the function who's return value we're testing.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2 months agoxen/spinlock: Don't perpetuate broken API in new logic
Andrew Cooper [Tue, 19 Mar 2024 11:17:16 +0000 (11:17 +0000)]
xen/spinlock: Don't perpetuate broken API in new logic

The single user wants this the sane way around.  Write it as a normal static
inline just like rspin_lock().

Fixes: cc3e8df542ed ("xen/spinlock: add rspin_[un]lock_irq[save|restore]()")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
2 months agox86/asm: Remove semicolon from LOCK prefix
Andrew Cooper [Fri, 28 Feb 2025 21:50:01 +0000 (21:50 +0000)]
x86/asm: Remove semicolon from LOCK prefix

Most of Xen's LOCK prefixes are already without semicolon, but we have a few
still remaining in the tree.

As noted in the Linux patch, this adversely affects size/inlining decisions,
and prevents the assembler from diagnosing certain classes of error.

No functional change.

Link: https://lore.kernel.org/lkml/20250228085149.2478245-1-ubizjak@gmail.com/
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2 months agoxen/arm: Fix platforms Kconfig indent
Bertrand Marquis [Mon, 3 Mar 2025 10:27:15 +0000 (11:27 +0100)]
xen/arm: Fix platforms Kconfig indent

Fix platforms/Kconfig and Kconfig.debug help indent to respect the
standard (tab + 2 spaces).
While there also move some default in Kconfig.debug before the help
message.

Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
2 months agoxen/arm: Don't blindly print hwdom in generic panic messages
Michal Orzel [Mon, 3 Mar 2025 08:56:50 +0000 (09:56 +0100)]
xen/arm: Don't blindly print hwdom in generic panic messages

These functions are generic and used not only for hardware domain. This
creates confusion when printing any of these panic messages (e.g.
failure when loading domU kernel would result in informing a user about
a failure in loading hwdom kernel).

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
2 months agoxen/arm: static-shmem: Drop unused size_cells
Michal Orzel [Mon, 3 Mar 2025 08:56:49 +0000 (09:56 +0100)]
xen/arm: static-shmem: Drop unused size_cells

Value stored in size_cells is never read because we're only interested
in retrieving gbase address of shmem region for which we only need
address cells.

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
2 months agoxen/arm: Check return code from fdt_finish_reservemap()
Michal Orzel [Mon, 3 Mar 2025 08:56:48 +0000 (09:56 +0100)]
xen/arm: Check return code from fdt_finish_reservemap()

fdt_finish_reservemap() may fail (with -FDT_ERR_NOSPACE) in which case
further DTB creation (in prepare_dtb_hwdom()) makes no sense. Fix it.

Fixes: 13bb63b754e4 ("device tree,arm: supply a flat device tree to dom0")
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
2 months agoxen/arm: dm: Bail out if padding != 0 for XEN_DMOP_set_irq_level
Michal Orzel [Mon, 3 Mar 2025 08:56:47 +0000 (09:56 +0100)]
xen/arm: dm: Bail out if padding != 0 for XEN_DMOP_set_irq_level

XEN_DMOP_set_irq_level operation requires elements of pad array (being
member of xen_dm_op_set_irq_level structure) to be 0. While handling the
hypercall we validate this. If one of the elements is not zero, we set
rc to -EINVAL. At this point we should stop further DM handling and bail
out propagating the error to the caller. However, instead of goto the
code uses break which has basically no meaningful effect. The rc value
is never read and the code continues with the hypercall processing ending
up (possibly) with the interrupt injection. Fix it.

Fixes: 5d752df85f2c ("xen/dm: Introduce xendevicemodel_set_irq_level DM op")
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
2 months agoxen/arm: Don't use copy_from_paddr for DTB relocation
Luca Fancellu [Wed, 26 Feb 2025 21:52:56 +0000 (21:52 +0000)]
xen/arm: Don't use copy_from_paddr for DTB relocation

Currently the early stage of the Arm boot maps the DTB using
early_fdt_map() using PAGE_HYPERVISOR_RO which is cacheable
read-only memory, later during DTB relocation the function
copy_from_paddr() is used to map pages in the same range on
the fixmap but using PAGE_HYPERVISOR_WC which is non-cacheable
read-write memory.

The Arm specifications, ARM DDI0487L.a, section B2.11 "Mismatched
memory attributes" discourage using mismatched attributes for
aliases of the same location.

Given that there is nothing preventing the relocation since the region
is already mapped, fix that by open-coding copy_from_paddr inside
relocate_fdt, without mapping on the fixmap.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
2 months agodocs: add basic CI documentation
Marek Marczykowski-Górecki [Wed, 19 Feb 2025 02:56:55 +0000 (03:56 +0100)]
docs: add basic CI documentation

Include info how to get access/enable hardware runners and how to select
individual jobs.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2 months agoautomation: add tools/tests jobs on the AMD Zen3+ runner too
Marek Marczykowski-Górecki [Wed, 19 Feb 2025 02:56:54 +0000 (03:56 +0100)]
automation: add tools/tests jobs on the AMD Zen3+ runner too

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@amd.com>
2 months agoautomation: allow selecting individual jobs via CI variables
Marek Marczykowski-Górecki [Wed, 19 Feb 2025 02:56:53 +0000 (03:56 +0100)]
automation: allow selecting individual jobs via CI variables

Debugging sometimes involves running specific jobs on different
versions. It's useful to easily avoid running all of the not interesting
ones (for given case) to save both time and CI resources. Doing so used
to require changing the yaml files, usually in several places.
Ease this step by adding SELECTED_JOBS_ONLY variable that takes a regex.
Note that one needs to satisfy job dependencies on their own (for
example if a test job needs a build job, that specific build job
needs to be included too).

The variable can be specified via Gitlab web UI when scheduling a
pipeline, but it can be also set when doing git push directly:

    git push -o ci.variable=SELECTED_JOBS_ONLY="/job1|job2/"

More details at https://docs.gitlab.co.jp/ee/user/project/push_options.html

The variable needs to include regex for selecting jobs, including
enclosing slashes.
A coma/space separated list of jobs to select would be friendlier UX,
but unfortunately that is not supported:
https://gitlab.com/gitlab-org/gitlab/-/issues/209904 (note the proposed
workaround doesn't work for job-level CI_JOB_NAME).
On the other hand, the regex is more flexible (one can select for
example all arm32 jobs).

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
2 months agoautomation: add jobs running tests from tools/tests/*
Marek Marczykowski-Górecki [Wed, 19 Feb 2025 02:56:52 +0000 (03:56 +0100)]
automation: add jobs running tests from tools/tests/*

There are a bunch of tests in tools/tests/, let them run in CI.
For each subdirectory expect "make run" will run the test, and observe
its exit code. This way, adding new tests is easy, and they will be
automatically picked up.

For better visibility, log test output to junit xml format, and let
gitlab ingest it. Set SUT_ADDR variable with name/address of the system
under test, so a network can be used to extract the file. The actual
address is set using DHCP. And for the test internal network, still add
the 192.168.0.1 IP (but don't replace the DHCP-provided one).

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@amd.com>
2 months agoautomation: skip building domU if there is no test defined for it
Marek Marczykowski-Górecki [Wed, 19 Feb 2025 02:56:51 +0000 (03:56 +0100)]
automation: skip building domU if there is no test defined for it

This will be useful for later tests not using generic domU (unit tests,
xtf etc).

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@amd.com>
2 months agoautomation: Update ECLAIR analysis configuration
Nicola Vetrini [Fri, 14 Feb 2025 20:45:23 +0000 (21:45 +0100)]
automation: Update ECLAIR analysis configuration

The Xen configurations for the ARM64 and X86_64 ECLAIR analyses
is currently held in fixed files under
'automation/eclair_analysis/xen_{arm,x86}_config'. The values
of the configuration options there are susceptible to going stale
due to configuration option changes.

To enhance maintainability, the configuration under analysis is
derived from the respective architecture's defconfig, with suitable
changes added via EXTRA_XEN_CONFIG.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2 months agoxen/sched: address violation of MISRA C Rule 8.2
Nicola Vetrini [Fri, 14 Feb 2025 20:45:22 +0000 (21:45 +0100)]
xen/sched: address violation of MISRA C Rule 8.2

Rule 8.2 states: "Function types shall be in prototype form with
named parameters".

The parameter name is missing from the function pointer type
that constitutes the first parameter.

No functional change.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
2 months agoxen/arm: platform: address violation of MISRA C Rule 7.2
Nicola Vetrini [Fri, 14 Feb 2025 20:45:21 +0000 (21:45 +0100)]
xen/arm: platform: address violation of MISRA C Rule 7.2

Rule 7.2 states: "A u or U suffix shall be applied to all integer
constants that are represented in an unsigned type".

Some PM_* constants are unsigned quantities, despite some
of them being representable in a signed type, so a 'U' suffix
should be present.

No functional change.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2 months agoxen/arm: platform: Add support for R-Car Gen4
Oleksandr Andrushchenko [Wed, 15 Jan 2025 09:21:43 +0000 (09:21 +0000)]
xen/arm: platform: Add support for R-Car Gen4

Add Rcar Gen4 platform choice to Kconfig to select all required
drivers automatically.

Changelog:
v1 -> v2:
- Added RB from Stefano Stabellini

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2 months agoARM: ITS: implement quirks and add support for Renesas Gen4 ITS
Oleksandr Andrushchenko [Wed, 15 Jan 2025 09:21:43 +0000 (09:21 +0000)]
ARM: ITS: implement quirks and add support for Renesas Gen4 ITS

There are number of ITS implementations exist which are different from
the base one which have number of functionalities defined as is
"IMPLEMENTATION DEFINED", e.g. there may exist differences in cacheability,
shareability and memory requirements and others. This requires
appropriate handling of such HW requirements which are implemented as
ITS quirks: GITS_IIDR (ITS Implementer Identification Register) is used to
differentiate the ITS implementations and select appropriate set of
quirks if any.

As an example of such ITSes add quirk implementation for Renesas Gen4 ITS:
- add possibility to override default cacheability and shareability
settings used for ITS memory allocations;
- change relevant memory allocations to alloc_xenheap_pages which allows
to specify memory access flags, free_xenheap_pages is used to free;
- add quirks validation to ensure that all ITSes share the same quirks
in case of multiple ITSes are present in the system;

The Gen4 ITS memory requirements are not covered in any errata as of yet,
but observed behavior suggests that they are indeed required.

The idea of the quirk implementation is inspired by the Linux kernel ITS
quirk implementation [1].

Changelog:
v2 -> v3:
- added missing memset;
v1 -> v2:
- switched to using alloc_xenheap_pages/free_xenheap_pages for ITS memory
allocations;
- updated declaration of its_quirk_flags;
- added quirks validation to ensure that all ITSes share the same quirks;
- removed unnecessary vITS changes;

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
[1] https://elixir.bootlin.com/linux/v5.16.1/source/drivers/irqchip/irq-gic-v3-its.c

2 months agoxen/arm: Create GIC node using the node name from host dt
Michal Orzel [Wed, 19 Feb 2025 17:29:46 +0000 (18:29 +0100)]
xen/arm: Create GIC node using the node name from host dt

At the moment the GIC node we create for hwdom has a name
"interrupt-controller". Change it so that we use the same name as the
GIC node from host device tree. This is done for at least 2 purposes:
1) The convention in DT spec is that a node name with "reg" property
is formed "node-name@unit-address".
2) With DT overlay feature, many overlays refer to the GIC node using
the symbol under __symbols__ that we copy to hwdom 1:1. With the name
changed, the symbol is no longer valid and requires error prone manual
change by the user.

The unit-address part of the node name always refers to the first
address in the "reg" property which in case of GIC, always refers to
GICD and hwdom uses host memory layout.

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2 months agoarch: arm64: always set IL=1 when injecting an abort exception
Volodymyr Babchuk [Thu, 13 Feb 2025 15:37:55 +0000 (15:37 +0000)]
arch: arm64: always set IL=1 when injecting an abort exception

ARM Architecture Reference Manual states that IL field of ESR_EL1
register should be 1 in some cases, and all these cases are covered by
inject_abt64_exception()

Section D24.2.40, page D24-7337 of ARM DDI 0487L:

  IL, bit [25]
  Instruction Length for synchronous exceptions. Possible values of this bit are:

  [...]

  0b1 - 32-bit instruction trapped.
  This value is also used when the exception is one of the following:
  [...]
   - An Instruction Abort exception.
   - A Data Abort exception for which the value of the ISV bit is 0.
  [...]

inject_abt64_exception() function injects either Instruction Abort or
Data Abort exception. In both cases, ISS is 0, which means that ISV
bit is 0 as well. Thus, IL must be set to 1 unconditionally.

To align code with the specification, set .len field to 1 in
inject_abt64_exception() and remove unneeded third parameter.

Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
2 months agoarch: arm64: always set IL=1 when injecting undefined exception
Volodymyr Babchuk [Thu, 13 Feb 2025 15:37:54 +0000 (15:37 +0000)]
arch: arm64: always set IL=1 when injecting undefined exception

ARM Architecture Reference Manual states that IL field of ESR_EL1
register should be 1 when EC is 0b000000 aka HSR_EC_UNKNOWN.

Section D24.2.40, page D24-7337 of ARM DDI 0487L:

  IL, bit [25]
  Instruction Length for synchronous exceptions. Possible values of this bit are:

  [...]

  0b1 - 32-bit instruction trapped.
  This value is also used when the exception is one of the following:
  [...]
   - An exception reported using EC value 0b000000.

To align code with the specification, set .len field to 1 in
inject_undef64_exception() and remove unneeded second parameter.

Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2 months agodevice-tree: optimize size of struct dt_device_node
Michal Orzel [Wed, 12 Feb 2025 15:43:58 +0000 (17:43 +0200)]
device-tree: optimize size of struct dt_device_node

The current placement of fields in struct dt_device_node is not optimal and
introduces holes due to fields alignment.

Checked with "'pahole xen-syms -C dt_device_node"

ARM64 size 144B, 16B holes:
/* size: 144, cachelines: 3, members: 15 */
/* sum members: 128, holes: 3, sum holes: 16 */
/* last cacheline: 16 bytes */
ARM32 size 72B, 4B holes
/* size: 72, cachelines: 2, members: 15 */
/* sum members: 68, holes: 2, sum holes: 4 */
/* last cacheline: 8 bytes */

This patch optimizes size of struct dt_device_node by rearranging its
field, which eliminates holes and reduces structure size by 16B(ARM64) and
4B(ARM32).

After ARM64 size 128B, no holes (-16B):
/* size: 128, cachelines: 2, members: 15 */
After ARM32 size 68B, no holes (-4B)
/* size: 68, cachelines: 2, members: 15 */
/* last cacheline: 4 bytes */

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Signed-off-by: Grygorii Strashko <grygorii_strashko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2 months agoarm/vuart: move vpl011-related code to vpl011 emulator
dmkhn@proton.me [Wed, 12 Feb 2025 21:19:58 +0000 (21:19 +0000)]
arm/vuart: move vpl011-related code to vpl011 emulator

Xen console driver has vpl011-related logic which shall belong vpl011 emulator
code (Arm port). Move vpl011-related code from arch-independent console driver
to Arm's vpl011.c.

Use rate-limiting guest_printk() for error logging in console driver in case
vpl011 cannot handle the console input.

Signed-off-by: Denis Mukhin <dmukhin@ford.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2 months agoxen/include: introduce resource.h
Denis Mukhin [Tue, 11 Feb 2025 15:55:44 +0000 (15:55 +0000)]
xen/include: introduce resource.h

Move resource definitions to a new architecture-agnostic shared header file.

Signed-off-by: Denis Mukhin <dmukhin@ford.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2 months agoxen/dom0less: support for vcpu affinity
Xenia Ragiadakou [Thu, 20 Feb 2025 21:37:12 +0000 (13:37 -0800)]
xen/dom0less: support for vcpu affinity

Add vcpu affinity to the dom0less bindings. Example:

    dom1 {
            ...
            cpus = <4>;
            vcpu0 {
                   compatible = "xen,vcpu";
                   id = <0>;
                   hard-affinity = "4-7";
            };
            vcpu1 {
                   compatible = "xen,vcpu";
                   id = <1>;
                   hard-affinity = "0-3,5";
            };
            vcpu2 {
                   compatible = "xen,vcpu";
                   id = <2>;
                   hard-affinity = "1,6";
            };
            ...

Note that the property hard-affinity is optional. It is possible to add
other properties in the future not only to specify soft affinity, but
also to provide more precise methods for configuring affinity. For
instance, on ARM the MPIDR could be use to specify the pCPU. For now, it
is left to the future.

Signed-off-by: Xenia Ragiadakou <xenia.ragiadakou@amd.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Acked-by: Julien Grall <jgrall@amazon.com>
2 months agoxen/arm: introduce legacy dom0less option for xenstore allocation
Stefano Stabellini [Sat, 1 Feb 2025 00:42:12 +0000 (16:42 -0800)]
xen/arm: introduce legacy dom0less option for xenstore allocation

The new xenstore page allocation scheme might break older unpatched
Linux kernels that do not check for the Xenstore connection status
before proceeding with Xenstore initialization.

Introduce a dom0less configuration option to retain the older behavior.

The older behavior triggered by this option is to allocate the xenstore
page in init-dom0less. That does not work with static-mem guests.
However, it will make it possible to run as regular guests older Linux
kernel versions that are left unpatched.

Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
2 months agoinit-dom0less: allocate xenstore page if not already allocated
Stefano Stabellini [Sat, 1 Feb 2025 00:29:46 +0000 (16:29 -0800)]
init-dom0less: allocate xenstore page if not already allocated

We check if the xenstore page is already allocated. If yes, there is
nothing to do. If no, we proceed allocating it to support old unpatched
Linux kernels.

Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
2 months agoautomation: add ping test to static-mem test
Stefano Stabellini [Fri, 31 Jan 2025 23:30:55 +0000 (15:30 -0800)]
automation: add ping test to static-mem test

With the recent fixes, Dom0less direct mapped domains can use PV
drivers. Extend the existing static-mem test with a PV network ping
tests.

Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
2 months agodocs/features/dom0less: Update the late XenStore init protocol
Henry Wang [Fri, 24 May 2024 22:55:22 +0000 (15:55 -0700)]
docs/features/dom0less: Update the late XenStore init protocol

With the new allocation strategy of Dom0less DomUs XenStore page,
update the doc of the late XenStore init protocol accordingly.

Signed-off-by: Henry Wang <xin.wang2@amd.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
2 months agoxen/arm: Alloc XenStore page for Dom0less DomUs from hypervisor
Henry Wang [Fri, 24 May 2024 22:55:20 +0000 (15:55 -0700)]
xen/arm: Alloc XenStore page for Dom0less DomUs from hypervisor

There are use cases (for example using the PV driver) in Dom0less
setup that require Dom0less DomUs start immediately with Dom0, but
initialize XenStore later after Dom0's successful boot and call to
the init-dom0less application.

An error message can seen from the init-dom0less application on
1:1 direct-mapped domains:
```
Allocating magic pages
memory.c:238:d0v0 mfn 0x39000 doesn't belong to d1
Error on alloc magic pages
```

The "magic page" is a terminology used in the toolstack as reserved
pages for the VM to have access to virtual platform capabilities.
Currently the magic pages for Dom0less DomUs are populated by the
init-dom0less app through populate_physmap(), and populate_physmap()
automatically assumes gfn == mfn for 1:1 direct mapped domains. This
cannot be true for the magic pages that are allocated later from the
init-dom0less application executed in Dom0. For domain using statically
allocated memory but not 1:1 direct-mapped, similar error "failed to
retrieve a reserved page" can be seen as the reserved memory list is
empty at that time.

Since for init-dom0less, the magic page region is only for XenStore.
To solve above issue, this commit allocates the XenStore page for
Dom0less DomUs at the domain construction time. The PFN will be
noted and communicated to the init-dom0less application executed
from Dom0. To keep the XenStore late init protocol, set the connection
status to XENSTORE_RECONNECT.

Since the guest magic region allocation from init-dom0less is for
XenStore, and the XenStore page is now allocated from the hypervisor,
instead of hardcoding the guest magic pages region, use
xc_hvm_param_get() to get the XenStore page PFN. Rename alloc_xs_page()
to get_xs_page() to reflect the changes.

With this change, some existing code is not needed anymore, including:
(1) The definition of the XenStore page offset.
(2) Call to xc_domain_setmaxmem() and xc_clear_domain_page() as we
    don't need to set the max mem and clear the page anymore.
(3) Foreign mapping of the XenStore page, setting of XenStore interface
    status and HVM_PARAM_STORE_PFN from init-dom0less, as they are set
    by the hypervisor.

Take the opportunity to do some coding style improvements when possible.

Reported-by: Alec Kwapis <alec.kwapis@medtronic.com>
Signed-off-by: Henry Wang <xin.wang2@amd.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
Tested-by: Michal Orzel <michal.orzel@amd.com>
2 months agoautomation: upgrade arm32 kernel from bullseye to bookworm
Stefano Stabellini [Thu, 20 Feb 2025 22:56:20 +0000 (14:56 -0800)]
automation: upgrade arm32 kernel from bullseye to bookworm

automation: upgrade arm32 kernel from bullseye to bookworm

Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
2 months agoautomation: upgrade Linux kernel for arm64 tests to 6.6.74
Stefano Stabellini [Fri, 31 Jan 2025 23:32:53 +0000 (15:32 -0800)]
automation: upgrade Linux kernel for arm64 tests to 6.6.74

Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
2 months agoMISRA: Update path for bsearch devation
Andrew Cooper [Fri, 28 Feb 2025 09:58:51 +0000 (09:58 +0000)]
MISRA: Update path for bsearch devation

This ought to have been part of the original patch, so as to avoid breaking
CI.

Fixes: 31c0d6fdf421 ("xen/bsearch: Split out of lib.h into it's own header")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
2 months agoxen/bsearch: Split out of lib.h into it's own header
Andrew Cooper [Thu, 23 Jan 2025 15:11:47 +0000 (15:11 +0000)]
xen/bsearch: Split out of lib.h into it's own header

There are currently two users, and lib.h is included everywhere.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <jgrall@amazon.com>
2 months agoCHANGELOG.md: Finalize changes in 4.20 release cycle
Oleksii Kurochko [Thu, 27 Feb 2025 14:27:52 +0000 (15:27 +0100)]
CHANGELOG.md: Finalize changes in 4.20 release cycle

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 months agoIOMMU/x86: the bus-to-bridge lock needs to be acquired IRQ-safe
Jan Beulich [Thu, 27 Feb 2025 12:58:32 +0000 (12:58 +0000)]
IOMMU/x86: the bus-to-bridge lock needs to be acquired IRQ-safe

The function's use from set_msi_source_id() is guaranteed to be in an
IRQs-off region. While the invocation of that function could be moved
ahead in msi_msg_to_remap_entry() (doesn't need to be in the IOMMU-
intremap-locked region), the call tree from map_domain_pirq() holds an
IRQ descriptor lock. Hence all use sites of the lock need become IRQ-
safe ones.

In find_upstream_bridge() do a tiny bit of tidying in adjacent code:
Change a variable's type to unsigned and merge a redundant assignment
into another variable's initializer.

This is XSA-467 / CVE-2025-1713.

Fixes: 476bbccc811c ("VT-d: fix MSI source-id of interrupt remapping")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>