]> xenbits.xensource.com Git - people/royger/xen.git/log
people/royger/xen.git
2 months agoDNA: enable Qubes jobs on non-protected branches msi-pci-access gitlab/msi-pci-access
Roger Pau Monne [Fri, 28 Feb 2025 10:37:26 +0000 (11:37 +0100)]
DNA: enable Qubes jobs on non-protected branches

2 months agox86/msi: prevent MSI entry re-writes of the same data
Roger Pau Monne [Thu, 27 Feb 2025 10:26:35 +0000 (11:26 +0100)]
x86/msi: prevent MSI entry re-writes of the same data

Attempt to reduce the MSI entry writes, and the associated checking whether
memory decoding and MSI-X is enabled for the PCI device, when the MSI data
hasn't changed.

When using Interrupt Remapping the MSI entry will contain an index into
the remapping table, and it's in such remapping table where the MSI vector
and destination CPU is stored.  As such, when using interrupt remapping,
changes to the interrupt affinity shouldn't result in changes to the MSI
entry, and the MSI entry update can be avoided.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Ross Lagerwall <ross.lagerwall@citrix.com>
2 months agox86/dom0: be less restrictive with the Interrupt Address Range
Roger Pau Monne [Wed, 12 Feb 2025 10:37:50 +0000 (11:37 +0100)]
x86/dom0: be less restrictive with the Interrupt Address Range

Xen currently prevents dom0 from creating CPU or IOMMU page-table mappings
into the interrupt address range [0xfee00000, 0xfeefffff].  This range has
two different purposes.  For accesses from the CPU is contains the default
position of local APIC page at 0xfee00000.  For accesses from devices
it's the MSI address range, so the address field in the MSI entries
(usually) point to an address on that range to trigger an interrupt.

There are reports of Lenovo Thinkpad devices placing what seems to be the
UCSI shared mailbox at address 0xfeec2000 in the interrupt address range.
Attempting to use that device with a Linux PV dom0 leads to an error when
Linux kernel maps 0xfeec2000:

RIP: e030:xen_mc_flush+0x1e8/0x2b0
 xen_leave_lazy_mmu+0x15/0x60
 vmap_range_noflush+0x408/0x6f0
 __ioremap_caller+0x20d/0x350
 acpi_os_map_iomem+0x1a3/0x1c0
 acpi_ex_system_memory_space_handler+0x229/0x3f0
 acpi_ev_address_space_dispatch+0x17e/0x4c0
 acpi_ex_access_region+0x28a/0x510
 acpi_ex_field_datum_io+0x95/0x5c0
 acpi_ex_extract_from_field+0x36b/0x4e0
 acpi_ex_read_data_from_field+0xcb/0x430
 acpi_ex_resolve_node_to_value+0x2e0/0x530
 acpi_ex_resolve_to_value+0x1e7/0x550
 acpi_ds_evaluate_name_path+0x107/0x170
 acpi_ds_exec_end_op+0x392/0x860
 acpi_ps_parse_loop+0x268/0xa30
 acpi_ps_parse_aml+0x221/0x5e0
 acpi_ps_execute_method+0x171/0x3e0
 acpi_ns_evaluate+0x174/0x5d0
 acpi_evaluate_object+0x167/0x440
 acpi_evaluate_dsm+0xb6/0x130
 ucsi_acpi_dsm+0x53/0x80
 ucsi_acpi_read+0x2e/0x60
 ucsi_register+0x24/0xa0
 ucsi_acpi_probe+0x162/0x1e3
 platform_probe+0x48/0x90
 really_probe+0xde/0x340
 __driver_probe_device+0x78/0x110
 driver_probe_device+0x1f/0x90
 __driver_attach+0xd2/0x1c0
 bus_for_each_dev+0x77/0xc0
 bus_add_driver+0x112/0x1f0
 driver_register+0x72/0xd0
 do_one_initcall+0x48/0x300
 do_init_module+0x60/0x220
 __do_sys_init_module+0x17f/0x1b0
 do_syscall_64+0x82/0x170

Remove the restrictions to create mappings in the interrupt address range
for dom0.  Note that the restriction to map the local APIC page is enforced
separately, and that continues to be present.  Additionally make sure the
emulated local APIC page is also not mapped, in case dom0 is using it.

Note that even if the interrupt address range entries are populated in the
IOMMU page-tables no device access will reach those pages.  Device accesses
to the Interrupt Address Range will always be converted into Interrupt
Messages and are not subject to DMA remapping.

There's also the following restriction noted in Intel VT-d:

> Software must not program paging-structure entries to remap any address to
> the interrupt address range. Untranslated requests and translation requests
> that result in an address in the interrupt range will be blocked with
> condition code LGN.4 or SGN.8. Translated requests with an address in the
> interrupt address range are treated as Unsupported Request (UR).

Similarly for AMD-Vi:

> Accesses to the interrupt address range (Table 3) are defined to go through
> the interrupt remapping portion of the IOMMU and not through address
> translation processing. Therefore, when a transaction is being processed as
> an interrupt remapping operation, the transaction attribute of
> pretranslated or untranslated is ignored.
>
> Software Note: The IOMMU should
> not be configured such that an address translation results in a special
> address such as the interrupt address range.

However those restrictions don't apply to the identity mappings possibly
created for dom0, since the interrupt address range is never subject to DMA
remapping, and hence there's no output address after translation that
belongs to the interrupt address range.

Reported-by: Jürgen Groß <jgross@suse.com>
Link: https://lore.kernel.org/xen-devel/baade0a7-e204-4743-bda1-282df74e5f89@suse.com/
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agox86/iommu: account for IOMEM caps when populating dom0 IOMMU page-tables
Roger Pau Monne [Fri, 14 Feb 2025 09:39:29 +0000 (10:39 +0100)]
x86/iommu: account for IOMEM caps when populating dom0 IOMMU page-tables

The current code in arch_iommu_hwdom_init() kind of open-codes the same
MMIO permission ranges that are added to the hardware domain ->iomem_caps.
Avoid this duplication and use ->iomem_caps in arch_iommu_hwdom_init() to
filter which memory regions should be added to the dom0 IOMMU page-tables.

Note the IO-APIC and MCFG page(s) must be set as not accessible for a PVH
dom0, otherwise the internal Xen emulation for those ranges won't work.
This requires adjustments in dom0_setup_permissions().

The call to pvh_setup_mmcfg() in dom0_construct_pvh() must now strictly be
done ahead of setting up dom0 permissions, so take the opportunity to also
put it inside the existing is_hardware_domain() region.

Also the special casing of E820_UNUSABLE regions no longer needs to be done
in arch_iommu_hwdom_init(), as those regions are already blocked in
->iomem_caps and thus would be removed from the rangeset as part of
->iomem_caps processing in arch_iommu_hwdom_init().  The E820_UNUSABLE
regions below 1Mb are not removed from ->iomem_caps, that's a slight
difference for the IOMMU created page-tables, but the aim is to allow
access to the same memory either from the CPU or the IOMMU page-tables.

Since ->iomem_caps already takes into account the domain max paddr, there's
no need to remove any regions past the last address addressable by the
domain, as applying ->iomem_caps would have already taken care of that.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 months agox86/dom0: correctly set the maximum ->iomem_caps bound for PVH
Roger Pau Monne [Tue, 18 Feb 2025 16:57:49 +0000 (17:57 +0100)]
x86/dom0: correctly set the maximum ->iomem_caps bound for PVH

The logic in dom0_setup_permissions() sets the maximum bound in
->iomem_caps unconditionally using paddr_bits, which is not correct for HVM
based domains.  Instead use domain_max_paddr_bits() to get the correct
maximum paddr bits for each possible domain type.

Switch to using PFN_DOWN() instead of PAGE_SHIFT, as that's shorter.

Fixes: 53de839fb409 ('x86: constrain MFN range Dom0 may access')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 months agox86/dom0: attempt to fixup p2m page-faults for PVH dom0
Roger Pau Monne [Thu, 13 Feb 2025 09:58:45 +0000 (10:58 +0100)]
x86/dom0: attempt to fixup p2m page-faults for PVH dom0

When building a PVH dom0 Xen attempts to map all (relevant) MMIO regions
into the p2m for dom0 access.  However the information Xen has about the
host memory map is limited.  Xen doesn't have access to any resources
described in ACPI dynamic tables, and hence the p2m mappings provided might
not be complete.

PV doesn't suffer from this issue because a PV dom0 is capable of mapping
into it's page-tables any address not explicitly banned in d->iomem_caps.

Introduce a new command line options that allows Xen to attempt to fixup
the p2m page-faults, by creating p2m identity maps in response to p2m
page-faults.

This is aimed as a workaround to small ACPI regions Xen doesn't know about.
Note that missing large MMIO regions mapped in this way will lead to
slowness due to the VM exit processing, plus the mappings will always use
small pages.

The ultimate aim is to attempt to bring better parity with a classic PV
dom0.

Note such fixup rely on the CPU doing the access to the unpopulated
address.  If the access is attempted from a device instead there's no
possible way to fixup, as IOMMU page-fault are asynchronous.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Only slightly tested on my local PVH dom0 deployment.
---
Changes since v1:
 - Make the fixup function static.
 - Print message in case mapping already exists.

2 months agox86/emul: dump unhandled memory accesses for PVH dom0
Roger Pau Monne [Thu, 13 Feb 2025 08:08:01 +0000 (09:08 +0100)]
x86/emul: dump unhandled memory accesses for PVH dom0

A PV dom0 can map any host memory as long as it's allowed by the IO
capability range in d->iomem_caps.  On the other hand, a PVH dom0 has no
way to populate MMIO region onto it's p2m, so it's limited to what Xen
initially populates on the p2m based on the host memory map and the enabled
device BARs.

Introduce a new debug build only printk that reports attempts by dom0 to
access addresses not populated on the p2m, and not handled by any emulator.
This is for information purposes only, but might allow getting an idea of
what MMIO ranges might be missing on the p2m.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agoxen/bsearch: Split out of lib.h into it's own header
Andrew Cooper [Thu, 23 Jan 2025 15:11:47 +0000 (15:11 +0000)]
xen/bsearch: Split out of lib.h into it's own header

There are currently two users, and lib.h is included everywhere.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <jgrall@amazon.com>
2 months agoCHANGELOG.md: Finalize changes in 4.20 release cycle
Oleksii Kurochko [Thu, 27 Feb 2025 14:27:52 +0000 (15:27 +0100)]
CHANGELOG.md: Finalize changes in 4.20 release cycle

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 months agoIOMMU/x86: the bus-to-bridge lock needs to be acquired IRQ-safe
Jan Beulich [Thu, 27 Feb 2025 12:58:32 +0000 (12:58 +0000)]
IOMMU/x86: the bus-to-bridge lock needs to be acquired IRQ-safe

The function's use from set_msi_source_id() is guaranteed to be in an
IRQs-off region. While the invocation of that function could be moved
ahead in msi_msg_to_remap_entry() (doesn't need to be in the IOMMU-
intremap-locked region), the call tree from map_domain_pirq() holds an
IRQ descriptor lock. Hence all use sites of the lock need become IRQ-
safe ones.

In find_upstream_bridge() do a tiny bit of tidying in adjacent code:
Change a variable's type to unsigned and merge a redundant assignment
into another variable's initializer.

This is XSA-467 / CVE-2025-1713.

Fixes: 476bbccc811c ("VT-d: fix MSI source-id of interrupt remapping")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
2 months agoPPC: Activate UBSAN in testing
Andrew Cooper [Wed, 26 Feb 2025 03:27:33 +0000 (21:27 -0600)]
PPC: Activate UBSAN in testing

Also enable -fno-sanitize=alignment like x86 since support for unaligned
accesses is guaranteed by the ISA and the existing OPAL setup code
relies on it.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agox86/traps: Move guest_{rd,wr}msr_xen() into msr.c
Andrew Cooper [Tue, 31 Dec 2024 11:02:49 +0000 (11:02 +0000)]
x86/traps: Move guest_{rd,wr}msr_xen() into msr.c

They are out of place in traps.c, and only have a single caller each.  Make
them static inside msr.c.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agox86/traps: Move cpuid_hypervisor_leaves() into cpuid.c
Andrew Cooper [Tue, 31 Dec 2024 10:56:00 +0000 (10:56 +0000)]
x86/traps: Move cpuid_hypervisor_leaves() into cpuid.c

It's out of place in traps.c, and only has a single caller.  Make it static
inside cpuid.c.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agox86/ucode: Drop the match_reg[] field from AMD's microcode_patch
Andrew Cooper [Thu, 24 Oct 2024 12:47:20 +0000 (13:47 +0100)]
x86/ucode: Drop the match_reg[] field from AMD's microcode_patch

This was true in the K10 days, but even back then the match registers were
really payload data rather than header data.

But, it's really model specific data, and these days typically part of the
signature, so is random data for all intents and purposes.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 months agox86/ucode: Rename hypercall-context functions
Andrew Cooper [Fri, 23 Aug 2024 18:38:59 +0000 (19:38 +0100)]
x86/ucode: Rename hypercall-context functions

microcode_update{,_helper}() are overly generic names in a file that has
multiple update routines and helper functions contexts.

Rename microcode_update() to ucode_update_hcall() so it explicitly identifies
itself as hypercall context, and rename microcode_update_helper() to
ucode_update_hcall_cont() to make it clear it is in continuation context.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
2 months agox86/DM: slightly simplify set_mem_type()
Jan Beulich [Wed, 26 Feb 2025 11:26:23 +0000 (12:26 +0100)]
x86/DM: slightly simplify set_mem_type()

There's no need to access the static array twice per iteration, even
more so when that's effectively open-coding array_access_nospec().
Along with renaming the "new type" variable, rename the "old type" one
as well, to clarify which one is which.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 months agoxen/riscv: update mfn calculation in pt_mapping_level()
Oleksii Kurochko [Wed, 26 Feb 2025 11:24:26 +0000 (12:24 +0100)]
xen/riscv: update mfn calculation in pt_mapping_level()

When pt_update() is called with arguments (..., INVALID_MFN, ..., 0 or 1),
it indicates that a mapping is being destroyed/modifyed.

In the case when modifying or destroying a mapping, it is necessary to
search until a leaf node is found, instead of searching for a page table
entry based on the precalculated `level` and `order`(look at pt_update()).
This is because when `mfn` == INVALID_MFN, the `mask` (in pt_mapping_level())
will take into account only `vfn`, which could accidentally return an
incorrect level, leading to the discovery of an incorrect page table entry.

For example, if `vfn` is page table level 1 aligned, but it was mapped as
page table level 0, then pt_mapping_level() will return `level` = 1, since
only `vfn` (which is page table level 1 aligned) is taken into account when
`mfn` == INVALID_MFN (look at pt_mapping_level()).

Have unmap_table() check for NULL, such that individual callers don't need
to.

Fixes: c2f1ded524 ("xen/riscv: page table handling")
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Jan Beulich<jbeulich@suse.com>
2 months agox86/MCE-telem: drop unnecessary per-CPU field
Jan Beulich [Wed, 26 Feb 2025 11:23:49 +0000 (12:23 +0100)]
x86/MCE-telem: drop unnecessary per-CPU field

struct mc_telem_cpu_ctl's processing field is used solely in
mctelem_process_deferred(), where the local variable can as well be used
directly when retrieving the head of the list to process. This then also
eliminates the field holding a dangling pointer once the processing of
the list finished, in particular when the entry is handed to
mctelem_dismiss().

No functional change intended.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 months agox86/MCE: fail init more gracefully when CPU vendor isn't supported
Jan Beulich [Wed, 26 Feb 2025 11:23:19 +0000 (12:23 +0100)]
x86/MCE: fail init more gracefully when CPU vendor isn't supported

When mcheck_init() doesn't recognize the CPU vendor, it will undo the
all-banks allocation, and it will in particular not install the CPU
notifier. This way APs will pointlessly try to re-establish an
all-banks allocation, while then falling over NULL pointers due to the
notifier not having run and hence not having allocated anything for
them.

Prevent both from happening, and additionally delay writing MCG_CTL
until no errors can occur anymore in mca_cap_init().

Fixes: 741367e77d6c ("mce: Clean-up mcheck_init handler")
Fixes: a5e1b534ac6f ("x86: mce cleanup for both Intel and AMD mce logic")
Fixes: 560cf418c845 ("x86/mcheck: allow varying bank counts per CPU")
Reported-by: Teddy Astie <teddy.astie@vates.tech>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2 months agoradix-tree: drop "root" parameters from radix_tree_node_{alloc,free}()
Jan Beulich [Wed, 26 Feb 2025 11:22:22 +0000 (12:22 +0100)]
radix-tree: drop "root" parameters from radix_tree_node_{alloc,free}()

They aren't used anymore.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 months agoxen/console: print Xen version via keyhandler
Denis Mukhin [Wed, 26 Feb 2025 11:17:01 +0000 (12:17 +0100)]
xen/console: print Xen version via keyhandler

Add Xen version printout to 'h' keyhandler output.

That is useful for debugging systems that have been left intact for a long
time.

Signed-off-by: Denis Mukhin <dmukhin@ford.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 months agoCirrusCI: Use shallow clone
Andrew Cooper [Mon, 24 Feb 2025 15:36:11 +0000 (15:36 +0000)]
CirrusCI: Use shallow clone

This reduces the Clone step from ~50s to ~3s.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2 months agoscripts: Fix git-checkout.sh to work with non-master branches (take 2)
Andrew Cooper [Thu, 31 Oct 2024 13:35:40 +0000 (13:35 +0000)]
scripts: Fix git-checkout.sh to work with non-master branches (take 2)

First, rename $TAG to $COMMITTISH.  We already pass tags, branches (well, only
master) and full SHAs into this script.

Xen uses master for QEMU_UPSTREAM_REVISION, and has done for other trees too
in the path.  Apparently we've never specified a different branch, because the
git-clone rune only pulls in the master branch; it does not pull in diverging
branches.

Fix this by performing an explicit fetch of the $COMMITTISH, then checking out
the dummy branch from the FETCH_HEAD.

Suggested-by: Jason Andryuk <jason.andryuk@amd.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
2 months agotools/ocaml: Fix oxenstored build warning
Andrii Sultanov [Fri, 14 Feb 2025 15:24:27 +0000 (15:24 +0000)]
tools/ocaml: Fix oxenstored build warning

OCaml, in preparation for a renaming of the error string associated with
conversion failure in 'int_of_string' functions, started to issue this
warning:

  File "process.ml", line 440, characters 13-28:
  440 |   | (Failure "int_of_string")    -> reply_error "EINVAL"
                     ^^^^^^^^^^^^^^^
  Warning 52 [fragile-literal-pattern]: Code should not depend on the actual values of
  this constructor's arguments. They are only for information
  and may change in future versions. (See manual section 11.5)

Deal with this at the source, and instead create our own stable
ConversionFailure exception that's raised on the None case in
'int_of_string_opt'.

'c_int_of_string' is safe and does not raise such exceptions.

Signed-off-by: Andrii Sultanov <andrii.sultanov@cloud.com>
Acked-by: Christian Lindig <christian.lindig@cloud.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 months agoxen/ACPI: Drop local acpi_os_{v,}printf() and use plain {v,}printk()
Andrew Cooper [Mon, 17 Feb 2025 19:13:01 +0000 (19:13 +0000)]
xen/ACPI: Drop local acpi_os_{v,}printf() and use plain {v,}printk()

Now that Xen has a real vprintk(), there's no need to opencode it locally with
vsnprintf().  Redirect the debug routines to the real {v,}printk() and drop
the local acpi_os_{v,}printf() implementations.

Amongst other things, this removes one arbitrary limit on message size, as
well as removing a 512 byte static buffer that ought to have been in
__initdata given that is private to an __init function.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 months agoxen/console: Optimise the parameter order of vprintk_common()
Andrew Cooper [Thu, 23 Jan 2025 03:27:07 +0000 (03:27 +0000)]
xen/console: Optimise the parameter order of vprintk_common()

For ABIs which pass parameters by register (all cases that we compile Xen
for), inserting new arguments on the left hand side involves shuffling all
other parameters along by one register whereas appending a new argument
doesn't involve shuffling of existing registers.

Reorder vprintk_common()'s prefix parameter to being last.  This is a marginal
improvement on all architectures:

  Function                              old     new   delta
  vprintk                                18      12      -6  x86
  vprintk                                32      24      -8  arm32
  vprintk                                52      48      -4  arm64
  vprintk                                52      48      -4  riscv64
  vprintk                                80      72      -8  ppc64

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 months agox86/msi: Change __msi_set_enable() to take pci_sbdf_t
Andrew Cooper [Sun, 2 Feb 2025 13:48:40 +0000 (13:48 +0000)]
x86/msi: Change __msi_set_enable() to take pci_sbdf_t

This removes the unnecessary work of splitting a 32-bit number across
4 registers, and recombining later.  Bloat-o-meter reports:

  add/remove: 0/0 grow/shrink: 0/9 up/down: 0/-295 (-295)
  Function                                     old     new   delta
  enable_iommu                                1748    1732     -16
  iommu_msi_unmask                              98      81     -17
  iommu_msi_mask                               100      83     -17
  disable_iommu                                286     269     -17
  __msi_set_enable                              81      50     -31
  __pci_disable_msi                            178     146     -32
  pci_cleanup_msi                              268     229     -39
  pci_enable_msi                              1063    1019     -44
  pci_restore_msi_state                       1116    1034     -82

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 months agox86/ucode: Add option to scan microcode by default
Ross Lagerwall [Mon, 17 Feb 2025 17:50:11 +0000 (17:50 +0000)]
x86/ucode: Add option to scan microcode by default

A lot of systems automatically add microcode to the initramfs so it can
be useful as a vendor policy to always scan for microcode. Add a Kconfig
option to allow setting the default behaviour.

The default behaviour is unchanged since the new option defaults to
"no".

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
2 months agoxen/riscv: update defintion of vmap_to_mfn()
Oleksii Kurochko [Tue, 25 Feb 2025 07:47:12 +0000 (08:47 +0100)]
xen/riscv: update defintion of vmap_to_mfn()

vmap_to_mfn() uses virt_to_maddr(), which is designed to work with VA from
either the direct map region or Xen's linkage region (XEN_VIRT_START).
An assertion will occur if it is used with other regions, in particular for
the VMAP region.

Since RISC-V lacks a hardware feature to request the MMU to translate a VA to
a PA (as Arm does, for example), software page table walking (pt_walk()) is
used for the VMAP region to obtain the mfn from pte_t.

To avoid introduce a circular dependency between asm/mm.h and asm/page.h by
including each other, the static inline function  _vmap_to_mfn() is introduced
in asm/page.h, as it uses struct pte_t and pte_is_mapping() from asm/page.h.
_vmap_to_mfn() is then reused in the definition of vmap_to_mfn() macro in
asm/mm.h.

Fixes: 7db8d2bd9b ("xen/riscv: add minimal stuff to mm.h to build full Xen")
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 months agoxen/riscv: implement software page table walking
Oleksii Kurochko [Tue, 25 Feb 2025 07:46:32 +0000 (08:46 +0100)]
xen/riscv: implement software page table walking

RISC-V doesn't have hardware feature to ask MMU to translate
virtual address to physical address ( like Arm has, for example ),
so software page table walking is implemented.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 months agox86/PV: don't half-open-code SIF_PM_MASK
Jan Beulich [Tue, 25 Feb 2025 07:45:46 +0000 (08:45 +0100)]
x86/PV: don't half-open-code SIF_PM_MASK

Avoid using the same literal number (8) in two distinct places, by using
MASK_INTR() to avoid opencoding the literal 8.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 months agoradix-tree: don't left-shift negative values
Jan Beulich [Tue, 25 Feb 2025 07:45:14 +0000 (08:45 +0100)]
radix-tree: don't left-shift negative values

Any (signed) integer is okay to pass into radix_tree_int_to_ptr(), yet
left shifting negative values is UB. Use an unsigned intermediate type,
reducing the impact to implementation defined behavior (for the
unsigned->signed conversion).

Also please Misra C:2012 rule 7.3 by dropping the lower case numeric 'l'
tag.

No difference in generated code, at least on x86.

Fixes: b004883e29bb ("Simplify and build-fix (for some gcc versions) radix_tree_int_to_ptr()")
Reported-by: Teddy Astie <teddy.astie@vates.tech>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 months agoVMX: don't run with CR4.VMXE set when VMX could not be enabled
Jan Beulich [Tue, 25 Feb 2025 07:44:32 +0000 (08:44 +0100)]
VMX: don't run with CR4.VMXE set when VMX could not be enabled

While generally benign, doing so is still at best misleading.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2 months agox86emul: drop open-coding of REX.W prefixes
Jan Beulich [Tue, 25 Feb 2025 07:43:07 +0000 (08:43 +0100)]
x86emul: drop open-coding of REX.W prefixes

Along the lines of 0e3642514719 ("x86: drop REX64_PREFIX"), move to well
formed FXSAVEQ / FXRSTORQ here as well.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 months agoxen/console: introduce is_console_printable()
Denis Mukhin [Tue, 25 Feb 2025 07:42:37 +0000 (08:42 +0100)]
xen/console: introduce is_console_printable()

Add is_console_printable() to implement a common check for printable characters
in the UART emulation and guest logging code.

Signed-off-by: Denis Mukhin <dmukhin@ford.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2 months agoxen/x86: add CPPC feature flag for AMD processors
Penny Zheng [Tue, 25 Feb 2025 07:41:41 +0000 (08:41 +0100)]
xen/x86: add CPPC feature flag for AMD processors

Add Collaborative Processor Performance Control feature flag for
AMD processors.

amd-cppc is the AMD CPU performance scaling driver that
introduces a new CPU frequency control mechanism on modern AMD
APU and CPU series.
There are two types of hardware implementations: "Full MSR Support"
and "Shared Memory Support".

Right now, xen will only implement "Full MSR Support", and this new
feature flag indicates whether processor has this feature or not.

Signed-off-by: Penny Zheng <Penny.Zheng@amd.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 months agoioreq: allow arch_vcpu_ioreq_completion() to signal an error
Sergiy Kibrik [Tue, 25 Feb 2025 07:41:07 +0000 (08:41 +0100)]
ioreq: allow arch_vcpu_ioreq_completion() to signal an error

Return false from arch_vcpu_ioreq_completion() when completion is not handled.
According to coding-best-practices.pandoc an error should be propagated to
caller, if caller is expecting to handle it, which seems to the case for
callers of arch_vcpu_ioreq_completion().

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Sergiy Kibrik <Sergiy_Kibrik@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 months agoCHANGELOG.md: Start a new 4.21 section
Andrew Cooper [Fri, 21 Feb 2025 14:53:57 +0000 (14:53 +0000)]
CHANGELOG.md: Start a new 4.21 section

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 months agoConfig.mk: Switch QEMU back to master
Andrew Cooper [Fri, 21 Feb 2025 14:33:33 +0000 (14:33 +0000)]
Config.mk: Switch QEMU back to master

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 months agoRerun ./autogen.sh for 4.21
Andrew Cooper [Fri, 21 Feb 2025 14:52:10 +0000 (14:52 +0000)]
Rerun ./autogen.sh for 4.21

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 months agoUpdate Xen to 4.21
Andrew Cooper [Fri, 21 Feb 2025 14:41:41 +0000 (14:41 +0000)]
Update Xen to 4.21

Xen 4.20 has branched.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 months agoeclair: mark R16.6 as clean
Stefano Stabellini [Thu, 20 Feb 2025 21:54:45 +0000 (13:54 -0800)]
eclair: mark R16.6 as clean

Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
2 months agoxen/x86: resolve the last 3 MISRA R16.6 violations
Stefano Stabellini [Thu, 20 Feb 2025 21:32:46 +0000 (13:32 -0800)]
xen/x86: resolve the last 3 MISRA R16.6 violations

MISRA R16.6 states that "Every switch statement shall have at least two
switch-clauses". There are only 3 violations left on x86 (zero on ARM).

One of them is only a violation depending on the kconfig configuration.
So deviate it instead with a SAF comment.

Two of them are deliberate to enable future additions. Deviate them as
such.

Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Reviewed-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
2 months agoUpdate Xen to 4.20.0-rc5
Andrew Cooper [Thu, 20 Feb 2025 15:47:30 +0000 (15:47 +0000)]
Update Xen to 4.20.0-rc5

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 months agoCI: Mark MISRA Rule 11.2 as clean
Andrew Cooper [Thu, 20 Feb 2025 12:53:54 +0000 (12:53 +0000)]
CI: Mark MISRA Rule 11.2 as clean

Reviewed-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 months agox86/MCE-telem: adjust cookie definition
Jan Beulich [Thu, 20 Feb 2025 12:50:19 +0000 (13:50 +0100)]
x86/MCE-telem: adjust cookie definition

struct mctelem_ent is opaque outside of mcetelem.c; the cookie
abstraction exists - afaict - just to achieve this opaqueness. Then it
is irrelevant though which kind of pointer mctelem_cookie_t resolves to.
IOW we can as well use struct mctelem_ent there, allowing to remove the
casts from COOKIE2MCTE() and MCTE2COOKIE(). Their removal addresses
Misra C:2012 rule 11.2 ("Conversions shall not be performed between a
pointer to an incomplete type and any other type") violations.

No functional change intended.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-By: Oleksii Kurochko<oleksii.kurochko@gmail.com>
2 months agox86/svm: Separate STI and VMRUN instructions in svm_asm_do_resume()
Andrew Cooper [Mon, 17 Feb 2025 15:51:51 +0000 (15:51 +0000)]
x86/svm: Separate STI and VMRUN instructions in svm_asm_do_resume()

There is a corner case in the VMRUN instruction where its INTR_SHADOW state
leaks into guest state if a VMExit occurs before the VMRUN is complete.  An
example of this could be taking #NPF due to event injection.

Xen can safely execute STI anywhere between CLGI and VMRUN, as CLGI blocks
external interrupts too.  However, an exception (while fatal) will appear to
be in an irqs-on region (as GIF isn't considered), so position the STI after
the speculation actions but prior to the GPR pops.

Link: https://lore.kernel.org/all/CADH9ctBs1YPmE4aCfGPNBwA10cA8RuAk2gO7542DjMZgs4uzJQ@mail.gmail.com/
Fixes: 66b245d9eaeb ("SVM: limit GIF=0 region")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
2 months agoxen/memory: Make resource_max_frames() to return 0 on unknown type
Oleksandr Tyshchenko [Mon, 17 Feb 2025 22:34:02 +0000 (00:34 +0200)]
xen/memory: Make resource_max_frames() to return 0 on unknown type

This is actually what the caller acquire_resource() expects on any kind
of error (the comment on top of resource_max_frames() also suggests that).
Otherwise, the caller will treat -errno as a valid value and propagate incorrect
nr_frames to the VM. As a possible consequence, a VM trying to query a resource
size of an unknown type will get the success result from the hypercall and obtain
nr_frames 4294967201.

Also, add an ASSERT_UNREACHABLE() in the default case of _acquire_resource(),
normally we won't get to this point, as an unknown type will always be rejected
earlier in resource_max_frames().

Also, update test-resource app to verify that Xen can deal with invalid
(unknown) resource type properly.

Fixes: 9244528955de ("xen/memory: Fix acquire_resource size semantics")
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
2 months agoxen/console: Fix truncation of panic() messages
Andrew Cooper [Wed, 22 Jan 2025 12:13:24 +0000 (12:13 +0000)]
xen/console: Fix truncation of panic() messages

The panic() function uses a static buffer to format its arguments into, simply
to emit the result via printk("%s", buf).  This buffer is not large enough for
some existing users in Xen.  e.g.:

  (XEN) ****************************************
  (XEN) Panic on CPU 0:
  (XEN) Invalid device tree blob at physical address 0x46a00000.
  (XEN) The DTB must be 8-byte aligned and must not exceed 2 MB in size.
  (XEN)
  (XEN) Plea****************************************

The remainder of this particular message is 'e check your bootloader.', but
has been inherited by RISC-V from ARM.

It is also pointless double buffering.  Implement vprintk() beside printk(),
and use it directly rather than rendering into a local buffer, removing it as
one source of message limitation.

This marginally simplifies panic(), and drops a global used-once buffer.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
2 months agoARM32/traps: Fix do_trap_undefined_instruction()'s detection of kernel text
Andrew Cooper [Fri, 7 Feb 2025 23:15:01 +0000 (23:15 +0000)]
ARM32/traps: Fix do_trap_undefined_instruction()'s detection of kernel text

While fixing some common/arch boundaries for UBSAN support on other
architectures, the following debugging patch:

  diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
  index c1f2d1b89d43..58d1d048d339 100644
  --- a/xen/arch/arm/setup.c
  +++ b/xen/arch/arm/setup.c
  @@ -504,6 +504,8 @@ void asmlinkage __init start_xen(unsigned long fdt_paddr)

       system_state = SYS_STATE_active;

  +    dump_execution_state();
  +
       for_each_domain( d )
           domain_unpause_by_systemcontroller(d);

failed with:

  (XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
  (XEN) CPU0: Unexpected Trap: Undefined Instruction
  (XEN) ----[ Xen-4.20-rc  arm32  debug=n  Not tainted ]----
  (XEN) CPU:    0
  <snip>
  (XEN)
  (XEN) ****************************************
  (XEN) Panic on CPU 0:
  (XEN) CPU0: Unexpected Trap: Undefined Instruction
  (XEN) ****************************************

This is because the condition for init text is wrong.  While there's nothing
interesting from that point onwards in start_xen(), it's also wrong for
livepatches too.

Use is_active_kernel_text() which is the correct test for this purpose, and is
aware of init and livepatch regions as well as their lifetimes.

Fixes: 3e802c6ca1fb ("xen/arm: Correctly support WARN_ON")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
2 months agox86/HVM: use XVFREE() in hvmemul_cache_destroy()
Jan Beulich [Thu, 13 Feb 2025 13:32:13 +0000 (14:32 +0100)]
x86/HVM: use XVFREE() in hvmemul_cache_destroy()

My adjustments to move from xmalloc() et al to respective xvmalloc()
flavors was flawed - a freeing instance wasn't converted.

Fixes: 23d60dbb0493 ("x86/HVM: allocate emulation cache entries dynamically")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
2 months agox86/iommu: disable interrupts at shutdown
Roger Pau Monne [Tue, 4 Feb 2025 10:46:14 +0000 (11:46 +0100)]
x86/iommu: disable interrupts at shutdown

Add a new hook to inhibit interrupt generation by the IOMMU(s).  Note the
hook is currently only implemented for x86 IOMMUs.  The purpose is to
disable interrupt generation at shutdown so any kexec chained image finds
the IOMMU(s) in a quiesced state.

It would also prevent "Receive accept error" being raised as a result of
non-disabled interrupts targeting offline CPUs.

Note that the iommu_quiesce() call in nmi_shootdown_cpus() is still
required even when there's a preceding iommu_crash_shutdown() call; the
later can become a no-op depending on the setting of the "crash-disable"
command line option.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
2 months agox86/pci: disable MSI(-X) on all devices at shutdown
Roger Pau Monne [Wed, 5 Feb 2025 14:05:47 +0000 (15:05 +0100)]
x86/pci: disable MSI(-X) on all devices at shutdown

Attempt to disable MSI(-X) capabilities on all PCI devices know by Xen at
shutdown.  Doing such disabling should facilitate kexec chained kernel from
booting more reliably, as device MSI(-X) interrupt generation should be
quiesced.

Only attempt to disable MSI(-X) on all devices in the crash context if the
PCI lock is not taken, otherwise the PCI device list could be in an
inconsistent state.  This requires introducing a new pcidevs_trylock()
helper to check whether the lock is currently taken.

Disabling MSI(-X) should prevent "Receive accept error" being raised as a
result of non-disabled interrupts targeting offline CPUs.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
2 months agox86/smp: perform disabling on interrupts ahead of AP shutdown
Roger Pau Monne [Thu, 6 Feb 2025 11:20:04 +0000 (12:20 +0100)]
x86/smp: perform disabling on interrupts ahead of AP shutdown

Move the disabling of interrupt sources so it's done ahead of the offlining
of APs.  This is to prevent AMD systems triggering "Receive accept error"
when interrupts target CPUs that are no longer online.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
2 months agox86/irq: drop fixup_irqs() parameters
Roger Pau Monne [Tue, 28 Jan 2025 15:06:07 +0000 (16:06 +0100)]
x86/irq: drop fixup_irqs() parameters

The solely remaining caller always passes the same globally available
parameters.  Drop the parameters and modify fixup_irqs() to use
cpu_online_map in place of the input mask parameter, and always be verbose
in its output printing.

While there remove some of the checks given the single context where
fixup_irqs() is now called, which should always be in the CPU offline path,
after the CPU going offline has been removed from cpu_online_map.

No functional change intended.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
2 months agox86/shutdown: offline APs with interrupts disabled on all CPUs
Roger Pau Monne [Tue, 28 Jan 2025 08:34:20 +0000 (09:34 +0100)]
x86/shutdown: offline APs with interrupts disabled on all CPUs

The current shutdown logic in smp_send_stop() will disable the APs while
having interrupts enabled on the BSP or possibly other APs. On AMD systems
this can lead to local APIC errors:

APIC error on CPU0: 00(08), Receive accept error

Such error message can be printed in a loop, thus blocking the system from
rebooting.  I assume this loop is created by the error being triggered by
the console interrupt, which is further stirred by the ESR handler
printing to the console.

Intel SDM states:

"Receive Accept Error.

Set when the local APIC detects that the message it received was not
accepted by any APIC on the APIC bus, including itself. Used only on P6
family and Pentium processors."

So the error shouldn't trigger on any Intel CPU supported by Xen.

However AMD doesn't make such claims, and indeed the error is broadcast to
all local APICs when an interrupt targets a CPU that's already offline.

To prevent the error from stalling the shutdown process perform the
disabling of APs and the BSP local APIC with interrupts disabled on all
CPUs in the system, so that by the time interrupts are unmasked on the BSP
the local APIC is already disabled.  This can still lead to a spurious:

APIC error on CPU0: 00(00)

As a result of an LVT Error getting injected while interrupts are masked on
the CPU, and the vector only handled after the local APIC is already
disabled.  ESR reports 0 because as part of disable_local_APIC() the ESR
register is cleared.

Note the NMI crash path doesn't have such issue, because disabling of APs
and the caller local APIC is already done in the same contiguous region
with interrupts disabled.  There's a possible window on the NMI crash path
(nmi_shootdown_cpus()) where some APs might be disabled (and thus
interrupts targeting them raising "Receive accept error") before others APs
have interrupts disabled.  However the shutdown NMI will be handled,
regardless of whether the AP is processing a local APIC error, and hence
such interrupts will not cause the shutdown process to get stuck.

Remove the call to fixup_irqs() in smp_send_stop(): it doesn't achieve the
intended goal of moving all interrupts to the BSP anyway.  The logic in
fixup_irqs() will move interrupts whose affinity doesn't overlap with the
passed mask, but the movement of interrupts is done to any CPU set in
cpu_online_map.  As in the shutdown path fixup_irqs() is called before APs
are cleared from cpu_online_map this leads to interrupts being shuffled
around, but not assigned to the BSP exclusively.

The Fixes tag is more of a guess than a certainty; it's possible the
previous sleep window in fixup_irqs() allowed any in-flight interrupt to be
delivered before APs went offline.  However fixup_irqs() was still
incorrectly used, as it didn't (and still doesn't) move all interrupts to
target the provided cpu mask.

Fixes: e2bb28d62158 ('x86/irq: forward pending interrupts to new destination in fixup_irqs()')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
2 months agotools: fix typo in sysconfig.xencommons.in
Denis Mukhin [Tue, 11 Feb 2025 07:31:57 +0000 (07:31 +0000)]
tools: fix typo in sysconfig.xencommons.in

Fixes: 7b61011e1450 ("tools: make xenstore domain easy configurable")
Signed-off-by: Denis Mukhin <dmukhin@ford.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
2 months agoRISCV: Activate UBSAN in testing
Andrew Cooper [Fri, 7 Feb 2025 21:19:21 +0000 (21:19 +0000)]
RISCV: Activate UBSAN in testing

RISC-V has less complicated headers, so update ubsan.c to pull in everything
it needs.  Provide dump_execution_state(), and update the printk() message to
make it more obvious that it's an outstanding task.

As with commit 8ef2ac727e21 ("automation: enable UBSAN for debug tests"),
enable UBSAN in RISC-V testing too.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
2 months agoRISCV/asm: Use CALL rather than JAL
Andrew Cooper [Fri, 7 Feb 2025 15:04:25 +0000 (15:04 +0000)]
RISCV/asm: Use CALL rather than JAL

JAL has a maximium displacement of 2M.  To branch further, it needs pairing
with an AUIPC instruction.  CALL is a pseudoinstruction which allows the
linker to pick the appropriate sequence when relaxations are enabled.

This avoids a build failure of the form:

  prelink.o: in function `start':
  xen/xen/arch/riscv/riscv64/head.S:28:(.text.header+0x2c):
  relocation truncated to fit: R_RISCV_JAL against symbol `calc_phys_offset' defined in .init.text section in prelink.o
  make[3]: *** [arch/riscv/Makefile:18: xen-syms] Error 1

when Xen gets large enough, e.g. with CONFIG_UBSAN enabled.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
2 months agoRISCV/boot: Run constructors during setup
Andrew Cooper [Fri, 7 Feb 2025 14:35:37 +0000 (14:35 +0000)]
RISCV/boot: Run constructors during setup

Without this, RISC-V isn't running boot time selftests when they're compiled
in.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
2 months agoautomation: enable UBSAN for debug tests
Stefano Stabellini [Thu, 6 Feb 2025 02:37:23 +0000 (18:37 -0800)]
automation: enable UBSAN for debug tests

automation: enable UBSAN for debug tests

Enable CONFIG_UBSAN and CONFIG_UBSAN_FATAL for the ARM64 and x86_64
build jobs, with debug enabled, which are later used for Xen tests on
QEMU and/or real hardware.

Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
R-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
2 months agoradix-tree: introduce RADIX_TREE{,_INIT}()
Jan Beulich [Fri, 7 Feb 2025 09:00:04 +0000 (10:00 +0100)]
radix-tree: introduce RADIX_TREE{,_INIT}()

... now that static initialization is possible. Use RADIX_TREE() for
pci_segments and ivrs_maps.

This then fixes an ordering issue on x86: With the call to
radix_tree_init(), acpi_mmcfg_init()'s invocation of pci_segments_init()
will zap the possible earlier introduction of segment 0 by
amd_iommu_detect_one_acpi()'s call to pci_ro_device(), and thus the
write-protection of the PCI devices representing AMD IOMMUs.

Fixes: 3950f2485bbc ("x86/x2APIC: defer probe until after IOMMU ACPI table parsing")
Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
2 months agoradix-tree: purge node allocation override hooks
Jan Beulich [Fri, 7 Feb 2025 08:59:11 +0000 (09:59 +0100)]
radix-tree: purge node allocation override hooks

These were needed by TMEM only, which is long gone. The Linux original
doesn't have such either. This effectively reverts one of the "Other
changes" from 8dc6738dbb3c ("Update radix-tree.[ch] from upstream Linux
to gain RCU awareness").

Positive side effect: Two cf_check go away.

While there also convert xmalloc()+memset() to xzalloc(). (Don't convert
to xvzalloc(), as that would require touching the freeing side, too.)

Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
2 months agoAMD/IOMMU: drop stray MSI enabling
Jan Beulich [Tue, 4 Feb 2025 12:50:49 +0000 (13:50 +0100)]
AMD/IOMMU: drop stray MSI enabling

While the 2nd of the commits referenced below should have moved the call
to amd_iommu_msi_enable() instead of adding another one, the situation
wasn't quite right even before: It can't have done any good to enable
MSI when no IRQ was allocated for it, yet.

The other call to amd_iommu_msi_enable(), just out of patch context,
needs to stay there until S3 resume is re-worked. For the boot path that
call should be unnecessary, as iommu{,_maskable}_msi_startup() will have
done it already (by way of invoking iommu_msi_unmask()).

Fixes: 5f569f1ac50e ("AMD/IOMMU: allow enabling with IRQ not yet set up")
Fixes: d9e49d1afe2e ("AMD/IOMMU: adjust setup of internal interrupt for x2APIC mode")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
Tested-by: Jason Andryuk <jason.andryuk@amd.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
3 months agoxen/arm: ffa: fix bind/unbind notification
Jens Wiklander [Mon, 3 Feb 2025 10:21:12 +0000 (11:21 +0100)]
xen/arm: ffa: fix bind/unbind notification

The notification bitmask is in passed in the FF-A ABI in two 32-bit
registers w3 and w4. The lower 32-bits should go in w3 and the higher in
w4. These two registers has unfortunately been swapped for
FFA_NOTIFICATION_BIND and FFA_NOTIFICATION_UNBIND in the FF-A mediator.
So fix that by using the correct registers.

Fixes: b490f470f58d ("xen/arm: ffa: support notification")
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Relese-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agoAMD/IOMMU: log IVHD contents
Jan Beulich [Mon, 3 Feb 2025 10:43:49 +0000 (11:43 +0100)]
AMD/IOMMU: log IVHD contents

Despite all the verbosity with "iommu=debug", information on the IOMMUs
themselves was missing.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
Tested-by: Jason Andryuk <jason.andryuk@amd.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
3 months agoxen/arm: Fix build issue when CONFIG_PHYS_ADDR_T_32=y
Michal Orzel [Tue, 28 Jan 2025 09:40:02 +0000 (10:40 +0100)]
xen/arm: Fix build issue when CONFIG_PHYS_ADDR_T_32=y

On Arm32, when CONFIG_PHYS_ADDR_T_32 is set, a build failure is observed:
arch/arm/platforms/vexpress.c: In function 'vexpress_smp_init':
arch/arm/platforms/vexpress.c:102:12: error: format '%lx' expects argument of type 'long unsigned int', but argument 2 has type 'long long unsigned int' [-Werror=format=]
  102 |     printk("Set SYS_FLAGS to %"PRIpaddr" (%p)\n",

When CONFIG_PHYS_ADDR_T_32 is set, paddr_t is defined as unsigned long.
Commit 96f35de69e59 dropped __virt_to_maddr() which used paddr_t as a
return type. Without a cast, the expression type is unsigned long long
which causes the issue. Fix it.

Fixes: 96f35de69e59 ("x86+Arm: drop (rename) __virt_to_maddr() / __maddr_to_virt()")
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Tested-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
3 months agodevice-tree: bootfdt: Fix build issue when CONFIG_PHYS_ADDR_T_32=y
Michal Orzel [Tue, 28 Jan 2025 09:40:01 +0000 (10:40 +0100)]
device-tree: bootfdt: Fix build issue when CONFIG_PHYS_ADDR_T_32=y

On Arm32, when CONFIG_PHYS_ADDR_T_32 is set, a build failure is observed:
common/device-tree/bootfdt.c: In function 'build_assertions':
./include/xen/macros.h:47:31: error: static assertion failed: "!(alignof(struct membanks) != 8)"
   47 | #define BUILD_BUG_ON(cond) ({ _Static_assert(!(cond), "!(" #cond ")"); })
      |                               ^~~~~~~~~~~~~~
common/device-tree/bootfdt.c:31:5: note: in expansion of macro 'BUILD_BUG_ON'
   31 |     BUILD_BUG_ON(alignof(struct membanks) != 8);

When CONFIG_PHYS_ADDR_T_32 is set, paddr_t is defined as unsigned long,
therefore the struct membanks alignment is 4B and not 8B. The check is
there to ensure the struct membanks and struct membank, which is a
member of the former, are equally aligned. Therefore modify the check to
compare alignments obtained via alignof not to rely on hardcoded
values.

Fixes: 2209c1e35b47 ("xen/arm: Introduce a generic way to access memory bank structures")
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Tested-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Julien Grall <julien@xen.org>
3 months agox86/intel: Fix PERF_GLOBAL fixup when virtualised
Andrew Cooper [Tue, 21 Jan 2025 16:56:26 +0000 (16:56 +0000)]
x86/intel: Fix PERF_GLOBAL fixup when virtualised

Logic using performance counters needs to look at
MSR_MISC_ENABLE.PERF_AVAILABLE before touching any other resources.

When virtualised under ESX, Xen dies with a #GP fault trying to read
MSR_CORE_PERF_GLOBAL_CTRL.

Factor this logic out into a separate function (it's already too squashed to
the RHS), and insert a check of MSR_MISC_ENABLE.PERF_AVAILABLE.

This also avoids setting X86_FEATURE_ARCH_PERFMON if MSR_MISC_ENABLE says that
PERF is unavailable, although oprofile (the only consumer of this flag)
cross-checks too.

Fixes: 6bdb965178bb ("x86/intel: ensure Global Performance Counter Control is setup correctly")
Reported-by: Jonathan Katz <jonathan.katz@aptar.com>
Link: https://xcp-ng.org/forum/topic/10286/nesting-xcp-ng-on-esx-8
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Tested-by: Jonathan Katz <jonathan.katz@aptar.com>
3 months agox86/PV: further harden guest memory accesses against speculative abuse
Jan Beulich [Mon, 27 Jan 2025 14:23:59 +0000 (15:23 +0100)]
x86/PV: further harden guest memory accesses against speculative abuse

The original implementation has two issues: For one it doesn't preserve
non-canonical-ness of inputs in the range 0x8000000000000000 through
0x80007fffffffffff. Bogus guest pointers in that range would not cause a
(#GP) fault upon access, when they should.

And then there is an AMD-specific aspect, where only the low 48 bits of
an address are used for speculative execution; the architecturally
mandated #GP for non-canonical addresses would be raised at a later
execution stage. Therefore to prevent Xen controlled data to make it
into any of the caches in a guest controllable manner, we need to
additionally ensure that for non-canonical inputs bit 47 would be clear.

See the code comment for how addressing both is being achieved.

Fixes: 4dc181599142 ("x86/PV: harden guest memory accesses against speculative abuse")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
3 months agox86emul: further correct 64-bit mode zero count repeated string insn handling
Jan Beulich [Mon, 27 Jan 2025 14:23:19 +0000 (15:23 +0100)]
x86emul: further correct 64-bit mode zero count repeated string insn handling

In an entirely different context I came across Linux commit 428e3d08574b
("KVM: x86: Fix zero iterations REP-string"), which points out that
we're still doing things wrong: For one, there's no zero-extension at
all on AMD. And then while RCX is zero-extended from 32 bits uniformly
for all string instructions on newer hardware, RSI/RDI are only for MOVS
and STOS on the systems I have access to. (On an old family 0xf system
I've further found that for REP LODS even RCX is not zero-extended.)

While touching the lines anyway, replace two casts in get_rep_prefix().

Fixes: 79e996a89f69 ("x86emul: correct 64-bit mode repeated string insn handling with zero count")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Released-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agoiommu/amd: atomically update IRTE amd-iommu gitlab/amd-iommu
Roger Pau Monne [Mon, 20 Jan 2025 14:48:21 +0000 (15:48 +0100)]
iommu/amd: atomically update IRTE

Either when using a 32bit Interrupt Remapping Entry or a 128bit one update
the entry atomically, by using cmpxchg unconditionally as IOMMU depends on
it.  No longer disable the entry by setting RemapEn = 0 ahead of updating
it.  As a consequence of not toggling RemapEn ahead of the update the
Interrupt Remapping Table needs to be flushed after the entry update.

This avoids a window where the IRTE has RemapEn = 0, which can lead to
IO_PAGE_FAULT if the underlying interrupt source is not masked.

There's no guidance in AMD-Vi specification about how IRTE update should be
performed as opposed to DTE updating which has specific guidance.  However
DTE updating claims that reads will always be at least 128bits in size, and
hence for the purposes here assume that reads and caching of the IRTE
entries in either 32 or 128 bit format will be done atomically from
the IOMMU.

Note that as part of introducing a new raw128 field in the IRTE struct, the
current raw field is renamed to raw64 to explicitly contain the size in the
field name.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agoiommu/vtd: cleanup MAP_SINGLE_DEVICE and related code
Teddy Astie [Thu, 18 Apr 2024 11:57:21 +0000 (11:57 +0000)]
iommu/vtd: cleanup MAP_SINGLE_DEVICE and related code

This flag was only used in case cx16 is not available, as those code paths no
longer exist, this flag now does basically nothing.

Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agox86/iommu: remove non-CX16 logic from DMA remapping
Teddy Astie [Thu, 18 Apr 2024 11:57:20 +0000 (11:57 +0000)]
x86/iommu: remove non-CX16 logic from DMA remapping

As CX16 support is now mandatory for IOMMU usage, the checks for CX16 in
the DMA remapping code are stale.  Remove them together with the associated
code introduced in case CX16 was not available.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agoiommu/vtd: remove non-CX16 logic from interrupt remapping
Teddy Astie [Thu, 18 Apr 2024 11:57:21 +0000 (11:57 +0000)]
iommu/vtd: remove non-CX16 logic from interrupt remapping

As CX16 support is now mandatory for IOMMU usage, the checks for CX16 in
the interrupt remapping code are stale.  Remove them together with the
associated code introduced in case CX16 was not available.

Note that AMD-Vi support for atomically updating a 128bit IRTE entry is
still not implemented, it will be done by further changes.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agox86/iommu: check for CMPXCHG16B when enabling IOMMU
Teddy Astie [Fri, 24 Jan 2025 11:31:15 +0000 (12:31 +0100)]
x86/iommu: check for CMPXCHG16B when enabling IOMMU

All hardware with VT-d/AMD-Vi has CMPXCHG16B support. Check this at
initialisation time, and otherwise refuse to use the IOMMU.

If the local APICs support x2APIC mode the IOMMU support for interrupt
remapping will be checked earlier using a specific helper.  If no support
for CX16 is detected by that earlier hook disable the IOMMU at that point
and prevent further poking for CX16 later in the boot process, which would
also fail.

There's a possible corner case when running virtualized, and the underlying
hypervisor exposing an IOMMU but no CMPXCHG16B support.  In which case
ignoring the IOMMU is fine, albeit the most natural would be for the
underlying hypervisor to also expose CMPXCHG16B support if an IOMMU is
available to the VM.

Note this change only introduces the checks, but doesn't remove the now
stale checks for CX16 support sprinkled in the IOMMU code.  Further changes
will take care of that.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agox86/HVM: correct read/write split at page boundaries
Jan Beulich [Fri, 24 Jan 2025 09:15:56 +0000 (10:15 +0100)]
x86/HVM: correct read/write split at page boundaries

The MMIO cache is intended to have one entry used per independent memory
access that an insn does. This, in particular, is supposed to be
ignoring any page boundary crossing. Therefore when looking up a cache
entry, the access'es starting (linear) address is relevant, not the one
possibly advanced past a page boundary.

In order for the same offset-into-buffer variable to be usable in
hvmemul_phys_mmio_access() for both the caller's buffer and the cache
entry's it is further necessary to have the un-adjusted caller buffer
passed into there.

Fixes: 2d527ba310dc ("x86/hvm: split all linear reads and writes at page boundary")
Reported-by: Manuel Andreas <manuel.andreas@tum.de>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agox86/HVM: allocate emulation cache entries dynamically
Jan Beulich [Fri, 24 Jan 2025 09:15:29 +0000 (10:15 +0100)]
x86/HVM: allocate emulation cache entries dynamically

Both caches may need higher capacity, and the upper bound will need to
be determined dynamically based on CPUID policy (for AMX'es TILELOAD /
TILESTORE at least).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agox86/HVM: correct MMIO emulation cache bounds check
Jan Beulich [Thu, 23 Jan 2025 10:14:48 +0000 (11:14 +0100)]
x86/HVM: correct MMIO emulation cache bounds check

To avoid overrunning the internal buffer we need to take the offset into
the buffer into account.

Fixes: d95da91fb497 ("x86/HVM: grow MMIO cache data size to 64 bytes")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
3 months agodocs: fusa: Fix OFT tags for the design requirements
Ayan Kumar Halder [Tue, 14 Jan 2025 18:57:07 +0000 (18:57 +0000)]
docs: fusa: Fix OFT tags for the design requirements

The OFT tags for the design requirements are updated.

Fixes: b9f9b396452 ("docs: fusa: Add dom0less domain configuration requirements")
Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
3 months agoautomation/cirrus-ci: introduce FreeBSD randconfig builds randconf gitlab/randconf
Roger Pau Monne [Thu, 16 Jan 2025 08:06:26 +0000 (09:06 +0100)]
automation/cirrus-ci: introduce FreeBSD randconfig builds

Add a new randconfig job for each FreeBSD version.  This requires some
rework of the template so common parts can be shared between the full and
the randconfig builds.  Such randconfig builds are relevant because FreeBSD
is the only tested system that has a full non-GNU toolchain.

While there replace the usage of the python311 package with python3, which is
already using 3.11, and remove the install of the plain python package for full
builds.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
3 months agoautomation/cirrus-ci: update FreeBSD to 13.4
Roger Pau Monne [Thu, 16 Jan 2025 08:07:31 +0000 (09:07 +0100)]
automation/cirrus-ci: update FreeBSD to 13.4

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
3 months agodocs/misra: Document ECLAIR extension to Rule 20.7
Nicola Vetrini [Fri, 17 Jan 2025 07:54:39 +0000 (08:54 +0100)]
docs/misra: Document ECLAIR extension to Rule 20.7

MISRA C Rule 20.7 states:
"Expressions resulting from the expansion of macro parameters shall
be enclosed in parentheses".

Document the behaviour of ECLAIR with respect to the CPP extension
that allows variable macro arguments to be named.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
3 months agoManual pages: Fix a few typos
Bernhard Kaindl [Fri, 17 Jan 2025 07:54:25 +0000 (08:54 +0100)]
Manual pages: Fix a few typos

While skimming through the manual pages, I spotted a few typos.

Signed-off-by: Bernhard Kaindl <bernhard.kaindl@cloud.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 months agoxl: properly dispose of libxl_dominfo struct instances
Jan Beulich [Fri, 17 Jan 2025 07:54:03 +0000 (08:54 +0100)]
xl: properly dispose of libxl_dominfo struct instances

The ssid_label field requires separate freeing; make sure to call
libxl_dominfo_dispose() as well as libxl_dominfo_init(). Since vcpuset()
calls only the former, add a call to the latter there at the same time.

Coverity-ID: 1638727
Coverity-ID: 1638728
Fixes: c458c404da16 ("xl: use libxl_domain_info to get the uuid in printf_info")
Fixes: 48dab9767d2e ("tools/xl: use libxl_domain_info to get domain type for vcpu-pin")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@vates.tech>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
3 months agoxl: properly dispose of vTPM struct instance
Jan Beulich [Fri, 17 Jan 2025 07:53:50 +0000 (08:53 +0100)]
xl: properly dispose of vTPM struct instance

The backend_domname field requires separate freeing; make sure to call
libxl_device_vtpm_dispose() also on respective error paths.

Coverity-ID: 1638719
Fixes: dde22055ac3a ("libxl: add vtpm support")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@vates.tech>
Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agoxentrace: free CPU mask string before overwriting pointer
Jan Beulich [Fri, 17 Jan 2025 07:53:27 +0000 (08:53 +0100)]
xentrace: free CPU mask string before overwriting pointer

While multiple -c options may be unexpected, we'd still better deal with
them properly.

Also restore the blank line that was bogusly zapped by the same commit.

Coverity-ID: 1638723
Fixes: e4ad2836842a ("xentrace: Implement cpu mask range parsing of human values (-c)")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@vates.tech>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
3 months agodocs/misc: Fix a few typos
Bernhard Kaindl [Wed, 15 Jan 2025 15:09:04 +0000 (16:09 +0100)]
docs/misc: Fix a few typos

While skimming through the misc docs, I spotted a few typos.

Signed-off-by: Bernhard Kaindl <bernhard.kaindl@cloud.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
3 months agodocs: Fix some typos in the design docs
Bernhard Kaindl [Wed, 15 Jan 2025 13:44:55 +0000 (14:44 +0100)]
docs: Fix some typos in the design docs

Skimming through the design docs, I saw some typos that needed fixing.

Reviewed-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 months agoxen/ppc: Fix double xen_ulong_t typedef in public/arch-ppc.h
Andrew Cooper [Wed, 15 Jan 2025 14:22:21 +0000 (14:22 +0000)]
xen/ppc: Fix double xen_ulong_t typedef in public/arch-ppc.h

public/arch-ppc.h contains two adjacent #ifndef __ASSEMBLY__ blocks.

With these merged, it becomes very obvious that there's a duplicate
definition of xen_ulong_t, which is also noticed by the docs build:

  /usr/bin/perl -w /local/xen.git/docs/xen-headers -O html/hypercall/ppc \
          -T 'arch-ppc - Xen public headers' \
          -X arch-arm -X arch-riscv -X arch-x86_32 -X arch-x86_64 \
          -X xen-arm -X xen-riscv -X xen-x86_32 -X xen-x86_64 \
          -X arch-x86 \
          /local/xen.git/docs/../xen include/public include/xen/errno.h
  include/public/memory.h:63: multiple definitions of Typedef xen_ulong_t: include/public/arch-ppc.h:55
  include/public/memory.h:63: multiple definitions of Typedef xen_ulong_t: include/public/arch-ppc.h:61
  include/public/memory.h:63: multiple definitions of Typedef xen_ulong_t: include/public/arch-ppc.h:61
  include/public/memory.h:63: multiple definitions of Typedef xen_ulong_t: include/public/arch-ppc.h:55

Drop the second typedef.  Finally, annotate the #endif so it's clear
what it refers to.

Fixes: 08c192cc1127 ("xen/ppc: Add public/arch-ppc.h")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Shawn Anastasio <sanastasio@raptorengineering.com>
3 months agodocs/sphinx: gitignore generated files
Yann Dirson [Wed, 15 Jan 2025 12:27:56 +0000 (12:27 +0000)]
docs/sphinx: gitignore generated files

Signed-off-by: Yann Dirson <yann.dirson@vates.tech>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 months agodocs: rationalise .gitignore
Yann Dirson [Wed, 15 Jan 2025 12:27:56 +0000 (12:27 +0000)]
docs: rationalise .gitignore

Note I did not transplant the patterns under doc/txt/ (since the whole
dir is ignored already), and adjusted sort order to be fully
alphabetical.

Signed-off-by: Yann Dirson <yann.dirson@vates.tech>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 months agodocs/sphinx: import sys for error reporting
Yann Dirson [Wed, 15 Jan 2025 12:27:56 +0000 (12:27 +0000)]
docs/sphinx: import sys for error reporting

Signed-off-by: Yann Dirson <yann.dirson@vates.tech>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 months agoautomation/gitlab: disable coverage from clang randconfig
Roger Pau Monne [Tue, 14 Jan 2025 14:10:14 +0000 (15:10 +0100)]
automation/gitlab: disable coverage from clang randconfig

If randconfig enables coverage support the build times out due to GNU LD
taking too long.  For the time being prevent coverage from being enabled in
clang randconfig job.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
3 months agox86/time: prefer CMOS over EFI_GET_TIME
Roger Pau Monne [Mon, 2 Sep 2024 14:00:19 +0000 (16:00 +0200)]
x86/time: prefer CMOS over EFI_GET_TIME

The EFI_GET_TIME implementation is well known to be broken for many firmware
implementations, for Xen the result on such implementations are:

----[ Xen-4.19-unstable  x86_64  debug=y  Tainted:   C    ]----
CPU:    0
RIP:    e008:[<0000000062ccfa70>] 0000000062ccfa70
[...]
Xen call trace:
   [<0000000062ccfa70>] R 0000000062ccfa70
   [<00000000732e9a3f>] S 00000000732e9a3f
   [<ffff82d04034f34f>] F arch/x86/time.c#get_cmos_time+0x1b3/0x26e
   [<ffff82d04045926f>] F init_xen_time+0x28/0xa4
   [<ffff82d040454bc4>] F __start_xen+0x1ee7/0x2578
   [<ffff82d040203334>] F __high_start+0x94/0xa0

Pagetable walk from 0000000062ccfa70:
 L4[0x000] = 000000207ef1c063 ffffffffffffffff
 L3[0x001] = 000000005d6c0063 ffffffffffffffff
 L2[0x116] = 8000000062c001e3 ffffffffffffffff (PSE)

****************************************
Panic on CPU 0:
FATAL PAGE FAULT
[error_code=0011]
Faulting linear address: 0000000062ccfa70
****************************************

Swap the preference to default to CMOS first, and EFI later, in an attempt to
use EFI_GET_TIME as a last resort option only.  Note that Linux for example
doesn't allow calling the get_time method, and instead provides a dummy handler
that unconditionally returns EFI_UNSUPPORTED on x86-64.

Such change in the preferences requires some re-arranging of the function
logic, so that panic messages with workaround suggestions are suitably printed.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-By: Oleksii Kurochko<oleksii.kurochko@gmail.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
3 months agox86/time: introduce command line option to select wallclock
Roger Pau Monne [Mon, 2 Sep 2024 15:51:33 +0000 (17:51 +0200)]
x86/time: introduce command line option to select wallclock

Allow setting the used wallclock from the command line.  When the option is set
to a value different than `auto` the probing is bypassed and the selected
implementation is used (as long as it's available).

The `xen` and `efi` options require being booted as a Xen guest (with Xen guest
supported built-in) or from UEFI firmware respectively.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
3 months agoautomation/eclair: make Misra rule 20.7 blocking
Roger Pau Monne [Tue, 14 Jan 2025 11:08:22 +0000 (12:08 +0100)]
automation/eclair: make Misra rule 20.7 blocking

There are no violations left, make the rule globally blocking for both x86
and ARM.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
3 months agodocs: Improve spelling of few cases in the documentation
Bernhard Kaindl [Wed, 15 Jan 2025 15:01:39 +0000 (16:01 +0100)]
docs: Improve spelling of few cases in the documentation

Skimming the docs, I came across a few places for spelling improvements.

Signed-off-by: Bernhard Kaindl <bernhard.kaindl@cloud.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 months agoMAINTAINERS: Change reviewer of the ECLAIR integration
Nicola Vetrini [Wed, 15 Jan 2025 15:01:25 +0000 (16:01 +0100)]
MAINTAINERS: Change reviewer of the ECLAIR integration

Simone Ballarin is no longer actively involved in reviewing
the ECLAIR integration for Xen. I am stepping up as a reviewer.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Simone Ballarin <simone.ballarin@bugseng.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
3 months agomisra: add deviation for MISRA C Rule R11.8
Alessandro Zucchelli [Wed, 15 Jan 2025 15:01:13 +0000 (16:01 +0100)]
misra: add deviation for MISRA C Rule R11.8

Rule 11.8 states as following: "A cast shall not remove any `const' or
`volatile' qualification from the type pointed to by a pointer".

Function `__hvm_copy' in `xen/arch/x86/hvm/hvm.c' is a double-use
function, where the parameter needs to not be const because it can be
set for write or not. As it was decided a new const-only function will
lead to more developer confusion than it's worth, this violation is
addressed by deviating the function.
All cases of casting away const-ness are accompanied with a comment
explaining why it is safe given the other flags passed in; such comment is used
by the deviation in order to match the appropriate function call.

No functional change.

Signed-off-by: Alessandro Zucchelli <alessandro.zucchelli@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>