]> xenbits.xensource.com Git - xen.git/log
xen.git
2 months agoautomation: enable UBSAN for debug tests 4.20.0-rc4
Stefano Stabellini [Thu, 6 Feb 2025 02:37:23 +0000 (18:37 -0800)]
automation: enable UBSAN for debug tests

automation: enable UBSAN for debug tests

Enable CONFIG_UBSAN and CONFIG_UBSAN_FATAL for the ARM64 and x86_64
build jobs, with debug enabled, which are later used for Xen tests on
QEMU and/or real hardware.

Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
R-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
2 months agoradix-tree: introduce RADIX_TREE{,_INIT}()
Jan Beulich [Fri, 7 Feb 2025 09:00:04 +0000 (10:00 +0100)]
radix-tree: introduce RADIX_TREE{,_INIT}()

... now that static initialization is possible. Use RADIX_TREE() for
pci_segments and ivrs_maps.

This then fixes an ordering issue on x86: With the call to
radix_tree_init(), acpi_mmcfg_init()'s invocation of pci_segments_init()
will zap the possible earlier introduction of segment 0 by
amd_iommu_detect_one_acpi()'s call to pci_ro_device(), and thus the
write-protection of the PCI devices representing AMD IOMMUs.

Fixes: 3950f2485bbc ("x86/x2APIC: defer probe until after IOMMU ACPI table parsing")
Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
2 months agoradix-tree: purge node allocation override hooks
Jan Beulich [Fri, 7 Feb 2025 08:59:11 +0000 (09:59 +0100)]
radix-tree: purge node allocation override hooks

These were needed by TMEM only, which is long gone. The Linux original
doesn't have such either. This effectively reverts one of the "Other
changes" from 8dc6738dbb3c ("Update radix-tree.[ch] from upstream Linux
to gain RCU awareness").

Positive side effect: Two cf_check go away.

While there also convert xmalloc()+memset() to xzalloc(). (Don't convert
to xvzalloc(), as that would require touching the freeing side, too.)

Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
2 months agoAMD/IOMMU: drop stray MSI enabling
Jan Beulich [Tue, 4 Feb 2025 12:50:49 +0000 (13:50 +0100)]
AMD/IOMMU: drop stray MSI enabling

While the 2nd of the commits referenced below should have moved the call
to amd_iommu_msi_enable() instead of adding another one, the situation
wasn't quite right even before: It can't have done any good to enable
MSI when no IRQ was allocated for it, yet.

The other call to amd_iommu_msi_enable(), just out of patch context,
needs to stay there until S3 resume is re-worked. For the boot path that
call should be unnecessary, as iommu{,_maskable}_msi_startup() will have
done it already (by way of invoking iommu_msi_unmask()).

Fixes: 5f569f1ac50e ("AMD/IOMMU: allow enabling with IRQ not yet set up")
Fixes: d9e49d1afe2e ("AMD/IOMMU: adjust setup of internal interrupt for x2APIC mode")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
Tested-by: Jason Andryuk <jason.andryuk@amd.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
3 months agoxen/arm: ffa: fix bind/unbind notification
Jens Wiklander [Mon, 3 Feb 2025 10:21:12 +0000 (11:21 +0100)]
xen/arm: ffa: fix bind/unbind notification

The notification bitmask is in passed in the FF-A ABI in two 32-bit
registers w3 and w4. The lower 32-bits should go in w3 and the higher in
w4. These two registers has unfortunately been swapped for
FFA_NOTIFICATION_BIND and FFA_NOTIFICATION_UNBIND in the FF-A mediator.
So fix that by using the correct registers.

Fixes: b490f470f58d ("xen/arm: ffa: support notification")
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Relese-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agoAMD/IOMMU: log IVHD contents
Jan Beulich [Mon, 3 Feb 2025 10:43:49 +0000 (11:43 +0100)]
AMD/IOMMU: log IVHD contents

Despite all the verbosity with "iommu=debug", information on the IOMMUs
themselves was missing.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
Tested-by: Jason Andryuk <jason.andryuk@amd.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
3 months agoxen/arm: Fix build issue when CONFIG_PHYS_ADDR_T_32=y 4.20.0-rc3
Michal Orzel [Tue, 28 Jan 2025 09:40:02 +0000 (10:40 +0100)]
xen/arm: Fix build issue when CONFIG_PHYS_ADDR_T_32=y

On Arm32, when CONFIG_PHYS_ADDR_T_32 is set, a build failure is observed:
arch/arm/platforms/vexpress.c: In function 'vexpress_smp_init':
arch/arm/platforms/vexpress.c:102:12: error: format '%lx' expects argument of type 'long unsigned int', but argument 2 has type 'long long unsigned int' [-Werror=format=]
  102 |     printk("Set SYS_FLAGS to %"PRIpaddr" (%p)\n",

When CONFIG_PHYS_ADDR_T_32 is set, paddr_t is defined as unsigned long.
Commit 96f35de69e59 dropped __virt_to_maddr() which used paddr_t as a
return type. Without a cast, the expression type is unsigned long long
which causes the issue. Fix it.

Fixes: 96f35de69e59 ("x86+Arm: drop (rename) __virt_to_maddr() / __maddr_to_virt()")
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Tested-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
3 months agodevice-tree: bootfdt: Fix build issue when CONFIG_PHYS_ADDR_T_32=y
Michal Orzel [Tue, 28 Jan 2025 09:40:01 +0000 (10:40 +0100)]
device-tree: bootfdt: Fix build issue when CONFIG_PHYS_ADDR_T_32=y

On Arm32, when CONFIG_PHYS_ADDR_T_32 is set, a build failure is observed:
common/device-tree/bootfdt.c: In function 'build_assertions':
./include/xen/macros.h:47:31: error: static assertion failed: "!(alignof(struct membanks) != 8)"
   47 | #define BUILD_BUG_ON(cond) ({ _Static_assert(!(cond), "!(" #cond ")"); })
      |                               ^~~~~~~~~~~~~~
common/device-tree/bootfdt.c:31:5: note: in expansion of macro 'BUILD_BUG_ON'
   31 |     BUILD_BUG_ON(alignof(struct membanks) != 8);

When CONFIG_PHYS_ADDR_T_32 is set, paddr_t is defined as unsigned long,
therefore the struct membanks alignment is 4B and not 8B. The check is
there to ensure the struct membanks and struct membank, which is a
member of the former, are equally aligned. Therefore modify the check to
compare alignments obtained via alignof not to rely on hardcoded
values.

Fixes: 2209c1e35b47 ("xen/arm: Introduce a generic way to access memory bank structures")
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Tested-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Julien Grall <julien@xen.org>
3 months agox86/intel: Fix PERF_GLOBAL fixup when virtualised
Andrew Cooper [Tue, 21 Jan 2025 16:56:26 +0000 (16:56 +0000)]
x86/intel: Fix PERF_GLOBAL fixup when virtualised

Logic using performance counters needs to look at
MSR_MISC_ENABLE.PERF_AVAILABLE before touching any other resources.

When virtualised under ESX, Xen dies with a #GP fault trying to read
MSR_CORE_PERF_GLOBAL_CTRL.

Factor this logic out into a separate function (it's already too squashed to
the RHS), and insert a check of MSR_MISC_ENABLE.PERF_AVAILABLE.

This also avoids setting X86_FEATURE_ARCH_PERFMON if MSR_MISC_ENABLE says that
PERF is unavailable, although oprofile (the only consumer of this flag)
cross-checks too.

Fixes: 6bdb965178bb ("x86/intel: ensure Global Performance Counter Control is setup correctly")
Reported-by: Jonathan Katz <jonathan.katz@aptar.com>
Link: https://xcp-ng.org/forum/topic/10286/nesting-xcp-ng-on-esx-8
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Tested-by: Jonathan Katz <jonathan.katz@aptar.com>
3 months agox86/PV: further harden guest memory accesses against speculative abuse
Jan Beulich [Mon, 27 Jan 2025 14:23:59 +0000 (15:23 +0100)]
x86/PV: further harden guest memory accesses against speculative abuse

The original implementation has two issues: For one it doesn't preserve
non-canonical-ness of inputs in the range 0x8000000000000000 through
0x80007fffffffffff. Bogus guest pointers in that range would not cause a
(#GP) fault upon access, when they should.

And then there is an AMD-specific aspect, where only the low 48 bits of
an address are used for speculative execution; the architecturally
mandated #GP for non-canonical addresses would be raised at a later
execution stage. Therefore to prevent Xen controlled data to make it
into any of the caches in a guest controllable manner, we need to
additionally ensure that for non-canonical inputs bit 47 would be clear.

See the code comment for how addressing both is being achieved.

Fixes: 4dc181599142 ("x86/PV: harden guest memory accesses against speculative abuse")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
3 months agox86emul: further correct 64-bit mode zero count repeated string insn handling
Jan Beulich [Mon, 27 Jan 2025 14:23:19 +0000 (15:23 +0100)]
x86emul: further correct 64-bit mode zero count repeated string insn handling

In an entirely different context I came across Linux commit 428e3d08574b
("KVM: x86: Fix zero iterations REP-string"), which points out that
we're still doing things wrong: For one, there's no zero-extension at
all on AMD. And then while RCX is zero-extended from 32 bits uniformly
for all string instructions on newer hardware, RSI/RDI are only for MOVS
and STOS on the systems I have access to. (On an old family 0xf system
I've further found that for REP LODS even RCX is not zero-extended.)

While touching the lines anyway, replace two casts in get_rep_prefix().

Fixes: 79e996a89f69 ("x86emul: correct 64-bit mode repeated string insn handling with zero count")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Released-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agoiommu/amd: atomically update IRTE
Roger Pau Monne [Mon, 20 Jan 2025 14:48:21 +0000 (15:48 +0100)]
iommu/amd: atomically update IRTE

Either when using a 32bit Interrupt Remapping Entry or a 128bit one update
the entry atomically, by using cmpxchg unconditionally as IOMMU depends on
it.  No longer disable the entry by setting RemapEn = 0 ahead of updating
it.  As a consequence of not toggling RemapEn ahead of the update the
Interrupt Remapping Table needs to be flushed after the entry update.

This avoids a window where the IRTE has RemapEn = 0, which can lead to
IO_PAGE_FAULT if the underlying interrupt source is not masked.

There's no guidance in AMD-Vi specification about how IRTE update should be
performed as opposed to DTE updating which has specific guidance.  However
DTE updating claims that reads will always be at least 128bits in size, and
hence for the purposes here assume that reads and caching of the IRTE
entries in either 32 or 128 bit format will be done atomically from
the IOMMU.

Note that as part of introducing a new raw128 field in the IRTE struct, the
current raw field is renamed to raw64 to explicitly contain the size in the
field name.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agoiommu/vtd: cleanup MAP_SINGLE_DEVICE and related code
Teddy Astie [Thu, 18 Apr 2024 11:57:21 +0000 (11:57 +0000)]
iommu/vtd: cleanup MAP_SINGLE_DEVICE and related code

This flag was only used in case cx16 is not available, as those code paths no
longer exist, this flag now does basically nothing.

Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agox86/iommu: remove non-CX16 logic from DMA remapping
Teddy Astie [Thu, 18 Apr 2024 11:57:20 +0000 (11:57 +0000)]
x86/iommu: remove non-CX16 logic from DMA remapping

As CX16 support is now mandatory for IOMMU usage, the checks for CX16 in
the DMA remapping code are stale.  Remove them together with the associated
code introduced in case CX16 was not available.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agoiommu/vtd: remove non-CX16 logic from interrupt remapping
Teddy Astie [Thu, 18 Apr 2024 11:57:21 +0000 (11:57 +0000)]
iommu/vtd: remove non-CX16 logic from interrupt remapping

As CX16 support is now mandatory for IOMMU usage, the checks for CX16 in
the interrupt remapping code are stale.  Remove them together with the
associated code introduced in case CX16 was not available.

Note that AMD-Vi support for atomically updating a 128bit IRTE entry is
still not implemented, it will be done by further changes.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agox86/iommu: check for CMPXCHG16B when enabling IOMMU
Teddy Astie [Fri, 24 Jan 2025 11:31:15 +0000 (12:31 +0100)]
x86/iommu: check for CMPXCHG16B when enabling IOMMU

All hardware with VT-d/AMD-Vi has CMPXCHG16B support. Check this at
initialisation time, and otherwise refuse to use the IOMMU.

If the local APICs support x2APIC mode the IOMMU support for interrupt
remapping will be checked earlier using a specific helper.  If no support
for CX16 is detected by that earlier hook disable the IOMMU at that point
and prevent further poking for CX16 later in the boot process, which would
also fail.

There's a possible corner case when running virtualized, and the underlying
hypervisor exposing an IOMMU but no CMPXCHG16B support.  In which case
ignoring the IOMMU is fine, albeit the most natural would be for the
underlying hypervisor to also expose CMPXCHG16B support if an IOMMU is
available to the VM.

Note this change only introduces the checks, but doesn't remove the now
stale checks for CX16 support sprinkled in the IOMMU code.  Further changes
will take care of that.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Teddy Astie <teddy.astie@vates.tech>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agox86/HVM: correct read/write split at page boundaries
Jan Beulich [Fri, 24 Jan 2025 09:15:56 +0000 (10:15 +0100)]
x86/HVM: correct read/write split at page boundaries

The MMIO cache is intended to have one entry used per independent memory
access that an insn does. This, in particular, is supposed to be
ignoring any page boundary crossing. Therefore when looking up a cache
entry, the access'es starting (linear) address is relevant, not the one
possibly advanced past a page boundary.

In order for the same offset-into-buffer variable to be usable in
hvmemul_phys_mmio_access() for both the caller's buffer and the cache
entry's it is further necessary to have the un-adjusted caller buffer
passed into there.

Fixes: 2d527ba310dc ("x86/hvm: split all linear reads and writes at page boundary")
Reported-by: Manuel Andreas <manuel.andreas@tum.de>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agox86/HVM: allocate emulation cache entries dynamically
Jan Beulich [Fri, 24 Jan 2025 09:15:29 +0000 (10:15 +0100)]
x86/HVM: allocate emulation cache entries dynamically

Both caches may need higher capacity, and the upper bound will need to
be determined dynamically based on CPUID policy (for AMX'es TILELOAD /
TILESTORE at least).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agox86/HVM: correct MMIO emulation cache bounds check
Jan Beulich [Thu, 23 Jan 2025 10:14:48 +0000 (11:14 +0100)]
x86/HVM: correct MMIO emulation cache bounds check

To avoid overrunning the internal buffer we need to take the offset into
the buffer into account.

Fixes: d95da91fb497 ("x86/HVM: grow MMIO cache data size to 64 bytes")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
3 months agodocs: fusa: Fix OFT tags for the design requirements
Ayan Kumar Halder [Tue, 14 Jan 2025 18:57:07 +0000 (18:57 +0000)]
docs: fusa: Fix OFT tags for the design requirements

The OFT tags for the design requirements are updated.

Fixes: b9f9b396452 ("docs: fusa: Add dom0less domain configuration requirements")
Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
3 months agoautomation/cirrus-ci: introduce FreeBSD randconfig builds 4.20.0-rc2
Roger Pau Monne [Thu, 16 Jan 2025 08:06:26 +0000 (09:06 +0100)]
automation/cirrus-ci: introduce FreeBSD randconfig builds

Add a new randconfig job for each FreeBSD version.  This requires some
rework of the template so common parts can be shared between the full and
the randconfig builds.  Such randconfig builds are relevant because FreeBSD
is the only tested system that has a full non-GNU toolchain.

While there replace the usage of the python311 package with python3, which is
already using 3.11, and remove the install of the plain python package for full
builds.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
3 months agoautomation/cirrus-ci: update FreeBSD to 13.4
Roger Pau Monne [Thu, 16 Jan 2025 08:07:31 +0000 (09:07 +0100)]
automation/cirrus-ci: update FreeBSD to 13.4

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
3 months agodocs/misra: Document ECLAIR extension to Rule 20.7
Nicola Vetrini [Fri, 17 Jan 2025 07:54:39 +0000 (08:54 +0100)]
docs/misra: Document ECLAIR extension to Rule 20.7

MISRA C Rule 20.7 states:
"Expressions resulting from the expansion of macro parameters shall
be enclosed in parentheses".

Document the behaviour of ECLAIR with respect to the CPP extension
that allows variable macro arguments to be named.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
3 months agoManual pages: Fix a few typos
Bernhard Kaindl [Fri, 17 Jan 2025 07:54:25 +0000 (08:54 +0100)]
Manual pages: Fix a few typos

While skimming through the manual pages, I spotted a few typos.

Signed-off-by: Bernhard Kaindl <bernhard.kaindl@cloud.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 months agoxl: properly dispose of libxl_dominfo struct instances
Jan Beulich [Fri, 17 Jan 2025 07:54:03 +0000 (08:54 +0100)]
xl: properly dispose of libxl_dominfo struct instances

The ssid_label field requires separate freeing; make sure to call
libxl_dominfo_dispose() as well as libxl_dominfo_init(). Since vcpuset()
calls only the former, add a call to the latter there at the same time.

Coverity-ID: 1638727
Coverity-ID: 1638728
Fixes: c458c404da16 ("xl: use libxl_domain_info to get the uuid in printf_info")
Fixes: 48dab9767d2e ("tools/xl: use libxl_domain_info to get domain type for vcpu-pin")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@vates.tech>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
3 months agoxl: properly dispose of vTPM struct instance
Jan Beulich [Fri, 17 Jan 2025 07:53:50 +0000 (08:53 +0100)]
xl: properly dispose of vTPM struct instance

The backend_domname field requires separate freeing; make sure to call
libxl_device_vtpm_dispose() also on respective error paths.

Coverity-ID: 1638719
Fixes: dde22055ac3a ("libxl: add vtpm support")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@vates.tech>
Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agoxentrace: free CPU mask string before overwriting pointer
Jan Beulich [Fri, 17 Jan 2025 07:53:27 +0000 (08:53 +0100)]
xentrace: free CPU mask string before overwriting pointer

While multiple -c options may be unexpected, we'd still better deal with
them properly.

Also restore the blank line that was bogusly zapped by the same commit.

Coverity-ID: 1638723
Fixes: e4ad2836842a ("xentrace: Implement cpu mask range parsing of human values (-c)")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@vates.tech>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
3 months agodocs/misc: Fix a few typos
Bernhard Kaindl [Wed, 15 Jan 2025 15:09:04 +0000 (16:09 +0100)]
docs/misc: Fix a few typos

While skimming through the misc docs, I spotted a few typos.

Signed-off-by: Bernhard Kaindl <bernhard.kaindl@cloud.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
3 months agodocs: Fix some typos in the design docs
Bernhard Kaindl [Wed, 15 Jan 2025 13:44:55 +0000 (14:44 +0100)]
docs: Fix some typos in the design docs

Skimming through the design docs, I saw some typos that needed fixing.

Reviewed-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 months agoxen/ppc: Fix double xen_ulong_t typedef in public/arch-ppc.h
Andrew Cooper [Wed, 15 Jan 2025 14:22:21 +0000 (14:22 +0000)]
xen/ppc: Fix double xen_ulong_t typedef in public/arch-ppc.h

public/arch-ppc.h contains two adjacent #ifndef __ASSEMBLY__ blocks.

With these merged, it becomes very obvious that there's a duplicate
definition of xen_ulong_t, which is also noticed by the docs build:

  /usr/bin/perl -w /local/xen.git/docs/xen-headers -O html/hypercall/ppc \
          -T 'arch-ppc - Xen public headers' \
          -X arch-arm -X arch-riscv -X arch-x86_32 -X arch-x86_64 \
          -X xen-arm -X xen-riscv -X xen-x86_32 -X xen-x86_64 \
          -X arch-x86 \
          /local/xen.git/docs/../xen include/public include/xen/errno.h
  include/public/memory.h:63: multiple definitions of Typedef xen_ulong_t: include/public/arch-ppc.h:55
  include/public/memory.h:63: multiple definitions of Typedef xen_ulong_t: include/public/arch-ppc.h:61
  include/public/memory.h:63: multiple definitions of Typedef xen_ulong_t: include/public/arch-ppc.h:61
  include/public/memory.h:63: multiple definitions of Typedef xen_ulong_t: include/public/arch-ppc.h:55

Drop the second typedef.  Finally, annotate the #endif so it's clear
what it refers to.

Fixes: 08c192cc1127 ("xen/ppc: Add public/arch-ppc.h")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Shawn Anastasio <sanastasio@raptorengineering.com>
3 months agodocs/sphinx: gitignore generated files
Yann Dirson [Wed, 15 Jan 2025 12:27:56 +0000 (12:27 +0000)]
docs/sphinx: gitignore generated files

Signed-off-by: Yann Dirson <yann.dirson@vates.tech>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 months agodocs: rationalise .gitignore
Yann Dirson [Wed, 15 Jan 2025 12:27:56 +0000 (12:27 +0000)]
docs: rationalise .gitignore

Note I did not transplant the patterns under doc/txt/ (since the whole
dir is ignored already), and adjusted sort order to be fully
alphabetical.

Signed-off-by: Yann Dirson <yann.dirson@vates.tech>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 months agodocs/sphinx: import sys for error reporting
Yann Dirson [Wed, 15 Jan 2025 12:27:56 +0000 (12:27 +0000)]
docs/sphinx: import sys for error reporting

Signed-off-by: Yann Dirson <yann.dirson@vates.tech>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 months agoautomation/gitlab: disable coverage from clang randconfig
Roger Pau Monne [Tue, 14 Jan 2025 14:10:14 +0000 (15:10 +0100)]
automation/gitlab: disable coverage from clang randconfig

If randconfig enables coverage support the build times out due to GNU LD
taking too long.  For the time being prevent coverage from being enabled in
clang randconfig job.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
3 months agox86/time: prefer CMOS over EFI_GET_TIME
Roger Pau Monne [Mon, 2 Sep 2024 14:00:19 +0000 (16:00 +0200)]
x86/time: prefer CMOS over EFI_GET_TIME

The EFI_GET_TIME implementation is well known to be broken for many firmware
implementations, for Xen the result on such implementations are:

----[ Xen-4.19-unstable  x86_64  debug=y  Tainted:   C    ]----
CPU:    0
RIP:    e008:[<0000000062ccfa70>] 0000000062ccfa70
[...]
Xen call trace:
   [<0000000062ccfa70>] R 0000000062ccfa70
   [<00000000732e9a3f>] S 00000000732e9a3f
   [<ffff82d04034f34f>] F arch/x86/time.c#get_cmos_time+0x1b3/0x26e
   [<ffff82d04045926f>] F init_xen_time+0x28/0xa4
   [<ffff82d040454bc4>] F __start_xen+0x1ee7/0x2578
   [<ffff82d040203334>] F __high_start+0x94/0xa0

Pagetable walk from 0000000062ccfa70:
 L4[0x000] = 000000207ef1c063 ffffffffffffffff
 L3[0x001] = 000000005d6c0063 ffffffffffffffff
 L2[0x116] = 8000000062c001e3 ffffffffffffffff (PSE)

****************************************
Panic on CPU 0:
FATAL PAGE FAULT
[error_code=0011]
Faulting linear address: 0000000062ccfa70
****************************************

Swap the preference to default to CMOS first, and EFI later, in an attempt to
use EFI_GET_TIME as a last resort option only.  Note that Linux for example
doesn't allow calling the get_time method, and instead provides a dummy handler
that unconditionally returns EFI_UNSUPPORTED on x86-64.

Such change in the preferences requires some re-arranging of the function
logic, so that panic messages with workaround suggestions are suitably printed.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-By: Oleksii Kurochko<oleksii.kurochko@gmail.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
3 months agox86/time: introduce command line option to select wallclock
Roger Pau Monne [Mon, 2 Sep 2024 15:51:33 +0000 (17:51 +0200)]
x86/time: introduce command line option to select wallclock

Allow setting the used wallclock from the command line.  When the option is set
to a value different than `auto` the probing is bypassed and the selected
implementation is used (as long as it's available).

The `xen` and `efi` options require being booted as a Xen guest (with Xen guest
supported built-in) or from UEFI firmware respectively.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
3 months agoautomation/eclair: make Misra rule 20.7 blocking
Roger Pau Monne [Tue, 14 Jan 2025 11:08:22 +0000 (12:08 +0100)]
automation/eclair: make Misra rule 20.7 blocking

There are no violations left, make the rule globally blocking for both x86
and ARM.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
3 months agodocs: Improve spelling of few cases in the documentation
Bernhard Kaindl [Wed, 15 Jan 2025 15:01:39 +0000 (16:01 +0100)]
docs: Improve spelling of few cases in the documentation

Skimming the docs, I came across a few places for spelling improvements.

Signed-off-by: Bernhard Kaindl <bernhard.kaindl@cloud.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 months agoMAINTAINERS: Change reviewer of the ECLAIR integration
Nicola Vetrini [Wed, 15 Jan 2025 15:01:25 +0000 (16:01 +0100)]
MAINTAINERS: Change reviewer of the ECLAIR integration

Simone Ballarin is no longer actively involved in reviewing
the ECLAIR integration for Xen. I am stepping up as a reviewer.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Simone Ballarin <simone.ballarin@bugseng.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
3 months agomisra: add deviation for MISRA C Rule R11.8
Alessandro Zucchelli [Wed, 15 Jan 2025 15:01:13 +0000 (16:01 +0100)]
misra: add deviation for MISRA C Rule R11.8

Rule 11.8 states as following: "A cast shall not remove any `const' or
`volatile' qualification from the type pointed to by a pointer".

Function `__hvm_copy' in `xen/arch/x86/hvm/hvm.c' is a double-use
function, where the parameter needs to not be const because it can be
set for write or not. As it was decided a new const-only function will
lead to more developer confusion than it's worth, this violation is
addressed by deviating the function.
All cases of casting away const-ness are accompanied with a comment
explaining why it is safe given the other flags passed in; such comment is used
by the deviation in order to match the appropriate function call.

No functional change.

Signed-off-by: Alessandro Zucchelli <alessandro.zucchelli@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
3 months agox86: Add Support for Paging-Write Feature
Petr Beneš [Thu, 2 Jan 2025 17:13:28 +0000 (17:13 +0000)]
x86: Add Support for Paging-Write Feature

This patch introduces a new XENMEM_access_r_pw permission.
Functionally, it is similar to XENMEM_access_r, but for processors
with TERTIARY_EXEC_EPT_PAGING_WRITE support (Intel 12th Gen/Alder Lake
and later, Xeon 4th Gen/Sappire Rapids and later), it also permits the
CPU to write to the page during guest page-table walks (e.g., updating
A/D bits) without triggering an EPT violation.

This behavior works by both enabling the EPT paging-write feature and
setting the EPT paging-write flag in the EPT leaf entry.

This feature provides a significant performance boost for
introspection tools that monitor guest page-table updates. Previously,
every page-table modification by the guest—including routine updates
like setting A/D bits—triggered an EPT violation, adding unnecessary
overhead. The new XENMEM_access_r_pw permission allows these
"uninteresting" updates to occur without EPT violations, improving
efficiency.

Additionally, this feature simplifies the handling of race conditions
in scenarios where an introspection tool:

- Sets an "invisible breakpoint" in the altp2m view for a function F.
- Monitors guest page-table updates to track whether the page
  containing F is paged out.
- Encounters a cleared Access (A) bit on the page containing F while
  the guest is about to execute the breakpoint.

In the current implementation:

- If xc_monitor_inguest_pagefault() is enabled, the introspection tool
  must emulate both the breakpoint and the setting of the Access bit.
- If xc_monitor_inguest_pagefault() is disabled, Xen handles the EPT
  violation without notifying the introspection tool, setting the
  Access bit and emulating the instruction. However, Xen fetches the
  instruction from the default view instead of the altp2m view,
  potentially causing the breakpoint to be missed.

With this patch, setting XENMEM_access_r_pw for monitored guest
page-tables prevents EPT violations in these cases. This change
enhances performance and reduces complexity for introspection tools,
ensuring seamless breakpoint handling while tracking guest page-table
updates.

Signed-off-by: Petr Beneš <w1benny@gmail.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 months agox86: Rename _rsvd field to pw and move it to the bit 58
Petr Beneš [Thu, 2 Jan 2025 17:13:27 +0000 (17:13 +0000)]
x86: Rename _rsvd field to pw and move it to the bit 58

The EPT Paging-write feature (when enabled by the
TERTIARY_EXEC_EPT_PAGING_WRITE bit) uses bit 58 of the EPT entry to
indicate that guest paging may update the page, even if the W access
is not set.

This patch is a preparation for the EPT Paging-write feature.

Signed-off-by: Petr Beneš <w1benny@gmail.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
3 months agobuild: Set DATE to SOURCE_DATE_EPOCH if available
Maximilian Engelhardt [Mon, 30 Dec 2024 21:00:30 +0000 (22:00 +0100)]
build: Set DATE to SOURCE_DATE_EPOCH if available

Use the solution described in [1] to provide a wrapper to the 'date'
command that uses SOURCE_DATE_EPOCH if available.  This is needed for
reproducible builds.

The -d "@..." syntax was introduced in GNU date about 2005 (but only
added to the docuemntation in 2011), so I assume a version supporting
this syntax is available, if SOURCE_DATE_EPOCH is defined. If
SOURCE_DATE_EPOCH is not defined, nothing changes with respect to the
current behavior.

Update all users of 'date' in the tree to use the new wrapper.

[1] https://reproducible-builds.org/docs/source-date-epoch/

Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 months agodocs/Makefile: Add ppc and riscv to DOC_ARCHES
Maximilian Engelhardt [Fri, 10 Jan 2025 21:19:03 +0000 (22:19 +0100)]
docs/Makefile: Add ppc and riscv to DOC_ARCHES

Not having ppc and riscv included in DOC_ARCHES causes "multiple
definitions of ..." message on documentation build, similar to the
example shown below:

include/public/arch-ppc.h:91: multiple definitions of Typedef
vcpu_guest_core_regs_t: include/public/arch-arm.h:300
include/public/arch-ppc.h:91: multiple definitions of Typedef
vcpu_guest_core_regs_t: include/public/arch-ppc.h:85

It can also make the generated html documentation link to the header
files of another architecture. This is additionally a problem as it can
randomly make the documentation build non-reproducible.

Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 months agoCI: Add an x86_64 Clang Randconfig job
Andrew Cooper [Fri, 10 Jan 2025 16:02:17 +0000 (16:02 +0000)]
CI: Add an x86_64 Clang Randconfig job

This was recently identified as a hole in testing.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
3 months agoUpdate Xen version to 4.20-rc 4.20.0-rc1
Andrew Cooper [Thu, 9 Jan 2025 15:06:34 +0000 (15:06 +0000)]
Update Xen version to 4.20-rc

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agoConfig.mk: Pin QEMU_UPSTREAM_REVISION
Andrew Cooper [Thu, 9 Jan 2025 15:10:01 +0000 (15:10 +0000)]
Config.mk: Pin QEMU_UPSTREAM_REVISION

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@vates.tech>
Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agoxen/arm: Fully initialise struct membanks_hdr fields
Luca Fancellu [Thu, 9 Jan 2025 13:02:04 +0000 (13:02 +0000)]
xen/arm: Fully initialise struct membanks_hdr fields

Commit a14593e3995a ("xen/device-tree: Allow region overlapping with
/memreserve/ ranges") introduced a type in the 'struct membanks_hdr'
but forgot to update the 'struct kernel_info' initialiser, while
it doesn't lead to failures because the field is not currently
used while managing kernel_info structures, it's good to have it
for completeness.

There are other instance of structures using 'struct membanks_hdr'
that are dynamically allocated and don't fully initialise these
fields, provide a static inline helper for that.

Fixes: a14593e3995a ("xen/device-tree: Allow region overlapping with /memreserve/ ranges")
Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
3 months agodocs: fusa: Add dom0less domain configuration requirements
Michal Orzel [Wed, 8 Jan 2025 17:03:04 +0000 (17:03 +0000)]
docs: fusa: Add dom0less domain configuration requirements

Add requirements for dom0less domain creation.

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
3 months agoxen/events: fix race with set_global_virq_handler()
Juergen Gross [Thu, 9 Jan 2025 16:34:01 +0000 (17:34 +0100)]
xen/events: fix race with set_global_virq_handler()

There is a possible race scenario between set_global_virq_handler()
and clear_global_virq_handlers() targeting the same domain, which
might result in that domain ending as a zombie domain.

In case set_global_virq_handler() is being called for a domain which
is just dying, it might happen that clear_global_virq_handlers() is
running first, resulting in set_global_virq_handler() taking a new
reference for that domain and entering in the global_virq_handlers[]
array afterwards. The reference will never be dropped, thus the domain
will never be freed completely.

This can be fixed by checking the is_dying state of the domain inside
the region guarded by global_virq_handlers_lock. In case the domain is
dying, handle it as if the domain wouldn't exist, which will be the
case in near future anyway.

Fixes: 87521589aa6a ("xen: allow global VIRQ handlers to be delegated to other domains")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 months agoxen/arm: ffa: fix build with clang
Stewart Hildebrand [Thu, 9 Jan 2025 16:33:14 +0000 (17:33 +0100)]
xen/arm: ffa: fix build with clang

Clang 16 reports:

In file included from arch/arm/tee/ffa.c:72:
arch/arm/tee/ffa_private.h:329:17: error: 'used' attribute ignored on a non-definition declaration [-Werror,-Wignored-attributes]
extern uint32_t __ro_after_init ffa_fw_version;
                ^

The variable ffa_fw_version is only used in ffa.c. Remove the
declaration in the header and make the definition in ffa.c static.

Fixes: 2f9f240a5e87 ("xen/arm: ffa: Fine granular call support")
Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
3 months agoCI: Update Fedora to 41
Andrew Cooper [Wed, 8 Jan 2025 12:05:38 +0000 (12:05 +0000)]
CI: Update Fedora to 41

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
3 months agoxen/arm64: Drop relocate_and_switch_ttbr() stub
Michal Orzel [Wed, 8 Jan 2025 07:57:19 +0000 (08:57 +0100)]
xen/arm64: Drop relocate_and_switch_ttbr() stub

In the original patch e7a80636f16e ("xen/arm: add cache coloring support
for Xen image"), the stub was added under wrong assumption that DCE
won't remove the function call if it's not static. This assumption is
incorrect as we already rely on DCE for cases like this one. Therefore
drop the stub, that otherwise would be a place potentially prone to
errors in the future.

Suggested-by: Julien Grall <julien@xen.org>
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
3 months agoxen/flask: Wire up XEN_DOMCTL_set_llc_colors
Michal Orzel [Tue, 7 Jan 2025 09:27:19 +0000 (10:27 +0100)]
xen/flask: Wire up XEN_DOMCTL_set_llc_colors

Addition of FLASK permission for this hypercall was overlooked in the
original patch. Fix it. Setting LLC colors is only possible during domain
creation.

Fixes: 6985aa5e0c3c ("xen: extend domctl interface for cache coloring")
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Daniel P. Smith <dpsmith@apertussolutions.com>
3 months agoxen/flask: Wire up XEN_DOMCTL_dt_overlay
Michal Orzel [Tue, 7 Jan 2025 09:27:18 +0000 (10:27 +0100)]
xen/flask: Wire up XEN_DOMCTL_dt_overlay

Addition of FLASK permission for this hypercall was overlooked in the
original patch. Fix it. The only dt overlay operation is attaching that can
happen only after the domain is created. Dom0 can attach overlay to itself
as well.

Fixes: 4c733873b5c2 ("xen/arm: Add XEN_DOMCTL_dt_overlay and device attachment to domains")
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Daniel P. Smith <dpsmith@apertussolutions.com>
3 months agoxen/flask: Wire up XEN_DOMCTL_vuart_op
Michal Orzel [Tue, 7 Jan 2025 09:27:17 +0000 (10:27 +0100)]
xen/flask: Wire up XEN_DOMCTL_vuart_op

Addition of FLASK permission for this hypercall was overlooked in the
original patch. Fix it. The only VUART operation is initialization that
can occur only during domain creation.

Fixes: 86039f2e8c20 ("xen/arm: vpl011: Add a new domctl API to initialize vpl011")
Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Daniel P. Smith <dpsmith@apertussolutions.com>
3 months agox86emul: correct put_fpu()'s segment selector handling
Jan Beulich [Wed, 8 Jan 2025 10:02:16 +0000 (11:02 +0100)]
x86emul: correct put_fpu()'s segment selector handling

All selector fields under ctxt->regs are (normally) poisoned in the HVM
case, and the four ones besides CS and SS are potentially stale for PV.
Avoid using them in the hypervisor incarnation of the emulator, when
trying to cover for a missing ->read_segment() hook.

To make sure there's always a valid ->read_segment() handler for all HVM
cases, add a respective function to shadow code, even if it is not
expected for FPU insns to be used to update page tables.

Fixes: 0711b59b858a ("x86emul: correct FPU code/data pointers and opcode handling")
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 months agox86emul: VCVT{,U}DQ2PD ignores embedded rounding
Jan Beulich [Wed, 8 Jan 2025 10:01:17 +0000 (11:01 +0100)]
x86emul: VCVT{,U}DQ2PD ignores embedded rounding

IOW we shouldn't raise #UD in that case. Be on the safe side though and
only encode fully legitimate forms into the stub to be executed.

Things weren't quite right for VCVT{,U}SI2SD either, in the attempt to
be on the safe side: Clearing EVEX.L'L isn't useful; it's EVEX.b which
primarily needs clearing. Also reflect the somewhat improved doc
situation in the comment there.

Fixes: ed806f373730 ("x86emul: support AVX512F legacy-equivalent packed int/FP conversion insns")
Fixes: baf4a376f550 ("x86emul: support AVX512F legacy-equivalent scalar int/FP conversion insns")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 months agolibxl: drop setting XEN_QEMU_CONSOLE_LIMIT in the environment (XSA-180 / CVE-2014...
James Dingwall [Wed, 8 Jan 2025 10:00:54 +0000 (11:00 +0100)]
libxl: drop setting XEN_QEMU_CONSOLE_LIMIT in the environment (XSA-180 / CVE-2014-3672)

The corresponding code in the Xen qemu repository was not applied from
qemu-xen-4.18.0.

Signed-off-by: James Dingwall <james@dingwall.me.uk>
Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
3 months agoxen/perfc: Cleanup
Andrew Cooper [Sun, 29 Dec 2024 19:36:34 +0000 (19:36 +0000)]
xen/perfc: Cleanup

 * Strip trailing whitspace.
 * Remove PRIperfc.  It has never been used and isn't useful.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
3 months agoxen/perfc: Trim includes
Andrew Cooper [Sun, 29 Dec 2024 18:01:34 +0000 (18:01 +0000)]
xen/perfc: Trim includes

This is mostly for the removal of xen/lib.h and xen/smp.h from perfc.h.  All
that is needed is xen/macros.h.

Trim and sort the includes for perfc.c too.  There's no need for smp.h,
keyhandler.h or mm.h, but cpumask.h is needed.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
3 months agoxen/perfc: Add perfc_defn.h to asm-generic
Andrew Cooper [Sun, 29 Dec 2024 18:18:22 +0000 (18:18 +0000)]
xen/perfc: Add perfc_defn.h to asm-generic

... and hook it up for RISC-V and PPC.

On RISC-V at least, no combination of headers pulls in errno.h, so include it
explicitly.

Guard the hypercalls array declaration based on NR_hypercalls existing.  This
is sufficient to get PERF_COUNTERS fully working on RISC-V and PPC, so drop
the randconfig override.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Oleksii Kurohcko <oleksii.kurochko@gmail.com>
3 months agoxen/perfc: Drop arch_perfc_{gather,reset}()
Andrew Cooper [Sun, 29 Dec 2024 18:31:32 +0000 (18:31 +0000)]
xen/perfc: Drop arch_perfc_{gather,reset}()

These were only ever used by the IA64 port, which was droped in commit
570c311ca2c7 ("remove ia64") in 2012.

Remove them, and clean up the arm/x86 stub headers.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
3 months agox86/amd: Misc setup for Fam1Ah processors
Andrew Cooper [Tue, 31 Dec 2024 14:15:22 +0000 (14:15 +0000)]
x86/amd: Misc setup for Fam1Ah processors

Fam1Ah is similar to Fam19h in these regards.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
3 months agox86/pv: Fix build with Clang and CONFIG_PERF_COUNTERS
Andrew Cooper [Thu, 2 Jan 2025 19:46:19 +0000 (19:46 +0000)]
x86/pv: Fix build with Clang and CONFIG_PERF_COUNTERS

Clang, of at least verion 17 complains:

  arch/x86/pv/hypercall.c:30:10: error: variable 'eax' is used uninitialized
  whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
     30 |     if ( !compat )
        |          ^~~~~~~
  arch/x86/pv/hypercall.c:87:29: note: uninitialized use occurs here
     87 |     perfc_incra(hypercalls, eax);
        |                             ^~~

This function is forced always_inline to cause compat to be
constant-propagated through, but that is only a heuristic to try and get the
compiler to do what we want, not a gurantee that it does.

Clang doesn't appear to be able to see that the only case where compat is
true (and therefore the if() is false) is when there's an else clause on the
end which sets eax too.

Initialise eax to -1, which ought to be optimised out, but if for whatever
reason it happens not to be, then perfc_incra() will fail it's bounds check
and do nothing.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
3 months agox86/traps: Rework LER initialisation and support Zen5/Diamond Rapids
Andrew Cooper [Tue, 31 Dec 2024 14:06:19 +0000 (14:06 +0000)]
x86/traps: Rework LER initialisation and support Zen5/Diamond Rapids

AMD have always used the architectural MSRs for LER.  As the first processor
to support LER was the K7 (which was 32bit), we can assume it's presence
unconditionally in 64bit mode.

Intel are about to run out of space in Family 6 and start using 19.  It is
only the Pentium 4 which uses non-architectural LER MSRs.

percpu_traps_init(), which runs on every CPU, contains a lot of code which
should be init-only, and is the only reason why opt_ler can't be in initdata.

Write a brand new init_ler() which expects all future Intel and AMD CPUs to
continue using the architectural MSRs, and does all setup together.  Call it
from trap_init(), and remove the setup logic percpu_traps_init() except for
the single path configuring MSR_IA32_DEBUGCTLMSR.

Leave behind a warning if the user asked for LER and Xen couldn't enable it.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 months agoeclair-analysis: tidy toolchain.ecl configuration and mark Rule 1.1 clean
Nicola Vetrini [Sun, 22 Dec 2024 14:04:08 +0000 (15:04 +0100)]
eclair-analysis: tidy toolchain.ecl configuration and mark Rule 1.1 clean

Reformat the list of GNU extensions and non-standard tokens used by Xen
in the ECLAIR configuration to make it easier to review any changes to it.

The extension "ext_missing_varargs_arg", which captures the GNU extension that
allows variadic functions and macros not to require at least one named parameter
before C23 has been renamed to "ext_c_missing_varargs_arg" in the current version
of ECLAIR used in CI, therefore this resolves regressions on MISRA C Rule 1.1:

"The program shall contain no violations of the standard C syntax and constraints,
and shall not exceed the implementation's translation limits."

As a result, Rule 1.1 now has no violations and is tagged as such.

Remove two unused configurations, that were already commented out.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Fixes: 631f535a3d4f ("xen: update ECLAIR service identifiers from MC3R1 to MC3A2.")
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
4 months agoxen/scripts: Fix regex syntax warnings with Python 3.12
Ariel Otilibili [Thu, 19 Dec 2024 18:10:43 +0000 (19:10 +0100)]
xen/scripts: Fix regex syntax warnings with Python 3.12

Same fix than commit 826a9eb072 (tools: Fix regex syntax warnings with Python 3.12).

It clears out the warning:

```
$ xen/scripts/xen-analysis.py
xen/scripts/xen_analysis/cppcheck_analysis.py:94: SyntaxWarning: invalid escape sequence '\*'
  comment_line_starts = re.match('^[ \t]*/\*.*$', line)
```

The  warning appears only the first time the command is run, then it disappears.

Fixes: 02b26c02c7 (xen/scripts: add cppcheck tool to the xen-analysis.py script)
Signed-off-by: Ariel Otilibili <Ariel.Otilibili-Anieli@eurecom.fr>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
--
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Anthony PERARD <anthony.perard@vates.tech>
Cc: Michal Orzel <michal.orzel@amd.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien@xen.org>
Cc: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
4 months agox86/spec-ctrl: Support for SRSO_U/S_NO and SRSO_MSR_FIX
Andrew Cooper [Mon, 25 Mar 2024 15:14:46 +0000 (15:14 +0000)]
x86/spec-ctrl: Support for SRSO_U/S_NO and SRSO_MSR_FIX

AMD have updated the SRSO whitepaper[1] with further information.  These
features exist on AMD Zen5 CPUs and are necessary for Xen to use.

The two features are in principle unrelated:

 * SRSO_U/S_NO is an enumeration saying that SRSO attacks can't cross the
   User(CPL3) / Supervisor(CPL<3) boundary.  i.e. Xen don't need to use
   IBPB-on-entry for PV64.  PV32 guests are explicitly unsupported for
   speculative issues, and excluded from consideration for simplicity.

 * SRSO_MSR_FIX is an enumeration identifying that the BP_SPEC_REDUCE bit is
   available in MSR_BP_CFG.  When set, SRSO attacks can't cross the host/guest
   boundary.  i.e. Xen don't need to use IBPB-on-entry for HVM.

Extend ibpb_calculations() to account for these when calculating
opt_ibpb_entry_{pv,hvm} defaults.  Add a `bp-spec-reduce=<bool>` option to
control the use of BP_SPEC_REDUCE, with it active by default.

Because MSR_BP_CFG is core-scoped with a race condition updating it, repurpose
amd_check_erratum_1485() into amd_check_bp_cfg() and calculate all updates at
once.

Xen also needs to to advertise SRSO_U/S_NO to guests to allow the guest kernel
to skip SRSO mitigations too:

 * This is trivial for HVM guests.  It is also is accurate for PV32 guests
   too, but we have already excluded them from consideration, and do so again
   here to simplify the policy logic.

 * As written, SRSO_U/S_NO does not help for the PV64 user->kernel boundary.
   However, after discussing with AMD, an implementation detail of having
   BP_SPEC_REDUCE active causes the PV64 user->kernel boundary to have the
   property described by SRSO_U/S_NO, so we can advertise SRSO_U/S_NO to
   guests when the BP_SPEC_REDUCE precondition is met.

Finally, fix a typo in the SRSO_NO's comment.

[1] https://www.amd.com/content/dam/amd/en/documents/corporate/cr/speculative-return-stack-overflow-whitepaper.pdf
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 months agoxen/arch/x86: make objdump output user locale agnostic
Maximilian Engelhardt [Mon, 30 Dec 2024 21:00:31 +0000 (22:00 +0100)]
xen/arch/x86: make objdump output user locale agnostic

The objdump output is fed to grep, so make sure it doesn't change with
different user locales and break the grep parsing.
This problem was identified while updating xen in Debian and the fix is
needed for generating reproducible builds in varying environments.

Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 months agotools: fix typo: subsytem -> subsystem
Maximilian Engelhardt [Mon, 30 Dec 2024 21:00:33 +0000 (22:00 +0100)]
tools: fix typo: subsytem -> subsystem

This was found by the lintian tool (Debian package checker) during
packaging xen for Debian.

Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 months agodocs/man: fix typo: hexidecimal -> hexadecimal
Maximilian Engelhardt [Mon, 30 Dec 2024 21:00:32 +0000 (22:00 +0100)]
docs/man: fix typo: hexidecimal -> hexadecimal

This was found by the lintian tool (Debian package checker) during
packaging xen for Debian.

Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 months agodocs/man/xen-vbd-interface.7: Provide properly-formatted NAME section
Ian Jackson [Mon, 30 Dec 2024 21:00:29 +0000 (22:00 +0100)]
docs/man/xen-vbd-interface.7: Provide properly-formatted NAME section

This manpage was omitted from
   docs/man: Provide properly-formatted NAME sections
   (423c4def1f7a01eeff56fa70564180640ef3af43)
because I was previously building with markdown not installed.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Tested-by: Maximilian Engelhardt <maxi@daemonizer.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 months agoCHANGELOG: Mention LLC coloring feature on Arm
Michal Orzel [Fri, 20 Dec 2024 08:19:40 +0000 (09:19 +0100)]
CHANGELOG: Mention LLC coloring feature on Arm

It's definitely worth mentioning as one of the most notable feature on
Arm this release.

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
4 months agotools: Introduce a xc_xenver_buildid() wrapper
Andrew Cooper [Tue, 17 Jan 2023 12:52:01 +0000 (12:52 +0000)]
tools: Introduce a xc_xenver_buildid() wrapper

... which converts binary content to hex automatically.

Update libxl to match.  No API/ABI change.

This removes a latent libxl bug for cases when the buildid is longer than 4092
bytes.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
4 months agotools: Introduce a non-truncating xc_xenver_cmdline()
Andrew Cooper [Tue, 17 Jan 2023 12:47:44 +0000 (12:47 +0000)]
tools: Introduce a non-truncating xc_xenver_cmdline()

Update libxl to match.  No API/ABI change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
4 months agotools: Introduce a non-truncating xc_xenver_changeset()
Andrew Cooper [Tue, 17 Jan 2023 12:45:37 +0000 (12:45 +0000)]
tools: Introduce a non-truncating xc_xenver_changeset()

Update libxl and the ocaml stubs to match.  No API/ABI change in either.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
4 months agotools: Introduce a non-truncating xc_xenver_capabilities()
Andrew Cooper [Tue, 17 Jan 2023 12:39:48 +0000 (12:39 +0000)]
tools: Introduce a non-truncating xc_xenver_capabilities()

Update libxl and the ocaml stubs to match.  No API/ABI change in either.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
4 months agotools: Introduce a non-truncating xc_xenver_extraversion()
Andrew Cooper [Mon, 16 Jan 2023 16:56:17 +0000 (16:56 +0000)]
tools: Introduce a non-truncating xc_xenver_extraversion()

... which uses XENVER_extraversion2.

In order to do this sensibly, use manual hypercall buffer handling.  Not only
does this avoid an extra bounce buffer (we need to strip the xen_varbuf_t
header anyway), it's also shorter and easlier to follow.

Update libxl and the ocaml stubs to match.  No API/ABI change in either.

With this change made, `xl info` can now correctly access a >15 char
extraversion:

  # xl info xen_version
  4.18-unstable+REALLY LONG EXTRAVERSION

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
4 months agotools/libxc: Move xc_version() out of xc_private.c into its own file
Andrew Cooper [Mon, 16 Jan 2023 14:40:07 +0000 (14:40 +0000)]
tools/libxc: Move xc_version() out of xc_private.c into its own file

kexec-tools uses xc_version(), meaning that it is not a private API.  As we're
going to extend the functionality substantially, move it to its own file.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
4 months agoxen/version: Misc style fixes
Andrew Cooper [Tue, 20 Dec 2022 16:45:23 +0000 (16:45 +0000)]
xen/version: Misc style fixes

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Daniel P. Smith <dpsmith@apertussolutions.com>
4 months agoxen/version: Fold build_id handling into xenver_varbuf_op()
Andrew Cooper [Tue, 3 Jan 2023 19:06:43 +0000 (19:06 +0000)]
xen/version: Fold build_id handling into xenver_varbuf_op()

struct xen_build_id and struct xen_varbuf are identical from an ABI point of
view, so XENVER_build_id can reuse xenver_varbuf_op() rather than having it's
own almost identical copy of the logic.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 months agoxen/version: Introduce non-truncating deterministically-signed XENVER_* subops
Andrew Cooper [Tue, 20 Dec 2022 13:12:52 +0000 (13:12 +0000)]
xen/version: Introduce non-truncating deterministically-signed XENVER_* subops

In XenServer, we have encountered problems caused by both XENVER_extraversion
and XENVER_commandline having fixed bounds.

More than just the invariant size, the APIs/ABIs also broken by typedef-ing an
array, and using an unqualified 'char' which has implementation-specific
signed-ness.

Provide brand new ops, which are capable of expressing variable length
strings, and mark the older ops as broken.

This fixes all issues around XENVER_extraversion being longer than 15 chars.
Further work beyond just this API is needed to remove other assumptions about
XENVER_commandline being 1023 chars long.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
---
Non-technical objections to this patch were raised, and subsequently rejected
by a community wide vote.  The results of the vote have not been shared with
the community at the time of committing.

4 months agoxen/version: Calculate xen_capabilities_info once at boot
Andrew Cooper [Fri, 13 Jan 2023 17:20:41 +0000 (17:20 +0000)]
xen/version: Calculate xen_capabilities_info once at boot

The arch_get_xen_caps() infrastructure is horribly inefficient for something
that is constant after features have been resolved on boot.

Every instance used snprintf() to format constants into a string (which gets
shorter when %d gets resolved!), and which get double buffered on the stack.

Switch to using string literals with the "3.0" inserted - these numbers
haven't changed in 19 years; the Xen 3.0 release was Dec 5th 2005.

Use initcalls to format the data into xen_cap_info, which is deliberately not
of type xen_capabilities_info_t because a 1k array is a silly overhead for
storing a maximum of 77 chars (the x86 version) and isn't liable to need any
more space in the forseeable future.  RISC-V and PPC have their stub dropped,
with the expectation that they won't carry this legacy interface forward.

This speeds up the the XENVER_capabilities hypercall, but the purpose of the
change is to allow us to introduce a better XENVER_* API that doesn't force
the use of a 1k buffer on the stack.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 months agoMAINTAINERS: Add myself as maintainer for NXP S32G3
Andrei Cherechesu [Thu, 19 Dec 2024 11:23:15 +0000 (13:23 +0200)]
MAINTAINERS: Add myself as maintainer for NXP S32G3

Add myself as maintainer for NXP S32G3 SoCs Family,
and the S32 Linux Team as relevant reviewers list.

Signed-off-by: Andrei Cherechesu <andrei.cherechesu@nxp.com>
Acked-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Michal Orzel <michal.orzel@amd.com>
4 months agoSUPPORT.md: Describe SCMI-SMC layer feature
Andrei Cherechesu [Thu, 19 Dec 2024 11:23:14 +0000 (13:23 +0200)]
SUPPORT.md: Describe SCMI-SMC layer feature

Describe the layer which enables SCMI over SMC calls forwarding
to EL3 FW if issued by the Hardware domain. If the SCMI firmware
node is not found in the Host DT during initialization, it fails
silently as it's not mandatory.

The SCMI SMCs trapping at EL2 now lets hwdom perform SCMI ops for
interacting with system-level resources almost as if it would be
running natively.

Signed-off-by: Andrei Cherechesu <andrei.cherechesu@nxp.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Michal Orzel <michal.orzel@amd.com>
4 months agoCHANGELOG.md: Add NXP S32G3 and SCMI-SMC layer support mentions
Andrei Cherechesu [Thu, 19 Dec 2024 11:23:13 +0000 (13:23 +0200)]
CHANGELOG.md: Add NXP S32G3 and SCMI-SMC layer support mentions

Signed-off-by: Andrei Cherechesu <andrei.cherechesu@nxp.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
4 months agoxen/arm: platforms: Add NXP S32G3 Processors config
Andrei Cherechesu [Thu, 19 Dec 2024 11:23:12 +0000 (13:23 +0200)]
xen/arm: platforms: Add NXP S32G3 Processors config

Platforms based on NXP S32G3 processors use the NXP LINFlexD
UART driver for console by default, and rely on Dom0 having
access to SCMI services for system-level resources from
firmware at EL3.

Signed-off-by: Andrei Cherechesu <andrei.cherechesu@nxp.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
4 months agoxen/arm: vsmc: Enable handling SiP-owned SCMI SMC calls
Andrei Cherechesu [Thu, 19 Dec 2024 11:23:11 +0000 (13:23 +0200)]
xen/arm: vsmc: Enable handling SiP-owned SCMI SMC calls

Change the handling of SiP SMC calls to be more generic,
instead of directly relying on the `platform_smc()` callback
implementation.

Try to handle the SiP SMC first through the `platform_smc()`
callback (if implemented). Otherwise, try to handle it as SCMI
message.

Signed-off-by: Andrei Cherechesu <andrei.cherechesu@nxp.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Julien Grall <jgrall@amazon.com>
4 months agoxen/arm: firmware: Add SCMI over SMC calls handling layer
Andrei Cherechesu [Thu, 19 Dec 2024 11:23:10 +0000 (13:23 +0200)]
xen/arm: firmware: Add SCMI over SMC calls handling layer

Introduce the SCMI-SMC layer to have some basic degree of
awareness about SCMI calls that are based on the ARM System
Control and Management Interface (SCMI) specification (DEN0056E).

The SCMI specification includes various protocols for managing
system-level resources, such as: clocks, pins, reset, system power,
power domains, performance domains, etc. The clients are named
"SCMI agents" and the server is named "SCMI platform".

Only support the shared-memory based transport with SMCs as
the doorbell mechanism for notifying the platform. Also, this
implementation only handles the "arm,scmi-smc" compatible,
requiring the following properties:
- "arm,smc-id" (unique SMC ID)
- "shmem" (one or more phandles pointing to shmem zones
for each channel)

The initialization is done as initcall, since we need
SMCs, and PSCI should already probe EL3 FW for SMCCC support.
If no "arm,scmi-smc" compatible node is found in the host
DT, the initialization fails silently, as it's not mandatory.
Otherwise, we get the 'arm,smc-id' DT property from the node,
to know the SCMI SMC ID we handle. The 'shmem' memory ranges
are not validated, as the SMC calls are only passed through
to EL3 FW if coming from the hardware domain.

Create a new 'firmware' folder to keep the SCMI code separate
from the generic ARM code.

Signed-off-by: Andrei Cherechesu <andrei.cherechesu@nxp.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Michal Orzel <michal.orzel@amd.com>
4 months agoxen/arm: add cache coloring support for Xen image
Carlo Nonato [Tue, 17 Dec 2024 17:06:37 +0000 (18:06 +0100)]
xen/arm: add cache coloring support for Xen image

Xen image is relocated to a new colored physical space. Some relocation
functionalities must be brought back:
- the virtual address of the new space is taken from 0c18fb76323b
  ("xen/arm: Remove unused BOOT_RELOC_VIRT_START").
- relocate_xen() and get_xen_paddr() are taken from f60658c6ae47
  ("xen/arm: Stop relocating Xen").

setup_pagetables() must be adapted for coloring and for relocation. Runtime
page tables are used to map the colored space, but they are also linked in
boot tables so that the new space is temporarily available for relocation.
This implies that Xen protection must happen after the copy.

Finally, since the alternative framework needs to remap the Xen text and
inittext sections, this operation must be done in a coloring-aware way.
The function xen_remap_colored() is introduced for that.

Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Reviewed-by: Jan Beulich <jbeulich@suse.com> # common
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
4 months agoxen/arm: make consider_modules() available for xen relocation
Carlo Nonato [Tue, 17 Dec 2024 17:06:36 +0000 (18:06 +0100)]
xen/arm: make consider_modules() available for xen relocation

Cache coloring must physically relocate Xen in order to color the hypervisor
and consider_modules() is a key function that is needed to find a new
available physical address.

672d67f339c0 ("xen/arm: Split MMU-specific setup_mm() and related code out")
moved consider_modules() under arm32. Move it to mmu/setup.c and make it
non-static so that it can be used outside.

Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
4 months agoxen/arm: add Xen cache colors command line parameter
Luca Miccio [Tue, 17 Dec 2024 17:06:35 +0000 (18:06 +0100)]
xen/arm: add Xen cache colors command line parameter

Add a new command line parameter to configure Xen cache colors.
These colors are dumped together with other coloring info.

Benchmarking the VM interrupt response time provides an estimation of
LLC usage by Xen's most latency-critical runtime task. Results on Arm
Cortex-A53 on Xilinx Zynq UltraScale+ XCZU9EG show that one color, which
reserves 64 KiB of L2, is enough to attain best responsiveness:
- Xen 1 color latency:  3.1 us
- Xen 2 color latency:  3.1 us

Since this is the most common target for Arm cache coloring, the default
amount of Xen colors is set to one.

More colors are instead very likely to be needed on processors whose L1
cache is physically-indexed and physically-tagged, such as Cortex-A57.
In such cases, coloring applies to L1 also, and there typically are two
distinct L1-colors. Therefore, reserving only one color for Xen would
senselessly partitions a cache memory that is already private, i.e.
underutilize it.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 months agoxen: add cache coloring allocator for domains
Carlo Nonato [Tue, 17 Dec 2024 17:06:34 +0000 (18:06 +0100)]
xen: add cache coloring allocator for domains

Add a new memory page allocator that implements the cache coloring mechanism.
The allocation algorithm enforces equal frequency distribution of cache
partitions, following the coloring configuration of a domain. This allows
for an even utilization of cache sets for every domain.

Pages are stored in a color-indexed array of lists. Those lists are filled
by a simple init function which computes the color of each page.
When a domain requests a page, the allocator extracts the page from the list
with the maximum number of free pages among those that the domain can access,
given its coloring configuration.

The allocator can only handle requests of order-0 pages. This allows for
easier implementation and since cache coloring targets only embedded systems,
it's assumed not to be a major problem.

The buddy allocator must coexist with the colored one because the Xen heap
isn't colored. For this reason a new Kconfig option and a command line
parameter are added to let the user set the amount of memory reserved for
the buddy allocator. Even when cache coloring is enabled, this memory
isn't managed by the colored allocator.

Colored heap information is dumped in the dump_heap() debug-key function.

Based on original work from: Luca Miccio <lucmiccio@gmail.com>

Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Michal Orzel <michal.orzel@amd.com>
4 months agoxen/arm: add support for cache coloring configuration via device-tree
Carlo Nonato [Tue, 17 Dec 2024 17:06:32 +0000 (18:06 +0100)]
xen/arm: add support for cache coloring configuration via device-tree

Add the "llc-colors" Device Tree property to express DomUs and Dom0less
color configurations.

Based on original work from: Luca Miccio <lucmiccio@gmail.com>

Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Reviewed-by: Jan Beulich <jbeulich@suse.com> # non-Arm
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
4 months agotools: add support for cache coloring configuration
Carlo Nonato [Tue, 17 Dec 2024 17:06:31 +0000 (18:06 +0100)]
tools: add support for cache coloring configuration

Add a new "llc_colors" parameter that defines the LLC color assignment for
a domain. The user can specify one or more color ranges using the same
syntax used everywhere else for color config described in the
documentation.
The parameter is defined as a list of strings that represent the color
ranges.

Documentation is also added.
Golang bindings are regenerated.

Based on original work from: Luca Miccio <lucmiccio@gmail.com>

Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
4 months agoxen: extend domctl interface for cache coloring
Carlo Nonato [Tue, 17 Dec 2024 17:06:30 +0000 (18:06 +0100)]
xen: extend domctl interface for cache coloring

Add a new domctl hypercall to allow the user to set LLC coloring
configurations. Colors can be set only once, just after domain creation,
since recoloring isn't supported.

Based on original work from: Luca Miccio <lucmiccio@gmail.com>

Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 months agoxen/arm: add Dom0 cache coloring support
Carlo Nonato [Tue, 17 Dec 2024 17:06:29 +0000 (18:06 +0100)]
xen/arm: add Dom0 cache coloring support

Add a command line parameter to allow the user to set the coloring
configuration for Dom0.
A common configuration syntax for cache colors is introduced and
documented.
Take the opportunity to also add:
 - default configuration notion.
 - function to check well-formed configurations.

Direct mapping Dom0 isn't possible when coloring is enabled, so
CDF_directmap flag is removed when creating it.

Based on original work from: Luca Miccio <lucmiccio@gmail.com>

Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
4 months agoxen/arm: permit non direct-mapped Dom0 construction
Carlo Nonato [Tue, 17 Dec 2024 17:06:28 +0000 (18:06 +0100)]
xen/arm: permit non direct-mapped Dom0 construction

Cache coloring requires Dom0 not to be direct-mapped because of its non
contiguous mapping nature, so allocate_memory() is needed in this case.
8d2c3ab18cc1 ("arm/dom0less: put dom0less feature code in a separate module")
moved allocate_memory() in dom0less_build.c. In order to use it
in Dom0 construction bring it back to domain_build.c and declare it in
domain_build.h.

Adapt the implementation of allocate_memory() so that it uses the host
layout when called on the hwdom, via find_unallocated_memory().

Since gnttab information are needed in the process, move find_gnttab_region()
before allocate_memory() in construct_dom0().

Introduce add_hwdom_free_regions() callback to add hwdom banks in descending
order.

Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
4 months agoxen/arm: add initial support for LLC coloring on arm64
Carlo Nonato [Tue, 17 Dec 2024 17:06:27 +0000 (18:06 +0100)]
xen/arm: add initial support for LLC coloring on arm64

LLC coloring needs to know the last level cache layout in order to make the
best use of it. This can be probed by inspecting the CLIDR_EL1 register,
so the Last Level is defined as the last level visible by this register.
Note that this excludes system caches in some platforms.

Static memory allocation and cache coloring are incompatible because static
memory can't be guaranteed to use only colors assigned to the domain.
Panic during DomUs creation when both are enabled.

Based on original work from: Luca Miccio <lucmiccio@gmail.com>

Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
Acked-by: Jan Beulich <jbeulich@suse.com>