]> xenbits.xensource.com Git - xen.git/log
xen.git
12 months agodocs/man: Add xenwatchdog manual page
Leigh Brown [Tue, 23 Apr 2024 12:11:14 +0000 (14:11 +0200)]
docs/man: Add xenwatchdog manual page

Add a manual page for xenwatchdogd.

Signed-off-by: Leigh Brown <leigh@solinno.co.uk>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
12 months agotools/misc: Add xenwatchdogd.c copyright notice
Leigh Brown [Tue, 23 Apr 2024 12:10:16 +0000 (14:10 +0200)]
tools/misc: Add xenwatchdogd.c copyright notice

Add copyright notice and description of the program.

Signed-off-by: Leigh Brown <leigh@solinno.co.uk>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
12 months agotools/misc: xenwatchdogd enhancements
Leigh Brown [Tue, 23 Apr 2024 12:10:03 +0000 (14:10 +0200)]
tools/misc: xenwatchdogd enhancements

Add usage() function, the ability to run in the foreground, and
the ability to disarm the watchdog timer when exiting.

Add enhanced parameter parsing and validation, making use of
getopt_long().  Check the number of parameters are correct, the
timeout is at least two seconds (to allow a minimum sleep time of
one second), and that the sleep time is at least one and less
than the watchdog timeout.

With these changes, the daemon will no longer instantly reboot
the domain if you enter a zero timeout (or non-numeric parameter),
and prevent the daemon consuming 100% of a CPU due to zero sleep
time.

Signed-off-by: Leigh Brown <leigh@solinno.co.uk>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
12 months agotools/misc: xenwatchdogd: add parse_secs()
Leigh Brown [Tue, 23 Apr 2024 12:09:50 +0000 (14:09 +0200)]
tools/misc: xenwatchdogd: add parse_secs()

Create a new parse_secs() function to parse the timeout and sleep
parameters. This ensures that non-numeric parameters are not
accidentally treated as numbers.

Signed-off-by: Leigh Brown <leigh@solinno.co.uk>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
12 months agox86/rtc: Avoid UIP flag being set for longer than expected
Ross Lagerwall [Tue, 23 Apr 2024 12:09:18 +0000 (14:09 +0200)]
x86/rtc: Avoid UIP flag being set for longer than expected

In a test, OVMF reported an error initializing the RTC without
indicating the precise nature of the error. The only plausible
explanation I can find is as follows:

As part of the initialization, OVMF reads register C and then reads
register A repatedly until the UIP flag is not set. If this takes longer
than 100 ms, OVMF fails and reports an error. This may happen with the
following sequence of events:

At guest time=0s, rtc_init() calls check_update_timer() which schedules
update_timer for t=(1 - 244us).

At t=1s, the update_timer function happens to have been called >= 244us
late. In the timer callback, it sets the UIP flag and schedules
update_timer2 for t=1s.

Before update_timer2 runs, the guest reads register C which calls
check_update_timer(). check_update_timer() stops the scheduled
update_timer2 and since the guest time is now outside of the update
cycle, it schedules update_timer for t=(2 - 244us).

The UIP flag will therefore be set for a whole second from t=1 to t=2
while the guest repeatedly reads register A waiting for the UIP flag to
clear. Fix it by clearing the UIP flag when scheduling update_timer.

I was able to reproduce this issue with a synthetic test and this
resolves the issue.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
12 months agox86/pvh: zero VGA information
Roger Pau Monné [Mon, 22 Apr 2024 13:13:30 +0000 (15:13 +0200)]
x86/pvh: zero VGA information

PVH guests skip real mode VGA detection, and never have a VGA available, hence
the default VGA selection is not applicable, and at worse can cause confusion
when parsing Xen boot log.

Zero the boot_vid_info structure when Xen is booted from the PVH entry point.

This fixes Xen incorrectly reporting:

(XEN) Video information:
(XEN)  VGA is text mode 80x25, font 8x16

When booted as a PVH guest.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
12 months agox86/video: add boot_video_info offset generation to asm-offsets
Roger Pau Monné [Mon, 22 Apr 2024 13:13:00 +0000 (15:13 +0200)]
x86/video: add boot_video_info offset generation to asm-offsets

Currently the offsets into the boot_video_info struct are manually encoded in
video.S, which is fragile.  Generate them in asm-offsets.c and switch the
current code to use those instead.

No functional change intended.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 months agoautomation/eclair_analysis: substitute deprecated service STD.emptrecd
Nicola Vetrini [Mon, 22 Apr 2024 13:12:47 +0000 (15:12 +0200)]
automation/eclair_analysis: substitute deprecated service STD.emptrecd

The ECLAIR service STD.emptrecd (which checks for empty structures) is being
deprecated; hence, as a preventive measure, STD.anonstct (which checks for
structures with no named members, an UB in C99) is used here; the latter being
a more general case than the previous one, this change does not affect the
analysis. This new service is already supported by the current version of
ECLAIR.

No functional change.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Acked-by: Julien Grall <jgrall@amazon.com>
12 months agoxen/riscv: check whether the assembler has Zbb extension support
Oleksii Kurochko [Mon, 22 Apr 2024 13:12:03 +0000 (15:12 +0200)]
xen/riscv: check whether the assembler has Zbb extension support

Update the argument of the as-insn for the Zbb case to verify that
Zbb is supported not only by a compiler, but also by an assembler.

Also, check-extenstion(ext_name, "insn") helper macro is introduced
to check whether extension is supported by a compiler and an assembler.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
12 months agoxen/domain: deviate MISRA C Rule 16.2 violation
Nicola Vetrini [Mon, 22 Apr 2024 13:11:38 +0000 (15:11 +0200)]
xen/domain: deviate MISRA C Rule 16.2 violation

MISRA C Rule 16.2 states:
"A switch label shall only be used when the most closely-enclosing
compound statement is the body of a switch statement".

The PROGRESS_VCPU local helper specifies a case that is directly
inside the compound statement of a for loop, hence violating the rule.
To avoid this, the construct is deviated with a text-based deviation.

No functional change.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
12 months agox86/PVH: Use unsigned int for dom0 e820 index
Jason Andryuk [Mon, 22 Apr 2024 13:11:02 +0000 (15:11 +0200)]
x86/PVH: Use unsigned int for dom0 e820 index

Switch to unsigned int for the dom0 e820 index.  This eliminates the
potential for array underflows, and the compiler might be able to
generate better code.

Requested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
12 months agox86/svm: Add flushbyasid in the supported features
Vaishali Thakkar [Tue, 16 Apr 2024 09:08:12 +0000 (09:08 +0000)]
x86/svm: Add flushbyasid in the supported features

TLB Flush by ASID is missing in the list of supported features
here. So, add it.

Signed-off-by: Vaishali Thakkar <vaishali.thakkar@vates.tech>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 months agoeclair_analysis: deviate x86 emulator for Rule 16.2
Nicola Vetrini [Fri, 19 Apr 2024 06:51:24 +0000 (08:51 +0200)]
eclair_analysis: deviate x86 emulator for Rule 16.2

MISRA C Rule 16.2 states:
"A switch label shall only be used when the most closely-enclosing
compound statement is the body of a switch statement".

Since complying with this rule of the x86 emulator would lead to
a lot of code duplication, it is deemed better to exempt those
files for this guideline.

No functional change.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
12 months agoxen/riscv: add minimal stuff to page.h to build full Xen
Oleksii Kurochko [Fri, 19 Apr 2024 06:47:36 +0000 (08:47 +0200)]
xen/riscv: add minimal stuff to page.h to build full Xen

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
12 months agoxen/riscv: introduce io.h
Oleksii Kurochko [Fri, 19 Apr 2024 06:47:13 +0000 (08:47 +0200)]
xen/riscv: introduce io.h

The header taken form Linux 6.4.0-rc1 and is based on
arch/riscv/include/asm/mmio.h with the following changes:
- drop forcing of endianess for read*(), write*() functions as
  no matter what CPU endianness, what endianness a particular device
  (and hence its MMIO region(s)) is using is entirely independent.
  Hence conversion, where necessary, needs to occur at a layer up.
  Another one reason to drop endianess conversion here is:
  https://patchwork.kernel.org/project/linux-riscv/patch/20190411115623.5749-3-hch@lst.de/
  One of the answers of the author of the commit:
    And we don't know if Linux will be around if that ever changes.
    The point is:
     a) the current RISC-V spec is LE only
     b) the current linux port is LE only except for this little bit
    There is no point in leaving just this bitrotting code around.  It
    just confuses developers, (very very slightly) slows down compiles
    and will bitrot.  It also won't be any significant help to a future
    developer down the road doing a hypothetical BE RISC-V Linux port.
- drop unused argument of __io_ar() macros.
- drop "#define _raw_{read,write}{b,w,l,d,q} _raw_{read,write}{b,w,l,d,q}"
  as they are unnecessary.
- Adopt the Xen code style for this header, considering that significant changes
  are not anticipated in the future.
  In the event of any issues, adapting them to Xen style should be easily
  manageable.
- drop unnecessary  __r variables in macros read*_cpu()
- update inline assembler constraints for addr argument for
  __raw_read{b,w,l,q} and __raw_write{b,w,l,q} to tell a compiler that
 *addr will be accessed.
- add stubs for __raw_readq() and __raw_writeq() for RISCV_32

Addionally, to the header was added definions of ioremap_*().

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
12 months agoxen/ppc: mm-radix: Replace debug printing code with printk
Shawn Anastasio [Fri, 19 Apr 2024 06:46:29 +0000 (08:46 +0200)]
xen/ppc: mm-radix: Replace debug printing code with printk

Now that we have common code building, there's no need to keep the old
itoa64+debug print function in mm-radix.c

Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
12 months agox86/MCE: move intel mcheck init code to separate file
Sergiy Kibrik [Fri, 19 Apr 2024 06:45:23 +0000 (08:45 +0200)]
x86/MCE: move intel mcheck init code to separate file

Separate Intel nonfatal MCE initialization code from generic MCE code, the same
way it is done for AMD code. This is to be able to later make intel/amd MCE
code optional in the build.

Convert to Xen coding style. Clean up unused includes. Remove seemingly
outdated comment about MCE check period.

No functional change intended.

Signed-off-by: Sergiy Kibrik <Sergiy_Kibrik@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
12 months agoxen/gzip: Drop huffman code table tracking
Daniel P. Smith [Wed, 17 Apr 2024 14:37:16 +0000 (10:37 -0400)]
xen/gzip: Drop huffman code table tracking

The memory usage tracking isn't used outside of a debugging option which can't
compile under Xen anyway.  Drop it.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 months agoxen/gzip: Remove custom memory allocator
Daniel P. Smith [Wed, 17 Apr 2024 14:37:13 +0000 (10:37 -0400)]
xen/gzip: Remove custom memory allocator

All the other decompression routines use xmalloc_bytes(), thus there is no
reason for gzip to be handling its own allocation of memory. In fact, there is
a bug somewhere in the allocator as decompression started to break when adding
additional allocations. Instead of troubleshooting the allocator, replace it
with xmalloc_bytes().

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 months agoxen/gzip: Drop unused define checks
Daniel P. Smith [Wed, 17 Apr 2024 14:37:11 +0000 (10:37 -0400)]
xen/gzip: Drop unused define checks

Drop various macros and checks which are never used.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 months agox86/emul: Simplify segment override prefix decoding
Andrew Cooper [Thu, 28 Dec 2023 18:41:30 +0000 (18:41 +0000)]
x86/emul: Simplify segment override prefix decoding

x86_seg_* uses architectural encodings.  Therefore, we can fold the prefix
handling cases together and derive the segment from the prefix byte itself.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
12 months agoxen/efi: Rewrite DOS/PE magic checking without memcmp()
Andrew Cooper [Tue, 16 Apr 2024 15:21:34 +0000 (16:21 +0100)]
xen/efi: Rewrite DOS/PE magic checking without memcmp()

Misra Rule 21.16 doesn't like the use of memcmp() against character arrays (a
string literal in this case).  This is a rare piece of logic where we're
looking for a magic marker that just happens to make sense when expressed as
ASCII.  Rewrite using plain compares.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
12 months agodocs/misra: mark the gzip folder as adopted code
Federico Serafini [Mon, 15 Apr 2024 09:56:30 +0000 (11:56 +0200)]
docs/misra: mark the gzip folder as adopted code

Mark the whole gzip folder as adopted code and remove the redundant
deviation of file inflate.

Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 months agoRevert "public: s/int/int32_t"
Julien Grall [Wed, 17 Apr 2024 12:46:55 +0000 (13:46 +0100)]
Revert "public: s/int/int32_t"

This reverts commit afab29d0882f1d6889c73302fdf04632a492c529.

This is breaking the build. I mistakenly committed the wrong version.

Signed-off-by: Julien Grall <jgrall@amazon.com>
12 months agodocs: arm: Update where Xen should be loaded in memory
Michal Orzel [Fri, 12 Apr 2024 06:16:24 +0000 (08:16 +0200)]
docs: arm: Update where Xen should be loaded in memory

Since commit 6cd046c501bc ("xen/arm: Enlarge identity map space to 10TB")
Xen can be loaded below 10 TiB. Update docs accordingly.

Take the opportunity to update stale links to Linux docs.

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
12 months agopublic: s/int/int32_t
Stefano Stabellini [Tue, 9 Apr 2024 23:19:21 +0000 (16:19 -0700)]
public: s/int/int32_t

Straightforward int -> int32_t and unsigned int -> uint32_t replacements
in public headers. No ABI or semantic changes intended.

Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
12 months agodocs/misra: add Rule 16.4
Stefano Stabellini [Thu, 14 Mar 2024 21:50:21 +0000 (14:50 -0700)]
docs/misra: add Rule 16.4

Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Acked-by: Bertrand Marquis <bertrand.marquis@arm.com>
12 months agodocs/misra/rules.rst: add rule 5.5
Stefano Stabellini [Fri, 15 Mar 2024 00:35:03 +0000 (17:35 -0700)]
docs/misra/rules.rst: add rule 5.5

Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Acked-by: Bertrand Marquis <bertrand.marquis@arm.com>
12 months agodocs/hypercall-abi: State that the hypercall page is optional
Andrew Cooper [Thu, 11 Apr 2024 14:37:57 +0000 (15:37 +0100)]
docs/hypercall-abi: State that the hypercall page is optional

Xen doesn't care (and indeed, cannot feasibly tell) whether a hypercall was
initiated using the hypercall page or not.

For SEV-SNP/TDX encrypted VMs, use of a hypercall page would violate the
integrity properties wanted.

Explicitly state that the hypercall page is optional.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
12 months agoxen/gzip: Colocate gunzip code files
Daniel P. Smith [Thu, 11 Apr 2024 15:25:14 +0000 (11:25 -0400)]
xen/gzip: Colocate gunzip code files

This patch moves the gunzip code files to common/gzip. Makefiles are adjusted
accordingly.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 months agoaltcall: fix __alt_call_maybe_initdata so it's safe for livepatch
Roger Pau Monne [Thu, 11 Apr 2024 16:08:38 +0000 (18:08 +0200)]
altcall: fix __alt_call_maybe_initdata so it's safe for livepatch

Setting alternative call variables as __init is not safe for use with
livepatch, as livepatches can rightfully introduce new alternative calls to
structures marked as __alt_call_maybe_initdata (possibly just indirectly due to
replacing existing functions that use those).  Attempting to resolve those
alternative calls then results in page faults as the variable that holds the
function pointer address has been freed.

When livepatch is supported use the __ro_after_init attribute instead of
__initdata for __alt_call_maybe_initdata.

Fixes: f26bb285949b ('xen: Implement xen/alternative-call.h for use in common code')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 months agolibxl: devd: Spawn QEMU for 9pfs
Jason Andryuk [Sun, 7 Apr 2024 20:58:09 +0000 (16:58 -0400)]
libxl: devd: Spawn QEMU for 9pfs

Add support for xl devd to support 9pfs in a domU.  devd need to spawn a
pvqemu for the domain to service 9pfs as well as qdisk backends.  Rename
num_qdisks to pvqemu_refcnt to be more generic.

Keep the qdisk-backend-pid xenstore key as well as the disk-%u log file.
They are externally visible, so they might be used by other tooling.

Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
12 months agolibxl: Use vkb=[] for HVMs
Jason Andryuk [Sun, 7 Apr 2024 14:32:08 +0000 (10:32 -0400)]
libxl: Use vkb=[] for HVMs

xl/libxl only applies vkb=[] to PV & PVH guests.  HVM gets only a single
vkb by default, but that can be disabled by the vkb_device boolean.
Notably the HVM vkb cannot be configured, so feature-abs-pointer or the
backend-type cannot be specified.

Re-arrange the logic so that vkb=[] is handled regardless of domain
type.  If vkb is empty or unspecified, follow the vkb_device boolean for
HVMs.  Nothing changes for PVH & PV.  HVMs can now get a configured vkb
instead of just the default one.

The chance for regression is an HVM config with
vkb=["$something"]
vkb_device=false

Which would now get a vkb.

This is useful for vGlass which provides a VKB to HVMs.  vGlass wants to
specify feature-abs-pointer, but that is racily written by vGlass
instead of coming through the xl.cfg.  Unhelpfully, Linux xen-kbdfront
reads the backend nodes without checking that the backend is in
InitWait.

Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
12 months agoxen/include: move definition of ASM_INT() to xen/linkage.h
Juergen Gross [Wed, 3 Apr 2024 12:03:23 +0000 (14:03 +0200)]
xen/include: move definition of ASM_INT() to xen/linkage.h

ASM_INT() is defined in arch/[arm|x86]/include/asm/asm_defns.h in
exactly the same way. Instead of replicating this definition for riscv
and ppc, move it to include/xen/linkage.h, where other arch agnostic
definitions for assembler code are living already.

Adapt the generation of assembler sources via tools/binfile to include
the new home of ASM_INT().

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Michal Orzel <michal.orzel@amd.com>
12 months agoMAINTAINERS: Update livepatch maintainers
Ross Lagerwall [Tue, 9 Apr 2024 10:32:07 +0000 (11:32 +0100)]
MAINTAINERS: Update livepatch maintainers

Remove Konrad from the livepatch maintainers list as he hasn't been
active for a few years.
At the same time, add Roger as a new maintainer since he has been
actively working on it for a while.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
12 months agotools/misc: xenwatchdogd: add static qualifier
Leigh Brown [Fri, 29 Mar 2024 11:10:53 +0000 (11:10 +0000)]
tools/misc: xenwatchdogd: add static qualifier

Make all functions except main() static in xenwatchdogd.c. Also make
the remaining global variable static.

Signed-off-by: Leigh Brown <leigh@solinno.co.uk>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
12 months agotools/misc: rework xenwatchdogd signal handling
Leigh Brown [Fri, 29 Mar 2024 11:10:52 +0000 (11:10 +0000)]
tools/misc: rework xenwatchdogd signal handling

Rework xenwatchdogd signal handling to do the minimum in the signal
handler. This is a very minor enhancement.

Signed-off-by: Leigh Brown <leigh@solinno.co.uk>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
12 months agotools/misc: xenwatchdogd: use EXIT_* constants
Leigh Brown [Fri, 29 Mar 2024 11:10:51 +0000 (11:10 +0000)]
tools/misc: xenwatchdogd: use EXIT_* constants

Use EXIT_SUCCESS/EXIT_FAILURE constants instead of magic numbers.

Signed-off-by: Leigh Brown <leigh@solinno.co.uk>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
12 months agoxen/acpi: Allow xen/acpi.h to be included on non-ACPI archs
Shawn Anastasio [Fri, 5 Apr 2024 18:20:31 +0000 (13:20 -0500)]
xen/acpi: Allow xen/acpi.h to be included on non-ACPI archs

Conditionalize xen/acpi.h's inclusion of acpi/acpi.h and asm/acpi.h on
CONFIG_ACPI and import ARM's !CONFIG_ACPI stub for acpi_disabled() so
that the header can be included on architectures without ACPI support,
like ppc.

This change revealed some missing #includes across the ARM tree, so fix
those as well.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Shawn Anastasio <sanastasio@raptorengineering.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
[Fold Randconfig fix]
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 months agoxen/xsm: address violation of MISRA C Rule 16.2
Nicola Vetrini [Fri, 5 Apr 2024 09:14:35 +0000 (11:14 +0200)]
xen/xsm: address violation of MISRA C Rule 16.2

Refactor the switch so that a violation of
MISRA C Rule 16.2 is resolved (A switch label shall only be used
when the most closely-enclosing compound statement is the body of
a switch statement).
Note that the switch clause ending with the pseudo
keyword "fallthrough" is an allowed exception to Rule 16.3.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Acked-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
12 months agox86/hvm: address violations of MISRA C Rule 16.2
Nicola Vetrini [Fri, 5 Apr 2024 09:14:34 +0000 (11:14 +0200)]
x86/hvm: address violations of MISRA C Rule 16.2

Refactor the switch so that a violation of
MISRA C Rule 16.2 is resolved (a switch label should be immediately
enclosed in the compound statement of the switch).

The switch clause ending with the pseudo
keyword "fallthrough" is an allowed exception to Rule 16.3.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
12 months agoxen/domctl: address violations of MISRA C Rule 16.2
Nicola Vetrini [Fri, 5 Apr 2024 09:14:33 +0000 (11:14 +0200)]
xen/domctl: address violations of MISRA C Rule 16.2

Refactor the first clauses so that a violation of
MISRA C Rule 16.2 is resolved (a switch label should be immediately
enclosed in the compound statement of the switch).
Note that the switch clause ending with the pseudo
keyword "fallthrough" is an allowed exception to Rule 16.3.

Convert fallthrough comments in other clauses to the pseudo-keyword
while at it.

No functional change.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
12 months agox86/efi: tidy switch statement and address MISRA violation
Nicola Vetrini [Fri, 5 Apr 2024 09:14:32 +0000 (11:14 +0200)]
x86/efi: tidy switch statement and address MISRA violation

Refactor the first clauses so that a violation of
MISRA C Rule 16.2 is resolved (a switch label, "default" in this
case, should be immediately enclosed in the compound statement
of the switch). Note that the switch clause ending with the pseudo
keyword "fallthrough" is an allowed exception to Rule 16.3.

Convert fallthrough comments in other clauses to the pseudo-keyword
while at it.

No functional change.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 months agox86/irq: tidy switch statement and address MISRA violation
Nicola Vetrini [Fri, 5 Apr 2024 09:14:31 +0000 (11:14 +0200)]
x86/irq: tidy switch statement and address MISRA violation

Refactor the clauses so that a MISRA C Rule 16.2 violation is resolved
(A switch label shall only be used when the most closely-enclosing
compound statement is the body of a switch statement).
Note that the switch clause ending with the pseudo keyword "fallthrough"
is an allowed exception to Rule 16.3.

No functional change.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 months agox86/cpuid: address violation of MISRA C Rule 16.2
Nicola Vetrini [Fri, 5 Apr 2024 09:14:30 +0000 (11:14 +0200)]
x86/cpuid: address violation of MISRA C Rule 16.2

Refactor the switch so that a violation of MISRA C Rule 16.2 is resolved
(A switch label shall only be used when the most closely-enclosing
compound statement is the body of a switch statement).
Note that the switch clause ending with the pseudo
keyword "fallthrough" is an allowed exception to Rule 16.3.

No functional change.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
12 months agox86/vlapic: tidy switch statement and address MISRA violation
Nicola Vetrini [Fri, 5 Apr 2024 09:14:29 +0000 (11:14 +0200)]
x86/vlapic: tidy switch statement and address MISRA violation

Refactor the last clauses so that a violation of
MISRA C Rule 16.2 is resolved (A switch label shall only be used
when the most closely-enclosing compound statement is the body of
a switch statement). The switch clause ending with the
pseudo keyword "fallthrough" is an allowed exception to Rule 16.3.

No functional change.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
12 months agox86/emul: Adjust X86EMUL_OPC_EXT_MASK to placate MISRA
Andrew Cooper [Wed, 10 Apr 2024 19:41:27 +0000 (20:41 +0100)]
x86/emul: Adjust X86EMUL_OPC_EXT_MASK to placate MISRA

Resolves 4740 MISRA R7.2 violations (of 4935, so 96% of them).

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
12 months agoxen/spinlock: Adjust LOCK_DEBUG_INITVAL to placate MISRA
Andrew Cooper [Wed, 10 Apr 2024 19:32:24 +0000 (20:32 +0100)]
xen/spinlock: Adjust LOCK_DEBUG_INITVAL to placate MISRA

Resolves 160 MISRA R7.2 violations.

Fixes: c286bb93d20c ("xen/spinlock: support higher number of cpus")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
12 months agoxen/vPCI: Remove shadowed variable
Andrew Cooper [Wed, 10 Apr 2024 19:28:23 +0000 (20:28 +0100)]
xen/vPCI: Remove shadowed variable

Resolves a MISRA R5.3 violation.

Fixes: 622bdd962822 ("vpci/header: handle p2m range sets per BAR")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
12 months agoxen/nospec: Remove unreachable code
Andrew Cooper [Wed, 10 Apr 2024 19:08:03 +0000 (20:08 +0100)]
xen/nospec: Remove unreachable code

When CONFIG_SPECULATIVE_HARDEN_LOCK is active, this reads:

  static always_inline bool lock_evaluate_nospec(bool condition)
  {
      return arch_lock_evaluate_nospec(condition);
      return condition;
  }

Insert an #else to take out the second return.

Fixes: 7ef0084418e1 ("x86/spinlock: introduce support for blocking speculation into critical regions")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
12 months agox86/hvm: Fix Misra Rule 19.1 regression
Andrew Cooper [Wed, 10 Apr 2024 10:26:24 +0000 (11:26 +0100)]
x86/hvm: Fix Misra Rule 19.1 regression

Despite noticing an impending Rule 19.1 violation, the adjustment made (the
uint32_t cast) wasn't sufficient to avoid it.  Try again.

Subsequently noticed by Coverity too.

Fixes: 6a98383b0877 ("x86/HVM: clear upper halves of GPRs upon entry from 32-bit code")
Coverity-IDs: 1596289 thru 1596298
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
12 months agoxen/virtual-region: Drop setup_virtual_regions()
Andrew Cooper [Fri, 15 Mar 2024 17:47:58 +0000 (17:47 +0000)]
xen/virtual-region: Drop setup_virtual_regions()

All other actions it used to perform have been converted to build-time
initialisation.  The extable setup can done at build time too.

This is one fewer setup step required to get exceptions working.

Take the opportunity to move 'core' into read_mostly, where it probably should
have lived all along.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Tested-by: Luca Fancellu <luca.fancellu@arm.com>
12 months agoxen/virtual-region: Link the list build time
Andrew Cooper [Fri, 15 Mar 2024 17:18:42 +0000 (17:18 +0000)]
xen/virtual-region: Link the list build time

Given 3 statically initialised objects, its easy to link the list at build
time.  There's no need to do it during runtime at boot (and with IRQs-off,
even).

As a consequence, register_virtual_region() can now move inside ifdef
CONFIG_LIVEPATCH like unregister_virtual_region().

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
12 months agoxen/virtual-region: Rework how bugframe linkage works
Andrew Cooper [Fri, 15 Mar 2024 18:43:53 +0000 (18:43 +0000)]
xen/virtual-region: Rework how bugframe linkage works

The start/stop1/etc linkage scheme predates struct virtual_region, and as
setup_virtual_regions() shows, it's awkward to express in the new scheme.

Change the linker to provide explicit start/stop symbols for each bugframe
type, and change virtual_region to have a stop pointer rather than a count.

This marginally simplifies both do_bug_frame()s and prepare_payload(), but it
massively simplifies setup_virtual_regions() by allowing the compiler to
initialise the .frame[] array at build time.

virtual_region.c is the only user of the linker symbols, and this is unlikely
to change given the purpose of struct virtual_region, so move their externs
out of bug.h

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Michal Orzel <michal.orzel@amd.com>
12 months agoxen/link: Introduce a common BUGFRAMES definition
Andrew Cooper [Fri, 15 Mar 2024 18:21:31 +0000 (18:21 +0000)]
xen/link: Introduce a common BUGFRAMES definition

Bugframe linkage is identical in all architectures.  This is not surprising
given that it is (now) only consumed by common/virtual_region.c

Introduce a common BUGFRAMES define in xen.lds.h ahead of rearranging their
structure.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Michal Orzel <michal.orzel@amd.com>
12 months agox86/Kconfig: Introduce CONFIG_{AMD,INTEL} and conditionalise ucode
Andrew Cooper [Wed, 25 Oct 2023 13:18:15 +0000 (14:18 +0100)]
x86/Kconfig: Introduce CONFIG_{AMD,INTEL} and conditionalise ucode

We eventually want to be able to build a stripped down Xen for a single
platform.  Make a start with CONFIG_{AMD,INTEL} (hidden behind EXPERT, but
available to randconfig), and adjust the microcode logic.

No practical change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
12 months agox86/ucode: Move vendor specifics back out of early_microcode_init()
Andrew Cooper [Tue, 24 Oct 2023 18:32:31 +0000 (19:32 +0100)]
x86/ucode: Move vendor specifics back out of early_microcode_init()

I know it was me who dropped microcode_init_{intel,amd}() in c/s
dd5f07997f29 ("x86/ucode: Rationalise startup and family/model checks"), but
times have moved on.  We've gained new conditional support, and a wish to
compile-time specialise Xen to single platform.

(Re)introduce ucode_probe_{amd,intel}() and move the recent vendor specific
additions back out.  Encode the conditional support state in the NULL-ness of
hooks as it's already done on other paths.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
12 months agodocs/misra: document the expected sizes of integer types
Stefano Stabellini [Fri, 5 Apr 2024 18:44:46 +0000 (11:44 -0700)]
docs/misra: document the expected sizes of integer types

Xen makes assumptions about the size of integer types on the various
architectures. Document these assumptions.

Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Acked-by: Bertrand Marquis <bertrand.marquis@arm.com>
12 months agoMAINTAINERS: Become a reviewer of iMX8Q{M,XP} related patches
John Ernberg [Mon, 8 Apr 2024 16:11:35 +0000 (16:11 +0000)]
MAINTAINERS: Become a reviewer of iMX8Q{M,XP} related patches

I have experience with the IMX8QXP, and the supported parts of the IMX8QM
are identical.

Help review patches touching these areas.

Signed-off-by: John Ernberg <john.ernberg@actia.se>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Peng Fan <peng.fan@nxp.com>
12 months agoxen/drivers: imx-lpuart: Replace iMX8QM compatible with iMX8QXP
John Ernberg [Mon, 8 Apr 2024 16:11:35 +0000 (16:11 +0000)]
xen/drivers: imx-lpuart: Replace iMX8QM compatible with iMX8QXP

Allow the uart to probe also with iMX8QXP. The ip-block is the same as in
the QM.

Since the fsl,imx8qm-lpuart compatible in Linux exists in name only and is
not used in the driver any iMX8QM device tree that can boot Linux must set
fsl,imx8qxp-lpuart compatible as well as the QM one.

Thus we replace the compatible rather than adding just another one.

Signed-off-by: John Ernberg <john.ernberg@actia.se>
Acked-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Peng Fan <peng.fan@nxp.com>
12 months agoxen/arm: Add imx8q{m,x} platform glue
John Ernberg [Mon, 8 Apr 2024 16:11:35 +0000 (16:11 +0000)]
xen/arm: Add imx8q{m,x} platform glue

When using Linux for dom0 there are a bunch of drivers that need to do SMC
SIP calls into the firmware to enable certain hardware bits like the
watchdog.

Provide a basic platform glue that implements the needed SMC forwarding.

The format of these calls are as follows:
 - reg 0: function ID
 - reg 1: subfunction ID (when there's a subfunction)
 remaining regs: args

For now we only allow Dom0 to make these calls as they are all managing
hardware. There is no specification for these SIP calls, the IDs and names
have been extracted from the upstream linux kernel and the vendor kernel.

We can reject CPUFREQ because Dom0 cannot make an informed decision
regarding CPU frequency scaling, WAKEUP_SRC is to wake up from suspend,
which Xen doesn't support at this time.

This leaves the TIME SIP, OTP SIPs which for now are allowed to Dom0.

NOTE: This code is based on code found in NXP Xen tree located here:
https://github.com/nxp-imx/imx-xen/blob/lf-5.10.y_4.13/xen/arch/arm/platforms/imx8qm.c

Signed-off-by: Peng Fan <peng.fan@nxp.com>
[jernberg: Add SIP call filtering]
Signed-off-by: John Ernberg <john.ernberg@actia.se>
Reviewed-by: Peng Fan <peng.fan@nxp.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
[stefano: commit message improvement]
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
12 months agox86/entry: Fix build with older toolchains
Andrew Cooper [Tue, 9 Apr 2024 20:39:51 +0000 (21:39 +0100)]
x86/entry: Fix build with older toolchains

Binutils older than 2.29 doesn't know INCSSPD.

Fixes: 8e186f98ce0e ("x86: Use indirect calls in reset-stack infrastructure")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
12 months agox86/spec-ctrl: Support the "long" BHB loop sequence
Andrew Cooper [Fri, 22 Mar 2024 19:29:34 +0000 (19:29 +0000)]
x86/spec-ctrl: Support the "long" BHB loop sequence

Out of an abudnance of caution, implement the long loop too, and allowing for
it to be opted in to.

This is part of XSA-456 / CVE-2024-2201.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
12 months agox86/spec-ctrl: Wire up the Native-BHI software sequences
Andrew Cooper [Thu, 8 Jun 2023 18:41:44 +0000 (19:41 +0100)]
x86/spec-ctrl: Wire up the Native-BHI software sequences

In the absence of BHI_DIS_S, mitigating Native-BHI requires the use of a
software sequence.

Introduce a new bhb-seq= option to select between avaialble sequences and
bhb-entry= to control the per-PV/HVM actions like we have for other blocks.

Activate the short sequence by default for PV and HVM guests on affected
hardware if BHI_DIS_S isn't present.

This is part of XSA-456 / CVE-2024-2201.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
12 months agox86/spec-ctrl: Software BHB-clearing sequences
Andrew Cooper [Thu, 8 Jun 2023 18:41:44 +0000 (19:41 +0100)]
x86/spec-ctrl: Software BHB-clearing sequences

Implement clear_bhb_{tsx,loops}() as per the BHI guidance.  The loops variant
is set up as the "short" sequence.

Introduce SCF_entry_bhb and extend SPEC_CTRL_ENTRY_* with a conditional call
to selected clearing routine.

Note that due to a limitation in the ALTERNATIVE capability, the TEST/JZ can't
be included alongside a CALL in a single alternative block.  This is going to
require further work to untangle.

The BHB sequences (if used) must be after the restoration of Xen's
MSR_SPEC_CTRL value, which must be accounted for when judging whether it is
safe to skip the safety LFENCEs.

This is part of XSA-456 / CVE-2024-2201.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
12 months agox86/spec-ctrl: Support BHI_DIS_S in order to mitigate BHI
Andrew Cooper [Tue, 26 Mar 2024 19:01:37 +0000 (19:01 +0000)]
x86/spec-ctrl: Support BHI_DIS_S in order to mitigate BHI

Introduce a "bhi-dis-s" boolean to match the other options we have for
MSR_SPEC_CTRL values.  Also introduce bhi_calculations().

Use BHI_DIS_S whenever possible.

Guests which are levelled to be migration compatible with older CPUs can't see
BHI_DIS_S, and Xen must fill in the difference to make the guest safe.  Use
the virt MSR_SPEC_CTRL infrastructure to force BHI_DIS_S behind the guest's
back.

This is part of XSA-456 / CVE-2024-2201.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
12 months agox86/tsx: Expose RTM_ALWAYS_ABORT to guests
Andrew Cooper [Sat, 6 Apr 2024 19:36:54 +0000 (20:36 +0100)]
x86/tsx: Expose RTM_ALWAYS_ABORT to guests

A TSX Abort is one option mitigate Native-BHI, but a guest kernel doesn't get
to see this if Xen has turned RTM off using MSR_TSX_{CTRL,FORCE_ABORT}.

Therefore, the meaning of RTM_ALWAYS_ABORT has been adjusted to "XBEGIN won't
fault", and it should be exposed to guests so they can make a better decision.

Expose it in the max policy for any RTM-capable system.  Offer it by default
only if RTM has been disabled.

Update test-tsx to account for this new meaning.  While adjusting the logic in
test_guest_policies(), take the opportunity to use feature names (now they're
available) to make the logic easier to follow.

This is part of XSA-456 / CVE-2024-2201.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
12 months agox86: Drop INDIRECT_JMP
Andrew Cooper [Fri, 22 Dec 2023 18:01:37 +0000 (18:01 +0000)]
x86: Drop INDIRECT_JMP

Indirect JMPs which are not tailcalls can lead to an unwelcome form of
speculative type confusion, and we've removed the uses of INDIRECT_JMP to
compensate.  Remove the temptation to reintroduce new instances.

This is part of XSA-456 / CVE-2024-2201.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
12 months agox86: Use indirect calls in reset-stack infrastructure
Andrew Cooper [Fri, 22 Dec 2023 17:44:48 +0000 (17:44 +0000)]
x86: Use indirect calls in reset-stack infrastructure

Mixing up JMP and CALL indirect targets leads a very fun form of speculative
type confusion.  A target which is expecting to be called CALLed needs a
return address on the stack, and an indirect JMP doesn't place one there.

An indirect JMP which predicts to a target intending to be CALLed can end up
with a RET speculatively executing with a value from the JMPers stack frame.

There are several ways get indirect JMPs in Xen.

 * From tailcall optimisations.  These are safe because the compiler has
   arranged the stack to point at the callee's return address.

 * From jump tables.  These are unsafe, but Xen is built with -fno-jump-tables
   to work around several compiler issues.

 * From reset_stack_and_jump_ind(), which is particularly unsafe.  Because of
   the additional stack adjustment made, the value picked up off the stack is
   regs->r15 of the next vCPU to run.

In order to mitigate this type confusion, we want to make all indirect targets
be CALL targets, and remove the use of indirect JMP except via tailcall
optimisation.

Luckily due to XSA-348, all C target functions of reset_stack_and_jump_ind()
are noreturn.  {svm,vmx}_do_resume() exits via reset_stack_and_jump(); a
direct JMP with entirely different prediction properties.  idle_loop() is an
infinite loop which eventually exits via reset_stack_and_jump_ind() from a new
schedule.  i.e. These paths are all fine having one extra return address on
the stack.

This leaves continue_pv_domain(), which is expecting to be a JMP target.
Alter it to strip the return address off the stack, which is safe because
there isn't actually a RET expecting to return to its caller.

This allows us change reset_stack_and_jump_ind() to reset_stack_and_call_ind()
in order to mitigate the speculative type confusion.

This is part of XSA-456 / CVE-2024-2201.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
12 months agox86/spec-ctrl: Widen the {xen,last,default}_spec_ctrl fields
Andrew Cooper [Tue, 26 Mar 2024 22:43:18 +0000 (22:43 +0000)]
x86/spec-ctrl: Widen the {xen,last,default}_spec_ctrl fields

Right now, they're all bytes, but MSR_SPEC_CTRL has been steadily gaining new
features.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
12 months agox86/vmx: Add support for virtualize SPEC_CTRL
Roger Pau Monne [Thu, 15 Feb 2024 16:46:53 +0000 (17:46 +0100)]
x86/vmx: Add support for virtualize SPEC_CTRL

The feature is defined in the tertiary exec control, and is available starting
from Sapphire Rapids and Alder Lake CPUs.

When enabled, two extra VMCS fields are used: SPEC_CTRL mask and shadow.  Bits
set in mask are not allowed to be toggled by the guest (either set or clear)
and the value in the shadow field is the value the guest expects to be in the
SPEC_CTRL register.

By using it the hypervisor can force the value of SPEC_CTRL bits behind the
guest back without having to trap all accesses to SPEC_CTRL, note that no bits
are forced into the guest as part of this patch.  It also allows getting rid of
SPEC_CTRL in the guest MSR load list, since the value in the shadow field will
be loaded by the hardware on vmentry.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 months agox86/spec-ctrl: Detail the safety properties in SPEC_CTRL_ENTRY_*
Andrew Cooper [Mon, 25 Mar 2024 11:09:35 +0000 (11:09 +0000)]
x86/spec-ctrl: Detail the safety properties in SPEC_CTRL_ENTRY_*

The complexity is getting out of hand.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
12 months agox86/spec-ctrl: Simplify DO_COND_IBPB
Andrew Cooper [Fri, 22 Mar 2024 14:33:17 +0000 (14:33 +0000)]
x86/spec-ctrl: Simplify DO_COND_IBPB

With the prior refactoring, SPEC_CTRL_ENTRY_{PV,INTR} both load SCF into %ebx,
and handle the conditional safety including skipping if interrupting Xen.

Therefore, we can drop the maybexen parameter and the conditional safety.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
12 months agox86/spec_ctrl: Hold SCF in %ebx across SPEC_CTRL_ENTRY_{PV,INTR}
Andrew Cooper [Fri, 22 Mar 2024 12:08:02 +0000 (12:08 +0000)]
x86/spec_ctrl: Hold SCF in %ebx across SPEC_CTRL_ENTRY_{PV,INTR}

... as we do in the exit paths too.  This will allow simplification to the
sub-blocks.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
12 months agox86/entry: Arrange for %r14 to be STACK_END across SPEC_CTRL_ENTRY_FROM_PV
Andrew Cooper [Fri, 22 Mar 2024 15:52:06 +0000 (15:52 +0000)]
x86/entry: Arrange for %r14 to be STACK_END across SPEC_CTRL_ENTRY_FROM_PV

Other SPEC_CTRL_* paths already use %r14 like this, and it will allow for
simplifications.

All instances of SPEC_CTRL_ENTRY_FROM_PV are followed by a GET_STACK_END()
invocation, so this change is only really logic and register shuffling.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
12 months agox86/spec-ctrl: Rework conditional safety for SPEC_CTRL_ENTRY_*
Andrew Cooper [Fri, 22 Mar 2024 11:41:41 +0000 (11:41 +0000)]
x86/spec-ctrl: Rework conditional safety for SPEC_CTRL_ENTRY_*

Right now, we have a mix of safety strategies in different blocks, making the
logic fragile and hard to follow.

Start addressing this by having a safety LFENCE at the end of the blocks,
which can be patched out if other safety criteria are met.  This will allow us
to simplify the sub-blocks.  For SPEC_CTRL_ENTRY_FROM_IST, simply leave an
LFENCE unconditionally at the end; the IST path is not a fast-path by any
stretch of the imagination.

For SPEC_CTRL_ENTRY_FROM_INTR, the existing description was incorrect.  The
IRET #GP path is non-fatal but can occur with the guest's choice of
MSR_SPEC_CTRL.  It is safe to skip the flush/barrier-like protections when
interrupting Xen, but we must run DO_SPEC_CTRL_ENTRY irrespective.

This will skip RSB stuffing which was previously unconditional even when
interrupting Xen.

AFAICT, this is a missing cleanup from commit 3fffaf9c13e9 ("x86/entry: Avoid
using alternatives in NMI/#MC paths") where we split the IST entry path out of
the main INTR entry path.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
12 months agox86/spec-ctrl: Rename spec_ctrl_flags to scf
Andrew Cooper [Thu, 28 Mar 2024 11:57:25 +0000 (11:57 +0000)]
x86/spec-ctrl: Rename spec_ctrl_flags to scf

XSA-455 was ultimately caused by having fields with too-similar names.

Both {xen,last}_spec_ctrl are fields containing an architectural MSR_SPEC_CTRL
value.  The spec_ctrl_flags field contains Xen-internal flags.

To more-obviously distinguish the two, rename spec_ctrl_flags to scf, which is
also the prefix of the constants used by the fields.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
12 months agox86/spec-ctrl: Fix BTC/SRSO mitigations
Andrew Cooper [Tue, 26 Mar 2024 22:47:25 +0000 (22:47 +0000)]
x86/spec-ctrl: Fix BTC/SRSO mitigations

We were looking for SCF_entry_ibpb in the wrong variable in the top-of-stack
block, and xen_spec_ctrl won't have had bit 5 set because Xen doesn't
understand SPEC_CTRL_RRSBA_DIS_U yet.

This is XSA-455 / CVE-2024-31142.

Fixes: 53a570b28569 ("x86/spec-ctrl: Support IBPB-on-entry")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
12 months agox86/cpuid: Don't expose {IPRED,RRSBA,BHI}_CTRL to PV guests
Andrew Cooper [Tue, 9 Apr 2024 14:03:05 +0000 (15:03 +0100)]
x86/cpuid: Don't expose {IPRED,RRSBA,BHI}_CTRL to PV guests

All of these are prediction-mode (i.e. CPL) based.  They don't operate as
advertised in PV context.

Fixes: 4dd676070684 ("x86/spec-ctrl: Expose IPRED_CTRL to guests")
Fixes: 478e4787fa64 ("x86/spec-ctrl: Expose RRSBA_CTRL to guests")
Fixes: 583f1d095052 ("x86/spec-ctrl: Expose BHI_CTRL to guests")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
12 months agox86/alternatives: fix .init section reference in _apply_alternatives()
Roger Pau Monné [Tue, 9 Apr 2024 12:50:46 +0000 (14:50 +0200)]
x86/alternatives: fix .init section reference in _apply_alternatives()

The code in _apply_alternatives() will unconditionally attempt to read
__initdata_cf_clobber_{start,end} when called as part of applying alternatives
to a livepatch payload when Xen is using IBT.

That leads to a page-fault as __initdata_cf_clobber_{start,end} living in
.init section will have been unmapped by the time a livepatch gets loaded.

Fix by adding a check that limits the clobbering of endbr64 instructions to
boot time only.

Fixes: 37ed5da851b8 ('x86/altcall: Optimise away endbr64 instruction where possible')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 months agohypercall_xlat_continuation: Replace BUG_ON with domain_crash
Bjoern Doebel [Wed, 27 Mar 2024 17:31:38 +0000 (17:31 +0000)]
hypercall_xlat_continuation: Replace BUG_ON with domain_crash

Instead of crashing the host in case of unexpected hypercall parameters,
resort to only crashing the calling domain.

This is part of XSA-454 / CVE-2023-46842.

Fixes: b8a7efe8528a ("Enable compatibility mode operation for HYPERVISOR_memory_op")
Reported-by: Manuel Andreas <manuel.andreas@tum.de>
Signed-off-by: Bjoern Doebel <doebel@amazon.de>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
12 months agox86/HVM: clear upper halves of GPRs upon entry from 32-bit code
Jan Beulich [Wed, 27 Mar 2024 17:31:38 +0000 (17:31 +0000)]
x86/HVM: clear upper halves of GPRs upon entry from 32-bit code

Hypercalls in particular can be the subject of continuations, and logic
there checks updated state against incoming register values. If the
guest manufactured a suitable argument register with a non-zero upper
half before entering compatibility mode and issuing a hypercall from
there, checks in hypercall_xlat_continuation() might trip.

Since for HVM we want to also be sure to not hit a corner case in the
emulator, initiate the clipping right from the top of
{svm,vmx}_vmexit_handler(). Also rename the invoked function, as it no
longer does only invalidation of fields.

Note that architecturally the upper halves of registers are undefined
after a switch between compatibility and 64-bit mode (either direction).
Hence once having entered compatibility mode, the guest can't assume
the upper half of any register to retain its value.

This is part of XSA-454 / CVE-2023-46842.

Fixes: b8a7efe8528a ("Enable compatibility mode operation for HYPERVISOR_memory_op")
Reported-by: Manuel Andreas <manuel.andreas@tum.de>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
12 months agodrivers: char: Enable OMAP UART driver for TI K3 devices
Vaishnav Achath [Mon, 8 Apr 2024 15:03:17 +0000 (20:33 +0530)]
drivers: char: Enable OMAP UART driver for TI K3 devices

TI K3 devices (J721E, J721S2, AM62X .etc) have the same variant
of UART as OMAP4. Add the compatible used in Linux device tree,
"ti,am654-uart" to the OMAP UART dt_match so that the driver can
be used with these devices. Also, enable the driver for ARM64
platforms.

Signed-off-by: Vaishnav Achath <vaishnav.a@ti.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
12 months agoxen/compiler: address violation of MISRA C Rule 20.9
Nicola Vetrini [Mon, 8 Apr 2024 07:23:15 +0000 (09:23 +0200)]
xen/compiler: address violation of MISRA C Rule 20.9

The rule states:
"All identifiers used in the controlling expression of #if or #elif
preprocessing directives shall be #define'd before evaluation".
In this case, using defined(identifier) is a MISRA-compliant
way to achieve the same effect.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
12 months agox86/PVH: Support relocatable dom0 kernels
Jason Andryuk [Mon, 8 Apr 2024 07:22:56 +0000 (09:22 +0200)]
x86/PVH: Support relocatable dom0 kernels

Xen tries to load a PVH dom0 kernel at the fixed guest physical address
from the elf headers.  For Linux, this defaults to 0x1000000 (16MB), but
it can be configured.

Unfortunately there exist firmwares that have reserved regions at this
address, so Xen fails to load the dom0 kernel since it's not RAM.

The PVH entry code is not relocatable - it loads from absolute
addresses, which fail when the kernel is loaded at a different address.
With a suitably modified kernel, a reloctable entry point is possible.

Add XEN_ELFNOTE_PHYS32_RELOC which specifies optional alignment,
minimum, and maximum addresses needed for the kernel.  The presence of
the NOTE indicates the kernel supports a relocatable entry path.

Change the loading to check for an acceptable load address.  If the
kernel is relocatable, support finding an alternate load address.

The primary motivation for an explicit align field is that Linux has a
configurable CONFIG_PHYSICAL_ALIGN field.  This value is present in the
bzImage setup header, but not the ELF program headers p_align, which
report 2MB even when CONFIG_PHYSICAL_ALIGN is greater.  Since a kernel
is only considered relocatable if the PHYS32_RELOC elf note is present,
the alignment contraints can just be specified within the note instead
of searching for an alignment value via a heuristic.

Load alignment uses the PHYS32_RELOC note value if specified.
Otherwise, the maxmum ELF PHDR p_align value is selected if greater than
or equal to PAGE_SIZE.  Finally, the fallback default is 2MB.

libelf-private.h includes common-macros.h to satisfy the fuzzer build.

Link: https://gitlab.com/xen-project/xen/-/issues/180
Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
12 months agolibelf: Store maximum PHDR p_align
Jason Andryuk [Mon, 8 Apr 2024 07:22:28 +0000 (09:22 +0200)]
libelf: Store maximum PHDR p_align

While parsing the PHDRs, store the maximum p_align value.  This may be
consulted for moving a PVH image's load address.

Signed-off-by: Jason Andryuk <jason.andryuk@amd.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
12 months agoxen/rwlock: raise the number of possible cpus
Juergen Gross [Mon, 8 Apr 2024 07:21:41 +0000 (09:21 +0200)]
xen/rwlock: raise the number of possible cpus

The rwlock handling is limiting the number of cpus to 4095 today. The
main reason is the use of the atomic_t data type for the main lock
handling, which needs 2 bits for the locking state (writer waiting or
write locked), 12 bits for the id of a possible writer, and a 12 bit
counter for readers. The limit isn't 4096 due to an off by one sanity
check.

The atomic_t data type is 32 bits wide, so in theory 15 bits for the
writer's cpu id and 15 bits for the reader count seem to be fine, but
via read_trylock() more readers than cpus are possible.

This means that it is possible to raise the number of cpus to 16384
without changing the rwlock_t data structure. In order to avoid the
reader count wrapping to zero, don't let read_trylock() succeed in case
the highest bit of the reader's count is set already. This leaves enough
headroom for non-recursive readers to enter without risking a wrap.

While at it calculate _QW_CPUMASK and _QR_SHIFT from _QW_SHIFT and
add a sanity check for not overflowing the atomic_t data type.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
12 months agoxen/spinlock: support higher number of cpus
Juergen Gross [Mon, 8 Apr 2024 07:21:13 +0000 (09:21 +0200)]
xen/spinlock: support higher number of cpus

Allow 16 bits per cpu number, which is the limit imposed by
spinlock_tickets_t.

This will allow up to 65535 cpus, while increasing only the size of
recursive spinlocks in debug builds from 8 to 12 bytes.

The current Xen limit of 4095 cpus is imposed by SPINLOCK_CPU_BITS
being 12. There are machines available with more cpus than the current
Xen limit, so it makes sense to have the possibility to use more cpus.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
12 months agoxen/spinlock: let all is_locked and trylock variants return bool
Juergen Gross [Mon, 8 Apr 2024 07:20:24 +0000 (09:20 +0200)]
xen/spinlock: let all is_locked and trylock variants return bool

Switch the remaining trylock and is_locked variants to return bool.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
12 months agoxen/spinlock: split recursive spinlocks from normal ones
Juergen Gross [Mon, 8 Apr 2024 07:19:34 +0000 (09:19 +0200)]
xen/spinlock: split recursive spinlocks from normal ones

Recursive and normal spinlocks are sharing the same data structure for
representation of the lock. This has two major disadvantages:

- it is not clear from the definition of a lock, whether it is intended
  to be used recursive or not, while a mixture of both usage variants
  needs to be

- in production builds (builds without CONFIG_DEBUG_LOCKS) the needed
  data size of an ordinary spinlock is 8 bytes instead of 4, due to the
  additional recursion data needed (associated with that the rwlock
  data is using 12 instead of only 8 bytes)

Fix that by introducing a struct spinlock_recursive for recursive
spinlocks only, and switch recursive spinlock functions to require
pointers to this new struct.

This allows to check the correct usage at build time.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
12 months agoxen/spinlock: add missing rspin_is_locked() and rspin_barrier()
Juergen Gross [Mon, 8 Apr 2024 07:18:40 +0000 (09:18 +0200)]
xen/spinlock: add missing rspin_is_locked() and rspin_barrier()

Add rspin_is_locked() and rspin_barrier() in order to prepare differing
spinlock_t and rspinlock_t types.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
12 months agoxen/spinlock: add another function level
Juergen Gross [Mon, 8 Apr 2024 07:17:47 +0000 (09:17 +0200)]
xen/spinlock: add another function level

Add another function level in spinlock.c hiding the spinlock_t layout
from the low level locking code.

This is done in preparation of introducing rspinlock_t for recursive
locks without having to duplicate all of the locking code.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
12 months agoxen/spinlock: add explicit non-recursive locking functions
Juergen Gross [Mon, 8 Apr 2024 07:16:23 +0000 (09:16 +0200)]
xen/spinlock: add explicit non-recursive locking functions

In order to prepare a type-safe recursive spinlock structure, add
explicitly non-recursive locking functions to be used for non-recursive
locking of spinlocks, which are used recursively, too.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Michal Orzel <michal.orzel@amd.com>
13 months agoMISRA C Rule 17.1 states: "The features of `<stdarg.h>' shall not be used"
Simone Ballarin [Thu, 28 Mar 2024 10:29:35 +0000 (11:29 +0100)]
MISRA C Rule 17.1 states: "The features of `<stdarg.h>' shall not be used"

The Xen community wants to avoid using variadic functions except for
specific circumstances where it feels appropriate by strict code review.

Functions hypercall_create_continuation and hypercall_xlat_continuation
are internal helper functions made to break long running hypercalls into
multiple calls. They take a variable number of arguments depending on the
original hypercall they are trying to continue.

Add SAF deviations for the aforementioned functions.

Signed-off-by: Simone Ballarin <simone.ballarin@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
13 months agoMISRA C:2012 Rule 17.1 states: The features of `<stdarg.h>' shall not be used
Simone Ballarin [Thu, 28 Mar 2024 10:29:34 +0000 (11:29 +0100)]
MISRA C:2012 Rule 17.1 states: The features of `<stdarg.h>' shall not be used

The Xen community wants to avoid using variadic functions except for
specific circumstances where it feels appropriate by strict code review.

Add deviation for printf()-like functions.

Signed-off-by: Simone Ballarin <simone.ballarin@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
13 months agoautomation/eclair: add deviations for Rule 20.7
Nicola Vetrini [Fri, 29 Mar 2024 09:11:33 +0000 (10:11 +0100)]
automation/eclair: add deviations for Rule 20.7

These deviations deal with the following cases:
- macro arguments used directly as initializer list arguments;
- uses of the __config_enabled macro, that can't be brought
  into compliance without breaking its functionality;
- exclude files that are out of scope (efi headers and cpu_idle);
- uses of alternative_{call,vcall}[0-9] macros.

The existing configuration for R20.7 is reordered so that it matches the
cases listed in its documentation comment.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
13 months agoarm/public: address violations of MISRA C Rule 20.7
Nicola Vetrini [Fri, 29 Mar 2024 09:11:30 +0000 (10:11 +0100)]
arm/public: address violations of MISRA C Rule 20.7

MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.

No functional change.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
13 months agox86: Address MISRA Rule 13.6
Andrew Cooper [Tue, 2 Apr 2024 15:26:22 +0000 (16:26 +0100)]
x86: Address MISRA Rule 13.6

MISRA Rule 13.6 doesn't like having an expression in a sizeof() which
potentially has side effects, including function calls.

Address several violations by pulling the expression out into a local
variable.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
13 months agox86/tsx: Cope with RTM_ALWAYS_ABORT vs RTM mismatch
Andrew Cooper [Wed, 3 Apr 2024 16:43:42 +0000 (17:43 +0100)]
x86/tsx: Cope with RTM_ALWAYS_ABORT vs RTM mismatch

It turns out there is something wonky on some but not all CPUs with
MSR_TSX_FORCE_ABORT.  The presence of RTM_ALWAYS_ABORT causes Xen to think
it's safe to offer HLE/RTM to guests, but in this case, XBEGIN instructions
genuinely #UD.

Spot this case and try to back out as cleanly as we can.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
13 months agochar: lpuart: Drop useless variables from UART structure
Michal Orzel [Thu, 4 Apr 2024 07:51:43 +0000 (09:51 +0200)]
char: lpuart: Drop useless variables from UART structure

These variables are useless. They are being assigned a value which is
never used since UART is expected to be pre-configured.

No functional change.

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>