]> xenbits.xensource.com Git - xen.git/log
xen.git
11 months agovpci/header: emulate PCI_COMMAND register for guests
Oleksandr Andrushchenko [Thu, 23 May 2024 08:18:04 +0000 (10:18 +0200)]
vpci/header: emulate PCI_COMMAND register for guests

Xen and/or Dom0 may have put values in PCI_COMMAND which they expect
to remain unaltered. PCI_COMMAND_SERR bit is a good example: while the
guest's (domU) view of this will want to be zero (for now), the host
having set it to 1 should be preserved, or else we'd effectively be
giving the domU control of the bit. Thus, PCI_COMMAND register needs
proper emulation in order to honor host's settings.

According to "PCI LOCAL BUS SPECIFICATION, REV. 3.0", section "6.2.2
Device Control" the reset state of the command register is typically 0,
so when assigning a PCI device use 0 as the initial state for the
guest's (domU) view of the command register.

Here is the full list of command register bits with notes about
PCI/PCIe specification, and how Xen handles the bit. QEMU's behavior is
also documented here since that is our current reference implementation
for PCI passthrough.

PCI_COMMAND_IO (bit 0)
  PCIe 6.1: RW
  PCI LB 3.0: RW
  QEMU: (emu_mask) QEMU provides an emulated view of this bit. Guest
    writes do not propagate to hardware. QEMU sets this bit to 1 in
    hardware if an I/O BAR is exposed to the guest.
  Xen domU: (rsvdp_mask) We treat this bit as RsvdP for now since we
    don't yet support I/O BARs for domUs.
  Xen dom0: We allow dom0 to control this bit freely.

PCI_COMMAND_MEMORY (bit 1)
  PCIe 6.1: RW
  PCI LB 3.0: RW
  QEMU: (emu_mask) QEMU provides an emulated view of this bit. Guest
    writes do not propagate to hardware. QEMU sets this bit to 1 in
    hardware if a Memory BAR is exposed to the guest.
  Xen domU/dom0: We handle writes to this bit by mapping/unmapping BAR
    regions.
  Xen domU: For devices assigned to DomUs, memory decoding will be
    disabled at the time of initialization.

PCI_COMMAND_MASTER (bit 2)
  PCIe 6.1: RW
  PCI LB 3.0: RW
  QEMU: Pass through writes to hardware.
  Xen domU/dom0: Pass through writes to hardware.

PCI_COMMAND_SPECIAL (bit 3)
  PCIe 6.1: RO, hardwire to 0
  PCI LB 3.0: RW
  QEMU: Pass through writes to hardware.
  Xen domU/dom0: Pass through writes to hardware.

PCI_COMMAND_INVALIDATE (bit 4)
  PCIe 6.1: RO, hardwire to 0
  PCI LB 3.0: RW
  QEMU: Pass through writes to hardware.
  Xen domU/dom0: Pass through writes to hardware.

PCI_COMMAND_VGA_PALETTE (bit 5)
  PCIe 6.1: RO, hardwire to 0
  PCI LB 3.0: RW
  QEMU: Pass through writes to hardware.
  Xen domU/dom0: Pass through writes to hardware.

PCI_COMMAND_PARITY (bit 6)
  PCIe 6.1: RW
  PCI LB 3.0: RW
  QEMU: (emu_mask) QEMU provides an emulated view of this bit. Guest
    writes do not propagate to hardware.
  Xen domU: (rsvdp_mask) We treat this bit as RsvdP.
  Xen dom0: We allow dom0 to control this bit freely.

PCI_COMMAND_WAIT (bit 7)
  PCIe 6.1: RO, hardwire to 0
  PCI LB 3.0: hardwire to 0
  QEMU: res_mask
  Xen domU: (rsvdp_mask) We treat this bit as RsvdP.
  Xen dom0: We allow dom0 to control this bit freely.

PCI_COMMAND_SERR (bit 8)
  PCIe 6.1: RW
  PCI LB 3.0: RW
  QEMU: (emu_mask) QEMU provides an emulated view of this bit. Guest
    writes do not propagate to hardware.
  Xen domU: (rsvdp_mask) We treat this bit as RsvdP.
  Xen dom0: We allow dom0 to control this bit freely.

PCI_COMMAND_FAST_BACK (bit 9)
  PCIe 6.1: RO, hardwire to 0
  PCI LB 3.0: RW
  QEMU: (emu_mask) QEMU provides an emulated view of this bit. Guest
    writes do not propagate to hardware.
  Xen domU: (rsvdp_mask) We treat this bit as RsvdP.
  Xen dom0: We allow dom0 to control this bit freely.

PCI_COMMAND_INTX_DISABLE (bit 10)
  PCIe 6.1: RW
  PCI LB 3.0: RW
  QEMU: (emu_mask) QEMU provides an emulated view of this bit. Guest
    writes do not propagate to hardware. QEMU checks if INTx was mapped
    for a device. If it is not, then guest can't control
    PCI_COMMAND_INTX_DISABLE bit.
  Xen domU: We prohibit a guest from enabling INTx if MSI(X) is enabled.
  Xen dom0: We allow dom0 to control this bit freely.

Bits 11-15
  PCIe 6.1: RsvdP
  PCI LB 3.0: Reserved
  QEMU: res_mask
  Xen domU: rsvdp_mask
  Xen dom0: We allow dom0 to control these bits freely.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
11 months agoarm/vpci: honor access size when returning an error
Volodymyr Babchuk [Thu, 23 May 2024 08:17:30 +0000 (10:17 +0200)]
arm/vpci: honor access size when returning an error

Guest can try to read config space using different access sizes: 8,
16, 32, 64 bits. We need to take this into account when we are
returning an error back to MMIO handler, otherwise it is possible to
provide more data than requested: i.e. guest issues LDRB instruction
to read one byte, but we are writing 0xFFFFFFFFFFFFFFFF in the target
register.

Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
Acked-by: Julien Grall <jgrall@amazon.com>
11 months agox86: detect PIT aliasing on ports other than 0x4[0-3]
Jan Beulich [Thu, 23 May 2024 08:16:52 +0000 (10:16 +0200)]
x86: detect PIT aliasing on ports other than 0x4[0-3]

... in order to also deny Dom0 access through the alias ports (commonly
observed on Intel chipsets). Without this it is only giving the
impression of denying access to PIT. Unlike for CMOS/RTC, do detection
pretty early, to avoid disturbing normal operation later on (even if
typically we won't use much of the PIT).

Like for CMOS/RTC a fundamental assumption of the probing is that reads
from the probed alias port won't have side effects (beyond such that PIT
reads have anyway) in case it does not alias the PIT's.

As to the port 0x61 accesses: Unlike other accesses we do, this masks
off the top four bits (in addition to the bottom two ones), following
Intel chipset documentation saying that these (read-only) bits should
only be written with zero.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
11 months agox86/PIT: supply and use #define-s
Jan Beulich [Thu, 23 May 2024 08:16:07 +0000 (10:16 +0200)]
x86/PIT: supply and use #define-s

Help reading of code programming the PIT by introducing constants for
control word, read back and latch commands, as well as status.

Requested-by: Jason Andryuk <jason.andryuk@amd.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
11 months agoxen/riscv: add required things to current.h
Oleksii Kurochko [Fri, 17 May 2024 13:54:58 +0000 (15:54 +0200)]
xen/riscv: add required things to current.h

Add minimal requied things to be able to build full Xen.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
11 months agoxen/riscv: introduce atomic.h
Oleksii Kurochko [Fri, 17 May 2024 13:54:55 +0000 (15:54 +0200)]
xen/riscv: introduce atomic.h

Initially the patch was introduced by Bobby, who takes the header from
Linux kernel.

The following changes were done on top of Bobby's changes:
 - atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were updated
   to use__*xchg_generic()
 - drop casts in write_atomic() as they are unnecessary
 - drop introduction of WRITE_ONCE() and READ_ONCE().
   Xen provides ACCESS_ONCE()
 - remove zero-length array access in read_atomic()
 - drop defines similar to pattern:
   #define atomic_add_return_relaxed   atomic_add_return_relaxed
 - move not RISC-V specific functions to asm-generic/atomics-ops.h
 - drop  atomic##prefix##_{cmp}xchg_{release, aquire, release}() as they
   are not used in Xen.
 - update the defintion of  atomic##prefix##_{cmp}xchg according to
   {cmp}xchg() implementation in Xen.
 - some ATOMIC_OP() macros were updated:
   - drop size argument for ATOMIC_OP which defines atomic##prefix##_xchg()
     and atomic##prefix##_cmpxchg().
   - drop c_op argument for ATOMIC_OPS which defines ATOMIC_OPS(and, and),
     ATOMIC_OPS( or,  or), ATOMIC_OPS(xor, xor), ATOMIC_OPS(add, add, +),
     ATOMIC_OPS(sub, add, -) as c_op is always "+" for them.
   - drop "" from definition of __atomic_{acquire/release"}_fence.

The current implementation is the same with 8e86f0b409a4
("arm64: atomics: fix use of acquire + release for full barrier
semantics") [1].
RISC-V could combine acquire and release into the SC
instructions and it could reduce a fence instruction to gain better
performance. Here is related description from RISC-V ISA 10.2
Load-Reserved/Store-Conditional Instructions:

 - .aq:   The LR/SC sequence can be given acquire semantics by
          setting the aq bit on the LR instruction.
 - .rl:   The LR/SC sequence can be given release semantics by
              setting the rl bit on the SC instruction.
 - .aqrl: Setting the aq bit on the LR instruction, and setting
          both the aq and the rl bit on the SC instruction makes
          the LR/SC sequence sequentially consistent, meaning that
          it cannot be reordered with earlier or later memory
          operations from the same hart.

 Software should not set the rl bit on an LR instruction unless
 the aq bit is also set, nor should software set the aq bit on an
 SC instruction unless the rl bit is also set. LR.rl and SC.aq
 instructions are not guaranteed to provide any stronger ordering
 than those with both bits clear, but may result in lower
 performance.

Also, I way of transforming ".rl + full barrier" to ".aqrl" was approved
by (the author of the RVWMO spec) [2]

[1] https://patchwork.kernel.org/project/linux-arm-kernel/patch/1391516953-14541-1-git-send-email-will.deacon@arm.com/
[2] https://lore.kernel.org/linux-riscv/41e01514-74ca-84f2-f5cc-2645c444fd8e@nvidia.com/

Signed-off-by: Bobby Eshleman <bobbyeshleman@gmail.com>
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
11 months agoxen/riscv: introduce cmpxchg.h
Oleksii Kurochko [Fri, 17 May 2024 13:54:54 +0000 (15:54 +0200)]
xen/riscv: introduce cmpxchg.h

The header was taken from Linux kernl 6.4.0-rc1.

Addionally, were updated:
* add emulation of {cmp}xchg for 1/2 byte types using 32-bit atomic
  access.
* replace tabs with spaces
* replace __* variale with *__
* introduce generic version of xchg_* and cmpxchg_*.
* drop {cmp}xchg{release,relaxed,acquire} as Xen doesn't use them
* drop barries and use instruction suffixices instead ( .aq, .rl, .aqrl )

Implementation of 4- and 8-byte cases were updated according to the spec:
```
              ....
Linux Construct         RVWMO AMO Mapping
    ...
atomic <op>             amo<op>.{w|d}.aqrl
Linux Construct         RVWMO LR/SC Mapping
    ...
atomic <op>             loop: lr.{w|d}.aq; <op>; sc.{w|d}.aqrl; bnez loop

Table A.5: Mappings from Linux memory primitives to RISC-V primitives

```

The current implementation is the same with 8e86f0b409a4
("arm64: atomics: fix use of acquire + release for full barrier
semantics") [1].
RISC-V could combine acquire and release into the SC
instructions and it could reduce a fence instruction to gain better
performance. Here is related description from RISC-V ISA 10.2
Load-Reserved/Store-Conditional Instructions:

 - .aq:   The LR/SC sequence can be given acquire semantics by
          setting the aq bit on the LR instruction.
 - .rl:   The LR/SC sequence can be given release semantics by
          setting the rl bit on the SC instruction.
 - .aqrl: Setting the aq bit on the LR instruction, and setting
          both the aq and the rl bit on the SC instruction makes
          the LR/SC sequence sequentially consistent, meaning that
          it cannot be reordered with earlier or later memory
          operations from the same hart.

 Software should not set the rl bit on an LR instruction unless
 the aq bit is also set, nor should software set the aq bit on an
 SC instruction unless the rl bit is also set. LR.rl and SC.aq
 instructions are not guaranteed to provide any stronger ordering
 than those with both bits clear, but may result in lower
 performance.

Also, I way of transforming ".rl + full barrier" to ".aqrl" was approved
by (the author of the RVWMO spec) [2]

[1] https://patchwork.kernel.org/project/linux-arm-kernel/patch/1391516953-14541-1-git-send-email-will.deacon@arm.com/
[2] https://lore.kernel.org/linux-riscv/41e01514-74ca-84f2-f5cc-2645c444fd8e@nvidia.com/

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
11 months agoxen/x86: Simplify header dependencies in x86/hvm
Alejandro Vallejo [Thu, 23 May 2024 08:07:31 +0000 (10:07 +0200)]
xen/x86: Simplify header dependencies in x86/hvm

Otherwise it's not possible to call functions described in hvm/vlapic.h from the
inline functions of hvm/hvm.h.

This is because a static inline in vlapic.h depends on hvm.h, and pulls it
transitively through vpt.h. The ultimate cause is having hvm.h included in any
of the "v*.h" headers, so break the cycle moving the guilty inline into hvm.h.

No functional change.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
11 months agoiommu/x86: print RMRR/IVMD ranges using full addresses
Roger Pau Monné [Thu, 23 May 2024 08:03:33 +0000 (10:03 +0200)]
iommu/x86: print RMRR/IVMD ranges using full addresses

It's easier to correlate with the physical memory map if the addresses are
fully printed, instead of using frame numbers.

Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
11 months agoxen/livepatch: make .livepatch.funcs read-only for in-tree tests
Roger Pau Monné [Thu, 23 May 2024 08:03:14 +0000 (10:03 +0200)]
xen/livepatch: make .livepatch.funcs read-only for in-tree tests

This matches the flags of the .livepatch.funcs section when generated using
livepatch-build-tools, which only sets the SHT_ALLOC flag.

Also constify the definitions of the livepatch_func variables in the tests
themselves, in order to better match the resulting output.  Note that just
making those variables constant is not enough to force the generated sections
to be read-only.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>
11 months agox86_64/cpu_idle: address violations of MISRA C Rule 20.7
Nicola Vetrini [Tue, 21 May 2024 14:01:17 +0000 (16:01 +0200)]
x86_64/cpu_idle: address violations of MISRA C Rule 20.7

MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.

No functional change.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
11 months agox86_64/uaccess: address violations of MISRA C Rule 20.7
Nicola Vetrini [Tue, 21 May 2024 14:00:47 +0000 (16:00 +0200)]
x86_64/uaccess: address violations of MISRA C Rule 20.7

MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.

xlat_malloc_init is touched for consistency, despite the construct
being already deviated.

No functional change.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
11 months agox86/hvm: address violations of MISRA C Rule 20.7
Nicola Vetrini [Tue, 21 May 2024 14:00:20 +0000 (16:00 +0200)]
x86/hvm: address violations of MISRA C Rule 20.7

MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.

No functional change.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
11 months agox86/vpmu: address violations of MISRA C Rule 20.7
Nicola Vetrini [Tue, 21 May 2024 13:59:50 +0000 (15:59 +0200)]
x86/vpmu: address violations of MISRA C Rule 20.7

MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.

No functional change.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
11 months agoxen/common/dt-overlay: Fix lock issue when add/remove the device
Henry Wang [Tue, 21 May 2024 13:59:14 +0000 (15:59 +0200)]
xen/common/dt-overlay: Fix lock issue when add/remove the device

If CONFIG_DEBUG=y, below assertion will be triggered:
(XEN) Assertion 'rw_is_locked(&dt_host_lock)' failed at drivers/passthrough/device_tree.c:146
(XEN) ----[ Xen-4.19-unstable  arm64  debug=y  Not tainted ]----
[...]
(XEN) Xen call trace:
(XEN)    [<00000a0000257418>] iommu_remove_dt_device+0x8c/0xd4 (PC)
(XEN)    [<00000a00002573a0>] iommu_remove_dt_device+0x14/0xd4 (LR)
(XEN)    [<00000a000020797c>] dt-overlay.c#remove_node_resources+0x8c/0x90
(XEN)    [<00000a0000207f14>] dt-overlay.c#remove_nodes+0x524/0x648
(XEN)    [<00000a0000208460>] dt_overlay_sysctl+0x428/0xc68
(XEN)    [<00000a00002707f8>] arch_do_sysctl+0x1c/0x2c
(XEN)    [<00000a0000230b40>] do_sysctl+0x96c/0x9ec
(XEN)    [<00000a0000271e08>] traps.c#do_trap_hypercall+0x1e8/0x288
(XEN)    [<00000a0000273490>] do_trap_guest_sync+0x448/0x63c
(XEN)    [<00000a000025c480>] entry.o#guest_sync_slowpath+0xa8/0xd8
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Assertion 'rw_is_locked(&dt_host_lock)' failed at drivers/passthrough/device_tree.c:146
(XEN) ****************************************

This is because iommu_remove_dt_device() is called without taking the
dt_host_lock. dt_host_lock is meant to ensure that the DT node will not
disappear behind back. So fix the issue by taking the lock as soon as
getting hold of overlay_node.

Similar issue will be observed in adding the dtbo:
(XEN) Assertion 'system_state < SYS_STATE_active || rw_is_locked(&dt_host_lock)'
failed at xen-source/xen/drivers/passthrough/device_tree.c:192
(XEN) ----[ Xen-4.19-unstable  arm64  debug=y  Not tainted ]----
[...]
(XEN) Xen call trace:
(XEN)    [<00000a00002594f4>] iommu_add_dt_device+0x7c/0x17c (PC)
(XEN)    [<00000a0000259494>] iommu_add_dt_device+0x1c/0x17c (LR)
(XEN)    [<00000a0000267db4>] handle_device+0x68/0x1e8
(XEN)    [<00000a0000208ba8>] dt_overlay_sysctl+0x9d4/0xb84
(XEN)    [<00000a000027342c>] arch_do_sysctl+0x24/0x38
(XEN)    [<00000a0000231ac8>] do_sysctl+0x9ac/0xa34
(XEN)    [<00000a0000274b70>] traps.c#do_trap_hypercall+0x230/0x2dc
(XEN)    [<00000a0000276330>] do_trap_guest_sync+0x478/0x688
(XEN)    [<00000a000025e480>] entry.o#guest_sync_slowpath+0xa8/0xd8

This is because the lock is released too early. So fix the issue by
releasing the lock after handle_device().

Fixes: 7e5c4a8b86f1 ("xen/arm: Implement device tree node removal functionalities")
Signed-off-by: Henry Wang <xin.wang2@amd.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
11 months agox86/p2m: Add braces for better code clarity
Petr Beneš [Tue, 21 May 2024 07:16:25 +0000 (09:16 +0200)]
x86/p2m: Add braces for better code clarity

No functional change.

Signed-off-by: Petr Beneš <w1benny@gmail.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
11 months agoxen/riscv: introduce vm_event_*() functions
Oleksii Kurochko [Tue, 21 May 2024 07:16:02 +0000 (09:16 +0200)]
xen/riscv: introduce vm_event_*() functions

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
11 months agoxen/riscv: introduce monitor.h
Oleksii Kurochko [Tue, 21 May 2024 07:15:37 +0000 (09:15 +0200)]
xen/riscv: introduce monitor.h

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
11 months agoxen/x86: pretty print interrupt CPU affinity masks
Roger Pau Monné [Tue, 21 May 2024 07:15:03 +0000 (09:15 +0200)]
xen/x86: pretty print interrupt CPU affinity masks

Print the CPU affinity masks as numeric ranges instead of plain hexadecimal
bitfields.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
11 months agoxen/trace: Drop old trace API
Andrew Cooper [Mon, 20 Sep 2021 12:40:21 +0000 (13:40 +0100)]
xen/trace: Drop old trace API

With all users updated to the new API, drop the old API.  This includes all of
asm/hvm/trace.h, which allows us to drop some includes.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@cloud.com>
11 months agoxen/trace: Removal final {__,}trace_var() users in favour of the new API
Andrew Cooper [Tue, 21 Sep 2021 18:55:47 +0000 (19:55 +0100)]
xen/trace: Removal final {__,}trace_var() users in favour of the new API

The cycles parameter (which gets removed as a consequence) determines whether
trace() or trace_time() is used.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@cloud.com>
11 months agoxen: Switch to new TRACE() API
Andrew Cooper [Fri, 17 Sep 2021 23:31:27 +0000 (00:31 +0100)]
xen: Switch to new TRACE() API

(Almost) no functional change.

 * In irq_move_cleanup_interrupt(), use the 'me' local variable rather than
   calling smp_processor_id() again.  This manifests as a minor code
   improvement.
 * In vlapic_update_timer() and lapic_rearm(), introduce a new 'timer_period'
   local variable to simplify the expressions used for both the trace and
   create_periodic_time() calls.

All other differences in the compiled binary are to do with line numbers
changing.

Some conversion notes:
 * HVMTRACE_LONG_[234]D() and TRACE_2_LONG_[234]D() were latently buggy.  They
   blindly discard extra parameters, but luckily no users are impacted.  They
   are also obfuscated wrappers, depending on exactly one or two parameters
   being TRC_PAR_LONG() to compile successfully.
 * HVMTRACE_LONG_1D() behaves unlike its named companions, and takes exactly
   one 64bit parameter which it splits manually.  It's one user,
   vmx_cr_access()'s LMSW path, is gets adjusted.
 * TRACE_?D() and TRACE_2_LONG_*() change to TRACE_TIME() as cycles is always
   enabled.
 * HVMTRACE_ND() is opencoded for VMENTRY/VMEXIT records to include cycles.
   These are converted to TRACE_TIME(), with the old modifier parameter
   expressed as an OR at the callsite.  One callsite, svm_vmenter_helper() had
   a nested tb_init_done check, which is dropped.  (The optimiser also spotted
   this, which is why it doesn't manifest as a binary difference.)
 * All uses of *LONG() are either opencoded or swapped to using a struct, to
   avoid MISRA issues.
 * All HVMTRACE_?D() change to TRACE() as cycles is explicitly skipped.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@cloud.com>
11 months agoxen/sched: Clean up trace handling
Andrew Cooper [Mon, 20 Sep 2021 13:07:43 +0000 (14:07 +0100)]
xen/sched: Clean up trace handling

There is no need for bitfields anywhere - use more sensible types.  There is
also no need to cast 'd' to (unsigned char *) before passing it to a function
taking void *.  Switch to new trace_time() API.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
Reviewed-by: George Dunlap <george.dunlap@cloud.com>
11 months agoxen/rt: Clean up trace handling
Andrew Cooper [Fri, 17 Sep 2021 15:28:19 +0000 (16:28 +0100)]
xen/rt: Clean up trace handling

Most uses of bitfields and __packed are unnecessary.  There is also no need to
cast 'd' to (unsigned char *) before passing it to a function taking void *.
Switch to new trace_time() API.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
Reviewed-by: George Dunlap <george.dunlap@cloud.com>
11 months agoxen/credit2: Clean up trace handling
Andrew Cooper [Wed, 15 Sep 2021 16:01:43 +0000 (17:01 +0100)]
xen/credit2: Clean up trace handling

There is no need for bitfields anywhere - use types with an explicit width
instead.  There is also no need to cast 'd' to (unsigned char *) before
passing it to a function taking void *.  Switch to new trace_time() API.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
11 months agoxen/trace: Introduce new API
Andrew Cooper [Mon, 20 Sep 2021 12:36:12 +0000 (13:36 +0100)]
xen/trace: Introduce new API

trace() and trace_time(), in function form for struct arguments, and macro
form for simple uint32_t list arguments.

This will be used to clean up the mess of macros which exists throughout the
codebase, as well as eventually dropping __trace_var().

There is intentionally no macro to split a 64-bit parameter in the new API,
for MISRA reasons.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@cloud.com>
11 months agotools/xen-cpuid: Drop old names
Roger Pau Monné [Thu, 2 May 2024 11:49:22 +0000 (13:49 +0200)]
tools/xen-cpuid: Drop old names

Not used any more.  Split out of previous patch to aid legibility.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
11 months agotools/xen-cpuid: Use automatically generated feature names
Roger Pau Monné [Thu, 2 May 2024 11:49:22 +0000 (12:49 +0100)]
tools/xen-cpuid: Use automatically generated feature names

Have gen-cpuid.py write out INIT_FEATURE_VAL_TO_NAME, derived from the same
data source as INIT_FEATURE_NAME_TO_VAL, although both aliases of common_1d
are needed.

In xen-cpuid.c, sanity check at build time that leaf_info[] and
feature_names[] are of sensible length.

As dump_leaf() rendered missing names as numbers, always dump leaves even if
we don't have the leaf name.  This conversion was argumably missed in commit
59afdb8a81d6 ("tools/misc: Tweak reserved bit handling for xen-cpuid").

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
11 months agotools/xen-cpuid: Rename decodes[] to leaf_info[]
Roger Pau Monné [Thu, 2 May 2024 11:49:22 +0000 (12:49 +0100)]
tools/xen-cpuid: Rename decodes[] to leaf_info[]

Split out of subsequent patch to aid legibility.

No functional change.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
11 months agox86/gen-cpuid: Minor cleanup
Andrew Cooper [Fri, 10 May 2024 19:04:51 +0000 (20:04 +0100)]
x86/gen-cpuid: Minor cleanup

Rename INIT_FEATURE_NAMES to INIT_FEATURE_NAME_TO_VAL as we're about to gain a
inverse mapping of the same thing.

Use dict.items() unconditionally.  iteritems() is a marginal perf optimsiation
for Python2 only, and simply not worth the effort on a script this small.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
11 months agotools/golang: Add missing golang bindings for vlan
Henry Wang [Mon, 20 May 2024 08:21:45 +0000 (16:21 +0800)]
tools/golang: Add missing golang bindings for vlan

It is noticed that commit:
3bc14e4fa4b9 ("tools/libs/light: Add vlan field to libxl_device_nic")
introduces a new "vlan" string field to libxl_device_nic. But the
golang bindings are missing. Add it in this patch.

Fixes: 3bc14e4fa4b9 ("tools/libs/light: Add vlan field to libxl_device_nic")
Signed-off-by: Henry Wang <xin.wang2@amd.com>
Acked-by: George Dunlap <george.dunlap@cloud.com>
11 months agox86/msi: prevent watchdog triggering when dumping MSI state
Roger Pau Monné [Fri, 17 May 2024 13:56:05 +0000 (15:56 +0200)]
x86/msi: prevent watchdog triggering when dumping MSI state

Use the same check that's used in dump_irqs().

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Release-acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 months agoinclude/ctype.h: fix MISRA R10.2 violation
Stefano Stabellini [Wed, 15 May 2024 22:52:04 +0000 (15:52 -0700)]
include/ctype.h: fix MISRA R10.2 violation

The value returned by __toupper is used in arithmetic operations causing
MISRA C 10.2 violations. Cast to plain char in the toupper macro. Also
do the same in tolower for consistency.

Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
11 months agoxen/arm: Add DT reserve map regions to bootinfo.reserved_mem
Luca Fancellu [Thu, 25 Apr 2024 13:11:18 +0000 (14:11 +0100)]
xen/arm: Add DT reserve map regions to bootinfo.reserved_mem

Currently the code is listing device tree reserve map regions
as reserved memory for Xen, but they are not added into
bootinfo.reserved_mem and they are fetched in multiple places
using the same code sequence, causing duplication. Fix this
by adding them to the bootinfo.reserved_mem at early stage.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Acked-by: Julien Grall <jgrall@amazon.com>
11 months agoxen/arm64: lib: Use the generic xen/linkage.h macros
Edgar E. Iglesias [Sat, 4 May 2024 11:55:14 +0000 (13:55 +0200)]
xen/arm64: lib: Use the generic xen/linkage.h macros

Use the generic xen/linkage.h macros to annotate code symbols.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
11 months agoxen/arm64: cache: Use the generic xen/linkage.h macros
Edgar E. Iglesias [Sat, 4 May 2024 11:55:13 +0000 (13:55 +0200)]
xen/arm64: cache: Use the generic xen/linkage.h macros

Use the generic xen/linkage.h macros to annotate code symbols.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
11 months agoxen/arm64: mmu/head: Add missing code symbol annotations
Edgar E. Iglesias [Sat, 4 May 2024 11:55:12 +0000 (13:55 +0200)]
xen/arm64: mmu/head: Add missing code symbol annotations

Use the generic xen/linkage.h macros to annotate code symbols
and add missing annotations.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
11 months agoxen/arm64: bpi: Add missing code symbol annotations
Edgar E. Iglesias [Sat, 4 May 2024 11:55:11 +0000 (13:55 +0200)]
xen/arm64: bpi: Add missing code symbol annotations

Use the generic xen/linkage.h macros to annotate code symbols
and add missing annotations.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
11 months agoxen/arm64: debug: Add missing code symbol annotations
Edgar E. Iglesias [Sat, 4 May 2024 11:55:10 +0000 (13:55 +0200)]
xen/arm64: debug: Add missing code symbol annotations

Use the generic xen/linkage.h macros to annotate code symbols
and add missing annotations.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
11 months agoxen/arm64: head: Add missing code symbol annotations
Edgar E. Iglesias [Sat, 4 May 2024 11:55:09 +0000 (13:55 +0200)]
xen/arm64: head: Add missing code symbol annotations

Use the generic xen/linkage.h macros to annotate code symbols
and add missing annotations.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
11 months agoxen/arm64: sve: Add missing code symbol annotations
Edgar E. Iglesias [Sat, 4 May 2024 11:55:08 +0000 (13:55 +0200)]
xen/arm64: sve: Add missing code symbol annotations

Use the generic xen/linkage.h macros to annotate code symbols
and add missing annotations.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
11 months agoxen/arm64: smc: Add missing code symbol annotations
Edgar E. Iglesias [Sat, 4 May 2024 11:55:07 +0000 (13:55 +0200)]
xen/arm64: smc: Add missing code symbol annotations

Use the generic xen/linkage.h macros to annotate code symbols
and add missing annotations.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
11 months agoxen/arm64: entry: Add missing code symbol annotations
Edgar E. Iglesias [Sat, 4 May 2024 11:55:06 +0000 (13:55 +0200)]
xen/arm64: entry: Add missing code symbol annotations

Use the generic xen/linkage.h macros to annotate code symbols
and add missing annotations.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@amd.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
11 months agox86/ucode: Further fixes to identify "ucode already up to date"
Andrew Cooper [Thu, 16 May 2024 11:09:39 +0000 (12:09 +0100)]
x86/ucode: Further fixes to identify "ucode already up to date"

When the revision in hardware is newer than anything Xen has to hand,
'microcode_cache' isn't set up.  Then, `xen-ucode` initiates the update
because it doesn't know whether the revisions across the system are symmetric
or not.  This involves the patch getting all the way into the
apply_microcode() hooks before being found to be too old.

This is all a giant mess and needs an overhaul, but in the short term simply
adjust the apply_microcode() to return -EEXIST.

Also, unconditionally print the preexisting microcode revision on boot.  It's
relevant information which is otherwise unavailable if Xen doesn't find new
microcode to use.

Fixes: 648db37a155a ("x86/ucode: Distinguish "ucode already up to date"")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
11 months agox86/p2m: move altp2m-related code to separate file
Sergiy Kibrik [Thu, 16 May 2024 11:36:22 +0000 (13:36 +0200)]
x86/p2m: move altp2m-related code to separate file

Move altp2m code from generic p2m.c file to altp2m.c, so it is kept separately
and can possibly be disabled in the build. We may want to disable it when
building for specific platform only, that doesn't support alternate p2m.

No functional change intended.

Signed-off-by: Sergiy Kibrik <Sergiy_Kibrik@epam.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
11 months agox86/MCE: guard {intel/amd}_mcheck_init() calls
Sergiy Kibrik [Thu, 16 May 2024 11:35:54 +0000 (13:35 +0200)]
x86/MCE: guard {intel/amd}_mcheck_init() calls

Guard calls to CPU-specific mcheck init routines in common MCE code
using new INTEL/AMD config options.

The purpose is not to build platform-specific mcheck code and calls to it,
if this platform is disabled in config.

Signed-off-by: Sergiy Kibrik <Sergiy_Kibrik@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
11 months agox86/MCE: guard access to Intel/AMD-specific MCA MSRs
Sergiy Kibrik [Thu, 16 May 2024 11:35:34 +0000 (13:35 +0200)]
x86/MCE: guard access to Intel/AMD-specific MCA MSRs

Add build-time checks for newly introduced INTEL/AMD config options when
calling vmce_{intel/amd}_{rdmsr/wrmsr}() routines.
This way a platform-specific code can be omitted in vmce code, if this
platform is disabled in config.

Signed-off-by: Sergiy Kibrik <Sergiy_Kibrik@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
11 months agox86/vpmu: separate amd/intel vPMU code
Sergiy Kibrik [Thu, 16 May 2024 11:34:54 +0000 (13:34 +0200)]
x86/vpmu: separate amd/intel vPMU code

Build AMD vPMU when CONFIG_AMD is on, and Intel vPMU when CONFIG_INTEL
is on respectively, allowing for a plaftorm-specific build.

No functional change intended.

Signed-off-by: Sergiy Kibrik <Sergiy_Kibrik@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
11 months agoxen/bitops: put __ffs() into linux compatible header
Oleksii Kurochko [Thu, 16 May 2024 08:08:37 +0000 (10:08 +0200)]
xen/bitops: put __ffs() into linux compatible header

The mentioned macros exist only because of Linux compatible purpose.

The patch defines __ffs() in terms of Xen bitops and it is safe
to define in this way ( as __ffs() - 1 ) as considering that __ffs()
was defined as __builtin_ctzl(x), which has undefined behavior when x=0,
so it is assumed that such cases are not encountered in the current code.

To not include <xen/linux-compat.h> to Xen library files __ffs() and __ffz()
were defined locally in find-next-bit.c.

Except __ffs() usage in find-next-bit.c only one usage of __ffs() leave
in smmu-v3.c. It seems that it __ffs can be changed to ffsl(x)-1 in
this file, but to keep smmu-v3.c looks close to linux it was deciced just
to define __ffs() in xen/linux-compat.h and include it in smmu-v3.c

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Shawn Anastasio <sanastasio@raptorengineering.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Michal Orzel <michal.orzel@amd.com>
Acked-by: Rahul Singh <rahul.singh@arm.com>
11 months agox86: detect PIC aliasing on ports other than 0x[2A][01]
Jan Beulich [Thu, 16 May 2024 08:03:16 +0000 (10:03 +0200)]
x86: detect PIC aliasing on ports other than 0x[2A][01]

... in order to also deny Dom0 access through the alias ports (commonly
observed on Intel chipsets). Without this it is only giving the
impression of denying access to both PICs. Unlike for CMOS/RTC, do
detection very early, to avoid disturbing normal operation later on.

Like for CMOS/RTC a fundamental assumption of the probing is that reads
from the probed alias port won't have side effects in case it does not
alias the respective PIC's one.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
11 months agox86: allow to suppress port-alias probing
Jan Beulich [Thu, 16 May 2024 08:02:34 +0000 (10:02 +0200)]
x86: allow to suppress port-alias probing

By default there's already no use for this when we run in shim mode.
Plus there may also be a need to suppress the probing in case of issues
with it. Before introducing further port alias probing, introduce a
command line option allowing to bypass it, default it to on when in shim
mode, and gate RTC/CMOS port alias probing on it.

Requested-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
11 months agoautomation/eclair_analysis: deviate macro count_args_ for MISRA Rule 20.7
Nicola Vetrini [Tue, 23 Apr 2024 15:12:45 +0000 (17:12 +0200)]
automation/eclair_analysis: deviate macro count_args_ for MISRA Rule 20.7

The count_args_ macro violates Rule 20.7, but it can't be made
compliant with Rule 20.7 without breaking its functionality. Since
it's very unlikely for this macro to be misused, it is deviated.

No functional change.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
11 months agoautomation/eclair_analysis: fully deviate MISRA C Rules 21.9 and 21.10
Nicola Vetrini [Wed, 15 May 2024 07:51:59 +0000 (09:51 +0200)]
automation/eclair_analysis: fully deviate MISRA C Rules 21.9 and 21.10

These rules are concerned with the use of facilities provided by the
C Standard Library (qsort, bsearch for rule 21.9, and those provided
by <time.h> for rule 21.10).

Xen provides in its source code its own implementation of some of these
functions and macros, therefore a justification is provided for allowing
uses of these functions in the project.

The rules are also marked as clean as a consequence.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
11 months agox86/mtrr: avoid system wide rendezvous when setting AP MTRRs
Roger Pau Monne [Mon, 13 May 2024 08:59:25 +0000 (10:59 +0200)]
x86/mtrr: avoid system wide rendezvous when setting AP MTRRs

There's no point in forcing a system wide update of the MTRRs on all processors
when there are no changes to be propagated.  On AP startup it's only the AP
that needs to write the system wide MTRR values in order to match the rest of
the already online CPUs.

We have occasionally seen the watchdog trigger during `xen-hptool cpu-online`
in one Intel Cascade Lake box with 448 CPUs due to the re-setting of the MTRRs
on all the CPUs in the system.

While there adjust the comment to clarify why the system-wide resetting of the
MTRR registers is not needed for the purposes of mtrr_ap_init().

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Release-acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 months agotools/xl: add vlan keyword to vif option
Leigh Brown [Wed, 8 May 2024 21:38:21 +0000 (22:38 +0100)]
tools/xl: add vlan keyword to vif option

Update parse_nic_config() to support a new `vlan' keyword. This
keyword specifies the VLAN configuration to assign to the VIF when
attaching it to the bridge port, on operating systems that support
the capability (e.g. Linux). The vlan keyword will allow one or
more VLANs to be configured on the VIF when adding it to the bridge
port. This will be done by the vif-bridge script and functions.

Signed-off-by: Leigh Brown <leigh@solinno.co.uk>
Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
11 months agotools/libs/light: Add vlan field to libxl_device_nic
Leigh Brown [Wed, 8 May 2024 21:38:20 +0000 (22:38 +0100)]
tools/libs/light: Add vlan field to libxl_device_nic

Add `vlan' string field to libxl_device_nic, to allow a VLAN
configuration to be specified for the VIF when adding it to the
bridge device.

Update libxl_nic.c to read and write the vlan field from the
xenstore.

This provides the capability for supported operating systems (e.g.
Linux) to perform VLAN filtering on bridge ports.  The Xen
hotplug scripts need to be updated to read this information from
the xenstore and perform the required configuration.

Signed-off-by: Leigh Brown <leigh@solinno.co.uk>
Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
11 months agotools/xentop: Fix cpu% sort order
Leigh Brown [Tue, 14 May 2024 08:13:44 +0000 (09:13 +0100)]
tools/xentop: Fix cpu% sort order

In compare_cpu_pct(), there is a double -> unsigned long long converion when
calling compare().  In C, this discards the fractional part, resulting in an
out-of order sorting such as:

        NAME  STATE   CPU(sec) CPU(%)
       xendd --b---       4020    5.7
    icecream --b---       2600    3.8
    Domain-0 -----r       1060    1.5
        neon --b---        827    1.1
      cheese --b---        225    0.7
       pizza --b---        359    0.5
     cassini --b---        490    0.4
     fusilli --b---        159    0.2
         bob --b---        502    0.2
     blender --b---        121    0.2
       bread --b---         69    0.1
    chickpea --b---         67    0.1
      lentil --b---         67    0.1

Introduce compare_dbl() function and update compare_cpu_pct() to call it.

Fixes: 49839b535b78 ("Add xenstat framework.")
Signed-off-by: Leigh Brown <leigh@solinno.co.uk>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 months agotools/hvmloader: Further simplify SMP setup
Andrew Cooper [Thu, 9 May 2024 17:40:11 +0000 (18:40 +0100)]
tools/hvmloader: Further simplify SMP setup

Now that we're using hypercalls to start APs, we can replace the 'ap_cpuid'
global with a regular function parameter.  This requires telling the compiler
that we'd like the parameter in a register rather than on the stack.

While adjusting, rename to cpu_setup().  It's always been used on the BSP,
making the name ap_start() specifically misleading.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
11 months agox86/cpufreq: Rename cpuid variable/parameters to cpu
Andrew Cooper [Sat, 11 May 2024 18:25:00 +0000 (19:25 +0100)]
x86/cpufreq: Rename cpuid variable/parameters to cpu

Various functions have a parameter or local variable called cpuid, but this
triggers a MISRA R5.3 violation because we also have a function called cpuid()
which wraps the real CPUID instruction.

In all these cases, it's a Xen cpu index, which is far more commonly named
just cpu in our code.

While adjusting these, fix a couple of other issues:

 * cpufreq_cpu_init() is on the end of a hypercall (with in-memory parameters,
   even), making EFAULT the wrong error to use.  Use EOPNOTSUPP instead.

 * check_est_cpu() is wrong to tie EIST to just Intel, and nowhere else using
   EIST makes this restriction.  Just check the feature itself, which is more
   succinctly done after being folded into its single caller.

 * In powernow_cpufreq_update(), replace an opencoded cpu_online().

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
11 months agox86: respect mapcache_domain_init() failing
Jan Beulich [Wed, 15 May 2024 13:35:15 +0000 (15:35 +0200)]
x86: respect mapcache_domain_init() failing

The function itself properly handles and hands onwards failure from
create_perdomain_mapping(). Therefore its caller should respect possible
failure, too.

Fixes: 4b28bf6ae90b ("x86: re-introduce map_domain_page() et al")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
11 months agoxen/sched: set all sched_resource data inside locked region for new cpu
Juergen Gross [Wed, 15 May 2024 15:25:39 +0000 (17:25 +0200)]
xen/sched: set all sched_resource data inside locked region for new cpu

When adding a cpu to a scheduler, set all data items of struct
sched_resource inside the locked region, as otherwise a race might
happen (e.g. when trying to access the cpupool of the cpu):

  (XEN) ----[ Xen-4.19.0-1-d  x86_64  debug=y  Tainted:     H  ]----
  (XEN) CPU:    45
  (XEN) RIP:    e008:[<ffff82d040244cbf>] common/sched/credit.c#csched_load_balance+0x41/0x877
  (XEN) RFLAGS: 0000000000010092   CONTEXT: hypervisor
  (XEN) rax: ffff82d040981618   rbx: ffff82d040981618   rcx: 0000000000000000
  (XEN) rdx: 0000003ff68cd000   rsi: 000000000000002d   rdi: ffff83103723d450
  (XEN) rbp: ffff83207caa7d48   rsp: ffff83207caa7b98   r8:  0000000000000000
  (XEN) r9:  ffff831037253cf0   r10: ffff83103767c3f0   r11: 0000000000000009
  (XEN) r12: ffff831037237990   r13: ffff831037237990   r14: ffff831037253720
  (XEN) r15: 0000000000000000   cr0: 000000008005003b   cr4: 0000000000f526e0
  (XEN) cr3: 000000005bc2f000   cr2: 0000000000000010
  (XEN) fsb: 0000000000000000   gsb: 0000000000000000   gss: 0000000000000000
  (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
  (XEN) Xen code around <ffff82d040244cbf> (common/sched/credit.c#csched_load_balance+0x41/0x877):
  (XEN)  48 8b 0c 10 48 8b 49 08 <48> 8b 79 10 48 89 bd b8 fe ff ff 49 8b 4e 28 48
  <snip>
  (XEN) Xen call trace:
  (XEN)    [<ffff82d040244cbf>] R common/sched/credit.c#csched_load_balance+0x41/0x877
  (XEN)    [<ffff82d040245a18>] F common/sched/credit.c#csched_schedule+0x36a/0x69f
  (XEN)    [<ffff82d040252644>] F common/sched/core.c#do_schedule+0xe8/0x433
  (XEN)    [<ffff82d0402572dd>] F common/sched/core.c#schedule+0x2e5/0x2f9
  (XEN)    [<ffff82d040232f35>] F common/softirq.c#__do_softirq+0x94/0xbe
  (XEN)    [<ffff82d040232fc8>] F do_softirq+0x13/0x15
  (XEN)    [<ffff82d0403075ef>] F arch/x86/domain.c#idle_loop+0x92/0xe6
  (XEN)
  (XEN) Pagetable walk from 0000000000000010:
  (XEN)  L4[0x000] = 000000103ff61063 ffffffffffffffff
  (XEN)  L3[0x000] = 000000103ff60063 ffffffffffffffff
  (XEN)  L2[0x000] = 0000001033dff063 ffffffffffffffff
  (XEN)  L1[0x000] = 0000000000000000 ffffffffffffffff
  (XEN)
  (XEN) ****************************************
  (XEN) Panic on CPU 45:
  (XEN) FATAL PAGE FAULT
  (XEN) [error_code=0000]
  (XEN) Faulting linear address: 0000000000000010
  (XEN) ****************************************

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Fixes: a8c6c623192e ("sched: clarify use cases of schedule_cpu_switch()")
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 months agoxen/console: fix Rule 10.2 violation
Stefano Stabellini [Fri, 10 May 2024 23:37:11 +0000 (16:37 -0700)]
xen/console: fix Rule 10.2 violation

Change opt_conswitch to char to fix a violation of Rule 10.2.

Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
11 months agodocs/misra: add R21.6 R21.9 R21.10 R21.14 R21.15 R21.16
Stefano Stabellini [Fri, 26 Apr 2024 21:36:28 +0000 (14:36 -0700)]
docs/misra: add R21.6 R21.9 R21.10 R21.14 R21.15 R21.16

Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 months agox86/io: Don't cast away constness in read{b..q}()
Andrew Cooper [Fri, 10 May 2024 19:23:40 +0000 (20:23 +0100)]
x86/io: Don't cast away constness in read{b..q}()

Addresses various MISRA R11.8 violations.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
11 months agoRevert "evtchn: refuse EVTCHNOP_status for Xen-bound event channels"
Andrew Cooper [Tue, 2 Apr 2024 14:50:19 +0000 (15:50 +0100)]
Revert "evtchn: refuse EVTCHNOP_status for Xen-bound event channels"

The commit makes a claim without justification.

The claim is false; it broke lsevtchn in dom0, a debugging utility which
absolutely does care about all of the domain's event channels.

Whether to return information about a xen-owned evtchn is a matter of policy,
and it's not acceptable to subvert Xen's security subsystem on the decision.

This reverts commit f60ab5337f968e2f10c639ab59db7afb0fe4f7c3.

Fixes: f60ab5337f96 ("evtchn: refuse EVTCHNOP_status for Xen-bound event channels")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Daniel P. Smith <dpsmith@apertussolutions.com>
11 months agoxen: Use -Wuninitialized and -Winit-self
Andrew Cooper [Fri, 10 May 2024 22:56:52 +0000 (23:56 +0100)]
xen: Use -Wuninitialized and -Winit-self

Assigning a variable to itself is an anti-pattern.  It introduces definite UB
in an attempt to silence a warning about possible UB.

As it's definite undefined behaviour, it also mis-compiles in simple cases,
using whatever stale value happened to be in the allocated register.

Clang includes -Wuninitialized within -Wall, but GCC only includes it in
-Wextra, which is not used by Xen at this time.

Furthermore, the specific pattern of assigning a variable to itself in its
declaration is only diagnosed by GCC with -Winit-self.  Clang does diagnose
simple forms of this pattern with a plain -Wuninitialized, but it fails to
diagnose the instances in Xen that GCC manages to find.

GCC, with -Wuninitialized and -Winit-self notices:

  arch/x86/time.c: In function ‘read_pt_and_tsc’:
  arch/x86/time.c:297:14: error: ‘best’ is used uninitialized in this function [-Werror=uninitialized]
    297 |     uint32_t best = best;
        |              ^~~~
  arch/x86/time.c: In function ‘read_pt_and_tmcct’:
  arch/x86/time.c:1022:14: error: ‘best’ is used uninitialized in this function [-Werror=uninitialized]
   1022 |     uint64_t best = best;
        |              ^~~~

Fix these up to start with a value of ~0, which is also more robust in the
case that something goes wrong.

Fixes: 23658e823238 ("x86/time: further improve TSC / CPU freq calibration accuracy")
Fixes: 3f3906b462d5 ("x86/APIC: calibrate against platform timer when possible")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
11 months agoxen: Use -Wflex-array-member-not-at-end when available
Andrew Cooper [Sat, 13 Jan 2024 17:40:48 +0000 (17:40 +0000)]
xen: Use -Wflex-array-member-not-at-end when available

This option is new in GCC-14, and maps to MISRA Rule 1.1.  The codebase is
clean to it, and Eclair is blocking.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
11 months agoautomation/eclair_analysis: tag MISRA C Rule 1.1 as clean
Nicola Vetrini [Fri, 10 May 2024 18:03:36 +0000 (20:03 +0200)]
automation/eclair_analysis: tag MISRA C Rule 1.1 as clean

Tag the rule as clean, as there are no more violations in the codebase since
93c27d54dd23 ("xen/arm: Fix MISRA regression on R1.1,
flexible array member not at the end").

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 months agolibxl: Fix handling XenStore errors in device creation
Demi Marie Obenour [Sat, 27 Apr 2024 02:17:03 +0000 (22:17 -0400)]
libxl: Fix handling XenStore errors in device creation

If xenstored runs out of memory it is possible for it to fail operations
that should succeed.  libxl wasn't robust against this, and could fail
to ensure that the TTY path of a non-initial console was created and
read-only for guests.  This doesn't qualify for an XSA because guests
should not be able to run xenstored out of memory, but it still needs to
be fixed.

Add the missing error checks to ensure that all errors are properly
handled and that at no point can a guest make the TTY path of its
frontend directory writable.

Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
11 months agox86/hvm: Allow access to registers on the same page as MSI-X table
Marek Marczykowski-Górecki [Fri, 10 May 2024 03:53:22 +0000 (05:53 +0200)]
x86/hvm: Allow access to registers on the same page as MSI-X table

Some devices (notably Intel Wifi 6 AX210 card) keep auxiliary registers
on the same page as MSI-X table. Device model (especially one in
stubdomain) cannot really handle those, as direct writes to that page is
refused (page is on the mmio_ro_ranges list). Instead, extend
msixtbl_mmio_ops to handle such accesses too.

Doing this, requires correlating read/write location with guest
MSI-X table address. Since QEMU doesn't map MSI-X table to the guest,
it requires msixtbl_entry->gtable, which is HVM-only. Similar feature
for PV would need to be done separately.

This will be also used to read Pending Bit Array, if it lives on the same
page, making QEMU not needing /dev/mem access at all (especially helpful
with lockdown enabled in dom0). If PBA lives on another page, QEMU will
map it to the guest directly.
If PBA lives on the same page, discard writes and log a message.
Technically, writes outside of PBA could be allowed, but at this moment
the precise location of PBA isn't saved, and also no known device abuses
the spec in this way (at least yet).

To access those registers, msixtbl_mmio_ops need the relevant page
mapped. MSI handling already has infrastructure for that, using fixmap,
so try to map first/last page of the MSI-X table (if necessary) and save
their fixmap indexes. Note that msix_get_fixmap() does reference
counting and reuses existing mapping, so just call it directly, even if
the page was mapped before. Also, it uses a specific range of fixmap
indexes which doesn't include 0, so use 0 as default ("not mapped")
value - which simplifies code a bit.

Based on assumption that all MSI-X page accesses are handled by Xen, do
not forward adjacent accesses to other hypothetical ioreq servers, even
if the access wasn't handled for some reason (failure to map pages etc).
Relevant places log a message about that already.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
11 months agox86/msi: Extend per-domain/device warning mechanism
Marek Marczykowski-Górecki [Fri, 10 May 2024 03:53:21 +0000 (05:53 +0200)]
x86/msi: Extend per-domain/device warning mechanism

The arch_msix struct had a single "warned" field with a domid for which
warning was issued. Upcoming patch will need similar mechanism for few
more warnings, so change it to save a bit field of issued warnings.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
11 months agolibxl: fix population of the online vCPU bitmap for PVH
Roger Pau Monne [Fri, 10 May 2024 12:49:13 +0000 (14:49 +0200)]
libxl: fix population of the online vCPU bitmap for PVH

libxl passes some information to libacpi to create the ACPI table for a PVH
guest, and among that information it's a bitmap of which vCPUs are online
which can be less than the maximum number of vCPUs assigned to the domain.

While the population of the bitmap is done correctly for HVM based on the
number of online vCPUs, for PVH the population of the bitmap is done based on
the number of maximum vCPUs allowed.  This leads to all local APIC entries in
the MADT being set as enabled, which contradicts the data in xenstore if vCPUs
is different than maximum vCPUs.

Fix by copying the internal libxl bitmap that's populated based on the vCPUs
parameter.

Reported-by: Arthur Borsboom <arthurborsboom@gmail.com>
Link: https://gitlab.com/libvirt/libvirt/-/issues/399
Reported-by: Leigh Brown <leigh@solinno.co.uk>
Fixes: 14c0d328da2b ('libxl/acpi: Build ACPI tables for HVMlite guests')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Tested-by: Leigh Brown <leigh@solinno.co.uk>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 months agoxen: allow up to 16383 cpus
Juergen Gross [Fri, 10 May 2024 14:16:36 +0000 (16:16 +0200)]
xen: allow up to 16383 cpus

With lock handling now allowing up to 16384 cpus (spinlocks can handle
65535 cpus, rwlocks can handle 16384 cpus), raise the allowed limit for
the number of cpus to be configured to 16383.

The new limit is imposed by IOMMU_CMD_BUFFER_MAX_ENTRIES and
QINVAL_MAX_ENTRY_NR required to be larger than 2 * CONFIG_NR_CPUS.

Add a support limit of physical CPUs to SUPPORT.md (4096 on x86, 128
on ARM).

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
11 months agoautomation/eclair: hide reports coming from adopted code in scheduled analysis
Federico Serafini [Fri, 3 May 2024 13:14:11 +0000 (15:14 +0200)]
automation/eclair: hide reports coming from adopted code in scheduled analysis

To improve clarity and ease of navigation do not show reports related
to adopted code in the scheduled analysis.
Configuration options are commented out because they may be useful
in the future.

Signed-off-by: Simone Ballarin <simone.ballarin@bugseng.com>
Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
11 months agoautomation/eclair_analysis: amend configuration for some MISRA rules
Alessandro Zucchelli [Fri, 10 May 2024 01:02:17 +0000 (18:02 -0700)]
automation/eclair_analysis: amend configuration for some MISRA rules

Adjust ECLAIR configuration for rules: R21.14, R21.15, R21.16 by taking
into account mem* macros defined in the Xen sources as if they were
equivalent to the ones in Standard Library.

Signed-off-by: Alessandro Zucchelli <alessandro.zucchelli@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
11 months agoxen/arm: Fix MISRA regression on R1.1, flexible array member not at the end
Luca Fancellu [Tue, 30 Apr 2024 11:09:22 +0000 (12:09 +0100)]
xen/arm: Fix MISRA regression on R1.1, flexible array member not at the end

Commit 2209c1e35b47 ("xen/arm: Introduce a generic way to access memory
bank structures") introduced a MISRA regression for Rule 1.1 because a
flexible array member is introduced in the middle of a struct, furthermore
this is using a GCC extension that is going to be deprecated in GCC 14 and
a warning to identify such cases will be present
(-Wflex-array-member-not-at-end) to identify such cases.

In order to fix this issue, use the macro __struct_group to create a
structure 'struct membanks_hdr' which will hold the common data among
structures using the 'struct membanks' interface.

Modify the 'struct shared_meminfo' and 'struct meminfo' to use this new
structure, effectively removing the flexible array member from the middle
of the structure and modify the code accessing the .common field to use
the macro container_of to maintain the functionality of the interface.

Given this change, container_of needs to be supplied with a type and so
the macro 'kernel_info_get_mem' inside arm/include/asm/kernel.h can't be
an option since it uses const and non-const types for struct membanks, so
introduce two static inline, one of which will keep the const qualifier.

Given the complexity of the interface, which carries a lot of benefit but
on the other hand could be prone to developer confusion if the access is
open-coded, introduce two static inline helper for the
'struct kernel_info' .shm_mem member and get rid the open-coding
shm_mem.common access.

Fixes: 2209c1e35b47 ("xen/arm: Introduce a generic way to access memory bank structures")
Reported-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
11 months agoxen/kernel.h: Import __struct_group from Linux
Luca Fancellu [Tue, 30 Apr 2024 11:09:21 +0000 (12:09 +0100)]
xen/kernel.h: Import __struct_group from Linux

Import __struct_group from Linux, commit 50d7bd38c3aa
("stddef: Introduce struct_group() helper macro"), in order to
allow the access through the anonymous structure to the members
without having to write also the name, e.g:

struct foo {
    int one;
    struct {
        int two;
        int three, four;
    } thing;
    int five;
};

would become:

struct foo {
    int one;
    __struct_group(/* None */, thing, /* None */,
        int two;
        int three, four;
    );
    int five;
};

Allowing the users of this structure to access the .thing members by
using .two/.three/.four on the struct foo.
This construct will become useful in order to have some generalized
interfaces that shares some common members.

Origin: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 50d7bd38c3aa
Reported-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
11 months agox86/boot: Refactor pvh_load_kernel() to have an initrd_len local
Andrew Cooper [Tue, 23 Apr 2024 11:42:47 +0000 (12:42 +0100)]
x86/boot: Refactor pvh_load_kernel() to have an initrd_len local

The expression get more complicated when ->mod_end isn't being abused as a
size field.  Introduce and use a initrd_len local variable.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
11 months agox86/boot: Explain how moving mod[0] works
Andrew Cooper [Tue, 23 Apr 2024 15:45:36 +0000 (16:45 +0100)]
x86/boot: Explain how moving mod[0] works

modules_headroom is a misleading name as it applies strictly to mod[0] only,
and the movement loop is deeply unintuitive and completely undocumented.

Provide help to whomever needs to look at this code next.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Jason Andryuk <jason.andryuk@gmail.com>
11 months agox86/IOMMU: address violations of MISRA C:2012 Rule 14.4
Maria Celeste Cesario [Wed, 8 May 2024 18:46:21 +0000 (20:46 +0200)]
x86/IOMMU: address violations of MISRA C:2012 Rule 14.4

The xen sources contain violations of MISRA C:2012 Rule 14.4 whose
headline states:
"The controlling expression of an if statement and the controlling
expression of an iteration-statement shall have essentially Boolean type".

Add comparisons to avoid using enum constants as controlling expressions
to comply with Rule 14.4.

Amend the comment in the enum definition to reflect the fact that
boolean uses of iommu_intremap are no longer allowed.

No functional change.

Signed-off-by: Maria Celeste Cesario <maria.celeste.cesario@bugseng.com>
Signed-off-by: Simone Ballarin <simone.ballarin@bugseng.com>
Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 months agoautomation/eclair: add deviation of MISRA C:2012 Rule 14.4
Federico Serafini [Thu, 2 May 2024 13:11:15 +0000 (15:11 +0200)]
automation/eclair: add deviation of MISRA C:2012 Rule 14.4

Update ECLAIR configuration to take into account the deviations
agreed during MISRA meetings.

Amend an existing entry of Rule 14.4 in deviations-rst:
it is not a project-wide deviation.

Tag Rule 14.4 as clean for arm.

Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
11 months agoxen/pci: address violations of MISRA C Rule 20.7
Nicola Vetrini [Tue, 30 Apr 2024 14:28:16 +0000 (16:28 +0200)]
xen/pci: address violations of MISRA C Rule 20.7

MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.

No functional change.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
11 months agoxen/unaligned: address violation of MISRA C Rule 20.7
Nicola Vetrini [Tue, 30 Apr 2024 14:28:15 +0000 (16:28 +0200)]
xen/unaligned: address violation of MISRA C Rule 20.7

MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.

No functional change.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 months agox86/hvm: Defer the size calculation in hvm_save_cpu_xsave_states()
Andrew Cooper [Mon, 29 Apr 2024 16:31:03 +0000 (17:31 +0100)]
x86/hvm: Defer the size calculation in hvm_save_cpu_xsave_states()

HVM_CPU_XSAVE_SIZE() may rewrite %xcr0 twice.  Defer the calculation until
after we've decided to write out an XSAVE record.

Note in hvm_load_cpu_xsave_states() that there were versions of Xen which
wrote out a useless XSAVE record.  This sadly limits out ability to tidy up
the existing infrastructure.  Also leave a note in xstate_ctxt_size() that 0
still needs tolerating for now.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
11 months agotools/hvmloader: Wake APs with hypercalls rather than INIT+SIPI+SIPI
Alejandro Vallejo [Wed, 8 May 2024 12:39:23 +0000 (13:39 +0100)]
tools/hvmloader: Wake APs with hypercalls rather than INIT+SIPI+SIPI

... in order to change how LAPIC_ID handling works.  Importantly, this allows
us to start APs by vCPU ID in order to query the LAPIC_ID, rather than needing
to know the APIC_ID in order to wake them.

Other improvements avoid:
 * The 16bit entry stub
 * A LMSW insn, which has no decode assist on AMD and needs emulating fully
 * 13 vLAPIC emulations when 3 hypercalls can do
 * 4 pages of stack when 1 in plenty

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 months agotools/hvmloader: Move various helpers to being static inlines
Andrew Cooper [Wed, 24 Aug 2022 10:08:28 +0000 (11:08 +0100)]
tools/hvmloader: Move various helpers to being static inlines

The IO port, MSR, IO-APIC and LAPIC accessors compile typically to single or
pairs of instructions, which is less overhead than even the stack manipulation
to call the helpers.

Move the implementations from util.c to being static inlines in util.h

In addition, turn ioapic_base_address into a constant as it is never modified
from 0xfec00000 (substantially shrinks the IO-APIC logic), and make use of the
"A" constraint for WRMSR/RDMSR like we already do for RDTSC.

Bloat-o-meter reports a net:
  add/remove: 0/13 grow/shrink: 0/18 up/down: 0/-790 (-790)

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
11 months agoxen/gunzip: Move crc state into gunzip_state
Daniel P. Smith [Wed, 24 Apr 2024 16:34:22 +0000 (12:34 -0400)]
xen/gunzip: Move crc state into gunzip_state

Move the crc and its state into struct gunzip_state.  In the process, expand
the only use of CRC_VALUE as it is hides what is being compared.

Furthermore, all variables here should be uint32_t rather than unsigned long,
which halves the storage space required.  Filter the typechanges through the
logic.

Adjust the logic to hold crc in a positive form, and negate it for update in
flush_window().  This is the more normal way to write CRC algorithms, and
avoids weird-to-follow logic in gunzip().

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 months agoxen/gunzip: Move bitbuffer into gunzip_state
Daniel P. Smith [Wed, 24 Apr 2024 16:34:21 +0000 (12:34 -0400)]
xen/gunzip: Move bitbuffer into gunzip_state

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 months agoxen/gunzip: Move output count into gunzip_state
Daniel P. Smith [Wed, 24 Apr 2024 16:34:20 +0000 (12:34 -0400)]
xen/gunzip: Move output count into gunzip_state

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 months agoxen/gunzip: Move input buffer handling into gunzip_state
Daniel P. Smith [Wed, 24 Apr 2024 16:34:19 +0000 (12:34 -0400)]
xen/gunzip: Move input buffer handling into gunzip_state

Move the input buffer handling, buffer pointer(inbuf), size(insize), and
index(inptr), into gunzip_state. Adjust functions and macros that consumed the
input buffer to accept a struct gunzip_state reference.

Convert get_byte() into a real function and subsume fill_inbuf().  Fix the
failure path to work correctly when error() stops being a plain panic().

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 months agoxen/gunzip: Move window position into gunzip_state
Daniel P. Smith [Wed, 24 Apr 2024 16:34:18 +0000 (12:34 -0400)]
xen/gunzip: Move window position into gunzip_state

Move the window position, outcnt/wp, into struct gunzip_state.  This removes
'outcnt' and it's alias 'wp'.

Consistently use the term "position" which is better than "pointer" given that
this is is a plain integer field.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 months agoxen/gunzip: Introduce struct gunzip_state and refactor window
Daniel P. Smith [Wed, 24 Apr 2024 16:34:17 +0000 (12:34 -0400)]
xen/gunzip: Introduce struct gunzip_state and refactor window

Introduce struct gunzip_state so the state can be per-instance rather than
global.  Allocate and free the structure in perform_gunzip().

Move the window (output) pointer into gunzip_state first, which involves
plumbing the state pointer all the way down into flush_window().

Drop the 'slide' alias, and the flush_output() macro too as it hides at least
one "wp = wp" assignment.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 months agoxen/gunzip: don't leak memory on error paths
Jan Beulich [Mon, 6 May 2024 08:08:40 +0000 (10:08 +0200)]
xen/gunzip: don't leak memory on error paths

While decompression errors are likely going to be fatal to Xen's boot
process anyway, the latest with the goal of doing multiple decompressor
runs it is likely better to avoid leaks even on error paths. All the
more when this way code size actually shrinks a tiny bit.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 months agox86/ucode: Distinguish "ucode already up to date"
Andrew Cooper [Wed, 8 May 2024 15:56:12 +0000 (16:56 +0100)]
x86/ucode: Distinguish "ucode already up to date"

Right now, Xen returns -ENOENT for both "the provided blob isn't correct for
this CPU", and "the blob isn't newer than what's loaded".

This in turn causes xen-ucode to exit with an error, when "nothing to do" is
more commonly a success condition.

Handle EEXIST specially and exit cleanly.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
11 months agoautomation/eclair_analysis: unblock pipelines from certain repositories
Nicola Vetrini [Mon, 6 May 2024 08:52:31 +0000 (10:52 +0200)]
automation/eclair_analysis: unblock pipelines from certain repositories

Repositories under people/* only execute the analyze step if manually
triggered, but in order to avoid blocking the rest of the pipeline
if such step is not run, allow it to fail.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
11 months agoautomation/eclair_analysis: tag MISRA C Rule 8.2 as clean.
Nicola Vetrini [Thu, 9 May 2024 12:04:07 +0000 (14:04 +0200)]
automation/eclair_analysis: tag MISRA C Rule 8.2 as clean.

Tag the rule as clean, as there are no more violations in the codebase
since e8e8afee990a ("svm: Fix MISRA 8.2 violation").

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 months agosvm: Fix MISRA 8.2 violation
George Dunlap [Thu, 25 Apr 2024 08:49:42 +0000 (09:49 +0100)]
svm: Fix MISRA 8.2 violation

Misra 8.2 requires named parameters in prototypes.  Use the name from
the implementaiton.

Fixes: 0d19d3aab0 ("svm/nestedsvm: Introduce nested capabilities bit")
Reported-by: Andrew Cooper <andrew.cooper@cloud.com>
Reported-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Signed-off-by: George Dunlap <george.dunlap@cloud.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
12 months agox86/cpu-policy: Fix migration from Ice Lake to Cascade Lake
Andrew Cooper [Tue, 7 May 2024 11:19:41 +0000 (12:19 +0100)]
x86/cpu-policy: Fix migration from Ice Lake to Cascade Lake

Ever since Xen 4.14, there has been a latent bug with migration.

While some toolstacks can level the features properly, they don't shink
feat.max_subleaf when all features have been dropped.  This is because
we *still* have not completed the toolstack side work for full CPU Policy
objects.

As a consequence, even when properly feature levelled, VMs can't migrate
"backwards" across hardware which reduces feat.max_subleaf.  One such example
is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0).

Extend the max policies feat.max_subleaf to the hightest number Xen knows
about, but leave the default policies matching the host.  This will allow VMs
with a higher feat.max_subleaf than strictly necessary to migrate in.

Eventually we'll manage to teach the toolstack how to avoid creating such VMs
in the first place, but there's still more work to do there.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
12 months agotools/libxs: Open /dev/xen/xenbus fds as O_CLOEXEC
Andrew Cooper [Sat, 4 May 2024 01:10:33 +0000 (02:10 +0100)]
tools/libxs: Open /dev/xen/xenbus fds as O_CLOEXEC

The header description for xs_open() goes as far as to suggest that the fd is
O_CLOEXEC, but it isn't actually.

`xl devd` has been observed leaking /dev/xen/xenbus into children.

Link: https://github.com/QubesOS/qubes-issues/issues/8292
Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
12 months agox86/platform: correct #undef in compat checking
Jan Beulich [Mon, 6 May 2024 12:53:17 +0000 (14:53 +0200)]
x86/platform: correct #undef in compat checking

A stray 'p' was there, rendering the #undef ineffectual.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>