]> xenbits.xensource.com Git - people/andrewcoop/xen.git/log
people/andrewcoop/xen.git
7 months agox86/spec-ctrl: Introduce and use DO_COND_BHB_SEQ xen-decode-lite
Andrew Cooper [Mon, 22 Apr 2024 17:15:48 +0000 (18:15 +0100)]
x86/spec-ctrl: Introduce and use DO_COND_BHB_SEQ

Now that alternatives can fix up call displacements even when they're not the
first instruction of the replacement, move the SCF_entry_bhb conditional
inside the replacement block.

This removes a conditional branch from the fastpaths of BHI-unaffected
hardware.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
7 months agox86/alternative: Relocate all insn-relative fields
Andrew Cooper [Tue, 16 Apr 2024 16:24:18 +0000 (17:24 +0100)]
x86/alternative: Relocate all insn-relative fields

Right now, relocation of displacements is restricted to finding 0xe8/e9 as the
first byte of the replacement, but this is overly restrictive.

Use x86_decode_lite() to find and adjust all insn-relative fields.

As with disp8's not leaving the replacemnet block, some disp32's don't either.
e.g. the RSB stuffing loop.  These stay unmodified.

For now, leave the altcall devirtualisation alone.  These require more care to
transform into the new scheme.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
7 months agox86/alternative: Replace a continue with a goto
Andrew Cooper [Wed, 17 Apr 2024 13:34:52 +0000 (14:34 +0100)]
x86/alternative: Replace a continue with a goto

A subsequent patch is going to insert a loop, which interferes with the
continue in the devirtualisation logic.

Replace it with a goto, and a paragraph explaining why we intentionally avoid
setting a->priv = 1.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
7 months agox86/alternative: Indent the relocation logic
Andrew Cooper [Wed, 17 Apr 2024 12:43:03 +0000 (13:43 +0100)]
x86/alternative: Indent the relocation logic

... to make subsequent patches legible.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
7 months agox86/alternative: Walk all replacements during self tests
Andrew Cooper [Mon, 15 Apr 2024 16:35:57 +0000 (17:35 +0100)]
x86/alternative: Walk all replacements during self tests

When self tests are active, walk all alternative replacements with
x86_decode_lite().

This checks that we can decode all instructions, and also lets us check that
disp8's don't leave the replacement block as such a case will definitely
malfunction.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <JBeulich@suse.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
v2:
 * Rebase over API changes in patch 1
 * Use +%lu and drop casts
 * Swap to CONFIG_SELF_TESTS

7 months agotests/x86: Introduce a userspace test harness for x86_decode_lite()
Andrew Cooper [Fri, 12 Apr 2024 09:45:16 +0000 (10:45 +0100)]
tests/x86: Introduce a userspace test harness for x86_decode_lite()

All the interesting behaviour is in insns.S.

There are 4 interesting cases; "not an instruction we tolerate", or one we do
tolerate, split by no relation, disp8 or disp32.  The DECL()/END() macros
start and terminate the tests_*[] arrays used by C.

Between DECL()/END(), a macro named _ adds an entry into the array, including
a name and the length of the instruction according to the assembler, while
being as visually unintrusive as possible.

Plain labels are ad-hoc and there to aid legibility during disassembly.  In a
couple of cases, the macro named n (for name) allows for choosing a name
manually, and is used for cases where the assembler doesn't like the mnemonic.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
v2:
 * New

Despite claiming full APX support in the 2.43 release, binutils trunk doesn't
tolerate JMPABS at all.  Clang-IAS like it but only when encoded as an
immediate, despite the fact the operand should be a moffset and encoded
without a $ prefix.  https://godbolt.org/z/P4Ph3svha

Back to this patch, I can't find any way to get Clang happy with rex.w for
explicit prefixing.  I suspect we're just going to need to ignore this test
case for clang=y.

Also, Clang and GAS disagree on needing .allow_index_reg for %riz.

7 months agox86: Introduce x86_decode_lite()
Andrew Cooper [Fri, 12 Apr 2024 09:45:16 +0000 (10:45 +0100)]
x86: Introduce x86_decode_lite()

In order to relocate all IP-relative fields in an alternative replacement
block, we need to decode the instructions enough to obtain their length and
any relative fields.

Full x86_decode() is far too heavyweight, so introduce a minimal form which
can make several simplifying assumptions.

This a mostly-complete decoder for integer instruction in the onebyte and
twobyte maps.  Some instructions are intentionally unsupported, owing to being
unlikely to find in alternatives, and so as to reduce decode complexity.  See
the subsequent patch adding a userspace test harness for further details.

This logic can decode all alternative blocks that exist in Xen right now.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
v2:
 * Switch to 0 on failure, rel_sz in bytes
 * Mostly complete the integer instructions; paird with userspace harness
 * Put in .init when !CONFIG_LIVEPATCH

7 months agox86: move ENTRY(), GLOBAL(), and ALIGN
Jan Beulich [Wed, 2 Oct 2024 06:59:03 +0000 (08:59 +0200)]
x86: move ENTRY(), GLOBAL(), and ALIGN

... to boot code, limiting their scope and thus allowing to drop
respective #undef-s from the linker script.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86: convert dom_crash_sync_extable() annotation
Jan Beulich [Wed, 2 Oct 2024 06:56:45 +0000 (08:56 +0200)]
x86: convert dom_crash_sync_extable() annotation

... to that from the generic framework in xen/linkage.h.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86/kexec: convert entry point annotations
Jan Beulich [Wed, 2 Oct 2024 06:56:04 +0000 (08:56 +0200)]
x86/kexec: convert entry point annotations

Use the generic framework from xen/linkage.h.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86/ACPI: annotate assembly function/data with type and size
Jan Beulich [Wed, 2 Oct 2024 06:55:31 +0000 (08:55 +0200)]
x86/ACPI: annotate assembly function/data with type and size

Use the generic framework from xen/linkage.h.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agoVMX: convert entry point annotations
Jan Beulich [Wed, 2 Oct 2024 06:55:02 +0000 (08:55 +0200)]
VMX: convert entry point annotations

Use the generic framework from xen/linkage.h.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agoxen/riscv: introduce early_fdt_map()
Oleksii Kurochko [Wed, 2 Oct 2024 06:54:36 +0000 (08:54 +0200)]
xen/riscv: introduce early_fdt_map()

Introduce function which allows to map FDT to Xen.

Also, initialization of device_tree_flattened happens using
early_fdt_map().

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agoxen/riscv: page table handling
Oleksii Kurochko [Wed, 2 Oct 2024 06:53:59 +0000 (08:53 +0200)]
xen/riscv: page table handling

Implement map_pages_to_xen() which requires several
functions to manage page tables and entries:
- pt_update()
- pt_mapping_level()
- pt_update_entry()
- pt_next_level()
- pt_check_entry()

To support these operations, add functions for creating,
mapping, and unmapping Xen tables:
- create_table()
- map_table()
- unmap_table()

Introduce PTE_SMALL to indicate that 4KB mapping is needed
and PTE_POPULATE.

In addition introduce flush_tlb_range_va() for TLB flushing across
CPUs after updating the PTE for the requested mapping.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 months agox86: prefer RDTSCP in rdtsc_ordered()
Jan Beulich [Wed, 2 Oct 2024 06:52:18 +0000 (08:52 +0200)]
x86: prefer RDTSCP in rdtsc_ordered()

If available, its use is supposed to be cheaper than LFENCE+RDTSC, and
is virtually guaranteed to be cheaper than MFENCE+RDTSC.

Update commentary (and indentation) while there.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agodocs: fusa: Add Assumption of Use (AOU)
Michal Orzel [Tue, 24 Sep 2024 08:29:23 +0000 (09:29 +0100)]
docs: fusa: Add Assumption of Use (AOU)

AoU are the assumptions that Xen relies on other components (eg platform
platform, domains) to fulfill its requirements. In our case, platform means
a combination of hardware, firmware and bootloader.

We have defined AoU in the intro.rst and added AoU for the generic
timer.

Also, fixed a requirement to denote that Xen shall **not** expose the
system counter frequency via the "clock-frequency" device tree property.
The reason being the device tree documentation strongly discourages the
use of this peoperty. Further if the "clock-frequency" is exposed, then
it overrides the value programmed in the CNTFRQ_EL0 register.

So, the frequency shall be exposed via the CNTFRQ_EL0 register only and
consequently there is an assumption on the platform to program the
register correctly.

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
7 months agox86/pv: Rename pv.iobmp_limit to iobmp_nr and clarify behaviour
Andrew Cooper [Tue, 1 Oct 2024 12:00:13 +0000 (13:00 +0100)]
x86/pv: Rename pv.iobmp_limit to iobmp_nr and clarify behaviour

Ever since it's introduction in commit 013351bd7ab3 ("Define new event-channel
and physdev hypercalls") in 2006, the public interface was named nr_ports
while the internal field was called iobmp_limit.

Rename the internal field to iobmp_nr to match the public interface, and
clarify that, when nonzero, Xen will read 2 bytes.

There isn't a perfect parallel with a real TSS, but iobmp_nr being 0 is the
paravirt "no IOPB" case, and it is important that no read occurs in this case.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 months agox86/pv: Handle #PF correctly when reading the IO permission bitmap
Andrew Cooper [Mon, 30 Sep 2024 15:20:29 +0000 (16:20 +0100)]
x86/pv: Handle #PF correctly when reading the IO permission bitmap

The switch statement in guest_io_okay() is a very expensive way of
pre-initialising x with ~0, and performing a partial read into it.

However, the logic isn't correct either.

In a real TSS, the CPU always reads two bytes (like here), and any TSS limit
violation turns silently into no-access.  But, in-limit accesses trigger #PF
as usual.  AMD document this property explicitly, and while Intel don't (so
far as I can tell), they do behave consistently with AMD.

Switch from __copy_from_guest_offset() to __copy_from_guest_pv(), like
everything else in this file.  This removes code generation setting up
copy_from_user_hvm() (in the likely path even), and safety LFENCEs from
evaluate_nospec().

Change the logic to raise #PF if __copy_from_guest_pv() fails, rather than
disallowing the IO port access.  This brings the behaviour better in line with
normal x86.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 months agox86/pv: Rework guest_io_okay() to return X86EMUL_*
Andrew Cooper [Mon, 30 Sep 2024 15:09:51 +0000 (16:09 +0100)]
x86/pv: Rework guest_io_okay() to return X86EMUL_*

In order to fix a bug with guest_io_okay() (subsequent patch), rework
guest_io_okay() to take in an emulation context, and return X86EMUL_* rather
than a boolean.

For the failing case, take the opportunity to inject #GP explicitly, rather
than returning X86EMUL_UNHANDLEABLE.  There is a logical difference between
"we know what this is, and it's #GP", vs "we don't know what this is".

There is no change in practice as emulation is the final step on general #GP
resolution, but returning X86EMUL_UNHANDLEABLE would be a latent bug if a
subsequent action were to appear.

No practical change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 months agox86/MSR: improve code gen for rdmsr_safe() and rdtsc()
Jan Beulich [Tue, 1 Oct 2024 07:47:32 +0000 (09:47 +0200)]
x86/MSR: improve code gen for rdmsr_safe() and rdtsc()

To fold two 32-bit outputs from the asm()-s into a single 64-bit value
the compiler needs to emit a zero-extension insn for the low half. Both
RDMSR and RDTSC clear the upper halves of their output registers anyway,
though. So despite that zero-extending insn (a simple MOV) being cheap,
we can do better: Without one, by declaring the local variables as 64-
bit ones.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86: use alternative_input() in cache_flush()
Jan Beulich [Tue, 1 Oct 2024 07:47:05 +0000 (09:47 +0200)]
x86: use alternative_input() in cache_flush()

There's no point using alternative_io() when there are no outputs. While
there drop the unnecessary semicolon after "ds".

No functional change.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agoiommu/amd-vi: make IOMMU list ro after init
Roger Pau Monné [Tue, 1 Oct 2024 07:46:09 +0000 (09:46 +0200)]
iommu/amd-vi: make IOMMU list ro after init

The only functions to modify the list, amd_iommu_detect_one_acpi() and
amd_iommu_init_cleanup(), are already init.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agox86/traps: Re-enable interrupts after reading cr2 in the #PF handler
Alejandro Vallejo [Tue, 1 Oct 2024 07:45:49 +0000 (09:45 +0200)]
x86/traps: Re-enable interrupts after reading cr2 in the #PF handler

Hitting a page fault clobbers %cr2, so if a page fault is handled while
handling a previous page fault then %cr2 will hold the address of the
latter fault rather than the former. In particular, if a debug key
handler happens to trigger during #PF and before %cr2 is read, and that
handler itself encounters a #PF, then %cr2 will be corrupt for the outer #PF
handler.

This patch makes the page fault path delay re-enabling IRQs until %cr2
has been read in order to ensure it stays consistent.

A similar argument holds in additional cases, but they happen to be safe:
    * %dr6 inside #DB: Safe because IST exceptions don't re-enable IRQs.
    * MSR_XFD_ERR inside #NM: Safe because AMX isn't used in #NM handler.

While in the area, remove redundant q suffix to a movq in entry.S and
the space after the comma.

Fixes: a4cd20a19073 ("[XEN] 'd' key dumps both host and guest state.")
Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
7 months agox86/PV: simplify (and thus correct) guest accessor functions
Jan Beulich [Tue, 1 Oct 2024 07:44:55 +0000 (09:44 +0200)]
x86/PV: simplify (and thus correct) guest accessor functions

Taking a fault on a non-byte-granular insn means that the "number of
bytes not handled" return value would need extra care in calculating, if
we want callers to be able to derive e.g. exception context (to be
injected to the guest) - CR2 for #PF in particular - from the value. To
simplify things rather than complicating them, reduce inline assembly to
just byte-granular string insns. On recent CPUs that's also supposed to
be more efficient anyway.

For singular element accessors, however, alignment checks are added,
hence slightly complicating the code. Misaligned (user) buffer accesses
will now be forwarded to copy_{from,to}_guest_ll().

Naturally copy_{from,to}_unsafe_ll() accessors end up being adjusted the
same way, as they're produced by mere re-processing of the same code.
Otoh copy_{from,to}_unsafe() aren't similarly adjusted, but have their
comments made match reality; down the road we may want to change their
return types, e.g. to bool.

Fixes: 76974398a63c ("Added user-memory accessing functionality for x86_64")
Fixes: 7b8c36701d26 ("Introduce clear_user and clear_guest")
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agodrivers/video: Convert source files to UTF-8
Frediano Ziglio [Thu, 26 Sep 2024 15:46:06 +0000 (16:46 +0100)]
drivers/video: Convert source files to UTF-8

Most of the tools nowadays assume this encoding.
These files do not specify any encoding so convert them to the default.

Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agotools: Add new function to do PIRQ (un)map on PVH dom0
Jiqian Chen [Mon, 30 Sep 2024 08:14:01 +0000 (10:14 +0200)]
tools: Add new function to do PIRQ (un)map on PVH dom0

When dom0 is PVH, and passthrough a device to dumU, xl will
use the gsi number of device to do a pirq mapping, see
pci_add_dm_done->xc_physdev_map_pirq, but the gsi number is
got from file /sys/bus/pci/devices/<sbdf>/irq, that confuses
irq and gsi, they are in different space and are not equal,
so it will fail when mapping.
To solve this issue, to get the real gsi and add a new function
xc_physdev_map_pirq_gsi to get a free pirq for gsi.
Note: why not use current function xc_physdev_map_pirq, because
it doesn't support to allocate a free pirq, what's more, to
prevent changing it and affecting its callers, so add
xc_physdev_map_pirq_gsi.

Besides, PVH dom0 doesn't have PIRQs flag, it doesn't do
PHYSDEVOP_map_pirq for each gsi. So grant function callstack
pci_add_dm_done->XEN_DOMCTL_irq_permission will fail at function
domain_pirq_to_irq. And old hypercall XEN_DOMCTL_irq_permission
requires passing in pirq, it is not suitable for PVH dom0 that
doesn't have PIRQs to grant irq permission.
To solve this issue, use the another hypercall
XEN_DOMCTL_gsi_permission to grant the permission of irq(
translate from gsi) to dumU when dom0 has no PIRQs.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Chen Jiqian <Jiqian.Chen@amd.com>
Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
7 months agotools: Add new function to get gsi from dev
Jiqian Chen [Mon, 30 Sep 2024 08:13:46 +0000 (10:13 +0200)]
tools: Add new function to get gsi from dev

On PVH dom0, when passthrough a device to domU, QEMU and xl tools
want to use gsi number to do pirq mapping, see QEMU code
xen_pt_realize->xc_physdev_map_pirq, and xl code
pci_add_dm_done->xc_physdev_map_pirq, but in current codes, the gsi
number is got from file /sys/bus/pci/devices/<sbdf>/irq, that is
wrong, because irq is not equal with gsi, they are in different
spaces, so pirq mapping fails.

And in current codes, there is no method to get gsi for userspace.
For above purpose, add new function to get gsi, and the
corresponding ioctl is implemented on linux kernel side.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Chen Jiqian <Jiqian.Chen@amd.com>
Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
7 months agox86/irq: allow setting IRQ permissions from GSI instead of pIRQ
Jiqian Chen [Mon, 30 Sep 2024 08:13:15 +0000 (10:13 +0200)]
x86/irq: allow setting IRQ permissions from GSI instead of pIRQ

Some domains are not aware of the pIRQ abstraction layer that maps
interrupt sources into Xen space interrupt numbers.  pIRQs values are
only exposed to domains that have the option to route physical
interrupts over event channels.

This creates issues for PCI-passthrough from a PVH domain, as some of
the passthrough related hypercalls use pIRQ as references to physical
interrupts on the system.  One of such interfaces is
XEN_DOMCTL_irq_permission, used to grant or revoke access to
interrupts, takes a pIRQ as the reference to the interrupt to be
adjusted.

Since PVH doesn't manage interrupts in terms of pIRQs, introduce a new
hypercall that allows setting interrupt permissions based on GSI value
rather than pIRQ.

Note the GSI hypercall parameters is translated to an IRQ value (in
case there are ACPI overrides) before doing the checks.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Reviewed-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agoxen/riscv: introduce and initialize SBI RFENCE extension
Oleksii Kurochko [Mon, 30 Sep 2024 08:12:40 +0000 (10:12 +0200)]
xen/riscv: introduce and initialize SBI RFENCE extension

Introduce functions to work with the SBI RFENCE extension for issuing
various fence operations to remote CPUs.

Add the sbi_init() function along with auxiliary functions and macro
definitions for proper initialization and checking the availability of
SBI extensions. Currently, this is implemented only for RFENCE.

Introduce sbi_remote_sfence_vma() to send SFENCE_VMA instructions to
a set of target HARTs. This will support the implementation of
flush_xen_tlb_range_va().

Integrate __sbi_rfence_v02 from Linux kernel 6.6.0-rc4 with minimal
modifications:
 - Adapt to Xen code style.
 - Use cpuid_to_hartid() instead of cpuid_to_hartid_map[].
 - Update BIT(...) to BIT(..., UL).
 - Rename __sbi_rfence_v02_call to sbi_rfence_v02_real and
   remove the unused arg5.
 - Handle NULL cpu_mask to execute rfence on all CPUs by calling
   sbi_rfence_v02_real(..., 0UL, -1UL,...) instead of creating hmask.
 - change type for start_addr and size to vaddr_t and size_t.
 - Add an explanatory comment about when batching can and cannot occur,
   and why batching happens in the first place.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agoxen/riscv: introduce functionality to work with CPU info
Oleksii Kurochko [Mon, 30 Sep 2024 08:11:18 +0000 (10:11 +0200)]
xen/riscv: introduce functionality to work with CPU info

Introduce struct pcpu_info to store pCPU-related information.
Initially, it includes only processor_id and hart id, but it
will be extended to include guest CPU information and
temporary variables for saving/restoring vCPU registers.

Add set_processor_id() function to set processor_id stored in
pcpu_info.

Define smp_processor_id() to provide accurate information,
replacing the previous "dummy" value of 0.

Initialize tp registers to point to pcpu_info[0].
Set processor_id to 0 for logical CPU 0 and store the physical
CPU ID in pcpu_info[0].

Introduce helpers for getting/setting hart_id ( physical CPU id
in RISC-V terms ) from Xen CPU id.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agoxen/riscv: introduce asm/pmap.h header
Oleksii Kurochko [Mon, 30 Sep 2024 08:09:37 +0000 (10:09 +0200)]
xen/riscv: introduce asm/pmap.h header

Introduce arch_pmap_{un}map functions and select HAS_PMAP for CONFIG_RISCV.

Add pte_from_mfn() for use in arch_pmap_map().

Introduce flush_xen_tlb_one_local() and use it in arch_pmap_{un}map().

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 months agoxen/riscv: set up fixmap mappings
Oleksii Kurochko [Mon, 30 Sep 2024 08:08:51 +0000 (10:08 +0200)]
xen/riscv: set up fixmap mappings

Set up fixmap mappings and the L0 page table for fixmap support.

Modify the PTEs (xen_fixmap[]) directly in arch_pmap_map() instead
of using set_fixmap() which is expected to be implemented using
map_pages_to_xen(), which, in turn, is expected to use
arch_pmap_map() during early boot, resulting in a loop.

Define new macros in riscv/config.h for calculating
the FIXMAP_BASE address, including BOOT_FDT_VIRT_{START, SIZE},
XEN_VIRT_SIZE, and XEN_VIRT_END.

Update the check for Xen size in riscv/xen.lds.S to use
XEN_VIRT_SIZE instead of a hardcoded constant.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agoxen/riscv: allow write_atomic() to work with non-scalar types
Oleksii Kurochko [Mon, 30 Sep 2024 08:06:44 +0000 (10:06 +0200)]
xen/riscv: allow write_atomic() to work with non-scalar types

Update the defintion of write_atomic() to support non-scalar types,
bringing it closer to the behavior of read_atomic().

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agox86/intel: optional build of PSR support
Sergiy Kibrik [Mon, 30 Sep 2024 08:06:13 +0000 (10:06 +0200)]
x86/intel: optional build of PSR support

Xen's implementation of PSR only supports Intel CPUs right now, hence it can be
made dependant on CONFIG_INTEL build option.
Since platform implementation is not limited to single vendor, intermediate
option CONFIG_X86_PSR introduced, which selected by CONFIG_INTEL.

When !X86_PSR then PSR-related sysctls XEN_SYSCTL_psr_cmt_op &
XEN_SYSCTL_psr_alloc are off as well.

Signed-off-by: Sergiy Kibrik <Sergiy_Kibrik@epam.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 months agox86: introduce x86_seg_sys
Jan Beulich [Mon, 30 Sep 2024 08:05:25 +0000 (10:05 +0200)]
x86: introduce x86_seg_sys

To represent the USER-MSR bitmap access, a new segment type needs
introducing, behaving like x86_seg_none in terms of address treatment,
but behaving like a system segment for page walk purposes (implicit
supervisor-mode access).

While there also add x86_seg_none handling to the test harness'es
read() hook, as will be needed for MSR-LIST support.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agoblkif: Fix a couple of typos
Anthony PERARD [Thu, 26 Sep 2024 12:53:50 +0000 (12:53 +0000)]
blkif: Fix a couple of typos

Those where fixed in OVMF's copy. (And one of them fixed in QEMU's
copy but later discarded by an update.)

Signed-off-by: Anthony PERARD <anthony.perard@vates.tech>
Reviewed-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
7 months agoblkif: Fix alignment description for discard request
Anthony PERARD [Thu, 26 Sep 2024 12:53:50 +0000 (12:53 +0000)]
blkif: Fix alignment description for discard request

The discard feature have an other xenstore node to described the size
of the blocks than can be discarded, "discard-granularity", which
default to "sector-size" when absent as noted in the properties and in
note 4. So discard request should be aligned on this value.

Fixes: 221f2748e8da ("blkif: reconcile protocol specification with in-use implementations")
Signed-off-by: Anthony PERARD <anthony.perard@vates.tech>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
7 months agox86/boot: Refactor BIOS/PVH start
Frediano Ziglio [Thu, 26 Sep 2024 09:21:07 +0000 (10:21 +0100)]
x86/boot: Refactor BIOS/PVH start

The 2 code paths were sharing quite some common code, reuse it instead
of having duplications.

Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86/alternatives: build time check feature is in range
Roger Pau Monné [Thu, 26 Sep 2024 10:14:31 +0000 (12:14 +0200)]
x86/alternatives: build time check feature is in range

Ensure at build time the feature(s) used for the alternative blocks are in
range of the featureset.

No functional change intended, as all current usages are correct.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86/alternatives: do not BUG during apply
Roger Pau Monné [Thu, 26 Sep 2024 10:14:30 +0000 (12:14 +0200)]
x86/alternatives: do not BUG during apply

alternatives is used both at boot time, and when loading livepatch payloads.
While for the former it makes sense to panic, it's not useful for the later, as
for livepatches it's possible to fail to load the livepatch if alternatives
cannot be resolved and continue operating normally.

Relax the BUGs in _apply_alternatives() to instead return an error code.  The
caller will figure out whether the failures are fatal and panic.

Print an error message to provide some user-readable information about what
went wrong.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agoxen/livepatch: do Xen build-id check earlier
Roger Pau Monné [Thu, 26 Sep 2024 10:14:29 +0000 (12:14 +0200)]
xen/livepatch: do Xen build-id check earlier

The check against the expected Xen build ID should be done ahead of attempting
to apply the alternatives contained in the livepatch.

If the CPUID in the alternatives patching data is out of the scope of the
running Xen featureset the BUG() in _apply_alternatives() will trigger thus
bringing the system down.  Note the layout of struct alt_instr could also
change between versions.  It's also possible for struct exception_table_entry
to have changed format, hence leading to other kind of errors if parsing of the
payload is done ahead of checking if the Xen build-id matches.

Move the Xen build ID check as early as possible.  To do so introduce a new
check_xen_buildid() function that parses and checks the Xen build-id before
moving the payload.  Since the expected Xen build-id is used early to
detect whether the livepatch payload could be loaded, there's no reason to
store it in the payload struct, as a non-matching Xen build-id won't get the
payload populated in the first place.

Note printing the expected Xen build ID has part of dumping the payload
information is no longer done: all loaded payloads would have Xen build IDs
matching the running Xen, otherwise they would have failed to load.

Fixes: 879615f5db1d ('livepatch: Always check hypervisor build ID upon livepatch upload')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agoxen/livepatch: simplify and unify logic in prepare_payload()
Roger Pau Monné [Thu, 26 Sep 2024 10:14:28 +0000 (12:14 +0200)]
xen/livepatch: simplify and unify logic in prepare_payload()

The following sections: .note.gnu.build-id, .livepatch.xen_depends and
.livepatch.depends are mandatory and ensured to be present by
check_special_sections() before prepare_payload() is called.

Simplify the logic in prepare_payload() by introducing a generic function to
parse the sections that contain a buildid.  Note the function assumes the
buildid related section to always be present.

No functional change intended.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agoxen/livepatch: drop load_addr Elf section field
Roger Pau Monné [Thu, 26 Sep 2024 10:14:27 +0000 (12:14 +0200)]
xen/livepatch: drop load_addr Elf section field

The Elf loading logic will initially use the `data` section field to stash a
pointer to the temporary loaded data (from the buffer allocated in
livepatch_upload(), which is later relocated and the new pointer stashed in
`load_addr`.

Remove this dual field usage and use an `addr` uniformly.  Initially data will
point to the temporary buffer, until relocation happens, at which point the
pointer will be updated to the relocated address.

This avoids leaving a dangling pointer in the `data` field once the temporary
buffer is freed by livepatch_upload().

Note the `addr` field cannot retain the const attribute from the previous
`data`field, as there's logic that performs manipulations against the loaded
sections, like applying relocations or sorting the exception table.

No functional change intended.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Michal Orzel <michal.orzel@amd.com>
7 months agoxen/livepatch: remove useless check for duplicated sections
Roger Pau Monné [Wed, 25 Sep 2024 14:48:33 +0000 (16:48 +0200)]
xen/livepatch: remove useless check for duplicated sections

The current check for duplicated sections in a payload is not effective.  Such
check is done inside a loop that iterates over the sections names, it's
logically impossible for the bitmap to be set more than once.

The usage of a bitmap in check_patching_sections() has been replaced with a
boolean, since the function just cares that at least one of the special
sections is present.

No functional change intended, as the check was useless.

Fixes: 29f4ab0b0a4f ('xsplice: Implement support for applying/reverting/replacing patches.')
Fixes: 76b3d4098a92 ('livepatch: Do not enforce ELF_LIVEPATCH_FUNC section presence')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86/boot: Initialise BSS sooner
Frediano Ziglio [Wed, 25 Sep 2024 14:47:51 +0000 (16:47 +0200)]
x86/boot: Initialise BSS sooner

Allows to call C code earlier.
In order to safely call C code we need to setup stack, selectors and BSS.

Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agochangelog: add note about blkif protocol fixes
Roger Pau Monné [Wed, 25 Sep 2024 14:47:35 +0000 (16:47 +0200)]
changelog: add note about blkif protocol fixes

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
7 months agox86/defns: Fix typo in comment "Porection" -> "Protection"
Frediano Ziglio [Wed, 25 Sep 2024 11:09:46 +0000 (12:09 +0100)]
x86/defns: Fix typo in comment "Porection" -> "Protection"

Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agoxen: introduce common macros for per-CPU sections defintion
Oleksii Kurochko [Tue, 24 Sep 2024 16:42:27 +0000 (18:42 +0200)]
xen: introduce common macros for per-CPU sections defintion

Introduce PERCPU_BSS macro which manages:
 * Alignment of the section start
 * Insertion of per-CPU data sections
 * Alignment and start/end markers for per-CPU data
This change simplifies the linker script maintenance and ensures a unified
approach for per-CPU sections across different architectures.

Refactor the linker scripts for Arm, PPC, and x86 architectures by using
the common macro PERCPU_BSS defined in xen/xen.lds.h to handle per-CPU
data sections.

No functional changes.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agoxen/efi: efibind: Fix typo in comment
Frediano Ziglio [Mon, 16 Sep 2024 09:35:57 +0000 (10:35 +0100)]
xen/efi: efibind: Fix typo in comment

expresion -> expression

Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agoxen/ucode: Make Intel's microcode_sanity_check() stricter
Demi Marie Obenour [Fri, 13 Sep 2024 13:19:30 +0000 (14:19 +0100)]
xen/ucode: Make Intel's microcode_sanity_check() stricter

The SDM states that data size must be a multiple of 4, but Xen doesn't check
this propery.

This is liable to cause a later failures, but should be checked explicitly.

Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 months agoxen/ucode: Improve commentary for parsing AMD containers
Andrew Cooper [Fri, 13 Sep 2024 11:20:37 +0000 (12:20 +0100)]
xen/ucode: Improve commentary for parsing AMD containers

Despite writing this code, it's not the easiest logic to follow.

Shorten the UCODE_EQUIV_TYPE name, and provide more of an explanation of
what's going on.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agox86/APIC: Remove x2APIC pure cluster mode
Matthew Barnes [Mon, 23 Sep 2024 14:35:59 +0000 (15:35 +0100)]
x86/APIC: Remove x2APIC pure cluster mode

With the introduction of mixed x2APIC mode (using cluster addressing for
IPIs and physical for external interrupts) the use of pure cluster mode
doesn't have any benefit.

Remove the mode itself, leaving only the code required for logical
addressing when sending IPIs.

Resolves: https://gitlab.com/xen-project/xen/-/issues/189
Signed-off-by: Matthew Barnes <matthew.barnes@cloud.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
7 months agox86/vLAPIC: prevent undue recursion of vlapic_error()
Jan Beulich [Tue, 24 Sep 2024 12:23:29 +0000 (14:23 +0200)]
x86/vLAPIC: prevent undue recursion of vlapic_error()

With the error vector set to an illegal value, the function invoking
vlapic_set_irq() would bring execution back here, with the non-recursive
lock already held. Avoid the call in this case, merely further updating
ESR (if necessary).

This is XSA-462 / CVE-2024-45817.

Fixes: 5f32d186a8b1 ("x86/vlapic: don't silently accept bad vectors")
Reported-by: Federico Serafini <federico.serafini@bugseng.com>
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86/efi: Use generic PE/COFF structures
Nikola Jelic [Mon, 23 Sep 2024 17:50:08 +0000 (19:50 +0200)]
x86/efi: Use generic PE/COFF structures

Adapted x86 efi parser and mkreloc utility to use generic PE header
(efi/pe.h), instead of locally defined structures for each component.

Signed-off-by: Nikola Jelic <nikola.jelic@rt-rk.com>
Signed-off-by: Milan Djokic <milan.djokic@rt-rk.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Daniel P. Smith <dpsmith@apertussolutions.com>
7 months agox86/cpufeature: Reposition cpu_has_{lfence_dispatch,nscb}
Andrew Cooper [Tue, 10 Sep 2024 19:59:37 +0000 (20:59 +0100)]
x86/cpufeature: Reposition cpu_has_{lfence_dispatch,nscb}

LFENCE_DISPATCH used to be a synthetic feature, but was given a real CPUID bit
by AMD.  The define wasn't moved when this was changed.

NSCB has always been a real CPUID bit, and was misplaced when introduced in
the synthetic block alongside LFENCE_DISPATCH.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 months agotools/libxs: Style consistency improvements
Andrew Cooper [Fri, 28 Jun 2024 12:05:47 +0000 (13:05 +0100)]
tools/libxs: Style consistency improvements

This is mostly Linux style.  Make the file self-consistent.  Drop trailing
whitespace, and use tabs consistently.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
7 months agox86: enable long section names for xen.efi
Jan Beulich [Tue, 24 Sep 2024 08:34:35 +0000 (10:34 +0200)]
x86: enable long section names for xen.efi

While for our present .data.read_mostly it may be deemed tolerable that
the name is truncated to .data.re, for the planned .init.trampoline an
abbreviation to .init.tr would end up pretty meaningless. Engage the
long section names extension that GNU ld has had support for already in
2.22 (which we consider the baseline release for xen.efi building).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Frediano Ziglio <frediano.ziglio@cloud.com>
7 months agox86/mwait-idle: add dependency on general Intel CPU support
Sergiy Kibrik [Tue, 24 Sep 2024 08:33:38 +0000 (10:33 +0200)]
x86/mwait-idle: add dependency on general Intel CPU support

Currently mwait_idle driver in Xen only implements support for Intel CPUs.
Thus in order to reduce dead code in non-Intel build configurations it can
be made explicitly dependant on CONFIG_INTEL option.

Signed-off-by: Sergiy Kibrik <Sergiy_Kibrik@epam.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agox86/boot: Drop stale comment about zeroing the stack
Andrew Cooper [Mon, 16 Sep 2024 11:56:06 +0000 (12:56 +0100)]
x86/boot: Drop stale comment about zeroing the stack

This used to be true, but was altered by commit 37786b23b027 ("x86/cet: Remove
writeable mapping of the BSPs shadow stack") which moved cpu0_stack into
.init.bss.stack_aligned.

Fixes: 37786b23b027 ("x86/cet: Remove writeable mapping of the BSPs shadow stack")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agoxen/riscv: use {read,write}{b,w,l,q}_cpu() to define {read,write}_atomic()
Oleksii Kurochko [Mon, 23 Sep 2024 14:32:17 +0000 (16:32 +0200)]
xen/riscv: use {read,write}{b,w,l,q}_cpu() to define {read,write}_atomic()

The functions {read,write}{b,w,l,q}_cpu() do not need to be memory-ordered
atomic operations in Xen, based on their definitions for other architectures.

Therefore, {read,write}{b,w,l,q}_cpu() can be used instead of
{read,write}{b,w,l,q}(), allowing the caller to decide if additional
fences should be applied before or after {read,write}_atomic().

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 months agoubsan: use linux-compat.h
Jan Beulich [Mon, 23 Sep 2024 14:31:49 +0000 (16:31 +0200)]
ubsan: use linux-compat.h

Instead of replacing the s64 (and later also u64) uses, keep the file as
little modified as possible from its Linux origin. (Sadly the two cast
adjustments are needed to avoid compiler warnings.)

Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agodocs/misra: add R17.2 and R18.2
Stefano Stabellini [Wed, 18 Sep 2024 20:23:19 +0000 (13:23 -0700)]
docs/misra: add R17.2 and R18.2

The Xen community is already informally following both rules. Let's make
it explicit. Both rules have zero violations, only cautions. While we
want to go down to zero cautions in time, adding both rules to rules.rst
enables us to immediately make both rules gating in the ECLAIR job part
of gitlab-ci.

Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Acked-by: Bertrand Marquis <bertrand.marquis@arm.com>
7 months agodocs: fusa: Add requirements for emulated uart
Michal Orzel [Tue, 17 Sep 2024 13:13:36 +0000 (14:13 +0100)]
docs: fusa: Add requirements for emulated uart

Add the requirements for emulated SBSA UART.

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Signed-off-by: Ayan Kumar Halder <ayan.kumar.halder@amd.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
7 months agoautomation/eclair: add deviation for MISRA C 2012 Dir 4.10
Alessandro Zucchelli [Tue, 10 Sep 2024 14:15:36 +0000 (16:15 +0200)]
automation/eclair: add deviation for MISRA C 2012 Dir 4.10

Add deviation to address violations of MISRA C:2012 Directive 4.10
("Precautions shall be taken in order to prevent the contents of a
header file being included more than once").

This deviation suppresses the violation arising from autogenerated file
xen/include/generated/autoconf.h

No functional change.

Signed-off-by: Alessandro Zucchelli <alessandro.zucchelli@bugseng.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
7 months agoarm/smmu: Complete SMR masking support
Michal Orzel [Wed, 4 Sep 2024 12:43:49 +0000 (14:43 +0200)]
arm/smmu: Complete SMR masking support

SMR masking support allows deriving a mask either using a 2-cell iommu
specifier (per master) or stream-match-mask SMMU dt property (global
config). Even though the mask is stored in the fwid when adding a
device (in arm_smmu_dt_xlate_generic()), we still set it to 0 when
allocating SMEs (in arm_smmu_master_alloc_smes()). So at the end, we
always ignore the mask when programming SMRn registers. This leads to
SMMU failures. Fix it by completing the support.

A bit of history:
Linux support for SMR allocation was mainly done with:
588888a7399d ("iommu/arm-smmu: Intelligent SMR allocation")
021bb8420d44 ("iommu/arm-smmu: Wire up generic configuration support")

Taking the mask into account in arm_smmu_master_alloc_smes() was added
as part of the second commit, although quite hidden in the thicket of
other changes. We backported only the first patch with: 0435784cc75d
("xen/arm: smmuv1: Intelligent SMR allocation") but the changes to take
the mask into account were missed.

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Rahul Singh <rahul.singh@arm.com>
Acked-by: Julien Grall <jgrall@amazon.com>
7 months agoxen/arm: Enable workaround for Cortex-A53 erratum #1530924
Andrei Cherechesu [Tue, 10 Sep 2024 14:34:11 +0000 (17:34 +0300)]
xen/arm: Enable workaround for Cortex-A53 erratum #1530924

All versions of Cortex-A53 cores are affected by the speculative
AT instruction erratum, as mentioned in the Cortex-A53 Revision r0
SDEN v21 documentation.

Enabled ARM64_WORKAROUND_AT_SPECULATE for all versions of Cortex-A53
cores, to avoid corrupting the TLB if performing a speculative AT
instruction during a guest context switch.

Signed-off-by: Andrei Cherechesu <andrei.cherechesu@nxp.com>
Acked-by: Julien Grall <jgrall@amazon.com>
7 months agoarm: Drop deprecated early printk platform options
Michal Orzel [Fri, 13 Sep 2024 06:15:29 +0000 (08:15 +0200)]
arm: Drop deprecated early printk platform options

The predefined configurations for early printk have been deprecated for
a sufficient amount of time. Let's finally remove them.

Note:
In order not to lose these predefined configurations, I wrote a wiki
page: https://wiki.xenproject.org/wiki/Xen_on_ARM_Early_Printk

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
7 months agoxen/ucode: Fix buffer under-run when parsing AMD containers
Demi Marie Obenour [Fri, 13 Sep 2024 10:29:32 +0000 (11:29 +0100)]
xen/ucode: Fix buffer under-run when parsing AMD containers

The AMD container format has no formal spec.  It is, at best, precision
guesswork based on AMD's prior contributions to open source projects.  The
Equivalence Table has both an explicit length, and an expectation of having a
NULL entry at the end.

Xen was sanity checking the NULL entry, but without confirming that an entry
was present, resulting in a read off the front of the buffer.  With some
manual debugging/annotations this manifests as:

  (XEN) *** Buf ffff83204c00b19c, eq ffff83204c00b194
  (XEN) *** eq: 0c 00 00 00 44 4d 41 00 00 00 00 00 00 00 00 00 aa aa aa aa
                            ^-Actual buffer-------------------^
  (XEN) *** installed_cpu: 000c
  (XEN) microcode: Bad equivalent cpu table
  (XEN) Parsing microcode blob error -22

When loaded by hypercall, the 4 bytes interpreted as installed_cpu happen to
be the containing struct ucode_buf's len field, and luckily will be nonzero.

When loaded at boot, it's possible for the access to #PF if the module happens
to have been placed on a 2M boundary by the bootloader.  Under Linux, it will
commonly be the end of the CPIO header.

Drop the probe of the NULL entry; Nothing else cares.  A container without one
is well formed, insofar that we can still parse it correctly.  With this
dropped, the same container results in:

  (XEN) microcode: couldn't find any matching ucode in the provided blob!

Fixes: 4de936a38aa9 ("x86/ucode/amd: Rework parsing logic in cpu_request_microcode()")
Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 months agoxen/keyhandler: Move key_table[] into __ro_after_init
Andrew Cooper [Thu, 12 Sep 2024 10:30:44 +0000 (11:30 +0100)]
xen/keyhandler: Move key_table[] into __ro_after_init

All registration is done at boot.  Almost...

iommu_dump_page_tables() is registered in iommu_hwdom_init(), which is called
twice when LATE_HWDOM is in use.

register_irq_keyhandler() has an ASSERT() guarding againt multiple
registration attempts, and the absence of bug reports hints at how many
configurations use LATE_HWDOM in practice.

Move the registration into iommu_setup() just after printing the overall
status of the IOMMU.  For starters, the hardware domain is specifically
excluded by iommu_dump_page_tables().

ept_dump_p2m_table is registered in setup_ept_dump() which is non-__init, but
whose sole caller, start_vmx(), is __init.  Move setup_ept_dump() to match.

With these two tweeks, all keyhandler reigstration is from __init functions,
so register_{,irq_}keyhandler() can move, and key_table[] can become
__ro_after_init.

No practical change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
7 months agox86/hvm: Simplify stdvga_mem_accept() further
Andrew Cooper [Thu, 12 Sep 2024 11:04:17 +0000 (12:04 +0100)]
x86/hvm: Simplify stdvga_mem_accept() further

stdvga_mem_accept() is called on almost all IO emulations, and the
overwhelming likely answer is to reject the ioreq.  Simply rearranging the
expression yields an improvement:

  add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-57 (-57)
  Function                                     old     new   delta
  stdvga_mem_accept                            109      52     -57

which is best explained looking at the disassembly:

  Before:                                                    After:
  f3 0f 1e fa           endbr64                              f3 0f 1e fa           endbr64
  0f b6 4e 1e           movzbl 0x1e(%rsi),%ecx            |  0f b6 46 1e           movzbl 0x1e(%rsi),%eax
  48 8b 16              mov    (%rsi),%rdx                |  31 d2                 xor    %edx,%edx
  f6 c1 40              test   $0x40,%cl                  |  a8 30                 test   $0x30,%al
  75 38                 jne    <stdvga_mem_accept+0x48>   |  75 23                 jne    <stdvga_mem_accept+0x31>
  31 c0                 xor    %eax,%eax                  <
  48 81 fa ff ff 09 00  cmp    $0x9ffff,%rdx              <
  76 26                 jbe    <stdvga_mem_accept+0x41>   <
  8b 46 14              mov    0x14(%rsi),%eax            <
  8b 7e 10              mov    0x10(%rsi),%edi            <
  48 0f af c7           imul   %rdi,%rax                  <
  48 8d 54 02 ff        lea    -0x1(%rdx,%rax,1),%rdx     <
  31 c0                 xor    %eax,%eax                  <
  48 81 fa ff ff 0b 00  cmp    $0xbffff,%rdx              <
  77 0c                 ja     <stdvga_mem_accept+0x41>   <
  83 e1 30              and    $0x30,%ecx                 <
  75 07                 jne    <stdvga_mem_accept+0x41>   <
  83 7e 10 01           cmpl   $0x1,0x10(%rsi)               83 7e 10 01           cmpl   $0x1,0x10(%rsi)
  0f 94 c0              sete   %al                        |  75 1d                 jne    <stdvga_mem_accept+0x31>
  c3                    ret                               |  48 8b 0e              mov    (%rsi),%rcx
  66 0f 1f 44 00 00     nopw   0x0(%rax,%rax,1)           |  48 81 f9 ff ff 09 00  cmp    $0x9ffff,%rcx
  8b 46 10              mov    0x10(%rsi),%eax            |  76 11                 jbe    <stdvga_mem_accept+0x31>
  8b 7e 14              mov    0x14(%rsi),%edi            |  8b 46 14              mov    0x14(%rsi),%eax
  49 89 d0              mov    %rdx,%r8                   |  48 8d 44 01 ff        lea    -0x1(%rcx,%rax,1),%rax
  48 83 e8 01           sub    $0x1,%rax                  |  48 3d ff ff 0b 00     cmp    $0xbffff,%rax
  48 8d 54 3a ff        lea    -0x1(%rdx,%rdi,1),%rdx     |  0f 96 c2              setbe  %dl
  48 0f af c7           imul   %rdi,%rax                  |  89 d0                 mov    %edx,%eax
  49 29 c0              sub    %rax,%r8                   <
  31 c0                 xor    %eax,%eax                  <
  49 81 f8 ff ff 09 00  cmp    $0x9ffff,%r8               <
  77 be                 ja     <stdvga_mem_accept+0x2a>   <
  c3                    ret                                  c3                    ret

By moving the "p->count != 1" check ahead of the
ioreq_mmio_{first,last}_byte() calls, both multiplies disappear along with a
lot of surrounding logic.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 months agoARM/cache: Drop legacy __read_mostly/__ro_after_init definitions
Andrew Cooper [Thu, 30 May 2024 20:09:48 +0000 (21:09 +0100)]
ARM/cache: Drop legacy __read_mostly/__ro_after_init definitions

These are no longer needed.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <jgrall@amazon.com>
7 months agox86/mm: undo type change of partial_flags
Jan Beulich [Thu, 12 Sep 2024 15:52:27 +0000 (17:52 +0200)]
x86/mm: undo type change of partial_flags

Clang dislikes the boolean type combined with the field being set using
PTF_partial_set.

Fixes: 5ffe6d4a02e0 ("types: replace remaining uses of s16")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Frediano Ziglio <frediano.ziglio@cloud.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
7 months agoblkif: reconcile protocol specification with in-use implementations
Roger Pau Monné [Thu, 12 Sep 2024 12:04:56 +0000 (14:04 +0200)]
blkif: reconcile protocol specification with in-use implementations

Current blkif implementations (both backends and frontends) have all slight
differences about how they handle the 'sector-size' xenstore node, and how
other fields are derived from this value or hardcoded to be expressed in units
of 512 bytes.

To give some context, this is an excerpt of how different implementations use
the value in 'sector-size' as the base unit for to other fields rather than
just to set the logical sector size of the block device:

                        │ sectors xenbus node │ requests sector_number │ requests {first,last}_sect
────────────────────────┼─────────────────────┼────────────────────────┼───────────────────────────
FreeBSD blk{front,back} │     sector-size     │      sector-size       │           512
────────────────────────┼─────────────────────┼────────────────────────┼───────────────────────────
Linux blk{front,back}   │         512         │          512           │           512
────────────────────────┼─────────────────────┼────────────────────────┼───────────────────────────
QEMU blkback            │     sector-size     │      sector-size       │       sector-size
────────────────────────┼─────────────────────┼────────────────────────┼───────────────────────────
Windows blkfront        │     sector-size     │      sector-size       │       sector-size
────────────────────────┼─────────────────────┼────────────────────────┼───────────────────────────
MiniOS                  │     sector-size     │          512           │           512

An attempt was made by 67e1c050e36b in order to change the base units of the
request fields and the xenstore 'sectors' node.  That however only lead to more
confusion, as the specification now clearly diverged from the reference
implementation in Linux.  Such change was only implemented for QEMU Qdisk
and Windows PV blkfront.

Partially revert to the state before 67e1c050e36b while adjusting the
documentation for 'sectors' to match what it used to be previous to
2fa701e5346d:

 * Declare 'feature-large-sector-size' deprecated.  Frontends should not expose
   the node, backends should not make decisions based on its presence.

 * Clarify that 'sectors' xenstore node and the requests fields are always in
   512-byte units, like it was previous to 2fa701e5346d and 67e1c050e36b.

All base units for the fields used in the protocol are 512-byte based, the
xenbus 'sector-size' field is only used to signal the logic block size.  When
'sector-size' is greater than 512, blkfront implementations must make sure that
the offsets and sizes (despite being expressed in 512-byte units) are aligned
to the logical block size specified in 'sector-size', otherwise the backend
will fail to process the requests.

This will require changes to some of the frontends and backends in order to
properly support 'sector-size' nodes greater than 512.

Fixes: 2fa701e5346d ('blkif.h: Provide more complete documentation of the blkif interface')
Fixes: 67e1c050e36b ('public/io/blkif.h: try to fix the semantics of sector based quantities')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
7 months agotypes: replace remaining uses of s32
Jan Beulich [Thu, 12 Sep 2024 12:03:50 +0000 (14:03 +0200)]
types: replace remaining uses of s32

... and move the type itself to linux-compat.h.

While doing so switch a few adjacent types as well, for (a little bit
of) consistency.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
7 months agotypes: replace remaining uses of s16
Jan Beulich [Thu, 12 Sep 2024 12:01:42 +0000 (14:01 +0200)]
types: replace remaining uses of s16

... and move the type itself to linux-compat.h.

While doing so switch an adjacent x86 struct page_info field to bool.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
7 months agoxen/x86/pvh: handle ACPI RSDT table in PVH Dom0 build
Stefano Stabellini [Thu, 12 Sep 2024 07:18:25 +0000 (09:18 +0200)]
xen/x86/pvh: handle ACPI RSDT table in PVH Dom0 build

Xen always generates an XSDT table even if the firmware only provided an
RSDT table.  Copy the RSDT header from the firmware table, adjusting the
signature, for the XSDT table when not provided by the firmware.

This is necessary to run Xen on QEMU.

Fixes: 1d74282c455f ('x86: setup PVHv2 Dom0 ACPI tables')
Suggested-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com>
Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
7 months agox86/HVM: drop .complete hook for intercept handling
Jan Beulich [Thu, 12 Sep 2024 07:17:43 +0000 (09:17 +0200)]
x86/HVM: drop .complete hook for intercept handling

No user of the hook exists anymore.

While touching hvm_mmio_internal() also make direction of the request
explicit - it only so happens that IOREQ_WRITE is zero. Yet it being a
write is imperative for stdvga.c to "accept" the request.

Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86/HVM: drop stdvga's "lock" struct member
Jan Beulich [Thu, 12 Sep 2024 07:17:02 +0000 (09:17 +0200)]
x86/HVM: drop stdvga's "lock" struct member

No state is left to protect. It being the last field, drop the struct
itself as well. Similarly for then ending up empty, drop the .complete
handler.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86/HVM: drop stdvga's "vram_page[]" struct member
Jan Beulich [Thu, 12 Sep 2024 07:15:52 +0000 (09:15 +0200)]
x86/HVM: drop stdvga's "vram_page[]" struct member

No uses are left, hence its setup, teardown, and the field itself can
also go away. stdvga_deinit() is then empty and can be dropped as well.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86/HVM: drop stdvga's "{g,s}r_index" struct members
Jan Beulich [Thu, 12 Sep 2024 07:15:23 +0000 (09:15 +0200)]
x86/HVM: drop stdvga's "{g,s}r_index" struct members

No consumers are left, hence the producer and the fields themselves can
also go away. stdvga_outb() is then useless, rendering stdvga_out()
useless as well. Hence the entire I/O port intercept can go away.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86/HVM: drop stdvga's "sr[]" struct member
Jan Beulich [Thu, 12 Sep 2024 07:14:55 +0000 (09:14 +0200)]
x86/HVM: drop stdvga's "sr[]" struct member

No consumers are left, hence the producer and the array itself can also
go away. The static sr_mask[] is then orphaned and hence needs dropping,
too.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86/HVM: drop stdvga's "gr[]" struct member
Jan Beulich [Thu, 12 Sep 2024 07:14:27 +0000 (09:14 +0200)]
x86/HVM: drop stdvga's "gr[]" struct member

No consumers are left, hence the producer and the array itself can also
go away. The static gr_mask[] is then orphaned and hence needs dropping,
too.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86/HVM: remove unused MMIO handling code
Jan Beulich [Thu, 12 Sep 2024 07:13:57 +0000 (09:13 +0200)]
x86/HVM: remove unused MMIO handling code

All read accesses are rejected by the ->accept handler, while writes
bypass the bulk of the function body. Drop the dead code, leaving an
assertion in the read handler.

A number of other static items (and a macro) are then unreferenced and
hence also need (want) dropping. The same applies to the "latch" field
of the state structure.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86/HVM: drop stdvga's "stdvga" struct member
Jan Beulich [Thu, 12 Sep 2024 07:13:27 +0000 (09:13 +0200)]
x86/HVM: drop stdvga's "stdvga" struct member

Two of its consumers are dead (in compile-time constant conditionals)
and the only remaining ones are merely controlling debug logging. Hence
the field is now pointless to set, which in particular allows to get rid
of the questionable conditional from which the field's value was
established (afaict 551ceee97513 ["x86, hvm: stdvga cache always on"]
had dropped too much of the earlier extra check that was there, and
quite likely further checks were missing).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86/HVM: properly reject "indirect" VRAM writes
Jan Beulich [Thu, 12 Sep 2024 07:13:04 +0000 (09:13 +0200)]
x86/HVM: properly reject "indirect" VRAM writes

While ->count will only be different from 1 for "indirect" (data in
guest memory) accesses, it being 1 does not exclude the request being an
"indirect" one. Check both to be on the safe side, and bring the ->count
part also in line with what ioreq_send_buffered() actually refuses to
handle.

Fixes: 3bbaaec09b1b ("x86/hvm: unify stdvga mmio intercept with standard mmio intercept")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86emul: support CMPccXADD
Jan Beulich [Thu, 12 Sep 2024 07:11:53 +0000 (09:11 +0200)]
x86emul: support CMPccXADD

Unconditionally wire this through the ->rmw() hook. Since x86_emul_rmw()
now wants to construct and invoke a stub, make stub_exn available to it
via a new field in the emulator state structure.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agoautomation/eclair_analysis: address violation of Rule 20.7
Nicola Vetrini [Tue, 10 Sep 2024 12:43:21 +0000 (14:43 +0200)]
automation/eclair_analysis: address violation of Rule 20.7

MISRA Rule 20.7 states:
"Expressions resulting from the expansion of macro parameters
shall be enclosed in parentheses".

The files imported from the gnu-efi package are already deviated, yet
the macro NextMemoryDescriptor is used in non-excluded code, so a further
deviation is needed to exclude also any expansion of the macro.

No functional change.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
7 months agoxen/bitmap: remove redundant deviations
Federico Serafini [Tue, 10 Sep 2024 10:50:07 +0000 (12:50 +0200)]
xen/bitmap: remove redundant deviations

Remove comment-based deviations since a project wide deviation that
cover such cases is present.

Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
7 months agoautomation/eclair: update configuration of Rule 20.7
Federico Serafini [Thu, 12 Sep 2024 00:34:37 +0000 (17:34 -0700)]
automation/eclair: update configuration of Rule 20.7

MISRA C:2012 Rule 20.7 states that "Expressions resulting from the
expansion of macro parameters shall be enclosed in parentheses".
The rational of the rule is that if a macro argument expands to an
expression, there may be problems related to operator precedence, e.g.,

define M(A, B) A * B

M(1+1, 2+2) will expand to: 1+1 * 2+2

Update ECLAIR configuration to tag as 'safe' the expansions of macro
arguments surrounded tokens '{', '}' and ';', since in their presence
problems related to operator precedence can not occur.

Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
7 months agoautomation/eclair_analysis: deviate linker symbols for Rule 18.2
Nicola Vetrini [Sat, 7 Sep 2024 13:03:25 +0000 (15:03 +0200)]
automation/eclair_analysis: deviate linker symbols for Rule 18.2

MISRA C Rule 18.2 states: "Subtraction between pointers shall
only be applied to pointers that address elements of the same array".

Subtractions between pointer where at least one symbol is a
symbol defined by the linker are safe and thus deviated, because
the compiler cannot exploit the undefined behaviour that would
arise from violating the rules in this case.

To create an ECLAIR configuration that contains the list of
linker-defined symbols, the script "linker-symbols.sh" is used
after a build of xen (without static analysis) is performed.
The generated file "linker_symbols.ecl" is then used as part of the
static analysis configuration.

Additional changes to the ECLAIR integration are:
- perform a build of xen without static analysis during prepare.sh
- run the scripts to generated ECL configuration during the prepare.sh,
  rather than analysis.sh
- export ECLAIR_PROJECT_ROOT earlier, to allow such generation

Additionally, the macro page_to_mfn performs a subtraction that is safe,
so its uses are deviated.

No functional changes.

Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
7 months agoautomation/eclair_analysis: fix MISRA Rule 20.7 regression in self-tests.h
Nicola Vetrini [Sun, 8 Sep 2024 13:27:57 +0000 (15:27 +0200)]
automation/eclair_analysis: fix MISRA Rule 20.7 regression in self-tests.h

Prior to bd1664db7b7d ("xen/bitops: Introduce a multiple_bits_set() helper")
the definition of {COMPILE,RUNTIME}_CHECK was fully compliant with respect
to MISRA C Rule 20.7:

"Expressions resulting from the expansion of macro parameters shall be
enclosed in parentheses."

However, to allow testing function-like macros, parentheses on the "fn"
parameter were removed and thus new violations of the rule have been
introduced. Given the usefulness of this functionality,
it is deemed ok to deviate these two macros for this rule, because
their scope of (direct) usage is limited to just the file where they
are defined, and the possibility of misuses is unlikely.

No functional change.

Fixes: bd1664db7b7d ("xen/bitops: Introduce a multiple_bits_set() helper")
Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 months agox86/hvm: allow {,un}map_pirq hypercalls unconditionally
Jiqian Chen [Wed, 11 Sep 2024 10:58:24 +0000 (12:58 +0200)]
x86/hvm: allow {,un}map_pirq hypercalls unconditionally

The current hypercall interfaces to manage and assign interrupts to
domains is mostly based in using pIRQs as handlers.  Such pIRQ values
are abstract domain-specific references to interrupts.

Classic HVM domains can have access to {,un}map_pirq hypercalls if the
domain is allowed to route physical interrupts over event channels.
That's however a different interface, limited to only mapping
interrupts to itself. PVH domains on the other hand never had access
to the interface, as PVH domains are not allowed to route interrupts
over event channels.

In order to allow setting up PCI passthrough from a PVH domain it
needs access to the {,un}map_pirq hypercalls so interrupts can be
assigned a pIRQ handler that can then be used by further hypercalls to
bind the interrupt to a domain.

Note that the {,un}map_pirq hypercalls end up calling helpers that are
already used against a PVH domain in order to setup interrupts for the
hardware domain when running in PVH mode.  physdev_map_pirq() will
call allocate_and_map_{gsi,msi}_pirq() which is already used by the
vIO-APIC or the vPCI code respectively.  So the exposed code paths are
not new when targeting a PVH domain, but rather previous callers are
not hypercall but emulation based.

Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Jiqian Chen <Jiqian.Chen@amd.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agox86/HVM: drop stdvga's "cache" struct member
Jan Beulich [Wed, 11 Sep 2024 10:57:53 +0000 (12:57 +0200)]
x86/HVM: drop stdvga's "cache" struct member

Since 68e1183411be ("libxc: introduce a xc_dom_arch for hvm-3.0-x86_32
guests"), HVM guests are built using XEN_DOMCTL_sethvmcontext, which
ends up disabling stdvga caching because of arch_hvm_load() being
involved in the processing of the request. With that the field is
useless, and can be dropped. Drop the helper functions manipulating /
checking as well right away, but leave the use sites of
stdvga_cache_is_enabled() with the hard-coded result the function would
have produced, to aid validation of subsequent dropping of further code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 months agox86/mmcfg: address violation of MISRA C Rule 16.3
Federico Serafini [Wed, 11 Sep 2024 10:57:07 +0000 (12:57 +0200)]
x86/mmcfg: address violation of MISRA C Rule 16.3

Address a violation of MISRA C:2012 Rule 16.3:
"An unconditional `break' statement shall terminate every
switch-clause".

No functional change.

Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agox86/mm: address violations of MISRA C Rule 16.3
Federico Serafini [Wed, 11 Sep 2024 10:56:33 +0000 (12:56 +0200)]
x86/mm: address violations of MISRA C Rule 16.3

Address violations of MISRA C:2012 Rule 16.3:
"An unconditional `break' statement shall terminate every
switch-clause".

No functional change.

Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agox86/monitor: address violation of MISRA C Rule 16.3
Federico Serafini [Wed, 11 Sep 2024 10:56:03 +0000 (12:56 +0200)]
x86/monitor: address violation of MISRA C Rule 16.3

Address a violation of MISRA C:2012 Rule 16.3:
"An unconditional `break' statement shall terminate every
switch-clause".

No functional change.

Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
7 months agox86/hypercall: address violations of MISRA C Rule 16.3
Federico Serafini [Wed, 11 Sep 2024 10:55:35 +0000 (12:55 +0200)]
x86/hypercall: address violations of MISRA C Rule 16.3

Address violations of MISRA C:2012 Rule 16.3:
"An unconditional `break' statement shall terminate every
switch-clause".

No functional change.

Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agox86/vm_event: address violation of MISRA C Rule 16.3
Federico Serafini [Wed, 11 Sep 2024 10:55:14 +0000 (12:55 +0200)]
x86/vm_event: address violation of MISRA C Rule 16.3

Address a violation of MISRA C:2012 Rule 16.3:
"An unconditional `break' statement shall terminate every
switch-clause".

No functional change.

Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
7 months agox86/time: address violations of MISRA C Rule 16.3
Federico Serafini [Wed, 11 Sep 2024 10:54:52 +0000 (12:54 +0200)]
x86/time: address violations of MISRA C Rule 16.3

Address violations of MISRA C:2012 Rule 16.3:
"An unconditional `break' statement shall terminate every
switch-clause".

No functional change.

Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 months agox86/psr: address violation of MISRA C Rule 16.3
Federico Serafini [Wed, 11 Sep 2024 10:54:22 +0000 (12:54 +0200)]
x86/psr: address violation of MISRA C Rule 16.3

Address a violation of MISRA C:2012 Rule 16.3:
"An unconditional `break' statement shall terminate every
switch-clause".

No functional change.

Signed-off-by: Federico Serafini <federico.serafini@bugseng.com>
Acked-by: Jan Beulich <jbeulich@suse.com>