]> xenbits.xensource.com Git - xen.git/log
xen.git
2 years agovpci/msix: restore PBA access length and alignment restrictions
Roger Pau Monné [Wed, 29 Mar 2023 12:56:33 +0000 (14:56 +0200)]
vpci/msix: restore PBA access length and alignment restrictions

Accesses to the PBA array have the same length and alignment
limitations as accesses to the MSI-X table:

"For all accesses to MSI-X Table and MSI-X PBA fields, software must
use aligned full DWORD or aligned full QWORD transactions; otherwise,
the result is undefined."

Introduce such length and alignment checks into the handling of PBA
accesses for vPCI.  This was a mistake of mine for not reading the
specification correctly.

Note that accesses must now be aligned, and hence there's no longer a
need to check that the end of the access falls into the PBA region as
both the access and the region addresses must be aligned.

Fixes: b177892d2d ('vpci/msix: handle accesses adjacent to the MSI-X table')
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 years agons16550: correct name/value pair parsing for PCI port/bridge
Jan Beulich [Wed, 29 Mar 2023 12:55:37 +0000 (14:55 +0200)]
ns16550: correct name/value pair parsing for PCI port/bridge

First of all these were inverted: "bridge=" caused the port coordinates
to be established, while "port=" controlled the bridge coordinates. And
then the error messages being identical also wasn't helpful. While
correcting this also move both case blocks close together.

Fixes: 97fd49a7e074 ("ns16550: add support for UART parameters to be specifed with name-value pairs")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 years agotools/xenstore: remove stale comment in create_node()
Juergen Gross [Wed, 29 Mar 2023 12:54:20 +0000 (14:54 +0200)]
tools/xenstore: remove stale comment in create_node()

There is a part of a comment in create_node() which should have been
deleted when modifying the related coding.

Fixes: 1cd3cc7ea27c ("tools/xenstore: create_node: Don't defer work to undo any changes on failure")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
2 years agovpci/msix: handle accesses adjacent to the MSI-X table
Roger Pau Monné [Tue, 28 Mar 2023 12:20:35 +0000 (14:20 +0200)]
vpci/msix: handle accesses adjacent to the MSI-X table

The handling of the MSI-X table accesses by Xen requires that any
pages part of the MSI-X related tables are not mapped into the domain
physmap.  As a result, any device registers in the same pages as the
start or the end of the MSIX or PBA tables is not currently
accessible, as the accesses are just dropped.

Note the spec forbids such placing of registers, as the MSIX and PBA
tables must be 4K isolated from any other registers:

"If a Base Address register that maps address space for the MSI-X
Table or MSI-X PBA also maps other usable address space that is not
associated with MSI-X structures, locations (e.g., for CSRs) used in
the other address space must not share any naturally aligned 4-KB
address range with one where either MSI-X structure resides."

Yet the 'Intel Wi-Fi 6 AX201' device on one of my boxes has registers
in the same page as the MSIX tables, and thus won't work on a PVH dom0
without this fix.

In order to cope with the behavior passthrough any accesses that fall
on the same page as the MSIX tables (but don't fall in between) to the
underlying hardware.  Such forwarding also takes care of the PBA
accesses, so it allows to remove the code doing this handling in
msix_{read,write}.  Note that as a result accesses to the PBA array
are no longer limited to 4 and 8 byte sizes, there's no access size
restriction for PBA accesses documented in the specification.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 years agoinclude: don't mention stub headers more than once in a make rule
Jan Beulich [Tue, 28 Mar 2023 12:20:16 +0000 (14:20 +0200)]
include: don't mention stub headers more than once in a make rule

When !GRANT_TABLE and !PV_SHIM headers-n contains grant_table.h twice,
causing make to complain "target '...' given more than once in the same
rule" for the rule generating the stub headers. We don't need duplicate
entries in headers-n anywhere, so zap them (by using $(sort ...)) right
where the final value of the variable is constructed.

Fixes: 6bec713f871f ("include/compat: produce stubs for headers not otherwise generated")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
2 years agox86/monitor: add new monitor event to catch I/O instructions
Dmitry Isaykin [Tue, 28 Mar 2023 12:18:46 +0000 (14:18 +0200)]
x86/monitor: add new monitor event to catch I/O instructions

Adds monitor support for I/O instructions.

Signed-off-by: Dmitry Isaykin <isaikin-dmitry@yandex.ru>
Signed-off-by: Anton Belousov <abelousov@ptsecurity.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
2 years agoCI: Minor updates to buster-gcc-ibt
Andrew Cooper [Fri, 24 Feb 2023 18:23:38 +0000 (18:23 +0000)]
CI: Minor updates to buster-gcc-ibt

 * Update from GCC 11.2 to 11.3
 * Use python3-minimal instead of python
 * Use --no-install-recommends, requiring ca-certificates, g++-multilib and
   build-essential to be listed explicitly

The resulting container is ~50M smaller

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
2 years agoCI: Remove llvm-8 from the Debian Stretch container
Andrew Cooper [Fri, 24 Mar 2023 17:59:56 +0000 (17:59 +0000)]
CI: Remove llvm-8 from the Debian Stretch container

For similar reasons to c/s a6b1e2b80fe20.  While this container is still
build-able for now, all the other problems with explicitly-versioned compilers
remain.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2 years agoconfigure: Drop --enable-githttp
Andrew Cooper [Fri, 24 Mar 2023 20:09:33 +0000 (20:09 +0000)]
configure: Drop --enable-githttp

Following Demi's work to use HTTPS everywhere, all users of GIT_HTTP have
been removed from the build system.  Drop the configure knob.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2 years agox86/boot: Restrict directmap permissions for .text/.rodata
Andrew Cooper [Mon, 6 Dec 2021 13:07:40 +0000 (13:07 +0000)]
x86/boot: Restrict directmap permissions for .text/.rodata

While we've been diligent to ensure that the main text/data/rodata mappings
have suitable restrictions, their aliases via the directmap were left fully
read/write.  Worse, we even had pieces of code making use of this as a
feature.

Restrict the permissions for .text/rodata, as we have no legitimate need for
writeability of these areas via the directmap alias.  Note that the
compile-time allocated pagetables do get written through their directmap
alias, so need to remain writeable.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 years agox86/ucode: Fix error paths control_thread_fn()
Andrew Cooper [Mon, 4 May 2020 12:32:21 +0000 (13:32 +0100)]
x86/ucode: Fix error paths control_thread_fn()

These two early exits skipped re-enabling the watchdog, restoring the NMI
callback, and clearing the nmi_patch global pointer.  Always execute the tail
of the function on the way out.

Fixes: 8dd4dfa92d62 ("x86/microcode: Synchronize late microcode loading")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 years agoautomation: add a smoke and suspend test on an Alder Lake system
Marek Marczykowski-Górecki [Sat, 25 Mar 2023 21:11:58 +0000 (22:11 +0100)]
automation: add a smoke and suspend test on an Alder Lake system

This is a first test using Qubes OS CI infra. The gitlab-runner has
access to ssh-based control interface (control@thor.testnet, ssh key
exposed to the test via ssh-agent) and pre-configured HTTP dir for boot
files (mapped under /scratch/gitlab-runner/tftp inside the container).
Details about the setup are described on
https://www.qubes-os.org/news/2022/05/05/automated-os-testing-on-physical-laptops/

There are two test. First is a simple dom0+domU boot smoke test, similar
to other existing tests. The second is one boots Xen, and try if S3
works. It runs on a ADL-based desktop system. The test script is based
on the Xilinx one.

The machine needs newer kernel than other x86 tests run, so use 6.1.x
kernel added in previous commit.

The usage of fakeroot is necessary to preserve device nodes (/dev/null
etc) when repacking rootfs. The test runs in a rootless podman
container, which doesn't have full root permissions. BTW the same
applies to docker with user namespaces enabled (but it's only opt-in
feature there).

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
2 years agoautomation: update x86-64 tests to Linux 6.1.19
Marek Marczykowski-Górecki [Sat, 25 Mar 2023 21:11:57 +0000 (22:11 +0100)]
automation: update x86-64 tests to Linux 6.1.19

It will be used in tests added in subsequent patches.
Enable config options needed for those tests.
While at it, migrate all the x86 tests to the newer kernel, and
introduce x86-64-test-needs to allow deduplication later (for now it's
used only once).

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2 years agox86/vmx: Don't spuriously crash the domain when INIT is received
Andrew Cooper [Thu, 24 Feb 2022 19:40:15 +0000 (19:40 +0000)]
x86/vmx: Don't spuriously crash the domain when INIT is received

In VMX operation, the handling of INIT IPIs is changed.  Instead of the CPU
resetting, the next VMEntry fails with EXIT_REASON_INIT.  From the TXT spec,
the intent of this behaviour is so that an entity which cares can scrub
secrets from RAM before participating in an orderly shutdown.

Right now, Xen's behaviour is that when an INIT arrives, the HVM VM which
schedules next is killed (citing an unknown VMExit), *and* we ignore the INIT
and continue blindly onwards anyway.

This patch addresses only the first of these two problems by ignoring the INIT
and continuing without crashing the VM in question.

The second wants addressing too, just as soon as we've figured out something
better to do...

Discovered as collateral damage from when an AP triple faults on S3 resume on
Intel TigerLake platforms.

Link: https://github.com/QubesOS/qubes-issues/issues/7283
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
2 years agoRevert "build: Change remaining xenbits.xen.org link to HTTPS"
Andrew Cooper [Fri, 24 Mar 2023 20:32:24 +0000 (20:32 +0000)]
Revert "build: Change remaining xenbits.xen.org link to HTTPS"

This reverts commit e1d75084443f676be681fdaf47585cc9a5f5b820.

After spending ages sorting out Gitlab CI, it appears that OSSTest too has an
out-of-date Lets Encrypt cert.  Revert again in the short term while we fix
this up.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 years agomisc: Replace git:// and http:// with https://
Demi Marie Obenour [Tue, 21 Mar 2023 17:33:42 +0000 (13:33 -0400)]
misc: Replace git:// and http:// with https://

Obtaining code over an insecure transport is a terrible idea for
blatently obvious reasons.  Even for non-executable data, insecure
transports are considered deprecated.

This patch enforces the use of secure transports in misc places.
All URLs are known to work.

Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 years agoconfigure: Replace git:// and http:// with https://
Demi Marie Obenour [Tue, 21 Mar 2023 17:33:40 +0000 (13:33 -0400)]
configure: Replace git:// and http:// with https://

Obtaining code over an insecure transport is a terrible idea for
blatently obvious reasons.  Even for non-executable data, insecure
transports are considered deprecated.

This patch enforces the use of secure transports in the build system.
Some URLs returned 301 or 302 redirects, so I replaced them with the
URLs that were redirected to.

Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 years agoconfigure: Do not try to use broken links
Demi Marie Obenour [Tue, 21 Mar 2023 17:33:38 +0000 (13:33 -0400)]
configure: Do not try to use broken links

The upstream URLs for zlib, PolarSSL, and the TPM emulator do not work
anymore, so do not attempt to use them.

Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 years agobuild: Change remaining xenbits.xen.org link to HTTPS
Demi Marie Obenour [Tue, 21 Mar 2023 17:33:36 +0000 (13:33 -0400)]
build: Change remaining xenbits.xen.org link to HTTPS

Obtaining code over an insecure transport is a terrible idea for
blatently obvious reasons.  Even for non-executable data, insecure
transports are considered deprecated.

This patch enforces the use of secure transports for all xenbits.xen.org
URLs.  All altered links have been tested and are known to work.

Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 years agobuild: Use HTTPS for all xenbits.xen.org Git repos
Demi Marie Obenour [Tue, 21 Mar 2023 17:33:34 +0000 (13:33 -0400)]
build: Use HTTPS for all xenbits.xen.org Git repos

Obtaining code over an insecure transport is a terrible idea for
blatently obvious reasons.  Even for non-executable data, insecure
transports are considered deprecated.

This patch enforces the use of secure transports for all xenbits git
repositories.  It was generated with the following shell script:

    git ls-files -z |
    xargs -0 -- sed -Ei -- 's@(git://xenbits\.xen\.org|http://xenbits\.xen\.org/git-http)/@https://xenbits.xen.org/git-http/@g'

All altered links have been tested and are known to work.

Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 years agoxen/trace: Minor code cleanup
Andrew Cooper [Wed, 15 Sep 2021 17:24:19 +0000 (18:24 +0100)]
xen/trace: Minor code cleanup

 * Delete trailing whitespace
 * Replace an opencoded DIV_ROUND_UP()
 * Drop bogus smp_rmb() - spin_lock_irqsave() has full smp_mb() semantics.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 years agoxen/credit2: Remove tail padding from TRC_CSCHED2_* records
Andrew Cooper [Wed, 15 Sep 2021 16:01:43 +0000 (17:01 +0100)]
xen/credit2: Remove tail padding from TRC_CSCHED2_* records

All three of these records have tail padding, leaking stack rubble into the
trace buffer.  Introduce an explicit _pad field and have the compiler zero the
padding automatically.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
2 years agoxen/memory: Remove tail padding from TRC_MEM_* records
Andrew Cooper [Wed, 15 Sep 2021 15:49:01 +0000 (16:49 +0100)]
xen/memory: Remove tail padding from TRC_MEM_* records

Four TRC_MEM_* records supply custom structures with tail padding, leaking
stack rubble into the trace buffer.  Three of the records were fine in 32-bit
builds of Xen, due to the relaxed alignment of 64-bit integers, but
POD_SUPERPAGE_SPLITER was broken right from the outset.

We could pack the datastructures to remove the padding, but xentrace_format
has no way of rendering the upper half of a 16-bit field.  Instead, expand all
16-bit fields to 32-bit.

For POD_SUPERPAGE_SPLINTER, introduce an order field as it is relevant
information, and to match DECREASE_RESERVATION, and so it doesn't require a
__packed attribute to drop tail padding.

Update xenalyze's structures to match, and introduce xentrace_format rendering
which was absent previously.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
2 years agoxen/trace: Don't over-read trace objects
Andrew Cooper [Thu, 16 Sep 2021 09:24:26 +0000 (10:24 +0100)]
xen/trace: Don't over-read trace objects

In the case that 'extra' isn't a multiple of uint32_t, the calculation rounds
the number of bytes up, causing later logic to read unrelated bytes beyond the
end of the object.

Also, asserting that the object is within TRACE_EXTRA_MAX, but truncating it
in release builds is rude.  Instead, reject any out-of-spec records, leaving
enough of a message to identify the faulty caller.

There is one buggy trace record, TRC_RTDS_BUDGET_BURN.  As it must remain
__packed (as cur_budget is misaligned), change bool has_extratime to uint32_t
to compensate.

It turns out that the new printk() can also be hit by HVMOP_xentrace, because
the hypercall is broken.  It cannot be used outside of custom debugging, as
none of the tooling was ever updated to understand TRC_GUEST, nor is there any
evidence of hypercall ever being used in public.

While the hypercall was clearly intended to be used with units if uint32_t's,
that's not how the API/ABI works - Xen will in fact read the entire structure
rather than the initialised subset out of guest memory (most likely, stack
rubble), then copy up to 3 bytes of it (rounding up to the next uint32_t) into
the real tracebuffer.

There are several possible ways to fix this, but as the hypercall, and does
not plausibly have any users, go with the one that is least logic in Xen, by
rejecting tracing attempts that are not of uint32_t size.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 years agox86/hvm: Improve hvm_set_guest_pat() code generation again
Edwin Török [Mon, 16 May 2022 19:45:13 +0000 (20:45 +0100)]
x86/hvm: Improve hvm_set_guest_pat() code generation again

Following on from cset 9ce0a5e207f3 ("x86/hvm: Improve hvm_set_guest_pat()
code generation"), and the discovery that Clang/LLVM makes some especially
disastrous code generation for the loop at -O2

  https://github.com/llvm/llvm-project/issues/54644

Edvin decided to remove the loop entirely by fully vectorising it.  This is
substantially more efficient than the loop, and rather harder for a typical
compiler to mess up.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 years agox86/boot: Factor move_xen() out of __start_xen()
Andrew Cooper [Fri, 3 Dec 2021 20:33:57 +0000 (20:33 +0000)]
x86/boot: Factor move_xen() out of __start_xen()

Partly for clarity because there is a lot of subtle magic at work here.
Expand the commentary of what's going on.

Also because there is no need to double copy the stack (32kB).  Spilled
content does need accounting for, but this can be sorted by only copying only
a handful of words.

No practical change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 years agox86/shadow: Fix build with no PG_log_dirty
Andrew Cooper [Thu, 23 Mar 2023 23:41:20 +0000 (23:41 +0000)]
x86/shadow: Fix build with no PG_log_dirty

Gitlab Randconfig found:

  arch/x86/mm/shadow/common.c: In function 'shadow_prealloc':
  arch/x86/mm/shadow/common.c:1023:18: error: implicit declaration of function
      'paging_logdirty_levels'; did you mean 'paging_log_dirty_init'? [-Werror=implicit-function-declaration]
   1023 |         count += paging_logdirty_levels();
        |                  ^~~~~~~~~~~~~~~~~~~~~~
        |                  paging_log_dirty_init
  arch/x86/mm/shadow/common.c:1023:18: error: nested extern declaration of 'paging_logdirty_levels' [-Werror=nested-externs]

The '#if PG_log_dirty' expression is currently SHADOW_PAGING && !HVM &&
PV_SHIM_EXCLUSIVE.  Move the declaration outside.

Fixes: 33fb3a661223 ("x86/shadow: account for log-dirty mode when pre-allocating")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 years agox86/hvmloader: Don't override stddef.h
Andrew Cooper [Wed, 24 Aug 2022 10:06:18 +0000 (11:06 +0100)]
x86/hvmloader: Don't override stddef.h

Since c/s 73b13705af7c ("firmware: provide a stand alone set of headers"),
we've had a proper stddef.h.  Actually use it.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 years agox86/hvmloader: Don't build as PIC
Andrew Cooper [Wed, 24 Aug 2022 09:48:48 +0000 (10:48 +0100)]
x86/hvmloader: Don't build as PIC

HVMLoader is not relocatable in memory, and 32bit PIC code has a large
overhead.  Override the compilers choice of pic/no-pic and force it to be
non-relocatable.

Bloat-o-meter reports a net:
  add/remove: 0/0 grow/shrink: 3/107 up/down: 14/-3370 (-3356)

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 years agoxen: Modify domain_crash() to take a print string
Andrew Cooper [Thu, 20 Jan 2022 15:45:02 +0000 (15:45 +0000)]
xen: Modify domain_crash() to take a print string

There are two problems with domain_crash().

First, that it is frequently not preceded by a printk() at all, or only by a
dprintk().  Either way, critical diagnostic information is missing for an
event which is fatal to the guest.

Second, the embedded __LINE__ is an issue for livepatching, creating unwanted
churn in the binary diff.  This is the final __LINE__ remaining in
livepatching-relevant contexts.

The end goal is to have domain_crash() require a print string which gets fed
to printk(), making it far less easy to omit relevant diagnostic information.

However, modifying all callers at once is far too big and complicated, so use
some macro magic to tolerate the old API (no print string) in the short term.

Adjust two callers in load_segments() to demonstrate the new API.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 years agox86/nospec: Fix evaluate_nospec() code generation under Clang
Andrew Cooper [Mon, 25 Apr 2022 16:25:53 +0000 (17:25 +0100)]
x86/nospec: Fix evaluate_nospec() code generation under Clang

It turns out that evaluate_nospec() code generation is not safe under Clang.
Given:

  void eval_nospec_test(int x)
  {
      if ( evaluate_nospec(x) )
          asm volatile ("nop #true" ::: "memory");
      else
          asm volatile ("nop #false" ::: "memory");
  }

Clang emits:

  <eval_nospec_test>:
         0f ae e8                lfence
         85 ff                   test   %edi,%edi
         74 02                   je     <eval_nospec_test+0x9>
         90                      nop
         c3                      ret
         90                      nop
         c3                      ret

which is not safe because the lfence has been hoisted above the conditional
jump.  Clang concludes that both barrier_nospec_true()'s have identical side
effects and can safely be merged.

Clang can be persuaded that the side effects are different if there are
different comments in the asm blocks.  This is fragile, but no more fragile
that other aspects of this construct.

Introduce barrier_nospec_false() with a separate internal comment to prevent
Clang merging it with barrier_nospec_true() despite the otherwise-identical
content.  The generated code now becomes:

  <eval_nospec_test>:
         85 ff                   test   %edi,%edi
         74 05                   je     <eval_nospec_test+0x9>
         0f ae e8                lfence
         90                      nop
         c3                      ret
         0f ae e8                lfence
         90                      nop
         c3                      ret

which has the correct number of lfence's, and in the correct place.

Link: https://github.com/llvm/llvm-project/issues/55084
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 years agotools/migration: Fix iovec handling in send_checkpoint_dirty_pfn_list()
Andrew Cooper [Mon, 5 Jul 2021 20:05:14 +0000 (21:05 +0100)]
tools/migration: Fix iovec handling in send_checkpoint_dirty_pfn_list()

We shouldn't be using two struct iovec's to write half of 'rec' each, and
there is no need to malloc() for two struct iovec's at all.

Simplify down to just two - one covering the whole of 'rec', and one covering
the pfns array.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Olaf Hering <olaf@aepfle.de>
2 years agoxen/riscv: Fix early_puts() newline handling
Andrew Cooper [Thu, 2 Mar 2023 20:35:28 +0000 (20:35 +0000)]
xen/riscv: Fix early_puts() newline handling

OpenSBI already expands \n to \r\n.  Don't repeat the expansion, as it doubles
the size of the resulting log with every other line being blank.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
2 years agoxen/check-endbr.sh: Explain the purpose of the script
Andrew Cooper [Tue, 5 Jul 2022 14:51:58 +0000 (15:51 +0100)]
xen/check-endbr.sh: Explain the purpose of the script

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 years agoxen/argo: Fixes to argo_dprintk()
Andrew Cooper [Fri, 14 Oct 2022 13:39:55 +0000 (14:39 +0100)]
xen/argo: Fixes to argo_dprintk()

Rewrite argo_dprintk() so printk() format typechecking can always be
performed.  This also fixes the fact that parameters are not evaulated at all
in the default case.

Emit the messages at XENLOG_DEBUG.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
2 years agox86/shadow: OOS mode is HVM-only
Jan Beulich [Fri, 24 Mar 2023 10:20:59 +0000 (11:20 +0100)]
x86/shadow: OOS mode is HVM-only

XEN_DOMCTL_CDF_oos_off is forced set for PV domains, so the logic can't
ever be engaged for them. Conditionalize respective fields and remove
the respective bit from SHADOW_OPTIMIZATIONS when !HVM. As a result the
SH_type_oos_snapshot constant can disappear altogether in that case, and
a couple of #ifdef-s can also be dropped/combined.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 years agox86/shadow: purge {write,cmpxchg}_guest_entry() hooks
Jan Beulich [Fri, 24 Mar 2023 10:19:37 +0000 (11:19 +0100)]
x86/shadow: purge {write,cmpxchg}_guest_entry() hooks

These aren't mode dependent (see 06f04f54ba97 ["x86/shadow:
sh_{write,cmpxchg}_guest_entry() are PV-only"], where they were moved
out of multi.c) and hence there's no need to have pointers to the
functions in struct shadow_paging_mode. Due to include dependencies,
however, the "paging" wrappers need to move out of paging.h; they're
needed from PV memory management code only anyway, so by moving them
their exposure is reduced at the same time.

By carefully placing the (moved and renamed) shadow function
declarations, #ifdef can also be dropped from the "paging" wrappers
(paging_mode_shadow() is constant false when !SHADOW_PAGING).

While moving the code, drop the (largely wrong) comment from
paging_write_guest_entry() and reduce that of
paging_cmpxchg_guest_entry().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 years agotools/libs/vchan: remove private offsetof() definition
Juergen Gross [Fri, 24 Mar 2023 10:14:25 +0000 (11:14 +0100)]
tools/libs/vchan: remove private offsetof() definition

vchan/init.c is defining offsetof privately. Remove that definition
and just use stddef.h instead.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2 years agotools/libfsimage: remove private offsetof() definition
Juergen Gross [Fri, 24 Mar 2023 10:14:11 +0000 (11:14 +0100)]
tools/libfsimage: remove private offsetof() definition

xfs/fsys_xfs.c is defining offsetof privately. Remove that definition
and just use stddef.h instead.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2 years agotools/hvmloader: remove private offsetof() definition
Juergen Gross [Fri, 24 Mar 2023 10:13:57 +0000 (11:13 +0100)]
tools/hvmloader: remove private offsetof() definition

util.h contains a definition of offsetof(), which isn't needed, as
firmware/include/stddef.h's doesn't really need overriding.

Remove it.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 years agotools: add container_of() macro to xen-tools/common-macros.h
Juergen Gross [Fri, 24 Mar 2023 10:13:43 +0000 (11:13 +0100)]
tools: add container_of() macro to xen-tools/common-macros.h

Instead of having 3 identical copies of the definition of a
container_of() macro in different tools header files, add that macro
to xen-tools/common-macros.h and use that instead.

Delete the other copies of that macro.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2 years agotools: get rid of additional min() and max() definitions
Juergen Gross [Fri, 24 Mar 2023 10:12:32 +0000 (11:12 +0100)]
tools: get rid of additional min() and max() definitions

Defining min(), min_t(), max() and max_t() at other places than
xen-tools/common-macros.h isn't needed, as the definitions in said
header can be used instead.

Same applies to BUILD_BUG_ON() in hvmloader.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2 years agox86/PV: conditionalize arch_set_info_guest()'s call to update_cr3()
Jan Beulich [Fri, 24 Mar 2023 10:11:48 +0000 (11:11 +0100)]
x86/PV: conditionalize arch_set_info_guest()'s call to update_cr3()

sh_update_paging_modes() as its last action already invokes
sh_update_cr3(). Therefore there is no reason to invoke update_cr3()
another time immediately after calling paging_update_paging_modes(),
especially as sh_update_cr3() does not short-circuit the "nothing
changed" case.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 years agox86/shadow: replace memcmp() in sh_resync_l1()
Jan Beulich [Fri, 24 Mar 2023 10:10:41 +0000 (11:10 +0100)]
x86/shadow: replace memcmp() in sh_resync_l1()

Ordinary scalar operations are used in a multitude of other places, so
do so here as well. In fact take the opportunity and drop a local
variable then as well, first and foremost to get rid of a bogus cast.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 years agox86/shadow: fold/rename sh_unhook_*_mappings()
Jan Beulich [Fri, 24 Mar 2023 10:08:36 +0000 (11:08 +0100)]
x86/shadow: fold/rename sh_unhook_*_mappings()

The "32b" and "pae" functions are identical at the source level (they
differ in what they get compiled to, due to differences in
SHADOW_FOREACH_L2E()), leaving aside a comment the PAE variant has and
the non-PAE one doesn't. Replace these infixes by the more usual l<N>
ones (and then also for the "64b" one for consistency; that'll also
allow for re-use once we support 5-level paging, if need be). The two
different instances are still distinguishable by their "level" suffix.

While fiddling with the names, convert the last parameter to boolean
as well.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 years agox86/shadow: fix and improve sh_page_has_multiple_shadows()
Jan Beulich [Fri, 24 Mar 2023 10:07:08 +0000 (11:07 +0100)]
x86/shadow: fix and improve sh_page_has_multiple_shadows()

While no caller currently invokes the function without first making sure
there is at least one shadow [1], we'd better eliminate UB here:
find_first_set_bit() requires input to be non-zero to return a well-
defined result.

Further, using find_first_set_bit() isn't very efficient in the first
place for the intended purpose.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
[1] The function has exactly two uses, and both are from OOS code, which
    is HVM-only. For HVM (but not for PV) sh_mfn_is_a_page_table(),
    guarding the call to sh_unsync(), guarantees at least one shadow.
    Hence even if sh_page_has_multiple_shadows() returned a bogus value
    when invoked for a PV domain, the subsequent is_hvm_vcpu() and
    oos_active checks (the former being redundant with the latter) will
    compensate. (Arguably that oos_active check should come first, for
    both clarity and efficiency reasons.)

2 years agotools/xl: rework p9 config parsing
Juergen Gross [Thu, 23 Mar 2023 08:18:26 +0000 (09:18 +0100)]
tools/xl: rework p9 config parsing

Rework the config parsing of a p9 device to use the
split_string_into_pair() function instead of open coding it.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2 years agotools/xl: make split_string_into_pair() more usable
Juergen Gross [Thu, 23 Mar 2023 08:18:12 +0000 (09:18 +0100)]
tools/xl: make split_string_into_pair() more usable

Today split_string_into_pair() will not really do what its name is
suggesting: instead of splitting a string into a pair of strings using
a delimiter, it will return the first two strings of the initial string
by using the delimiter.

This is never what the callers want, so modify split_string_into_pair()
to split the string only at the first delimiter found, resulting in
something like "x=a=b" to be split into "x" and "a=b" when being called
with "=" as the delimiter. Today the returned strings would be "x" and
"a".

At the same time switch the delimiter from "const char *" (allowing
multiple delimiter characters) to "char" (a single character only), as
this makes the function more simple without breaking any use cases.

Suggested-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2 years agotools: use libxenlight for writing xenstore-stubdom console nodes
Juergen Gross [Thu, 23 Mar 2023 08:17:57 +0000 (09:17 +0100)]
tools: use libxenlight for writing xenstore-stubdom console nodes

Instead of duplicating libxl__device_console_add() work in
init-xenstore-domain.c, just use libxenlight.

This requires to add a small wrapper function to libxenlight, as
libxl__device_console_add() is an internal function.

This at once removes a theoretical race between starting xenconsoled
and xenstore-stubdom, as the old code wasn't using a single
transaction for writing all the entries, leading to the possibility
that xenconsoled would see only some of the entries being written.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
2 years agoVT-d: fix iommu=no-igfx if the IOMMU scope contains fake device(s)
Marek Marczykowski-Górecki [Thu, 23 Mar 2023 08:16:41 +0000 (09:16 +0100)]
VT-d: fix iommu=no-igfx if the IOMMU scope contains fake device(s)

If the scope for IGD's IOMMU contains additional device that doesn't
actually exist, iommu=no-igfx would not disable that IOMMU. In this
particular case (Thinkpad x230) it included 00:02.1, but there is no
such device on this platform. Consider only existing devices for the
"gfx only" check as well as the establishing of IGD DRHD address
(underlying is_igd_drhd(), which is used to determine applicability of
two workarounds).

Fixes: 2d7f191b392e ("VT-d: generalize and correct "iommu=no-igfx" handling")
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
2 years agotools/xl: allow split_string_into_pair() to trim values
Juergen Gross [Wed, 22 Mar 2023 09:00:09 +0000 (10:00 +0100)]
tools/xl: allow split_string_into_pair() to trim values

Most use cases of split_string_into_pair() are requiring the returned
strings to be white space trimmed.

In order to avoid the same code pattern multiple times, add a predicate
parameter to split_string_into_pair() which can be specified to call
trim() with that predicate for the string pair returned. Specifying
NULL for the predicate will avoid the call of trim().

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2 years agomove {,vcpu_}show_execution_state() declarations to common header
Jan Beulich [Wed, 22 Mar 2023 08:58:25 +0000 (09:58 +0100)]
move {,vcpu_}show_execution_state() declarations to common header

These are used from common code, so their signatures should be
consistent across architectures. This is achieved / guaranteed easiest
when their declarations are in a common header.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <jgrall@amazon.com>
2 years agotools: rename xen-tools/libs.h file to common-macros.h
Juergen Gross [Wed, 22 Mar 2023 08:57:19 +0000 (09:57 +0100)]
tools: rename xen-tools/libs.h file to common-macros.h

In order to better reflect the contents of the header and to make it
more appropriate to use it for different runtime environments like
programs, libraries, and firmware, rename the libs.h include file to
common-macros.h. Additionally add a comment pointing out the need to be
self-contained.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> # tools/python/xen/lowlevel/xc/xc.c
Acked-by: Christian Lindig <christian.lindig@cloud.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 years agox86/spec-ctrl: Defer CR4_PV32_RESTORE on the cstar_enter path
Andrew Cooper [Fri, 10 Feb 2023 21:11:14 +0000 (21:11 +0000)]
x86/spec-ctrl: Defer CR4_PV32_RESTORE on the cstar_enter path

As stated (correctly) by the comment next to SPEC_CTRL_ENTRY_FROM_PV, between
the two hunks visible in the patch, RET's are not safe prior to this point.

CR4_PV32_RESTORE hides a CALL/RET pair in certain configurations (PV32
compiled in, SMEP or SMAP active), and the RET can be attacked with one of
several known speculative issues.

Furthermore, CR4_PV32_RESTORE also hides a reference to the cr4_pv32_mask
global variable, which is not safe when XPTI is active before restoring Xen's
full pagetables.

This crash has gone unnoticed because it is only AMD CPUs which permit the
SYSCALL instruction in compatibility mode, and these are not vulnerable to
Meltdown so don't activate XPTI by default.

This is XSA-429 / CVE-2022-42331

Fixes: 5e7962901131 ("x86/entry: Organise the use of MSR_SPEC_CTRL at each entry/exit point")
Fixes: 5784de3e2067 ("x86: Meltdown band-aid against malicious 64-bit PV guests")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 years agox86/HVM: serialize pinned cache attribute list manipulation
Jan Beulich [Tue, 21 Mar 2023 12:01:01 +0000 (12:01 +0000)]
x86/HVM: serialize pinned cache attribute list manipulation

While the RCU variants of list insertion and removal allow lockless list
traversal (with RCU just read-locked), insertions and removals still
need serializing amongst themselves. To keep things simple, use the
domain lock for this purpose.

This is CVE-2022-42334 / part of XSA-428.

Fixes: 642123c5123f ("x86/hvm: provide XEN_DMOP_pin_memory_cacheattr")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
2 years agox86/HVM: bound number of pinned cache attribute regions
Jan Beulich [Tue, 21 Mar 2023 12:01:01 +0000 (12:01 +0000)]
x86/HVM: bound number of pinned cache attribute regions

This is exposed via DMOP, i.e. to potentially not fully privileged
device models. With that we may not permit registration of an (almost)
unbounded amount of such regions.

This is CVE-2022-42333 / part of XSA-428.

Fixes: 642123c5123f ("x86/hvm: provide XEN_DMOP_pin_memory_cacheattr")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 years agox86/shadow: account for log-dirty mode when pre-allocating
Jan Beulich [Tue, 21 Mar 2023 11:58:50 +0000 (11:58 +0000)]
x86/shadow: account for log-dirty mode when pre-allocating

Pre-allocation is intended to ensure that in the course of constructing
or updating shadows there won't be any risk of just made shadows or
shadows being acted upon can disappear under our feet. The amount of
pages pre-allocated then, however, needs to account for all possible
subsequent allocations. While the use in sh_page_fault() accounts for
all shadows which may need making, so far it didn't account for
allocations coming from log-dirty tracking (which piggybacks onto the
P2M allocation functions).

Since shadow_prealloc() takes a count of shadows (or other data
structures) rather than a count of pages, putting the adjustment at the
call site of this function won't work very well: We simply can't express
the correct count that way in all cases. Instead take care of this in
the function itself, by "snooping" for L1 type requests. (While not
applicable right now, future new request sites of L1 tables would then
also be covered right away.)

It is relevant to note here that pre-allocations like the one done from
shadow_alloc_p2m_page() are benign when they fall in the "scope" of an
earlier pre-alloc which already included that count: The inner call will
simply find enough pages available then; it'll bail right away.

This is CVE-2022-42332 / XSA-427.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
2 years agox86/vmx: Provide named fields for IO exit qualification
Andrew Cooper [Thu, 16 Mar 2023 17:53:56 +0000 (17:53 +0000)]
x86/vmx: Provide named fields for IO exit qualification

This removes most of the opencoded bit logic on the exit qualification.
Unfortunately, size is 1-based not 0-based, so need adjusting in a separate
variable.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 years agoAMD/IOMMU: without XT, x2APIC needs to be forced into physical mode
Jan Beulich [Tue, 21 Mar 2023 08:23:25 +0000 (09:23 +0100)]
AMD/IOMMU: without XT, x2APIC needs to be forced into physical mode

An earlier change with the same title (commit 1ba66a870eba) altered only
the path where x2apic_phys was already set to false (perhaps from the
command line). The same of course needs applying when the variable
wasn't modified yet from its initial value.

Reported-by: Elliott Mitchell <ehem+xen@m5p.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 years agoautomation: arm64: Create test jobs for testing static shared memory on qemu
Jiamei Xie [Thu, 16 Mar 2023 09:12:24 +0000 (09:12 +0000)]
automation: arm64: Create test jobs for testing static shared memory on qemu

Create 2 new test jobs, called qemu-smoke-dom0less-arm64-gcc-static-shared-mem
and qemu-smoke-dom0less-arm64-gcc-debug-static-shared-mem.

Adjust qemu-smoke-dom0less-arm64.sh script to accomodate the static
shared memory test as a new test variant. The test variant is determined
based on the first argument passed to the script. For testing static
shared memory, the argument is 'static-shared-mem'.

The test configures two dom0less DOMUs with a static shared memory
region and adds a check in the init script.

The check consists in comparing the contents of the /proc/device-tree/reserved-memory
xen-shmem entry with the static shared memory range and id with which
DOMUs were configured. If the memory layout is correct, a message gets
printed by DOMU.

At the end of the qemu run, the script searches for the specific message
in the logs and fails if not found.

Signed-off-by: Jiamei Xie <jiamei.xie@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2 years agoautomation: arm64: Create test jobs for testing static heap on qemu
Jiamei Xie [Thu, 16 Mar 2023 09:12:23 +0000 (09:12 +0000)]
automation: arm64: Create test jobs for testing static heap on qemu

Create 2 new test jobs, called qemu-smoke-dom0less-arm64-gcc-staticheap
and qemu-smoke-dom0less-arm64-gcc-debug-staticheap.

Add property "xen,static-heap" under /chosen node to enable static-heap.
If the domU can start successfully with static-heap enabled, then this
test pass.

ImageBuillder sets the kernel and ramdisk range based on the file size.
It will use the memory range between 0x45600000 to 0x47AED1E8. It uses
MEMORY_START and MEMORY_END from the cfg file as a range in which it can
instruct u-boot where to place the images.

Change MEMORY_END to 0x50000000 for all test cases.

Signed-off-by: Jiamei Xie <jiamei.xie@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2 years agoxen/console: skip switching serial input to non existing domains
Michal Orzel [Mon, 20 Mar 2023 16:12:51 +0000 (17:12 +0100)]
xen/console: skip switching serial input to non existing domains

At the moment, we direct serial input to hardware domain by default.
This does not make any sense when running in true dom0less mode, since
such domain does not exist. As a result, users wishing to write to
an emulated UART of a domU are always forced to execute CTRL-AAA first.
The same issue is when rotating among serial inputs, where we always
have to go through hardware domain case. This problem can be elaborated
further to all the domains that no longer exist.

Modify switch_serial_input() so that we skip switching serial input to
non existing domains. Take the opportunity to define and make use of
macro max_console_rx to make it clear what 'max_init_domid + 1' means
in the console code context. Also, modify call to printk() to use correct
format specifier for unsigned int.

For now, to minimize the required changes and to match the current
behavior with hwdom, the default input goes to the first real domain.
The choice is more or less arbitrary since dom0less domUs are supposedly
equal. This will be handled in the future by adding support in boot time
configuration for marking a specific domain preferred in terms of
directing serial input to.

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 years agolibacpi: fix PCI hotplug AML
David Woodhouse [Mon, 20 Mar 2023 16:12:34 +0000 (17:12 +0100)]
libacpi: fix PCI hotplug AML

The emulated PIIX3 uses a nybble for the status of each PCI function,
so the status for e.g. slot 0 functions 0 and 1 respectively can be
read as (\_GPE.PH00 & 0x0F), and (\_GPE.PH00 >> 0x04).

The AML that Xen gives to a guest gets the operand order for the odd-
numbered functions the wrong way round, returning (0x04 >> \_GPE.PH00)
instead.

As far as I can tell, this was the wrong way round in Xen from the
moment that PCI hotplug was first introduced in commit 83d82e6f35a8:

+                    ShiftRight (0x4, \_GPE.PH00, Local1)
+                    Return (Local1) /* IN status as the _STA */

Or maybe there's bizarre AML operand ordering going on there, like
Intel's wrong-way-round assembler, and it only broke later when it was
changed to being generated?

Either way, it's definitely wrong now, and instrumenting a Linux guest
shows that it correctly sees _STA being 0x00 in function 0 of an empty
slot, but then the loop in acpiphp_glue.c::get_slot_status() goes on to
look at function 1 and sees that _STA evaluates to 0x04. Thus reporting
an adapter is present in every slot in /sys/bus/pci/slots/*

Quite why Linux wants to look for function 1 being physically present
when function 0 isn't... I don't want to think about right now.

Fixes: 83d82e6f35a8 ("hvmloader: pass-through: multi-function PCI hot-plug")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 years agoxen/riscv: initialize .bss section
Oleksii Kurochko [Mon, 20 Mar 2023 16:12:04 +0000 (17:12 +0100)]
xen/riscv: initialize .bss section

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Bobby Eshleman <bobbyeshleman@gmail.com>
2 years agoxen/riscv: read/save hart_id and dtb_base passed by bootloader
Oleksii Kurochko [Mon, 20 Mar 2023 16:11:13 +0000 (17:11 +0100)]
xen/riscv: read/save hart_id and dtb_base passed by bootloader

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Bobby Eshleman <bobbyeshleman@gmail.com>
2 years agoxen/riscv: disable fpu
Oleksii Kurochko [Mon, 20 Mar 2023 16:10:34 +0000 (17:10 +0100)]
xen/riscv: disable fpu

Disable FPU to detect illegal usage of floating point in kernel
space.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Bobby Eshleman <bobbyeshleman@gmail.com>
2 years agoautomation: Drop sles11sp4 dockerfile
Michal Orzel [Fri, 3 Mar 2023 12:53:46 +0000 (13:53 +0100)]
automation: Drop sles11sp4 dockerfile

It has reached EOL and there are no jobs using it on any branch.

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2 years agotools: Use -s for python shebangs
Andrew Cooper [Tue, 14 Mar 2023 10:53:51 +0000 (10:53 +0000)]
tools: Use -s for python shebangs

This is mandated by the Fedora packaging guidelines because it is a security
vulnerability otherwise in suid scripts.  While Xen doesn't have suid scripts,
it's a very good idea generally because it prevents the users local python
environment interfering from system packaged scripts.

pygrub is the odd-script-out, being installed by distutils rather than
manually with INSTALL_PYTHON_PROG.  distutils has no nice way of editing the
shebang, so arrange to use INSTALL_PYTHON_PROG for pygrub too.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
2 years agotools/python: Drop shebangs from library files
Andrew Cooper [Tue, 14 Mar 2023 11:32:11 +0000 (11:32 +0000)]
tools/python: Drop shebangs from library files

These aren't runable scripts, so shouldn't have shebangs.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
2 years agotools/python: Improve unit test handling
Andrew Cooper [Tue, 14 Mar 2023 10:59:25 +0000 (10:59 +0000)]
tools/python: Improve unit test handling

 * Add X86_{CPUID,MSR}_POLICY_FORMAT checks which were missed previously.
 * Drop test_suite().  It hasn't been necessary since the Py2.3 era.
 * Drop the __main__ logic.  This can't be used without manually adjusting the
   include path, and `make test` knows how to do the right thing.
 * For `make test`, use `-v` to see which tests have been discovered and run.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
2 years agotools/pygrub: Factor out common setup.py parts
Andrew Cooper [Tue, 14 Mar 2023 11:24:22 +0000 (11:24 +0000)]
tools/pygrub: Factor out common setup.py parts

... to mirror the tools/python side in c/s 2b8314a3c354.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
2 years agotools: Delete trailing whitespace in python scripts
Andrew Cooper [Tue, 14 Mar 2023 13:17:19 +0000 (13:17 +0000)]
tools: Delete trailing whitespace in python scripts

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
2 years agotools/misc: Drop xencons
Andrew Cooper [Tue, 14 Mar 2023 13:31:32 +0000 (13:31 +0000)]
tools/misc: Drop xencons

This script is not python3 compatible, but has its shebang altered to say
python3 by INSTALL_PYTHON_PROG.

The most recent reference I can find to this script (which isn't incidental
adjustments in the makefile) is from the Xen book, fileish 561e30b80402 which
says

  %% <snip>  Alternatively, if the
  %% Xen machine is connected to a serial-port server then we supply a
  %% dumb TCP terminal client, {\tt xencons}.

So this a not-invented-here version of telnet.  Delete it.

Resolves: xen-project/xen#159
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
2 years agotools/python: Drop pylintrc
Andrew Cooper [Tue, 14 Mar 2023 13:18:41 +0000 (13:18 +0000)]
tools/python: Drop pylintrc

This was added in 2004 in c/s b7d4a69f0ccb5 and has never been referenced
since.  Given the the commit message of simply "Added .", it was quite
possibly a mistake in the first place.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
2 years agox86/svm: Provide EXITINFO decodes for IO intercetps
Andrew Cooper [Wed, 15 Mar 2023 19:52:25 +0000 (19:52 +0000)]
x86/svm: Provide EXITINFO decodes for IO intercetps

This removes raw number manipulation, and makes the logic easier to follow.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 years agoCHANGELOG: mention xl/libxl SMBIOS support
Jason Andryuk [Thu, 16 Mar 2023 13:50:08 +0000 (14:50 +0100)]
CHANGELOG: mention xl/libxl SMBIOS support

Add an entry for the new xl/libxl SMBIOS support.

Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: Henry Wang <Henry.Wang@arm.com>
2 years agox86/shadow: drop zero initialization from shadow_domain_init()
Jan Beulich [Thu, 16 Mar 2023 13:49:20 +0000 (14:49 +0100)]
x86/shadow: drop zero initialization from shadow_domain_init()

There's no need for this as struct domain starts out zero-filled.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 years agox86/paging: move and conditionalize flush_tlb() hook
Jan Beulich [Thu, 16 Mar 2023 13:48:23 +0000 (14:48 +0100)]
x86/paging: move and conditionalize flush_tlb() hook

The hook isn't mode dependent, hence it's misplaced in struct
paging_mode. (Or alternatively I see no reason why the alloc_page() and
free_page() hooks don't also live there.) Move it to struct
paging_domain.

The hook also is used for HVM guests only, so make respective pieces
conditional upon CONFIG_HVM.

While there also add __must_check to the hook declaration, as it's
imperative that callers deal with getting back "false".

While moving the shadow implementation, introduce a "curr" local
variable.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 years agox86/paging: move update_paging_modes() hook
Jan Beulich [Thu, 16 Mar 2023 13:46:31 +0000 (14:46 +0100)]
x86/paging: move update_paging_modes() hook

The hook isn't mode dependent, hence it's misplaced in struct
paging_mode. (Or alternatively I see no reason why the alloc_page() and
free_page() hooks don't also live there.) Move it to struct
paging_domain.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
2 years agox86/paging: drop set-allocation from final-teardown
Jan Beulich [Thu, 16 Mar 2023 13:43:31 +0000 (14:43 +0100)]
x86/paging: drop set-allocation from final-teardown

The fixes for XSA-410 have arranged for P2M pages being freed by P2M
code to be properly freed directly, rather than being put back on the
paging pool list. Therefore whatever p2m_teardown() may return will no
longer need taking care of here. Drop the code, leaving the assertions
in place and adding "total" back to the PAGING_PRINTK() message.

With merely the (optional) log message and the assertions left, there's
really no point anymore to hold the paging lock there, so drop that too.

Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
2 years agox86/paging: fold most HAP and shadow final teardown
Jan Beulich [Thu, 16 Mar 2023 13:42:04 +0000 (14:42 +0100)]
x86/paging: fold most HAP and shadow final teardown

HAP does a few things beyond what's common, which are left there at
least for now. Common operations, however, are moved to
paging_final_teardown(), allowing shadow_final_teardown() to go away.

While moving (and hence generalizing) the respective SHADOW_PRINTK()
drop the logging of total_pages from the 2nd instance - the value is
necessarily zero after {hap,shadow}_set_allocation() - and shorten the
messages, in part accounting for PAGING_PRINTK() logging __func__
already.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
2 years agox86: don't include processor.h from system.h
Jan Beulich [Thu, 16 Mar 2023 12:23:14 +0000 (13:23 +0100)]
x86: don't include processor.h from system.h

processor.h in particular pulls in xen/smp.h, which is overly heavy for
a supposedly pretty fundamental header like system.h. To keep things
building, move the declarations of struct cpuinfo_x86 and boot_cpu_data
to asm/cpufeature.h (which arguably also is where they belong). In the
course of the move switch away from using fixed-width types and convert
plain "int" to "unsigned int" for the two x86_cache_* fields.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 years agoconsole: use more appropriate domain RCU-locking function
Jan Beulich [Thu, 16 Mar 2023 12:21:50 +0000 (13:21 +0100)]
console: use more appropriate domain RCU-locking function

While both 19afff14b4cb ("xen: support console_switching between Dom0
and DomUs on ARM") and 1ee1e4b0d1ff ("xen/arm: Allow vpl011 to be used
by DomU") were part of the same series (iirc), the latter correctly used
rcu_lock_domain_by_id() in console_input_domain(), whereas the former
for some reason used rcu_lock_domain_by_any_id() instead, despite that
code only kind of open-coding console_input_domain(). There's no point
here to deal with DOMID_SELF, which is the sole difference between the
two functions.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 years agoxen/grants: repurpose command line max options
Roger Pau Monne [Tue, 14 Mar 2023 14:45:53 +0000 (15:45 +0100)]
xen/grants: repurpose command line max options

Slightly change the meaning of the command line
gnttab_max_{maptrack_,}frames: do not use them as upper bounds for the
passed values at domain creation, instead just use them as defaults
in the absence of any provided value.

It's not very useful for the options to be used both as defaults and
as capping values for domain creation inputs.  The defaults passed on
the command line are used by dom0 which has a very different grant
requirements than a regular domU.  dom0 usually needs a bigger
maptrack array, while domU usually require a bigger number of grant
frames.

The relaxation in the logic for the maximum size of the grant and
maptrack table sizes doesn't change the fact that domain creation
hypercall can cause resource exhausting, so disaggregated setups
should take it into account.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2 years agolibxl: Fix libxl__device_pci_reset error messages
Jason Andryuk [Mon, 13 Mar 2023 19:57:55 +0000 (15:57 -0400)]
libxl: Fix libxl__device_pci_reset error messages

Don't use the LOG*D macros.  They expect a domid, but "domain" here is
the PCI domain.  Hence it is inappropriate for this use.

Make the write error messages uniform with LOGE.  errno has the
interesting information while rc is just -1.  Drop printing rc and use
LOGE to print errno as text.

The interesting part of a failed write to do_flr is that PCI BDF, so
print that.

Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
2 years agoxl/libxl: Add OEM string support to smbios
Jason Andryuk [Mon, 6 Mar 2023 20:40:24 +0000 (15:40 -0500)]
xl/libxl: Add OEM string support to smbios

Add support for OEM strings in the SMBIOS type 11.

hvmloader checks them sequentially, so hide the implementation detail.
Allow multiple plain oem= items and assign the numeric values
internally.

Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
2 years agoxl/libxl: Add ability to specify SMBIOS strings
Jason Andryuk [Mon, 6 Mar 2023 20:40:23 +0000 (15:40 -0500)]
xl/libxl: Add ability to specify SMBIOS strings

hvm_xs_strings.h specifies xenstore entries which can be used to set or
override smbios strings.  hvmloader has support for reading them, but
xl/libxl support is not wired up.

Allow specifying the strings with the new xl.cfg option:
smbios=["bios_vendor=Xen Project","system_version=1.0"]

In terms of strings, the SMBIOS specification 3.5 says:
https://www.dmtf.org/sites/default/files/standards/documents/DSP0134_3.5.0.pdf
"""
Strings must be encoded as UTF-8 with no byte order mark (BOM). For
compatibility with older SMBIOS parsers, US-ASCII characters should be
used.  NOTE There is no limit on the length of each individual text
string. However, the length of the entire structure table (including all
strings) must be reported in the Structure Table Length field of the
32-bit Structure Table Entry Point (see 5.2.1) and/or the Structure
Table Maximum Size field of the 64-bit Structure Table Entry Point (see
5.2.2).
"""

The strings aren't checked for utf-8 or length.  hvmloader has a sanity
check on the overall length.

The libxl_smbios_type enum starts at 1 since otherwise the 0th key is
not printed in the json output.

Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
2 years agogolang/xenlight: Extend KeyedUnion to support Arrays
Jason Andryuk [Mon, 6 Mar 2023 20:40:22 +0000 (15:40 -0500)]
golang/xenlight: Extend KeyedUnion to support Arrays

Generation for KeyedUnion types doesn't support Arrays.  The smbios
support will place an smbios array inside the hvm KeyedUnion, and
gentotypes doesn't generate buildable Go code.

Have KeyedUnion add an idl.Array check and issue the approriate
xenlight_golang_array_to_C and xenlight_golang_array_from_C calls when
needed.  This matches how it is done in xenlight_golang_define_to_C &
xenlight_golang_define_from_C

xenlight_golang_array_to_C and xenlight_golang_array_from_C need to be
extended to set the cvarname and govarname as approriate for the
KeyedUnion cases to match the surrounding code.

Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
2 years agoarch/arm: time: Add support for parsing interrupts by names
Andrei Cherechesu [Mon, 13 Mar 2023 13:08:03 +0000 (15:08 +0200)]
arch/arm: time: Add support for parsing interrupts by names

Added support for parsing the ARM generic timer interrupts DT
node by the "interrupt-names" property, if it is available.

If not available, the usual parsing based on the expected
IRQ order is performed.

Also treated returning 0 as an error case for the
platform_get_irq() calls, since it is not a valid PPI ID and
treating it as a valid case would only cause Xen to BUG() later,
when trying to reserve vIRQ being SGI.

Added the "hyp-virt" PPI to the timer PPI list, even
though it's currently not in use. If the "hyp-virt" PPI is
not found, the hypervisor won't panic.

Signed-off-by: Andrei Cherechesu <andrei.cherechesu@nxp.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Reviewed-by: Michal Orzel <michal.orzel@amd.com>
2 years agoarch/arm: irq: Add platform_get_irq_byname() implementation
Andrei Cherechesu [Mon, 13 Mar 2023 13:08:02 +0000 (15:08 +0200)]
arch/arm: irq: Add platform_get_irq_byname() implementation

Moved implementation for the function which parses the IRQs of a DT
node by the "interrupt-names" property from the SMMU-v3 driver
to the IRQ core code and made it non-static to be used as helper.

Also changed it to receive a "struct dt_device_node*" as parameter,
like its counterpart, platform_get_irq(). Updated its usage inside
the SMMU-v3 driver accordingly.

Signed-off-by: Andrei Cherechesu <andrei.cherechesu@nxp.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
2 years agoflask/label-pci: Allow specifying optional irq label
Jason Andryuk [Tue, 14 Mar 2023 09:46:00 +0000 (10:46 +0100)]
flask/label-pci: Allow specifying optional irq label

IRQs can be shared between devices, so using the same label as the PCI
device can create conflicts where the IRQ is labeled with one of the
device labels preventing assignment of the second device to the second
domain.  Add the ability to specify an irq label distinct from the PCI
device, so a shared irq label can be specified.  The policy would then
be written such that the two domains can each use the shared IRQ type in
addition to their labeled PCI device.  That way we can still label most
of the PCI device resources and assign devices in the face of shared
IRQs.

Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Reviewed-by: Daniel P. Smith <dpsmith@apertussolutions.com>
2 years agobunzip: work around gcc13 warning
Jan Beulich [Tue, 14 Mar 2023 09:45:28 +0000 (10:45 +0100)]
bunzip: work around gcc13 warning

While provable that length[0] is always initialized (because symCount
cannot be zero), upcoming gcc13 fails to recognize this and warns about
the unconditional use of the value immediately following the loop.

See also https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106511.

Reported-by: Martin Liška <martin.liska@suse.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 years agobuild: run targets cscope,tags,... using tree-wide approach
Michal Orzel [Tue, 14 Mar 2023 09:44:47 +0000 (10:44 +0100)]
build: run targets cscope,tags,... using tree-wide approach

Despite being a matter of taste, in general, there are two main approaches
when dealing with code tagging: tree-wide, where all the sources are taken
into account or config-wide, when considering Kconfig options and actually
built files. At the moment, all_sources variable is defined using SUBDIRS,
which lists all the directories except arch/, where only $(TARGET_ARCH)
is taken into account. This makes it difficult to reason about and creates
fuzzy boundaries being a blocker when considering new directories that
might be config-dependent (like crypto/ which is missing in SUBDIRS).

For now, switch to the intermediate solution to list all the directories
in SUBDIRS without exceptions (also include crypto/). This way, the
approach taken is clear allowing new directories to be listed right away
without waiting to fix the infrastructure first. In the future, we can
then add support for config-wide approach.

Signed-off-by: Michal Orzel <michal.orzel@amd.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
2 years agoVT-d: constrain IGD check
Jan Beulich [Tue, 14 Mar 2023 09:44:08 +0000 (10:44 +0100)]
VT-d: constrain IGD check

Marking a DRHD as controlling an IGD isn't very sensible without
checking that at the very least it's a graphics device that lives at
0000:00:02.0. Re-use the reading of the class-code to control both the
clearing of "gfx_only" and the setting of "igd_drhd_address".

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
2 years agox86emul/test: suppress GNU ld 2.39 warning about RWX load segments
Jan Beulich [Tue, 14 Mar 2023 09:42:51 +0000 (10:42 +0100)]
x86emul/test: suppress GNU ld 2.39 warning about RWX load segments

Commit 68f5aac012b9 ("build: suppress future GNU ld warning about RWX
load segments") didn't quite cover all the cases: I missed ones in the
building of the test code blobs. Clone the workaround to the helper
Makefile in question, kind of open-coding the hypervisor build system's
ld-option macro.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2 years agox86/altp2m: help gcc13 to avoid it emitting a warning
Jan Beulich [Mon, 13 Mar 2023 14:16:21 +0000 (15:16 +0100)]
x86/altp2m: help gcc13 to avoid it emitting a warning

Switches of altp2m-s always expect a valid altp2m to be in place (and
indeed altp2m_vcpu_initialise() sets the active one to be at index 0).
The compiler, however, cannot know that, and hence it cannot eliminate
p2m_get_altp2m()'s case of returnin (literal) NULL. If then the compiler
decides to special case that code path in the caller, the dereference in
instances of

    atomic_dec(&p2m_get_altp2m(v)->active_vcpus);

can, to the code generator, appear to be NULL dereferences, leading to

In function 'atomic_dec',
    inlined from '...' at ...:
./arch/x86/include/asm/atomic.h:182:5: error: array subscript 0 is outside array bounds of 'int[0]' [-Werror=array-bounds=]

Aid the compiler by adding a BUG_ON() checking the return value of the
problematic p2m_get_altp2m(). Since with the use of the local variable
the 2nd p2m_get_altp2m() each will look questionable at the first glance
(Why is the local variable not used here?), open-code the only relevant
piece of p2m_get_altp2m() there.

To avoid repeatedly doing these transformations, and also to limit how
"bad" the open-coding really is, convert the entire operation to an
inline helper, used by all three instances (and accepting the redundant
BUG_ON(idx >= MAX_ALTP2M) in two of the three cases).

Reported-by: Charles Arnold <carnold@suse.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 years agocore-parking: fix build with gcc12 and NR_CPUS=1
Jan Beulich [Mon, 13 Mar 2023 14:15:42 +0000 (15:15 +0100)]
core-parking: fix build with gcc12 and NR_CPUS=1

Gcc12 takes issue with core_parking_remove()'s

    for ( ; i < cur_idle_nums; ++i )
        core_parking_cpunum[i] = core_parking_cpunum[i + 1];

complaining that the right hand side array access is past the bounds of
1. Clearly the compiler can't know that cur_idle_nums can only ever be
zero in this case (as the sole CPU cannot be parked).

Arrange for core_parking.c's contents to not be needed altogether, and
then disable its building when NR_CPUS == 1.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 years agox86/platform: make XENPF_get_dom0_console actually usable
Jan Beulich [Mon, 13 Mar 2023 14:14:38 +0000 (15:14 +0100)]
x86/platform: make XENPF_get_dom0_console actually usable

struct dom0_vga_console_info has been extended in the past, and it may
be extended again. The use in PV Dom0's start info already covers for
that by supplying the size of the provided data. For the recently
introduced platform-op size needs providing similarly. Go the easiest
available route and simply supply size via the hypercall return value.

While there also add a build-time check that possibly future growth of
the struct won't affect xen_platform_op_t's size.

Fixes: 4dd160583c79 ("x86/platform: introduce hypercall to get initial video console settings")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
2 years agox86/pvh: report ACPI VFCT table to dom0 if present
Roger Pau Monne [Sun, 12 Mar 2023 07:54:50 +0000 (15:54 +0800)]
x86/pvh: report ACPI VFCT table to dom0 if present

The VFCT ACPI table is used by AMD GPUs to expose the vbios ROM image
from the firmware instead of doing it on the PCI ROM on the physical
device.

As such, this needs to be available for PVH dom0 to access, or else
the GPU won't work.

Reported-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-and-Tested-by: Huang Rui <ray.huang@amd.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
2 years agox86/sysctl: Retrofit XEN_SYSCTL_cpu_featureset_{pv,hvm}_max
Andrew Cooper [Fri, 10 Mar 2023 19:37:56 +0000 (19:37 +0000)]
x86/sysctl: Retrofit XEN_SYSCTL_cpu_featureset_{pv,hvm}_max

Featuresets are supposed to be disappearing when the CPU policy infrastructure
is complete, but that has taken longer than expected, and isn't going to be
complete imminently either.

In the meantime, Xen does have proper default/max featuresets, and xen-cpuid
can even get them via the XEN_SYSCTL_cpu_policy_* interface, but only knows
now to render them nicely via the featureset interface.

Differences between default and max are a frequent source of errors,
frequently too in secret leading up to an embargo, so extend the featureset
sysctl to allow xen-cpuid to render them all nicely.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Christian Lindig <christian.lindig@cloud.com>