]> xenbits.xensource.com Git - xen.git/log
xen.git
3 years agox86: limit number of hypercall parameters to 5
Juergen Gross [Fri, 3 Dec 2021 10:18:38 +0000 (11:18 +0100)]
x86: limit number of hypercall parameters to 5

Today there is no hypercall with more than 5 parameters, while the ABI
allows up to 6 parameters. Especially for the X86 32-bit case using
6 parameters would require to run without frame pointer, which isn't
very fortunate. Note that for Arm the limit is 5 parameters already.

So limit the maximum number of parameters to 5 for x86, too.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/HVM: skip offline vCPU-s when dumping VMCBs/VMCSes
Jan Beulich [Fri, 3 Dec 2021 10:17:50 +0000 (11:17 +0100)]
x86/HVM: skip offline vCPU-s when dumping VMCBs/VMCSes

There's not really any register state associated with vCPU-s that
haven't been initialized yet, so avoid spamming the log with largely
useless information while still leaving an indication of the fact.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
3 years agox86/HVM: also dump stacks from show_execution_state()
Jan Beulich [Fri, 3 Dec 2021 10:15:57 +0000 (11:15 +0100)]
x86/HVM: also dump stacks from show_execution_state()

Wire up show_hvm_stack() also on this path. Move the show_guest_stack()
invocation out of show_stack(), rendering dead the is-HVM check there.

While separating guest and host paths, also move the show_code()
invocation - the function bails immediately when guest_mode() returns
"true".

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
3 years agox86/PV: properly set shadow allocation for Dom0
Jan Beulich [Fri, 3 Dec 2021 10:14:24 +0000 (11:14 +0100)]
x86/PV: properly set shadow allocation for Dom0

Leaving shadow setup just to the L1TF tasklet means running Dom0 on a
minimally acceptable shadow memory pool, rather than what normally
would be used (also, for example, for PVH). Populate the pool before
triggering the tasklet (or in preparation for L1TF checking logic to
trigger it), on a best effort basis (again like done for PVH).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
3 years agox86/boot: Support __ro_after_init
Andrew Cooper [Mon, 29 Nov 2021 20:11:01 +0000 (20:11 +0000)]
x86/boot: Support __ro_after_init

For security hardening reasons, it advantageous to make setup-once data
immutable after boot.  Borrow __ro_after_init from Linux.

On x86, place .data.ro_after_init at the start of .rodata, excluding it from
the early permission restrictions.  Re-apply RO restrictions to the whole of
.rodata in init_done(), attempting to reform the superpage if possible.

For architectures which don't implement __ro_after_init explicitly, variables
merges into .data.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/boot: Adjust .text/.rodata/etc permissions in one place
Andrew Cooper [Mon, 29 Nov 2021 20:04:11 +0000 (20:04 +0000)]
x86/boot: Adjust .text/.rodata/etc permissions in one place

At the moment, we have two locations selecting restricted permissions, not
very far apart on boot, dependent on opposite answers from using_2M_mapping().
The later location however can shatter superpages if needed, while the former
cannot.

Collect together all the permission adjustments at the slightly later point in
boot, as we likely need to shatter a superpage to support __ro_after_init.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/boot: Drop xen_virt_end
Andrew Cooper [Mon, 29 Nov 2021 19:01:50 +0000 (19:01 +0000)]
x86/boot: Drop xen_virt_end

The calculation in __start_xen() for xen_virt_end is an opencoding of
ROUNDUP(_end, 2M).  This is __2M_rwdata_end as provided by the linker script.

This corrects the bound calculations in arch_livepatch_init() and
update_xen_mappings() to not enforce 2M alignment when Xen is not compiled
with CONFIG_XEN_ALIGN_2M.

Furthermore, since 52975142d154 ("x86/boot: Create the l2_xenmap[] mappings
dynamically"), there have not been extraneous mappings to delete, meaning that
the call to destroy_xen_mappings() has been a no-op.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/boot: Fix data placement around __high_start()
Andrew Cooper [Mon, 29 Nov 2021 19:52:05 +0000 (19:52 +0000)]
x86/boot: Fix data placement around __high_start()

multiboot_ptr should be in __initdata - it is only used on the BSP path.
Furthermore, the .align 8 then .long means that stack_start is misaligned.

Move both into setup.c, which lets the compiler handle the details correctly,
as well as providing proper debug information for them.

Declare stack_start in setup.h and avoid extern-ing it locally in smpboot.c.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/boot: Better describe the pagetable relocation loops
Andrew Cooper [Mon, 29 Nov 2021 19:19:43 +0000 (19:19 +0000)]
x86/boot: Better describe the pagetable relocation loops

The first loop relocates all reachable non-leaf entries in idle_pg_table[],
which includes l2_xenmap[511]'s reference to l1_fixmap_x[].

The second loop relocates only leaf mappings in l2_xenmap[], so update the
skip condition to be opposite to the first loop.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/boot: Drop incorrect mapping at l2_xenmap[0]
Andrew Cooper [Mon, 29 Nov 2021 16:09:08 +0000 (16:09 +0000)]
x86/boot: Drop incorrect mapping at l2_xenmap[0]

It has been 4 years since the default load address changed from 1M to 2M, and
_stext ceased residing in l2_xenmap[0].  We should not be inserting an unused
mapping.

To ensure we don't create mappings accidentally, loop from 0 and obey
_PAGE_PRESENT on all entries.

Fixes: 7ed93f3a0dff ("x86: change default load address from 1 MiB to 2 MiB")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agobitops: Fix incorrect value in comment
Ayan Kumar Halder [Tue, 30 Nov 2021 18:12:38 +0000 (18:12 +0000)]
bitops: Fix incorrect value in comment

GENMASK(30, 21) should be 0x7fe00000. Fixed this in the comment
in bitops.h.

Signed-off-by: Ayan Kumar Halder <ayankuma@xilinx.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
[Tweak text, to put an end to any further bikeshedding]
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agoCHANGELOG.md: Start new "unstable" section
Ian Jackson [Wed, 1 Dec 2021 18:07:40 +0000 (18:07 +0000)]
CHANGELOG.md: Start new "unstable" section

I have just forward-ported the CHANGELOG.md updates from the
stable-4.16 branch.  But we need a new section for work in this
release cycle.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
3 years agoCHANGELOG.md: Set 4.16 version and date
Ian Jackson [Tue, 30 Nov 2021 11:40:21 +0000 (11:40 +0000)]
CHANGELOG.md: Set 4.16 version and date

Signed-off-by: Ian Jackson <iwj@xenproject.org>
(cherry picked from commit 36aa64095d0419d52d2466405ac13b9858463f48)

3 years agoCHANGELOG: add missing entries for work during the 4.16 release cycle
Roger Pau Monne [Wed, 24 Nov 2021 11:24:03 +0000 (12:24 +0100)]
CHANGELOG: add missing entries for work during the 4.16 release cycle

Document some of the relevant changes during the 4.16 release cycle.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
(cherry picked from commit e2544a28beacd854f295095d102a8773743ac917)

3 years agoarm/efi: Improve performance requesting filesystem handle
Luca Fancellu [Tue, 16 Nov 2021 15:06:24 +0000 (15:06 +0000)]
arm/efi: Improve performance requesting filesystem handle

Currently, the code used to handle and possibly load from the filesystem
modules defined in the DT is allocating and closing the filesystem handle
for each module to be loaded.

To improve the performance, the filesystem handle pointer is passed
through the call stack, requested when it's needed only once and closed
if it was allocated.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
3 years agoUpdate libfdt to v1.6.1
Vikram Garhwal [Fri, 12 Nov 2021 07:27:20 +0000 (23:27 -0800)]
Update libfdt to v1.6.1

Update libfdt to v1.6.1 of libfdt taken from git://github.com/dgibson/dtc.
This update is done to support device tree overlays.

A few minor changes are done to make it compatible with Xen:
    fdt_overlay.c: overlay_fixup_phandle()

        Replace strtoul() with simple_strtoul() as strtoul() is not available in
        Xen lib and included lib.h.

        Change char *endptr to const char *endptr. This change is required for
        using simple_strtoul().

    libfdt_env.h:
        Remaining Xen changes to libfdt_env.h carried over from existing
        libfdt (v1.4.0)

Signed-off-by: Vikram Garhwal <fnu.vikram@xilinx.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Tested-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
3 years agox86/crash: Drop manual hooking of exception_table[]
Andrew Cooper [Thu, 7 Oct 2021 13:02:10 +0000 (14:02 +0100)]
x86/crash: Drop manual hooking of exception_table[]

NMI hooking in the crash path has undergone several revisions since its
introduction.  What we have now is not sufficiently different from the regular
nmi_callback() mechanism to warrant special casing.

Use set_nmi_callback() directly, and do away with patching a read-only data
structure via a read-write alias.  This also means that the
vmx_vmexit_handler() can and should call do_nmi() directly, rather than
indirecting through the exception table to pick up the crash path hook.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/traps: Drop dummy_nmi_callback()
Andrew Cooper [Fri, 8 Oct 2021 12:11:21 +0000 (13:11 +0100)]
x86/traps: Drop dummy_nmi_callback()

The unconditional nmi_callback() call in do_nmi() calls dummy_nmi_callback()
in all cases other than for a few specific and rare tasks (alternative
patching, microcode loading, etc).

Indirect calls are expensive under retpoline, so rearrange the logic to use
NULL as the default, and skip the call entirely in the common case.

While rearranging the code, fold the exit paths.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/traps: Collect PERFC_exceptions stats for IST vectors too
Andrew Cooper [Fri, 8 Oct 2021 09:47:07 +0000 (10:47 +0100)]
x86/traps: Collect PERFC_exceptions stats for IST vectors too

This causes NMIs, #DB and #MC to be counted, rather than being reported as 0.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/dom0: Fix command line parsing issues with dom0_nodes=
Andrew Cooper [Fri, 19 Nov 2021 13:16:12 +0000 (13:16 +0000)]
x86/dom0: Fix command line parsing issues with dom0_nodes=

This is a simple comma separated list, so use the normal form.

 * Don't cease processing subsequent elements on an error
 * Do report -EINVAL for things like `dom0_nodes=4foo`
 * Don't opencode the cmdline_strcmp() helper

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/hvm: Remove callback from paging->flush_tlb() hook
Andrew Cooper [Wed, 17 Nov 2021 17:45:21 +0000 (17:45 +0000)]
x86/hvm: Remove callback from paging->flush_tlb() hook

TLB flushing is a hotpath, and function pointer calls are
expensive (especially under retpoline) for what amounts to an identity
transform on the data.  Just pass the vcpu_bitmap bitmap directly.

As we use NULL for all rather than none, introduce a flush_vcpu() helper to
avoid the risk of logical errors from opencoding the expression.  This also
means the viridian callers can avoid writing an all-ones bitmap for the
flushing logic to consume.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
3 years agox86/IO-APIC: Drop function pointers from __ioapic_{read,write}_entry()
Andrew Cooper [Sat, 30 Oct 2021 23:03:56 +0000 (00:03 +0100)]
x86/IO-APIC: Drop function pointers from __ioapic_{read,write}_entry()

Function pointers are expensive, and the raw parameter is a constant at the
root of all call trees, meaning that it predicts very well with local branch
history.

Furthermore, the knock-on effects are quite impressive.

  $ ../scripts/bloat-o-meter xen-syms-before xen-syms-after
  add/remove: 0/4 grow/shrink: 3/9 up/down: 459/-823 (-364)
  Function                                     old     new   delta
  __ioapic_write_entry                          73     286    +213
  __ioapic_read_entry                           75     276    +201
  save_IO_APIC_setup                           182     227     +45
  eoi_IO_APIC_irq                              241     229     -12
  disable_IO_APIC                              296     280     -16
  mask_IO_APIC_setup                           272     240     -32
  __io_apic_write                               46       -     -46
  __io_apic_read                                46       -     -46
  io_apic_set_pci_routing                      985     930     -55
  __io_apic_eoi.part                           223     161     -62
  io_apic_write                                 69       -     -69
  io_apic_read                                  69       -     -69
  restore_IO_APIC_setup                        325     253     -72
  ioapic_guest_write                          1413    1333     -80
  clear_IO_APIC_pin                            447     343    -104
  setup_IO_APIC                               5148    4988    -160

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agoxen/wait: Remove indirect jump
Andrew Cooper [Fri, 22 Oct 2021 15:07:07 +0000 (16:07 +0100)]
xen/wait: Remove indirect jump

There is no need for this to be an indirect jump at all.  Execution always
returns to a specific location, so use a direct jump instead.

Use a named label for the jump.  As both 1 and 2 have disappeared as labels,
rename 3 to skip to better describe its purpose.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agoxen/smp: Support NULL IPI function pointers
Andrew Cooper [Wed, 17 Nov 2021 16:16:23 +0000 (16:16 +0000)]
xen/smp: Support NULL IPI function pointers

There are several cases where the act of interrupting a remote processor has
the required side effect.  Explicitly allow NULL function pointers so the
calling code doesn't have to provide a stub implementation.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/ACPI: drop dead interpreter-related code
Jan Beulich [Fri, 5 Nov 2021 12:35:46 +0000 (13:35 +0100)]
x86/ACPI: drop dead interpreter-related code

CONFIG_ACPI_INTERPRETER does not get defined anywhere, the enclosed code
wouldn't build, and the default-to-phys logic works differently anyway
(see genapic/bigsmp.c:probe_bigsmp()).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agox86/APIC: rename cmdline_apic
Jan Beulich [Fri, 5 Nov 2021 12:35:21 +0000 (13:35 +0100)]
x86/APIC: rename cmdline_apic

The name hasn't been appropriate for a long time: It covers not only
command line overrides, but also x2APIC pre-enabled state.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agox86/APIC: drop probe_default()
Jan Beulich [Fri, 5 Nov 2021 12:34:57 +0000 (13:34 +0100)]
x86/APIC: drop probe_default()

The function does nothing but return success. Simply treat absence of a
probe hook to mean just this. This then eliminates the (purely
theoretical at this point) risk of trying to call through
apic_x2apic_{cluster,phys}'s respective NULL pointers.

While doing this also eliminate generic_apic_probe()'s "changed"
variable: apic_probe[]'s default entry will now be used unconditionally
in yet more obvious a way, such that separately setting genapic from
apic_default is (hopefully) no longer justified. Yet that was the main
purpose of the variable.

To help prove that apic_default's probe() hook doesn't get used
elsewhere, further make apic_probe[] static at this occasion.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agox86/APIC: drop {acpi_madt,mps}_oem_check() hooks
Jan Beulich [Fri, 5 Nov 2021 12:34:37 +0000 (13:34 +0100)]
x86/APIC: drop {acpi_madt,mps}_oem_check() hooks

The hook functions have been empty for a very long time, if not
(according to git history) forever. Ditch them alongside the then empty
mach_mpparse.h instances and the then unused APICFUNC() macro.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agox86/APIC: drop clustered_apic_check() hook
Jan Beulich [Fri, 5 Nov 2021 12:34:12 +0000 (13:34 +0100)]
x86/APIC: drop clustered_apic_check() hook

The hook functions have been empty forever (x2APIC) or issuing merely a
printk() for a long time (xAPIC). Since that printk() is (a) generally
useful (i.e. also in the x2APIC case) and (b) would better only be
issued once the final APIC driver to use was determined, move (and
generalize) it into connect_bsp_APIC().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agox86/cpufreq: Drop opencoded CPUID handling from powernow
Andrew Cooper [Fri, 12 Nov 2021 16:00:13 +0000 (16:00 +0000)]
x86/cpufreq: Drop opencoded CPUID handling from powernow

Xen already collects CPUID.0x80000007.edx by default, meaning that we can
refer to per-cpu data directly.  This also avoids the need IPI the onlining
CPU to identify whether Core Performance Boost is available.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/cpufreq: Rework APERF/MPERF handling
Andrew Cooper [Fri, 12 Nov 2021 16:28:24 +0000 (16:28 +0000)]
x86/cpufreq: Rework APERF/MPERF handling

Currently, each feature_detect() (called on CPU add) hook for both cpufreq
drivers duplicates cpu_has_aperfmperf in a per-cpu datastructure, and edits
cpufreq_driver.getavg to point at get_measured_perf().

As all parts of this are vendor-neutral, drop the function pointer and
duplicated boolean, and call get_measured_perf() directly.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/cpufreq: Clean up powernow registration
Andrew Cooper [Fri, 12 Nov 2021 15:13:36 +0000 (15:13 +0000)]
x86/cpufreq: Clean up powernow registration

powernow_register_driver() is currently written with a K&R type definition;
I'm surprised that compilers don't object to a mismatch with its declaration,
which is written in an ANSI-C compatible way.

Furthermore, its sole caller is cpufreq_driver_init() which is a pre-smp
initcall.  There are no other online CPUs, and even if there were, checking
the BSP's CPUID data $N times is pointless.  Simplify registration to only
look at the BSP.

While at it, drop obviously unused includes.  Also rewrite the expression in
cpufreq_driver_init() for clarity.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
3 years agoxen/xsm: Improve fallback handling in xsm_fixup_ops()
Andrew Cooper [Thu, 4 Nov 2021 03:12:49 +0000 (03:12 +0000)]
xen/xsm: Improve fallback handling in xsm_fixup_ops()

The current xsm_fixup_ops() is just shy of a full page when compiled, and very
fragile to NULL function pointer errors.

Address both of these issues with a minor piece of structure (ab)use.
Introduce dummy_ops, and fix up the provided xsm_ops pointer by treating both
as an array of unsigned longs.

The compiled size improvement speaks for itself:

  $ ../scripts/bloat-o-meter xen-syms-before xen-syms-after
  add/remove: 1/0 grow/shrink: 0/1 up/down: 712/-3897 (-3185)
  Function                                     old     new   delta
  dummy_ops                                      -     712    +712
  xsm_fixup_ops                               3987      90   -3897

and there is an additional safety check that will make it obvious during
development if there is an issue with the fallback handling.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Daniel P. Smith <dpsmith@apertussolutions.com>
3 years agoxen/xsm: Drop xsm_hvm_control() hook
Andrew Cooper [Fri, 29 Oct 2021 21:43:50 +0000 (22:43 +0100)]
xen/xsm: Drop xsm_hvm_control() hook

The final caller was dropped by c/s 58cbc034dc62 "dm_op: convert
HVMOP_inject_trap and HVMOP_inject_msi"

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Daniel P. Smith <dpsmith@apertussolutions.com>
3 years agoxen/xsm: Complete altcall conversion of xsm interface
Andrew Cooper [Thu, 4 Nov 2021 19:36:16 +0000 (19:36 +0000)]
xen/xsm: Complete altcall conversion of xsm interface

With alternative_call() capable of handling compound types, the three
remaining hooks can be optimised at boot time too.

Fixes: 164a0b9653f4 ("xsm: refactor xsm_ops handling")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Daniel P. Smith <dpsmith@apertussolutions.com>
3 years agox86/altcall: allow compound types to be passed
Jan Beulich [Thu, 4 Nov 2021 16:04:05 +0000 (17:04 +0100)]
x86/altcall: allow compound types to be passed

Replace the conditional operator in ALT_CALL_ARG(), which was intended
to limit usable types to scalar ones, by a size check. Some restriction
here is necessary to make sure we don't violate the ABI's calling
conventions, but limiting to scalar types was both too restrictive
(disallowing e.g. guest handles) and too permissive (allowing e.g.
__int128_t).

Note that there was some anomaly with that conditional operator anyway:
Something - I don't recall what - made it impossible to omit the middle
operand.

Code-generation-wise this has the effect of removing certain zero- or
sign-extending in some altcall invocations. This ought to be fine as the
ABI doesn't require sub-sizeof(int) values to be extended, except when
passed through an ellipsis. No functions subject to altcall patching has
a variable number of arguments, though.

Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Unfortunately this triggers -Werror=sizeof-array-argument on some versions of
GCC, so alter xsm_{alloc,free}_security_evtchns() to use a pointer rather than
array parameter.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Daniel P. Smith <dpsmith@apertussolutions.com>
3 years agoRevert "x86/CPUID: shrink max_{,sub}leaf fields according to actual leaf contents"
Andrew Cooper [Wed, 24 Nov 2021 19:06:02 +0000 (19:06 +0000)]
Revert "x86/CPUID: shrink max_{,sub}leaf fields according to actual leaf contents"

OSSTest has identified a 3rd regression caused by this change.  Migration
between Xen 4.15 and 4.16 on the nocera pair of machines (AMD Opteron 4133)
fails with:

  xc: error: Failed to set CPUID policy: leaf 00000000, subleaf ffffffff, msr ffffffff (22 = Invalid argument): Internal error
  xc: error: Restore failed (22 = Invalid argument): Internal error

which is a safety check to prevent resuming the guest when the CPUID data has
been truncated.  The problem is caused by shrinking of the max policies, which
is an ABI that needs handling compatibly between different versions of Xen.

Furthermore, shrinking of the default policies also breaks things in some
cases, because certain cpuid= settings in a VM config file which used to work
will now be refused.  Also external toolstacks that attempt to set the CPUID
policy from a featureset might now see some filled leaves not reachable due to
the shrinking done to the default domain policy before applying the
featureset.

This reverts commit 540d911c2813c3d8f4cdbb3f5672119e5e768a3d, as well as the
partial fix attempt in 81da2b544cbb003a5447c9b14d275746ad22ab37 (which added
one new case where cpuid= settings might not apply correctly) and restores the
same behaviour as Xen 4.15.

Fixes: 540d911c2813 ("x86/CPUID: shrink max_{,sub}leaf fields according to actual leaf contents")
Fixes: 81da2b544cbb ("x86/cpuid: prevent shrinking migrated policies max leaves")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
3 years agoVT-d: conditionalize IOTLB register offset check
Jan Beulich [Wed, 24 Nov 2021 10:12:44 +0000 (11:12 +0100)]
VT-d: conditionalize IOTLB register offset check

As of commit 6773b1a7584a ("VT-d: Don't assume register-based
invalidation is always supported") we don't (try to) use register based
invalidation anymore when that's not supported by hardware. Hence
there's also no point in the respective check, avoiding pointless IOMMU
initialization failure. After all the spec (version 3.3 at the time of
writing) doesn't say what the respective Extended Capability Register
field would contain in such a case.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
3 years agoVT-d: correct off-by-1 in fault register range check
Jan Beulich [Wed, 24 Nov 2021 10:12:03 +0000 (11:12 +0100)]
VT-d: correct off-by-1 in fault register range check

All our present implementation requires is that the range fully fits
in a single page. No need to exclude the case of the last register
extending right to the end of that page.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
3 years agoVT-d: prune SAGAW recognition
Jan Beulich [Wed, 24 Nov 2021 10:11:24 +0000 (11:11 +0100)]
VT-d: prune SAGAW recognition

Bit 0 of SAGAW in the capability register has become reserved at or
before spec version 2.2. Treat it as such. Replace the effective open-
coding of find_first_set_bit(). Adjust local variable types.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
3 years agox86/Viridian: drop dead variable updates
Jan Beulich [Wed, 24 Nov 2021 10:10:36 +0000 (11:10 +0100)]
x86/Viridian: drop dead variable updates

Both hvcall_flush_ex() and hvcall_ipi_ex() update "size" without
subsequently using the value; future compilers may warn about such.
Alongside dropping the updates, shrink the variables' scopes to
demonstrate that there are no outer scope uses.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
3 years agox86/Viridian: fix error code use
Jan Beulich [Wed, 24 Nov 2021 10:09:56 +0000 (11:09 +0100)]
x86/Viridian: fix error code use

Both the wrong use of HV_STATUS_* and the return type of
hv_vpset_to_vpmask() can lead to viridian_hypercall()'s
ASSERT_UNREACHABLE() triggering when translating error codes from Xen
to Viridian representation.

Fixes: b4124682db6e ("viridian: add ExProcessorMasks variants of the flush hypercalls")
Fixes: 9afa867d42ba ("viridian: add ExProcessorMasks variant of the IPI hypercall")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
3 years agoMAINTAINERS: declare REMUS support orphaned
Roger Pau Monné [Wed, 24 Nov 2021 10:07:52 +0000 (11:07 +0100)]
MAINTAINERS: declare REMUS support orphaned

The designated maintainer email address for the remus entry is
bouncing, so remove it and declare the entry as orphaned as there's no
other maintainer.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agoVT-d: don't leak domid mapping on error path
Jan Beulich [Wed, 24 Nov 2021 10:07:11 +0000 (11:07 +0100)]
VT-d: don't leak domid mapping on error path

While domain_context_mapping() invokes domain_context_unmap() in a sub-
case of handling DEV_TYPE_PCI when encountering an error, thus avoiding
a leak, individual calls to domain_context_mapping_one() aren't
similarly covered. Such a leak might persist until domain destruction.
Leverage that these cases can be recognized by pdev being non-NULL.

Fixes: dec403cc668f ("VT-d: fix iommu_domid for PCI/PCIx devices assignment")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
3 years agoVT-d: split domid map cleanup check into a function
Jan Beulich [Wed, 24 Nov 2021 10:06:20 +0000 (11:06 +0100)]
VT-d: split domid map cleanup check into a function

This logic will want invoking from elsewhere.

No functional change intended.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
3 years agoVT-d: properly reserve DID 0 for caching mode IOMMUs
Jan Beulich [Wed, 24 Nov 2021 10:05:36 +0000 (11:05 +0100)]
VT-d: properly reserve DID 0 for caching mode IOMMUs

Merely setting bit 0 in the bitmap is insufficient, as then Dom0 will
still have DID 0 allocated to it, because of the zero-filling of
domid_map[]. Set slot 0 to DOMID_INVALID to keep DID 0 from getting
used.

Fixes: b9c20c78789f ("VT-d: per-iommu domain-id")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
3 years agoVT-d: don't needlessly engage the untrusted-MSI workaround
Jan Beulich [Wed, 24 Nov 2021 10:04:32 +0000 (11:04 +0100)]
VT-d: don't needlessly engage the untrusted-MSI workaround

The quarantine domain doesn't count as a DomU, as it won't itself
trigger any bad behavior. The workaround only needs enabling when an
actual DomU is about to gain control of a device. This then also means
enabling of the workaround can be deferred until immediately ahead of
the call to domain_context_mapping(). While there also stop open-coding
is_hardware_domain().

Fixes: 319f9a0ba94c ("passthrough: quarantine PCI devices")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
3 years agoVT-d: prune super-page related capability macros
Jan Beulich [Wed, 24 Nov 2021 10:03:52 +0000 (11:03 +0100)]
VT-d: prune super-page related capability macros

cap_super_page_val() and cap_super_offset() are unused (apart from the
latter using the former). I don't see how cap_super_offset() can be
useful in its current shape: cap_super_page_val()'s result is not an
lvalue and hence can't have its address taken. Plus a user would have
to check the capability register field is non-zero, for
find_first_bit() (or find_first_set_bit(), if suitably corrected) to be
valid in the first place. Yet as per the spec when the field is non-zero
the low bit would always be set, so the result would be independent of
the actual value the field holds.

Further zap cap_sps_512gb() and cap_sps_1tb(). While earlier versions
of the spec had things spelled out that way, the current version marks
the two bits as reserved. And "48-bit offset to page frame" wasn't in
line with 1Tb pages anyway - clearly 256Tb pages would have been meant
here.

Finally properly parenthesize parameter uses in the remaining two
macros.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
3 years agoadd .gitignore entries for *.[is] below xen
Juergen Gross [Wed, 24 Nov 2021 10:03:09 +0000 (11:03 +0100)]
add .gitignore entries for *.[is] below xen

Instead of listing each single file with .s or .i suffixes in
.gitignore use pattern based entries. Restrict those to the xen
directory as we have e.g. tools/libs/stat/bindings/swig/xenstat.i in
our tree.

Below xen the pattern based entries are fine, as we have pattern rules
for creating *.s and *.i files in xen/Rules.mk.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
3 years agox86: modify hvm_memory_op() prototype
Juergen Gross [Wed, 24 Nov 2021 10:02:24 +0000 (11:02 +0100)]
x86: modify hvm_memory_op() prototype

hvm_memory_op() should take an unsigned long as cmd, like
do_memory_op().

As hvm_memory_op() is basically just calling do_memory_op() (or
compat_memory_op()) passing through the parameters the cmd parameter
should have no smaller size than that of the called functions.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/PV: drop "vcpu" local variable from show_guest_stack()
Jan Beulich [Wed, 24 Nov 2021 10:01:05 +0000 (11:01 +0100)]
x86/PV: drop "vcpu" local variable from show_guest_stack()

It's not really needed and has been misleading me more than once to try
and spot its "actual" use(s). It should really have been dropped when
the 32-bit specific logic was purged from here.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
3 years agoSet version to 4.17: rerun autogen.sh
Ian Jackson [Tue, 23 Nov 2021 16:55:32 +0000 (16:55 +0000)]
Set version to 4.17: rerun autogen.sh

Signed-off-by: Ian Jackson <iwj@xenproject.org>
3 years agoSet version to 4.17; 4.16 has branched
Ian Jackson [Tue, 23 Nov 2021 16:54:08 +0000 (16:54 +0000)]
Set version to 4.17; 4.16 has branched

Signed-off-by: Ian Jackson <iwj@xenproject.org>
3 years agoRevert "Config.mk: pin QEMU_UPSTREAM_REVISION (prep for Xen 4.16 RC1)"
Ian Jackson [Tue, 23 Nov 2021 16:51:47 +0000 (16:51 +0000)]
Revert "Config.mk: pin QEMU_UPSTREAM_REVISION (prep for Xen 4.16 RC1)"

This branch is unstable again now.

This reverts commit c9ce6afbf2d7772f47fc572bb7fc9555724927ed.

3 years agox86/P2M: deal with partial success of p2m_set_entry() 4.16.0-rc4
Jan Beulich [Mon, 22 Nov 2021 11:12:32 +0000 (11:12 +0000)]
x86/P2M: deal with partial success of p2m_set_entry()

M2P and PoD stats need to remain in sync with P2M; if an update succeeds
only partially, respective adjustments need to be made. If updates get
made before the call, they may also need undoing upon complete failure
(i.e. including the single-page case).

Log-dirty state would better also be kept in sync.

Note that the change to set_typed_p2m_entry() may not be strictly
necessary (due to the order restriction enforced near the top of the
function), but is being kept here to be on the safe side.

This is CVE-2021-28705 and CVE-2021-28709 / XSA-389.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
3 years agox86/PoD: handle intermediate page orders in p2m_pod_cache_add()
Jan Beulich [Mon, 22 Nov 2021 11:11:44 +0000 (11:11 +0000)]
x86/PoD: handle intermediate page orders in p2m_pod_cache_add()

p2m_pod_decrease_reservation() may pass pages to the function which
aren't 4k, 2M, or 1G. Handle all intermediate orders as well, to avoid
hitting the BUG() at the switch() statement's "default" case.

This is CVE-2021-28708 / part of XSA-388.

Fixes: 3c352011c0d3 ("x86/PoD: shorten certain operations on higher order ranges")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
3 years agox86/PoD: deal with misaligned GFNs
Jan Beulich [Mon, 22 Nov 2021 11:11:44 +0000 (11:11 +0000)]
x86/PoD: deal with misaligned GFNs

Users of XENMEM_decrease_reservation and XENMEM_populate_physmap aren't
required to pass in order-aligned GFN values. (While I consider this
bogus, I don't think we can fix this there, as that might break existing
code, e.g Linux'es swiotlb, which - while affecting PV only - until
recently had been enforcing only page alignment on the original
allocation.) Only non-PoD code paths (guest_physmap_{add,remove}_page(),
p2m_set_entry()) look to be dealing with this properly (in part by being
implemented inefficiently, handling every 4k page separately).

Introduce wrappers taking care of splitting the incoming request into
aligned chunks, without putting much effort in trying to determine the
largest possible chunk at every iteration.

Also "handle" p2m_set_entry() failure for non-order-0 requests by
crashing the domain in one more place. Alongside putting a log message
there, also add one to the other similar path.

Note regarding locking: This is left in the actual worker functions on
the assumption that callers aren't guaranteed atomicity wrt acting on
multiple pages at a time. For mis-aligned GFNs gfn_lock() wouldn't have
locked the correct GFN range anyway, if it didn't simply resolve to
p2m_lock(), and for well-behaved callers there continues to be only a
single iteration, i.e. behavior is unchanged for them. (FTAOD pulling
out just pod_lock() into p2m_pod_decrease_reservation() would result in
a lock order violation.)

This is CVE-2021-28704 and CVE-2021-28707 / part of XSA-388.

Fixes: 3c352011c0d3 ("x86/PoD: shorten certain operations on higher order ranges")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
3 years agoxen/page_alloc: Harden assign_pages()
Julien Grall [Mon, 22 Nov 2021 11:11:05 +0000 (11:11 +0000)]
xen/page_alloc: Harden assign_pages()

domain_tot_pages() and d->max_pages are 32-bit values. While the order
should always be quite small, it would still be possible to overflow
if domain_tot_pages() is near to (2^32 - 1).

As this code may be called by a guest via XENMEM_increase_reservation
and XENMEM_populate_physmap, we want to make sure the guest is not going
to be able to allocate more than it is allowed.

Rework the allocation check to avoid any possible overflow. While the
check domain_tot_pages() < d->max_pages should technically not be
necessary, it is probably best to have it to catch any possible
inconsistencies in the future.

This is CVE-2021-28706 / part of XSA-385.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
3 years agoefi: fix alignment of function parameters in compat mode
Roger Pau Monne [Thu, 18 Nov 2021 08:28:06 +0000 (09:28 +0100)]
efi: fix alignment of function parameters in compat mode

Currently the max_store_size, remain_store_size and max_size in
compat_pf_efi_runtime_call are 4 byte aligned, which makes clang
13.0.0 complain with:

In file included from compat.c:30:
./runtime.c:646:13: error: passing 4-byte aligned argument to 8-byte aligned parameter 2 of 'QueryVariableInfo' may result in an unaligned pointer access [-Werror,-Walign-mismatch]
            &op->u.query_variable_info.max_store_size,
            ^
./runtime.c:647:13: error: passing 4-byte aligned argument to 8-byte aligned parameter 3 of 'QueryVariableInfo' may result in an unaligned pointer access [-Werror,-Walign-mismatch]
            &op->u.query_variable_info.remain_store_size,
            ^
./runtime.c:648:13: error: passing 4-byte aligned argument to 8-byte aligned parameter 4 of 'QueryVariableInfo' may result in an unaligned pointer access [-Werror,-Walign-mismatch]
            &op->u.query_variable_info.max_size);
            ^
Fix this by bouncing the variables on the stack in order for them to
be 8 byte aligned.

Note this could be done in a more selective manner to only apply to
compat code calls, but given the overhead of making an EFI call doing
an extra copy of 3 variables doesn't seem to warrant the special
casing.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Signed-off-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes since v3:
 - Remove hard tabs.  Apply Jan's r-b as authorised in email.
Changes since v2:
 - Adjust the commentary as per discussion.
Changes since v1:
 - Copy back the results.

3 years agogolang/xenlight: regen generated code
Anthony PERARD [Fri, 19 Nov 2021 10:29:48 +0000 (10:29 +0000)]
golang/xenlight: regen generated code

Fixes: 7379f9e10a3b ("gnttab: allow setting max version per-domain")
Fixes: 1e6706b0d123 ("xen/arm: Introduce gpaddr_bits field to struct xen_domctl_getdomaininfo")
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Acked-by: Ian Jackson <iwj@xenproject.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agoVT-d: fix reduced page table levels support when sharing tables
Jan Beulich [Fri, 19 Nov 2021 14:14:08 +0000 (15:14 +0100)]
VT-d: fix reduced page table levels support when sharing tables

domain_pgd_maddr() contains logic to adjust the root address to be put
in the context entry in case 4-level page tables aren't supported by an
IOMMU. This logic may not be bypassed when sharing page tables.

This is CVE-2021-28710 / XSA-390.

Fixes: 25ccd093425c ("iommu: remove the share_p2m operation")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agotools/python: fix python libxc bindings to pass a max grant version
Roger Pau Monné [Wed, 17 Nov 2021 11:43:05 +0000 (12:43 +0100)]
tools/python: fix python libxc bindings to pass a max grant version

Such max version should be provided by the caller, otherwise the
bindings will default to specifying a max version of 2, which is
inline with the current defaults in the hypervisor.

Fixes: 7379f9e10a ('gnttab: allow setting max version per-domain')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <iwj@xenproject.org>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agoCHANGELOG: set Xen 4.15 release date
Roger Pau Monné [Wed, 17 Nov 2021 11:35:26 +0000 (12:35 +0100)]
CHANGELOG: set Xen 4.15 release date

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agotest/tsx: set grant version for created domains
Roger Pau Monné [Wed, 17 Nov 2021 07:13:18 +0000 (08:13 +0100)]
test/tsx: set grant version for created domains

Set the grant table version for the created domains to use version 1,
as such tests domains don't require the usage of the grant table at
all. A TODO note is added to switch those dummy domains to not have a
grant table at all when possible. Without setting the grant version
the domains for the tests cannot be created.

Fixes: 7379f9e10a ('gnttab: allow setting max version per-domain')
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agotests/resource: set grant version for created domains
Roger Pau Monné [Wed, 17 Nov 2021 07:13:02 +0000 (08:13 +0100)]
tests/resource: set grant version for created domains

Set the grant table version for the created domains to use version 1,
as that's the used by the test cases. Without setting the grant
version the domains for the tests cannot be created.

Fixes: 7379f9e10a ('gnttab: allow setting max version per-domain')
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agodomctl: introduce a macro to set the grant table max version
Roger Pau Monné [Wed, 17 Nov 2021 07:12:00 +0000 (08:12 +0100)]
domctl: introduce a macro to set the grant table max version

Such macro just clamps the passed version to fit in the designated
bits of the domctl field. The main purpose is to make it clearer in
the code when max grant version is being set in the grant_opts field.

Existing users that where setting the version in the grant_opts field
are switched to use the macro.

No functional change intended.

Requested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agopublic/gnttab: relax v2 recommendation 4.16.0-rc3
Jan Beulich [Tue, 16 Nov 2021 16:34:06 +0000 (17:34 +0100)]
public/gnttab: relax v2 recommendation

With there being a way to disable v2 support, telling new guests to use
v2 exclusively is not a good suggestion.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agotests/resource: Extend to check that the grant frames are mapped correctly
Jane Malalane [Fri, 12 Nov 2021 14:48:21 +0000 (14:48 +0000)]
tests/resource: Extend to check that the grant frames are mapped correctly

Previously, we checked that we could map 40 pages with nothing
complaining. Now we're adding extra logic to check that those 40
frames are "correct".

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jane Malalane <jane.malalane@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
3 years agox86/cpuid: prevent shrinking migrated policies max leaves
Roger Pau Monne [Wed, 10 Nov 2021 17:40:59 +0000 (18:40 +0100)]
x86/cpuid: prevent shrinking migrated policies max leaves

CPUID policies from guest being migrated shouldn't have the maximum
leaves shrink, as that would be a guest visible change. The hypervisor
has no knowledge on whether a guest has been migrated or is build from
scratch, and hence it must not blindly shrink the CPUID policy in
recalculate_cpuid_policy. Remove the
x86_cpuid_policy_shrink_max_leaves call from recalculate_cpuid_policy.
Removing such call could be seen as a partial revert of 540d911c28.

Instead let the toolstack shrink the policies for newly created
guests, while keeping the previous values for guests that are migrated
in. Note that guests migrated in without a CPUID policy won't get any
kind of shrinking applied.

Fixes: 540d911c28 ('x86/CPUID: shrink max_{,sub}leaf fields according to actual leaf contents')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agoVT-d: per-domain IOMMU bitmap needs to have dynamic size
Jan Beulich [Fri, 12 Nov 2021 12:56:51 +0000 (13:56 +0100)]
VT-d: per-domain IOMMU bitmap needs to have dynamic size

With no upper bound (anymore) on the number of IOMMUs, a fixed-size
64-bit map may be insufficient (systems with 40 IOMMUs have already been
observed).

Fixes: 27713fa2aa21 ("VT-d: improve save/restore of registers across S3")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agoMAINTAINERS: add Bertrand to the ARM reviewers
Stefano Stabellini [Fri, 5 Nov 2021 15:44:45 +0000 (08:44 -0700)]
MAINTAINERS: add Bertrand to the ARM reviewers

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agoxen/arm: allocate_bank_memory: don't create memory banks of size zero
Stefano Stabellini [Wed, 10 Nov 2021 20:55:55 +0000 (12:55 -0800)]
xen/arm: allocate_bank_memory: don't create memory banks of size zero

allocate_bank_memory can be called with a tot_size of zero, as an
example see the implementation of allocate_memory which can call
allocate_bank_memory with a tot_size of zero for the second memory bank.

If tot_size == 0, don't create an empty memory bank, just return
immediately without error. Otherwise a zero-size memory bank will be
added to the domain device tree.

Note that Linux is known to be able to cope with zero-size memory banks,
and Xen more recently gained the ability to do so as well (5a37207df520
"xen/arm: bootfdt: Ignore empty memory bank"). However, there might be
other non-Linux OSes that are not able to cope with empty memory banks
as well as Linux (and now Xen). It would be more robust to avoid
zero-size memory banks unless required.

Moreover, the code to find empty address regions in make_hypervisor_node
in Xen is not able to cope with empty memory banks today and would
result in a Xen crash. This is only a latent bug because
make_hypervisor_node is only called for Dom0 at present and
allocate_memory is only called for DomU at the moment. (But if
make_hypervisor_node was to be called for a DomU, then the Xen crash
would become manifest.)

Fixes: f2931b4233ec ("xen/arm: introduce allocate_memory")
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agoxen/arm: don't assign domU static-mem to dom0 as reserved-memory
Stefano Stabellini [Wed, 10 Nov 2021 20:18:12 +0000 (12:18 -0800)]
xen/arm: don't assign domU static-mem to dom0 as reserved-memory

DomUs static-mem ranges are added to the reserved_mem array for
accounting, but they shouldn't be assigned to dom0 as the other regular
reserved-memory ranges in device tree.

In make_memory_nodes, fix the error by skipping banks with xen_domain
set to true in the reserved-memory array. Also make sure to use the
first valid (!xen_domain) start address for the memory node name.

Fixes: 41c031ff437b ("xen/arm: introduce domain on Static Allocation")
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Penny Zheng <penny.zheng@arm.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agotools/configure: make iPXE dependent on QEMU traditional
Roger Pau Monne [Tue, 9 Nov 2021 09:47:21 +0000 (10:47 +0100)]
tools/configure: make iPXE dependent on QEMU traditional

iPXE is only used by QEMU traditional, so make it off by default
unless QEMU traditional is enabled.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Fixes: bcf77ce510 ('configure: modify default of building rombios')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
3 years agognttab: allow setting max version per-domain
Roger Pau Monne [Thu, 4 Nov 2021 10:48:34 +0000 (11:48 +0100)]
gnttab: allow setting max version per-domain

Introduce a new domain create field so that toolstack can specify the
maximum grant table version usable by the domain. This is plumbed into
xl and settable by the user as max_grant_version.

Previously this was only settable on a per host basis using the
gnttab command line option.

Note the version is specified using 4 bits, which leaves room to
specify up to grant table version 15. Given that we only have 2 grant
table versions right now, and a new version is unlikely in the near
future using 4 bits seems more than enough.

xenstored stubdomains are limited to grant table v1 because the
current MiniOS code used to build them only has support for grants v1.
There are existing limits set for xenstored stubdomains at creation
time that already match the defaults in MiniOS.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agoxen: Report grant table v1/v2 capabilities to the toolstack
Andrew Cooper [Fri, 29 Oct 2021 17:38:13 +0000 (18:38 +0100)]
xen: Report grant table v1/v2 capabilities to the toolstack

In order to let the toolstack be able to set the gnttab version on a
per-domain basis, it needs to know which ABIs Xen supports.  Introduce
XEN_SYSCTL_PHYSCAP_gnttab_v{1,2} for the purpose, and plumb in down into
userspace.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Releae-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agoxen/efi: Fix Grub2 boot on arm64 4.16.0-rc2
Luca Fancellu [Fri, 5 Nov 2021 13:07:28 +0000 (13:07 +0000)]
xen/efi: Fix Grub2 boot on arm64

The code introduced by commit a1743fc3a9fe9b68c265c45264dddf214fd9b882
("arm/efi: Use dom0less configuration when using EFI boot") is
introducing a problem to boot Xen using Grub2 on ARM machine using EDK2.

Despite UEFI specification, EDK2+Grub2 is returning a NULL DeviceHandle
inside the interface given by the LOADED_IMAGE_PROTOCOL service, this
handle is used later by efi_bs->HandleProtocol(...) inside
get_parent_handle(...) when requesting the SIMPLE_FILE_SYSTEM_PROTOCOL
interface, causing Xen to stop the boot because of an EFI_INVALID_PARAMETER
error.

Before the commit above, the function was never called because the
logic was skipping the call when there were multiboot modules in the
DT because the filesystem was never used and the bootloader had
put in place all the right modules in memory and the addresses
in the DT.

To fix the problem the old logic is put back in place. Because the handle
was given to the efi_check_dt_boot(...), but the revert put the handle
out of scope, the signature of the function is changed to use an
EFI_LOADED_IMAGE handle and request the EFI_FILE_HANDLE only when
needed (module found using xen,uefi-binary).

Another problem is found when the UEFI stub tries to check if Dom0
image or DomUs are present.
The logic doesn't work when the UEFI stub is not responsible to load
any modules, so the efi_check_dt_boot(...) return value is modified
to return the number of multiboot module found and not only the number
of module loaded by the stub.
Taking the occasion to update the comment in handle_module_node(...)
to explain why we return success even if xen,uefi-binary is not found.

Fixes: a1743fc3a9 ("arm/efi: Use dom0less configuration when using EFI boot")
Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
3 years agotools: disable building qemu-trad per default
Juergen Gross [Thu, 4 Nov 2021 16:11:21 +0000 (17:11 +0100)]
tools: disable building qemu-trad per default

Using qemu-traditional as device model is deprecated for some time now.

So change the default for building it to "disable". This will affect
ioemu-stubdom, too, as there is a direct dependency between the two.

Today it is possible to use a PVH/HVM Linux-based stubdom as device
model. Additionally using ioemu-stubdom isn't really helping for
security, as it requires to run a very old and potentially buggy qemu
version in a PV domain. This is adding probably more security problems
than it is removing by using a stubdom.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Acked-by: Ian Jackson <iwj@xenproject.org>
Release-acked-by: Ian Jackson <iwj@xenproject.org>
3 years agoconfigure: modify default of building rombios
Juergen Gross [Thu, 4 Nov 2021 16:11:20 +0000 (17:11 +0100)]
configure: modify default of building rombios

The tools/configure script will default to build rombios if qemu
traditional is enabled. If rombios is being built, ipxe will be built
per default, too.

This results in rombios and ipxe no longer being built by default when
disabling qemu traditional.

Fix that be rearranging the dependencies:

- build ipxe by default
- build rombios by default if either ipxe or qemu traditional are
  being built

This modification prepares not building qemu traditional by default
without affecting build of rombios and ipxe.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Release-acked-by: Ian Jackson <iwj@xenproject.org>
3 years agotools/helpers: fix broken xenstore stubdom init
Juergen Gross [Thu, 4 Nov 2021 14:42:42 +0000 (15:42 +0100)]
tools/helpers: fix broken xenstore stubdom init

Commit 1787cc167906f3f ("libs/guest: Move the guest ABI check earlier
into xc_dom_parse_image()") broke starting the xenstore stubdom. This
is due to a rather special way the xenstore stubdom domain config is
being initialized: in order to support both, PV and PVH stubdom,
init-xenstore-domain is using xc_dom_parse_image() to find the correct
domain type. Unfortunately above commit requires xc_dom_boot_xen_init()
to have been called before using xc_dom_parse_image(). This requires
the domid, which is known only after xc_domain_create(), which requires
the domain type.

In order to break this circular dependency, call xc_dom_boot_xen_init()
with an arbitrary domid first, and then set dom->guest_domid later.

Fixes: 1787cc167906f3f ("libs/guest: Move the guest ABI check earlier into xc_dom_parse_image()")
Signed-off-by: Juergen Gross <jgross@suse.com>
Release-acked-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agox86/APIC: avoid iommu_supports_x2apic() on error path
Jan Beulich [Thu, 4 Nov 2021 13:44:43 +0000 (14:44 +0100)]
x86/APIC: avoid iommu_supports_x2apic() on error path

The value it returns may change from true to false in case
iommu_enable_x2apic() fails and, as a side effect, clears iommu_intremap
(as can happen at least on AMD). Latch the return value from the first
invocation to replace the second one.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agox86/IOMMU: mark IOMMU / intremap not in use when ACPI tables are missing
Jan Beulich [Thu, 4 Nov 2021 13:44:01 +0000 (14:44 +0100)]
x86/IOMMU: mark IOMMU / intremap not in use when ACPI tables are missing

x2apic_bsp_setup() gets called ahead of iommu_setup(), and since x2APIC
mode (physical vs clustered) depends on iommu_intremap, that variable
needs to be set to off as soon as we know we can't / won't enable
interrupt remapping, i.e. in particular when parsing of the respective
ACPI tables failed. Move the turning off of iommu_intremap from AMD
specific code into acpi_iommu_init(), accompanying it by clearing of
iommu_enable.

Take the opportunity and also fully skip ACPI table parsing logic on
VT-d when both "iommu=off" and "iommu=no-intremap" are in effect anyway,
like was already the case for AMD.

The tag below only references the commit uncovering a pre-existing
anomaly.

Fixes: d8bd82327b0f ("AMD/IOMMU: obtain IVHD type to use earlier")
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agox86/xstate: reset cached register values on resume
Marek Marczykowski-Górecki [Thu, 4 Nov 2021 13:42:37 +0000 (14:42 +0100)]
x86/xstate: reset cached register values on resume

set_xcr0() and set_msr_xss() use cached value to avoid setting the
register to the same value over and over. But suspend/resume implicitly
reset the registers and since percpu areas are not deallocated on
suspend anymore, the cache gets stale.
Reset the cache on resume, to ensure the next write will really hit the
hardware. Choose value 0, as it will never be a legitimate write to
those registers - and so, will force write (and cache update).

Note the cache is used io get_xcr0() and get_msr_xss() too, but:
- set_xcr0() is called few lines below in xstate_init(), so it will
  update the cache with appropriate value
- get_msr_xss() is not used anywhere - and thus not before any
  set_msr_xss() that will fill the cache

Fixes: aca2a985a55a "xen: don't free percpu areas during suspend"
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agox86/traps: Fix typo in do_entry_CP()
Andrew Cooper [Tue, 28 Sep 2021 20:55:56 +0000 (21:55 +0100)]
x86/traps: Fix typo in do_entry_CP()

The call to debugger_trap_entry() should pass the correct vector.  The
break-for-gdbsx logic is in practice unreachable because PV guests can't
generate #CP, but it will interfere with anyone inserting custom debugging
into debugger_trap_entry().

Fixes: 5ad05b9c2490 ("x86/traps: Implement #CP handler and extend #PF for shadow stacks")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agoxen/arm: fix SBDF calculation for vPCI MMIO handlers
Oleksandr Andrushchenko [Tue, 2 Nov 2021 11:20:41 +0000 (13:20 +0200)]
xen/arm: fix SBDF calculation for vPCI MMIO handlers

While in vPCI MMIO trap handlers for the guest PCI host bridge it is not
enough for SBDF translation to simply call VPCI_ECAM_BDF(info->gpa) as
the base address may not be aligned in the way that the translation
always work. If not adjusted with respect to the base address it may not be
able to properly convert SBDF.
Fix this by adjusting the gpa with respect to the host bridge base address
in a way as it is done for x86.

Please note, that this change is not strictly required given the current
value of GUEST_VPCI_ECAM_BASE which has bits 0 to 27 clear, but could cause
issues if such value is changed, or when handlers for dom0 ECAM
regions are added as those will be mapped over existing hardware
regions that could use non-aligned base addresses.

Fixes: d59168dc05a5 ("xen/arm: Enable the existing x86 virtual PCI support for ARM")
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agoRevert "tools: disable building qemu-trad per default"
Ian Jackson [Wed, 3 Nov 2021 15:20:02 +0000 (15:20 +0000)]
Revert "tools: disable building qemu-trad per default"

Unfortunately this breaks the gitlab CI.  See mails on-list.

This reverts commit ce309942c791628ff42082d1b74bfaeaa5267ae0.

3 years agox86/shstk: Fix use of shadow stacks with XPTI active
Andrew Cooper [Mon, 1 Nov 2021 20:45:26 +0000 (20:45 +0000)]
x86/shstk: Fix use of shadow stacks with XPTI active

The call to setup_cpu_root_pgt(0) in smp_prepare_cpus() is too early.  It
clones the BSP's stack while the .data mapping is still in use, causing all
mappings to be fully read read/write (and with no guard pages either).  This
ultimately causes #DF when trying to enter the dom0 kernel for the first time.

Defer setting up BSPs XPTI pagetable until reinit_bsp_stack() after we've set
up proper shadow stack permissions.

Fixes: 60016604739b ("x86/shstk: Rework the stack layout to support shadow stacks")
Fixes: b60ab42db2f0 ("x86/shstk: Activate Supervisor Shadow Stacks")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agotools: disable building qemu-trad per default
Juergen Gross [Fri, 10 Sep 2021 05:55:18 +0000 (07:55 +0200)]
tools: disable building qemu-trad per default

Using qemu-traditional as device model is deprecated for some time now.

So change the default for building it to "disable". This will affect
ioemu-stubdom, too, as there is a direct dependency between the two.

Today it is possible to use a PVH/HVM Linux-based stubdom as device
model. Additionally using ioemu-stubdom isn't really helping for
security, as it requires to run a very old and potentially buggy qemu
version in a PV domain. This is adding probably more security problems
than it is removing by using a stubdom.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Acked-by: Ian Jackson <iwj@xenproject.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agoupdate system time immediately when VCPUOP_register_vcpu_info
Dongli Zhang [Wed, 3 Nov 2021 09:19:06 +0000 (10:19 +0100)]
update system time immediately when VCPUOP_register_vcpu_info

The guest may access the pv vcpu_time_info immediately after
VCPUOP_register_vcpu_info. This is to borrow the idea of
VCPUOP_register_vcpu_time_memory_area, where the
force_update_vcpu_system_time() is called immediately when the new memory
area is registered.

Otherwise, we may observe clock drift at the VM side if the VM accesses
the clocksource immediately after VCPUOP_register_vcpu_info().

Reference: https://lists.xenproject.org/archives/html/xen-devel/2021-10/msg00571.html
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agox86: de-duplicate MONITOR/MWAIT CPUID-related definitions
Jan Beulich [Wed, 3 Nov 2021 09:17:47 +0000 (10:17 +0100)]
x86: de-duplicate MONITOR/MWAIT CPUID-related definitions

As of 724b55f48a6c ("x86: introduce MWAIT-based, ACPI-less CPU idle
driver") they (also) live in asm/mwait.h; no idea how I missed the
duplicates back at the time.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agoREADME, xen/Makefile: Change version to 4.16-rc 4.16.0-rc1
Ian Jackson [Mon, 1 Nov 2021 12:36:26 +0000 (12:36 +0000)]
README, xen/Makefile: Change version to 4.16-rc

Signed-off-by: Ian Jackson <iwj@xenproject.org>
3 years agoConfig.mk: pin QEMU_UPSTREAM_REVISION (prep for Xen 4.16 RC1)
Ian Jackson [Mon, 1 Nov 2021 12:33:54 +0000 (12:33 +0000)]
Config.mk: pin QEMU_UPSTREAM_REVISION (prep for Xen 4.16 RC1)

Signed-off-by: Ian Jackson <iwj@xenproject.org>
3 years agoautomation: add a QEMU based x86_64 Dom0/DomU test
Stefano Stabellini [Fri, 29 Oct 2021 16:33:38 +0000 (09:33 -0700)]
automation: add a QEMU based x86_64 Dom0/DomU test

Introduce a test based on QEMU to run Xen, Dom0 and start a DomU.
This is similar to the existing qemu-alpine-arm64.sh script and test.
The only differences are:
- use Debian's qemu-system-x86_64 (on ARM we build our own)
- use ipxe instead of u-boot and ImageBuilder

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
3 years agoautomation: Linux 5.10.74 test-artifact
Stefano Stabellini [Tue, 26 Oct 2021 00:55:40 +0000 (17:55 -0700)]
automation: Linux 5.10.74 test-artifact

Build a 5.10 kernel to be used as Dom0 and DomU kernel for testing. This
is almost the same as the existing ARM64 recipe for Linux 5.9, the
only differences are:
- upgrade to latest 5.10.x stable
- force Xen modules to built-in (on ARM it was already done by defconfig)

Also add the exporting job to build.yaml so that the binary can be used
during gitlab-ci runs.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
3 years agoautomation: add x86_64 alpine 3.12 test-artifact
Stefano Stabellini [Fri, 22 Oct 2021 20:30:49 +0000 (13:30 -0700)]
automation: add x86_64 alpine 3.12 test-artifact

It is the same as the existing ARM64 alpine 3.12 test-artifact. It is
used to export an Alpine rootfs for Dom0 used for testing.

Also add the exporting job to build.yaml so that the binaries can be
used during gitlab-ci runs.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
3 years agox86/cpuid: prevent decreasing of hypervisor max leaf on migration
Roger Pau Monne [Wed, 27 Oct 2021 14:00:50 +0000 (16:00 +0200)]
x86/cpuid: prevent decreasing of hypervisor max leaf on migration

In order to be compatible with previous Xen versions, and not change
max hypervisor leaf as a result of a migration, keep the clamping of
the maximum leaf value provided to XEN_CPUID_MAX_NUM_LEAVES, instead
of doing it based on the domain type. Also set the default maximum
leaf without taking the domain type into account. The maximum
hypervisor leaf is not migrated, so we need the default to not regress
beyond what might already be reported to a guest by existing Xen
versions.

This is a partial revert of 540d911c28 and restores the previous
behaviour and assures that HVM guests won't see it's maximum
hypervisor leaf reduced from 5 to 4 as a result of a migration.

Fixes: 540d911c28 ('x86/CPUID: shrink max_{,sub}leaf fields according to actual leaf contents')
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agox86/hpet: setup HPET even when disabled due to stopping in deep C states
Roger Pau Monne [Tue, 26 Oct 2021 15:12:33 +0000 (17:12 +0200)]
x86/hpet: setup HPET even when disabled due to stopping in deep C states

Always allow the HPET to be setup, but don't report a frequency back
to the platform time source probe in order to avoid it from being
selected as a valid timer if it's not usable.

Doing the setup even when not intended to be used as a platform timer
is required so that is can be used in legacy replacement mode in order
to assert the IO-APIC is capable of receiving interrupts.

Fixes: c12731493a ('x86/hpet: Use another crystalball to evaluate HPET usability')
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
3 years agoautomation: actually build with clang for ubuntu-focal-clang* jobs
Anthony PERARD [Fri, 22 Oct 2021 16:36:44 +0000 (17:36 +0100)]
automation: actually build with clang for ubuntu-focal-clang* jobs

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
3 years agoxen/arm: vgic: Ignore write access to ICPENDR*
Hongda Deng [Thu, 21 Oct 2021 12:03:19 +0000 (20:03 +0800)]
xen/arm: vgic: Ignore write access to ICPENDR*

Currently, Xen will return IO unhandled when guests write ICPENDR*
virtual registers, which will raise a data abort inside the guest.
For Linux guest, these virtual registers will not be accessed. But
for Zephyr, these virtual registers will be accessed during the
initialization. Zephyr guest will get an IO data abort and crash.
Emulating ICPENDR is not easy with the existing vGIC, this patch
reworks the emulation to ignore write access to ICPENDR* virtual
registers and print a message about whether they are already pending
instead of returning unhandled.
More details can be found at [1].

[1] https://github.com/zephyrproject-rtos/zephyr/blob/eaf6cf745df3807e6e
cc941c3a30de6c179ae359/drivers/interrupt_controller/intc_gicv3.c#L274

Signed-off-by: Hongda Deng <hongda.deng@arm.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Julien Grall <jgrall@amazon.com>
3 years agotools/xenstored: Ignore domain we were unable to restore
Julien Grall [Wed, 20 Oct 2021 14:45:19 +0000 (14:45 +0000)]
tools/xenstored: Ignore domain we were unable to restore

Commit 939775cfd3 "handle dying domains in live update" was meant to
handle gracefully dying domain. However, the @releaseDomain watch
will end up to be sent as soon as we finished to restore Xenstored
state.

This may be before Xen reports the domain to be dying (such as if
the guest decided to revoke access to the xenstore page). Consequently
daemon like xenconsoled will not clean-up the domain and it will be
left as a zombie.

To avoid the problem, mark the connection as ignored. This also
requires to tweak conn_can_write() and conn_can_read() to prevent
dereferencing a NULL pointer (the interface will not mapped).

The check conn->is_ignored was originally added after the callbacks
because the helpers for a socket connection may close the fd. However,
ignore_connection() will close a socket connection directly. So it is
fine to do the re-order.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>