]> xenbits.xensource.com Git - people/jgross/xen.git/log
people/jgross/xen.git
4 years agolibxl / libxlu: support 'xl pci-attach/detach' by name staging origin/staging
Paul Durrant [Tue, 5 Jan 2021 17:46:42 +0000 (17:46 +0000)]
libxl / libxlu: support 'xl pci-attach/detach' by name

This patch modifies libxlu_pci_parse_spec_string() to parse the new 'name'
parameter of PCI_SPEC_STRING detailed in the updated documention in
xl-pci-configuration(5) and populate the 'name' field of 'libxl_device_pci'.

If the 'name' field is non-NULL then both libxl_device_pci_add() and
libxl_device_pci_remove() will use it to look up the device BDF in
the list of assignable devices.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agodocs/man: modify xl-pci-configuration(5) to add 'name' field to PCI_SPEC_STRING
Paul Durrant [Tue, 5 Jan 2021 17:46:41 +0000 (17:46 +0000)]
docs/man: modify xl-pci-configuration(5) to add 'name' field to PCI_SPEC_STRING

Since assignable devices can be named, a subsequent patch will support use
of a PCI_SPEC_STRING containing a 'name' parameter instead of a 'bdf'. In
this case the name will be used to look up the 'bdf' in the list of assignable
(or assigned) devices.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agoxl: support naming of assignable devices
Paul Durrant [Tue, 5 Jan 2021 17:46:40 +0000 (17:46 +0000)]
xl: support naming of assignable devices

With this patch applied 'xl pci-assignable-add' will take an optional '--name'
parameter, 'xl pci-assignable-remove' can be passed either a BDF or a name and
'xl pci-assignable-list' will take a optional '--show-names' flag which
determines whether names are displayed in its output.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agolibxl: add 'name' field to 'libxl_device_pci' in the IDL...
Paul Durrant [Tue, 5 Jan 2021 17:46:39 +0000 (17:46 +0000)]
libxl: add 'name' field to 'libxl_device_pci' in the IDL...

... and modify libxl_pci_bdf_assignable_add/remove/list() to make use of it.

libxl_pci_bdf_assignable_add() will store the name of the device in xenstore
if the field is specified (i.e. non-NULL) and libxl_pci_bdf_assignable_remove()
will remove devices specified only by name, looking up the BDF as necessary.

libxl_pci_bdf_assignable_list() will also populate the 'name' field if a name
was stored by libxl_pci_bdf_assignable_add().

NOTE: This patch also fixes whitespace in the declaration of 'libxl_device_pci'
      in the IDL.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agolibxl: stop setting 'vdevfn' in pci_struct_fill()
Paul Durrant [Tue, 5 Jan 2021 17:46:38 +0000 (17:46 +0000)]
libxl: stop setting 'vdevfn' in pci_struct_fill()

There are only two call-sites. One always sets it to 0 (which is unnecessary
as the structure is already initialized to zero) and the other can simply set
the 'vdevfn' field directly (after proper structure initialization), avoiding
the need for a local variable.

A subsequent patch will also make use of pci_struct_fill() in a context
where 'vdevfn' may already have been set.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agolibxlu: introduce xlu_pci_parse_spec_string()
Paul Durrant [Tue, 5 Jan 2021 17:46:37 +0000 (17:46 +0000)]
libxlu: introduce xlu_pci_parse_spec_string()

This patch largely re-writes the code to parse a PCI_SPEC_STRING and enters
it via the newly introduced function. The new parser also deals with 'bdf'
and 'vslot' as non-positional paramaters, as per the documentation in
xl-pci-configuration(5).

The existing xlu_pci_parse_bdf() function remains, but now strictly parses
BDF values. Some existing callers of xlu_pci_parse_bdf() are
modified to call xlu_pci_parse_spec_string() as per the documentation in xl(1).

NOTE: Usage text in xl_cmdtable.c and error messages are also modified
      appropriately.
      As a side-effect this patch also fixes a bug where using '*' to specify
      all functions would lead to an assertion failure at the end of
      xlu_pci_parse_bdf().

Fixes: d25cc3ec93eb ("libxl: workaround gcc 10.2 maybe-uninitialized warning")
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agodocs/man: modify xl(1) in preparation for naming of assignable devices
Paul Durrant [Tue, 5 Jan 2021 17:46:36 +0000 (17:46 +0000)]
docs/man: modify xl(1) in preparation for naming of assignable devices

A subsequent patch will introduce code to allow a name to be specified to
'xl pci-assignable-add' such that the assignable device may be referred to
by than name in subsequent operations.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agox86/dpci: do not remove pirqs from domain tree on unbind
Roger Pau Monné [Thu, 21 Jan 2021 15:11:41 +0000 (16:11 +0100)]
x86/dpci: do not remove pirqs from domain tree on unbind

A fix for a previous issue removed the pirqs from the domain tree when
they are unbound in order to prevent shared pirqs from triggering a
BUG_ON in __pirq_guest_unbind if they are unbound multiple times. That
caused free_domain_pirqs to no longer unmap the pirqs because they
are gone from the domain pirq tree, thus leaving stale unbound pirqs
after domain destruction if the domain had mapped dpci pirqs after
shutdown.

Take a different approach to fix the original issue, instead of
removing the pirq from d->pirq_tree clear the flags of the dpci pirq
struct to signal that the pirq is now unbound. This prevents calling
pirq_guest_unbind multiple times for the same pirq without having to
remove it from the domain pirq tree.

This is XSA-360.

Fixes: 5b58dad089 ('x86/pass-through: avoid double IRQ unbind during domain cleanup')
Reported-by: Samuel Verschelde <samuel.verschelde@vates.fr>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agoxen/arm: Don't ignore the affinity level 3 in the MPIDR
Wei Chen [Fri, 8 Jan 2021 06:29:53 +0000 (14:29 +0800)]
xen/arm: Don't ignore the affinity level 3 in the MPIDR

Currently, Xen is considering that all the affinity bits are defined
below 32-bit. However, Arm64 define a 3rd level affinity in bits 32-39.

The function gicv3_send_sgi_list in the GICv3 driver will compute the
cluster using the following code:

uint64_t cluster_id = cpu_logical_map(cpu) & ~MPIDR_AFF0_MASK;

Because MPIDR_AFF0_MASK is defined as a 32-bit value, we will miss out
the 3rd level affinity. As a consequence, the IPI would not be sent to
the correct vCPU.

This particular error can be solved by switching MPIDR_AFF0_MASK to use
unsigned long. However, take the opportunity to switch all the MPIDR_*
define to use unsigned long to avoid anymore issue.

Signed-off-by: Wei Chen <wei.chen@arm.com>
[julien: Reword the commit message]
Reviewed-by: Julien Grall <jgrall@amazon.com>
4 years agoxen/irq: Propagate the error from init_one_desc_irq() in init_*_irq_data()
Julien Grall [Sat, 28 Nov 2020 11:36:42 +0000 (11:36 +0000)]
xen/irq: Propagate the error from init_one_desc_irq() in init_*_irq_data()

init_one_desc_irq() can return an error if it is unable to allocate
memory. While this is unlikely to happen during boot (called from
init_{,local_}irq_data()), it is better to harden the code by
propagting the return value.

Spotted by coverity.

CID: 106529

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Roger Paul Monné <roger.pau@citrix.com>
Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agoxen/arm: Relax GIC version check
Vladimir Murzin [Wed, 20 Jan 2021 11:26:44 +0000 (11:26 +0000)]
xen/arm: Relax GIC version check

Supported values are

0b0000 GIC CPU interface system registers not implemented.

0b0001 System register interface to versions 3.0 and 4.0 of the GIC
       CPU interface is supported.

0b0011 System register interface to version 4.1 of the GIC CPU
       interface is supported.

4.1 is still backward compatible with 4.0/3.0, moreover ARM ARM
guarantees that future versions of the GIC CPU interface retain
backwards compatible.

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Release-Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agoxen/arm: Hide Pointer Authentication (PAC)
Vladimir Murzin [Wed, 20 Jan 2021 11:27:12 +0000 (11:27 +0000)]
xen/arm: Hide Pointer Authentication (PAC)

The ARMv8.3 Pointer Authentication extension is not supported by Xen
at the moment, so do not expose that via ID register.

Signed-off-by: Vladimir Murzin <vladimir.murzin@arm.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
[julien: s/__res0/__res2/ to avoid name duplication]
Reviewed-by: Julien Grall <jgrall@amazon.com>
4 years agoxen/arm: Add defensive barrier in get_cycles for Arm64
Wei Chen [Fri, 8 Jan 2021 06:21:26 +0000 (14:21 +0800)]
xen/arm: Add defensive barrier in get_cycles for Arm64

Per the discussion [1] on the mailing list, we'd better to
have a barrier after reading CNTPCT in get_cycles. If there
is not any barrier there. When get_cycles being used in some
seqlock critical context in the future, the seqlock can be
speculated potentially.

We import Linux commit 75a19a0202db21638a1c2b424afb867e1f9a2376:
    arm64: arch_timer: Ensure counter register reads occur with seqlock held

    When executing clock_gettime(), either in the vDSO or via a system call,
    we need to ensure that the read of the counter register occurs within
    the seqlock reader critical section. This ensures that updates to the
    clocksource parameters (e.g. the multiplier) are consistent with the
    counter value and therefore avoids the situation where time appears to
    go backwards across multiple reads.

    Extend the vDSO logic so that the seqlock critical section covers the
    read of the counter register as well as accesses to the data page. Since
    reads of the counter system registers are not ordered by memory barrier
    instructions, introduce dependency ordering from the counter read to a
    subsequent memory access so that the seqlock memory barriers apply to
    the counter access in both the vDSO and the system call paths.

Cc: <stable@vger.kernel.org>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Link: https://lore.kernel.org/linux-arm-kernel/alpine.DEB.2.21.1902081950260.1662@nanos.tec.linutronix.de/
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Will Deacon <will.deacon@arm.com>
While we are not aware of such use in Xen, it would be best to add the
barrier to avoid any suprise.

In order to reduce the impact of new barrier, we perfer to
use enforce order instead of ISB [2].

Currently, enforce order is not applied to arm32 as this is
not done in Linux at the date of this patch. If this is done
in Linux it will need to be also done in Xen.

To avoid adding read_cntpct_enforce_ordering everywhere, we introduced
a new helper read_cntpct_stable to replace original get_cycles, and turn
get_cycles to a wrapper which we can add read_cntpct_enforce_ordering
easily.

[1] https://lists.xenproject.org/archives/html/xen-devel/2020-12/msg00181.html
[2] https://lkml.org/lkml/2020/3/13/645

Signed-off-by: Wei Chen <wei.chen@arm.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
4 years agoxen/gnttab: Log when grant_table_init() fails
Andrew Cooper [Tue, 19 Jan 2021 11:08:17 +0000 (11:08 +0000)]
xen/gnttab: Log when grant_table_init() fails

... so debug builds can see what went wrong, rather than getting an
unqualified -EINVAL out of domain creation.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agoxen/domain: Introduce vcpu_teardown()
Andrew Cooper [Mon, 28 Sep 2020 13:17:02 +0000 (14:17 +0100)]
xen/domain: Introduce vcpu_teardown()

Similarly to c/s 98d4d6d8a6 "xen/domain: Introduce domain_teardown()",
introduce a common mechanism for restartable per-vcpu teardown logic.

Extend the PROGRESS() mechanism to support saving and restoring the vcpu loop
variable across hypercalls.

This will eventually supersede domain_reliquish_resources(), and reduce the
quantity of redundant logic performed.

No functional change (yet).

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/CPUID: unconditionally set XEN_HVM_CPUID_IOMMU_MAPPINGS
Roger Pau Monné [Tue, 19 Jan 2021 15:04:06 +0000 (16:04 +0100)]
x86/CPUID: unconditionally set XEN_HVM_CPUID_IOMMU_MAPPINGS

This is a revert of f5cfa0985673 plus a rework of the comment that
accompanies the setting of the flag so we don't forget why it needs to
be unconditionally set: it's indicating whether the version of Xen has
the original issue fixed and IOMMU entries are created for
grant/foreign maps.

If the flag is only exposed when the IOMMU is enabled the guest could
resort to use bounce buffers when running backends as it would assume
the underlying Xen version still has the bug present and thus
grant/foreign maps cannot be used with devices.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agokconfig: ensure strndup() declaration is visible
Jan Beulich [Tue, 19 Jan 2021 15:03:41 +0000 (16:03 +0100)]
kconfig: ensure strndup() declaration is visible

Its guard was updated such that it is visible by default when POSIX 2008
was adopted by glibc. It's not visible by default on older glibc.

Fixes: f80fe2b34f08 ("xen: Update Kconfig to Linux v5.4")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Doug Goldstein <cardoe@cardoe.com>
4 years agotools/gdbsx: Use right path for privcmd on NetBSD
Manuel Bouyer [Tue, 12 Jan 2021 18:12:28 +0000 (19:12 +0100)]
tools/gdbsx: Use right path for privcmd on NetBSD

On NetBSD the privcmd interface node is /kern/xen/privcmd

Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agotools/xenstat: Remove usused NetBSD code
Manuel Bouyer [Tue, 12 Jan 2021 18:12:42 +0000 (19:12 +0100)]
tools/xenstat: Remove usused NetBSD code

remove PROCNETDEV_HEADER[] and read_attributes_vbd(), gcc complains that they
are unused

Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agotools/xenpaging: include errno.h
Manuel Bouyer [Tue, 12 Jan 2021 18:12:40 +0000 (19:12 +0100)]
tools/xenpaging: include errno.h

writable definition of errno on NetBSD.

Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agotools/xenbackendd: Remove xenbackendd
Manuel Bouyer [Tue, 12 Jan 2021 18:12:26 +0000 (19:12 +0100)]
tools/xenbackendd: Remove xenbackendd

NetBSD doens't need xenbackendd with xl toolstack so don't build it.
Remove now unused xenbackendd directory/files, and remaining references
in the hotplug scripts.

Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
[Also clean up stale comments in the Linux xencommons script]
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agolibs/evtchn: fix build on NetBSD
From: Manuel Bouyer [Mon, 18 Jan 2021 18:38:41 +0000 (18:38 +0000)]
libs/evtchn: fix build on NetBSD

Use xenio3.h for ioctl definitions

read_exact/write_exact seems to not be available here, which cause a gcc
error.  Use plain read/write, the xenevtchn interface won't do partial
read/write on NetBSD anyway so it should be safe.  This is in line with the
rest of the OS specific helpers.

Fixes: b7f76a699dc ('tools: Refactor /dev/xen/evtchn wrappers into libxenevtchn')
Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agoocaml/libs/eventchn: drop unneeded evtchn.h
Manuel Bouyer [Tue, 12 Jan 2021 18:12:39 +0000 (19:12 +0100)]
ocaml/libs/eventchn: drop unneeded evtchn.h

On NetBSD xen/sys/evtchn.h is not available any more. Just remove it as it's
not needed.

Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
4 years agox86/mem_sharing: fix uninitialized 'preempted' variable
Tamas K Lengyel [Mon, 18 Jan 2021 17:23:06 +0000 (10:23 -0700)]
x86/mem_sharing: fix uninitialized 'preempted' variable

UBSAN catches an uninitialized use of the 'preempted' variable in
fork_hap_allocation when there is no preemption.

Fixes: 41548c5472a ("mem_sharing: VM forking")
Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoxen/domain: Reposition config copying in domain_create()
Andrew Cooper [Mon, 18 Jan 2021 14:50:57 +0000 (14:50 +0000)]
xen/domain: Reposition config copying in domain_create()

This is cleanup for two pending series which will copy more data than just
flags from config.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agotools/libxenhypfs: fix reading of gzipped string
Juergen Gross [Mon, 18 Jan 2021 12:06:28 +0000 (13:06 +0100)]
tools/libxenhypfs: fix reading of gzipped string

Reading a gzipped string value from hypfs doesn't add a 0 byte at the
end. Fix that.

Fixes: 86234eafb95295 ("libs: add libxenhypfs")
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agognttab: consolidate pin-to-status syncing
Jan Beulich [Mon, 18 Jan 2021 11:14:19 +0000 (12:14 +0100)]
gnttab: consolidate pin-to-status syncing

Forever since the fix for XSA-230 the 2nd of the comments ahead of
fixup_status_for_copy_pin() has been stale - there's nothing specific to
transitive grants there anymore.

Move the function up, drop the "copy" part from its name again, add a
"readonly" parameter, and use it also on other paths having decremented
one (or not having got to increment any) of the pin counts.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agognttab: adjust pin count overflow checks
Jan Beulich [Mon, 18 Jan 2021 11:13:42 +0000 (12:13 +0100)]
gnttab: adjust pin count overflow checks

It's at least odd to check counters which aren't going to be
incremented, resulting in failure just because prior operations may have
reached the refcount limit. And it's also not helpful to use open-coded
literal numbers in these checks.

Calculate the increment values first and derive from them the mask to
use in the checks.

Also move the pin count checks ahead of the calculation of the status
(and for copy also sha2) pointers: They're not needed in the failure
cases, and this way the compiler may also have an easier time keeping
the variables at least transiently in registers for the subsequent uses.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/Dom0: support zstd compressed kernels
Jan Beulich [Mon, 18 Jan 2021 11:12:23 +0000 (12:12 +0100)]
x86/Dom0: support zstd compressed kernels

Taken from Linux at commit 1c4dd334df3a ("lib: decompress_unzstd: Limit
output size") for unzstd.c (renamed from decompress_unzstd.c) and
36f9ff9e03de ("lib: Fix fall-through warnings for Clang") for zstd/,
with bits from linux/zstd.h merged into suitable other headers.

To limit the editing necessary, introduce ptrdiff_t.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agolib: introduce xxhash
Jan Beulich [Mon, 18 Jan 2021 11:10:34 +0000 (12:10 +0100)]
lib: introduce xxhash

Taken from Linux at commit d89775fc929c ("lib/: replace HTTP links with
HTTPS ones"), but split into separate 32-bit and 64-bit sources, since
the immediate consumer (zstd) will need only the latter.

Note that the building of this code is restricted to x86 for now because
of the need to sort asm/unaligned.h for Arm.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agointroduce unaligned.h
Jan Beulich [Mon, 18 Jan 2021 11:09:13 +0000 (12:09 +0100)]
introduce unaligned.h

Rather than open-coding commonly used constructs in yet more places when
pulling in zstd decompression support (and its xxhash prereq), pull out
the custom bits into a commonly used header (for the hypervisor build;
the tool stack and stubdom builds of libxenguest will still remain in
need of similarly taking care of). For now this is limited to x86, where
custom logic isn't needed (considering this is going to be used in init
code only, even using alternatives patching to use MOVBE doesn't seem
worthwhile).

For Arm64 with CONFIG_ACPI=y (due to efi-dom0.c's re-use of xz/crc32.c)
drop the not really necessary inclusion of xz's private.h.

No change in generated code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoxen/arm: livepatch: Include xen/mm.h rather than asm/mm.h
Julien Grall [Fri, 15 Jan 2021 19:29:47 +0000 (19:29 +0000)]
xen/arm: livepatch: Include xen/mm.h rather than asm/mm.h

Livepatch fails to build on Arm after commit ced9795c6cb4 "mm: split
out mfn_t / gfn_t / pfn_t definitions and helpers":

In file included from livepatch.c:13:0:
/oss/xen/xen/include/asm/mm.h:32:28: error: field ‘list’ has incomplete type
     struct page_list_entry list;
                            ^~~~
/oss/xen/xen/include/asm/mm.h:53:43: error: ‘MAX_ORDER’ undeclared here (not in a function); did you mean ‘PFN_ORDER’?
                 unsigned long first_dirty:MAX_ORDER + 1;
                                           ^~~~~~~~~
                                           PFN_ORDER
/oss/xen/xen/include/asm/mm.h:53:31: error: bit-field ‘first_dirty’ width not an integer constant
                 unsigned long first_dirty:MAX_ORDER + 1;
                               ^~~~~~~~~~~

This is happening because asm/mm.h is included directly by livepatch.c.
Yet it depends on xen/mm.h to be included first so MAX_ORDER is defined.

Resolve the build failure by including xen/mm.h rather than asm/mm.h.

Fixes: ced9795c6cb4 ("mm: split out mfn_t / gfn_t / pfn_t definitions and helpers")
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agoNetBSD: Fix lock directory path
Manuel Bouyer [Tue, 12 Jan 2021 18:12:22 +0000 (19:12 +0100)]
NetBSD: Fix lock directory path

On NetBSD the lock directory is in /var/run/

Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agoArm: don't hard-code grant table limits in create_domUs()
Jan Beulich [Fri, 15 Jan 2021 15:05:03 +0000 (16:05 +0100)]
Arm: don't hard-code grant table limits in create_domUs()

I can only assume that f2ae59bc4b9b ("Rationalize max_grant_frames and
max_maptrack_frames handling") unintentionally left Arm's create_domUs()
set limits to explicit values, as at least some of the same constraints
apply here.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agomm: split out mfn_t / gfn_t / pfn_t definitions and helpers
Jan Beulich [Fri, 15 Jan 2021 15:03:56 +0000 (16:03 +0100)]
mm: split out mfn_t / gfn_t / pfn_t definitions and helpers

xen/mm.h has heavy dependencies, while in a number of cases only these
type definitions are needed. This separation then also allows pulling in
these definitions when including xen/mm.h would cause cyclic
dependencies.

Replace xen/mm.h inclusion where possible in include/xen/. (In
xen/iommu.h also take the opportunity and correct the few remaining
sorting issues.)

While the change could be dropped, remove an unnecessary asm/io.h
inclusion from xen/arch/x86/acpi/power.c. This was the initial attempt
to address build issues with it, until it became clear that the header
itself needs adjustment.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agoinclude: don't use asm/page.h from common headers
Jan Beulich [Fri, 15 Jan 2021 15:02:13 +0000 (16:02 +0100)]
include: don't use asm/page.h from common headers

Doing so limits what can be done in (in particular included by) this per-
arch header. Abstract out page shift/size related #define-s, which is all
the respective headers care about. Extend the replacement / removal to
some x86 headers as well; some others now need to include page.h (and
they really should have before).

Arm's VADDR_BITS gets dropped altogether: Its current value is clearly
wrong for 64-bit, but the constant also isn't used anywhere right now.

While Arm used vaddr_t in PAGE_OFFSET(), this use is compatible with
that of unsigned long in the new common implementation.

Also drop the dead PAGE_FLAG_MASK at this occasion.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agodocs: update the xenstore migration stream documentation
Juergen Gross [Fri, 15 Jan 2021 08:29:48 +0000 (09:29 +0100)]
docs: update the xenstore migration stream documentation

For live update of Xenstore some records defined in the migration
stream document need to be changed:

- Support of the read-only socket has been dropped from all Xenstore
  implementations, so ro-socket-fd in the global record can be removed.

- Some guests require the event channel to Xenstore to remain the same
  on Xenstore side, so Xenstore has to keep the event channel interface
  open across a live update. For this purpose an evtchn-fd needs to be
  added to the global record.

- With no read-only support the flags field in the connection record
  can be dropped.

- The evtchn field in the connection record needs to be switched to
  hold the port of the Xenstore side of the event channel.

- A flags field needs to be added to permission specifiers in order to
  be able to mark a permission as stale (XSA-322).

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
4 years agotools/libxenevtchn: add possibility to not close file descriptor on exec
Juergen Gross [Fri, 15 Jan 2021 08:29:38 +0000 (09:29 +0100)]
tools/libxenevtchn: add possibility to not close file descriptor on exec

Today the file descriptor for the access of the event channel driver
is being closed in case of exec(2). For the support of live update of
a daemon using libxenevtchn this can be problematic, so add a way to
keep that file descriptor open.

Add support of a flag XENEVTCHN_NO_CLOEXEC for xenevtchn_open() which
will result in _not_ setting O_CLOEXEC when opening the event channel
driver node.

The caller can then obtain the file descriptor via xenevtchn_fd().

Add an alternative open function xenevtchn_fdopen() which takes that
file descriptor as an additional parameter. This allows to allocate a
xenevtchn_handle and to associate it with that file descriptor.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wl@xen.org>
Reviewed-by: Julien Grall <jgrall@amazon.com>
4 years agotools/libxenevtchn: propagate xenevtchn_open() flags parameter
Juergen Gross [Fri, 15 Jan 2021 08:29:37 +0000 (09:29 +0100)]
tools/libxenevtchn: propagate xenevtchn_open() flags parameter

Propagate the flags parameter of xenevtchn_open() to the OS-specific
handlers in order to enable handling them there.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools/libxenevtchn: check xenevtchn_open() flags for not supported bits
Juergen Gross [Fri, 15 Jan 2021 08:29:36 +0000 (09:29 +0100)]
tools/libxenevtchn: check xenevtchn_open() flags for not supported bits

Refuse a call of xenevtchn_open() with unsupported bits in flags being
set.

This will change behavior for callers passing junk in flags today,
but those would otherwise get probably unwanted side effects when the
flags they specify today get any meaning. So checking flags is the
right thing to do.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools/libxenevtchn: rename open_flags to flags
Juergen Gross [Fri, 15 Jan 2021 08:29:35 +0000 (09:29 +0100)]
tools/libxenevtchn: rename open_flags to flags

Rename the xenevtchn_open() parameter open_flags to flags as it might
be used for things not passed on to open().

No functional change.
No API/ABI changes.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools/libxenevtchn: switch to standard xen coding style
Juergen Gross [Fri, 15 Jan 2021 08:29:34 +0000 (09:29 +0100)]
tools/libxenevtchn: switch to standard xen coding style

There is a mixture of different styles in libxenevtchn. Use the
standard xen style only.

No functional change.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoautomation: use test-artifacts/qemu-system-aarch64 instead of Debian's
Stefano Stabellini [Tue, 5 Jan 2021 22:58:45 +0000 (14:58 -0800)]
automation: use test-artifacts/qemu-system-aarch64 instead of Debian's

Instead apt-get'ing Debian's qemu-system-aarch64, simply use the
provided QEMU binary under binaries.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agoautomation: add a job to import qemu-system-aarch64 into the pipeline
Stefano Stabellini [Tue, 5 Jan 2021 22:58:44 +0000 (14:58 -0800)]
automation: add a job to import qemu-system-aarch64 into the pipeline

In order to use the pre-built test-artifacts/qemu-system-aarch64 binary
for our tests, first we need to import it into the pipeline. Let's do
that the same way we did it for the kernel and Alpine Linux filesystem:
by creating a special job for it.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agoautomation: add qemu-system-aarch64 to test-artifacts
Stefano Stabellini [Tue, 5 Jan 2021 22:58:43 +0000 (14:58 -0800)]
automation: add qemu-system-aarch64 to test-artifacts

Currently we are using Debian's qemu-system-aarch64 for our tests.
However, sometimes it crashes. It is hard to debug and even harder to
apply any fixes to it.

Instead, build our own QEMU as one of our test-artifacts, which are only
built once, then imported into each pipeline via phony jobs.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agostubdom: fix tpm_version
Olaf Hering [Thu, 14 Jan 2021 12:03:23 +0000 (13:03 +0100)]
stubdom: fix tpm_version

It is just a declaration, not a variable.

ld: /home/abuild/rpmbuild/BUILD/xen-4.14.20200616T103126.3625b04991/non-dbg/stubdom/vtpmmgr/vtpmmgr.a(vtpm_cmd_handler.o):(.bss+0x0): multiple definition of `tpm_version'; /home/abuild/rpmbuild/BUILD/xen-4.14.20200616T103126.3625b04991/non-dbg/stubdom/vtpmmgr/vtpmmgr.a(vtpmmgr.o):(.bss+0x0): first defined here

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
4 years agotools/libxenstat: ensure strnlen() declaration is visible
Jan Beulich [Thu, 14 Jan 2021 12:03:01 +0000 (13:03 +0100)]
tools/libxenstat: ensure strnlen() declaration is visible

Its guard was updated such that it is visible by default when POSIX 2008
was adopted by glibc. It's not visible by default on older glibc.

Fixes: 40fe714ca424 ("tools/libs/stat: use memcpy instead of strncpy in getBridge")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agoargo: don't pointlessly use get_domain_by_id()
Jan Beulich [Thu, 14 Jan 2021 12:02:35 +0000 (13:02 +0100)]
argo: don't pointlessly use get_domain_by_id()

For short-lived references rcu_lock_domain_by_id() is the better
(slightly cheaper) alternative.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Christopher Clark <christopher.w.clark@gmail.com>
4 years agolib: drop (replace) debug_build()
Jan Beulich [Thu, 14 Jan 2021 12:01:14 +0000 (13:01 +0100)]
lib: drop (replace) debug_build()

Its expansion shouldn't be tied to NDEBUG - down the road we may want to
allow enabling assertions independently of CONFIG_DEBUG. Replace the few
uses by a new xen_build_info() helper, subsuming gcov_string at the same
time (while replacing the stale CONFIG_GCOV used there) and also adding
CONFIG_UBSAN indication.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agomemory: avoid pointless continuation in xenmem_add_to_physmap()
Jan Beulich [Thu, 14 Jan 2021 12:00:26 +0000 (13:00 +0100)]
memory: avoid pointless continuation in xenmem_add_to_physmap()

Adjust so we uniformly avoid needlessly arranging for a continuation on
the last iteration.

Fixes: 5777a3742d88 ("IOMMU: hold page ref until after deferred TLB flush")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
4 years agoxen/arm: don't read aarch32 regs when aarch32 isn't available
Stefano Stabellini [Tue, 12 Jan 2021 23:44:50 +0000 (15:44 -0800)]
xen/arm: don't read aarch32 regs when aarch32 isn't available

Don't read aarch32 system registers at boot time when the aarch32 state
is not available at EL0. They are UNKNOWN, so it is not useful to read
them. Moreover, on Cavium ThunderX reading ID_PFR2_EL1 generates an
unsupported exception which causes a Xen crash.  Instead, only read them
when aarch32 is available.

Leave the corresponding fields in struct cpuinfo_arm so that they
are read-as-zero from a guest.

Since we are editing identify_cpu, also fix the indentation: 4 spaces
instead of 8.

Fixes: 9cfdb489af81 ("xen/arm: Add ID registers and complete cpuinfo")
Link: https://lore.kernel.org/xen-devel/f90e40ee-b042-6cc5-a08d-aef41a279527@suse.com/
Suggested-by: Julien Grall <julien@xen.org>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agoxen/arm: Correct the coding style of get_cycles
Wei Chen [Tue, 5 Jan 2021 07:19:45 +0000 (15:19 +0800)]
xen/arm: Correct the coding style of get_cycles

It seems the arm inline function get_cycles has used 8 spaces for
line indent since 2012. This patch correct them to 4 spaces and
remove extra space between function name and bracket.

Signed-off-by: Wei Chen <wei.chen@arm.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agox86/mem_sharing: fix wrong field name used in 2c5119d
Tamas K Lengyel [Wed, 13 Jan 2021 02:28:45 +0000 (18:28 -0800)]
x86/mem_sharing: fix wrong field name used in 2c5119d

The arch_domain struct has "msr", not "msrs".

Spotted by a TravisCI Randconfig build.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools: Move memshrtool from tests/ to misc/
Andrew Cooper [Tue, 12 Jan 2021 18:37:53 +0000 (18:37 +0000)]
tools: Move memshrtool from tests/ to misc/

memshrtool is a tool for a human to use, rather than a test.  Move it into
misc/ as a more appropriate location to live.  Also rename it to
xen-memshare

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agotools: Move xen-access from tests/ to misc/
Andrew Cooper [Tue, 12 Jan 2021 18:37:53 +0000 (18:37 +0000)]
tools: Move xen-access from tests/ to misc/

xen-access is a tool for a human to use, rather than a test.  Move it
into misc/ as a more appropriate location to live.

Move the -DXC_WANT_COMPAT_DEVICEMODEL_API from CFLAGS into xen-access.c itself
to avoid adding Makefile complexity.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
4 years agotools/tests: Drop obsolete running scripts
Andrew Cooper [Tue, 12 Jan 2021 18:33:39 +0000 (18:33 +0000)]
tools/tests: Drop obsolete running scripts

The python unit tests were dropped in Xen 4.12 due to being obsolete, but the
scripts to run the tests were missed.  Clean up .gitignore as well.

Also drop the libxenctrl {C,LD}FLAGS adjustments in the Makefile.  This logic
isn't used, and isn't appropriate even in principle, as there are tests in
here which don't want to use libxenctrl.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agoxen/memory: Fix compat XENMEM_acquire_resource for size requests
Andrew Cooper [Tue, 28 Jul 2020 10:23:54 +0000 (11:23 +0100)]
xen/memory: Fix compat XENMEM_acquire_resource for size requests

Copy the nr_frames from the structure which actually has the correct value, so
the caller doesn't unconditionally receive 0.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul@xen.org>
4 years agoxen/memory: Introduce CONFIG_ARCH_ACQUIRE_RESOURCE
Andrew Cooper [Mon, 27 Jul 2020 11:28:24 +0000 (12:28 +0100)]
xen/memory: Introduce CONFIG_ARCH_ACQUIRE_RESOURCE

New architectures shouldn't be forced to implement no-op stubs for unused
functionality.

Introduce CONFIG_ARCH_ACQUIRE_RESOURCE which can be opted in to, and provide
compatibility logic in xen/mm.h

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agoxen/serial: scif: Rework how the parameters are found
Julien Grall [Thu, 24 Dec 2020 16:50:21 +0000 (16:50 +0000)]
xen/serial: scif: Rework how the parameters are found

clang 11 will throw the following error while build Xen:

scif-uart.c:333:33: error: cast to smaller integer type 'enum port_types' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast]
    uart->params = &port_params[(enum port_types)match->data];
                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~

The error can be prevented by directly storing a pointer to the port
parameters rather than the a cast of the port type.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agoiommu/arm: ipmmu-vmsa: Use 1U << 31 rather than 1 << 31
Oleksandr Tyshchenko [Mon, 11 Jan 2021 10:33:55 +0000 (12:33 +0200)]
iommu/arm: ipmmu-vmsa: Use 1U << 31 rather than 1 << 31

Replace all the use of 1 << 31 with 1U << 31 to prevent undefined
behavior in the IPMMU-VMSA driver.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agoxen/iommu: smmu: Use 1U << 31 rather than 1 << 31
Julien Grall [Thu, 24 Dec 2020 15:24:19 +0000 (15:24 +0000)]
xen/iommu: smmu: Use 1U << 31 rather than 1 << 31

Replace all the use of 1 << 31 with 1U << 31 to prevent undefined
behavior in the SMMU driver.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
[stefano: fix title and description]
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agox86/acpi: remove dead code
Roger Pau Monné [Mon, 11 Jan 2021 13:58:00 +0000 (14:58 +0100)]
x86/acpi: remove dead code

After the recent changes to acpi_fadt_parse_sleep_info the bad label
can never be called with facs mapped, and hence the unmap can be
removed.

Additionally remove the whole label, since it was used by a
single caller. Move the relevant code from the label.

No functional change intended.

CID: 1471722
Fixes: 16ca5b3f873 ('x86/ACPI: don't invalidate S5 data when S3 wakeup vector cannot be determined')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86: drop fake CONFIG_{HPET,X86_PM}_TIMER
Jan Beulich [Mon, 11 Jan 2021 13:56:53 +0000 (14:56 +0100)]
x86: drop fake CONFIG_{HPET,X86_PM}_TIMER

I don't think we mean to ever make them real Kconfig options, so let's
just do away with them.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoACPI: replace casts by container_of()
Jan Beulich [Mon, 11 Jan 2021 13:56:23 +0000 (14:56 +0100)]
ACPI: replace casts by container_of()

The latter is slightly more type-safe. Also add const where possible,
including without need to touch further code. Additionally replace an
adjacent unnecessary use of u16.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/ACPI: don't overwrite FADT
Jan Beulich [Mon, 11 Jan 2021 13:55:52 +0000 (14:55 +0100)]
x86/ACPI: don't overwrite FADT

When marking fields invalid for our own purposes, we should do so in our
local copy (so we will notice later on), not in the firmware provided
one (which another entity may want to look at again, e.g. after kexec).
Also mark the function parameter const to notice such issues right away.

Instead use the pointer at the firmware copy for specifying an adjacent
printk()'s arguments. If nothing else this at least reduces the number
of relocations the assembler hasto emit and the linker has to process.

Fixes: 62d1a69a4e9f ("ACPI: support v5 (reduced HW) sleep interface")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoACPI: reduce verbosity by default
Jan Beulich [Mon, 11 Jan 2021 13:55:16 +0000 (14:55 +0100)]
ACPI: reduce verbosity by default

While they're KERN_INFO messages and hence not visible by default, we
still have had reports that the amount of output is too large, not the
least because
- the command line controlled resizing of the console ring buffer
  happens only after SRAT parsing (which may alone produce more than 16k
  of output),
- the default resizing of the console ring buffer happens only after
  ACPI table parsing, since the default size gets calculated depending
  on the number or processors found.

Gate all per-processor logging behind a new "acpi=verbose", making sure
we wouldn't unintentionally pass this on to Dom0.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoevtchn: closing of vIRQ-s doesn't require looping over all vCPU-s
Jan Beulich [Mon, 11 Jan 2021 13:53:55 +0000 (14:53 +0100)]
evtchn: closing of vIRQ-s doesn't require looping over all vCPU-s

Global vIRQ-s have their event channel association tracked on vCPU 0.
Per-vCPU vIRQ-s can't have their notify_vcpu_id changed. Hence it is
well-known which vCPU's virq_to_evtchn[] needs updating.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
4 years agoevtchn: don't call Xen consumer callback with per-channel lock held
Jan Beulich [Mon, 11 Jan 2021 13:53:02 +0000 (14:53 +0100)]
evtchn: don't call Xen consumer callback with per-channel lock held

While there don't look to be any problems with this right now, the lock
order implications from holding the lock can be very difficult to follow
(and may be easy to violate unknowingly). The present callbacks don't
(and no such callback should) have any need for the lock to be held.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agox86/PV: fold redundant calls to adjust_guest_l<N>e()
Jan Beulich [Mon, 11 Jan 2021 13:51:39 +0000 (14:51 +0100)]
x86/PV: fold redundant calls to adjust_guest_l<N>e()

At least from an abstract perspective it is quite odd for us to compare
adjusted old and unadjusted new page table entries when determining
whether the fast path can be used. This is largely benign because
FASTPATH_FLAG_WHITELIST covers most of the flags which the adjustments
may set, and the flags getting set don't affect the outcome of
get_page_from_l<N>e(). There's one exception: 32-bit L3 entries get
_PAGE_RW set, but get_page_from_l3e() doesn't allow linear page tables
to be created at this level for such guests. Apart from this _PAGE_RW
is unused by get_page_from_l<N>e() (for N > 1), and hence forcing the
bit on early has no functional effect.

The main reason for the change, however, is that adjust_guest_l<N>e()
aren't exactly cheap - both in terms of pure code size and because each
one has at least one evaluate_nospec() by way of containing
is_pv_32bit_domain() conditionals.

Call the functions once ahead of the fast path checks, instead of twice
after.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86/PV: consistently inline {,un}adjust_guest_l<N>e()
Jan Beulich [Mon, 11 Jan 2021 13:50:38 +0000 (14:50 +0100)]
x86/PV: consistently inline {,un}adjust_guest_l<N>e()

Commit 8a74707a7c ("x86/nospec: Use always_inline to fix code gen for
evaluate_nospec") converted inline to always_inline for
adjust_guest_l[134]e(), but left adjust_guest_l2e() and
unadjust_guest_l3e() alone without saying why these two would differ in
the needed / wanted treatment. Adjust these two as well.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agoxen/arm: do not read MVFR2 when is not defined
Stefano Stabellini [Tue, 5 Jan 2021 19:05:48 +0000 (11:05 -0800)]
xen/arm: do not read MVFR2 when is not defined

MVFR2 is not available on ARMv7. It is available on ARMv8 aarch32 and
aarch64. If Xen reads MVFR2 on ARMv7 it could crash.

Avoid the issue by doing the following:

- define MVFR2_MAYBE_UNDEFINED on arm32
- if MVFR2_MAYBE_UNDEFINED, do not attempt to read MVFR2 in Xen
- keep the 3rd register_t in struct cpuinfo_arm.mvfr on arm32 so that a
  guest read to the register returns '0' instead of crashing the guest.

'0' is an appropriate value to return to the guest because it is defined
as "no support for miscellaneous features".

Aarch64 Xen is not affected by this patch.

Fixes: 9cfdb489af81 ("xen/arm: Add ID registers and complete cpuinfo")
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agox86/hypercall: fix gnttab hypercall args conditional build on pvshim
Roger Pau Monné [Fri, 8 Jan 2021 15:51:52 +0000 (16:51 +0100)]
x86/hypercall: fix gnttab hypercall args conditional build on pvshim

A pvshim build doesn't require the grant table functionality built in,
but it does require knowing the number of arguments the hypercall has
so the hypercall parameter clobbering works properly.

Instead of also setting the argument count for the gnttab case if PV
shim functionality is enabled, just drop all of the conditionals from
hypercall_args_table, as a hypercall having a NULL handler won't get
to use that information anyway.

Note this hasn't been detected by osstest because the tools pvshim
build is done without debug enabled, so the hypercall parameter
clobbering doesn't happen.

Fixes: d2151152dd2 ('xen: make grant table support configurable')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/shadow: adjust TLB flushing in sh_unshadow_for_p2m_change()
Jan Beulich [Fri, 8 Jan 2021 15:51:19 +0000 (16:51 +0100)]
x86/shadow: adjust TLB flushing in sh_unshadow_for_p2m_change()

Accumulating transient state of d->dirty_cpumask in a local variable is
unnecessary here: The flush is fine to make with the dirty set at the
time of the call. With this, move the invocation to a central place at
the end of the function.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
4 years agox86/shadow: cosmetics to sh_unshadow_for_p2m_change()
Jan Beulich [Fri, 8 Jan 2021 15:50:47 +0000 (16:50 +0100)]
x86/shadow: cosmetics to sh_unshadow_for_p2m_change()

Besides the adjustments for style
- use switch(),
- widen scope of commonly used variables,
- narrow scope of other variables.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
4 years agox86/p2m: pass old PTE directly to write_p2m_entry_pre() hook
Jan Beulich [Fri, 8 Jan 2021 15:50:11 +0000 (16:50 +0100)]
x86/p2m: pass old PTE directly to write_p2m_entry_pre() hook

In no case is a pointer to non-const needed. Since no pointer arithmetic
is done by the sole user of the hook, passing in the PTE itself is quite
fine.

While doing this adjustment also
- drop the intermediate sh_write_p2m_entry_pre():
  sh_unshadow_for_p2m_change() can itself be used as the hook function,
  moving the conditional into there,
- introduce a local variable holding the flags of the old entry.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
4 years agox86/p2m: avoid unnecessary calls of write_p2m_entry_pre() hook
Jan Beulich [Fri, 8 Jan 2021 15:49:23 +0000 (16:49 +0100)]
x86/p2m: avoid unnecessary calls of write_p2m_entry_pre() hook

When shattering a large page, we first construct the new page table page
and only then hook it up. The "pre" hook in this case does nothing, for
the page starting out all blank. Avoid 512 calls into shadow code in
this case by passing in INVALID_GFN, indicating the page being updated
is (not yet) associated with any GFN. (The alternative to this change
would be to actually pass in a correct GFN, which can't be all the same
on every loop iteration.)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86/mem_sharing: resolve mm-lock order violations when forking VMs with nested p2m
Tamas K Lengyel [Fri, 8 Jan 2021 10:51:36 +0000 (11:51 +0100)]
x86/mem_sharing: resolve mm-lock order violations when forking VMs with nested p2m

Several lock-order violations have been encountered while attempting to fork
VMs with nestedhvm=1 set. This patch resolves the issues.

The order violations stems from a call to p2m_flush_nestedp2m being performed
whenever the hostp2m changes. This functions always takes the p2m lock for the
nested_p2m. However, with sharing the p2m locks always have to be taken before
the sharing lock. To resolve this issue we avoid taking the sharing lock where
possible (and was actually unecessary to begin with). But we also make
p2m_flush_nestedp2m aware that the p2m lock may have already been taken and
preemptively take all nested_p2m locks before unsharing a page where taking the
sharing lock is necessary.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agox86: fold indirect_thunk_asm.h into asm-defns.h
Jan Beulich [Fri, 8 Jan 2021 10:50:32 +0000 (11:50 +0100)]
x86: fold indirect_thunk_asm.h into asm-defns.h

There's little point in having two separate headers both getting
included by asm_defns.h. This in particular reduces the number of
instances of guarding asm(".include ...") suitably in such dual use
headers.

No change to generated code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86: drop ASM_{CL,ST}AC
Jan Beulich [Fri, 8 Jan 2021 10:48:09 +0000 (11:48 +0100)]
x86: drop ASM_{CL,ST}AC

Use ALTERNATIVE directly, such that at the use sites it is visible that
alternative code patching is in use. Similarly avoid hiding the fact in
SAVE_ALL.

No change to generated code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86: replace __ASM_{CL,ST}AC
Jan Beulich [Fri, 8 Jan 2021 10:45:07 +0000 (11:45 +0100)]
x86: replace __ASM_{CL,ST}AC

Introduce proper assembler macros instead, enabled only when the
assembler itself doesn't support the insns. To avoid duplicating the
macros for assembly and C files, have them processed into asm-macros.h.
This in turn requires adding a multiple inclusion guard when generating
that header.

No change to generated code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agoxen/arm: optee: The function identifier is always 32-bit
Roman Skakun [Wed, 6 Jan 2021 11:26:57 +0000 (13:26 +0200)]
xen/arm: optee: The function identifier is always 32-bit

Per the SMCCC specification (see section 3.1 in ARM DEN 0028D), the
function identifier is only stored in the least significant 32-bits.
The most significant 32-bits should be ignored.

Signed-off-by: Roman Skakun <roman_skakun@epam.com>
Acked-by: Volodymyr Babchyk <volodymyr_babchuk@epam.com>
[jgrall: Reword the commit message and comment]
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agoxsm/dummy: harden against speculative abuse
Jan Beulich [Thu, 7 Jan 2021 14:11:25 +0000 (15:11 +0100)]
xsm/dummy: harden against speculative abuse

First of all don't open-code is_control_domain(), which is already
suitably using evaluate_nospec(). Then also apply this construct to the
other paths of xsm_default_action(). Also guard two paths not using this
function.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wl@xen.org>
4 years agox86/dpci: EOI interrupt regardless of its masking status
Roger Pau Monné [Thu, 7 Jan 2021 14:10:29 +0000 (15:10 +0100)]
x86/dpci: EOI interrupt regardless of its masking status

Modify hvm_pirq_eoi to always EOI the interrupt if required, instead
of not doing such EOI if the interrupt is routed through the vIO-APIC
and the entry is masked at the time the EOI is performed.

Further unmask of the vIO-APIC pin won't EOI the interrupt, and thus
the guest OS has to wait for the timeout to expire and the automatic
EOI to be performed.

This allows to simplify the helpers and drop the vioapic_redir_entry
parameter from all of them.

Fixes: ccfe4e08455 ('Intel vt-d specific changes in arch/x86/hvm/vmx/vtd.')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86: drop use of E801 memory "map" (and alike)
Jan Beulich [Thu, 7 Jan 2021 14:09:47 +0000 (15:09 +0100)]
x86: drop use of E801 memory "map" (and alike)

ACPI mandates use of E820 (or newer, e.g. EFI), and in fact firmware
has been observed to include E820_ACPI ranges in what E801 reports as
available (really "configured") memory. Since all 64-bit systems ought
to support ACPI, drop our use of older BIOS and boot loader interfaces.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/mem-sharing: don't pointlessly use get_domain_by_id()
Jan Beulich [Thu, 7 Jan 2021 14:09:20 +0000 (15:09 +0100)]
x86/mem-sharing: don't pointlessly use get_domain_by_id()

For short-lived references rcu_lock_domain_by_id() is the better
(slightly cheaper) alternative.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86: don't pointlessly use get_domain_by_id()
Jan Beulich [Thu, 7 Jan 2021 14:08:51 +0000 (15:08 +0100)]
x86: don't pointlessly use get_domain_by_id()

For short-lived references rcu_lock_domain_by_id() is the better
(slightly cheaper) alternative.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agocommon: don't (kind of) open-code rcu_lock_domain_by_any_id()
Jan Beulich [Thu, 7 Jan 2021 14:06:15 +0000 (15:06 +0100)]
common: don't (kind of) open-code rcu_lock_domain_by_any_id()

Even more so when using rcu_lock_domain_by_id() in place of the more
efficient rcu_lock_current_domain().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agovPCI/MSI-X: fold clearing of entry->updated
Jan Beulich [Thu, 7 Jan 2021 14:03:17 +0000 (15:03 +0100)]
vPCI/MSI-X: fold clearing of entry->updated

Both call sites clear the flag after a successfull call to
update_entry(). This can be simplified by moving the clearing into the
function, onto its success path.

As a result of neither caller caring about update_entry()'s return value
anymore, the function gets switched to return void.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86/vm_event: transfer nested p2m base info
Tamas K Lengyel [Sun, 3 Jan 2021 18:41:17 +0000 (11:41 -0700)]
x86/vm_event: transfer nested p2m base info

Required to introspect events originating from nested VMs.

Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/mem_sharing: Copy CPUID and MSR configuration during vm forking
Tamas K Lengyel [Tue, 5 Jan 2021 21:58:23 +0000 (13:58 -0800)]
x86/mem_sharing: Copy CPUID and MSR configuration during vm forking

Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools/libxenguest: move M2P macros to xg_private.h
Olaf Hering [Tue, 5 Jan 2021 15:13:56 +0000 (16:13 +0100)]
tools/libxenguest: move M2P macros to xg_private.h

Just code movement as a preparatory change before xg_sr_* will be moved.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools/libxenguest: remove FOLD_CR3 from xg_save_restore.h
Olaf Hering [Tue, 5 Jan 2021 15:05:36 +0000 (16:05 +0100)]
tools/libxenguest: remove FOLD_CR3 from xg_save_restore.h

The last user was removed with commit b15bc4345e772df92e5ffdbc4c1e9ae2a6206617

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools/libxenguest: remove get_platform_info from xg_save_restore.h
Olaf Hering [Tue, 5 Jan 2021 15:02:47 +0000 (16:02 +0100)]
tools/libxenguest: remove get_platform_info from xg_save_restore.h

Last user was removed with commit 4ddf474e2b7c045fadeaf765ac6157de745e84d6
Previously it was also used in migration code, which was removed with commit
b15bc4345e772df92e5ffdbc4c1e9ae2a6206617

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agolibxl: cleanup remaining backend xs dirs after driver domain
Marek Marczykowski-Górecki [Sun, 8 Nov 2020 14:59:42 +0000 (15:59 +0100)]
libxl: cleanup remaining backend xs dirs after driver domain

When device is removed, backend domain (which may be a driver domain) is
responsible for removing backend entries from xenstore. But in case of
driver domain, it has no access to remove all of them - specifically the
directory named after frontend-id remains. This may accumulate enough to
exceed xenstore quote of the driver domain, breaking further devices.

Fix this by calling libxl__xs_path_cleanup() on the backend path from
libxl__device_destroy() in the toolstack domain too. Note
libxl__device_destroy() is called when the driver domain already removed
what it can (see device_destroy_be_watch_cb()->device_hotplug_done()).

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agotools: ipxe: update for fixing build with GCC10
Olaf Hering [Mon, 4 Jan 2021 11:52:23 +0000 (12:52 +0100)]
tools: ipxe: update for fixing build with GCC10

Update to v1.21.1 to fix build in Tumbleweed, which has been broken
since months due to lack of new release.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Wei Liu <wl@xen.org>
4 years agotools/libxenguest: handle more than 16T in precopy_stats
Olaf Hering [Tue, 5 Jan 2021 08:30:48 +0000 (09:30 +0100)]
tools/libxenguest: handle more than 16T in precopy_stats

total_written tracks the number of transferred dirty pages.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Wei Liu <wl@xen.org>
4 years agolibs/devicemodel: add dm_op support for FreeBSD
Roger Pau Monne [Tue, 5 Jan 2021 10:25:46 +0000 (11:25 +0100)]
libs/devicemodel: add dm_op support for FreeBSD

The FreeBSD ioctls have the same fields has the Linux ones, so the
same file can be shared between both OSes.

No functional change for OSes different than FreeBSD.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agolibs/foreignmemory: implement the missing functions on FreeBSD
Roger Pau Monne [Tue, 5 Jan 2021 10:25:45 +0000 (11:25 +0100)]
libs/foreignmemory: implement the missing functions on FreeBSD

Implement restrict, map resource and unmap resource helpers on
FreeBSD.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agolib/sort: adjust types
Jan Beulich [Tue, 5 Jan 2021 12:20:54 +0000 (13:20 +0100)]
lib/sort: adjust types

First and foremost do away with the use of plain int for sizes or size-
derived values. Use size_t, despite this requiring some adjustment to
the logic. Also replace u32 by uint32_t.

While not directly related also drop a leftover #ifdef from x86's
swap_ex - this was needed only back when 32-bit Xen was still a thing.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agovPCI/MSI-X: tidy init_msix()
Jan Beulich [Tue, 5 Jan 2021 12:20:13 +0000 (13:20 +0100)]
vPCI/MSI-X: tidy init_msix()

First of all introduce a local variable for the to be allocated struct.
The compiler can't CSE all the occurrences (I'm observing 80 bytes of
code saved with gcc 10). Additionally, while the caller can cope and
there was no memory leak, globally "announce" the struct only once done
initializing it. This also removes the dependency of the function on
the caller cleaning up after it in case of an error.

Also prefer a local variable over using a structure field previously
set from this very variable.

Finally move the call to vpci_add_register() ahead of all further
initialization of the struct, to bail early in case of error.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>