Paul Durrant [Tue, 5 Jan 2021 17:46:42 +0000 (17:46 +0000)]
libxl / libxlu: support 'xl pci-attach/detach' by name
This patch modifies libxlu_pci_parse_spec_string() to parse the new 'name'
parameter of PCI_SPEC_STRING detailed in the updated documention in
xl-pci-configuration(5) and populate the 'name' field of 'libxl_device_pci'.
If the 'name' field is non-NULL then both libxl_device_pci_add() and
libxl_device_pci_remove() will use it to look up the device BDF in
the list of assignable devices.
Signed-off-by: Paul Durrant <pdurrant@amazon.com> Acked-by: Wei Liu <wl@xen.org>
Paul Durrant [Tue, 5 Jan 2021 17:46:41 +0000 (17:46 +0000)]
docs/man: modify xl-pci-configuration(5) to add 'name' field to PCI_SPEC_STRING
Since assignable devices can be named, a subsequent patch will support use
of a PCI_SPEC_STRING containing a 'name' parameter instead of a 'bdf'. In
this case the name will be used to look up the 'bdf' in the list of assignable
(or assigned) devices.
Signed-off-by: Paul Durrant <pdurrant@amazon.com> Acked-by: Wei Liu <wl@xen.org>
Paul Durrant [Tue, 5 Jan 2021 17:46:40 +0000 (17:46 +0000)]
xl: support naming of assignable devices
With this patch applied 'xl pci-assignable-add' will take an optional '--name'
parameter, 'xl pci-assignable-remove' can be passed either a BDF or a name and
'xl pci-assignable-list' will take a optional '--show-names' flag which
determines whether names are displayed in its output.
Signed-off-by: Paul Durrant <pdurrant@amazon.com> Acked-by: Wei Liu <wl@xen.org>
Paul Durrant [Tue, 5 Jan 2021 17:46:39 +0000 (17:46 +0000)]
libxl: add 'name' field to 'libxl_device_pci' in the IDL...
... and modify libxl_pci_bdf_assignable_add/remove/list() to make use of it.
libxl_pci_bdf_assignable_add() will store the name of the device in xenstore
if the field is specified (i.e. non-NULL) and libxl_pci_bdf_assignable_remove()
will remove devices specified only by name, looking up the BDF as necessary.
libxl_pci_bdf_assignable_list() will also populate the 'name' field if a name
was stored by libxl_pci_bdf_assignable_add().
NOTE: This patch also fixes whitespace in the declaration of 'libxl_device_pci'
in the IDL.
Signed-off-by: Paul Durrant <pdurrant@amazon.com> Acked-by: Wei Liu <wl@xen.org>
Paul Durrant [Tue, 5 Jan 2021 17:46:38 +0000 (17:46 +0000)]
libxl: stop setting 'vdevfn' in pci_struct_fill()
There are only two call-sites. One always sets it to 0 (which is unnecessary
as the structure is already initialized to zero) and the other can simply set
the 'vdevfn' field directly (after proper structure initialization), avoiding
the need for a local variable.
A subsequent patch will also make use of pci_struct_fill() in a context
where 'vdevfn' may already have been set.
Signed-off-by: Paul Durrant <pdurrant@amazon.com> Acked-by: Wei Liu <wl@xen.org>
Paul Durrant [Tue, 5 Jan 2021 17:46:37 +0000 (17:46 +0000)]
libxlu: introduce xlu_pci_parse_spec_string()
This patch largely re-writes the code to parse a PCI_SPEC_STRING and enters
it via the newly introduced function. The new parser also deals with 'bdf'
and 'vslot' as non-positional paramaters, as per the documentation in
xl-pci-configuration(5).
The existing xlu_pci_parse_bdf() function remains, but now strictly parses
BDF values. Some existing callers of xlu_pci_parse_bdf() are
modified to call xlu_pci_parse_spec_string() as per the documentation in xl(1).
NOTE: Usage text in xl_cmdtable.c and error messages are also modified
appropriately.
As a side-effect this patch also fixes a bug where using '*' to specify
all functions would lead to an assertion failure at the end of
xlu_pci_parse_bdf().
Fixes: d25cc3ec93eb ("libxl: workaround gcc 10.2 maybe-uninitialized warning") Signed-off-by: Paul Durrant <pdurrant@amazon.com> Acked-by: Wei Liu <wl@xen.org>
Paul Durrant [Tue, 5 Jan 2021 17:46:36 +0000 (17:46 +0000)]
docs/man: modify xl(1) in preparation for naming of assignable devices
A subsequent patch will introduce code to allow a name to be specified to
'xl pci-assignable-add' such that the assignable device may be referred to
by than name in subsequent operations.
Signed-off-by: Paul Durrant <pdurrant@amazon.com> Acked-by: Wei Liu <wl@xen.org>
Roger Pau Monné [Thu, 21 Jan 2021 15:11:41 +0000 (16:11 +0100)]
x86/dpci: do not remove pirqs from domain tree on unbind
A fix for a previous issue removed the pirqs from the domain tree when
they are unbound in order to prevent shared pirqs from triggering a
BUG_ON in __pirq_guest_unbind if they are unbound multiple times. That
caused free_domain_pirqs to no longer unmap the pirqs because they
are gone from the domain pirq tree, thus leaving stale unbound pirqs
after domain destruction if the domain had mapped dpci pirqs after
shutdown.
Take a different approach to fix the original issue, instead of
removing the pirq from d->pirq_tree clear the flags of the dpci pirq
struct to signal that the pirq is now unbound. This prevents calling
pirq_guest_unbind multiple times for the same pirq without having to
remove it from the domain pirq tree.
This is XSA-360.
Fixes: 5b58dad089 ('x86/pass-through: avoid double IRQ unbind during domain cleanup') Reported-by: Samuel Verschelde <samuel.verschelde@vates.fr> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Because MPIDR_AFF0_MASK is defined as a 32-bit value, we will miss out
the 3rd level affinity. As a consequence, the IPI would not be sent to
the correct vCPU.
This particular error can be solved by switching MPIDR_AFF0_MASK to use
unsigned long. However, take the opportunity to switch all the MPIDR_*
define to use unsigned long to avoid anymore issue.
Julien Grall [Sat, 28 Nov 2020 11:36:42 +0000 (11:36 +0000)]
xen/irq: Propagate the error from init_one_desc_irq() in init_*_irq_data()
init_one_desc_irq() can return an error if it is unable to allocate
memory. While this is unlikely to happen during boot (called from
init_{,local_}irq_data()), it is better to harden the code by
propagting the return value.
Wei Chen [Fri, 8 Jan 2021 06:21:26 +0000 (14:21 +0800)]
xen/arm: Add defensive barrier in get_cycles for Arm64
Per the discussion [1] on the mailing list, we'd better to
have a barrier after reading CNTPCT in get_cycles. If there
is not any barrier there. When get_cycles being used in some
seqlock critical context in the future, the seqlock can be
speculated potentially.
When executing clock_gettime(), either in the vDSO or via a system call,
we need to ensure that the read of the counter register occurs within
the seqlock reader critical section. This ensures that updates to the
clocksource parameters (e.g. the multiplier) are consistent with the
counter value and therefore avoids the situation where time appears to
go backwards across multiple reads.
Extend the vDSO logic so that the seqlock critical section covers the
read of the counter register as well as accesses to the data page. Since
reads of the counter system registers are not ordered by memory barrier
instructions, introduce dependency ordering from the counter read to a
subsequent memory access so that the seqlock memory barriers apply to
the counter access in both the vDSO and the system call paths.
Cc: <stable@vger.kernel.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lore.kernel.org/linux-arm-kernel/alpine.DEB.2.21.1902081950260.1662@nanos.tec.linutronix.de/ Reported-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Will Deacon <will.deacon@arm.com>
While we are not aware of such use in Xen, it would be best to add the
barrier to avoid any suprise.
In order to reduce the impact of new barrier, we perfer to
use enforce order instead of ISB [2].
Currently, enforce order is not applied to arm32 as this is
not done in Linux at the date of this patch. If this is done
in Linux it will need to be also done in Xen.
To avoid adding read_cntpct_enforce_ordering everywhere, we introduced
a new helper read_cntpct_stable to replace original get_cycles, and turn
get_cycles to a wrapper which we can add read_cntpct_enforce_ordering
easily.
Roger Pau Monné [Tue, 19 Jan 2021 15:04:06 +0000 (16:04 +0100)]
x86/CPUID: unconditionally set XEN_HVM_CPUID_IOMMU_MAPPINGS
This is a revert of f5cfa0985673 plus a rework of the comment that
accompanies the setting of the flag so we don't forget why it needs to
be unconditionally set: it's indicating whether the version of Xen has
the original issue fixed and IOMMU entries are created for
grant/foreign maps.
If the flag is only exposed when the IOMMU is enabled the guest could
resort to use bounce buffers when running backends as it would assume
the underlying Xen version still has the bug present and thus
grant/foreign maps cannot be used with devices.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Tue, 19 Jan 2021 15:03:41 +0000 (16:03 +0100)]
kconfig: ensure strndup() declaration is visible
Its guard was updated such that it is visible by default when POSIX 2008
was adopted by glibc. It's not visible by default on older glibc.
Fixes: f80fe2b34f08 ("xen: Update Kconfig to Linux v5.4") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Doug Goldstein <cardoe@cardoe.com>
Manuel Bouyer [Tue, 12 Jan 2021 18:12:26 +0000 (19:12 +0100)]
tools/xenbackendd: Remove xenbackendd
NetBSD doens't need xenbackendd with xl toolstack so don't build it.
Remove now unused xenbackendd directory/files, and remaining references
in the hotplug scripts.
Signed-off-by: Manuel Bouyer <bouyer@netbsd.org> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
[Also clean up stale comments in the Linux xencommons script] Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
read_exact/write_exact seems to not be available here, which cause a gcc
error. Use plain read/write, the xenevtchn interface won't do partial
read/write on NetBSD anyway so it should be safe. This is in line with the
rest of the OS specific helpers.
Fixes: b7f76a699dc ('tools: Refactor /dev/xen/evtchn wrappers into libxenevtchn') Signed-off-by: Manuel Bouyer <bouyer@netbsd.org> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
UBSAN catches an uninitialized use of the 'preempted' variable in
fork_hap_allocation when there is no preemption.
Fixes: 41548c5472a ("mem_sharing: VM forking") Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Mon, 18 Jan 2021 11:14:19 +0000 (12:14 +0100)]
gnttab: consolidate pin-to-status syncing
Forever since the fix for XSA-230 the 2nd of the comments ahead of
fixup_status_for_copy_pin() has been stale - there's nothing specific to
transitive grants there anymore.
Move the function up, drop the "copy" part from its name again, add a
"readonly" parameter, and use it also on other paths having decremented
one (or not having got to increment any) of the pin counts.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Mon, 18 Jan 2021 11:13:42 +0000 (12:13 +0100)]
gnttab: adjust pin count overflow checks
It's at least odd to check counters which aren't going to be
incremented, resulting in failure just because prior operations may have
reached the refcount limit. And it's also not helpful to use open-coded
literal numbers in these checks.
Calculate the increment values first and derive from them the mask to
use in the checks.
Also move the pin count checks ahead of the calculation of the status
(and for copy also sha2) pointers: They're not needed in the failure
cases, and this way the compiler may also have an easier time keeping
the variables at least transiently in registers for the subsequent uses.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Mon, 18 Jan 2021 11:12:23 +0000 (12:12 +0100)]
x86/Dom0: support zstd compressed kernels
Taken from Linux at commit 1c4dd334df3a ("lib: decompress_unzstd: Limit
output size") for unzstd.c (renamed from decompress_unzstd.c) and 36f9ff9e03de ("lib: Fix fall-through warnings for Clang") for zstd/,
with bits from linux/zstd.h merged into suitable other headers.
To limit the editing necessary, introduce ptrdiff_t.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Mon, 18 Jan 2021 11:10:34 +0000 (12:10 +0100)]
lib: introduce xxhash
Taken from Linux at commit d89775fc929c ("lib/: replace HTTP links with
HTTPS ones"), but split into separate 32-bit and 64-bit sources, since
the immediate consumer (zstd) will need only the latter.
Note that the building of this code is restricted to x86 for now because
of the need to sort asm/unaligned.h for Arm.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Mon, 18 Jan 2021 11:09:13 +0000 (12:09 +0100)]
introduce unaligned.h
Rather than open-coding commonly used constructs in yet more places when
pulling in zstd decompression support (and its xxhash prereq), pull out
the custom bits into a commonly used header (for the hypervisor build;
the tool stack and stubdom builds of libxenguest will still remain in
need of similarly taking care of). For now this is limited to x86, where
custom logic isn't needed (considering this is going to be used in init
code only, even using alternatives patching to use MOVBE doesn't seem
worthwhile).
For Arm64 with CONFIG_ACPI=y (due to efi-dom0.c's re-use of xz/crc32.c)
drop the not really necessary inclusion of xz's private.h.
No change in generated code.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Julien Grall [Fri, 15 Jan 2021 19:29:47 +0000 (19:29 +0000)]
xen/arm: livepatch: Include xen/mm.h rather than asm/mm.h
Livepatch fails to build on Arm after commit ced9795c6cb4 "mm: split
out mfn_t / gfn_t / pfn_t definitions and helpers":
In file included from livepatch.c:13:0:
/oss/xen/xen/include/asm/mm.h:32:28: error: field ‘list’ has incomplete type
struct page_list_entry list;
^~~~
/oss/xen/xen/include/asm/mm.h:53:43: error: ‘MAX_ORDER’ undeclared here (not in a function); did you mean ‘PFN_ORDER’?
unsigned long first_dirty:MAX_ORDER + 1;
^~~~~~~~~
PFN_ORDER
/oss/xen/xen/include/asm/mm.h:53:31: error: bit-field ‘first_dirty’ width not an integer constant
unsigned long first_dirty:MAX_ORDER + 1;
^~~~~~~~~~~
This is happening because asm/mm.h is included directly by livepatch.c.
Yet it depends on xen/mm.h to be included first so MAX_ORDER is defined.
Resolve the build failure by including xen/mm.h rather than asm/mm.h.
Fixes: ced9795c6cb4 ("mm: split out mfn_t / gfn_t / pfn_t definitions and helpers") Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Jan Beulich [Fri, 15 Jan 2021 15:05:03 +0000 (16:05 +0100)]
Arm: don't hard-code grant table limits in create_domUs()
I can only assume that f2ae59bc4b9b ("Rationalize max_grant_frames and
max_maptrack_frames handling") unintentionally left Arm's create_domUs()
set limits to explicit values, as at least some of the same constraints
apply here.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Jan Beulich [Fri, 15 Jan 2021 15:03:56 +0000 (16:03 +0100)]
mm: split out mfn_t / gfn_t / pfn_t definitions and helpers
xen/mm.h has heavy dependencies, while in a number of cases only these
type definitions are needed. This separation then also allows pulling in
these definitions when including xen/mm.h would cause cyclic
dependencies.
Replace xen/mm.h inclusion where possible in include/xen/. (In
xen/iommu.h also take the opportunity and correct the few remaining
sorting issues.)
While the change could be dropped, remove an unnecessary asm/io.h
inclusion from xen/arch/x86/acpi/power.c. This was the initial attempt
to address build issues with it, until it became clear that the header
itself needs adjustment.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Julien Grall <jgrall@amazon.com>
Jan Beulich [Fri, 15 Jan 2021 15:02:13 +0000 (16:02 +0100)]
include: don't use asm/page.h from common headers
Doing so limits what can be done in (in particular included by) this per-
arch header. Abstract out page shift/size related #define-s, which is all
the respective headers care about. Extend the replacement / removal to
some x86 headers as well; some others now need to include page.h (and
they really should have before).
Arm's VADDR_BITS gets dropped altogether: Its current value is clearly
wrong for 64-bit, but the constant also isn't used anywhere right now.
While Arm used vaddr_t in PAGE_OFFSET(), this use is compatible with
that of unsigned long in the new common implementation.
Also drop the dead PAGE_FLAG_MASK at this occasion.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Julien Grall <jgrall@amazon.com>
Juergen Gross [Fri, 15 Jan 2021 08:29:48 +0000 (09:29 +0100)]
docs: update the xenstore migration stream documentation
For live update of Xenstore some records defined in the migration
stream document need to be changed:
- Support of the read-only socket has been dropped from all Xenstore
implementations, so ro-socket-fd in the global record can be removed.
- Some guests require the event channel to Xenstore to remain the same
on Xenstore side, so Xenstore has to keep the event channel interface
open across a live update. For this purpose an evtchn-fd needs to be
added to the global record.
- With no read-only support the flags field in the connection record
can be dropped.
- The evtchn field in the connection record needs to be switched to
hold the port of the Xenstore side of the event channel.
- A flags field needs to be added to permission specifiers in order to
be able to mark a permission as stale (XSA-322).
Juergen Gross [Fri, 15 Jan 2021 08:29:38 +0000 (09:29 +0100)]
tools/libxenevtchn: add possibility to not close file descriptor on exec
Today the file descriptor for the access of the event channel driver
is being closed in case of exec(2). For the support of live update of
a daemon using libxenevtchn this can be problematic, so add a way to
keep that file descriptor open.
Add support of a flag XENEVTCHN_NO_CLOEXEC for xenevtchn_open() which
will result in _not_ setting O_CLOEXEC when opening the event channel
driver node.
The caller can then obtain the file descriptor via xenevtchn_fd().
Add an alternative open function xenevtchn_fdopen() which takes that
file descriptor as an additional parameter. This allows to allocate a
xenevtchn_handle and to associate it with that file descriptor.
Juergen Gross [Fri, 15 Jan 2021 08:29:36 +0000 (09:29 +0100)]
tools/libxenevtchn: check xenevtchn_open() flags for not supported bits
Refuse a call of xenevtchn_open() with unsupported bits in flags being
set.
This will change behavior for callers passing junk in flags today,
but those would otherwise get probably unwanted side effects when the
flags they specify today get any meaning. So checking flags is the
right thing to do.
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Juergen Gross [Fri, 15 Jan 2021 08:29:35 +0000 (09:29 +0100)]
tools/libxenevtchn: rename open_flags to flags
Rename the xenevtchn_open() parameter open_flags to flags as it might
be used for things not passed on to open().
No functional change.
No API/ABI changes.
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
automation: add a job to import qemu-system-aarch64 into the pipeline
In order to use the pre-built test-artifacts/qemu-system-aarch64 binary
for our tests, first we need to import it into the pipeline. Let's do
that the same way we did it for the kernel and Alpine Linux filesystem:
by creating a special job for it.
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> Acked-by: Wei Liu <wl@xen.org>
automation: add qemu-system-aarch64 to test-artifacts
Currently we are using Debian's qemu-system-aarch64 for our tests.
However, sometimes it crashes. It is hard to debug and even harder to
apply any fixes to it.
Instead, build our own QEMU as one of our test-artifacts, which are only
built once, then imported into each pipeline via phony jobs.
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> Acked-by: Wei Liu <wl@xen.org>
Olaf Hering [Thu, 14 Jan 2021 12:03:23 +0000 (13:03 +0100)]
stubdom: fix tpm_version
It is just a declaration, not a variable.
ld: /home/abuild/rpmbuild/BUILD/xen-4.14.20200616T103126.3625b04991/non-dbg/stubdom/vtpmmgr/vtpmmgr.a(vtpm_cmd_handler.o):(.bss+0x0): multiple definition of `tpm_version'; /home/abuild/rpmbuild/BUILD/xen-4.14.20200616T103126.3625b04991/non-dbg/stubdom/vtpmmgr/vtpmmgr.a(vtpmmgr.o):(.bss+0x0): first defined here
Signed-off-by: Olaf Hering <olaf@aepfle.de> Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Jan Beulich [Thu, 14 Jan 2021 12:03:01 +0000 (13:03 +0100)]
tools/libxenstat: ensure strnlen() declaration is visible
Its guard was updated such that it is visible by default when POSIX 2008
was adopted by glibc. It's not visible by default on older glibc.
Fixes: 40fe714ca424 ("tools/libs/stat: use memcpy instead of strncpy in getBridge") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich [Thu, 14 Jan 2021 12:02:35 +0000 (13:02 +0100)]
argo: don't pointlessly use get_domain_by_id()
For short-lived references rcu_lock_domain_by_id() is the better
(slightly cheaper) alternative.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Christopher Clark <christopher.w.clark@gmail.com>
Jan Beulich [Thu, 14 Jan 2021 12:01:14 +0000 (13:01 +0100)]
lib: drop (replace) debug_build()
Its expansion shouldn't be tied to NDEBUG - down the road we may want to
allow enabling assertions independently of CONFIG_DEBUG. Replace the few
uses by a new xen_build_info() helper, subsuming gcov_string at the same
time (while replacing the stale CONFIG_GCOV used there) and also adding
CONFIG_UBSAN indication.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Julien Grall <jgrall@amazon.com>
Jan Beulich [Thu, 14 Jan 2021 12:00:26 +0000 (13:00 +0100)]
memory: avoid pointless continuation in xenmem_add_to_physmap()
Adjust so we uniformly avoid needlessly arranging for a continuation on
the last iteration.
Fixes: 5777a3742d88 ("IOMMU: hold page ref until after deferred TLB flush") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Julien Grall <jgrall@amazon.com>
xen/arm: don't read aarch32 regs when aarch32 isn't available
Don't read aarch32 system registers at boot time when the aarch32 state
is not available at EL0. They are UNKNOWN, so it is not useful to read
them. Moreover, on Cavium ThunderX reading ID_PFR2_EL1 generates an
unsupported exception which causes a Xen crash. Instead, only read them
when aarch32 is available.
Leave the corresponding fields in struct cpuinfo_arm so that they
are read-as-zero from a guest.
Since we are editing identify_cpu, also fix the indentation: 4 spaces
instead of 8.
Wei Chen [Tue, 5 Jan 2021 07:19:45 +0000 (15:19 +0800)]
xen/arm: Correct the coding style of get_cycles
It seems the arm inline function get_cycles has used 8 spaces for
line indent since 2012. This patch correct them to 4 spaces and
remove extra space between function name and bracket.
Tamas K Lengyel [Wed, 13 Jan 2021 02:28:45 +0000 (18:28 -0800)]
x86/mem_sharing: fix wrong field name used in 2c5119d
The arch_domain struct has "msr", not "msrs".
Spotted by a TravisCI Randconfig build.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Tue, 12 Jan 2021 18:37:53 +0000 (18:37 +0000)]
tools: Move xen-access from tests/ to misc/
xen-access is a tool for a human to use, rather than a test. Move it
into misc/ as a more appropriate location to live.
Move the -DXC_WANT_COMPAT_DEVICEMODEL_API from CFLAGS into xen-access.c itself
to avoid adding Makefile complexity.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
Andrew Cooper [Tue, 12 Jan 2021 18:33:39 +0000 (18:33 +0000)]
tools/tests: Drop obsolete running scripts
The python unit tests were dropped in Xen 4.12 due to being obsolete, but the
scripts to run the tests were missed. Clean up .gitignore as well.
Also drop the libxenctrl {C,LD}FLAGS adjustments in the Makefile. This logic
isn't used, and isn't appropriate even in principle, as there are tests in
here which don't want to use libxenctrl.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
New architectures shouldn't be forced to implement no-op stubs for unused
functionality.
Introduce CONFIG_ARCH_ACQUIRE_RESOURCE which can be opted in to, and provide
compatibility logic in xen/mm.h
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Paul Durrant <paul@xen.org> Acked-by: Julien Grall <jgrall@amazon.com>
Roger Pau Monné [Mon, 11 Jan 2021 13:58:00 +0000 (14:58 +0100)]
x86/acpi: remove dead code
After the recent changes to acpi_fadt_parse_sleep_info the bad label
can never be called with facs mapped, and hence the unmap can be
removed.
Additionally remove the whole label, since it was used by a
single caller. Move the relevant code from the label.
No functional change intended.
CID: 1471722 Fixes: 16ca5b3f873 ('x86/ACPI: don't invalidate S5 data when S3 wakeup vector cannot be determined') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Mon, 11 Jan 2021 13:56:23 +0000 (14:56 +0100)]
ACPI: replace casts by container_of()
The latter is slightly more type-safe. Also add const where possible,
including without need to touch further code. Additionally replace an
adjacent unnecessary use of u16.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Mon, 11 Jan 2021 13:55:52 +0000 (14:55 +0100)]
x86/ACPI: don't overwrite FADT
When marking fields invalid for our own purposes, we should do so in our
local copy (so we will notice later on), not in the firmware provided
one (which another entity may want to look at again, e.g. after kexec).
Also mark the function parameter const to notice such issues right away.
Instead use the pointer at the firmware copy for specifying an adjacent
printk()'s arguments. If nothing else this at least reduces the number
of relocations the assembler hasto emit and the linker has to process.
Fixes: 62d1a69a4e9f ("ACPI: support v5 (reduced HW) sleep interface") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Mon, 11 Jan 2021 13:55:16 +0000 (14:55 +0100)]
ACPI: reduce verbosity by default
While they're KERN_INFO messages and hence not visible by default, we
still have had reports that the amount of output is too large, not the
least because
- the command line controlled resizing of the console ring buffer
happens only after SRAT parsing (which may alone produce more than 16k
of output),
- the default resizing of the console ring buffer happens only after
ACPI table parsing, since the default size gets calculated depending
on the number or processors found.
Gate all per-processor logging behind a new "acpi=verbose", making sure
we wouldn't unintentionally pass this on to Dom0.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Mon, 11 Jan 2021 13:53:55 +0000 (14:53 +0100)]
evtchn: closing of vIRQ-s doesn't require looping over all vCPU-s
Global vIRQ-s have their event channel association tracked on vCPU 0.
Per-vCPU vIRQ-s can't have their notify_vcpu_id changed. Hence it is
well-known which vCPU's virq_to_evtchn[] needs updating.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Julien Grall <jgrall@amazon.com>
Jan Beulich [Mon, 11 Jan 2021 13:53:02 +0000 (14:53 +0100)]
evtchn: don't call Xen consumer callback with per-channel lock held
While there don't look to be any problems with this right now, the lock
order implications from holding the lock can be very difficult to follow
(and may be easy to violate unknowingly). The present callbacks don't
(and no such callback should) have any need for the lock to be held.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Julien Grall <jgrall@amazon.com>
Jan Beulich [Mon, 11 Jan 2021 13:51:39 +0000 (14:51 +0100)]
x86/PV: fold redundant calls to adjust_guest_l<N>e()
At least from an abstract perspective it is quite odd for us to compare
adjusted old and unadjusted new page table entries when determining
whether the fast path can be used. This is largely benign because
FASTPATH_FLAG_WHITELIST covers most of the flags which the adjustments
may set, and the flags getting set don't affect the outcome of
get_page_from_l<N>e(). There's one exception: 32-bit L3 entries get
_PAGE_RW set, but get_page_from_l3e() doesn't allow linear page tables
to be created at this level for such guests. Apart from this _PAGE_RW
is unused by get_page_from_l<N>e() (for N > 1), and hence forcing the
bit on early has no functional effect.
The main reason for the change, however, is that adjust_guest_l<N>e()
aren't exactly cheap - both in terms of pure code size and because each
one has at least one evaluate_nospec() by way of containing
is_pv_32bit_domain() conditionals.
Call the functions once ahead of the fast path checks, instead of twice
after.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Commit 8a74707a7c ("x86/nospec: Use always_inline to fix code gen for
evaluate_nospec") converted inline to always_inline for
adjust_guest_l[134]e(), but left adjust_guest_l2e() and
unadjust_guest_l3e() alone without saying why these two would differ in
the needed / wanted treatment. Adjust these two as well.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
MVFR2 is not available on ARMv7. It is available on ARMv8 aarch32 and
aarch64. If Xen reads MVFR2 on ARMv7 it could crash.
Avoid the issue by doing the following:
- define MVFR2_MAYBE_UNDEFINED on arm32
- if MVFR2_MAYBE_UNDEFINED, do not attempt to read MVFR2 in Xen
- keep the 3rd register_t in struct cpuinfo_arm.mvfr on arm32 so that a
guest read to the register returns '0' instead of crashing the guest.
'0' is an appropriate value to return to the guest because it is defined
as "no support for miscellaneous features".
Aarch64 Xen is not affected by this patch.
Fixes: 9cfdb489af81 ("xen/arm: Add ID registers and complete cpuinfo") Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> Acked-by: Julien Grall <jgrall@amazon.com>
Roger Pau Monné [Fri, 8 Jan 2021 15:51:52 +0000 (16:51 +0100)]
x86/hypercall: fix gnttab hypercall args conditional build on pvshim
A pvshim build doesn't require the grant table functionality built in,
but it does require knowing the number of arguments the hypercall has
so the hypercall parameter clobbering works properly.
Instead of also setting the argument count for the gnttab case if PV
shim functionality is enabled, just drop all of the conditionals from
hypercall_args_table, as a hypercall having a NULL handler won't get
to use that information anyway.
Note this hasn't been detected by osstest because the tools pvshim
build is done without debug enabled, so the hypercall parameter
clobbering doesn't happen.
Fixes: d2151152dd2 ('xen: make grant table support configurable') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Fri, 8 Jan 2021 15:51:19 +0000 (16:51 +0100)]
x86/shadow: adjust TLB flushing in sh_unshadow_for_p2m_change()
Accumulating transient state of d->dirty_cpumask in a local variable is
unnecessary here: The flush is fine to make with the dirty set at the
time of the call. With this, move the invocation to a central place at
the end of the function.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org>
Jan Beulich [Fri, 8 Jan 2021 15:50:11 +0000 (16:50 +0100)]
x86/p2m: pass old PTE directly to write_p2m_entry_pre() hook
In no case is a pointer to non-const needed. Since no pointer arithmetic
is done by the sole user of the hook, passing in the PTE itself is quite
fine.
While doing this adjustment also
- drop the intermediate sh_write_p2m_entry_pre():
sh_unshadow_for_p2m_change() can itself be used as the hook function,
moving the conditional into there,
- introduce a local variable holding the flags of the old entry.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org>
Jan Beulich [Fri, 8 Jan 2021 15:49:23 +0000 (16:49 +0100)]
x86/p2m: avoid unnecessary calls of write_p2m_entry_pre() hook
When shattering a large page, we first construct the new page table page
and only then hook it up. The "pre" hook in this case does nothing, for
the page starting out all blank. Avoid 512 calls into shadow code in
this case by passing in INVALID_GFN, indicating the page being updated
is (not yet) associated with any GFN. (The alternative to this change
would be to actually pass in a correct GFN, which can't be all the same
on every loop iteration.)
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Tamas K Lengyel [Fri, 8 Jan 2021 10:51:36 +0000 (11:51 +0100)]
x86/mem_sharing: resolve mm-lock order violations when forking VMs with nested p2m
Several lock-order violations have been encountered while attempting to fork
VMs with nestedhvm=1 set. This patch resolves the issues.
The order violations stems from a call to p2m_flush_nestedp2m being performed
whenever the hostp2m changes. This functions always takes the p2m lock for the
nested_p2m. However, with sharing the p2m locks always have to be taken before
the sharing lock. To resolve this issue we avoid taking the sharing lock where
possible (and was actually unecessary to begin with). But we also make
p2m_flush_nestedp2m aware that the p2m lock may have already been taken and
preemptively take all nested_p2m locks before unsharing a page where taking the
sharing lock is necessary.
Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Fri, 8 Jan 2021 10:50:32 +0000 (11:50 +0100)]
x86: fold indirect_thunk_asm.h into asm-defns.h
There's little point in having two separate headers both getting
included by asm_defns.h. This in particular reduces the number of
instances of guarding asm(".include ...") suitably in such dual use
headers.
No change to generated code.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Jan Beulich [Fri, 8 Jan 2021 10:48:09 +0000 (11:48 +0100)]
x86: drop ASM_{CL,ST}AC
Use ALTERNATIVE directly, such that at the use sites it is visible that
alternative code patching is in use. Similarly avoid hiding the fact in
SAVE_ALL.
No change to generated code.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 8 Jan 2021 10:45:07 +0000 (11:45 +0100)]
x86: replace __ASM_{CL,ST}AC
Introduce proper assembler macros instead, enabled only when the
assembler itself doesn't support the insns. To avoid duplicating the
macros for assembly and C files, have them processed into asm-macros.h.
This in turn requires adding a multiple inclusion guard when generating
that header.
No change to generated code.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Roman Skakun [Wed, 6 Jan 2021 11:26:57 +0000 (13:26 +0200)]
xen/arm: optee: The function identifier is always 32-bit
Per the SMCCC specification (see section 3.1 in ARM DEN 0028D), the
function identifier is only stored in the least significant 32-bits.
The most significant 32-bits should be ignored.
Signed-off-by: Roman Skakun <roman_skakun@epam.com> Acked-by: Volodymyr Babchyk <volodymyr_babchuk@epam.com>
[jgrall: Reword the commit message and comment] Acked-by: Julien Grall <jgrall@amazon.com>
Jan Beulich [Thu, 7 Jan 2021 14:11:25 +0000 (15:11 +0100)]
xsm/dummy: harden against speculative abuse
First of all don't open-code is_control_domain(), which is already
suitably using evaluate_nospec(). Then also apply this construct to the
other paths of xsm_default_action(). Also guard two paths not using this
function.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Wei Liu <wl@xen.org>
Roger Pau Monné [Thu, 7 Jan 2021 14:10:29 +0000 (15:10 +0100)]
x86/dpci: EOI interrupt regardless of its masking status
Modify hvm_pirq_eoi to always EOI the interrupt if required, instead
of not doing such EOI if the interrupt is routed through the vIO-APIC
and the entry is masked at the time the EOI is performed.
Further unmask of the vIO-APIC pin won't EOI the interrupt, and thus
the guest OS has to wait for the timeout to expire and the automatic
EOI to be performed.
This allows to simplify the helpers and drop the vioapic_redir_entry
parameter from all of them.
Fixes: ccfe4e08455 ('Intel vt-d specific changes in arch/x86/hvm/vmx/vtd.') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Thu, 7 Jan 2021 14:09:47 +0000 (15:09 +0100)]
x86: drop use of E801 memory "map" (and alike)
ACPI mandates use of E820 (or newer, e.g. EFI), and in fact firmware
has been observed to include E820_ACPI ranges in what E801 reports as
available (really "configured") memory. Since all 64-bit systems ought
to support ACPI, drop our use of older BIOS and boot loader interfaces.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Thu, 7 Jan 2021 14:03:17 +0000 (15:03 +0100)]
vPCI/MSI-X: fold clearing of entry->updated
Both call sites clear the flag after a successfull call to
update_entry(). This can be simplified by moving the clearing into the
function, onto its success path.
As a result of neither caller caring about update_entry()'s return value
anymore, the function gets switched to return void.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
libxl: cleanup remaining backend xs dirs after driver domain
When device is removed, backend domain (which may be a driver domain) is
responsible for removing backend entries from xenstore. But in case of
driver domain, it has no access to remove all of them - specifically the
directory named after frontend-id remains. This may accumulate enough to
exceed xenstore quote of the driver domain, breaking further devices.
Fix this by calling libxl__xs_path_cleanup() on the backend path from
libxl__device_destroy() in the toolstack domain too. Note
libxl__device_destroy() is called when the driver domain already removed
what it can (see device_destroy_be_watch_cb()->device_hotplug_done()).
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Acked-by: Wei Liu <wl@xen.org>
Jan Beulich [Tue, 5 Jan 2021 12:20:54 +0000 (13:20 +0100)]
lib/sort: adjust types
First and foremost do away with the use of plain int for sizes or size-
derived values. Use size_t, despite this requiring some adjustment to
the logic. Also replace u32 by uint32_t.
While not directly related also drop a leftover #ifdef from x86's
swap_ex - this was needed only back when 32-bit Xen was still a thing.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 5 Jan 2021 12:20:13 +0000 (13:20 +0100)]
vPCI/MSI-X: tidy init_msix()
First of all introduce a local variable for the to be allocated struct.
The compiler can't CSE all the occurrences (I'm observing 80 bytes of
code saved with gcc 10). Additionally, while the caller can cope and
there was no memory leak, globally "announce" the struct only once done
initializing it. This also removes the dependency of the function on
the caller cleaning up after it in case of an error.
Also prefer a local variable over using a structure field previously
set from this very variable.
Finally move the call to vpci_add_register() ahead of all further
initialization of the struct, to bail early in case of error.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>