Andrew Cooper [Mon, 27 Jul 2020 16:24:11 +0000 (17:24 +0100)]
xen/memory: Fix mapping grant tables with XENMEM_acquire_resource
A guest's default number of grant frames is 64, and XENMEM_acquire_resource
will reject an attempt to map more than 32 frames. This limit is caused by
the size of mfn_list[] on the stack.
Fix mapping of arbitrary size requests by looping over batches of 32 in
acquire_resource(), and using hypercall continuations when necessary.
To start with, break _acquire_resource() out of acquire_resource() to cope
with type-specific dispatching, and update the return semantics to indicate
the number of mfns returned. Update gnttab_acquire_resource() and x86's
arch_acquire_resource() to match these new semantics.
Have do_memory_op() pass start_extent into acquire_resource() so it can pick
up where it left off after a continuation, and loop over batches of 32 until
all the work is done, or a continuation needs to occur.
compat_memory_op() is a bit more complicated, because it also has to marshal
frame_list in the XLAT buffer. Have it account for continuation information
itself and hide details from the upper layer, so it can marshal the buffer in
chunks if necessary.
With these fixes in place, it is now possible to map the whole grant table for
a guest.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Fri, 5 Feb 2021 13:09:42 +0000 (14:09 +0100)]
x86/EFI: work around GNU ld 2.36 issue
Our linker capability check fails with the recent binutils release's ld:
.../check.o:(.debug_aranges+0x6): relocation truncated to fit: R_X86_64_32 against `.debug_info'
.../check.o:(.debug_info+0x6): relocation truncated to fit: R_X86_64_32 against `.debug_abbrev'
.../check.o:(.debug_info+0xc): relocation truncated to fit: R_X86_64_32 against `.debug_str'+76
.../check.o:(.debug_info+0x11): relocation truncated to fit: R_X86_64_32 against `.debug_str'+d
.../check.o:(.debug_info+0x15): relocation truncated to fit: R_X86_64_32 against `.debug_str'+2b
.../check.o:(.debug_info+0x29): relocation truncated to fit: R_X86_64_32 against `.debug_line'
.../check.o:(.debug_info+0x30): relocation truncated to fit: R_X86_64_32 against `.debug_str'+19
.../check.o:(.debug_info+0x37): relocation truncated to fit: R_X86_64_32 against `.debug_str'+71
.../check.o:(.debug_info+0x3e): relocation truncated to fit: R_X86_64_32 against `.debug_str'
.../check.o:(.debug_info+0x45): relocation truncated to fit: R_X86_64_32 against `.debug_str'+5e
.../check.o:(.debug_info+0x4c): additional relocation overflows omitted from the output
Tell the linker to strip debug info as a workaround. Debug info has been
getting stripped already anyway when linking the actual xen.efi.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monne [Fri, 5 Feb 2021 12:19:38 +0000 (13:19 +0100)]
tools/tests: fix resource test build on FreeBSD
error.h is not a standard header, and none of the functions declared
there are actually used by the code. This fixes the build on FreeBSD
that doesn't have error.h
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Thu, 23 Jul 2020 16:26:16 +0000 (17:26 +0100)]
tools/tests: Introduce a test for acquire_resource
For now, simply try to map 40 frames of grant table. This catches most of the
basic errors with resource sizes found and fixed through the 4.15 dev window.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Tested-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Manuel Bouyer [Wed, 3 Feb 2021 16:54:19 +0000 (17:54 +0100)]
tools/xenstored: close socket connections on error
On error, don't keep socket connection in ignored state but close them.
When the remote end of a socket is closed, xenstored will flag it as an
error and switch the connection to ignored. But on some OSes (e.g.
NetBSD), poll(2) will return only POLLIN in this case, so sockets in ignored
state will stay open forever in xenstored (and it will loop with CPU 100%
busy).
Fixes: d2fa370d3ef9 ("tools/xenstore: Preserve bad client until they are destroyed") Signed-off-by: Manuel Bouyer <bouyer@netbsd.org> Reviewed-by: Juergen Gross <jgross@suse.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Manuel Bouyer [Wed, 3 Feb 2021 16:54:18 +0000 (17:54 +0100)]
tools/hotplug: Add a qemu-ifup script on NetBSD
On NetBSD, qemu-xen will use a qemu-ifup script to setup the tap interfaces
(as qemu-xen-traditional used to). Copy the script from qemu-xen-traditional,
and install it on NetBSD. While there document parameters and environnement
variables.
Signed-off-by: Manuel Bouyer <bouyer@netbsd.org> Acked-by: Ian Jackson <iwj@xenproject.org>
Andrew Cooper [Thu, 4 Feb 2021 15:50:16 +0000 (15:50 +0000)]
libs/devicemodel: Fix ABI breakage from xendevicemodel_set_irq_level()
It is not permitted to edit the VERS clause for a version in a release of Xen.
Revert xendevicemodel_set_irq_level()'s inclusion in .so.1.2 and bump the the
library minor version to .so.1.4 instead.
Fixes: 5d752df85f ("xen/dm: Introduce xendevicemodel_set_irq_level DM op") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Ian Jackson <iwj@xenproject.org> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Edwin Török [Fri, 15 Jan 2021 19:38:58 +0000 (19:38 +0000)]
tools/oxenstored: mkdir conflicts were sometimes missed
Due to how set_write_lowpath was used here it didn't detect create/delete
conflicts. When we create an entry we must mark our parent as modified
(this is what creating a new node via write does).
Otherwise we can have 2 transactions one creating, and another deleting a node
both succeeding depending on timing. Or one transaction reading an entry,
concluding it doesn't exist, do some other work based on that information and
successfully commit even if another transaction creates the node via mkdir
meanwhile.
Signed-off-by: Edwin Török <edvin.torok@citrix.com> Acked-by: Christian Lindig <christian.lindig@citrix.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Edwin Török [Fri, 15 Jan 2021 19:28:37 +0000 (19:28 +0000)]
tools/oxenstored: Reject invalid watch paths early
Watches on invalid paths were accepted, but they would never trigger. The
client also got no notification that its watch is bad and would never trigger.
Found again by the structured fuzzer, due to an error on live update reload:
the invalid watch paths would get rejected during live update and the list of
watches would be different pre/post live update.
The testcase is watch on `//`, which is an invalid path.
Signed-off-by: Edwin Török <edvin.torok@citrix.com> Acked-by: Christian Lindig <christian.lindig@citrix.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Edwin Török [Fri, 15 Jan 2021 19:11:32 +0000 (19:11 +0000)]
tools/oxenstored: Fix quota calculation for mkdir EEXIST
We increment the domain's quota on mkdir even when the node already exists.
This results in a quota inconsistency after live update, where reconstructing
the tree from scratch results in a different quota.
Not a security issue because the domain uses up quota faster, so it will only
get a Quota error sooner than it should.
Found by the structured fuzzer.
Signed-off-by: Edwin Török <edvin.torok@citrix.com> Acked-by: Christian Lindig <christian.lindig@citrix.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Roger Pau Monné [Thu, 4 Feb 2021 13:02:32 +0000 (14:02 +0100)]
x86/efi: enable MS ABI attribute on clang
Or else the EFI service calls will use the wrong calling convention.
The __ms_abi__ attribute is available on all supported versions of
clang. Add a specific Clang check because the GCC version reported by
Clang is below the required 4.4 to use the __ms_abi__ attribute.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Ian Jackson <iwj@xenproject.org> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Thu, 4 Feb 2021 12:59:56 +0000 (13:59 +0100)]
x86/string: correct memmove()'s forwarding to memcpy()
With memcpy() expanding to the compiler builtin, we may not hand it
overlapping source and destination. We strictly mean to forward to our
own implementation (a few lines up in the same source file).
Fixes: 78825e1c60fa ("x86/string: Clean up x86/string.h") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Wed, 3 Feb 2021 15:43:35 +0000 (15:43 +0000)]
libs/foreignmem: Fix/simplify errno handling for map_resource
Simplify the FreeBSD and Linux logic, left in this state by the previous
change. No functional change.
Duplicate the FreeBSD logic for NetBSD, to maintain the uniform ABI for
callers that EOPNOTSUPP covers all Xen/Kernel support.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Ian Jackson <iwj@xenproject.org> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Andrew Cooper [Wed, 3 Feb 2021 15:41:55 +0000 (15:41 +0000)]
libs/foreignmem: Drop useless and/or misleading logging
These log lines are all in response to single system calls, and do not provide
any information which the immediate caller can't determine themselves. It is
however rude to put junk like this onto stderr, especially as system call
failures are not even error conditions in certain circumstances.
The FreeBSD logging has stale function names in, and Solaris shouldn't have
passed code review to start with.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Ian Jackson <iwj@xenproject.org> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Tue, 2 Feb 2021 10:36:50 +0000 (11:36 +0100)]
x86/build: correctly record dependencies of asm-offsets.s
Going through an intermediate *.new file requires telling the compiler
what the real target is, so that the inclusion of the resulting .*.d
file will actually be useful.
Fixes: 7d2d7a43d014 ("x86/build: limit rebuilding of asm-offsets.h") Reported-by: Julien Grall <julien@xen.org> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Julien Grall [Tue, 2 Feb 2021 10:35:42 +0000 (11:35 +0100)]
memory: fix build with COVERAGE but !HVM
Xen is heavily relying on the DCE stage to remove unused code so the
linker doesn't throw an error because a function is not implemented
yet we defined a prototype for it.
On some GCC versions (such as 9.4 provided by Debian sid), the compiler
DCE stage will not manage to figure that out for
xenmem_add_to_physmap_batch():
ld: ld: prelink.o: in function `xenmem_add_to_physmap_batch':
/xen/xen/common/memory.c:942: undefined reference to `xenmem_add_to_physmap_one'
/xen/xen/common/memory.c:942:(.text+0x22145): relocation truncated
to fit: R_X86_64_PLT32 against undefined symbol `xenmem_add_to_physmap_one'
prelink-efi.o: in function `xenmem_add_to_physmap_batch':
/xen/xen/common/memory.c:942: undefined reference to `xenmem_add_to_physmap_one'
make[2]: *** [Makefile:215: /root/xen/xen/xen.efi] Error 1
make[2]: *** Waiting for unfinished jobs....
ld: /xen/xen/.xen-syms.0: hidden symbol `xenmem_add_to_physmap_one' isn't defined
ld: final link failed: bad value
It is not entirely clear why the compiler DCE is not detecting the
unused code. However, cloning the check introduced by the commit below
into xenmem_add_to_physmap_batch() does the trick.
No functional change intended.
Fixes: d4f699a0df6c ("x86/mm: p2m_add_foreign() is HVM-only") Reported-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Signed-off-by: Julien Grall <jgrall@amazon.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Wei Liu <wl@xen.org> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
A non-breaking space isn't a valid C preprocessor token.
Fixes: ffbb8aa282de ("xenstore: fix build on {Net/Free}BSD") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
A non-breaking space isn't a valid C preprocessor token.
Fixes: ffbb8aa282de ("xenstore: fix build on {Net/Free}BSD") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Manuel Bouyer [Sat, 30 Jan 2021 18:27:10 +0000 (19:27 +0100)]
xenpmd.c: use dynamic allocation
On NetBSD, d_name is larger than 256, so file_name[284] may not be large
enough (and gcc emits a format-truncation error).
Use asprintf() instead of snprintf() on a static on-stack buffer.
Signed-off-by: Manuel Bouyer <bouyer@netbsd.org> Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Plus
define GNU_SOURCE for asprintf()
Harmless on NetBSD.
Signed-off-by: Manuel Bouyer <bouyer@netbsd.org> Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Tamas K Lengyel [Sat, 30 Jan 2021 01:59:53 +0000 (20:59 -0500)]
x86/debug: fix page-overflow bug in dbg_rw_guest_mem
When using gdbsx dbg_rw_guest_mem is used to read/write guest memory. When the
buffer being accessed is on a page-boundary, the next page needs to be grabbed
to access the correct memory for the buffer's overflown parts. While
dbg_rw_guest_mem has logic to handle that, it broke with 229492e210a. Instead
of grabbing the next page the code right now is looping back to the
start of the first page. This results in errors like the following while trying
to use gdb with Linux' lx-dmesg:
[ 0.114457] PM: hibernation: Registered nosave memory: [mem
0xfdfff000-0xffffffff]
[ 0.114460] [mem 0x90000000-0xfbffffff] available for PCI demem 0
[ 0.114462] f]f]
Python Exception <class 'ValueError'> embedded null character:
Error occurred in Python: embedded null character
Fixing this bug by taking the variable assignment outside the loop.
Fixes: 229492e210a ("x86/debugger: use copy_to/from_guest() in dbg_rw_guest_mem()") Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Wed, 20 Jan 2021 19:06:19 +0000 (19:06 +0000)]
xen+tools: Introduce XEN_SYSCTL_PHYSCAP_vmtrace
We're about to introduce support for Intel Processor Trace, but similar
functionality exists in other platforms.
Aspects of vmtrace can reasonably can be common, so start with
XEN_SYSCTL_PHYSCAP_vmtrace and plumb the signal from Xen all the way down into
`xl info`.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
The frame_list is an input, or an output, depending on whether the calling
domain is translated or not. The array does not need marshalling in both
directions.
Furthermore, the copy-in loop was very inefficient, copying 4 bytes at at
time. Rewrite it to copy in all nr_frames at once, and then expand
compat_pfn_t to xen_pfn_t in place.
Re-position the copy-in loop to simplify continuation support in a future
patch, and reduce the scope of certain variables.
No change in guest observed behaviour.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Paul Durrant <paul@xen.org>
Andrew Cooper [Thu, 23 Jul 2020 14:18:33 +0000 (15:18 +0100)]
xen/memory: Fix acquire_resource size semantics
Calling XENMEM_acquire_resource with a NULL frame_list is a request for the
size of the resource, but the returned 32 is bogus.
If someone tries to follow it for XENMEM_resource_ioreq_server, the acquire
call will fail as IOREQ servers currently top out at 2 frames, and it is only
half the size of the default grant table limit for guests.
Also, no users actually request a resource size, because it was never wired up
in the sole implementation of resource acquisition in Linux.
Introduce a new resource_max_frames() to calculate the size of a resource, and
implement it the IOREQ and grant subsystems.
It is impossible to guarantee that a mapping call following a successful size
call will succeed (e.g. The target IOREQ server gets destroyed, or the domain
switches from grant v2 to v1). Document the restriction, and use the
flexibility to simplify the paths to be lockless.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Paul Durrant <paul@xen.org> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Mon, 27 Jul 2020 12:40:06 +0000 (13:40 +0100)]
xen/gnttab: Rework resource acquisition
The existing logic doesn't function in the general case for mapping a guests
grant table, due to arbitrary 32 frame limit, and the default grant table
limit being 64.
In order to start addressing this, rework the existing grant table logic by
implementing a single gnttab_acquire_resource(). This is far more efficient
than the previous acquire_grant_table() in memory.c because it doesn't take
the grant table write lock, and attempt to grow the table, for every single
frame.
The new gnttab_acquire_resource() function subsumes the previous two
gnttab_get_{shared,status}_frame() helpers.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
The ABI is unfortunate, and frame being 64 bits leads to all kinds of problems
performing correct overflow checks.
Reject out-of-range values, and combinations which overflow, and use unsigned
int consistently elsewhere. This fixes several truncation bugs in the grant
call tree, as the underlying limits are expressed with unsigned int to begin
with.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Paul Durrant <paul@xen.org> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Manuel Bouyer [Tue, 26 Jan 2021 22:47:58 +0000 (23:47 +0100)]
libs/light: pass some infos to qemu
Pass bridge name to qemu as command line option
When starting qemu, set an environnement variable XEN_DOMAIN_ID,
to be used by qemu helper scripts
The only functional difference of using the br parameter is that the
bridge name gets passed to the QEMU script.
NetBSD doesn't have the ioctl to rename network interfaces implemented, and
thus cannot rename the interface from tapX to vifX.Y-emu. Only qemu knowns
the tap interface name, so we need to use the qemu script from qemu itself.
Signed-off-by: Manuel Bouyer <bouyer@netbsd.org> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Manuel Bouyer [Tue, 26 Jan 2021 22:47:57 +0000 (23:47 +0100)]
libs/light: make it build without setresuid()
NetBSD doesn't have setresuid(). introcuce libxl__setresuid(),
which on NetBSD assert() that it's never called (it should not be called when
dm restriction is off, and NetBSD doesn't support dm restriction at
this time).
On linux and FreeBSD it just calls setresuid().
Signed-off-by: Manuel Bouyer <bouyer@netbsd.org> Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Manuel Bouyer [Tue, 26 Jan 2021 22:47:54 +0000 (23:47 +0100)]
libs/light: Switch NetBSD to QEMU_XEN
Switch NetBSD to QEMU_XEN.
All 3 versions of libxl__default_device_model() now return
LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN, so remove it and just set
b_info->device_model_version to LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN in
libxl__domain_build_info_setdefault().
Signed-off-by: Manuel Bouyer <bouyer@netbsd.org> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Manuel Bouyer [Tue, 26 Jan 2021 22:47:49 +0000 (23:47 +0100)]
NetBSD hotplug: fix block unconfigure on destroy
When a domain is destroyed, xparams may not be available any more when
the block script is called to unconfigure the vnd.
Check xparam only at configure time, and just unconfigure any vnd present
in the xenstore.
Signed-off-by: Manuel Bouyer <bouyer@netbsd.org> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Manuel Bouyer [Tue, 26 Jan 2021 22:47:48 +0000 (23:47 +0100)]
NetBSD hotplug: Introduce locking functions
On NetBSD, some block device configuration requires serialisation.
Introcuce locking functions (derived from the Linux version), and use them
in the block script where appropriate.
Signed-off-by: Manuel Bouyer <bouyer@netbsd.org> Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
xen/ioreq: Make the IOREQ feature selectable on Arm
The purpose of this patch is to add a possibility for user
to be able to select IOREQ support on Arm (which is disabled
by default) with retaining the current behaviour on x86
(is selected by HVM and it's prompt is not visible).
Also make the IOREQ be depended on CONFIG_EXPERT on Arm since
it is considered as Technological Preview feature and
update SUPPORT.md.
xen/ioreq: Do not let bufioreq to be used on other than x86 arches
This patch prevents the device model running on other than x86
systems to use buffered I/O feature for now.
Please note, there is no caller which requires to send buffered
I/O request on Arm currently and the purpose of this check is
to catch any future user of bufioreq.
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Julien Grall <jgrall@amazon.com> Acked-by: Paul Durrant <paul@xen.org>
We need to send mapcache invalidation request to qemu/demu everytime
the page gets removed from a guest.
At the moment, the Arm code doesn't explicitely remove the existing
mapping before inserting the new mapping. Instead, this is done
implicitely by __p2m_set_entry().
First of all we need to recognize a case when the "freed" entry
contains some RAM page in order to set the corresponding flag.
The most suitable place to do this is p2m_free_entry(), there we can
find the correct leaf type. The invalidation request will be sent
in do_trap_hypercall() later on.
Taking into the account the following the do_trap_hypercall()
is the best place to send invalidation request:
- The only way a guest can modify its P2M on Arm is via an hypercall
- When sending the invalidation request, the vCPU will be blocked
until all the IOREQ servers have acknowledged the invalidation
xen/ioreq: Make x86's send_invalidate_req() common
As the IOREQ is a common feature now and we also need to
invalidate qemu/demu mapcache on Arm when the required condition
occurs this patch moves this function to the common code
(and remames it to ioreq_signal_mapcache_invalidate).
This patch also moves per-domain qemu_mapcache_invalidate
variable out of the arch sub-struct (and drops "qemu" prefix).
We don't put this variable inside the #ifdef CONFIG_IOREQ_SERVER
at the end of struct domain, but in the hole next to the group
of 5 bools further up which is more efficient.
The subsequent patch will add mapcache invalidation handling on Arm.
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> CC: Julien Grall <julien.grall@arm.com> Reviewed-by: Paul Durrant <paul@xen.org> Acked-by: Jan Beulich <jbeulich@suse.com>
[On Arm only] Tested-by: Wei Chen <Wei.Chen@arm.com>
In the ideal world we would never get an undefined behavior when
propagating the sign bit since that bit can only be set for access
size smaller than the register size (i.e byte/half-word for aarch32,
byte/half-word/word for aarch64).
In the real world we need to care for *possible* hardware bug such as
advertising a sign extension for either 64-bit (or 32-bit) on Arm64
(resp. Arm32).
So harden a bit more the code to prevent undefined behavior when
propagating the sign bit in case of buggy hardware.
In order to avoid code duplication (both handle_read() and
handle_ioserv() contain the same code for the sign-extension)
put this code to a common helper to be used for both.
xen/dm: Introduce xendevicemodel_set_irq_level DM op
This patch adds ability to the device emulator to notify otherend
(some entity running in the guest) using a SPI and implements Arm
specific bits for it. Proposed interface allows emulator to set
the logical level of a one of a domain's IRQ lines.
We can't reuse the existing DM op (xen_dm_op_set_isa_irq_level)
to inject an interrupt as the "isa_irq" field is only 8-bit and
able to cover IRQ 0 - 255, whereas we need a wider range (0 - 1020).
Please note, for egde-triggered interrupt (which is used for
the virtio-mmio emulation) we only trigger the interrupt on Arm
if the level is asserted (rising edge) and do nothing if the level
is deasserted (falling edge), so the call could be named "trigger_irq"
(without the level parameter). But, in order to model the line closely
(to be able to support level-triggered interrupt) we need to know whether
the line is low or high, so the proposed interface has been chosen.
However, it is worth mentioning that in case of the level-triggered
interrupt, we should keep injecting the interrupt to the guest until
the line is deasserted (this is not covered by current patch).
Signed-off-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
[On Arm only] Tested-by: Wei Chen <Wei.Chen@arm.com>
This patch introduces a helper the main purpose of which is to check
if a domain is using IOREQ server(s).
On Arm the current benefit is to avoid calling vcpu_ioreq_handle_completion()
(which implies iterating over all possible IOREQ servers anyway)
on every return in leave_hypervisor_to_guest() if there is no active
servers for the particular domain.
Also this helper will be used by one of the subsequent patches on Arm.
xen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm
This patch implements reference counting of foreign entries in
in set_foreign_p2m_entry() on Arm. This is a mandatory action if
we want to run emulator (IOREQ server) in other than dom0 domain,
as we can't trust it to do the right thing if it is not running
in dom0. So we need to grab a reference on the page to avoid it
disappearing.
It is valid to always pass "p2m_map_foreign_rw" type to
guest_physmap_add_entry() since the current and foreign domains
would be always different. A case when they are equal would be
rejected by rcu_lock_remote_domain_by_id(). Besides the similar
comment in the code put a respective ASSERT() to catch incorrect
usage in future.
It was tested with IOREQ feature to confirm that all the pages given
to this function belong to a domain, so we can use the same approach
as for XENMAPSPACE_gmfn_foreign handling in xenmem_add_to_physmap_one().
This involves adding an extra parameter for the foreign domain to
set_foreign_p2m_entry() and a helper to indicate whether the arch
supports the reference counting of foreign entries and the restriction
for the hardware domain in the common code can be skipped for it.
xen/arm: Call vcpu_ioreq_handle_completion() in check_for_vcpu_work()
This patch adds remaining bits needed for the IOREQ support on Arm.
Besides just calling vcpu_ioreq_handle_completion() we need to handle
it's return value to make sure that all the vCPU works are done before
we return to the guest (the vcpu_ioreq_handle_completion() may return
false if there is vCPU work to do or IOREQ state is invalid).
For that reason we use an unbounded loop in leave_hypervisor_to_guest().
The worse that can happen here if the vCPU will never run again
(the I/O will never complete). But, in Xen case, if the I/O never
completes then it most likely means that something went horribly
wrong with the Device Emulator. And it is most likely not safe
to continue. So letting the vCPU to spin forever if the I/O never
completes is a safer action than letting it continue and leaving
the guest in unclear state and is the best what we can do for now.
Please note, using this loop we will not spin forever on a pCPU,
preventing any other vCPUs from being scheduled. At every loop
we will call check_for_pcpu_work() that will process pending
softirqs. In case of failure, the guest will crash and the vCPU
will be unscheduled. In normal case, if the rescheduling is necessary
the vCPU will be rescheduled to give place to someone else.
Julien Grall [Fri, 29 Jan 2021 01:48:42 +0000 (03:48 +0200)]
arm/ioreq: Introduce arch specific bits for IOREQ/DM features
This patch adds basic IOREQ/DM support on Arm. The subsequent
patches will improve functionality and add remaining bits.
The IOREQ/DM features are supposed to be built with IOREQ_SERVER
option enabled, which is disabled by default on Arm for now.
Please note, the "PIO handling" TODO is expected to left unaddressed
for the current series. It is not an big issue for now while Xen
doesn't have support for vPCI on Arm. On Arm64 they are only used
for PCI IO Bar and we would probably want to expose them to emulator
as PIO access to make a DM completely arch-agnostic. So "PIO handling"
should be implemented when we add support for vPCI.
xen/ioreq: Use guest_cmpxchg64() instead of cmpxchg()
The cmpxchg() in ioreq_send_buffered() operates on memory shared
with the emulator domain (and the target domain if the legacy
interface is used).
In order to be on the safe side we need to switch
to guest_cmpxchg64() to prevent a domain to DoS Xen on Arm.
The point to use 64-bit version of helper is to support Arm32
since the IOREQ code uses cmpxchg() with 64-bit value.
As there is no plan to support the legacy interface on Arm,
we will have a page to be mapped in a single domain at the time,
so we can use s->emulator in guest_cmpxchg64() safely.
Thankfully the only user of the legacy interface is x86 so far
and there is not concern regarding the atomics operations.
Please note, that the legacy interface *must* not be used on Arm
without revisiting the code.
xen/ioreq: Remove "hvm" prefixes from involved function names
This patch removes "hvm" prefixes and infixes from IOREQ related
function names in the common code and performs a renaming where
appropriate according to the more consistent new naming scheme:
- IOREQ server functions should start with "ioreq_server_"
- IOREQ functions should start with "ioreq_"
A few function names are clarified to better fit into their purposes:
handle_hvm_io_completion -> vcpu_ioreq_handle_completion
hvm_io_pending -> vcpu_ioreq_pending
hvm_ioreq_init -> ioreq_domain_init
hvm_alloc_ioreq_mfn -> ioreq_server_alloc_mfn
hvm_free_ioreq_mfn -> ioreq_server_free_mfn
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Paul Durrant <paul@xen.org> CC: Julien Grall <julien.grall@arm.com>
[On Arm only] Tested-by: Wei Chen <Wei.Chen@arm.com>
Julien Grall [Fri, 29 Jan 2021 01:48:39 +0000 (03:48 +0200)]
xen/mm: Make x86's XENMEM_resource_ioreq_server handling common
As x86 implementation of XENMEM_resource_ioreq_server can be
re-used on Arm later on, this patch makes it common and removes
arch_acquire_resource (and the corresponding option) as unneeded.
Also re-order #include-s alphabetically.
This support is going to be used on Arm to be able run device
emulator outside of Xen hypervisor.
Signed-off-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Paul Durrant <paul@xen.org>
[On Arm only] Tested-by: Wei Chen <Wei.Chen@arm.com>
xen/ioreq: Move x86's io_completion/io_req fields to struct vcpu
The IOREQ is a common feature now and these fields will be used
on Arm as is. Move them to common struct vcpu as a part of new
struct vcpu_io and drop duplicating "io" prefixes. Also move
enum hvm_io_completion to xen/sched.h and remove "hvm" prefixes.
This patch completely removes layering violation in the common code.
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Reviewed-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Paul Durrant <paul@xen.org> Acked-by: Jan Beulich <jbeulich@suse.com> CC: Julien Grall <julien.grall@arm.com>
[On Arm only] Tested-by: Wei Chen <Wei.Chen@arm.com>
Julien Grall [Fri, 29 Jan 2021 01:48:37 +0000 (03:48 +0200)]
xen/ioreq: Make x86's IOREQ related dm-op handling common
As a lot of x86 code can be re-used on Arm later on, this patch
moves the IOREQ related dm-op handling to the common code.
The idea is to have the top level dm-op handling arch-specific
and call into ioreq_server_dm_op() for otherwise unhandled ops.
Pros:
- More natural than doing it other way around (top level dm-op
handling common).
- Leave compat_dm_op() in x86 code.
Cons:
- Code duplication. Both arches have to duplicate dm_op(), etc.
Make the corresponding functions static and rename them according
to the new naming scheme (including dropping the "hvm" prefixes).
Introduce common dm.c file as a resting place for the do_dm_op()
(which is identical for both Arm and x86) to minimize code duplication.
The common DM feature is supposed to be built with IOREQ_SERVER
option enabled (as well as the IOREQ feature), which is selected
for x86's config HVM for now.
Also update XSM code a bit to let dm-op be used on Arm.
This support is going to be used on Arm to be able run device
emulator outside of Xen hypervisor.
Signed-off-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Acked-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Paul Durrant <paul@xen.org>
[On Arm only] Tested-by: Wei Chen <Wei.Chen@arm.com>
xen/ioreq: Move x86's ioreq_server to struct domain
The IOREQ is a common feature now and this struct will be used
on Arm as is. Move it to common struct domain. This also
significantly reduces the layering violation in the common code
(*arch.hvm* usage).
We don't move ioreq_gfn since it is not used in the common code
(the "legacy" mechanism is x86 specific).
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Acked-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Paul Durrant <paul@xen.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> CC: Julien Grall <julien.grall@arm.com>
[On Arm only] Tested-by: Wei Chen <Wei.Chen@arm.com>
xen/ioreq: Make x86's hvm_ioreq_(page/vcpu/server) structs common
The IOREQ is a common feature now and these structs will be used
on Arm as is. Move them to xen/ioreq.h and remove "hvm" prefixes.
Also there is no need to include public/hvm/dm_op.h by
asm-x86/hvm/domain.h anymore since #define NR_IO_RANGE_TYPES
(which uses XEN_DMOP_IO_RANGE_PCI) gets moved to another location.
Instead include it by 2 places (p2m-pt.c and p2m-ept.c) which
require that header, but don't directly include it so far.
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Acked-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Paul Durrant <paul@xen.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> CC: Julien Grall <julien.grall@arm.com>
[On Arm only] Tested-by: Wei Chen <Wei.Chen@arm.com>
xen/ioreq: Make x86's hvm_ioreq_needs_completion() common
The IOREQ is a common feature now and this helper will be used
on Arm as is. Move it to xen/ioreq.h and remove "hvm" prefix.
Although PIO handling on Arm is not introduced with the current series
(it will be implemented when we add support for vPCI), technically
the PIOs exist on Arm (however they are accessed the same way as MMIO)
and it would be better not to diverge now.
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Reviewed-by: Paul Durrant <paul@xen.org> Acked-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> CC: Julien Grall <julien.grall@arm.com>
[On Arm only] Tested-by: Wei Chen <Wei.Chen@arm.com>
As a lot of x86 code can be re-used on Arm later on, this patch
moves previously prepared IOREQ support to the common code
(the code movement is verbatim copy).
The "legacy" mechanism of mapping magic pages for the IOREQ servers
remains x86 specific and not exposed to the common code.
The common IOREQ feature is supposed to be built with IOREQ_SERVER
option enabled, which is selected for x86's config HVM for now.
In order to avoid having a gigantic patch here, the subsequent
patches will update remaining bits in the common code step by step:
- Make IOREQ related structs/materials common
- Drop the "hvm" prefixes and infixes
- Remove layering violation by moving corresponding fields
out of *arch.hvm* or abstracting away accesses to them
Introduce asm/ioreq.h wrapper to be included by common ioreq.h
instead of asm/hvm/ioreq.h to avoid HVM-ism in the code common.
Also include <xen/domain_page.h> which will be needed on Arm
to avoid touch the common code again when introducing Arm specific bits.
This support is going to be used on Arm to be able run device
emulator outside of Xen hypervisor.
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Reviewed-by: Paul Durrant <paul@xen.org> Acked-by: Jan Beulich <jbeulich@suse.com> CC: Julien Grall <julien.grall@arm.com>
[On Arm only] Tested-by: Wei Chen <Wei.Chen@arm.com>
x86/ioreq: Provide out-of-line wrapper for the handle_mmio()
The IOREQ is about to be common feature and Arm will have its own
implementation.
But the name of the function is pretty generic and can be confusing
on Arm (we already have a try_handle_mmio()).
In order not to rename the function (which is used for a varying
set of purposes on x86) globally and get non-confusing variant on Arm
provide a wrapper arch_ioreq_complete_mmio() to be used on common
and Arm code.
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Paul Durrant <paul@xen.org> CC: Julien Grall <julien.grall@arm.com>
[On Arm only] Tested-by: Wei Chen <Wei.Chen@arm.com>
x86/ioreq: Prepare IOREQ feature for making it common
As a lot of x86 code can be re-used on Arm later on, this
patch makes some preparation to x86/hvm/ioreq.c before moving
to the common code. This way we will get a verbatim copy
for a code movement in subsequent patch.
This patch mostly introduces specific hooks to abstract arch
specific materials taking into the account the requirment to leave
the "legacy" mechanism of mapping magic pages for the IOREQ servers
x86 specific and not expose it to the common code.
These hooks are named according to the more consistent new naming
scheme right away (including dropping the "hvm" prefixes and infixes):
- IOREQ server functions should start with "ioreq_server_"
- IOREQ functions should start with "ioreq_"
other functions will be renamed in subsequent patches.
Introduce common ioreq.h right away and put arch hook declarations
there.
Also re-order #include-s alphabetically.
This support is going to be used on Arm to be able run device
emulator outside of Xen hypervisor.
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Paul Durrant <paul@xen.org> Acked-by: Jan Beulich <jbeulich@suse.com> CC: Julien Grall <julien.grall@arm.com>
[On Arm only] Tested-by: Wei Chen <Wei.Chen@arm.com>
Roger Pau Monné [Fri, 29 Jan 2021 16:10:33 +0000 (17:10 +0100)]
x86/pvh: pass module command line to dom0
Both the multiboot and the HVM start info structures allow passing a
string together with a module. Implement the missing support in
pvh_load_kernel so that module strings found in the multiboot
structure are forwarded to dom0.
Fixes: 62ba982424 ('x86: parse Dom0 kernel for PVHv2') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Release-Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Igor Druzhinin [Fri, 29 Jan 2021 13:18:43 +0000 (14:18 +0100)]
viridian: allow vCPU hotplug for Windows VMs
If Viridian extensions are enabled, Windows wouldn't currently allow
a hotplugged vCPU to be brought up dynamically. We need to expose a special
bit to let the guest know we allow it. Hide it behind an option to stay
on the safe side regarding compatibility with existing guests but
nevertheless set the option on by default.
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Paul Durrant <paul@xen.org> Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Igor Druzhinin [Fri, 29 Jan 2021 13:18:01 +0000 (14:18 +0100)]
viridian: remove implicit limit of 64 VPs per partition
TLFS 7.8.1 stipulates that "a virtual processor index must be less than
the maximum number of virtual processors per partition" that "can be obtained
through CPUID leaf 0x40000005". Furthermore, "Requirements for Implementing
the Microsoft Hypervisor Interface" defines that starting from Windows Server
2012, which allowed more than 64 CPUs to be brought up, this leaf can now
contain a value -1 basically assuming the hypervisor has no restriction while
0 (that we currently expose) means the default restriction is still present.
Along with the previous changes exposing ExProcessorMasks this allows a recent
Windows VM with Viridian extension enabled to have more than 64 vCPUs without
going into BSOD in some cases.
Since we didn't expose the leaf before and to keep CPUID data consistent for
incoming streams from previous Xen versions - let's keep it behind an option.
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Paul Durrant <paul@xen.org> Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Norbert Kamiński [Tue, 12 Jan 2021 20:27:43 +0000 (21:27 +0100)]
x86: Support booting under Secure Startup via SKINIT
For now, this is simply enough logic to let Xen come up after the bootloader
has executed an SKINIT instruction to begin a Secure Startup.
During a Secure Startup, the BSP operates with the GIF clear (blocks all
external interrupts, even SMI/NMI), and INIT_REDIRECTION active (converts INIT
IPIs to #SX exceptions, if e.g. the platform needs to scrub secrets before
resetting). To afford APs the same Secure Startup protections as the BSP, the
INIT IPI must be skipped, and SIPI must be the first interrupt seen.
Full details are available in AMD APM Vol2 15.27 "Secure Startup with SKINIT"
Introduce skinit_enable_intr() and call it from cpu_init(), next to the
enable_nmis() which performs a related function for tboot startups.
Also introduce ap_boot_method to control the sequence of actions for AP boot.
Signed-off-by: Marek Kasiewicz <marek.kasiewicz@3mdeb.com> Signed-off-by: Norbert Kamiński <norbert.kaminski@3mdeb.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Fri, 29 Jan 2021 10:36:54 +0000 (11:36 +0100)]
x86/HVM: re-order error path of hvm_domain_initialise()
hvm_destroy_all_ioreq_servers(), called from
hvm_domain_relinquish_resources(), invokes relocate_portio_handler(),
which uses d->arch.hvm.io_handler. Defer freeing of this array
accordingly on the error path of hvm_domain_initialise().
Similarly rtc_deinit() requires d->arch.hvm.pl_time to still be around,
or else an armed timer structure would get freed, and that timer never
get killed.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 29 Jan 2021 10:34:37 +0000 (11:34 +0100)]
memory: bail from page scrubbing when CPU is no longer online
Scrubbing can significantly delay the offlining (parking) of a CPU (e.g.
because of booting into in smt=0 mode), to a degree that the "CPU <n>
still not dead..." messages logged on x86 in 1s intervals can be seen
multiple times. There are no softirqs involved in this process, so
extend the existing preemption check in the scrubbing logic to also exit
when the CPU is no longer observed online.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monné [Fri, 29 Jan 2021 08:09:05 +0000 (09:09 +0100)]
libs/foreignmemory: fix MiniOS build
Keep the dummy handlers for restrict, map_resource and unmap_resource
for MiniOS, or else the build breaks with:
ld: /home/osstest/build.158759.build-amd64/xen/stubdom/mini-os-x86_64-xenstore/mini-os.o: in function `xenforeignmemory_restrict':
/home/osstest/build.158759.build-amd64/xen/stubdom/libs-x86_64/foreignmemory/core.c:137: undefined reference to `osdep_xenforeignmemory_restrict'
ld: /home/osstest/build.158759.build-amd64/xen/stubdom/mini-os-x86_64-xenstore/mini-os.o: in function `xenforeignmemory_map_resource':
/home/osstest/build.158759.build-amd64/xen/stubdom/libs-x86_64/foreignmemory/core.c:171: undefined reference to `osdep_xenforeignmemory_map_resource'
ld: /home/osstest/build.158759.build-amd64/xen/stubdom/mini-os-x86_64-xenstore/mini-os.o: in function `xenforeignmemory_unmap_resource':
/home/osstest/build.158759.build-amd64/xen/stubdom/libs-x86_64/foreignmemory/core.c:185: undefined reference to `osdep_xenforeignmemory_unmap_resource'
ld: /home/osstest/build.158759.build-amd64/xen/stubdom/mini-os-x86_64-xenstore/mini-os.o: in function `xenforeignmemory_resource_size':
/home/osstest/build.158759.build-amd64/xen/stubdom/libs-x86_64/foreignmemory/core.c:200: undefined reference to `osdep_xenforeignmemory_map_resource'
Fixes: 2b4b33ffe7d67 ('libs/foreignmemory: Implement on NetBSD') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
A recent thread [1] has exposed a couple of issues with our current way
of handling EXPERT.
1) It is not obvious that "Configure standard Xen features (expert
users)" is actually the famous EXPERT we keep talking about on xen-devel
2) It is not obvious when we need to enable EXPERT to get a specific
feature
In particular if you want to enable ACPI support so that you can boot
Xen on an ACPI platform, you have to enable EXPERT first. But searching
through the kconfig menu it is really not clear (type '/' and "ACPI"):
nothing in the description tells you that you need to enable EXPERT to
get the option.
So this patch makes things easier by doing two things:
- introduce a new kconfig option UNSUPPORTED which is clearly to enable
UNSUPPORTED features as defined by SUPPORT.md
- change EXPERT options to UNSUPPORTED where it makes sense: keep
depending on EXPERT for features made for experts
- tag unsupported features by adding (UNSUPPORTED) to the one-line
description
Andrew Cooper [Wed, 27 Jan 2021 19:43:32 +0000 (19:43 +0000)]
x86/boot: Drop 'noapic' suggestion from check_timer()
In practice, there is no such thing as a real 64bit system without
APICs. (PVH style virtual environments, sure, but they don't end up here).
The suggestion to try and use noapic only makes a bad situation worse.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Release-Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Wed, 25 Nov 2020 13:22:08 +0000 (13:22 +0000)]
xen-release-management doc: More info on schedule
This documents our practice, established in 2018
https://lists.xen.org/archives/html/xen-devel/2018-07/msg02240.html
et seq
CC: Jürgen Groß <jgross@suse.com> CC: Paul Durrant <xadimgnik@gmail.com> CC: Wei Liu <wl@xen.org> Acked-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Ian Jackson <iwj@xenproject.org>
Manuel Bouyer [Tue, 12 Jan 2021 18:12:21 +0000 (19:12 +0100)]
Fix error: array subscript has type 'char'
Use unsigned char variable, or cast to (unsigned char), for
tolower()/islower() and friends. Fix compiler error
array subscript has type 'char' [-Werror=char-subscripts]
Signed-off-by: Manuel Bouyer <bouyer@netbsd.org> Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com> Release-Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andrew Cooper [Thu, 23 Jul 2020 14:58:48 +0000 (15:58 +0100)]
tools/foreignmem: Support querying the size of a resource
With the Xen side of this interface (soon to be) fixed to return real sizes,
userspace needs to be able to make the query.
Introduce xenforeignmemory_resource_size() for the purpose, bumping the
library minor version.
Update both all osdep_xenforeignmemory_map_resource() implementations to
understand size requests, skip the mmap() operation, and copy back the
nr_frames field.
For NetBSD, also fix up the ioctl() error path to issue an unmap(), which was
overlooked by c/s 4a64e2bb39 "libs/foreignmemory: Implement on NetBSD".
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Paul Durrant <paul@xen.org> Acked-by: Wei Liu <wl@xen.org>
Andrew Cooper [Mon, 26 Oct 2020 15:32:12 +0000 (15:32 +0000)]
x86/ucode: Introduce ucode=allow-same for testing purposes
Many CPUs will actually reload microcode when offered the same version as
currently loaded. This allows for easy testing of the late microcode loading
path.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Mon, 26 Oct 2020 15:27:35 +0000 (15:27 +0000)]
x86/ucode/intel: Fix handling of microcode revision
For Intel microcode blobs, the revision field is signed (as documented in the
SDM) and negative revisions are used for pre-production/test microcode (not
documented publicly anywhere I can spot).
Adjust the revision checking to match the algorithm presented here:
This treats pre-production microcode as always applicable, but also production
microcode having higher precedent than pre-production. It is expected that
anyone using pre-production microcode knows what they are doing.
This is necessary to load production microcode on an SDP with pre-production
microcode embedded in firmware.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Thu, 6 Aug 2020 12:00:07 +0000 (13:00 +0100)]
x86/timer: Fix boot on Intel systems using ITSSPRC static PIT clock gating
Recent Intel client devices have disabled the legacy PIT for powersaving
reasons, breaking compatibility with a traditional IBM PC. Xen depends on a
legacy timer interrupt to check that the IO-APIC/PIC routing is configured
correctly, and fails to boot with:
(XEN) *******************************
(XEN) Panic on CPU 0:
(XEN) IO-APIC + timer doesn't work! Boot with apic_verbosity=debug and send report. Then try booting with the `noapic` option
(XEN) *******************************
While this setting can be undone by Xen, the details of how to differ by
chipset, and would be very short sighted for battery based devices. See bit 2
"8254 Static Clock Gating Enable" in:
All impacted systems have an HPET, but there is no indication of the absence
of PIT functionality, nor a suitable way to probe for its absence. As a short
term fix, reconfigure the HPET into legacy replacement mode. A better
longterm fix would be to avoid the reliance on the timer interrupt entirely.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Tested-by: Jason Andryuk <jandryuk@gmail.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Wed, 27 Jan 2021 16:08:32 +0000 (17:08 +0100)]
xenstored: fix build on libc without O_CLOEXEC
The call to lu_read_state() would remain unresolved in this case. Frame
the construct by a suitable #ifdef, and while at it also frame command
line handling related pieces similarly.
Fixes: 9777fa6b6ea0 ("tools/xenstore: evaluate the live update flag when starting") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich [Wed, 27 Jan 2021 16:08:14 +0000 (17:08 +0100)]
libxlutil: avoid almost-undefined behavior
While only value computations of an object are disallowed in the
presence of another unsequenced side effect, at least gcc 4.3 looks to
extend this to taking the object's address. The resulting warning causes
the build to fail, because of -Werror.
While there also correct an adjacent comment.
Fixes: bdc0799fe26a ("libxlu: introduce xlu_pci_parse_spec_string()") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich [Wed, 27 Jan 2021 16:07:57 +0000 (17:07 +0100)]
libxenguest: drop now unused le32_to_cpup() from lz4 decompression
While gcc doesn't warn about this because of it being static inline,
clang does, causing the build to fail there because of -Werror.
Fixes: d8099d94dfaa ("libxenguest: add get_unaligned_le32()") Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich [Wed, 27 Jan 2021 07:47:13 +0000 (08:47 +0100)]
x86/PV: use 64-bit subtract to adjust guest RIP upon missing SYSCALL callbacks
When discussing the shrunk down version of the commit in question it
was said (in reply to my conditional choosing of the width):
"However, the 32bit case isn't actually interesting here. A
guest can't execute a SYSCALL instruction on/across the 4G->0 boundary
because the M2P is mapped NX up to the 4G boundary, so we can never
reach this point with %eip < 2.
Therefore, the 64bit-only form is the appropriate one to use, which
solves any question of cleverness, or potential decode stalls it
causes."
Fixes: ca6fcf4321b3 ("x86/pv: Inject #UD for missing SYSCALL callbacks") Signed-off-by: Jan Beulich <JBeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Plain MSI doesn't allow caching the MSI address and data fields while
the capability is enabled and not masked, hence we need to allow any
changes to those fields to update the binding of the interrupt. For
reference, the same doesn't apply to MSI-X that is allowed to cache
the data and address fields while the entry is unmasked, see section
6.8.3.5 of the PCI Local Bus Specification 3.0.
Allowing such updates means that a guest can write an invalid address
(ie: all zeros) and then a valid one, so the PIRQs shouldn't be
unmapped when the interrupt cannot be bound to the guest, since
further updates to the address or data fields can result in the
binding succeeding.
Modify the vPCI MSI arch helpers to track whether the interrupt is
bound, and make failures in vpci_msi_update not unmap the PIRQ, so
that further calls can attempt to bind the PIRQ again.
Note this requires some modifications to the MSI-X handlers, but there
shouldn't be any functional changes in that area.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Tue, 26 Jan 2021 16:42:56 +0000 (17:42 +0100)]
tools/libs: honor build dependencies for recently moved subdirs
While the lack of proper dependency tracking of #include-d files is
wider than just the libs/ subtree, dealing with the problem universally
there or in tools/Rules.mk is too much of a risk at this point in the
release cycle. Add the missing inclusion of $(DEPS_INCLUDE) only in the
specific Makefile-s, after having checked that their prior Makefile-s
had such includes.
Interestingly the $(DEPS_RM) use is present in tools/libs/libs.mk's
clean target, so doesn't need taking care of in individual Makefile-s.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Wei Liu <wl@xen.org> Release-acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Tue, 26 Jan 2021 13:42:23 +0000 (14:42 +0100)]
xen/include: compat/xlat.h may change with .config changes
$(xlat-y) getting derived from $(headers-y) means its contents may
change with changes to .config. The individual files $(xlat-y) refers
to, otoh, may not change, and hence not trigger rebuilding of xlat.h.
(Note that the issue was already present before the commit referred to
below, but it was far more limited in affecting only changes to
CONFIG_XSM_FLASK.)
Fixes: 2c8fabb2232d ("x86: only generate compat headers actually needed") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Add a DOMPRINTF() other methods have, indicating success. To facilitate
this, introduce an "outsize" local variable and update *size as well as
*blob only once done. The latter then also avoids leaving a pointer to
freed memory in dom->kernel_blob in case of a decompression error.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Wei Liu <wl@xen.org> Release-Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich [Tue, 26 Jan 2021 13:16:34 +0000 (14:16 +0100)]
libxenguest: support zstd compressed kernels
This follows the logic used for other decompression methods utilizing an
external library, albeit here we can't ignore the 32-bit size field
appended to the compressed image - its presence causes decompression to
fail. Leverage the field instead to allocate the output buffer in one
go, i.e. without incrementally realloc()ing.
As far as configure.ac goes, I'm pretty sure there is a better (more
"standard") way of using PKG_CHECK_MODULES(). The construct also gets
put next to the other decompression library checks, albeit I think they
all ought to be x86-specific (e.g. placed in the existing case block a
few lines down).
Note that, where possible, instead of #ifdef-ing xen/*.h inclusions,
they get removed.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Wei Liu <wl@xen.org> Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com> Release-Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich [Tue, 26 Jan 2021 13:14:39 +0000 (14:14 +0100)]
libxenguest: add get_unaligned_le32()
Abstract xc_dom_check_gzip()'s reading of the uncompressed size into a
helper re-usable, in particular, by other decompressor code.
Sadly in the mini-os case this conflicts with other functions of the
same name (and purpose), which can't be easily replaced individually.
Yet it was requested that no full set of helpers be introduced at this
point in the release cycle. Hence the awkward XG_NEED_UNALIGNED.
Requested-by: Ian Jackson <iwj@xenproject.org> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com> Release-Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>