Jan Beulich [Thu, 22 Feb 2024 11:15:20 +0000 (12:15 +0100)]
IRQ: drop regs parameter from handler functions
It's simply not needed anymore. Note how Linux made this change many
years ago already, in 2.6.19 (late 2006, see [1]).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Julien Grall <jgrall@amazon.com>
[1] https://git.kernel.org/torvalds/c/7d12e780e003f93433d49ce78cfedf4b4c52adc5
Jan Beulich [Thu, 22 Feb 2024 11:11:47 +0000 (12:11 +0100)]
keyhandler: drop regs parameter from handle_keyregs()
In preparation for further removal of regs parameters, drop it here. In
the two places where it's actually needed, retrieve IRQ context if
available, or else guest context.
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Julien Grall <jgrall@amazon.com>
Jan Beulich [Thu, 22 Feb 2024 11:10:38 +0000 (12:10 +0100)]
serial: fake IRQ-regs context in poll handlers
In preparation of dropping the register parameters from
serial_[rt]x_interrupt() and in turn from IRQ handler functions,
register state needs making available another way for the few key
handlers which need it. Fake IRQ-like state.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Julien Grall <jgrall@amazon.com>
Jan Beulich [Thu, 22 Feb 2024 10:54:32 +0000 (11:54 +0100)]
x86emul: make run32 test harness goal work again
When re-working library call wrapping the sed invocation didn't account
for all sources living in the parent directory when building the 32-bit
harness binary.
Fixes: 6fba45ca3be1 ("x86emul: rework wrapping of libc functions in test and fuzzing harnesses") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Thu, 22 Feb 2024 10:54:07 +0000 (11:54 +0100)]
x86emul: add missing EVEX.R' checks
EVEX.R' is not ignored in 64-bit code when encoding a GPR or mask
register. While for mask registers suitable checks are in place (there
also covering EVEX.R), they were missing for the few cases where in
EVEX-encoded instructions ModR/M.reg encodes a GPR. While for VPEXTRW
the bit is replaced before an emulation stub is invoked, for
VCVT{,T}{S,D,H}2{,U}SI this actually would have led to #UD from inside
an emulation stub, in turn raising #UD to the guest, but accompanied by
log messages indicating something's wrong in Xen nevertheless.
Fixes: 001bd91ad864 ("x86emul: support AVX512{F,BW,DQ} extract insns") Fixes: baf4a376f550 ("x86emul: support AVX512F legacy-equivalent scalar int/FP conversion insns") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
x86/uaccess: add attribute noreturn to __{get,put}_user_bad()
__get_user_bad() and __put_user_bad() are undefined symbols used
to assert the unreachability of a program point:
a call to one of such functions is optimized away if it is considered
unreachable by the compiler. Otherwise, a linker error is reported.
In accordance with the purpose of such constructs:
1) add the attribute noreturn to __get_user_bad() and __put_user_bad();
2) change return type of __get_user_bad() to void (returning long is a
leftover from the past).
Point (1) meets the requirements to deviate MISRA C:2012 Rule 16.3
("An unconditional break statement shall terminate every switch
clause") since functions with noreturn attribute are considered
as allowed terminals for switch clauses.
Point (2) addresses several violations of MISRA C:2012 Rule 17.7
("The value returned by a function having non-void return type
shall be used").
While there also zap "extern".
No functional change.
Signed-off-by: Federico Serafini <federico.serafini@bugseng.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Thu, 22 Feb 2024 10:52:47 +0000 (11:52 +0100)]
build: make sure build fails when running kconfig fails
Because of using "-include", failure to (re)build auto.conf (with
auto.conf.cmd produced as a secondary target) won't stop make from
continuing the build. Arrange for it being possible to drop the - from
Rules.mk, requiring that the include be skipped for tools-only targets.
Note that relying on the inclusion in those cases wouldn't be correct
anyway, as it might be a stale file (yet to be rebuilt) which would be
included, while during initial build, the file would be absent
altogether.
Fixes: 8d4c17a90b0a ("xen/build: silence make warnings about missing auto.conf*") Reported-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Frediano Ziglio [Thu, 22 Feb 2024 10:51:19 +0000 (11:51 +0100)]
Constify some parameters
Make clear they are not changed in the functions.
Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Daniel P. Smith <dpsmith@apertussolutions.com> # XSM Acked-by: George Dunlap <george.dunlap@cloud.com> # sched
Jan Beulich [Thu, 22 Feb 2024 10:49:10 +0000 (11:49 +0100)]
gnttab: fully ignore zero-size copy requests
Along the line with observations in the context of XSA-448, no field in
struct gnttab_copy_ptr is relevant when no data is to be copied, much
like e.g. the pointers passed to memcpy() are irrelevant (and would
never be "validated") when the passed length is zero.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Julien Grall <jgrall@amazon.com>
libxl: Disable relocating memory for qemu-xen in stubdomain too
According to comments (and experiments) qemu-xen cannot handle memory
reolcation done by hvmloader. The code was already disabled when running
qemu-xen in dom0 (see libxl__spawn_local_dm()), but it was missed when
adding qemu-xen support to stubdomain. Adjust libxl__spawn_stub_dm() to
be consistent in this regard.
Reported-by: Neowutran <xen@neowutran.ovh> Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Jason Andryuk <jandryuk@gmail.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Michal Orzel [Thu, 15 Feb 2024 14:39:47 +0000 (15:39 +0100)]
xen/arm: Make hwdom vUART optional feature
At the moment, the hardware domain vUART is always compiled in. In the
spirit of fine granular configuration, make it optional so that the
feature can be disabled if not needed. This UART is not exposed (e.g.
via device tree) to a domain and is mostly used to support special use
cases like Linux early printk, prints from the decompressor code, etc.
Introduce Kconfig option CONFIG_HWDOM_VUART, enabled by default (to keep
the current behavior) and use it to protect the vUART related code.
Provide stubs for domain_vuart_{init,free}() in case the feature is
disabled. Take the opportunity to add a struct domain forward declaration
to vuart.h, so that the header is self contained.
Oleksii Kurochko [Tue, 20 Feb 2024 11:23:50 +0000 (12:23 +0100)]
xen/asm-generic: fold struct devarch into struct dev
The 'struct dev_archdata' is exclusively used within 'struct device',
so it could be merged into 'struct device.'
After the merger, it is necessary to update the 'dev_archdata()'
macros and the comments above 'struct arm_smmu_xen_device' in
drivers/passthrough/arm/smmu.c.
Additionally, it is required to update instances of
"dev->archdata->iommu" to "dev->iommu".
Oleksii Kurochko [Tue, 20 Feb 2024 11:23:00 +0000 (12:23 +0100)]
xen/arm: switch Arm to use asm-generic/device.h
The following changes were done as a result of switching to
asm-generic/device.h:
* DEVICE_GIC was renamed to DEVICE_INTERRUPT_CONTROLLER according
to definition of enum device_class in asm-generic/device.h.
* acpi-related things in Arm code were guarded by #ifdef CONFIG_ACPI
as struct acpi_device_desc was guarded in asm-generic, also functions
acpi_device_init() was guarded too as they are using structure
acpi_device_desc inside.
* drop arm/include/asm/device.h and update arm/include/asm/Makefile
to use asm-generic/device.h instead.
As 'struct device_desc' is protected by CONFIG_HAS_DEVICE_TREE,
_sdevice, _edevice, device_init(), and device_get_class should also be
protected.
However, this protection was not implemented because Arm always has
CONFIG_HAS_DEVICE_TREE=y at the moment.
Oleksii Kurochko [Tue, 20 Feb 2024 11:21:38 +0000 (12:21 +0100)]
xen/asm-generic: introduce generic device.h
Arm, PPC and RISC-V introduce the same things in asm/device.h, so
generic device.h was introduced.
Arm's device.h was taken as a base with the following changes:
- #ifdef ACPI related things.
- Rename #ifdef guards.
- Add SPDX tag.
- #ifdef CONFIG_HAS_DEVICE_TREE related things.
- #ifdef-ing iommu related things with CONFIG_HAS_PASSTHROUGH.
Frediano Ziglio [Mon, 19 Feb 2024 11:46:21 +0000 (12:46 +0100)]
x86: Reduce assembly code size of entry points
On many entries we push 8-bytes zero and exception constants are
small so we can just write a single byte saving 3 bytes for
instruction.
With ENDBR64 this reduces the size of many entry points from 32 to
16 bytes (due to alignment).
The push and the mov are overlapping stores either way. Swapping
between movl and movb will make no difference at all on performance.
Similar code is already used in autogen_stubs.
Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
libxl: Add "grant_usage" parameter for virtio disk devices
Allow administrators to control whether Xen grant mappings for
the virtio disk devices should be used. By default (when new
parameter is not specified), the existing behavior is retained
(we enable grants if backend-domid != 0).
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Anthony PERARD [Mon, 19 Feb 2024 11:45:48 +0000 (12:45 +0100)]
build: Replace `which` with `command -v`
The `which` command is not standard, may not exist on the build host,
or may not behave as expected by the build system. It is recommended
to use `command -v` to find out if a command exist and have its path,
and it's part of a POSIX shell standard (at least, it seems to be
mandatory since IEEE Std 1003.1-2008, but was optional before).
Fixes: c8a8645f1efe ("xen/build: Automatically locate a suitable python interpreter") Fixes: 3b47bcdb6d38 ("xen/build: Use a distro version of figlet") Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monné [Mon, 19 Feb 2024 11:44:50 +0000 (12:44 +0100)]
mm: add the __must_check attribute to {gfn,mfn,dfn}_add()
It's not obvious from just the function name whether the incremented value will
be stored in the parameter, or returned to the caller. That has leads to bugs
in the past as callers may assume the incremented value is stored in the
parameter.
Add the __must_check attribute to the function to easily spot callers that
don't consume the returned value, which signals an error in the caller logic.
No functional change intended.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Julien Grall [Tue, 16 Jan 2024 19:25:58 +0000 (19:25 +0000)]
xen/arm: fixmap: Rename the fixmap slots to follow the x86 convention
At the moment the fixmap slots are prefixed differently between arm and
x86.
Some of them (e.g. the PMAP slots) are used in common code. So it would
be better if they are named the same way to avoid having to create
aliases.
I have decided to use the x86 naming because they are less change. So
all the Arm fixmap slots will now be prefixed with FIX rather than
FIXMAP.
Signed-off-by: Julien Grall <jgrall@amazon.com> Signed-off-by: Elias El Yandouzi <eliasely@amazon.com> Reviewed-by: Henry Wang <Henry.Wang@arm.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Juergen Gross [Thu, 15 Feb 2024 13:04:52 +0000 (14:04 +0100)]
tools/xen-9pfsd: add 9pfs response generation support
Add support for generation a 9pfs protocol response via a format based
approach.
Strings are stored in a per device string buffer and they are
referenced via their offset in this buffer. This allows to avoid
having to dynamically allocate memory for each single string.
As a first user of the response handling add a generic p9_error()
function which will be used to return any error to the client.
Add all format parsing variants in order to avoid additional code churn
later when adding the users of those variants. Prepare a special case
for the "read" case already (format character 'D'): in order to avoid
adding another buffer for read data support doing the read I/O directly
into the response buffer.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Juergen Gross [Thu, 15 Feb 2024 13:04:51 +0000 (14:04 +0100)]
tools/xen-9pfsd: add transport layer
Add the transport layer of 9pfs. This is basically the infrastructure
to receive requests from the frontend and to send the related answers
via the rings.
Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Andryuk <jandryuk@gmail.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Juergen Gross [Thu, 15 Feb 2024 13:04:49 +0000 (14:04 +0100)]
tools: add a new xen 9pfs daemon
Add "xen-9pfsd", a new 9pfs daemon meant to support infrastructure
domains (e.g. xenstore-stubdom) to access files in dom0.
For now only add the code needed for starting the daemon and
registering it with Xenstore via a new "libxl/xen-9pfs/state" node by
writing the "running" state to it.
Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Andryuk <jandryuk@gmail.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Juergen Gross [Thu, 8 Feb 2024 16:05:15 +0000 (17:05 +0100)]
docs: add a best practices coding guide
Today the CODING_STYLE contains a section "Handling unexpected
conditions" specific to the hypervisor. This section is kind of
misplaced for a coding style. It should rather be part of a "Coding
best practices" guide.
Add such a guide as docs/process/coding-best-practices.pandoc and
move the mentioned section from CODING_STYLE to the new file, while
converting the format to pandoc.
Roger Pau Monné [Wed, 14 Feb 2024 13:18:06 +0000 (14:18 +0100)]
iommu/x86: fix IVMD/RMRR range checker loop increment
mfn_add() doesn't store the incremented value in the parameter, and instead
returns it to the caller. As a result, the loop in iommu_unity_region_ok()
didn't make progress. Fix it by storing the incremented value.
Fixes: e45801dea17b ('iommu/x86: introduce a generic IVMD/RMRR range validity helper')
Coverity-ID: 1592056 Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Juergen Gross [Wed, 14 Feb 2024 09:41:59 +0000 (10:41 +0100)]
tools: add access macros for unaligned data
Add the basic access macros for unaligned data to common-macros.h.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Cyril Rébert [Wed, 14 Feb 2024 09:41:43 +0000 (10:41 +0100)]
tools/xentop: add option to display dom0 first
Add a command line option to xentop to be able to display dom0 first, on top of the list.
This is unconditional, so sorting domains with the S option will also ignore dom0.
Signed-off-by: Cyril Rébert (zithro) <slack@rabbit.lu> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Jason Andryuk [Wed, 14 Feb 2024 09:41:17 +0000 (10:41 +0100)]
libxl: Allow Phy backend for CDROM devices
A Linux HVM domain ignores PV block devices with type cdrom. The
Windows PV drivers also ignore device-type != "disk". Therefore QEMU's
emulated CD-ROM support is used. This allows ejection and other CD-ROM
features to work.
With a stubdom, QEMU is running in the stubdom. A PV disk is still
connected into the stubdom, and then QEMU can emulate the CD-ROM into
the guest. Phy support has been enhanced to provide a placeholder file
forempty disks, so it is usable as a CDROM backend as well. Allow Phy
to pass the check as well.
(Bypassing just for a linux-based stubdom doesn't work because
libxl__device_disk_setdefault() gets called early in domain creation
before xenstore is populated with relevant information for the stubdom
type. The build information isn't readily available and won't exist in
some call trees, so it isn't usable either.)
Let disk_try_backend() allow format empty for Phy cdrom drives.
Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Jason Andryuk [Wed, 14 Feb 2024 09:40:56 +0000 (10:40 +0100)]
libxl: Create empty file for Phy cdrom
With a device model stubdom, dom0 exports a PV disk to the stubdom.
Inside the stubdom, QEMU emulates a cdrom to the guest with a
host_device pointing at the PV frontend (/dev/xvdc)
An empty cdrom drive causes problems booting the stubdom. The PV disk
protocol isn't designed to support no media. That can be partially
hacked around, but the stubdom kernel waits for all block devices to
transition to Connected. Since the backend never connects empty media,
stubdom launch times out and it is destroyed.
Empty media and the PV disks not connecting is fine at runtime since the
stubdom keeps running irrespective of the disk state.
Empty media can be worked around my providing an empty file to the
stubdom for the PV disk source. This works as the disk is exposed as a
zero-size disk. Dynamically create the empty file as needed and remove
in the stubdom cleanup.
libxl__device_disk_set_backend() needs to allow through these "empty"
disks with a pdev_path.
Fixup the params writing since scripts have trouble with an empty params
field.
This works for non-stubdom HVMs as well.
Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Jan Beulich [Wed, 14 Feb 2024 09:38:38 +0000 (10:38 +0100)]
Argo: drop meaningless mfn_valid() check
Holding a valid struct page_info * in hands already means the referenced
MFN is valid; there's no need to check that again. Convert the checking
logic to a switch(), to help keeping the extra (and questionable) x86-
only check in somewhat tidy shape.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Christopher Clark <christopher.w.clark@gmail.com>
Nicola Vetrini [Thu, 8 Feb 2024 15:50:09 +0000 (16:50 +0100)]
docs/misra: add asm-offset.c to exclude-list
These files contain several deliberate violations of MISRA C rules such
as:
* R20.12 for macros DEFINE and OFFSET, where the second argument
of OFFSET is a macro and is used as a normal parameter and a
stringification operand.
* R2.1 because the file is not linked. That said it was decided to
deviate the rule itself to address that aspect).
Roger Pau Monné [Tue, 13 Feb 2024 08:37:20 +0000 (09:37 +0100)]
iommu/vt-d: switch to common RMRR checker
Use the newly introduced generic unity map checker.
Also drop the message recommending the usage of iommu_inclusive_mapping: the
ranges would end up being mapped anyway even if some of the checks above
failed, regardless of whether iommu_inclusive_mapping is set. Plus such option
is not supported for PVH, and it's deprecated.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Tue, 13 Feb 2024 08:36:14 +0000 (09:36 +0100)]
x86/HVM: tidy state on hvmemul_map_linear_addr()'s error path
While in the vast majority of cases failure of the function will not
be followed by re-invocation with the same emulation context, a few
very specific insns - involving multiple independent writes, e.g. ENTER
and PUSHA - exist where this can happen. Since failure of the function
only signals to the caller that it ought to try an MMIO write instead,
such failure also cannot be assumed to result in wholesale failure of
emulation of the current insn. Instead we have to maintain internal
state such that another invocation of the function with the same
emulation context remains possible. To achieve that we need to reset MFN
slots after putting page references on the error path.
Note that all of this affects debugging code only, in causing an
assertion to trigger (higher up in the function). There's otherwise no
misbehavior - such a "leftover" slot would simply be overwritten by new
contents in a release build.
Also extend the related unmap() assertion, to further check for MFN 0.
Fixes: 8cbd4fb0b7ea ("x86/hvm: implement hvmemul_write() using real mappings") Reported-by: Manuel Andreas <manuel.andreas@tum.de> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Paul Durrant <paul@xen.org>
asm/iommu.h shouldn't need to be included when CONFIG_HAS_PASSTHROUGH
isn't enabled.
As <asm/iommu.h> is ifdef-ed by CONFIG_HAS_PASSTHROUGH it should
be also ifdef-ed field "struct arch_iommu arch" in struct domain_iommu
as definition of arch_iommu is located in <asm/iommu.h>.
These amount of changes are just enough to avoid generation of empty
asm/iommu.h for now.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Jason Andryuk [Tue, 13 Feb 2024 08:28:58 +0000 (09:28 +0100)]
libxl: Add support for blktap vbd3
This patch re-introduces blktap support to libxl. Unlike earlier
versions, it does not link against any blktap library. libxl changes
are needed to write to the vbd3 backend XenStore nodes.
blktap has three components. tapdisk is a daemon implementing the disk
IO, NBD (Network Block Device), and Xen PV interfaces. tap-ctl is a
tool to control tapdisks - creating, starting, stopping and freeing.
tapback manages the XenStore operations and instructs tapdisk to
connect.
It is notable that tapdisk performs the grant and event channel ops, but
doesn't interact with XenStore. tapback performs XenStore operations
and notifies tapdisks of values and changes.
The flow is: libxl writes to the "vbd3" XenStore nodes and runs the
block-tap script. The block-tap script runs tap-ctl to create a tapdisk
instance as the physical device. tapback then sees the tapdisk and
instructs the tapdisk to connect up the PV blkif interface.
This is expected to work without the kernel blktap driver, so the
block-tap script is modified accordingly to write the UNIX NBD path.
backendtype=tap was not fully removed previously, but it would never
succeed since it would hit the hardcoded error in disk_try_backend().
It is reused now.
An example command to attach a vhd:
xl block-attach vm 'vdev=xvdf,backendtype=tap,format=vhd,target=/srv/target.vhd'
Format raw also works to run an "aio:" tapdisk.
Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Jan Beulich [Tue, 13 Feb 2024 08:27:47 +0000 (09:27 +0100)]
x86/p2m: make p2m_get_page_from_gfn() handle grant case correctly
The 'fast' path of p2m_get_page_from_gfn handles three cases: normal ram,
foreign p2m entries, and grant map entries. For normal ram and grant table
entries, get_page() is called, but for foreign entries,
page_get_owner_and_reference() is called, since the current domain is
expected not to be the owner.
Unfortunately, grant maps are *also* generally expected to be owned by
foreign domains; so this function will fail for any p2m entry containing a
grant map that doesn't happen to be local.
Have grant maps take the same path as foreign entries. Since grants may
actually be either foreign or local, adjust the assertion to allow for this.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: George Dunlap <george.dunlap@cloud.com> Reviewed-by: George Dunlap <george.dunlap@cloud.com>
Petr Beneš [Mon, 12 Feb 2024 08:37:58 +0000 (09:37 +0100)]
x86/hvm: Fix fast singlestep state persistence
This patch addresses an issue where the fast singlestep setting would persist
despite xc_domain_debug_control being called with XEN_DOMCTL_DEBUG_OP_SINGLE_STEP_OFF.
Specifically, if fast singlestep was enabled in a VMI session and that session
stopped before the MTF trap occurred, the fast singlestep setting remained
active even though MTF itself was disabled. This led to a situation where, upon
starting a new VMI session, the first event to trigger an EPT violation would
cause the corresponding EPT event callback to be skipped due to the lingering
fast singlestep setting.
The fix ensures that the fast singlestep setting is properly reset when
disabling single step debugging operations.
Signed-off-by: Petr Beneš <w1benny@gmail.com> Reviewed-by: Tamas K Lengyel <tamas@tklengyel.com>
Jan Beulich [Mon, 12 Feb 2024 08:37:18 +0000 (09:37 +0100)]
x86/PV32: restore PAE-extended-CR3 logic
While the PAE-extended-CR3 VM assist is a 32-bit only concept, it still
applies to guests also when run on a 64-bit hypervisor: The "extended
CR3" format has to be used there as well, to fit the address in the only
32-bit wide register there. As a result it was a mistake that the check
was never enabled for that case, and was then mistakenly deleted in the
course of removal of 32-bit-Xen code (218adf199e68 ["x86: We can assume
CONFIG_PAGING_LEVELS==4"]).
Similarly during Dom0 construction kernel awareness needs to be taken
into account, and respective code was again mistakenly never enabled for
32-bit Dom0 when running on 64-bit Xen (and thus wrongly deleted by 5d1181a5ea5e ["xen: Remove x86_32 build target"]).
At the same time restrict enabling of the assist for Dom0 to just the
32-bit case. Furthermore there's no need for an atomic update there.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
The failure is reported for the following line:
(paddr_t)(uintptr_t)(_start + boot_phys_offset)
This occurs because the compiler treats (ptr + size) with size bigger than
PTRDIFF_MAX as undefined behavior. To address this, switch to macro
virt_to_maddr(), given the future plans to eliminate boot_phys_offset.
eclair: move function and macro properties outside ECLAIR
Function and macro properties contained in ECLAIR/call_properties.ecl are of
general interest: this patch moves these annotations in a generaric JSON file
in docs. In this way, they can be exploited for other purposes (i.e. documentation,
other tools).
Add rst file containing explanation on how to update function_macro_properties.json.
Add script to convert the JSON file in ECL configurations.
Remove ECLAIR/call_properties.ecl: the file is now automatically generated from
the JSON file.
Jason Andryuk [Wed, 7 Feb 2024 12:46:52 +0000 (13:46 +0100)]
block-common: Fix same_vm for no targets
same_vm is broken when the two main domains do not have targets. otvm
and targetvm are both missing, which means they get set to -1 and then
converted to empty strings:
++10697+ local targetvm=-1
++10697+ local otvm=-1
++10697+ otvm=
++10697+ othervm=/vm/cc97bc2f-3a91-43f7-8fbc-4cb92f90b4e4
++10697+ targetvm=
++10697+ local frontend_uuid=/vm/844dea4e-44f8-4e3e-8145-325132a31ca5
The final comparison returns true since the two empty strings match:
Replace -1 with distinct strings indicating the lack of a value and
remove the collescing to empty stings. The strings themselves will no
longer match, and that is correct.
Michal Orzel [Tue, 6 Feb 2024 15:20:12 +0000 (16:20 +0100)]
automation: Switch yocto-qemux86-64 job to run on x86
At the moment, all Yocto jobs run on Arm64 runners. To address CI
capacity issues, move yocto-qemux86-64 job to x86. Reflect the change in
the makefile generating Yocto docker files and fix CONTAINER name
definition that incorrectly expects YOCTO_HOST variable to be set for x86
container as well, which does not have a platform name appended.
Signed-off-by: Michal Orzel <michal.orzel@amd.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Frediano Ziglio [Tue, 6 Feb 2024 10:56:38 +0000 (11:56 +0100)]
x86/paging: Use more specific constant
__HYPERVISOR_arch_1 and __HYPERVISOR_paging_domctl_cont for x86
have the same value but this function is handling
"paging_domctl_cont" hypercall so using the latter mnemonic in
the code is more clear.
Signed-off-by: Frediano Ziglio <frediano.ziglio@cloud.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monné [Tue, 6 Feb 2024 10:56:13 +0000 (11:56 +0100)]
amd-vi: fix IVMD memory type checks
The current code that parses the IVMD blocks is relaxed with regard to the
restriction that such unity regions should always fall into memory ranges
marked as reserved in the memory map.
However the type checks for the IVMD addresses are inverted, and as a result
IVMD ranges falling into RAM areas are accepted. Note that having such ranges
in the first place is a firmware bug, as IVMD should always fall into reserved
ranges.
Fixes: ed6c77ebf0c1 ('AMD/IOMMU: check / convert IVMD ranges for being / to be reserved') Reported-by: Ox <oxjo@proton.me> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Tested-by: oxjo <oxjo@proton.me> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Hongyan Xia [Tue, 6 Feb 2024 10:54:50 +0000 (11:54 +0100)]
acpi: vmap pages in acpi_os_alloc_memory
Also, introduce a wrapper around vmap that maps a contiguous range for
boot allocations. Unfortunately, the new helper cannot be a static inline
because the dependencies are a mess. We would need to re-include
asm/page.h (was removed in aa4b9d1ee653 "include: don't use asm/page.h
from common headers") and it doesn't look to be enough anymore
because bits from asm/cpufeature.h is used in the definition of PAGE_NX.
Lastly, with the move to vmap(), it is now easier to find the size
of the mapping. So pass the whole area to init_boot_pages() rather than
just the first page.
Signed-off-by: Hongyan Xia <hongyxia@amazon.com> Signed-off-by: Julien Grall <jgrall@amazon.com> Signed-off-by: Elias El Yandouzi <eliasely@amazon.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Julien Grall [Tue, 6 Feb 2024 10:54:17 +0000 (11:54 +0100)]
xen/vmap: Introduce vmap_size() and use it
vunmap() and vfree() currently duplicate the (small) logic to find the
size of an vmap area. In a follow-up patch, we will want to introduce
another one (this time externally).
So introduce a new helper vmap_size() that will return the number of
pages in the area starting at the given address. Take the opportunity
to replace the open-coded version.
Note that vfree() was storing the type of the area in a local variable.
But this seems to have never been used (even when it was introduced).
Signed-off-by: Julien Grall <jgrall@amazon.com> Signed-off-by: Elias El Yandouzi <eliasely@amazon.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Wei Liu [Tue, 6 Feb 2024 10:53:04 +0000 (11:53 +0100)]
setup: Move vm_init() before acpi calls
After the direct map removal, pages from the boot allocator are not
going to be mapped in the direct map. Although we have map_domain_page,
they are ephemeral and are less helpful for mappings that are more than a
page, so we want a mechanism to globally map a range of pages, which is
what vmap is for. Therefore, we bring vm_init into early boot stage.
To allow vmap to be initialised and used in early boot, we need to
modify vmap to receive pages from the boot allocator during early boot
stage.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: David Woodhouse <dwmw2@amazon.com> Signed-off-by: Hongyan Xia <hongyxia@amazon.com> Signed-off-by: Julien Grall <jgrall@amazon.com> Signed-off-by: Elias El Yandouzi <eliasely@amazon.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Juergen Gross [Mon, 5 Feb 2024 10:49:57 +0000 (11:49 +0100)]
tools/xenstored: map stubdom interface
When running as stubdom, map the stubdom's Xenstore ring page in order
to support using the 9pfs frontend.
Use the same pattern as in dom0_init() when running as daemon in dom0
(introduce the own domain, then send an event to the client side to
signal Xenstore is ready to communicate).
Juergen Gross [Mon, 5 Feb 2024 10:49:56 +0000 (11:49 +0100)]
tools/xenstored: split domain_init()
Today domain_init() is called either just before calling dom0_init()
in case no live update is being performed, or it is called after
reading the global state from read_state_global(), as the event
channel fd is needed.
Split up domain_init() into a preparation part which can be called
unconditionally, and in a part setting up the event channel handle.
Note that there is no chance that chk_domain_generation() can be
called now before xc_handle has been setup, so there is no need for
the related special case anymore.
Juergen Gross [Mon, 5 Feb 2024 10:49:55 +0000 (11:49 +0100)]
tools/xenstored: rework ring page (un)map functions
When [un]mapping the ring page of a Xenstore client, different actions
are required for "normal" guests and dom0. Today this distinction is
made at call site.
Move this distinction into [un]map_interface() instead, avoiding code
duplication and preparing special handling for [un]mapping the stub
domain's ring page.
Juergen Gross [Mon, 5 Feb 2024 10:49:52 +0000 (11:49 +0100)]
tools/xenstored: move all log-pipe handling into posix.c
All of the log-pipe handling is needed only when running as daemon.
Move it into posix.c. This requires to have a service function in the
main event loop for handling the related requests and one for setting
the fds[] array, which is renamed to poll_fds to have a more specific
name. Use a generic name for those functions, as socket handling can
be added to them later, too.
Juergen Gross [Mon, 5 Feb 2024 10:49:50 +0000 (11:49 +0100)]
tools/xenstored: add early_init() function
Some xenstored initialization needs to be done in the daemon case only,
so split it out into a new early_init() function being a stub in the
stubdom case.
Remove the call of talloc_enable_leak_report_full(), as it serves no
real purpose: the daemon only ever exits due to a crash, in which case
a log of talloc()ed memory hardly has any value.
Juergen Gross [Mon, 5 Feb 2024 10:49:47 +0000 (11:49 +0100)]
tools/xenstored: rename xenbus_evtchn()
Rename the xenbus_evtchn() function to get_xenbus_evtchn() in order to
avoid two externally visible symbols with the same name when Xenstore-
stubdom is being built with a Mini-OS with CONFIG_XENBUS set.
Cyril Rébert [Sun, 4 Feb 2024 10:19:40 +0000 (11:19 +0100)]
tools/xentop: fix sorting bug for some columns
Sort doesn't work on columns VBD_OO, VBD_RD, VBD_WR and VBD_RSECT.
Fix by adjusting variables names in compare functions.
Bug fix only. No functional change.
Fixes: 91c3e3dc91d6 ("tools/xentop: Display '-' when stats are not available.") Signed-off-by: Cyril Rébert (zithro) <slack@rabbit.lu> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>