Leigh Brown [Wed, 8 May 2024 21:38:20 +0000 (22:38 +0100)]
tools/libs/light: Add vlan field to libxl_device_nic
Add `vlan' string field to libxl_device_nic, to allow a VLAN
configuration to be specified for the VIF when adding it to the
bridge device.
Update libxl_nic.c to read and write the vlan field from the
xenstore.
This provides the capability for supported operating systems (e.g.
Linux) to perform VLAN filtering on bridge ports. The Xen
hotplug scripts need to be updated to read this information from
the xenstore and perform the required configuration.
Signed-off-by: Leigh Brown <leigh@solinno.co.uk> Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Leigh Brown [Tue, 14 May 2024 08:13:44 +0000 (09:13 +0100)]
tools/xentop: Fix cpu% sort order
In compare_cpu_pct(), there is a double -> unsigned long long converion when
calling compare(). In C, this discards the fractional part, resulting in an
out-of order sorting such as:
Andrew Cooper [Thu, 9 May 2024 17:40:11 +0000 (18:40 +0100)]
tools/hvmloader: Further simplify SMP setup
Now that we're using hypercalls to start APs, we can replace the 'ap_cpuid'
global with a regular function parameter. This requires telling the compiler
that we'd like the parameter in a register rather than on the stack.
While adjusting, rename to cpu_setup(). It's always been used on the BSP,
making the name ap_start() specifically misleading.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
Andrew Cooper [Sat, 11 May 2024 18:25:00 +0000 (19:25 +0100)]
x86/cpufreq: Rename cpuid variable/parameters to cpu
Various functions have a parameter or local variable called cpuid, but this
triggers a MISRA R5.3 violation because we also have a function called cpuid()
which wraps the real CPUID instruction.
In all these cases, it's a Xen cpu index, which is far more commonly named
just cpu in our code.
While adjusting these, fix a couple of other issues:
* cpufreq_cpu_init() is on the end of a hypercall (with in-memory parameters,
even), making EFAULT the wrong error to use. Use EOPNOTSUPP instead.
* check_est_cpu() is wrong to tie EIST to just Intel, and nowhere else using
EIST makes this restriction. Just check the feature itself, which is more
succinctly done after being folded into its single caller.
* In powernow_cpufreq_update(), replace an opencoded cpu_online().
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Wed, 15 May 2024 13:35:15 +0000 (15:35 +0200)]
x86: respect mapcache_domain_init() failing
The function itself properly handles and hands onwards failure from
create_perdomain_mapping(). Therefore its caller should respect possible
failure, too.
Fixes: 4b28bf6ae90b ("x86: re-introduce map_domain_page() et al") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Juergen Gross [Wed, 15 May 2024 15:25:39 +0000 (17:25 +0200)]
xen/sched: set all sched_resource data inside locked region for new cpu
When adding a cpu to a scheduler, set all data items of struct
sched_resource inside the locked region, as otherwise a race might
happen (e.g. when trying to access the cpupool of the cpu):
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Fixes: a8c6c623192e ("sched: clarify use cases of schedule_cpu_switch()") Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Tue, 2 Apr 2024 14:50:19 +0000 (15:50 +0100)]
Revert "evtchn: refuse EVTCHNOP_status for Xen-bound event channels"
The commit makes a claim without justification.
The claim is false; it broke lsevtchn in dom0, a debugging utility which
absolutely does care about all of the domain's event channels.
Whether to return information about a xen-owned evtchn is a matter of policy,
and it's not acceptable to subvert Xen's security subsystem on the decision.
Fixes: f60ab5337f96 ("evtchn: refuse EVTCHNOP_status for Xen-bound event channels") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Andrew Cooper [Fri, 10 May 2024 22:56:52 +0000 (23:56 +0100)]
xen: Use -Wuninitialized and -Winit-self
Assigning a variable to itself is an anti-pattern. It introduces definite UB
in an attempt to silence a warning about possible UB.
As it's definite undefined behaviour, it also mis-compiles in simple cases,
using whatever stale value happened to be in the allocated register.
Clang includes -Wuninitialized within -Wall, but GCC only includes it in
-Wextra, which is not used by Xen at this time.
Furthermore, the specific pattern of assigning a variable to itself in its
declaration is only diagnosed by GCC with -Winit-self. Clang does diagnose
simple forms of this pattern with a plain -Wuninitialized, but it fails to
diagnose the instances in Xen that GCC manages to find.
GCC, with -Wuninitialized and -Winit-self notices:
arch/x86/time.c: In function ‘read_pt_and_tsc’:
arch/x86/time.c:297:14: error: ‘best’ is used uninitialized in this function [-Werror=uninitialized]
297 | uint32_t best = best;
| ^~~~
arch/x86/time.c: In function ‘read_pt_and_tmcct’:
arch/x86/time.c:1022:14: error: ‘best’ is used uninitialized in this function [-Werror=uninitialized]
1022 | uint64_t best = best;
| ^~~~
Fix these up to start with a value of ~0, which is also more robust in the
case that something goes wrong.
Fixes: 23658e823238 ("x86/time: further improve TSC / CPU freq calibration accuracy") Fixes: 3f3906b462d5 ("x86/APIC: calibrate against platform timer when possible") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Nicola Vetrini [Fri, 10 May 2024 18:03:36 +0000 (20:03 +0200)]
automation/eclair_analysis: tag MISRA C Rule 1.1 as clean
Tag the rule as clean, as there are no more violations in the codebase since 93c27d54dd23 ("xen/arm: Fix MISRA regression on R1.1,
flexible array member not at the end").
Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
libxl: Fix handling XenStore errors in device creation
If xenstored runs out of memory it is possible for it to fail operations
that should succeed. libxl wasn't robust against this, and could fail
to ensure that the TTY path of a non-initial console was created and
read-only for guests. This doesn't qualify for an XSA because guests
should not be able to run xenstored out of memory, but it still needs to
be fixed.
Add the missing error checks to ensure that all errors are properly
handled and that at no point can a guest make the TTY path of its
frontend directory writable.
Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com> Reviewed-by: Juergen Gross <jgross@suse.com>
x86/hvm: Allow access to registers on the same page as MSI-X table
Some devices (notably Intel Wifi 6 AX210 card) keep auxiliary registers
on the same page as MSI-X table. Device model (especially one in
stubdomain) cannot really handle those, as direct writes to that page is
refused (page is on the mmio_ro_ranges list). Instead, extend
msixtbl_mmio_ops to handle such accesses too.
Doing this, requires correlating read/write location with guest
MSI-X table address. Since QEMU doesn't map MSI-X table to the guest,
it requires msixtbl_entry->gtable, which is HVM-only. Similar feature
for PV would need to be done separately.
This will be also used to read Pending Bit Array, if it lives on the same
page, making QEMU not needing /dev/mem access at all (especially helpful
with lockdown enabled in dom0). If PBA lives on another page, QEMU will
map it to the guest directly.
If PBA lives on the same page, discard writes and log a message.
Technically, writes outside of PBA could be allowed, but at this moment
the precise location of PBA isn't saved, and also no known device abuses
the spec in this way (at least yet).
To access those registers, msixtbl_mmio_ops need the relevant page
mapped. MSI handling already has infrastructure for that, using fixmap,
so try to map first/last page of the MSI-X table (if necessary) and save
their fixmap indexes. Note that msix_get_fixmap() does reference
counting and reuses existing mapping, so just call it directly, even if
the page was mapped before. Also, it uses a specific range of fixmap
indexes which doesn't include 0, so use 0 as default ("not mapped")
value - which simplifies code a bit.
Based on assumption that all MSI-X page accesses are handled by Xen, do
not forward adjacent accesses to other hypothetical ioreq servers, even
if the access wasn't handled for some reason (failure to map pages etc).
Relevant places log a message about that already.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
The arch_msix struct had a single "warned" field with a domid for which
warning was issued. Upcoming patch will need similar mechanism for few
more warnings, so change it to save a bit field of issued warnings.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monne [Fri, 10 May 2024 12:49:13 +0000 (14:49 +0200)]
libxl: fix population of the online vCPU bitmap for PVH
libxl passes some information to libacpi to create the ACPI table for a PVH
guest, and among that information it's a bitmap of which vCPUs are online
which can be less than the maximum number of vCPUs assigned to the domain.
While the population of the bitmap is done correctly for HVM based on the
number of online vCPUs, for PVH the population of the bitmap is done based on
the number of maximum vCPUs allowed. This leads to all local APIC entries in
the MADT being set as enabled, which contradicts the data in xenstore if vCPUs
is different than maximum vCPUs.
Fix by copying the internal libxl bitmap that's populated based on the vCPUs
parameter.
Reported-by: Arthur Borsboom <arthurborsboom@gmail.com> Link: https://gitlab.com/libvirt/libvirt/-/issues/399 Reported-by: Leigh Brown <leigh@solinno.co.uk> Fixes: 14c0d328da2b ('libxl/acpi: Build ACPI tables for HVMlite guests') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Tested-by: Leigh Brown <leigh@solinno.co.uk> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Juergen Gross [Fri, 10 May 2024 14:16:36 +0000 (16:16 +0200)]
xen: allow up to 16383 cpus
With lock handling now allowing up to 16384 cpus (spinlocks can handle
65535 cpus, rwlocks can handle 16384 cpus), raise the allowed limit for
the number of cpus to be configured to 16383.
The new limit is imposed by IOMMU_CMD_BUFFER_MAX_ENTRIES and
QINVAL_MAX_ENTRY_NR required to be larger than 2 * CONFIG_NR_CPUS.
Add a support limit of physical CPUs to SUPPORT.md (4096 on x86, 128
on ARM).
automation/eclair: hide reports coming from adopted code in scheduled analysis
To improve clarity and ease of navigation do not show reports related
to adopted code in the scheduled analysis.
Configuration options are commented out because they may be useful
in the future.
automation/eclair_analysis: amend configuration for some MISRA rules
Adjust ECLAIR configuration for rules: R21.14, R21.15, R21.16 by taking
into account mem* macros defined in the Xen sources as if they were
equivalent to the ones in Standard Library.
xen/arm: Fix MISRA regression on R1.1, flexible array member not at the end
Commit 2209c1e35b47 ("xen/arm: Introduce a generic way to access memory
bank structures") introduced a MISRA regression for Rule 1.1 because a
flexible array member is introduced in the middle of a struct, furthermore
this is using a GCC extension that is going to be deprecated in GCC 14 and
a warning to identify such cases will be present
(-Wflex-array-member-not-at-end) to identify such cases.
In order to fix this issue, use the macro __struct_group to create a
structure 'struct membanks_hdr' which will hold the common data among
structures using the 'struct membanks' interface.
Modify the 'struct shared_meminfo' and 'struct meminfo' to use this new
structure, effectively removing the flexible array member from the middle
of the structure and modify the code accessing the .common field to use
the macro container_of to maintain the functionality of the interface.
Given this change, container_of needs to be supplied with a type and so
the macro 'kernel_info_get_mem' inside arm/include/asm/kernel.h can't be
an option since it uses const and non-const types for struct membanks, so
introduce two static inline, one of which will keep the const qualifier.
Given the complexity of the interface, which carries a lot of benefit but
on the other hand could be prone to developer confusion if the access is
open-coded, introduce two static inline helper for the
'struct kernel_info' .shm_mem member and get rid the open-coding
shm_mem.common access.
Fixes: 2209c1e35b47 ("xen/arm: Introduce a generic way to access memory bank structures") Reported-by: Nicola Vetrini <nicola.vetrini@bugseng.com> Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Import __struct_group from Linux, commit 50d7bd38c3aa
("stddef: Introduce struct_group() helper macro"), in order to
allow the access through the anonymous structure to the members
without having to write also the name, e.g:
struct foo {
int one;
struct {
int two;
int three, four;
} thing;
int five;
};
would become:
struct foo {
int one;
__struct_group(/* None */, thing, /* None */,
int two;
int three, four;
);
int five;
};
Allowing the users of this structure to access the .thing members by
using .two/.three/.four on the struct foo.
This construct will become useful in order to have some generalized
interfaces that shares some common members.
Andrew Cooper [Tue, 23 Apr 2024 15:45:36 +0000 (16:45 +0100)]
x86/boot: Explain how moving mod[0] works
modules_headroom is a misleading name as it applies strictly to mod[0] only,
and the movement loop is deeply unintuitive and completely undocumented.
Provide help to whomever needs to look at this code next.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Jason Andryuk <jason.andryuk@gmail.com>
x86/IOMMU: address violations of MISRA C:2012 Rule 14.4
The xen sources contain violations of MISRA C:2012 Rule 14.4 whose
headline states:
"The controlling expression of an if statement and the controlling
expression of an iteration-statement shall have essentially Boolean type".
Add comparisons to avoid using enum constants as controlling expressions
to comply with Rule 14.4.
Amend the comment in the enum definition to reflect the fact that
boolean uses of iommu_intremap are no longer allowed.
No functional change.
Signed-off-by: Maria Celeste Cesario <maria.celeste.cesario@bugseng.com> Signed-off-by: Simone Ballarin <simone.ballarin@bugseng.com> Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.
No functional change.
Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com> Acked-by: Jan Beulich <jbeulich@suse.com>
xen/unaligned: address violation of MISRA C Rule 20.7
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.
No functional change.
Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Mon, 29 Apr 2024 16:31:03 +0000 (17:31 +0100)]
x86/hvm: Defer the size calculation in hvm_save_cpu_xsave_states()
HVM_CPU_XSAVE_SIZE() may rewrite %xcr0 twice. Defer the calculation until
after we've decided to write out an XSAVE record.
Note in hvm_load_cpu_xsave_states() that there were versions of Xen which
wrote out a useless XSAVE record. This sadly limits out ability to tidy up
the existing infrastructure. Also leave a note in xstate_ctxt_size() that 0
still needs tolerating for now.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
tools/hvmloader: Wake APs with hypercalls rather than INIT+SIPI+SIPI
... in order to change how LAPIC_ID handling works. Importantly, this allows
us to start APs by vCPU ID in order to query the LAPIC_ID, rather than needing
to know the APIC_ID in order to wake them.
Other improvements avoid:
* The 16bit entry stub
* A LMSW insn, which has no decode assist on AMD and needs emulating fully
* 13 vLAPIC emulations when 3 hypercalls can do
* 4 pages of stack when 1 in plenty
Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Wed, 24 Aug 2022 10:08:28 +0000 (11:08 +0100)]
tools/hvmloader: Move various helpers to being static inlines
The IO port, MSR, IO-APIC and LAPIC accessors compile typically to single or
pairs of instructions, which is less overhead than even the stack manipulation
to call the helpers.
Move the implementations from util.c to being static inlines in util.h
In addition, turn ioapic_base_address into a constant as it is never modified
from 0xfec00000 (substantially shrinks the IO-APIC logic), and make use of the
"A" constraint for WRMSR/RDMSR like we already do for RDTSC.
Daniel P. Smith [Wed, 24 Apr 2024 16:34:22 +0000 (12:34 -0400)]
xen/gunzip: Move crc state into gunzip_state
Move the crc and its state into struct gunzip_state. In the process, expand
the only use of CRC_VALUE as it is hides what is being compared.
Furthermore, all variables here should be uint32_t rather than unsigned long,
which halves the storage space required. Filter the typechanges through the
logic.
Adjust the logic to hold crc in a positive form, and negate it for update in
flush_window(). This is the more normal way to write CRC algorithms, and
avoids weird-to-follow logic in gunzip().
Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Daniel P. Smith [Wed, 24 Apr 2024 16:34:19 +0000 (12:34 -0400)]
xen/gunzip: Move input buffer handling into gunzip_state
Move the input buffer handling, buffer pointer(inbuf), size(insize), and
index(inptr), into gunzip_state. Adjust functions and macros that consumed the
input buffer to accept a struct gunzip_state reference.
Convert get_byte() into a real function and subsume fill_inbuf(). Fix the
failure path to work correctly when error() stops being a plain panic().
Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Mon, 6 May 2024 08:08:40 +0000 (10:08 +0200)]
xen/gunzip: don't leak memory on error paths
While decompression errors are likely going to be fatal to Xen's boot
process anyway, the latest with the goal of doing multiple decompressor
runs it is likely better to avoid leaks even on error paths. All the
more when this way code size actually shrinks a tiny bit.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Nicola Vetrini [Mon, 6 May 2024 08:52:31 +0000 (10:52 +0200)]
automation/eclair_analysis: unblock pipelines from certain repositories
Repositories under people/* only execute the analyze step if manually
triggered, but in order to avoid blocking the rest of the pipeline
if such step is not run, allow it to fail.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
George Dunlap [Thu, 25 Apr 2024 08:49:42 +0000 (09:49 +0100)]
svm: Fix MISRA 8.2 violation
Misra 8.2 requires named parameters in prototypes. Use the name from
the implementaiton.
Fixes: 0d19d3aab0 ("svm/nestedsvm: Introduce nested capabilities bit") Reported-by: Andrew Cooper <andrew.cooper@cloud.com> Reported-by: Nicola Vetrini <nicola.vetrini@bugseng.com> Signed-off-by: George Dunlap <george.dunlap@cloud.com> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Tue, 7 May 2024 11:19:41 +0000 (12:19 +0100)]
x86/cpu-policy: Fix migration from Ice Lake to Cascade Lake
Ever since Xen 4.14, there has been a latent bug with migration.
While some toolstacks can level the features properly, they don't shink
feat.max_subleaf when all features have been dropped. This is because
we *still* have not completed the toolstack side work for full CPU Policy
objects.
As a consequence, even when properly feature levelled, VMs can't migrate
"backwards" across hardware which reduces feat.max_subleaf. One such example
is Ice Lake (max_subleaf=2 for INTEL_PSFD) to Cascade Lake (max_subleaf=0).
Extend the max policies feat.max_subleaf to the hightest number Xen knows
about, but leave the default policies matching the host. This will allow VMs
with a higher feat.max_subleaf than strictly necessary to migrate in.
Eventually we'll manage to teach the toolstack how to avoid creating such VMs
in the first place, but there's still more work to do there.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Andrew Cooper [Sat, 4 May 2024 01:10:33 +0000 (02:10 +0100)]
tools/libxs: Open /dev/xen/xenbus fds as O_CLOEXEC
The header description for xs_open() goes as far as to suggest that the fd is
O_CLOEXEC, but it isn't actually.
`xl devd` has been observed leaking /dev/xen/xenbus into children.
Link: https://github.com/QubesOS/qubes-issues/issues/8292 Reported-by: Demi Marie Obenour <demi@invisiblethingslab.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com>
Jan Beulich [Mon, 6 May 2024 12:52:48 +0000 (14:52 +0200)]
AMD/IOMMU: add helper to check whether ATS is to be used for a device
The same set of conditions is used in three places, requiring to be kept
in sync. Introduce a helper to centralize these checks.
To allow all parameters of the new helper be pointer-to-const,
iommu_has_cap() also needs its 1st parameter to be constified. Beyond
that further "modernize" that function.
Requested-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Jan Beulich [Mon, 6 May 2024 12:51:29 +0000 (14:51 +0200)]
VT-d: tidy error handling of RMRR parsing
It's acpi_parse_one_rmrr() where the allocation is coming from (by way
of invoking acpi_parse_dev_scope()), or in add_one_user_rmrr()'s case
allocation is even open-coded there, so freeing would better also happen
there. Care needs to be taken to preserve acpi_parse_one_rmrr()'s
ultimate return value.
While fiddling with callers also move scope_devices_free() to .init and
have it use XFREE() instead of open-coding it. To avoid making the
situation worse for register_one_rmrr(), mark that __init right here as
well.
In register_one_rmrr() also have the "ignore" path take the main
function return path.
Suggested-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com>
MISRA C:2012 Rule 16.4 states that "Every switch statement shall have a
default label".
Update ECLAIR configuration to take into account the deviations
agreed during MISRA meetings.
Roger Pau Monné [Mon, 6 May 2024 07:24:10 +0000 (09:24 +0200)]
ppc/riscv: fix arch_acquire_resource_check()
None of the implementations support set_foreign_p2m_entry() yet, neither they
have a p2m walk in domain_relinquish_resources() in order to remove the foreign
mappings from the p2m and thus drop the extra refcounts.
Adjust the arch helpers to return false and introduce a comment that clearly
states it is not only taking extra refcounts that's needed, but also dropping
them on domain teardown.
Fixes: 4988704e00d8 ('xen/riscv: introduce p2m.h') Fixes: 4a2f68f90930 ('xen/ppc: Define minimal stub headers required for full build') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Acked-by: Shawn Anastasio <sanastasio@raptorengineering.com>
Nicola Vetrini [Mon, 6 May 2024 07:23:30 +0000 (09:23 +0200)]
drivers/char: address violation of MISRA C Rule 20.7
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.
Jan Beulich [Mon, 6 May 2024 07:22:45 +0000 (09:22 +0200)]
VT-d: correct ATS checking for root complex integrated devices
Spec version 4.1 says
"The ATSR structures identifies PCI Express Root-Ports supporting
Address Translation Services (ATS) transactions. Software must enable
ATS on endpoint devices behind a Root Port only if the Root Port is
reported as supporting ATS transactions."
Clearly root complex integrated devices aren't "behind root ports",
matching my observation on a SapphireRapids system having an ATS-
capable root complex integrated device. Hence for such devices we
shouldn't try to locate a corresponding ATSR.
Since both pci_find_ext_capability() and pci_find_cap_offset() return
"unsigned int", change "pos" to that type at the same time.
Fixes: 903b93211f56 ("[VTD] laying the ground work for ATS") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Andrew Cooper [Thu, 2 May 2024 17:35:09 +0000 (18:35 +0100)]
xen/Kconfig: Drop the final remnants of ---help---
We deprecated the use of ---help--- a while ago, but a lot of new content
copy&pastes bad examples. Convert the remaining instances, and update
Kconfig's parser to no longer recongise it.
Juergen Gross [Thu, 2 May 2024 13:21:36 +0000 (15:21 +0200)]
tools/tests: don't let test-xenstore write nodes exceeding default size
Today test-xenstore will write nodes with 3000 bytes node data. This
size is exceeding the default quota for the allowed node size. While
working in dom0 with C-xenstored, OCAML-xenstored does not like that.
Use a size of 2000 instead, which is lower than the allowed default
node size of 2048.
Fixes: 3afc5e4a5b75 ("tools/tests: add xenstore testing framework") Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Fri, 26 Apr 2024 15:53:08 +0000 (16:53 +0100)]
x86/cpu-policy: Annotate the accumulated features
Some features need accumulating rather than intersecting to make migration
safe. Introduce the new '|' attribute for this purpose.
Right now, it's only used by the Xapi toolstack, but it will be used by
xl/libxl when the full policy-object work is complete, and until then it's
still a useful hint for hand-crafted cpuid= lines in vm.cfg files.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
That change aimed at eliminating an open-coded lock-like construct,
which really isn't all that similar to, in particular, get_page(). The
function always succeeds. Any remaining concern would want taking care
of by placing block_lock_speculation() at the end of the function.
Since the function is called only during page (de)validation, any
possible performance concerns over such extra serialization could
likely be addressed by pre-validating (e.g. via pinning) page tables.
The fundamental issue with the change being reverted is that it detects
bad state only after already having caused possible corruption. While
the system is going to be halted in such an event, there is a time
window during which the resulting incorrect state could be leveraged by
a clever (in particular: fast enough) attacker.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Now, the check-extension() macro has 1 argument instead of 2.
This change helps to reduce redundancy around usage of extensions
name (in the case of the zbb extension, the name was used 3 times).
To implement this, a new variable was introduced:
<extension name>-insn
which represents the instruction support that is being checked.
Additionally, zbb-insn is updated to use $(comma) instead of ",".
Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jason Andryuk [Tue, 30 Apr 2024 06:33:41 +0000 (08:33 +0200)]
xen/xsm: Wire up get_dom0_console
An XSM hook for get_dom0_console is currently missing. Using XSM with
a PVH dom0 shows:
(XEN) FLASK: Denying unknown platform_op: 64.
Wire up the hook, and allow it for dom0.
Fixes: 4dd160583c ("x86/platform: introduce hypercall to get initial video console settings") Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> Acked-by: Daniel P. Smith <dpsmith@apertussolutions.com>
x86/msi: passthrough all MSI-X vector ctrl writes to device model
QEMU needs to know whether clearing maskbit of a vector is really
clearing, or was already cleared before. Currently Xen sends only
clearing that bit to the device model, but not setting it, so QEMU
cannot detect it. Because of that, QEMU is working this around by
checking via /dev/mem, but that isn't the proper approach.
Give all necessary information to QEMU by passing all ctrl writes,
including masking a vector. Advertise the new behavior via
XENVER_get_features, so QEMU can know it doesn't need to access /dev/mem
anymore.
While this commit doesn't move the whole maskbit handling to QEMU (as
discussed on xen-devel as one of the possibilities), it is a necessary
first step anyway. Including telling QEMU it will get all the required
information to do so. The actual implementation would need to include:
- a hypercall for QEMU to control just maskbit (without (re)binding the
interrupt again
- a method for QEMU to tell Xen it will actually do the work
Those are not part of this series.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
The ->profile member is at different offsets in struct rspinlock and
struct spinlock. When initializing the profiling bits of an rspinlock,
an unrelated member in struct rspinlock was being overwritten, leading
to mild havoc. Use the correct pointer.
Fixes: b053075d1a7b ("xen/spinlock: make struct lock_profile rspinlock_t aware") Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Fri, 26 Apr 2024 10:43:01 +0000 (12:43 +0200)]
x86/entry: shrink insn size for some of our EFLAGS manipulation
Much like was recently done for setting entry vector, and along the
lines of what we already had in handle_exception_saved, avoid 32-bit
immediates where 8-bit ones do. Reduces .text.entry size by 16 bytes in
my non-CET reference build, while in my CET reference build section size
doesn't change (there and in .text only padding space increases).
Inspired by other long->byte conversion work.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
The vPCI prefetchable memory range is >= 4GB, so the memory space flags
should be set to 64-bit. See IEEE Std 1275-1994 [1] chapter 2.2.1.1 for
a definition of the field.
Introduce accepted_guidelines.sh: a script to autogenerate the
configuration file accepted.ecl from docs/misra/rules.rst which enables
all accepted guidelines.
Introduce monitored.ecl: a manual selection of accepted guidelines
which are clean or almost clean, it is intended to be used for the
analyses triggered by commits.
Reorganize tagging.ecl:
-Remove "accepted" tags: keeping track of accepted guidelines tagging
them as "accepted" in the configuration file tagging.ecl is no
longer needed since docs/rules.rst is keeping track of them.
-Tag more guidelines as clean.
Reorganize eclair pipelines:
- Set1, Set2, Set3 are now obsolete: remove the corresponding
pipelines and ecl files.
- Amend scheduled eclair pipeline to use accepted.ecl.
- Amend triggered eclair pipeline to use monitored.ecl.
Rename and improve action_check_clean_regressions.sh to print a
diagnostic in case a commit introduces a violation of a clean guideline.
An example of diagnostic is the following:
Failure: 13 regressions found for clean guidelines
service MC3R1.R8.2: (required) Function types shall be in prototype form with named parameters:
violation: 13
It's currently too restrictive by just checking whether there's a BHB clearing
sequence selected. It should instead check whether BHB clearing is used on
entry from PV or HVM specifically.
Switch to use opt_bhb_entry_{pv,hvm} instead, and then remove cpu_has_bhb_seq
since it no longer has any users.
Reported-by: Jan Beulich <jbeulich@suse.com> Fixes: 954c983abcee ('x86/spec-ctrl: Software BHB-clearing sequences') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
x86/spec: fix reporting of BHB clearing usage from guest entry points
Reporting whether the BHB clearing on entry is done for the different domains
types based on cpu_has_bhb_seq is unhelpful, as that variable signals whether
there's a BHB clearing sequence selected, but that alone doesn't imply that
such sequence is used from the PV and/or HVM entry points.
Instead use opt_bhb_entry_{pv,hvm} which do signal whether BHB clearing is
performed on entry from PV/HVM.
Fixes: 689ad48ce9cf ('x86/spec-ctrl: Wire up the Native-BHI software sequences') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Edwin Török [Wed, 27 Mar 2024 16:30:21 +0000 (16:30 +0000)]
tools/ocaml: Fix warnings in config.ml
Fixes warnings such as:
File "config.ml", line 102, characters 12-27:
102 | | Failure "int_of_string" -> append (k, "expect int arg")
^^^^^^^^^^^^^^^
Warning 52: Code should not depend on the actual values of
this constructor's arguments. They are only for information
and may change in future versions. (See manual section 9.5)
Do not rely on the string values of the `Failure` exception, but use the
`_opt` functions instead.
Signed-off-by: Edwin Török <edwin.torok@cloud.com> Acked-by: Christian Lindig <christian.lindig@cloud.com> Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Thu, 25 Apr 2024 07:53:55 +0000 (09:53 +0200)]
x86/paging: vCPU host mode is always set
... thanks to paging_vcpu_init() being part of vCPU creation. Further
if paging is enabled on a domain, it's also guaranteed to be either HAP
or shadow. Drop respective unnecessary (parts of) conditionals.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
x86/msr: add suffix 'U' to MSR_AMD_CSTATE_CFG macro
This addresses violations of MISRA C:2012 Rule 7.2 which states as
following: A “u” or “U” suffix shall be applied to all integer constants
that are represented in an unsigned type.
No functional change.
Fixes: 652683e1aeaa ("x86/hvm: address violations of MISRA C:2012 Rule 7.2") Signed-off-by: Alessandro Zucchelli <alessandro.zucchelli@bugseng.com> Acked-by: Jan Beulich <jbeulich@suse.com>
This addresses violations of MISRA C:2012 Rule 7.2 which states as
following: A “u” or “U” suffix shall be applied to all integer constants
that are represented in an unsigned type.
No functional change.
Signed-off-by: Alessandro Zucchelli <alessandro.zucchelli@bugseng.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Introduce a xen-livepatch tool --force option, that's propagated into the
hyerpvisor for livepatch operations. The intention is for the option to be
used to bypass some checks that would otherwise prevent the patch from being
loaded.
Re purpose the pad field in xen_sysctl_livepatch_op to be a flags field that
applies to all livepatch operations. The flag is currently only set by the
hypercall wrappers for the XEN_SYSCTL_LIVEPATCH_UPLOAD operation, as that's so
far the only one where it will be used initially. Other uses can be added as
required.
Note that helpers would set the .pad field to 0, that's been removed since the
structure is already zero initialized at definition.
No functional usages of the new flag introduced in this patch.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com>
It's incorrect to restrict strncmp to the length of the command line input
parameter, as then a user passing a rune like:
% xen-livepatch up foo.livepatch
Would match against the "upload" command, because the string comparison has
been truncated to the length of the input argument. Use strcmp instead which
doesn't truncate. Otherwise in order to keep using strncmp we would need to
also check strings are of the same length before doing the comparison.
Fixes: 05bb8afedede ('xen-xsplice: Tool to manipulate xsplice payloads') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com>
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.
No functional change.
Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com> Acked-by: Jan Beulich <jbeulich@suse.com>
x86/debugreg: address violation of MISRA C Rule 20.7
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.
No functional change.
Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com> Acked-by: Jan Beulich <jbeulich@suse.com>
x86/vhpet: address violations of MISRA C Rule 20.7
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.
No functional change.
Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com> Acked-by: Jan Beulich <jbeulich@suse.com>
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.
No functional change.
Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com> Acked-by: Jan Beulich <jbeulich@suse.com>
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.
No functional change.
Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com> Acked-by: Jan Beulich <jbeulich@suse.com>
xen/spinlock: address violations of MISRA C Rule 20.7
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.
No functional change.
Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com> Acked-by: Jan Beulich <jbeulich@suse.com>
xen/page-defs: address violation of MISRA C Rule 20.7
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.
No functional change.
Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com> Acked-by: Jan Beulich <jbeulich@suse.com>
MISRA C Rule 20.7 states: "Expressions resulting from the expansion
of macro parameters shall be enclosed in parentheses". Therefore, some
macro definitions should gain additional parentheses to ensure that all
current and future users will be safe with respect to expansions that
can possibly alter the semantics of the passed-in macro parameter.
No functional change.
Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Jason Andryuk [Thu, 25 Apr 2024 07:47:52 +0000 (09:47 +0200)]
libxl: Support blktap with HVM device model
blktap exposes disks over UNIX socket Network Block Device (NBD).
Modify libxl__device_disk_find_local_path() to provide back the
QEMU-formatted NBD path. This allows tapdisk to be used for booting an
HVM.
Use the nbd+unix:/// format specified by the protocol at
https://github.com/NetworkBlockDevice/nbd/blob/master/doc/uri.md
Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Jason Andryuk [Thu, 25 Apr 2024 07:46:56 +0000 (09:46 +0200)]
hotplug: Update block-tap
Implement a sharing check like the regular block script.
Checking tapback inside block-tap is too late since it needs to be
running to transition the backend to InitWait before block-tap is run.
tap-ctl check will be removed when the requirement for the blktap kernel
driver is removed. Remove it now as it is of limited use.
find_device() needs to be non-fatal allow a sharing check.
Only write physical-device-path because that is all that tapback needs.
Also write_dev doesn't handled files and would incorrectly store
physical-device as 0:0 which would confuse the minor inside tapback
Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Michal Orzel [Tue, 23 Apr 2024 16:11:21 +0000 (18:11 +0200)]
automation: Add arm64 test for running Xen with GICv3
At the moment, all the Arm64 Qemu tests use GICv2 which is the default
GIC version used by Qemu. Improve the coverage by adding a new test in
which Qemu will be configured to have GICv3.
Rename host device tree name to "virt.dtb" to be GIC version agnostic.
Use "gic-version" Qemu option to select the version to use. Unless the
test variant is set to "gicv3", version 2 will be used.
Signed-off-by: Michal Orzel <michal.orzel@amd.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Michal Orzel [Tue, 23 Apr 2024 16:11:20 +0000 (18:11 +0200)]
automation: Add arm{64,32} earlyprintk jobs
Introduce qemu based Arm earlyprintk test and build jobs to cover this
feature in debug variant. The tests simply check for the presence of the
last message printed by the bootstrap code before entering the C world.
Signed-off-by: Michal Orzel <michal.orzel@amd.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Michal Orzel [Tue, 23 Apr 2024 16:11:19 +0000 (18:11 +0200)]
automation: Drop some of the non-debug variants of the same Arm jobs
To save some bandwith that can be later on used to increase the test
coverage by adding new tests, drop the following non-debug test/build
jobs existing in both debug and non-debug variants:
- static memory (arm64, arm32)
- static shared memory (arm64)
- static heap (arm64)
- boot cpupools (arm64)
- gzip (arm32)
More generic tests existing in both variants were left unmodified.
Signed-off-by: Michal Orzel <michal.orzel@amd.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org>
xen/arm: List static shared memory regions as /memory nodes
Currently Xen is not exporting the static shared memory regions
to the device tree as /memory node, this commit is fixing this
issue.
Given that now make_memory_node needs a parameter 'struct kernel_info'
in order to call the new function shm_mem_node_fill_reg_range,
take the occasion to remove the unused struct domain parameter.
Signed-off-by: Luca Fancellu <luca.fancellu@arm.com> Reviewed-by: Michal Orzel <michal.orzel@amd.com>
xen/arm: fix duplicate /reserved-memory node in Dom0
In case there is a /reserved-memory node already present in the host
dtb, current Xen codes would create yet another /reserved-memory node
when the static shared memory feature is enabled and static shared
memory regions are present.
This would result in an incorrect device tree generation and hwdom
would not be able to detect the static shared memory region.
Avoid this issue by checking the presence of the /reserved-memory
node and appending the nodes instead of generating a duplicate
/reserved-memory.
Make make_shm_memory_node externally visible and rename it to
make_shm_resv_memory_node to make clear it produces childs for
/reserved-memory.