Roger Pau Monne [Thu, 16 Jan 2025 08:07:31 +0000 (09:07 +0100)]
automation/cirrus-ci: update FreeBSD to 13.4
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
Jan Beulich [Fri, 17 Jan 2025 07:54:03 +0000 (08:54 +0100)]
xl: properly dispose of libxl_dominfo struct instances
The ssid_label field requires separate freeing; make sure to call
libxl_dominfo_dispose() as well as libxl_dominfo_init(). Since vcpuset()
calls only the former, add a call to the latter there at the same time.
Coverity-ID: 1638727
Coverity-ID: 1638728 Fixes: c458c404da16 ("xl: use libxl_domain_info to get the uuid in printf_info") Fixes: 48dab9767d2e ("tools/xl: use libxl_domain_info to get domain type for vcpu-pin") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
Jan Beulich [Fri, 17 Jan 2025 07:53:27 +0000 (08:53 +0100)]
xentrace: free CPU mask string before overwriting pointer
While multiple -c options may be unexpected, we'd still better deal with
them properly.
Also restore the blank line that was bogusly zapped by the same commit.
Coverity-ID: 1638723 Fixes: e4ad2836842a ("xentrace: Implement cpu mask range parsing of human values (-c)") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
Bernhard Kaindl [Wed, 15 Jan 2025 15:09:04 +0000 (16:09 +0100)]
docs/misc: Fix a few typos
While skimming through the misc docs, I spotted a few typos.
Signed-off-by: Bernhard Kaindl <bernhard.kaindl@cloud.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Roger Pau Monne [Tue, 14 Jan 2025 14:10:14 +0000 (15:10 +0100)]
automation/gitlab: disable coverage from clang randconfig
If randconfig enables coverage support the build times out due to GNU LD
taking too long. For the time being prevent coverage from being enabled in
clang randconfig job.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org> Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
****************************************
Panic on CPU 0:
FATAL PAGE FAULT
[error_code=0011]
Faulting linear address: 0000000062ccfa70
****************************************
Swap the preference to default to CMOS first, and EFI later, in an attempt to
use EFI_GET_TIME as a last resort option only. Note that Linux for example
doesn't allow calling the get_time method, and instead provides a dummy handler
that unconditionally returns EFI_UNSUPPORTED on x86-64.
Such change in the preferences requires some re-arranging of the function
logic, so that panic messages with workaround suggestions are suitably printed.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Acked-By: Oleksii Kurochko<oleksii.kurochko@gmail.com> Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
x86/time: introduce command line option to select wallclock
Allow setting the used wallclock from the command line. When the option is set
to a value different than `auto` the probing is bypassed and the selected
implementation is used (as long as it's available).
The `xen` and `efi` options require being booted as a Xen guest (with Xen guest
supported built-in) or from UEFI firmware respectively.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
Roger Pau Monne [Tue, 14 Jan 2025 11:08:22 +0000 (12:08 +0100)]
automation/eclair: make Misra rule 20.7 blocking
There are no violations left, make the rule globally blocking for both x86
and ARM.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-by: Oleksii Kurochko<oleksii.kurochko@gmail.com>
Rule 11.8 states as following: "A cast shall not remove any `const' or
`volatile' qualification from the type pointed to by a pointer".
Function `__hvm_copy' in `xen/arch/x86/hvm/hvm.c' is a double-use
function, where the parameter needs to not be const because it can be
set for write or not. As it was decided a new const-only function will
lead to more developer confusion than it's worth, this violation is
addressed by deviating the function.
All cases of casting away const-ness are accompanied with a comment
explaining why it is safe given the other flags passed in; such comment is used
by the deviation in order to match the appropriate function call.
Petr Beneš [Thu, 2 Jan 2025 17:13:28 +0000 (17:13 +0000)]
x86: Add Support for Paging-Write Feature
This patch introduces a new XENMEM_access_r_pw permission.
Functionally, it is similar to XENMEM_access_r, but for processors
with TERTIARY_EXEC_EPT_PAGING_WRITE support (Intel 12th Gen/Alder Lake
and later, Xeon 4th Gen/Sappire Rapids and later), it also permits the
CPU to write to the page during guest page-table walks (e.g., updating
A/D bits) without triggering an EPT violation.
This behavior works by both enabling the EPT paging-write feature and
setting the EPT paging-write flag in the EPT leaf entry.
This feature provides a significant performance boost for
introspection tools that monitor guest page-table updates. Previously,
every page-table modification by the guest—including routine updates
like setting A/D bits—triggered an EPT violation, adding unnecessary
overhead. The new XENMEM_access_r_pw permission allows these
"uninteresting" updates to occur without EPT violations, improving
efficiency.
Additionally, this feature simplifies the handling of race conditions
in scenarios where an introspection tool:
- Sets an "invisible breakpoint" in the altp2m view for a function F.
- Monitors guest page-table updates to track whether the page
containing F is paged out.
- Encounters a cleared Access (A) bit on the page containing F while
the guest is about to execute the breakpoint.
In the current implementation:
- If xc_monitor_inguest_pagefault() is enabled, the introspection tool
must emulate both the breakpoint and the setting of the Access bit.
- If xc_monitor_inguest_pagefault() is disabled, Xen handles the EPT
violation without notifying the introspection tool, setting the
Access bit and emulating the instruction. However, Xen fetches the
instruction from the default view instead of the altp2m view,
potentially causing the breakpoint to be missed.
With this patch, setting XENMEM_access_r_pw for monitored guest
page-tables prevents EPT violations in these cases. This change
enhances performance and reduces complexity for introspection tools,
ensuring seamless breakpoint handling while tracking guest page-table
updates.
Signed-off-by: Petr Beneš <w1benny@gmail.com> Acked-by: Tamas K Lengyel <tamas@tklengyel.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Petr Beneš [Thu, 2 Jan 2025 17:13:27 +0000 (17:13 +0000)]
x86: Rename _rsvd field to pw and move it to the bit 58
The EPT Paging-write feature (when enabled by the
TERTIARY_EXEC_EPT_PAGING_WRITE bit) uses bit 58 of the EPT entry to
indicate that guest paging may update the page, even if the W access
is not set.
This patch is a preparation for the EPT Paging-write feature.
Signed-off-by: Petr Beneš <w1benny@gmail.com> Acked-by: Tamas K Lengyel <tamas@tklengyel.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Use the solution described in [1] to provide a wrapper to the 'date'
command that uses SOURCE_DATE_EPOCH if available. This is needed for
reproducible builds.
The -d "@..." syntax was introduced in GNU date about 2005 (but only
added to the docuemntation in 2011), so I assume a version supporting
this syntax is available, if SOURCE_DATE_EPOCH is defined. If
SOURCE_DATE_EPOCH is not defined, nothing changes with respect to the
current behavior.
Update all users of 'date' in the tree to use the new wrapper.
Not having ppc and riscv included in DOC_ARCHES causes "multiple
definitions of ..." message on documentation build, similar to the
example shown below:
include/public/arch-ppc.h:91: multiple definitions of Typedef
vcpu_guest_core_regs_t: include/public/arch-arm.h:300
include/public/arch-ppc.h:91: multiple definitions of Typedef
vcpu_guest_core_regs_t: include/public/arch-ppc.h:85
It can also make the generated html documentation link to the header
files of another architecture. This is additionally a problem as it can
randomly make the documentation build non-reproducible.
Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Thu, 9 Jan 2025 15:06:34 +0000 (15:06 +0000)]
Update Xen version to 4.20-rc
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Thu, 9 Jan 2025 15:10:01 +0000 (15:10 +0000)]
Config.mk: Pin QEMU_UPSTREAM_REVISION
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Anthony PERARD <anthony.perard@vates.tech> Release-Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Commit a14593e3995a ("xen/device-tree: Allow region overlapping with
/memreserve/ ranges") introduced a type in the 'struct membanks_hdr'
but forgot to update the 'struct kernel_info' initialiser, while
it doesn't lead to failures because the field is not currently
used while managing kernel_info structures, it's good to have it
for completeness.
There are other instance of structures using 'struct membanks_hdr'
that are dynamically allocated and don't fully initialise these
fields, provide a static inline helper for that.
Fixes: a14593e3995a ("xen/device-tree: Allow region overlapping with /memreserve/ ranges") Signed-off-by: Luca Fancellu <luca.fancellu@arm.com> Reviewed-by: Michal Orzel <michal.orzel@amd.com> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Juergen Gross [Thu, 9 Jan 2025 16:34:01 +0000 (17:34 +0100)]
xen/events: fix race with set_global_virq_handler()
There is a possible race scenario between set_global_virq_handler()
and clear_global_virq_handlers() targeting the same domain, which
might result in that domain ending as a zombie domain.
In case set_global_virq_handler() is being called for a domain which
is just dying, it might happen that clear_global_virq_handlers() is
running first, resulting in set_global_virq_handler() taking a new
reference for that domain and entering in the global_virq_handlers[]
array afterwards. The reference will never be dropped, thus the domain
will never be freed completely.
This can be fixed by checking the is_dying state of the domain inside
the region guarded by global_virq_handlers_lock. In case the domain is
dying, handle it as if the domain wouldn't exist, which will be the
case in near future anyway.
Fixes: 87521589aa6a ("xen: allow global VIRQ handlers to be delegated to other domains") Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
In file included from arch/arm/tee/ffa.c:72:
arch/arm/tee/ffa_private.h:329:17: error: 'used' attribute ignored on a non-definition declaration [-Werror,-Wignored-attributes]
extern uint32_t __ro_after_init ffa_fw_version;
^
The variable ffa_fw_version is only used in ffa.c. Remove the
declaration in the header and make the definition in ffa.c static.
Fixes: 2f9f240a5e87 ("xen/arm: ffa: Fine granular call support") Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Andrew Cooper [Wed, 8 Jan 2025 12:05:38 +0000 (12:05 +0000)]
CI: Update Fedora to 41
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Michal Orzel [Wed, 8 Jan 2025 07:57:19 +0000 (08:57 +0100)]
xen/arm64: Drop relocate_and_switch_ttbr() stub
In the original patch e7a80636f16e ("xen/arm: add cache coloring support
for Xen image"), the stub was added under wrong assumption that DCE
won't remove the function call if it's not static. This assumption is
incorrect as we already rely on DCE for cases like this one. Therefore
drop the stub, that otherwise would be a place potentially prone to
errors in the future.
Michal Orzel [Tue, 7 Jan 2025 09:27:19 +0000 (10:27 +0100)]
xen/flask: Wire up XEN_DOMCTL_set_llc_colors
Addition of FLASK permission for this hypercall was overlooked in the
original patch. Fix it. Setting LLC colors is only possible during domain
creation.
Fixes: 6985aa5e0c3c ("xen: extend domctl interface for cache coloring") Signed-off-by: Michal Orzel <michal.orzel@amd.com> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com> Acked-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Michal Orzel [Tue, 7 Jan 2025 09:27:18 +0000 (10:27 +0100)]
xen/flask: Wire up XEN_DOMCTL_dt_overlay
Addition of FLASK permission for this hypercall was overlooked in the
original patch. Fix it. The only dt overlay operation is attaching that can
happen only after the domain is created. Dom0 can attach overlay to itself
as well.
Fixes: 4c733873b5c2 ("xen/arm: Add XEN_DOMCTL_dt_overlay and device attachment to domains") Signed-off-by: Michal Orzel <michal.orzel@amd.com> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com> Acked-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Michal Orzel [Tue, 7 Jan 2025 09:27:17 +0000 (10:27 +0100)]
xen/flask: Wire up XEN_DOMCTL_vuart_op
Addition of FLASK permission for this hypercall was overlooked in the
original patch. Fix it. The only VUART operation is initialization that
can occur only during domain creation.
Fixes: 86039f2e8c20 ("xen/arm: vpl011: Add a new domctl API to initialize vpl011") Signed-off-by: Michal Orzel <michal.orzel@amd.com> Release-Acked-By: Oleksii Kurochko <oleksii.kurochko@gmail.com> Acked-by: Daniel P. Smith <dpsmith@apertussolutions.com>
All selector fields under ctxt->regs are (normally) poisoned in the HVM
case, and the four ones besides CS and SS are potentially stale for PV.
Avoid using them in the hypervisor incarnation of the emulator, when
trying to cover for a missing ->read_segment() hook.
To make sure there's always a valid ->read_segment() handler for all HVM
cases, add a respective function to shadow code, even if it is not
expected for FPU insns to be used to update page tables.
Fixes: 0711b59b858a ("x86emul: correct FPU code/data pointers and opcode handling") Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Wed, 8 Jan 2025 10:01:17 +0000 (11:01 +0100)]
x86emul: VCVT{,U}DQ2PD ignores embedded rounding
IOW we shouldn't raise #UD in that case. Be on the safe side though and
only encode fully legitimate forms into the stub to be executed.
Things weren't quite right for VCVT{,U}SI2SD either, in the attempt to
be on the safe side: Clearing EVEX.L'L isn't useful; it's EVEX.b which
primarily needs clearing. Also reflect the somewhat improved doc
situation in the comment there.
Fixes: ed806f373730 ("x86emul: support AVX512F legacy-equivalent packed int/FP conversion insns") Fixes: baf4a376f550 ("x86emul: support AVX512F legacy-equivalent scalar int/FP conversion insns") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Sun, 29 Dec 2024 18:18:22 +0000 (18:18 +0000)]
xen/perfc: Add perfc_defn.h to asm-generic
... and hook it up for RISC-V and PPC.
On RISC-V at least, no combination of headers pulls in errno.h, so include it
explicitly.
Guard the hypercalls array declaration based on NR_hypercalls existing. This
is sufficient to get PERF_COUNTERS fully working on RISC-V and PPC, so drop
the randconfig override.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Oleksii Kurohcko <oleksii.kurochko@gmail.com>
Andrew Cooper [Thu, 2 Jan 2025 19:46:19 +0000 (19:46 +0000)]
x86/pv: Fix build with Clang and CONFIG_PERF_COUNTERS
Clang, of at least verion 17 complains:
arch/x86/pv/hypercall.c:30:10: error: variable 'eax' is used uninitialized
whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
30 | if ( !compat )
| ^~~~~~~
arch/x86/pv/hypercall.c:87:29: note: uninitialized use occurs here
87 | perfc_incra(hypercalls, eax);
| ^~~
This function is forced always_inline to cause compat to be
constant-propagated through, but that is only a heuristic to try and get the
compiler to do what we want, not a gurantee that it does.
Clang doesn't appear to be able to see that the only case where compat is
true (and therefore the if() is false) is when there's an else clause on the
end which sets eax too.
Initialise eax to -1, which ought to be optimised out, but if for whatever
reason it happens not to be, then perfc_incra() will fail it's bounds check
and do nothing.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Tue, 31 Dec 2024 14:06:19 +0000 (14:06 +0000)]
x86/traps: Rework LER initialisation and support Zen5/Diamond Rapids
AMD have always used the architectural MSRs for LER. As the first processor
to support LER was the K7 (which was 32bit), we can assume it's presence
unconditionally in 64bit mode.
Intel are about to run out of space in Family 6 and start using 19. It is
only the Pentium 4 which uses non-architectural LER MSRs.
percpu_traps_init(), which runs on every CPU, contains a lot of code which
should be init-only, and is the only reason why opt_ler can't be in initdata.
Write a brand new init_ler() which expects all future Intel and AMD CPUs to
continue using the architectural MSRs, and does all setup together. Call it
from trap_init(), and remove the setup logic percpu_traps_init() except for
the single path configuring MSR_IA32_DEBUGCTLMSR.
Leave behind a warning if the user asked for LER and Xen couldn't enable it.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Nicola Vetrini [Sun, 22 Dec 2024 14:04:08 +0000 (15:04 +0100)]
eclair-analysis: tidy toolchain.ecl configuration and mark Rule 1.1 clean
Reformat the list of GNU extensions and non-standard tokens used by Xen
in the ECLAIR configuration to make it easier to review any changes to it.
The extension "ext_missing_varargs_arg", which captures the GNU extension that
allows variadic functions and macros not to require at least one named parameter
before C23 has been renamed to "ext_c_missing_varargs_arg" in the current version
of ECLAIR used in CI, therefore this resolves regressions on MISRA C Rule 1.1:
"The program shall contain no violations of the standard C syntax and constraints,
and shall not exceed the implementation's translation limits."
As a result, Rule 1.1 now has no violations and is tagged as such.
Remove two unused configurations, that were already commented out.
Signed-off-by: Nicola Vetrini <nicola.vetrini@bugseng.com> Fixes: 631f535a3d4f ("xen: update ECLAIR service identifiers from MC3R1 to MC3A2.") Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Andrew Cooper [Mon, 25 Mar 2024 15:14:46 +0000 (15:14 +0000)]
x86/spec-ctrl: Support for SRSO_U/S_NO and SRSO_MSR_FIX
AMD have updated the SRSO whitepaper[1] with further information. These
features exist on AMD Zen5 CPUs and are necessary for Xen to use.
The two features are in principle unrelated:
* SRSO_U/S_NO is an enumeration saying that SRSO attacks can't cross the
User(CPL3) / Supervisor(CPL<3) boundary. i.e. Xen don't need to use
IBPB-on-entry for PV64. PV32 guests are explicitly unsupported for
speculative issues, and excluded from consideration for simplicity.
* SRSO_MSR_FIX is an enumeration identifying that the BP_SPEC_REDUCE bit is
available in MSR_BP_CFG. When set, SRSO attacks can't cross the host/guest
boundary. i.e. Xen don't need to use IBPB-on-entry for HVM.
Extend ibpb_calculations() to account for these when calculating
opt_ibpb_entry_{pv,hvm} defaults. Add a `bp-spec-reduce=<bool>` option to
control the use of BP_SPEC_REDUCE, with it active by default.
Because MSR_BP_CFG is core-scoped with a race condition updating it, repurpose
amd_check_erratum_1485() into amd_check_bp_cfg() and calculate all updates at
once.
Xen also needs to to advertise SRSO_U/S_NO to guests to allow the guest kernel
to skip SRSO mitigations too:
* This is trivial for HVM guests. It is also is accurate for PV32 guests
too, but we have already excluded them from consideration, and do so again
here to simplify the policy logic.
* As written, SRSO_U/S_NO does not help for the PV64 user->kernel boundary.
However, after discussing with AMD, an implementation detail of having
BP_SPEC_REDUCE active causes the PV64 user->kernel boundary to have the
property described by SRSO_U/S_NO, so we can advertise SRSO_U/S_NO to
guests when the BP_SPEC_REDUCE precondition is met.
Finally, fix a typo in the SRSO_NO's comment.
[1] https://www.amd.com/content/dam/amd/en/documents/corporate/cr/speculative-return-stack-overflow-whitepaper.pdf Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
xen/arch/x86: make objdump output user locale agnostic
The objdump output is fed to grep, so make sure it doesn't change with
different user locales and break the grep parsing.
This problem was identified while updating xen in Debian and the fix is
needed for generating reproducible builds in varying environments.
Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Ian Jackson [Mon, 30 Dec 2024 21:00:29 +0000 (22:00 +0100)]
docs/man/xen-vbd-interface.7: Provide properly-formatted NAME section
This manpage was omitted from
docs/man: Provide properly-formatted NAME sections
(423c4def1f7a01eeff56fa70564180640ef3af43)
because I was previously building with markdown not installed.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com> Tested-by: Maximilian Engelhardt <maxi@daemonizer.de> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Tue, 17 Jan 2023 12:45:37 +0000 (12:45 +0000)]
tools: Introduce a non-truncating xc_xenver_changeset()
Update libxl and the ocaml stubs to match. No API/ABI change in either.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Christian Lindig <christian.lindig@citrix.com>
Andrew Cooper [Tue, 17 Jan 2023 12:39:48 +0000 (12:39 +0000)]
tools: Introduce a non-truncating xc_xenver_capabilities()
Update libxl and the ocaml stubs to match. No API/ABI change in either.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Christian Lindig <christian.lindig@citrix.com>
Andrew Cooper [Mon, 16 Jan 2023 16:56:17 +0000 (16:56 +0000)]
tools: Introduce a non-truncating xc_xenver_extraversion()
... which uses XENVER_extraversion2.
In order to do this sensibly, use manual hypercall buffer handling. Not only
does this avoid an extra bounce buffer (we need to strip the xen_varbuf_t
header anyway), it's also shorter and easlier to follow.
Update libxl and the ocaml stubs to match. No API/ABI change in either.
With this change made, `xl info` can now correctly access a >15 char
extraversion:
# xl info xen_version
4.18-unstable+REALLY LONG EXTRAVERSION
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Christian Lindig <christian.lindig@citrix.com>
Andrew Cooper [Mon, 16 Jan 2023 14:40:07 +0000 (14:40 +0000)]
tools/libxc: Move xc_version() out of xc_private.c into its own file
kexec-tools uses xc_version(), meaning that it is not a private API. As we're
going to extend the functionality substantially, move it to its own file.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Andrew Cooper [Tue, 20 Dec 2022 16:45:23 +0000 (16:45 +0000)]
xen/version: Misc style fixes
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Andrew Cooper [Tue, 3 Jan 2023 19:06:43 +0000 (19:06 +0000)]
xen/version: Fold build_id handling into xenver_varbuf_op()
struct xen_build_id and struct xen_varbuf are identical from an ABI point of
view, so XENVER_build_id can reuse xenver_varbuf_op() rather than having it's
own almost identical copy of the logic.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
In XenServer, we have encountered problems caused by both XENVER_extraversion
and XENVER_commandline having fixed bounds.
More than just the invariant size, the APIs/ABIs also broken by typedef-ing an
array, and using an unqualified 'char' which has implementation-specific
signed-ness.
Provide brand new ops, which are capable of expressing variable length
strings, and mark the older ops as broken.
This fixes all issues around XENVER_extraversion being longer than 15 chars.
Further work beyond just this API is needed to remove other assumptions about
XENVER_commandline being 1023 chars long.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
---
Non-technical objections to this patch were raised, and subsequently rejected
by a community wide vote. The results of the vote have not been shared with
the community at the time of committing.
Andrew Cooper [Fri, 13 Jan 2023 17:20:41 +0000 (17:20 +0000)]
xen/version: Calculate xen_capabilities_info once at boot
The arch_get_xen_caps() infrastructure is horribly inefficient for something
that is constant after features have been resolved on boot.
Every instance used snprintf() to format constants into a string (which gets
shorter when %d gets resolved!), and which get double buffered on the stack.
Switch to using string literals with the "3.0" inserted - these numbers
haven't changed in 19 years; the Xen 3.0 release was Dec 5th 2005.
Use initcalls to format the data into xen_cap_info, which is deliberately not
of type xen_capabilities_info_t because a 1k array is a silly overhead for
storing a maximum of 77 chars (the x86 version) and isn't liable to need any
more space in the forseeable future. RISC-V and PPC have their stub dropped,
with the expectation that they won't carry this legacy interface forward.
This speeds up the the XENVER_capabilities hypercall, but the purpose of the
change is to allow us to introduce a better XENVER_* API that doesn't force
the use of a 1k buffer on the stack.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Acked-by: Jan Beulich <jbeulich@suse.com>
Describe the layer which enables SCMI over SMC calls forwarding
to EL3 FW if issued by the Hardware domain. If the SCMI firmware
node is not found in the Host DT during initialization, it fails
silently as it's not mandatory.
The SCMI SMCs trapping at EL2 now lets hwdom perform SCMI ops for
interacting with system-level resources almost as if it would be
running natively.
Signed-off-by: Andrei Cherechesu <andrei.cherechesu@nxp.com> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com> Acked-by: Michal Orzel <michal.orzel@amd.com>
Platforms based on NXP S32G3 processors use the NXP LINFlexD
UART driver for console by default, and rely on Dom0 having
access to SCMI services for system-level resources from
firmware at EL3.
Signed-off-by: Andrei Cherechesu <andrei.cherechesu@nxp.com> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
xen/arm: firmware: Add SCMI over SMC calls handling layer
Introduce the SCMI-SMC layer to have some basic degree of
awareness about SCMI calls that are based on the ARM System
Control and Management Interface (SCMI) specification (DEN0056E).
The SCMI specification includes various protocols for managing
system-level resources, such as: clocks, pins, reset, system power,
power domains, performance domains, etc. The clients are named
"SCMI agents" and the server is named "SCMI platform".
Only support the shared-memory based transport with SMCs as
the doorbell mechanism for notifying the platform. Also, this
implementation only handles the "arm,scmi-smc" compatible,
requiring the following properties:
- "arm,smc-id" (unique SMC ID)
- "shmem" (one or more phandles pointing to shmem zones
for each channel)
The initialization is done as initcall, since we need
SMCs, and PSCI should already probe EL3 FW for SMCCC support.
If no "arm,scmi-smc" compatible node is found in the host
DT, the initialization fails silently, as it's not mandatory.
Otherwise, we get the 'arm,smc-id' DT property from the node,
to know the SCMI SMC ID we handle. The 'shmem' memory ranges
are not validated, as the SMC calls are only passed through
to EL3 FW if coming from the hardware domain.
Create a new 'firmware' folder to keep the SCMI code separate
from the generic ARM code.
Signed-off-by: Andrei Cherechesu <andrei.cherechesu@nxp.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Acked-by: Michal Orzel <michal.orzel@amd.com>
Carlo Nonato [Tue, 17 Dec 2024 17:06:37 +0000 (18:06 +0100)]
xen/arm: add cache coloring support for Xen image
Xen image is relocated to a new colored physical space. Some relocation
functionalities must be brought back:
- the virtual address of the new space is taken from 0c18fb76323b
("xen/arm: Remove unused BOOT_RELOC_VIRT_START").
- relocate_xen() and get_xen_paddr() are taken from f60658c6ae47
("xen/arm: Stop relocating Xen").
setup_pagetables() must be adapted for coloring and for relocation. Runtime
page tables are used to map the colored space, but they are also linked in
boot tables so that the new space is temporarily available for relocation.
This implies that Xen protection must happen after the copy.
Finally, since the alternative framework needs to remap the Xen text and
inittext sections, this operation must be done in a coloring-aware way.
The function xen_remap_colored() is introduced for that.
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech> Reviewed-by: Jan Beulich <jbeulich@suse.com> # common Reviewed-by: Michal Orzel <michal.orzel@amd.com>
Carlo Nonato [Tue, 17 Dec 2024 17:06:36 +0000 (18:06 +0100)]
xen/arm: make consider_modules() available for xen relocation
Cache coloring must physically relocate Xen in order to color the hypervisor
and consider_modules() is a key function that is needed to find a new
available physical address.
672d67f339c0 ("xen/arm: Split MMU-specific setup_mm() and related code out")
moved consider_modules() under arm32. Move it to mmu/setup.c and make it
non-static so that it can be used outside.
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech> Reviewed-by: Michal Orzel <michal.orzel@amd.com>
Luca Miccio [Tue, 17 Dec 2024 17:06:35 +0000 (18:06 +0100)]
xen/arm: add Xen cache colors command line parameter
Add a new command line parameter to configure Xen cache colors.
These colors are dumped together with other coloring info.
Benchmarking the VM interrupt response time provides an estimation of
LLC usage by Xen's most latency-critical runtime task. Results on Arm
Cortex-A53 on Xilinx Zynq UltraScale+ XCZU9EG show that one color, which
reserves 64 KiB of L2, is enough to attain best responsiveness:
- Xen 1 color latency: 3.1 us
- Xen 2 color latency: 3.1 us
Since this is the most common target for Arm cache coloring, the default
amount of Xen colors is set to one.
More colors are instead very likely to be needed on processors whose L1
cache is physically-indexed and physically-tagged, such as Cortex-A57.
In such cases, coloring applies to L1 also, and there typically are two
distinct L1-colors. Therefore, reserving only one color for Xen would
senselessly partitions a cache memory that is already private, i.e.
underutilize it.
Signed-off-by: Luca Miccio <lucmiccio@gmail.com> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Carlo Nonato [Tue, 17 Dec 2024 17:06:34 +0000 (18:06 +0100)]
xen: add cache coloring allocator for domains
Add a new memory page allocator that implements the cache coloring mechanism.
The allocation algorithm enforces equal frequency distribution of cache
partitions, following the coloring configuration of a domain. This allows
for an even utilization of cache sets for every domain.
Pages are stored in a color-indexed array of lists. Those lists are filled
by a simple init function which computes the color of each page.
When a domain requests a page, the allocator extracts the page from the list
with the maximum number of free pages among those that the domain can access,
given its coloring configuration.
The allocator can only handle requests of order-0 pages. This allows for
easier implementation and since cache coloring targets only embedded systems,
it's assumed not to be a major problem.
The buddy allocator must coexist with the colored one because the Xen heap
isn't colored. For this reason a new Kconfig option and a command line
parameter are added to let the user set the amount of memory reserved for
the buddy allocator. Even when cache coloring is enabled, this memory
isn't managed by the colored allocator.
Colored heap information is dumped in the dump_heap() debug-key function.
Based on original work from: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Michal Orzel <michal.orzel@amd.com>
Carlo Nonato [Tue, 17 Dec 2024 17:06:32 +0000 (18:06 +0100)]
xen/arm: add support for cache coloring configuration via device-tree
Add the "llc-colors" Device Tree property to express DomUs and Dom0less
color configurations.
Based on original work from: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech> Reviewed-by: Jan Beulich <jbeulich@suse.com> # non-Arm Reviewed-by: Michal Orzel <michal.orzel@amd.com>
Carlo Nonato [Tue, 17 Dec 2024 17:06:31 +0000 (18:06 +0100)]
tools: add support for cache coloring configuration
Add a new "llc_colors" parameter that defines the LLC color assignment for
a domain. The user can specify one or more color ranges using the same
syntax used everywhere else for color config described in the
documentation.
The parameter is defined as a list of strings that represent the color
ranges.
Documentation is also added.
Golang bindings are regenerated.
Based on original work from: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
Carlo Nonato [Tue, 17 Dec 2024 17:06:30 +0000 (18:06 +0100)]
xen: extend domctl interface for cache coloring
Add a new domctl hypercall to allow the user to set LLC coloring
configurations. Colors can be set only once, just after domain creation,
since recoloring isn't supported.
Based on original work from: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Carlo Nonato [Tue, 17 Dec 2024 17:06:29 +0000 (18:06 +0100)]
xen/arm: add Dom0 cache coloring support
Add a command line parameter to allow the user to set the coloring
configuration for Dom0.
A common configuration syntax for cache colors is introduced and
documented.
Take the opportunity to also add:
- default configuration notion.
- function to check well-formed configurations.
Direct mapping Dom0 isn't possible when coloring is enabled, so
CDF_directmap flag is removed when creating it.
Based on original work from: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Michal Orzel <michal.orzel@amd.com>
Carlo Nonato [Tue, 17 Dec 2024 17:06:28 +0000 (18:06 +0100)]
xen/arm: permit non direct-mapped Dom0 construction
Cache coloring requires Dom0 not to be direct-mapped because of its non
contiguous mapping nature, so allocate_memory() is needed in this case. 8d2c3ab18cc1 ("arm/dom0less: put dom0less feature code in a separate module")
moved allocate_memory() in dom0less_build.c. In order to use it
in Dom0 construction bring it back to domain_build.c and declare it in
domain_build.h.
Adapt the implementation of allocate_memory() so that it uses the host
layout when called on the hwdom, via find_unallocated_memory().
Since gnttab information are needed in the process, move find_gnttab_region()
before allocate_memory() in construct_dom0().
Introduce add_hwdom_free_regions() callback to add hwdom banks in descending
order.
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech> Reviewed-by: Michal Orzel <michal.orzel@amd.com>
Carlo Nonato [Tue, 17 Dec 2024 17:06:27 +0000 (18:06 +0100)]
xen/arm: add initial support for LLC coloring on arm64
LLC coloring needs to know the last level cache layout in order to make the
best use of it. This can be probed by inspecting the CLIDR_EL1 register,
so the Last Level is defined as the last level visible by this register.
Note that this excludes system caches in some platforms.
Static memory allocation and cache coloring are incompatible because static
memory can't be guaranteed to use only colors assigned to the domain.
Panic during DomUs creation when both are enabled.
Based on original work from: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech> Reviewed-by: Michal Orzel <michal.orzel@amd.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Carlo Nonato [Tue, 17 Dec 2024 17:06:26 +0000 (18:06 +0100)]
xen/common: add cache coloring common code
Last Level Cache (LLC) coloring allows to partition the cache in smaller
chunks called cache colors.
Since not all architectures can actually implement it, add a HAS_LLC_COLORING
Kconfig option.
LLC_COLORS_ORDER Kconfig option has a range maximum of 10 (2^10 = 1024)
because that's the number of colors that fit in a 4 KiB page when integers
are 4 bytes long.
LLC colors are a property of the domain, so struct domain has to be extended.
Based on original work from: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech> Acked-by: Michal Orzel <michal.orzel@amd.com>
Oleksii Kurochko [Thu, 19 Dec 2024 11:18:31 +0000 (12:18 +0100)]
automation: Pin down CONFIG_QEMU_PLATFORM for RISC-V's randconfig job
Except setting CONFIG_QEMU_PLATFORM=y in tiny64_defconfig,
CONFIG_QEMU_PLATFORM should be fixed for RISC-V's randconfig job.
Otherwise, an expected compilation error for RISC-V's randconfig job
will occur since clean_and_invalidate_dcache_va_range() and
clean_dcache_va_range() are currently implemented only for the QEMU
platform.
Additionally, sort the EXTRA_FIXED_RANDCONFIG list alphabetically.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Fixes: f92e2709bd ("xen/riscv: implement data and instruction cache operations") Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Sergiy Kibrik [Thu, 19 Dec 2024 11:13:26 +0000 (13:13 +0200)]
xen/ioreq: Fix check for CONFIG_ARCH_VCPU_IOREQ_COMPLETION
It should be CONFIG_ARCH_VCPU_IOREQ_COMPLETION (as in Kconfig) and not
misspelled CONFIG_VCPU_ARCH_IOREQ_COMPLETION.
Fixes: 979cfdd3e58c ("ioreq: do not build arch_vcpu_ioreq_completion() for non-VMX configurations") Signed-off-by: Sergiy Kibrik <Sergiy_Kibrik@epam.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
The important part: XZ decompression error: Memory usage limit reached
This looks to be related to the following change in Linux: 8653c909922743bceb4800e5cc26087208c9e0e6 ("xz: use 128 MiB dictionary and force single-threaded mode")
Fix this by increasing the block size to 256MiB. And remove the
misleading comment (from lack of better ideas).
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Anthony PERARD <anthony.perard@vates.tech> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Teddy Astie [Mon, 2 Dec 2024 09:49:14 +0000 (09:49 +0000)]
x86/hvm: Use constants for x86 modes
In many places of x86 HVM code, constants integer are used to indicate in what mode is
running the CPU (real, vm86, 16-bits, 32-bits, 64-bits). However, these constants are
are written directly as integer which hides the actual meaning of these modes.
This patch introduces X86_MODE_* macros and replace those occurences with it.
Signed-off-by: Teddy Astie <teddy.astie@vates.tech> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Teddy Astie <teddy.astie@vates.tech>
Andrew Cooper [Thu, 27 Jun 2024 12:55:51 +0000 (13:55 +0100)]
tools/libxg: Don't gunzip the guests initrd
Decompressing the kernel is necessary to inspect the ELF notes, but the
dombuilder will gunzip() secondary modules too. Specifically gunzip(), no
other decompression algorithms.
This may have been necessary in the dim and distant past, but it is broken
today. Linux specifically supports concatenating CPIO fragments of differing
compressions, and any attempt to interpret it with a single algorithm may
corrupt later parts.
This was an unexpected discovery while trying to test Xen's gunzip()
logic (Xen as a PVH guest, with a gzipped XTF kernel as dom0).
Interpreting secondary modules should be left as an exercise to the guest.
This reduces work done in dom0.
This is not expected to cause a practical difference to guests these days.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Thu, 12 Sep 2024 01:18:40 +0000 (02:18 +0100)]
xen/sched: Untangle credit2 vs cpu_nr_siblings()
Credit2 has no buisness including asm/cpufeature.h or asm/processor.h.
This was caused by a bad original abstraction, and an even less wise attempt
to fix the build on my behalf. It is also the sole reason why PPC and RISC-V
need cpufeature.h header.
Worst of all, cpu_data[cpu].x86_num_siblings doesn't even have the same
meaning between vendors on x86 CPUS.
Implement cpu_nr_siblings() locally in credit2.c, leaving behind a TODO. Drop
the stub from each architecture.
Fixes: 8e2aa76dc167 ("xen: credit2: limit the max number of CPUs in a runqueue") Fixes: ad33a573c009 ("xen/credit2: Fix build following c/s 8e2aa76dc (take 2)") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Acked-by: Shawn Anastasio <sanastasio@raptorengineering.com>
Oleksii Kurochko [Thu, 19 Dec 2024 09:23:48 +0000 (10:23 +0100)]
xen/riscv: relocating and unflattening host device tree
Introduce relocate_fdt() and call it to relocate FDT to Xen heap
instead of using early mapping as it is expected that discard_initial_modules()
( is supposed to call in the future ) discards the FDT boot module and
remove_early_mappings() destroys the early mapping.
Unflatten a device tree, creating the tree of struct device_node.
It also fills the "name" and "type" pointers of the nodes so the normal
device-tree walking functions can be used.
Set device_tree_flattened to NULL in the case when acpi_disabled is
equal to false.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Oleksii Kurochko [Thu, 19 Dec 2024 09:23:28 +0000 (10:23 +0100)]
xen/riscv: implement prereq for DTB relocation
DTB relocatin in Xen heap requires the following functions which are
introduced in current patch:
- xvmalloc_array()
- copy_from_paddr()
For internal use of xvmalloc, the functions flush_page_to_ram() and
virt_to_page() are introduced. virt_to_page() is also required for
free_xenheap_pages().
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Oleksii Kurochko [Thu, 19 Dec 2024 09:22:46 +0000 (10:22 +0100)]
xen/riscv: implement data and instruction cache operations
Implement following cache operations:
- clean_and_invalidate_dcache_va_range()
- clean_dcache_va_range()
- invalidate_icache()
The first two functions may require support for the CMO (Cache Management
Operations) extension and/or hardware-specific instructions.
Currently, only QEMU is supported, which does not model cache behavior.
Therefore, clean_and_invalidate_dcache_va_range() and clean_dcache_va_range()
are implemented to simply return 0. For other cases, generate compilation error
so a user won't miss to update this function if necessery.
If hardware supports CMO or hardware-specific instructions, these functions
should be updated accordingly. To support current implementation of these
function CONFIG_QEMU_PLATFORM is introduced.
invalidate_icache() is implemented using fence.i instruction as
mentioned in the unpriv spec:
The FENCE.I instruction was designed to support a wide variety of
implementations. A simple implementation can flush the local instruction
cache and the instruction pipeline when the FENCE.I is executed.
A more complex implementation might snoop the instruction (data) cache
on every data (instruction) cache miss, or use an inclusive unified
private L2 cache to invalidate lines from the primary instruction cache
when they are being written by a local store instruction.
If instruction and data caches are kept coherent in this way, or if the
memory system consists of only uncached RAMs, then just the fetch pipeline
needs to be flushed at a FENCE.I.
The FENCE.I instruction requires the presence of the Zifencei extension,
which might not always be available. However, Xen uses the RV64G ISA, which
guarantees the presence of the Zifencei extension. According to the
unprivileged ISA specification (version 20240411):
One goal of the RISC-V project is that it be used as a stable software
development target. For this purpose, we define a combination of a base ISA
(RV32I or RV64I) plus selected standard extensions (IMAFD, Zicsr, Zifencei)
as a "general-purpose" ISA, and we use the abbreviation G for the
IMAFDZicsr_Zifencei combination of instruction-set extensions.
Set CONFIG_QEMU_PLATFORM=y in tiny64_defconfig to have proper implemtation of
clean_and_invalidate_dcache_va_range() and clean_dcache_va_range() for CI.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Oleksii Kurochko [Thu, 19 Dec 2024 09:21:11 +0000 (10:21 +0100)]
xen/riscv: update layout table in config.h
Make all upper bounds (end addresses) for areas inclusive to align
with the corresponding definitions.
For the Direct map region, the upper bound was calculated incorrectly
in efadb18dd58aba ("xen/riscv: add VM space layout"). It should be
0x7f80000000 (considering that the value is exclusive, instead of
0x7f40000000). Therefore, the inclusive upper bound for that region
is 0x7f80000000 - 1.
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Acked-by: Jan Beulich <jbeulich@suse.com>
PGC_static and PGC_extra need to be preserved when assigning a page.
Define a new macro that groups those flags and use it instead of or'ing
every time.
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Ariel Otilibili [Mon, 16 Dec 2024 23:07:20 +0000 (00:07 +0100)]
tools: Fix regex syntax warnings with Python 3.12
Since Python 3.12, invalid escape sequences generate SyntaxWarning. In the
future, these invalid sequences will raise a SyntaxError, so changed to using
raw string notation.
Link: https://docs.python.org/3/whatsnew/3.12.html#other-language-changes Fixes: d8f3a67bf98 ("pygrub: further improve grub2 support") Fixes: dd03048708a ("xen/pygrub: grub2/grub.cfg from RHEL 7 has new commands in menuentry") Fixes: d1b93ea2615 ("tools/pygrub: Make pygrub understand default entry in string format") Fixes: 622e368758b ("Add ZFS libfsimage support patch") Fixes: 02b26c02c7c ("xen/scripts: add cppcheck tool to the xen-analysis.py script") Fixes: 56c0063f4e7 ("xen/misra: xen-analysis.py: Improve the cppcheck version check") Signed-off-by: Ariel Otilibili <Ariel.Otilibili-Anieli@eurecom.fr> Reviewed-by: Luca Fancellu <luca.fancellu@arm.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Thu, 28 Apr 2022 08:44:02 +0000 (09:44 +0100)]
x86/CET: Support cet=<bool> on the command line
... as a shorthand for setting both suboptions at once. Currently, an admin
needs to pass cet=no-shstk,no-ibt to turn both off, where cet=0 is a better
option.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Fixes: 631f535a3d4f ("xen: update ECLAIR service identifiers from MC3R1 to MC3A2.") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
xen: update ECLAIR service identifiers from MC3R1 to MC3A2.
Rename all instances of ECLAIR MISRA C:2012 service identifiers,
identified by the prefix MC3R1, to use the prefix MC3A2, which
refers to MISRA C:2012 Amendment 2 guidelines.
This update is motivated by the need to upgrade ECLAIR GitLab runners
that use the new naming scheme for MISRA C:2012 Amendment 2 guidelines.
Changes to the docs/misra directory are needed in order to keep
comment-based deviation up to date.
Andrew Cooper [Fri, 22 Nov 2024 16:00:37 +0000 (16:00 +0000)]
docs/guest-guide: Discuss when not use a hypercall page
The Linux rethunk and safe-ret speculative safety techniques involve
transforming `ret` to `jmp __x86_return_thunk` at compile time. Placing naked
`ret`s back in executable .text breaks these mitigations.
CET-IBT requires ENDBR instructions, and while we could in principle fix that,
the need to select between ENDBR32 or ENDBR64 means that the contents of the
hypercall page would need to become more mode-specific than it currently
is (HVM hypercall pages are currently 32bit and 64bit compatbile). However,
there's no feasible way to make a hypercall page compatible with fine-grain
CFI schemes such as FineIBT.
OSes which care about either of these things are better off avoiding the
hypercall page.
This is part of XSA-466.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monne [Mon, 16 Dec 2024 18:33:29 +0000 (19:33 +0100)]
x86/io-apic: prevent early exit from i8259 loop detection
Avoid exiting early from the loop when a pin that could be connected to the
i8259 is found, as such early exit would leave the EOI handler translation
array only partially allocated and/or initialized.
Otherwise on systems with multiple IO-APICs and an unmasked ExtINT pin on
any IO-APIC that's no the last one the following NULL pointer dereference
triggers:
(XEN) Enabling APIC mode. Using 2 I/O APICs
(XEN) ----[ Xen-4.20-unstable x86_64 debug=y Not tainted ]----
(XEN) CPU: 0
(XEN) RIP: e008:[<ffff82d040328046>] __ioapic_write_entry+0x83/0x95
[...]
(XEN) Xen call trace:
(XEN) [<ffff82d040328046>] R __ioapic_write_entry+0x83/0x95
(XEN) [<ffff82d04027464b>] F amd_iommu_ioapic_update_ire+0x1ea/0x273
(XEN) [<ffff82d0402755a1>] F iommu_update_ire_from_apic+0xa/0xc
(XEN) [<ffff82d040328056>] F __ioapic_write_entry+0x93/0x95
(XEN) [<ffff82d0403283c1>] F arch/x86/io_apic.c#clear_IO_APIC_pin+0x7c/0x10e
(XEN) [<ffff82d040328480>] F arch/x86/io_apic.c#clear_IO_APIC+0x2d/0x61
(XEN) [<ffff82d0404448b7>] F enable_IO_APIC+0x2e3/0x34f
(XEN) [<ffff82d04044c9b0>] F smp_prepare_cpus+0x254/0x27a
(XEN) [<ffff82d04044bec2>] F __start_xen+0x1ce1/0x23ae
(XEN) [<ffff82d0402033ae>] F __high_start+0x8e/0x90
(XEN)
(XEN) Pagetable walk from 0000000000000000:
(XEN) L4[0x000] = 000000007dbfd063ffffffffffffffff
(XEN) L3[0x000] = 000000007dbfa063ffffffffffffffff
(XEN) L2[0x000] = 000000007dbcc063ffffffffffffffff
(XEN) L1[0x000] = 0000000000000000ffffffffffffffff
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) FATAL PAGE FAULT
(XEN) [error_code=0002]
(XEN) Faulting linear address: 0000000000000000
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...
Reported-by: Sergii Dmytruk <sergii.dmytruk@3mdeb.com> Fixes: 86001b3970fe ('x86/io-apic: fix directed EOI when using AMD-Vi interrupt remapping') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>