automation/alpine: add elfutils-dev and coreutils for livepatch-tools
In preparation for adding some livepatch-tools test update the Alpine
container to also install elfutils-dev and coreutils.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
I don't very much like to add coreutils, as it's also good to test
that we can build Xen with busybox, but I also got tired of adjusting
livepatch-tools.
Andrew Cooper [Wed, 5 Apr 2023 12:36:13 +0000 (13:36 +0100)]
tools/libs/guest: Fix build following libx86 changes
I appear to have lost this hunk somewhere...
Fixes: 1b67fccf3b02 ("libx86: Update library API for cpu_policy") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Andrew Cooper [Thu, 30 Mar 2023 17:21:01 +0000 (18:21 +0100)]
x86: Out-of-inline the policy<->featureset convertors
These are already getting over-large for being inline functions, and are only
going to grow further over time. Out of line them, yielding the following net
delta from bloat-o-meter:
Andrew Cooper [Wed, 29 Mar 2023 11:01:33 +0000 (12:01 +0100)]
x86: Drop struct old_cpu_policy
With all the complicated callers of x86_cpu_policies_are_compatible() updated
to use a single cpu_policy object, we can drop the final user of struct
old_cpu_policy.
Update x86_cpu_policies_are_compatible() to take (new) cpu_policy pointers,
reducing the amount of internal pointer chasing, and update all callers to
pass their cpu_policy objects directly.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Wed, 29 Mar 2023 11:37:33 +0000 (12:37 +0100)]
x86: Merge xc_cpu_policy's cpuid and msr objects
Right now, they're the same underlying type, containing disjoint information.
Use a single object instead. Also take the opportunity to rename 'entries' to
'msrs' which is more descriptive, and more in line with nr_msrs being the
count of MSR entries in the API.
test-tsx uses xg_private.h to access the internals of xc_cpu_policy, so needs
updating at the same time. Take the opportunity to improve the code clarity
by passing a cpu_policy rather than an xc_cpu_policy into some functions.
No practical change. This undoes the transient doubling of storage space from
earlier patches.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Wed, 29 Mar 2023 06:39:44 +0000 (07:39 +0100)]
x86: Merge the system {cpuid,msr} policy objects
Right now, they're the same underlying type, containing disjoint information.
Introduce a new cpu-policy.{h,c} to be the new location for all policy
handling logic. Place the combined objects in __ro_after_init, which is new
since the original logic was written.
As we're trying to phase out the use of struct old_cpu_policy entirely, rework
update_domain_cpu_policy() to not pointer-chase through system_policies[].
This in turn allows system_policies[] in sysctl.c to become static and reduced
in scope to XEN_SYSCTL_get_cpu_policy.
No practical change. This undoes the transient doubling of storage space from
earlier patches.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Tue, 28 Mar 2023 20:24:20 +0000 (21:24 +0100)]
x86: Merge struct msr_policy into struct cpu_policy
As with the cpuid side, use a temporary define to make struct msr_policy still
work.
Note, this means that domains now have two separate struct cpu_policy
allocations with disjoint information, and system policies are in a similar
position, as well as xc_cpu_policy objects in libxenguest. All of these
duplications will be addressed in the following patches.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Tue, 28 Mar 2023 17:55:19 +0000 (18:55 +0100)]
x86: Rename struct cpuid_policy to struct cpu_policy
Also merge lib/x86/cpuid.h entirely into lib/x86/cpu-policy.h
Use a temporary define to make struct cpuid_policy still work.
There's one forward declaration of struct cpuid_policy in
tools/tests/x86_emulator/x86-emulate.h that isn't covered by the define, and
it's easier to rename that now than to rearrange the includes.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
These weren't great names to begin with, and using {leaves,msrs} matches up
better with the existing nr_{leaves,msr} parameters anyway.
Furthermore, by renaming these fields we can get away with using some #define
trickery to avoid the struct {cpuid,msr}_policy merge needing to happen in a
single changeset.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Tue, 28 Mar 2023 19:31:33 +0000 (20:31 +0100)]
x86: Rename struct cpu_policy to struct old_cpuid_policy
We want to merge struct cpuid_policy and struct msr_policy together, and the
result wants to be called struct cpu_policy.
The current struct cpu_policy, being a pair of pointers, isn't terribly
useful. Rename the type to struct old_cpu_policy, but it will disappear
entirely once the merge is complete.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Tue, 4 Apr 2023 13:45:18 +0000 (15:45 +0200)]
x86emul: correct AVX512VL+VPCLMUL test descriptions
The stride values (based on 32-bit element size) were wrong for these
two test, yielding misleading output (especially when comparing with the
test variants also involving AVX512-VBMI2).
Also insert a missing blank on a nearby, related line.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 4 Apr 2023 13:44:11 +0000 (15:44 +0200)]
x86/PV: ignore PAE_MODE ELF note for 64-bit Dom0
Besides a printk() the main effect is slight corruption of the start
info magic: While that's meant to be xen-3.0-x86_64, it wrongly ended
up as xen-3.0-x86_64p. (The extended-CR3 VM-assist thus won't be
enabled anymore either, but that's meaningless to 64-bit PV anyway.)
Note that no known users exist that would have developed a dependency on
the bogus magic string. In particular Linux, NetBSD, and mini-os have
been checked.
Fixes: 460060f83d41 ("libelf: use for x86 dom0 builder") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 4 Apr 2023 13:43:29 +0000 (15:43 +0200)]
x86emul/fuzzer: re-arrange cleaning
The latter of the two commits referenced below converted x86_emulate
from a symlinked dir to a real one, holding symlinked files. Yet even
before that the split between distclean and clean was suspicious: A
similar split, removing symlinks only in distclean, doesn't exist
anywhere else in the tree afaics.
Fixes: c808475882ef ("tools/fuzz: introduce x86 instruction emulator target") Fixes: 9ace97ab9b87 ("x86emul: split off opcode 0f01 handling") Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Mon, 3 Apr 2023 12:17:35 +0000 (13:17 +0100)]
x86/emul: Fix test harness build with blk.c moved out of x86_emulate.c
Trying to build the test harness fails with:
x86_emulate/blk.c: In function 'x86_emul_blk':
x86_emulate/blk.c:74:15: error: expected ':' or ')' before 'ASM_FLAG_OUT'
74 | ASM_FLAG_OUT(, "; setz %[zf]")
| ^~~~~~~~~~~~
This is because ASM_FLAG_OUT() is still local to x86_emulate.c. Move it into
x86-emulate.h instead so it ends up in all files including private.h. The
main Xen build gets this macro from compiler.h.
Fixes: c80243f94386 ("x86emul: move x86_emul_blk() to separate source file") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
xen/x86: switch to use generic implemetation of bug.h
The following changes were made:
* Make GENERIC_BUG_FRAME mandatory for X86
* Update asm/bug.h using generic implementation in <xen/bug.h>
* Update do_invalid_op using generic do_bug_frame()
* Define BUG_DEBUGGER_TRAP_FATAL to debugger_trap_fatal(X86_EXC_GP,regs)
* type of eip variable was changed to 'const void *'
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
The idea of the patch is to change all <asm/bug.h> to <xen/bug.h> and
keep Xen compilable with adding only minimal amount of changes:
1. It was added "#include <xen/types.h>" to ARM's "<asm/bug.h>" as it
uses uint_{16,32}t in 'struct bug_frame'.
2. It was added '#define BUG_FRAME_STRUCT' which means that ARM hasn't
been switched to generic implementation yet.
3. It was added '#define BUG_FRAME_STRUCT' which means that x86 hasn't
been switched to generic implementation yet.
4. BUGFRAME_* and _start_bug_frame[], _stop_bug_frame_*[] were removed
for ARM & x86 to deal with compilation errors such as:
redundant redeclaration of ...
5. Remove BUG_DISP_WIDTH, BUG_LINE_LO_WIDTH, BUG_LINE_HI_WIDTH from
x86's <asm.bug.h> to not to produce #undef for them and #define again
with the same values as in <xen/bug.h>. These #undef and #define will
be anyway removed in the patch [2]
6. Remove <asm/bug.h> from <x86/acpi/cpufreq/cpufreq.c> and
<drivers/cpufreq/cpufreq.c> as nothing from <xen/bug.h> are used in
<*/cpufreq.c>
In the following two patches x86 and ARM archictectures will be
switched fully:
[1] xen/arm: switch ARM to use generic implementation of bug.h
[2] xen/x86: switch x86 to use generic implemetation of bug.h
Jan Beulich [Mon, 3 Apr 2023 10:48:12 +0000 (12:48 +0200)]
x86emul: move various utility functions to separate source files
Many are needed by the hypervisor only - have one file for this purpose.
Some are also needed by the harness (but not the fuzzer) - have another
file for these.
Code moved gets slightly adjusted in a few places, e.g. replacing
"state" by "s" (like was done for other that has been split off).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Jan Beulich [Mon, 3 Apr 2023 10:47:08 +0000 (12:47 +0200)]
x86emul: move x86_emul_blk() to separate source file
The function is already non-trivial and is expected to further grow.
Code moved gets slightly adjusted in a few places, e.g. replacing EXC_*
by X86_EXC_* (such that EXC_* don't need to move as well; we want these
to be phased out anyway).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Jan Beulich [Mon, 3 Apr 2023 10:46:08 +0000 (12:46 +0200)]
x86emul: split off insn decoding
This is a fair chunk of code and data and can easily live separate from
the main emulation function.
Code moved gets slightly adjusted in a few places, e.g. replacing EXC_*
by X86_EXC_* (such that EXC_* don't need to move as well; we want these
to be phased out anyway).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Jan Beulich [Mon, 3 Apr 2023 10:44:59 +0000 (12:44 +0200)]
x86emul: split off FPU opcode handling
Some of the helper functions/macros are needed only for this, and the
code is otherwise relatively independent of other parts of the emulator.
Code moved gets slightly adjusted in a few places, e.g. replacing EXC_*
by X86_EXC_* (such that EXC_* don't need to move as well; we want these
to be phased out anyway).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Jan Beulich [Mon, 3 Apr 2023 10:43:51 +0000 (12:43 +0200)]
x86emul: split off opcode 0fc7 handling
There's a fair amount of sub-cases (with some yet to be implemented), so
a separate function seems warranted.
Code moved gets slightly adjusted in a few places, e.g. replacing EXC_*
by X86_EXC_* (such that EXC_* don't need to move as well; we want these
to be phased out anyway).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Jan Beulich [Mon, 3 Apr 2023 10:42:44 +0000 (12:42 +0200)]
x86emul: split off opcode 0fae handling
There's a fair amount of sub-cases (with some yet to be implemented), so
a separate function seems warranted.
Code moved gets slightly adjusted in a few places, e.g. replacing EXC_*
by X86_EXC_* (such that EXC_* don't need to move as well; we want these
to be phased out anyway).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Jan Beulich [Mon, 3 Apr 2023 10:41:08 +0000 (12:41 +0200)]
x86emul: split off opcode 0f01 handling
There's a fair amount of sub-cases (with some yet to be implemented), so
a separate function seems warranted.
Code moved gets slightly adjusted in a few places, e.g. replacing EXC_*
by X86_EXC_* (such that EXC_* don't need to move as well; we want these
to be phased out anyway).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Jan Beulich [Fri, 31 Mar 2023 06:12:45 +0000 (08:12 +0200)]
x86/shadow: drop redundant present bit checks from FOREACH_PRESENT_L<N>E() "bodies"
FOREACH_PRESENT_L<N>E(), as their names (now) say, already invoke the
"body" only when the present bit is set; no need to re-do the check.
While there also
- stop open-coding mfn_to_maddr() in code being touched (re-indented)
anyway,
- stop open-coding mfn_eq() in code being touched or in adjacent code,
- drop local variables when they're no longer used at least twice.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 31 Mar 2023 06:11:53 +0000 (08:11 +0200)]
x86/shadow: rename SHADOW_FOREACH_L<N>E() to FOREACH_PRESENT_L<N>E()
These being local macros, the SHADOW prefix doesn't gain us much. What
is more important to be aware of at use sites is that the supplied code
is invoked for present entries only.
While making the adjustment also properly use NULL for the 3rd argument
at respective invocation sites.
Requested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
1. One should use 'PRIpaddr' to display 'paddr_t' variables. However,
while creating nodes in fdt, the address (if present in the node name)
should be represented using 'PRIx64'. This is to be in conformance
with the following rule present in https://elinux.org/Device_Tree_Linux
. node names
"unit-address does not have leading zeros"
As 'PRIpaddr' introduces leading zeros, we cannot use it.
So, we have introduced a wrapper ie domain_fdt_begin_node() which will
represent physical address using 'PRIx64'.
2. One should use 'PRIx64' to display 'u64' in hex format. The current
use of 'PRIpaddr' for printing PTE is buggy as this is not a physical
address.
Andrew Cooper [Mon, 6 Feb 2023 20:06:45 +0000 (20:06 +0000)]
tools/ocaml/mmap: Drop the len parameter from Xenmmap.write
Strings in Ocaml carry their own length. Absolutely nothing good can come
from having caml_string_length(data) be different to len.
Use the appropriate accessor, String_val(), but retain the workaround for the
Ocaml -safe-string constness bug in the same way as we've done elsewhere in
Xen.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Christian Lindig <christian.lindig@cloud.com>
Jan Beulich [Thu, 30 Mar 2023 11:07:16 +0000 (13:07 +0200)]
x86emul: pull permission check ahead for REP INS/OUTS
Based on observations on a fair range of hardware from both primary
vendors even zero-iteration-count instances of these insns perform the
port related permission checking first.
Fixes: fe300600464c ("x86: Fix emulation of REP prefix") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
When vgic_reserve_virq() fails and backend is in domain, we should also
free the allocated event channel.
When backend is in Xen and call to xzalloc() returns NULL, there is no
need to call xfree(). This should be done instead on an error path
from vgic_reserve_virq(). Moreover, we should be calling XFREE() to
prevent an extra free in domain_vpl011_deinit().
In order not to repeat the same steps twice, call domain_vpl011_deinit()
on an error path whenever there is more work to do than returning rc.
Since this function can now be called from different places and more
than once, add proper guards, use XFREE() instead of xfree() and move
vgic_free_virq() to it.
Take the opportunity to return -ENOMEM instead of -EINVAL when memory
allocation fails.
Fixes: 1ee1e4b0d1ff ("xen/arm: Allow vpl011 to be used by DomU") Signed-off-by: Michal Orzel <michal.orzel@amd.com> Reviewed-by: Luca Fancellu <luca.fancellu@arm.com> Reviewed-by: Julien Grall <jgrall@amazon.com>
Michal Orzel [Thu, 23 Mar 2023 13:56:35 +0000 (14:56 +0100)]
xen/arm: domain_build: Check return code of domain_vpl011_init
We are assigning return code of domain_vpl011_init() to a variable
without checking it for an error. Fix it.
Fixes: 3580c8b2dfc3 ("xen/arm: if direct-map domain use native UART address and IRQ number for vPL011") Signed-off-by: Michal Orzel <michal.orzel@amd.com> Reviewed-by: Luca Fancellu <luca.fancellu@arm.com> Acked-by: Julien Grall <jgrall@amazon.com>
Juergen Gross [Tue, 28 Mar 2023 14:43:27 +0000 (16:43 +0200)]
tools/xenstore: fix quota check in acc_fix_domains()
Today when finalizing a transaction the number of node quota is checked
to not being exceeded after the transaction. This check is always done,
even if the transaction is being performed by a privileged connection,
or if there were no nodes created in the transaction.
Correct that by checking quota only if:
- the transaction is being performed by an unprivileged guest, and
- at least one node was created in the transaction
Roger Pau Monné [Wed, 29 Mar 2023 12:56:33 +0000 (14:56 +0200)]
vpci/msix: restore PBA access length and alignment restrictions
Accesses to the PBA array have the same length and alignment
limitations as accesses to the MSI-X table:
"For all accesses to MSI-X Table and MSI-X PBA fields, software must
use aligned full DWORD or aligned full QWORD transactions; otherwise,
the result is undefined."
Introduce such length and alignment checks into the handling of PBA
accesses for vPCI. This was a mistake of mine for not reading the
specification correctly.
Note that accesses must now be aligned, and hence there's no longer a
need to check that the end of the access falls into the PBA region as
both the access and the region addresses must be aligned.
Fixes: b177892d2d ('vpci/msix: handle accesses adjacent to the MSI-X table') Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Wed, 29 Mar 2023 12:55:37 +0000 (14:55 +0200)]
ns16550: correct name/value pair parsing for PCI port/bridge
First of all these were inverted: "bridge=" caused the port coordinates
to be established, while "port=" controlled the bridge coordinates. And
then the error messages being identical also wasn't helpful. While
correcting this also move both case blocks close together.
Fixes: 97fd49a7e074 ("ns16550: add support for UART parameters to be specifed with name-value pairs") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monné [Tue, 28 Mar 2023 12:20:35 +0000 (14:20 +0200)]
vpci/msix: handle accesses adjacent to the MSI-X table
The handling of the MSI-X table accesses by Xen requires that any
pages part of the MSI-X related tables are not mapped into the domain
physmap. As a result, any device registers in the same pages as the
start or the end of the MSIX or PBA tables is not currently
accessible, as the accesses are just dropped.
Note the spec forbids such placing of registers, as the MSIX and PBA
tables must be 4K isolated from any other registers:
"If a Base Address register that maps address space for the MSI-X
Table or MSI-X PBA also maps other usable address space that is not
associated with MSI-X structures, locations (e.g., for CSRs) used in
the other address space must not share any naturally aligned 4-KB
address range with one where either MSI-X structure resides."
Yet the 'Intel Wi-Fi 6 AX201' device on one of my boxes has registers
in the same page as the MSIX tables, and thus won't work on a PVH dom0
without this fix.
In order to cope with the behavior passthrough any accesses that fall
on the same page as the MSIX tables (but don't fall in between) to the
underlying hardware. Such forwarding also takes care of the PBA
accesses, so it allows to remove the code doing this handling in
msix_{read,write}. Note that as a result accesses to the PBA array
are no longer limited to 4 and 8 byte sizes, there's no access size
restriction for PBA accesses documented in the specification.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Tue, 28 Mar 2023 12:20:16 +0000 (14:20 +0200)]
include: don't mention stub headers more than once in a make rule
When !GRANT_TABLE and !PV_SHIM headers-n contains grant_table.h twice,
causing make to complain "target '...' given more than once in the same
rule" for the rule generating the stub headers. We don't need duplicate
entries in headers-n anywhere, so zap them (by using $(sort ...)) right
where the final value of the variable is constructed.
Fixes: 6bec713f871f ("include/compat: produce stubs for headers not otherwise generated") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Dmitry Isaykin [Tue, 28 Mar 2023 12:18:46 +0000 (14:18 +0200)]
x86/monitor: add new monitor event to catch I/O instructions
Adds monitor support for I/O instructions.
Signed-off-by: Dmitry Isaykin <isaikin-dmitry@yandex.ru> Signed-off-by: Anton Belousov <abelousov@ptsecurity.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
Andrew Cooper [Fri, 24 Feb 2023 18:23:38 +0000 (18:23 +0000)]
CI: Minor updates to buster-gcc-ibt
* Update from GCC 11.2 to 11.3
* Use python3-minimal instead of python
* Use --no-install-recommends, requiring ca-certificates, g++-multilib and
build-essential to be listed explicitly
The resulting container is ~50M smaller
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Andrew Cooper [Fri, 24 Mar 2023 17:59:56 +0000 (17:59 +0000)]
CI: Remove llvm-8 from the Debian Stretch container
For similar reasons to c/s a6b1e2b80fe20. While this container is still
build-able for now, all the other problems with explicitly-versioned compilers
remain.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Andrew Cooper [Fri, 24 Mar 2023 20:09:33 +0000 (20:09 +0000)]
configure: Drop --enable-githttp
Following Demi's work to use HTTPS everywhere, all users of GIT_HTTP have
been removed from the build system. Drop the configure knob.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Demi Marie Obenour <demi@invisiblethingslab.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Andrew Cooper [Mon, 6 Dec 2021 13:07:40 +0000 (13:07 +0000)]
x86/boot: Restrict directmap permissions for .text/.rodata
While we've been diligent to ensure that the main text/data/rodata mappings
have suitable restrictions, their aliases via the directmap were left fully
read/write. Worse, we even had pieces of code making use of this as a
feature.
Restrict the permissions for .text/rodata, as we have no legitimate need for
writeability of these areas via the directmap alias. Note that the
compile-time allocated pagetables do get written through their directmap
alias, so need to remain writeable.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Mon, 4 May 2020 12:32:21 +0000 (13:32 +0100)]
x86/ucode: Fix error paths control_thread_fn()
These two early exits skipped re-enabling the watchdog, restoring the NMI
callback, and clearing the nmi_patch global pointer. Always execute the tail
of the function on the way out.
Fixes: 8dd4dfa92d62 ("x86/microcode: Synchronize late microcode loading") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Sergey Dyasli <sergey.dyasli@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
automation: add a smoke and suspend test on an Alder Lake system
This is a first test using Qubes OS CI infra. The gitlab-runner has
access to ssh-based control interface (control@thor.testnet, ssh key
exposed to the test via ssh-agent) and pre-configured HTTP dir for boot
files (mapped under /scratch/gitlab-runner/tftp inside the container).
Details about the setup are described on
https://www.qubes-os.org/news/2022/05/05/automated-os-testing-on-physical-laptops/
There are two test. First is a simple dom0+domU boot smoke test, similar
to other existing tests. The second is one boots Xen, and try if S3
works. It runs on a ADL-based desktop system. The test script is based
on the Xilinx one.
The machine needs newer kernel than other x86 tests run, so use 6.1.x
kernel added in previous commit.
The usage of fakeroot is necessary to preserve device nodes (/dev/null
etc) when repacking rootfs. The test runs in a rootless podman
container, which doesn't have full root permissions. BTW the same
applies to docker with user namespaces enabled (but it's only opt-in
feature there).
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
It will be used in tests added in subsequent patches.
Enable config options needed for those tests.
While at it, migrate all the x86 tests to the newer kernel, and
introduce x86-64-test-needs to allow deduplication later (for now it's
used only once).
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Andrew Cooper [Thu, 24 Feb 2022 19:40:15 +0000 (19:40 +0000)]
x86/vmx: Don't spuriously crash the domain when INIT is received
In VMX operation, the handling of INIT IPIs is changed. Instead of the CPU
resetting, the next VMEntry fails with EXIT_REASON_INIT. From the TXT spec,
the intent of this behaviour is so that an entity which cares can scrub
secrets from RAM before participating in an orderly shutdown.
Right now, Xen's behaviour is that when an INIT arrives, the HVM VM which
schedules next is killed (citing an unknown VMExit), *and* we ignore the INIT
and continue blindly onwards anyway.
This patch addresses only the first of these two problems by ignoring the INIT
and continuing without crashing the VM in question.
The second wants addressing too, just as soon as we've figured out something
better to do...
Discovered as collateral damage from when an AP triple faults on S3 resume on
Intel TigerLake platforms.
After spending ages sorting out Gitlab CI, it appears that OSSTest too has an
out-of-date Lets Encrypt cert. Revert again in the short term while we fix
this up.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Obtaining code over an insecure transport is a terrible idea for
blatently obvious reasons. Even for non-executable data, insecure
transports are considered deprecated.
This patch enforces the use of secure transports in misc places.
All URLs are known to work.
Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
configure: Replace git:// and http:// with https://
Obtaining code over an insecure transport is a terrible idea for
blatently obvious reasons. Even for non-executable data, insecure
transports are considered deprecated.
This patch enforces the use of secure transports in the build system.
Some URLs returned 301 or 302 redirects, so I replaced them with the
URLs that were redirected to.
Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
build: Change remaining xenbits.xen.org link to HTTPS
Obtaining code over an insecure transport is a terrible idea for
blatently obvious reasons. Even for non-executable data, insecure
transports are considered deprecated.
This patch enforces the use of secure transports for all xenbits.xen.org
URLs. All altered links have been tested and are known to work.
Signed-off-by: Demi Marie Obenour <demi@invisiblethingslab.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
build: Use HTTPS for all xenbits.xen.org Git repos
Obtaining code over an insecure transport is a terrible idea for
blatently obvious reasons. Even for non-executable data, insecure
transports are considered deprecated.
This patch enforces the use of secure transports for all xenbits git
repositories. It was generated with the following shell script:
Andrew Cooper [Wed, 15 Sep 2021 16:01:43 +0000 (17:01 +0100)]
xen/credit2: Remove tail padding from TRC_CSCHED2_* records
All three of these records have tail padding, leaking stack rubble into the
trace buffer. Introduce an explicit _pad field and have the compiler zero the
padding automatically.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
Andrew Cooper [Wed, 15 Sep 2021 15:49:01 +0000 (16:49 +0100)]
xen/memory: Remove tail padding from TRC_MEM_* records
Four TRC_MEM_* records supply custom structures with tail padding, leaking
stack rubble into the trace buffer. Three of the records were fine in 32-bit
builds of Xen, due to the relaxed alignment of 64-bit integers, but
POD_SUPERPAGE_SPLITER was broken right from the outset.
We could pack the datastructures to remove the padding, but xentrace_format
has no way of rendering the upper half of a 16-bit field. Instead, expand all
16-bit fields to 32-bit.
For POD_SUPERPAGE_SPLINTER, introduce an order field as it is relevant
information, and to match DECREASE_RESERVATION, and so it doesn't require a
__packed attribute to drop tail padding.
Update xenalyze's structures to match, and introduce xentrace_format rendering
which was absent previously.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
Andrew Cooper [Thu, 16 Sep 2021 09:24:26 +0000 (10:24 +0100)]
xen/trace: Don't over-read trace objects
In the case that 'extra' isn't a multiple of uint32_t, the calculation rounds
the number of bytes up, causing later logic to read unrelated bytes beyond the
end of the object.
Also, asserting that the object is within TRACE_EXTRA_MAX, but truncating it
in release builds is rude. Instead, reject any out-of-spec records, leaving
enough of a message to identify the faulty caller.
There is one buggy trace record, TRC_RTDS_BUDGET_BURN. As it must remain
__packed (as cur_budget is misaligned), change bool has_extratime to uint32_t
to compensate.
It turns out that the new printk() can also be hit by HVMOP_xentrace, because
the hypercall is broken. It cannot be used outside of custom debugging, as
none of the tooling was ever updated to understand TRC_GUEST, nor is there any
evidence of hypercall ever being used in public.
While the hypercall was clearly intended to be used with units if uint32_t's,
that's not how the API/ABI works - Xen will in fact read the entire structure
rather than the initialised subset out of guest memory (most likely, stack
rubble), then copy up to 3 bytes of it (rounding up to the next uint32_t) into
the real tracebuffer.
There are several possible ways to fix this, but as the hypercall, and does
not plausibly have any users, go with the one that is least logic in Xen, by
rejecting tracing attempts that are not of uint32_t size.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Edwin Török [Mon, 16 May 2022 19:45:13 +0000 (20:45 +0100)]
x86/hvm: Improve hvm_set_guest_pat() code generation again
Following on from cset 9ce0a5e207f3 ("x86/hvm: Improve hvm_set_guest_pat()
code generation"), and the discovery that Clang/LLVM makes some especially
disastrous code generation for the loop at -O2
https://github.com/llvm/llvm-project/issues/54644
Edvin decided to remove the loop entirely by fully vectorising it. This is
substantially more efficient than the loop, and rather harder for a typical
compiler to mess up.
Signed-off-by: Edwin Török <edvin.torok@citrix.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 3 Dec 2021 20:33:57 +0000 (20:33 +0000)]
x86/boot: Factor move_xen() out of __start_xen()
Partly for clarity because there is a lot of subtle magic at work here.
Expand the commentary of what's going on.
Also because there is no need to double copy the stack (32kB). Spilled
content does need accounting for, but this can be sorted by only copying only
a handful of words.
No practical change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Thu, 23 Mar 2023 23:41:20 +0000 (23:41 +0000)]
x86/shadow: Fix build with no PG_log_dirty
Gitlab Randconfig found:
arch/x86/mm/shadow/common.c: In function 'shadow_prealloc':
arch/x86/mm/shadow/common.c:1023:18: error: implicit declaration of function
'paging_logdirty_levels'; did you mean 'paging_log_dirty_init'? [-Werror=implicit-function-declaration]
1023 | count += paging_logdirty_levels();
| ^~~~~~~~~~~~~~~~~~~~~~
| paging_log_dirty_init
arch/x86/mm/shadow/common.c:1023:18: error: nested extern declaration of 'paging_logdirty_levels' [-Werror=nested-externs]
The '#if PG_log_dirty' expression is currently SHADOW_PAGING && !HVM &&
PV_SHIM_EXCLUSIVE. Move the declaration outside.
Fixes: 33fb3a661223 ("x86/shadow: account for log-dirty mode when pre-allocating") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Wed, 24 Aug 2022 09:48:48 +0000 (10:48 +0100)]
x86/hvmloader: Don't build as PIC
HVMLoader is not relocatable in memory, and 32bit PIC code has a large
overhead. Override the compilers choice of pic/no-pic and force it to be
non-relocatable.
Andrew Cooper [Thu, 20 Jan 2022 15:45:02 +0000 (15:45 +0000)]
xen: Modify domain_crash() to take a print string
There are two problems with domain_crash().
First, that it is frequently not preceded by a printk() at all, or only by a
dprintk(). Either way, critical diagnostic information is missing for an
event which is fatal to the guest.
Second, the embedded __LINE__ is an issue for livepatching, creating unwanted
churn in the binary diff. This is the final __LINE__ remaining in
livepatching-relevant contexts.
The end goal is to have domain_crash() require a print string which gets fed
to printk(), making it far less easy to omit relevant diagnostic information.
However, modifying all callers at once is far too big and complicated, so use
some macro magic to tolerate the old API (no print string) in the short term.
Adjust two callers in load_segments() to demonstrate the new API.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
<eval_nospec_test>:
0f ae e8 lfence
85 ff test %edi,%edi
74 02 je <eval_nospec_test+0x9>
90 nop
c3 ret
90 nop
c3 ret
which is not safe because the lfence has been hoisted above the conditional
jump. Clang concludes that both barrier_nospec_true()'s have identical side
effects and can safely be merged.
Clang can be persuaded that the side effects are different if there are
different comments in the asm blocks. This is fragile, but no more fragile
that other aspects of this construct.
Introduce barrier_nospec_false() with a separate internal comment to prevent
Clang merging it with barrier_nospec_true() despite the otherwise-identical
content. The generated code now becomes:
<eval_nospec_test>:
85 ff test %edi,%edi
74 05 je <eval_nospec_test+0x9>
0f ae e8 lfence
90 nop
c3 ret
0f ae e8 lfence
90 nop
c3 ret
which has the correct number of lfence's, and in the correct place.
Link: https://github.com/llvm/llvm-project/issues/55084 Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 14 Oct 2022 13:39:55 +0000 (14:39 +0100)]
xen/argo: Fixes to argo_dprintk()
Rewrite argo_dprintk() so printk() format typechecking can always be
performed. This also fixes the fact that parameters are not evaulated at all
in the default case.
Emit the messages at XENLOG_DEBUG.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Jan Beulich [Fri, 24 Mar 2023 10:20:59 +0000 (11:20 +0100)]
x86/shadow: OOS mode is HVM-only
XEN_DOMCTL_CDF_oos_off is forced set for PV domains, so the logic can't
ever be engaged for them. Conditionalize respective fields and remove
the respective bit from SHADOW_OPTIMIZATIONS when !HVM. As a result the
SH_type_oos_snapshot constant can disappear altogether in that case, and
a couple of #ifdef-s can also be dropped/combined.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
These aren't mode dependent (see 06f04f54ba97 ["x86/shadow:
sh_{write,cmpxchg}_guest_entry() are PV-only"], where they were moved
out of multi.c) and hence there's no need to have pointers to the
functions in struct shadow_paging_mode. Due to include dependencies,
however, the "paging" wrappers need to move out of paging.h; they're
needed from PV memory management code only anyway, so by moving them
their exposure is reduced at the same time.
By carefully placing the (moved and renamed) shadow function
declarations, #ifdef can also be dropped from the "paging" wrappers
(paging_mode_shadow() is constant false when !SHADOW_PAGING).
While moving the code, drop the (largely wrong) comment from
paging_write_guest_entry() and reduce that of
paging_cmpxchg_guest_entry().
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Juergen Gross [Fri, 24 Mar 2023 10:13:43 +0000 (11:13 +0100)]
tools: add container_of() macro to xen-tools/common-macros.h
Instead of having 3 identical copies of the definition of a
container_of() macro in different tools header files, add that macro
to xen-tools/common-macros.h and use that instead.
Delete the other copies of that macro.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Juergen Gross [Fri, 24 Mar 2023 10:12:32 +0000 (11:12 +0100)]
tools: get rid of additional min() and max() definitions
Defining min(), min_t(), max() and max_t() at other places than
xen-tools/common-macros.h isn't needed, as the definitions in said
header can be used instead.
Same applies to BUILD_BUG_ON() in hvmloader.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Jan Beulich [Fri, 24 Mar 2023 10:11:48 +0000 (11:11 +0100)]
x86/PV: conditionalize arch_set_info_guest()'s call to update_cr3()
sh_update_paging_modes() as its last action already invokes
sh_update_cr3(). Therefore there is no reason to invoke update_cr3()
another time immediately after calling paging_update_paging_modes(),
especially as sh_update_cr3() does not short-circuit the "nothing
changed" case.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 24 Mar 2023 10:10:41 +0000 (11:10 +0100)]
x86/shadow: replace memcmp() in sh_resync_l1()
Ordinary scalar operations are used in a multitude of other places, so
do so here as well. In fact take the opportunity and drop a local
variable then as well, first and foremost to get rid of a bogus cast.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 24 Mar 2023 10:08:36 +0000 (11:08 +0100)]
x86/shadow: fold/rename sh_unhook_*_mappings()
The "32b" and "pae" functions are identical at the source level (they
differ in what they get compiled to, due to differences in
SHADOW_FOREACH_L2E()), leaving aside a comment the PAE variant has and
the non-PAE one doesn't. Replace these infixes by the more usual l<N>
ones (and then also for the "64b" one for consistency; that'll also
allow for re-use once we support 5-level paging, if need be). The two
different instances are still distinguishable by their "level" suffix.
While fiddling with the names, convert the last parameter to boolean
as well.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 24 Mar 2023 10:07:08 +0000 (11:07 +0100)]
x86/shadow: fix and improve sh_page_has_multiple_shadows()
While no caller currently invokes the function without first making sure
there is at least one shadow [1], we'd better eliminate UB here:
find_first_set_bit() requires input to be non-zero to return a well-
defined result.
Further, using find_first_set_bit() isn't very efficient in the first
place for the intended purpose.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
[1] The function has exactly two uses, and both are from OOS code, which
is HVM-only. For HVM (but not for PV) sh_mfn_is_a_page_table(),
guarding the call to sh_unsync(), guarantees at least one shadow.
Hence even if sh_page_has_multiple_shadows() returned a bogus value
when invoked for a PV domain, the subsequent is_hvm_vcpu() and
oos_active checks (the former being redundant with the latter) will
compensate. (Arguably that oos_active check should come first, for
both clarity and efficiency reasons.)
Juergen Gross [Thu, 23 Mar 2023 08:18:12 +0000 (09:18 +0100)]
tools/xl: make split_string_into_pair() more usable
Today split_string_into_pair() will not really do what its name is
suggesting: instead of splitting a string into a pair of strings using
a delimiter, it will return the first two strings of the initial string
by using the delimiter.
This is never what the callers want, so modify split_string_into_pair()
to split the string only at the first delimiter found, resulting in
something like "x=a=b" to be split into "x" and "a=b" when being called
with "=" as the delimiter. Today the returned strings would be "x" and
"a".
At the same time switch the delimiter from "const char *" (allowing
multiple delimiter characters) to "char" (a single character only), as
this makes the function more simple without breaking any use cases.
Suggested-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Andryuk <jandryuk@gmail.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Juergen Gross [Thu, 23 Mar 2023 08:17:57 +0000 (09:17 +0100)]
tools: use libxenlight for writing xenstore-stubdom console nodes
Instead of duplicating libxl__device_console_add() work in
init-xenstore-domain.c, just use libxenlight.
This requires to add a small wrapper function to libxenlight, as
libxl__device_console_add() is an internal function.
This at once removes a theoretical race between starting xenconsoled
and xenstore-stubdom, as the old code wasn't using a single
transaction for writing all the entries, leading to the possibility
that xenconsoled would see only some of the entries being written.
Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
VT-d: fix iommu=no-igfx if the IOMMU scope contains fake device(s)
If the scope for IGD's IOMMU contains additional device that doesn't
actually exist, iommu=no-igfx would not disable that IOMMU. In this
particular case (Thinkpad x230) it included 00:02.1, but there is no
such device on this platform. Consider only existing devices for the
"gfx only" check as well as the establishing of IGD DRHD address
(underlying is_igd_drhd(), which is used to determine applicability of
two workarounds).
Fixes: 2d7f191b392e ("VT-d: generalize and correct "iommu=no-igfx" handling") Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Juergen Gross [Wed, 22 Mar 2023 09:00:09 +0000 (10:00 +0100)]
tools/xl: allow split_string_into_pair() to trim values
Most use cases of split_string_into_pair() are requiring the returned
strings to be white space trimmed.
In order to avoid the same code pattern multiple times, add a predicate
parameter to split_string_into_pair() which can be specified to call
trim() with that predicate for the string pair returned. Specifying
NULL for the predicate will avoid the call of trim().
Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Andryuk <jandryuk@gmail.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Jan Beulich [Wed, 22 Mar 2023 08:58:25 +0000 (09:58 +0100)]
move {,vcpu_}show_execution_state() declarations to common header
These are used from common code, so their signatures should be
consistent across architectures. This is achieved / guaranteed easiest
when their declarations are in a common header.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Julien Grall <jgrall@amazon.com>