Juergen Gross [Tue, 30 May 2023 08:24:14 +0000 (10:24 +0200)]
tools/xenstore: add framework to commit accounting data on success only
Instead of modifying accounting data and undo those modifications in
case of an error during further processing, add a framework for
collecting the needed changes and commit them only when the whole
operation has succeeded.
This scheme can reuse large parts of the per transaction accounting.
The changed_domain handling can be reused, but the array size of the
accounting data should be possible to be different for both use cases.
Juergen Gross [Tue, 30 May 2023 08:24:12 +0000 (10:24 +0200)]
tools/xenstore: manage per-transaction domain accounting data in an array
In order to prepare keeping accounting data in an array instead of
using independent fields, switch the struct changed_domain accounting
data to that scheme, for now only using an array with one element.
In order to be able to extend this scheme add the needed indexing enum
to xenstored_domain.h.
Juergen Gross [Tue, 30 May 2023 08:24:11 +0000 (10:24 +0200)]
tools/xenstore: take transaction internal nodes into account for quota
The accounting for the number of nodes of a domain in an active
transaction is not working correctly, as it is checking the node quota
only against the number of nodes outside the transaction.
This can result in the transaction finally failing, as node quota is
checked at the end of the transaction again.
On the other hand even in a transaction deleting many nodes, new nodes
might not be creatable, in case the node quota was already reached at
the start of the transaction.
Luca Fancellu [Wed, 31 May 2023 07:24:10 +0000 (08:24 +0100)]
tools: add physinfo arch_capabilities handling for Arm
On Arm, the SVE vector length is encoded in arch_capabilities field
of struct xen_sysctl_physinfo, make use of this field in the tools
when building for arm.
Create header arm-arch-capabilities.h to handle the arch_capabilities
field of physinfo for Arm.
Signed-off-by: Luca Fancellu <luca.fancellu@arm.com> Acked-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Christian Lindig <christian.lindig@cloud.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Luca Fancellu [Wed, 31 May 2023 07:24:09 +0000 (08:24 +0100)]
xen/physinfo: encode Arm SVE vector length in arch_capabilities
When the arm platform supports SVE, advertise the feature in the
field arch_capabilities in struct xen_sysctl_physinfo by encoding
the SVE vector length in it.
Luca Fancellu [Wed, 31 May 2023 07:24:08 +0000 (08:24 +0100)]
xen: enable Dom0 to use SVE feature
Add a command line parameter to allow Dom0 the use of SVE resources,
the command line parameter sve=<integer>, sub argument of dom0=,
controls the feature on this domain and sets the maximum SVE vector
length for Dom0.
Add a new function, parse_signed_integer(), to parse an integer
command line argument.
Luca Fancellu [Wed, 31 May 2023 07:24:07 +0000 (08:24 +0100)]
xen/common: add dom0 xen command line argument for Arm
Currently x86 defines a Xen command line argument dom0=<list> where
there can be specified dom0 controlling sub-options, to use it also
on Arm, move the code that loops through the list of arguments from
x86 to the common code and from there, call architecture specific
functions to handle the comma separated sub-options.
Luca Fancellu [Wed, 31 May 2023 07:24:06 +0000 (08:24 +0100)]
arm/sve: save/restore SVE context switch
Save/restore context switch for SVE, allocate memory to contain
the Z0-31 registers whose length is maximum 2048 bits each and
FFR who can be maximum 256 bits, the allocated memory depends on
how many bits is the vector length for the domain and how many bits
are supported by the platform.
Save P0-15 whose length is maximum 256 bits each, in this case the
memory used is from the fpregs field in struct vfp_state,
because V0-31 are part of Z0-31 and this space would have been
unused for SVE domain otherwise.
Create zcr_el{1,2} fields in arch_vcpu, initialise zcr_el2 on vcpu
creation given the requested vector length and restore it on
context switch, save/restore ZCR_EL1 value as well.
List import macros from Linux in README.LinuxPrimitives.
Luca Fancellu [Wed, 31 May 2023 07:24:03 +0000 (08:24 +0100)]
xen/arm: add SVE vector length field to the domain
Add sve_vl field to arch_domain and xen_arch_domainconfig struct,
to allow the domain to have an information about the SVE feature
and the number of SVE register bits that are allowed for this
domain.
sve_vl field is the vector length in bits divided by 128, this
allows to use less space in the structures.
The field is used also to allow or forbid a domain to use SVE,
because a value equal to zero means the guest is not allowed to
use the feature.
Check that the requested vector length is lower or equal to the
platform supported vector length, otherwise fail on domain
creation.
Check that only 64 bit domains have SVE enabled, otherwise fail.
Luca Fancellu [Wed, 31 May 2023 07:24:02 +0000 (08:24 +0100)]
xen/arm: enable SVE extension for Xen
Enable Xen to handle the SVE extension, add code in cpufeature module
to handle ZCR SVE register, disable trapping SVE feature on system
boot only when SVE resources are accessed.
While there, correct coding style for the comment on coprocessor
trapping.
Now cptr_el2 is part of the domain context and it will be restored
on context switch, this is a preparation for saving the SVE context
which will be part of VFP operations, so restore it before the call
to save VFP registers.
To save an additional isb barrier, restore cptr_el2 before an
existing isb barrier and move the call for saving VFP context after
that barrier. To keep a (mostly) specularity of ctxt_switch_to()
and ctxt_switch_from(), move vfp_save_state() up in the function.
Change the KConfig entry to make ARM64_SVE symbol selectable, by
default it will be not selected.
Create sve module and sve_asm.S that contains assembly routines for
the SVE feature, this code is inspired from linux and it uses
instruction encoding to be compatible with compilers that does not
support SVE, imported instructions are documented in
README.LinuxPrimitives.
Add a feature to the diff-report.py script that improves the comparison
between two analysis report, one from a baseline codebase and the other
from the changes applied to the baseline.
The comparison between reports of different codebase is an issue because
entries in the baseline could have been moved in position due to addition
or deletion of unrelated lines or can disappear because of deletion of
the interested line, making the comparison between two revisions of the
code harder.
Having a baseline report, a report of the codebase with the changes
called "new report" and a git diff format file that describes the
changes happened to the code from the baseline, this feature can
understand which entries from the baseline report are deleted or shifted
in position due to changes to unrelated lines and can modify them as
they will appear in the "new report".
Having the "patched baseline" and the "new report", now it's simple
to make the diff between them and print only the entry that are new.
Luca Fancellu [Thu, 25 May 2023 08:33:59 +0000 (09:33 +0100)]
xen/misra: add diff-report.py tool
Add a new tool, diff-report.py that can be used to make diff between
reports generated by xen-analysis.py tool.
Currently this tool supports the Xen cppcheck text report format in
its operations.
The tool prints every finding that is in the report passed with -r
(check report) which is not in the report passed with -b (baseline).
x86/microcode: Add missing unlock in microcode_update_helper()
microcode_update_helper() may return early while holding
cpu_add_remove_lock, hence preventing any writers from taking it again.
Leave through the `put` label instead so it's properly released.
Fixes: 5ed12565aa32 ("microcode: rendezvous CPUs in NMI handler and load ucode") Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Mon, 5 Jun 2023 09:48:59 +0000 (10:48 +0100)]
xen: Fix incorrect taint constant
Insecure is the word being looked for here. Especially given the nature of
the sole caller, and the (correct) comment next to it.
Also update the taint marker from 'U' to 'I' for consistency; this isn't
expected to impact anyone in practice.
Fixes: 82c0d3d491cc ("xen: Add an unsecure Taint type") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Jan Beulich [Mon, 5 Jun 2023 14:54:30 +0000 (16:54 +0200)]
x86emul: AVX512-FP16 testing
Naming of some of the builtins isn't fully consistent with that of pre-
existing ones, so there's a need for a new BR2() wrapper macro.
With the tests providing some proof of proper functioning of the
emulator code also enable use of the feature by guests, as there's no
other infrastructure involved in enabling this ISA extension.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Henry Wang <Henry.Wang@arm.com> # CHANGELOG
Jan Beulich [Mon, 5 Jun 2023 13:02:39 +0000 (15:02 +0200)]
build: use $(dot-target)
While slightly longer, I agree with Andrew that using it helps
readability. Where touching them anyway, also wrap some overly long
lines.
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
These are easiest in that they have same-size source and destination
vectors, yet they're different from other conversion insns in that they
use opcodes which have different meaning in the 0F encoding space
({,V}H{ADD,SUB}P{S,D}), hence requiring a little bit of overriding.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Mon, 5 Jun 2023 12:58:25 +0000 (14:58 +0200)]
x86emul: handle AVX512-FP16 Map6 misc insns
While, as before, this leverages that the Map6 encoding space is a very
sparse clone of the "0f38" one, switch around the simd_size overriding
for opcode 2D. This way fewer separate overrides are needed.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Mon, 5 Jun 2023 12:57:47 +0000 (14:57 +0200)]
x86emul: handle AVX512-FP16 fma-like insns
The Map6 encoding space is a very sparse clone of the "0f38" one. Once
again re-use that table, as the entries corresponding to invalid opcodes
in Map6 are simply benign with simd_size forced to other than simd_none
(preventing undue memory reads in SrcMem handling early in
x86_emulate()).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Mon, 5 Jun 2023 12:56:25 +0000 (14:56 +0200)]
x86emul: handle AVX512-FP16 Map5 arithmetic insns
This encoding space is a very sparse clone of the "twobyte" one. Re-use
that table, as the entries corresponding to invalid opcodes in Map5 are
simply benign with simd_size forced to other than simd_none (preventing
undue memory reads in SrcMem handling early in x86_emulate()).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Mon, 5 Jun 2023 12:55:07 +0000 (14:55 +0200)]
x86emul: handle AVX512-FP16 insns encoded in 0f3a opcode map
In order to re-use (also in subsequent patches) existing code and tables
as much as possible, simply introduce a new boolean field in emulator
state indicating whether an insn is one with a half-precision source.
Everything else then follows "naturally".
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Mon, 5 Jun 2023 12:53:54 +0000 (14:53 +0200)]
x86emul: rework compiler probing in the test harness
Checking for what $(SIMD) contains was initially right, but already the
addition of $(FMA) wasn't. Later categories (correctly) weren't added.
Instead what is of interest is anything the main harness source file
uses outside of suitable #if and without resorting to .byte, as that's
the one file (containing actual tests) which has to succeed in building.
The auxiliary binary blobs we utilize may fail to build; the resulting
empty blobs are recognized and reported as "n/a" when the harness is
run.
Note that strictly speaking we'd need to probe the assembler. We assume
that a compiler knowing of a certain ISA extension is backed by an
equally capable assembler.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Thu, 1 Jun 2023 14:26:02 +0000 (15:26 +0100)]
x86/ucode: Exit early from early_update_cache() if loading not available
If for any reason early_microcode_init() concludes that no microcode loading
is available, early_update_cache() will fall over a NULL function pointer:
which is actually parse_blob()'s use of ucode_ops.collect_cpu_info.
Skip trying to cache anything if microcode loading is unavailable.
Fixes: dc380df12acf ("x86/ucode: load microcode earlier on boot CPU") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Wed, 31 May 2023 15:26:56 +0000 (16:26 +0100)]
xen/cpu-policy: Add an IBRS -> AUTO_IBRS dependency
AUTO_IBRS is an extention over regular (AMD) IBRS, and needs hiding if IBRS is
levelled out for any reason.
Fixes: defaf651631a ("x86/hvm: Expose Automatic IBRS to guests") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Olaf Hering [Wed, 31 May 2023 16:06:56 +0000 (17:06 +0100)]
xentrace: remove return value from monitor_tbufs
The program is structured so that fatal errors cause exit() to be
called directly, rather than being passed up the stack; returning a
value here may mislead people into believing otherwise.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Reviewed-by: George Dunlap <george.dunlap@cloud.com>
Jan Beulich [Wed, 31 May 2023 14:04:30 +0000 (16:04 +0200)]
vPCI: fix test harness build
The earlier commit introduced two uses of is_hardware_domain().
Fixes: 465217b0f872 ("vPCI: account for hidden devices") Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Jan Beulich [Wed, 31 May 2023 10:01:11 +0000 (12:01 +0200)]
vPCI: account for hidden devices
Hidden devices (e.g. an add-in PCI serial card used for Xen's serial
console) are associated with DomXEN, not Dom0. This means that while
looking for overlapping BARs such devices cannot be found on Dom0's list
of devices; DomXEN's list also needs to be scanned.
Suppress vPCI init altogether for r/o devices (which constitute a subset
of hidden ones).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Tested-by: Stefano Stabellini <sstabellini@kernel.org>
In xen/include/public/io/9pfs.h the name of the Xenstore backend node
"security-model" should be "security_model", as this is how the Xen
tools are creating it and qemu is reading it.
Fixes: ad58142e73a9 ("xen/public: move xenstore related doc into 9pfs.h") Fixes: cf1d2d22fdfd ("docs/misc: Xen transport for 9pfs") Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Andryuk <jandryuk@gmail.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Oleksii Kurochko [Wed, 31 May 2023 09:59:53 +0000 (11:59 +0200)]
xen/riscv: align __bss_start
bss clear cycle requires proper alignment of __bss_start.
ALIGN(PAGE_SIZE) before "*(.bss.page_aligned)" in xen.lds.S
was removed as any contribution to "*(.bss.page_aligned)" have to
specify proper aligntment themselves.
Fixes: cfa0409f7cbb ("xen/riscv: initialize .bss section") Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Bobby Eshleman <bobbyeshleman@gmail.com>
Oleksii Kurochko [Wed, 31 May 2023 09:55:58 +0000 (11:55 +0200)]
xen/riscv: introduce setup_initial_pages
The idea was taken from xvisor but the following changes
were done:
* Use only a minimal part of the code enough to enable MMU
* rename {_}setup_initial_pagetables functions
* add an argument for setup_initial_mapping to have
an opportunity to make set PTE flags.
* update setup_initial_pagetables function to map sections
with correct PTE flags.
* Rewrite enable_mmu() to C.
* map linker addresses range to load addresses range without
1:1 mapping. It will be 1:1 only in case when
load_start_addr is equal to linker_start_addr.
* add safety checks such as:
* Xen size is less than page size
* linker addresses range doesn't overlap load addresses
range
* Rework macros {THIRD,SECOND,FIRST,ZEROETH}_{SHIFT,MASK}
* change PTE_LEAF_DEFAULT to RW instead of RWX.
* Remove phys_offset as it is not used now
* Remove alignment of {map, pa}_start &= XEN_PT_LEVEL_MAP_MASK(0);
in setup_inital_mapping() as they should be already aligned.
Make a check that {map_pa}_start are aligned.
* Remove clear_pagetables() as initial pagetables will be
zeroed during bss initialization
* Remove __attribute__((section(".entry")) for setup_initial_pagetables()
as there is no such section in xen.lds.S
* Update the argument of pte_is_valid() to "const pte_t *p"
* Add check that Xen's load address is aligned at 4k boundary
* Refactor setup_initial_pagetables() so it is mapping linker
address range to load address range. After setup needed
permissions for specific section ( such as .text, .rodata, etc )
otherwise RW permission will be set by default.
* Add function to check that requested SATP_MODE is supported
Origin: git@github.com:xvisor/xvisor.git 9be2fdd7 Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Bobby Eshleman <bobbyeshleman@gmail.com>
Andrew Cooper [Tue, 30 May 2023 15:03:16 +0000 (16:03 +0100)]
x86/spec-ctrl: Update hardware hints
* Rename IBRS_ALL to EIBRS. EIBRS is the term that everyone knows, and this
makes ARCH_CAPS_EIBRS match the X86_FEATURE_EIBRS form.
* Print RRSBA too, which is also a hint about behaviour.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
This is an AMD feature to reduce the IBRS handling overhead. Once enabled,
processes running at CPL=0 are automatically IBRS-protected even if
SPEC_CTRL.IBRS is not set. Furthermore, the RAS/RSB is cleared on VMEXIT.
The feature is exposed in CPUID and toggled in EFER.
Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 30 May 2023 10:00:34 +0000 (12:00 +0200)]
x86/vPIC: register only one ELCR handler instance
There's no point consuming two port-I/O slots. Even less so considering
that some real hardware permits both ports to be accessed in one go,
emulating of which requires there to be only a single instance.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Simplify the declarations by getting rid of the macro (and thus the
__aligned/__section/__used attributes) in the header. No functional change
intended as the macro/attributes are present in the respective definitions in
xen/arch/arm/mm.c.
Fixes: 1c78d76b67e1 ("xen/arm64: mm: Introduce helpers to prepare/enable/disable the identity mapping") Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> Acked-by: Julien Grall <jgrall@amazon.com>
Cyril Rébert [Tue, 30 May 2023 09:57:42 +0000 (11:57 +0200)]
tools/xenstore: remove deprecated parameter from xenstore commands help
Completing commit c65687e ("tools/xenstore: remove socket-only option from xenstore client").
As the socket-only option (-s) has been removed from the Xenstore access commands (xenstore-*),
also remove the parameter from the commands help (xenstore-* -h).
Luca Fancellu [Tue, 30 May 2023 09:57:02 +0000 (11:57 +0200)]
xen/misra: xen-analysis.py: Fix latent bug
Currenly there is a latent bug that is not triggered because
the function cppcheck_merge_txt_fragments is called with the
parameter strip_paths having a list of only one element.
The bug is that the split function should not be in the
loop for strip_paths, but one level before, fix it.
Jan Beulich [Tue, 30 May 2023 09:54:55 +0000 (11:54 +0200)]
VMX/cpu-policy: check availability of RDTSCP and INVPCID
Both have separate enable bits, which are optional. While on real
hardware we can perhaps expect these VMX controls to be available if
(and only if) the base CPU feature is available, when running
virtualized ourselves this may not be the case.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
xen: dt: Replace u64 with uint64_t as the callback function parameters for dt_for_each_range()
In the callback functions invoked by dt_for_each_range() ie handle_pci_range(),
map_range_to_domain(), 'u64' should be replaced with 'uint64_t' as the data type
for the parameters. The reason being Xen coding style mentions that u32/u64
should be avoided.
Also dt_for_each_range() invokes the callback functions with 'uint64_t'
arguments. Thus, is_bar_valid() needs to change the parameter types accordingly.
xen/arm: domain_build: Check if the address fits the range of physical address
handle_pci_range() and map_range_to_domain() take addr and len as uint64_t
parameters. Then frame numbers are obtained from addr and len by right shifting
with PAGE_SHIFT. The frame numbers are expressed using unsigned long.
Now if 64-bit >> PAGE_SHIFT, the result will have 52-bits as valid. On a 32-bit
system, 'unsigned long' is 32-bits. Thus, there is a potential loss of value
when the result is stored as 'unsigned long'.
To mitigate this issue, we check if the starting and end address can be
contained within the range of physical address supported on the system. If not,
then an appropriate error is returned.
xen/arm: smmu: Use writeq_relaxed_non_atomic() for writing to SMMU_CBn_TTBR0
Refer ARM IHI 0062D.c ID070116 (SMMU 2.0 spec), 17-360, 17.3.9,
SMMU_CBn_TTBR0 is a 64 bit register. Thus, one can use
writeq_relaxed_non_atomic() to write to it instead of invoking
writel_relaxed() twice for lower half and upper half of the register.
This also helps us as p2maddr is 'paddr_t' (which may be u32 in future).
Thus, one can assign p2maddr to a 64 bit register and do the bit
manipulations on it, to generate the value for SMMU_CBn_TTBR0.
xen/arm: Introduce a wrapper for dt_device_get_address() to handle paddr_t
dt_device_get_address() can accept uint64_t only for address and size.
However, the address/size denotes physical addresses. Thus, they should
be represented by 'paddr_t'.
Consequently, we introduce a wrapper for dt_device_get_address() ie
dt_device_get_paddr() which accepts address/size as paddr_t and inturn
invokes dt_device_get_address() after converting address/size to
uint64_t.
The reason for introducing this is that in future 'paddr_t' may not
always be 64-bit. Thus, we need an explicit wrapper to do the type
conversion and return an error in case of truncation.
With this, callers can now invoke dt_device_get_paddr(). However, ns16550.c
is left unchanged as it requires some prior cleanup. For details, see
https://patchew.org/Xen/20230413173735.48387-1-ayan.kumar.halder@amd.com.
This will be addressed in a subsequent series.
The DT functions (dt_read_number(), device_tree_get_reg(), fdt_get_mem_rsv())
currently accept or return 64-bit values.
In future when we support 32-bit physical address, these DT functions are
expected to accept/return 32-bit or 64-bit values (depending on the width of
physical address). Also, we wish to detect if any truncation has occurred
(i.e. while parsing 32-bit physical addresses from 64-bit values read from DT).
device_tree_get_reg() should now be able to return paddr_t. This is invoked by
various callers to get DT address and size.
For fdt_get_mem_rsv(), we have introduced a wrapper named
fdt_get_mem_rsv_paddr() which will invoke fdt_get_mem_rsv() and translate
uint64_t to paddr_t. The reason being we cannot modify fdt_get_mem_rsv() as it
has been imported from external source.
For dt_read_number(), we have also introduced a wrapper named dt_read_paddr()
dt_read_paddr() to read physical addresses. We chose not to modify the original
function as it is used in places where it needs to specifically read 64-bit
values from dt (For e.g. dt_property_read_u64()).
Xen prints warning when it detects truncation in cases where it is not able to
return error.
Also, replaced u32/u64 with uint32_t/uint64_t in the functions touched
by the code changes.
xen/arm: domain_build: Track unallocated pages using the frame number
rangeset_{xxx}_range() functions are invoked with 'start' and 'size' as
arguments which are either 'uint64_t' or 'paddr_t'. However, the function
accepts 'unsigned long' for 'start' and 'size'. 'unsigned long' is 32 bits for
Arm32. Thus, there is an implicit downcasting from 'uint64_t'/'paddr_t' to
'unsigned long' when invoking rangeset_{xxx}_range().
So, it may seem there is a possibility of lose of data due to truncation.
In reality, 'start' and 'size' are always page aligned. And Arm32 currently
supports 40 bits as the width of physical address.
So if the addresses are page aligned, the last 12 bits contain zeroes.
Thus, we could instead pass page frame number which will contain 28 bits (40-12
on Arm32) and this can be represented using 'unsigned long'.
On Arm64, this change will not induce any adverse side effect as the max
supported width of physical address is 48 bits. Thus, the width of 'gfn'
(ie 48 - 12 = 36) can be represented using 'unsigned long' (which is 64 bits
wide).
Roger Pau Monné [Fri, 26 May 2023 07:18:37 +0000 (09:18 +0200)]
vpci/header: cope with devices not having vpci allocated
When traversing the list of pci devices assigned to a domain cope with
some of them not having the vpci struct allocated. It should be
possible for the hardware domain to have read-only devices assigned
that are not handled by vPCI, such support will be added by further
patches.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Anthony PERARD [Fri, 26 May 2023 07:17:46 +0000 (09:17 +0200)]
build: use $(filechk, ) for all compat/.xlat/%.lst
Making use of filechk means that we don't have to use
$(move-if-changed,). It also means that will have sometimes "UPD .." in
the build output when the target changed, rather than having "GEN ..."
all the time when "xlat.lst" happen to have a more recent modification
timestamp.
While there, replace `grep -v` by `sed '//d'` to avoid an extra
fork and pipe when building.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Luca Fancellu <luca.fancellu@arm.com> Tested-by: Luca Fancellu <luca.fancellu@arm.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Fri, 26 May 2023 07:16:44 +0000 (09:16 +0200)]
x86/shadow: restrict OOS allocation to when it's really needed
PV domains won't use it, and even HVM ones won't when OOS is turned off
for them. There's therefore no point in putting extra pressure on the
(limited) pool of memory.
While there also zap the sh_type_to_size[] entry when OOS is disabled
altogether.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Yann Dirson [Fri, 26 May 2023 07:15:39 +0000 (09:15 +0200)]
docs: fix complex-and-wrong xenstore-path wording
"0 or 1 ... to indicate whether it is capable or incapable, respectively"
is luckily just swapped words. Making this shorter will
make the reading easier.
Jan Beulich [Fri, 26 May 2023 07:15:18 +0000 (09:15 +0200)]
build: shorten macro references
Presumably by copy-and-paste we've accumulated a number of instances of
$(@D)/$(@F), which really is nothing else than $@. The split form only
needs using when we want to e.g. insert a leading . at the beginning of
the file name portion of the full name.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com> Acked-by: Alistair Francis <alistair.francis@wdc.com>
Roger Pau Monné [Thu, 25 May 2023 12:57:14 +0000 (14:57 +0200)]
x86/iommu: adjust type in arch_iommu_hwdom_init()
The 'i' iterator index stores a PDX, not a PFN, and hence the initial
assignation of start (which stores a PFN) needs a conversion from PFN
to PDX.
This is harmless currently, as the PDX compression skips the bottom
MAX_ORDER bits which cover the low 1MB, but still do the conversion
from PDX to PFN for type correctness.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Tue, 16 May 2023 13:07:43 +0000 (14:07 +0100)]
x86/cpufeature: Rework {boot_,}cpu_has()
One area where Xen deviates from Linux is that test_bit() forces a volatile
read. This leads to poor code generation, because the optimiser cannot merge
bit operations on the same word.
Drop the use of test_bit(), and write the expressions in regular C. This
removes the include of bitops.h (which is a frequent source of header
tangles), and it offers the optimiser far more flexibility.
with half of that in x86_emulate() alone. vmx_ctxt_switch_to() seems to be
the fastpath with the greatest delta at -24, where the optimiser has
successfully removed the branch hidden in cpu_has_msr_tsc_aux.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 12 May 2023 14:53:35 +0000 (15:53 +0100)]
x86/boot: Expose MSR_ARCH_CAPS data in guest max policies
We already have common and default feature adjustment helpers. Introduce one
for max featuresets too.
Offer MSR_ARCH_CAPS unconditionally in the max policy, and stop clobbering the
data inherited from the Host policy. This will be necessary to level a VM
safely for migration. Annotate the ARCH_CAPS CPUID bit as special. Note:
ARCH_CAPS is still max-only for now, so will not be inhereted by the default
policies.
With this done, the special case for dom0 can be shrunk to just resampling the
Host policy (as ARCH_CAPS isn't visible by default yet).
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 12 May 2023 14:37:02 +0000 (15:37 +0100)]
x86/boot: Record MSR_ARCH_CAPS for the Raw and Host CPU policy
Extend x86_cpu_policy_fill_native() with a read of ARCH_CAPS based on the
CPUID information just read, removing the specially handling in
calculate_raw_cpu_policy().
Right now, the only use of x86_cpu_policy_fill_native() outside of Xen is the
unit tests. Getting MSR data in this context is left to whomever first
encounters a genuine need to have it.
Extend generic_identify() to read ARCH_CAPS into x86_capability[], which is
fed into the Host Policy. This in turn means there's no need to special case
arch_caps in calculate_host_policy().
No practical change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 12 May 2023 17:50:59 +0000 (18:50 +0100)]
x86/cpu-policy: MSR_ARCH_CAPS feature names
Seed the default visibility from the dom0 special case, which for the most
part just exposes the *_NO bits. EIBRS is the one non-*_NO bit, which is
"just" a status bit to the guest indicating a change in implemention of IBRS
which is already fully supported.
Insert a block dependency from the ARCH_CAPS CPUID bit to the entire content
of the MSR. This is because MSRs have no structure information similar to
CPUID, and used by x86_cpu_policy_clear_out_of_range_leaves(), in order to
bulk-clear inaccessable words.
The overall CPUID bit is still max-only, so all of MSR_ARCH_CAPS is hidden in
the default policies.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Mon, 15 May 2023 13:14:53 +0000 (14:14 +0100)]
x86/boot: Adjust MSR_ARCH_CAPS handling for the Host policy
We are about to move MSR_ARCH_CAPS into featureset, but the order of
operations (copy raw policy, then copy x86_capabilitiles[] in) will end up
clobbering the ARCH_CAPS value.
Some toolstacks use this information to handle TSX compatibility across the
CPUs and microcode versions where support was removed.
To avoid this transient breakage, read from raw_cpu_policy rather than
modifying it in place. This logic will be removed entirely in due course.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 12 May 2023 12:52:39 +0000 (13:52 +0100)]
x86/boot: Rework dom0 feature configuration
Right now, dom0's feature configuration is split between between the common
path and a dom0-specific one. This mostly is by accident, and causes some
very subtle bugs.
First, start by clearly defining init_dom0_cpuid_policy() to be the domain
that Xen builds automatically. The late hwdom case is still constructed in a
mostly normal way, with the control domain having full discretion over the CPU
policy.
Identifying this highlights a latent bug - the two halves of the MSR_ARCH_CAPS
bodge are asymmetric with respect to the hardware domain. This means that
shim, or a control-only dom0 sees the MSR_ARCH_CAPS CPUID bit but none of the
MSR content. This in turn declares the hardware to be retpoline-safe by
failing to advertise the {R,}RSBA bits appropriately. Restrict this logic to
the hardware domain, although the special case will cease to exist shortly.
For the CPUID Faulting adjustment, the comment in ctxt_switch_levelling()
isn't actually relevant. Provide a better explanation.
Move the recalculate_cpuid_policy() call outside of the dom0-cpuid= case.
This is no change for now, but will become necessary shortly.
Finally, place the second half of the MSR_ARCH_CAPS bodge after the
recalculate_cpuid_policy() call. This is necessary to avoid transiently
breaking the hardware domain's view while the handling is cleaned up. This
special case will cease to exist shortly.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Wed, 24 May 2023 14:22:11 +0000 (16:22 +0200)]
x86: do away with HAVE_AS_NEGATIVE_TRUE
There's no real need for the associated probing - we can easily convert
to a uniform value without knowing the specific behavior (note also that
the respective comments weren't fully correct and have gone stale).
No difference in generated code.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Daniel P. Smith [Wed, 24 May 2023 14:21:32 +0000 (16:21 +0200)]
maintainers: add regex matching for xsm
XSM is a subsystem where it is equally important of how and where its hooks are
called as is the implementation of the hooks. The people best suited for
evaluating the how and where are the XSM maintainers and reviewers. This
creates a challenge as the hooks are used throughout the hypervisor for which
the XSM maintainers and reviewers are not, and should not be, a reviewer for
each of these subsystems in the MAINTAINERS file. Though the MAINTAINERS file
does support the use of regex matches, 'K' identifier, that are applied to both
the commit message and the commit delta. Adding the 'K' identifier will declare
that any patch relating to XSM require the input from the XSM maintainers and
reviewers. For those that use the get_maintianers script, the 'K' identifier
will automatically add the XSM maintainers and reviewers. Any one not using
get_maintainers, it will be their responsibility to ensure that if their work
touches and XSM hook, to ensure the XSM maintainers and reviewers are copied.
This patch adds a pair of regex expressions to the XSM section. The first is
`xsm_.*` which seeks to match XSM hooks in the commit's delta. The second is
`\b(xsm|XSM)\b` which seeks to match strictly the words xsm or XSM and should
not capture words with a substring of "xsm".
Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com> Acked-by: Julien Grall <jgrall@amazon.com>
sched/null: avoid crash after failed domU creation
When creating a domU, but the creation fails, there is a corner case that may
lead to a crash in the null scheduler when running a debug build of Xen.
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Assertion 'npc->unit == unit' failed at common/sched/null.c:379
(XEN) ****************************************
The events leading to the crash are:
* null_unit_insert() was invoked with the unit offline. Since the unit was
offline, unit_assign() was not called, and null_unit_insert() returned.
* Later during domain creation, the unit was onlined
* Eventually, domain creation failed due to bad configuration
* null_unit_remove() was invoked with the unit still online. Since the unit was
online, it called unit_deassign() and triggered an ASSERT.
To fix this, only call unit_deassign() when npc->unit is non-NULL in
null_unit_remove.
Yann Dirson [Mon, 22 May 2023 14:11:21 +0000 (16:11 +0200)]
docs: fix xenstore-paths doc structure
We currently have "Per Domain Paths" as an empty section, whereas it
looks like "General Paths" was not indended to include all the
following sections.
Signed-off-by: Yann Dirson <yann.dirson@vates.fr> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Olaf Hering [Wed, 17 May 2023 05:57:22 +0000 (05:57 +0000)]
automation: allow to rerun build script
Calling build twice in the same environment will fail because the
directory 'binaries' was already created before. Use mkdir -p to ignore
an existing directory and move on to the actual build.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Olaf Hering [Tue, 16 May 2023 15:41:27 +0000 (15:41 +0000)]
automation: update documentation about how to build a container
The command used in the example is different from the command used in
the Gitlab CI pipelines. Adjust it to simulate what will be used by CI.
This is essentially the build script, which is invoked with a number of
expected environment variables such as CC, CXX and debug.
In addition the input should not be a tty, which disables colors from
meson and interactive questions from kconfig.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Luca Fancellu [Thu, 4 May 2023 13:12:45 +0000 (14:12 +0100)]
xen/misra: xen-analysis.py: use the relative path from the ...
repository in the reports
Currently the cppcheck report entries shows the relative file path
from the /xen folder of the repository instead of the base folder.
In order to ease the checks, for example, when looking a git diff
output and the report, use the repository folder as base.
Currently Cppcheck has a limitation that prevents to use make with
parallel build and have a parallel Cppcheck invocation on each
translation unit (the .c files), because of spurious internal errors.
The issue comes from the fact that when using the build directory,
Cppcheck saves temporary files as <filename>.c.<many-extensions>, but
this doesn't work well when files with the same name are being
analysed at the same time, leading to race conditions.
Fix the issue creating, under the build directory, the same directory
structure of the file being analysed to avoid any clash.
Fixes: 02b26c02c7c4 ("xen/scripts: add cppcheck tool to the xen-analysis.py script") Signed-off-by: Luca Fancellu <luca.fancellu@arm.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Olaf Hering [Fri, 12 May 2023 12:26:14 +0000 (12:26 +0000)]
tools: drop bogus and obsolete ptyfuncs.m4
According to openpty(3) it is required to include <pty.h> to get the
prototypes for openpty() and login_tty(). But this is not what the
function AX_CHECK_PTYFUNCS actually does. It makes no attempt to include
the required header.
The two source files which call openpty() and login_tty() already contain
the conditionals to include the required header.
Remove the bogus m4 file to fix build with clang, which complains about
calls to undeclared functions.
Remove usage of INCLUDE_LIBUTIL_H in libxl_bootloader.c, it is already
covered by inclusion of libxl_osdep.h.
Remove usage of PTYFUNCS_LIBS in libxl/Makefile, it is already covered
by UTIL_LIBS from config/StdGNU.mk.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>