Julien Grall [Fri, 12 Aug 2022 19:24:43 +0000 (20:24 +0100)]
xen/arm32: heap: Rework adr_l so it doesn't rely on where Xen is loaded
At the moment, the macro addr_l needs to know whether the caller
is running with the MMU on. This is fine today because there are
only two possible cases:
1) MMU off
2) MMU on and linked to the virtual address
This is still cumbersome to use for the developer as they need
to know if the MMU is on.
Thankfully, Linux developpers came up with a great way to allow
adr_l to work within the range +/- 4GB of PC by emitting a PC-relative
reference [1].
Re-use the same approach on Arm and drop the parameter 'mmu'.
[1] 0b1674638a5c ("ARM: assembler: introduce adr_l, ldr_l and str_l macros")
Julien Grall [Fri, 12 Aug 2022 19:24:42 +0000 (20:24 +0100)]
xen/arm32: head: Introduce get_table_slot() and use it
There are a few places in the code that need to find the slot at a
given page-table level.
So create a new macro get_table_slot() for that. This will reduce
the effort to figure out whether the code is doing the right thing.
The new macro is using 'ubfx' (or 'lsr' for the first level) rather
than the existing sequence (mov_w, lsr, and) because it doesn't require
a scratch register and reduce the number of instructions (4 -> 1).
Julien Grall [Fri, 12 Aug 2022 19:24:41 +0000 (20:24 +0100)]
xen/arm64: head: Introduce get_table_slot() and use it
There are a few places in the code that need to find the slot
at a given page-table level.
So create a new macro get_table_slot() for that. This will reduce
the effort to figure out whether the code is doing the right thing.
Take the opportunity to use 'ubfx'. The only benefits is reducing
the number of instructions from 2 to 1.
The new macro is used everywhere we need to compute the slot. This
requires to tweak the parameter of create_table_entry() to pass
a level rather than shift.
Note, for slot 0 the code is currently skipping the masking part. While
this is fine, it is safer to mask it as technically slot 0 only covers
bit 48 - 39 bit (assuming 4KB page granularity).
Take the opportunity to correct the comment when finding the second
slot for the identity mapping (we are computing the second slot
rather than first).
Jan Beulich [Wed, 24 Aug 2022 12:33:06 +0000 (14:33 +0200)]
Arm32: correct string.h functions for "int" -> "unsigned char" conversion
While Arm64 does so uniformly, for Arm32 only strchr() currently handles
this properly. Add the necessary conversion also to strrchr(), memchr(),
and memset().
As to the placement in memset(): Putting the new insn at the beginning
of the function is apparently deemed more "obvious". It could be placed
later, as the code reachable without ever making it to the "1" label
only ever does byte stores.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Jan Beulich [Wed, 24 Aug 2022 12:23:59 +0000 (14:23 +0200)]
x86/CPUID: surface suitable value in EBX of XSTATE subleaf 1
While the SDM isn't very clear about this, our present behavior make
Linux 5.19 unhappy. As of commit 8ad7e8f69695 ("x86/fpu/xsave: Support
XSAVEC in the kernel") they're using this CPUID output also to size
the compacted area used by XSAVEC. Getting back zero there isn't really
liked, yet for PV that's the default on capable hardware: XSAVES isn't
exposed to PV domains.
Considering that the size reported is that of the compacted save area,
I view Linux'es assumption as appropriate (short of the SDM properly
considering the case). Therefore we need to populate the field also when
only XSAVEC is supported for a guest.
Fixes: 460b9a4b3630 ("x86/xsaves: enable xsaves/xrstors for hvm guest") Fixes: 8d050ed1097c ("x86: don't expose XSAVES capability to PV guests") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Xenia Ragiadakou [Wed, 24 Aug 2022 12:23:00 +0000 (14:23 +0200)]
arm/processor: fix MISRA C 2012 Rule 20.7 violations
In macros MPIDR_LEVEL_SHIFT() and MPIDR_AFFINITY_LEVEL(), add parentheses
around the macro parameters 'level' and 'mpidr', respectively, to prevent
against unintended expansions.
Xenia Ragiadakou [Wed, 24 Aug 2022 12:21:26 +0000 (14:21 +0200)]
arm/gic_v3_its: fix MISRA C 2012 Rule 20.7 violations
In macros GITS_TYPER_DEVICE_ID_BITS(), GITS_TYPER_EVENT_ID_BITS() and
GITS_BASER_ENTRY_SIZE(), add parentheses around the macro parameter to
prevent against unintended expansions.
Realign subsequent lines, if any.
Penny Zheng [Tue, 16 Aug 2022 02:36:53 +0000 (10:36 +0800)]
xen: add field "flags" to cover all internal CDF_XXX
With more and more CDF_xxx internal flags in and to save the space, this
commit introduces a new field "flags" in struct domain to store CDF_*
internal flags directly.
Another new CDF_xxx will be introduced in the next patch.
Penny Zheng [Tue, 16 Aug 2022 02:36:52 +0000 (10:36 +0800)]
xen: do not merge reserved pages in free_heap_pages()
The code in free_heap_pages() will try to merge pages with the
successor/predecessor if pages are suitably aligned. So if the pages
reserved are right next to the pages given to the heap allocator,
free_heap_pages() will merge them, and give the reserved pages to heap
allocator accidentally as a result.
So in order to avoid the above scenario, this commit updates free_heap_pages()
to check whether the predecessor and/or successor has PGC_static set,
when trying to merge the about-to-be-freed chunk with the predecessor
and/or successor.
Rahul Singh [Thu, 11 Aug 2022 15:42:04 +0000 (16:42 +0100)]
xen/arm: smmu: Set s2cr to type fault when the devices are deassigned
When devices are deassigned/assigned, SMMU global fault is observed
because SMEs are freed in detach function and not allocated again when
the device is assigned back to the guest.
Don't free the SMEs when devices are deassigned, set the s2cr to type
fault. This way the SMMU will generate a fault if a DMA access is done
by a device not assigned to a guest.
Remove the arm_smmu_master_free_smes() as this is not needed anymore,
arm_smmu_write_s2cr() will be used to set the s2cr to type fault.
Andrew Cooper [Mon, 22 Aug 2022 21:17:18 +0000 (22:17 +0100)]
x86/domain: Fix struct domain memory corruption when building PV guests
arch_domain_create() can't blindly write into d->arch.hvm union. Move the
logic into hvm_domain_initialise(), which involves passing config down.
Fixes: 2ce11ce249a3 ("x86/HVM: allow per-domain usage of hardware virtualized APIC") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Mon, 22 Aug 2022 12:46:39 +0000 (13:46 +0100)]
x86/entry: Fix !PV build
early_page_fault() needs to outside of #ifdef CONFIG_PV
Spotted by Gitlab CI.
Fixes: fe3f50726e87 ("x86/entry: move .init.text section higher up in the code for readability") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Mon, 22 Aug 2022 10:10:00 +0000 (12:10 +0200)]
xenbaked: properly use time_t in dump_stats()
"int" is not a suitable type to convert time()'s return value to. Avoid
casts and other extra fiddling by using difftime(), on the assumption
that the overhead of using "double" doesn't matter here.
Coverity ID: 1509374 Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Anthony PERARD [Mon, 22 Aug 2022 10:09:10 +0000 (12:09 +0200)]
tools/helper: Cleanup Makefile
Use $(TARGETS) to collect targets. Use := for the first target instead
of +=.
Collect library to link against in $(LDLIBS).
Remove extra "-f" flags that is already part of $(RM).
Anthony PERARD [Mon, 22 Aug 2022 10:09:07 +0000 (12:09 +0200)]
tools: Introduce $(xenlibs-ldlibs, ) macro
This can be used when linking against multiple in-tree Xen libraries,
and avoid duplicated flags. It can be used instead of multiple
$(LDLIBS_libxen*).
For now, replace the open-coding in libs.mk.
The macro $(xenlibs-libs, ) will be useful later when only the path to
the libraries is wanted (e.g. for checking for dependencies).
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com>
Anthony PERARD [Mon, 22 Aug 2022 10:09:05 +0000 (12:09 +0200)]
tools: Introduce $(xenlibs-rpath,..) to replace $(SHDEPS_lib*)
This patch introduce a new macro $(xenlibs-dependencies,) to generate
a list of all the xen library that a library is list against, and they
are listed only once. We use the side effect of $(sort ) which remove
duplicates.
This is used by another macro $(xenlibs-rpath,) which is to replace
$(SHDEPS_libxen*).
In libs.mk, we don't need to $(sort ) SHLIB_lib* anymore as this was used
to remove duplicates and they are no more duplicates.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com>
Roger Pau Monné [Mon, 15 Aug 2022 07:58:55 +0000 (09:58 +0200)]
amd/msr: implement VIRT_SPEC_CTRL for HVM guests using legacy SSBD
Expose VIRT_SSBD to guests if the hardware supports setting SSBD in
the LS_CFG MSR (a.k.a. non-architectural way). Different AMD CPU
families use different bits in LS_CFG, so exposing VIRT_SPEC_CTRL.SSBD
allows for an unified way of exposing SSBD support to guests on AMD
hardware that's compatible migration wise, regardless of what
underlying mechanism is used to set SSBD.
Note that on AMD Family 17h and Hygon Family 18h processors the value
of SSBD in LS_CFG is shared between threads on the same core, so
there's extra logic in order to synchronize the value and have SSBD
set as long as one of the threads in the core requires it to be set.
Such logic also requires extra storage for each thread state, which is
allocated at initialization time.
Do the context switching of the SSBD selection in LS_CFG between
hypervisor and guest in the same handler that's already used to switch
the value of VIRT_SPEC_CTRL.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Henry Wang <Henry.Wang@arm.com>
Re-commited with a tag removed.
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Roger Pau Monné [Mon, 15 Aug 2022 07:58:08 +0000 (09:58 +0200)]
amd/msr: allow passthrough of VIRT_SPEC_CTRL for HVM guests
Allow HVM guests access to MSR_VIRT_SPEC_CTRL if the platform Xen is
running on has support for it. This requires adding logic in the
vm{entry,exit} paths for SVM in order to context switch between the
hypervisor value and the guest one. The added handlers for context
switch will also be used for the legacy SSBD support.
Introduce a new synthetic feature leaf (X86_FEATURE_VIRT_SC_MSR_HVM)
to signal whether VIRT_SPEC_CTRL needs to be handled on guest
vm{entry,exit}.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Re-commited with a tag removed.
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Roger Pau Monné [Mon, 15 Aug 2022 07:57:23 +0000 (09:57 +0200)]
amd/msr: implement VIRT_SPEC_CTRL for HVM guests on top of SPEC_CTRL
Use the logic to set shadow SPEC_CTRL values in order to implement
support for VIRT_SPEC_CTRL (signaled by VIRT_SSBD CPUID flag) for HVM
guests. This includes using the spec_ctrl vCPU MSR variable to store
the guest set value of VIRT_SPEC_CTRL.SSBD, which will be OR'ed with
any SPEC_CTRL values being set by the guest.
On hardware having SPEC_CTRL VIRT_SPEC_CTRL will not be offered by
default to guests. VIRT_SPEC_CTRL will only be part of the max CPUID
policy so it can be enabled for compatibility purposes.
Use '!' to annotate the feature in order to express that the presence
of the bit is not directly tied to its value in the host policy.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Re-commited with a tag removed.
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
George Dunlap [Fri, 19 Aug 2022 19:18:46 +0000 (20:18 +0100)]
Temporarily revert "amd/msr: implement VIRT_SPEC_CTRL for HVM guests on top of SPEC_CTRL"
A person tagged in commit ebaaa72ee080c8774b1df5783220d4811159c327
claims the tag is in accurate; revert this commit so that we can
re-commit it again with the tag corrected.
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
George Dunlap [Fri, 19 Aug 2022 19:17:30 +0000 (20:17 +0100)]
Temporarily revert "amd/msr: allow passthrough of VIRT_SPEC_CTRL for HVM guests"
A person tagged in commit a2eeaa6906101fbf322766f37f8f061dd36fe58d
claims the tag is in accurate; revert this commit so that we can
re-commit it again with the tag corrected.
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
George Dunlap [Fri, 19 Aug 2022 19:15:22 +0000 (20:15 +0100)]
Temporarily revert "amd/msr: implement VIRT_SPEC_CTRL for HVM guests using legacy SSBD"
A person tagged in commit 646589ac148a2ff6bb222a6081b4d7b13ee468c0
claims the tag is in accurate; revert this commit so that we can
re-commit it again with the tag corrected.
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Jan Beulich [Thu, 18 Aug 2022 07:30:41 +0000 (09:30 +0200)]
x86: rework hypercall argument count table instantiation & use
The initial observation were duplicate symbols that our checking warns
about. Instead of merely renaming one or both pair(s) of symbols,
reduce #ifdef-ary at the same time by moving the instantiation of the
arrays into a macro. While doing the conversion also stop open-coding
array_access_nospec().
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com>
Jane Malalane [Thu, 18 Aug 2022 07:30:10 +0000 (09:30 +0200)]
x86/entry: move .init.text section higher up in the code for readability
.init.text is a small section currently located amongst .text.entry
code. Move it above .text.entry.
This has no functional change but makes the code a bit more readable.
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jane Malalane <jane.malalane@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Thu, 18 Aug 2022 07:29:34 +0000 (09:29 +0200)]
x86/P2M: allow 2M superpage use for shadowed guests
For guests in shadow mode the P2M table gets used only by software. The
only place where it matters whether superpages in the P2M can be dealt
with is sh_unshadow_for_p2m_change(): The table is never made accessible
to hardware for address translation, and the only checks of _PAGE_PSE in
P2M entries in shadow code are in this function (all others are against
guest page table entries). That function has been capable of handling
them even before commit 0ca1669871f8a ("P2M: check whether hap mode is
enabled before using 2mb pages") disabled 2M use in this case for
dubious reasons ("potential errors when hap is disabled").
While doing this, move "order" into more narrow scope and replace the
local variable "d" by a new "hap" one.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org>
In preparation for reactivating the presently dead 2M page path of the
function, also deal with the case of replacing an L1 page table all in
one go. Note that the prior comparing of MFNs to bypass the removal of
shadows was insufficient (but kind of benign, for being dead code so
far) - at the very least the R/W bit also needs considering there (to be
on the safe side, compare the full [virtual] PTEs).
While adjusting the first conditional in the loop for the use of the new
local variable "nflags", also drop mfn_valid(): If anything we'd need to
compare against INVALID_MFN, but that won't come out of l1e_get_mfn().
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org>
Pull common checks out of the switch(). This includes extending a
_PAGE_PRESENT check to L1 as well, which presumably was deemed redundant
with p2m_is_valid() || p2m_is_grant(), but I think we are better off
being explicit in all cases. Note that for L2 (or higher) the grant
check isn't strictly necessary, as grants are only ever single pages.
Leave a respective assertion.
With _PAGE_PRESENT checked uniformly, the suspicious mfn_valid(omfn)
checks can be dropped rather than moved/folded - if anything we'd need
to compare against INVALID_MFN, but that won't come out of l1e_get_mfn().
For L1 replace the moved out condition with a PTE comparison: There's
no need for any update or flushing when the two match.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org>
Replace a p2m_is_ram() check in the 2M case by an explicit _PAGE_PRESENT
one, to make more obvious that the subsequent l1e_get_mfn() actually
retrieves something that really is an MFN. It doesn't really matter
whether it's RAM, as the subsequent comparison with the original MFN is
going to lead to zapping of everything except the "same MFN again" case.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org>
Anthony PERARD [Thu, 18 Aug 2022 07:25:50 +0000 (09:25 +0200)]
tools/libxl: Replace deprecated -soundhw on QEMU command line
-soundhw is deprecated since 825ff02911c9 ("audio: add soundhw
deprecation notice"), QEMU v5.1, and is been remove for upcoming v7.1
by 039a68373c45 ("introduce -audio as a replacement for -soundhw").
Instead we can just add the sound card with "-device", for most option
that "-soundhw" could handle. "-device" is an option that existed
before QEMU 1.0, and could already be used to add audio hardware.
The list of possible option for libxl's "soundhw" is taken the list
from QEMU 7.0.
The list of options for "soundhw" are listed in order of preference in
the manual. The first three (hda, ac97, es1370) are PCI devices and
easy to test on Linux, and the last four are ISA devices which doesn't
seems to work out of the box on linux.
The sound card 'pcspk' isn't listed even if it used to be accepted by
'-soundhw' because QEMU crash when trying to add it to a Xen domain.
Also, it wouldn't work with "-device" might need to be "-machine
pcspk-audiodev=default" instead.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Anthony PERARD [Wed, 17 Aug 2022 15:21:06 +0000 (16:21 +0100)]
build: Fix missing MAKEFLAGS --no-print-directory
While we already have "--no-print-directory" added to the make flags
in some cases, there's one case where the flags is missing, when doing
an out-of-tree build with O=, e.g.
cd xen; make O=build
Without it, we just have loads of "Entering directory" and "Leaving
directory" with the same directory.
The comment and location in the Makefile are copied from Linux.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 12 Aug 2022 17:25:55 +0000 (18:25 +0100)]
x86/traps: Make nmi_show_execution_state() more useful
* Always emit current. It's critically important.
* Do not render (0000000000000000) for the symbol in guest context. It's
just line-noise. Instead, explicitly identify which Xen vs guest context.
* Try to tabulate the data, because there is often lots of it.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Penny Zheng [Tue, 16 Aug 2022 09:23:56 +0000 (11:23 +0200)]
xen/arm: rename PGC_reserved to PGC_static
PGC_reserved could be ambiguous, and we have to tell what the pages are
reserved for, so this commit intends to rename PGC_reserved to
PGC_static, which clearly indicates the page is reserved for static
memory.
drivers/char: add support for selecting specific xhci
Handle parameters similar to dbgp=ehci.
Implement this by not resettting dbc->sbdf again in dbc_init_xhc(), but
using a value found there if non-zero. Additionally, add xue->xhc_num to
select n-th controller.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
drivers/char: reset XHCI ports when initializing dbc
Reset ports, to force host system to re-enumerate devices. Otheriwse it
will require the cable to be re-plugged, or will wait in the
"configuring" state indefinitely.
Trick and code copied from Linux:
drivers/usb/early/xhci-dbc.c:xdbc_start()->xdbc_reset_debug_port()
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Connor Davis [Tue, 16 Aug 2022 09:20:01 +0000 (11:20 +0200)]
drivers/char: add support for USB3 DbC debugger
[Connor]
Xue is a cross-platform USB 3 debugger that drives the Debug
Capability (DbC) of xHCI-compliant host controllers. This patch
implements the operations needed for xue to initialize the host
controller's DbC and communicate with it. It also implements a struct
uart_driver that uses xue as a backend. Note that only target -> host
communication is supported for now. To use Xue as a console, add
'console=dbgp dbgp=xhci' to the command line.
[Marek]
The Xue driver is taken from https://github.com/connojd/xue and heavily
refactored to fit into Xen code base. Major changes include:
- rename to xhci_dbc
- drop support for non-Xen systems
- drop xue_ops abstraction
- use Xen's native helper functions for PCI access
- move all the code to xue.c, drop "inline"
- build for x86 only
- annotate functions with cf_check
- adjust for Xen's code style
At this stage, only the first xHCI is considered, and only output is
supported. Later patches add support for choosing specific device, and
input handling.
The driver is initiallized before memory allocator works, so all the
transfer buffers (about 230KiB of them) are allocated statically and will
use memory even if XUE console is not selected. The driver can be
disabled build time to reclaim this memory.
Most of this memory is shared with the controller via DMA. Later patch
will adjust structures placement to avoid anything else to be placed on
those DMA-reachable pages. This also means str_buf cannot use static
initializer, without reserving (at least) a whole page page in .data (or
more, when combined with other structures).
Signed-off-by: Connor Davis <davisc@ainfosec.com> Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Anthony PERARD [Tue, 16 Aug 2022 09:18:39 +0000 (11:18 +0200)]
tools/flask/utils: list build targets in $(TARGETS)
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Henry Wang <Henry.Wang@arm.com> Acked-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Roger Pau Monné [Mon, 15 Aug 2022 07:58:55 +0000 (09:58 +0200)]
amd/msr: implement VIRT_SPEC_CTRL for HVM guests using legacy SSBD
Expose VIRT_SSBD to guests if the hardware supports setting SSBD in
the LS_CFG MSR (a.k.a. non-architectural way). Different AMD CPU
families use different bits in LS_CFG, so exposing VIRT_SPEC_CTRL.SSBD
allows for an unified way of exposing SSBD support to guests on AMD
hardware that's compatible migration wise, regardless of what
underlying mechanism is used to set SSBD.
Note that on AMD Family 17h and Hygon Family 18h processors the value
of SSBD in LS_CFG is shared between threads on the same core, so
there's extra logic in order to synchronize the value and have SSBD
set as long as one of the threads in the core requires it to be set.
Such logic also requires extra storage for each thread state, which is
allocated at initialization time.
Do the context switching of the SSBD selection in LS_CFG between
hypervisor and guest in the same handler that's already used to switch
the value of VIRT_SPEC_CTRL.
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Henry Wang <Henry.Wang@arm.com>
Roger Pau Monné [Mon, 15 Aug 2022 07:58:08 +0000 (09:58 +0200)]
amd/msr: allow passthrough of VIRT_SPEC_CTRL for HVM guests
Allow HVM guests access to MSR_VIRT_SPEC_CTRL if the platform Xen is
running on has support for it. This requires adding logic in the
vm{entry,exit} paths for SVM in order to context switch between the
hypervisor value and the guest one. The added handlers for context
switch will also be used for the legacy SSBD support.
Introduce a new synthetic feature leaf (X86_FEATURE_VIRT_SC_MSR_HVM)
to signal whether VIRT_SPEC_CTRL needs to be handled on guest
vm{entry,exit}.
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monné [Mon, 15 Aug 2022 07:57:23 +0000 (09:57 +0200)]
amd/msr: implement VIRT_SPEC_CTRL for HVM guests on top of SPEC_CTRL
Use the logic to set shadow SPEC_CTRL values in order to implement
support for VIRT_SPEC_CTRL (signaled by VIRT_SSBD CPUID flag) for HVM
guests. This includes using the spec_ctrl vCPU MSR variable to store
the guest set value of VIRT_SPEC_CTRL.SSBD, which will be OR'ed with
any SPEC_CTRL values being set by the guest.
On hardware having SPEC_CTRL VIRT_SPEC_CTRL will not be offered by
default to guests. VIRT_SPEC_CTRL will only be part of the max CPUID
policy so it can be enabled for compatibility purposes.
Use '!' to annotate the feature in order to express that the presence
of the bit is not directly tied to its value in the host policy.
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Anthony PERARD [Mon, 15 Aug 2022 06:55:25 +0000 (08:55 +0200)]
tools/xentop: rework makefile
Add "xentop" to "TARGETS" because this variable will be useful later.
Always define all the targets, even when configured with
--disable-monitor, instead don't visit the subdirectory.
This mean xentop/ isn't visited anymore during "make clean" that's how
most other subdirs in the tools/ works.
Also add missing "xentop" rules. It only works without it because we
still have make's built-ins rules and variables, but fix this to not
have to rely on them.
Use $(TARGETS) with $(INSTALL_PROG), and thus install into the
directory rather than spelling the program name.
In the "clean" rule, use $(RM) and remove all "*.o" instead of just
one object.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Anthony PERARD [Mon, 15 Aug 2022 06:55:21 +0000 (08:55 +0200)]
tools/libfsimage: Cleanup makefiles
Remove the need for "fs-*" targets by creating a "common.mk" which
have flags that are common to libfsimage/common/ and the other
libfsimages/*/ directories.
In common.mk, make $(PIC_OBJS) a recursively expanded variable so it
doesn't matter where $(LIB_SRCS-y) is defined, and remove the extra
$(PIC_OBJS) from libfsimage/common/Makefile.
Use a $(TARGETS) variable to list things to be built. And $(TARGETS)
can be use in the clean target in common.mk.
iso9660/:
Remove the explicit dependency between fsys_iso9660.c and
iso9660.h, this is handled automaticaly by the .*.d dependency files,
and iso9660.h already exist.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Luca Fancellu <luca.fancellu@arm.com>
Rework dependencies of all objects. We don't need to add dependencies
for headers that $(CC) is capable of generating, we only need to
include $(DEPS_INCLUDE). Some dependencies are still needed so make
knows to generate symlinks for them.
We remove the use of "vpath" for cpuid.c. While it works fine for now,
when we will convert this makefile to subdirmk, vpath will not be
usable. Also, "-iquote" is now needed to build "cpuid.o".
Replace "-I." by "-iquote .", so it applies to double-quote includes
only.
Rather than checking if a symlink exist, always regenerate the
symlink. So if the source tree changed location, the symlink is
updated.
Since we are creating a new .gitignore for the symlink, also move the
entry to it.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Luca Fancellu <luca.fancellu@arm.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Anthony PERARD [Mon, 15 Aug 2022 06:55:14 +0000 (08:55 +0200)]
tools/firmware/hvmloader: rework Makefile
Setup proper dependencies with libacpi so we don't need to run "make
hvmloader" in the "all" target. ("build.o" new prerequisite isn't
exactly proper but a side effect of building the $(DSDT_FILES) is to
generate the "ssdt_*.h" needed by "build.o".)
Make use if "-iquote" instead of a plain "-I".
For "roms.inc" target, use "$(SHELL)" instead of plain "sh". And use
full path to "mkhex" instead of a relative one. Lastly, add "-f" flag
to "mv" to avoid a prompt in case the target already exist and we
don't have write permission.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Luca Fancellu <luca.fancellu@arm.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Mon, 15 Aug 2022 06:53:11 +0000 (08:53 +0200)]
x86/mm: re-arrange type check around _get_page_type()'s TLB flush
Checks dependent on only d and x can be pulled out, thus allowing to
skip the flush mask calculation.
(Also-)Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Thu, 14 Apr 2022 09:33:05 +0000 (10:33 +0100)]
x86/build: Don't convert boot/{cmdline,head}.bin back to .S
There's no point wasting time converting binaries back to asm source. Just
use .incbin directly. Explain in head.S what these binaries are.
Also, explicitly align the blobs. They contain 4-byte objects, and happen to
be 4-byte aligned currently because of the position of `lret` and the size of
cmdline.S but this is incredibly fragile.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
It's not clear why for x86-64 a different approach was used than the
(shorter) one x86-32 has been using. Move the setting to the respective
OS files and reuse x86-32's approach for x86-64, while at the same time
using an OS-independent variable name (thus avoiding the indirection
through $(XEN_OS)).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 12 Aug 2022 06:37:50 +0000 (08:37 +0200)]
PCI: bring pci_get_real_pdev() in line with pci_get_pdev()
Fold the three parameters into a single pci_sbdf_t one.
No functional change intended, despite the "(8 - stride)" ->
"stride" replacement (not really sure why it was written the more
complicated way originally).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Rahul Singh <rahul.singh@arm.com> Tested-by: Rahul Singh <rahul.singh@arm.com>
Jan Beulich [Fri, 12 Aug 2022 06:37:09 +0000 (08:37 +0200)]
PCI: fold pci_get_pdev{,_by_domain}()
Rename the latter, subsuming the functionality of the former when passed
NULL as first argument.
Since this requires touching all call sites anyway, take the opportunity
and fold the remaining three parameters into a single pci_sbdf_t one.
No functional change intended. In particular the locking related
assertion needs to continue to be kept silent when a non-NULL domain
pointer is passed - both vpci_read() and vpci_write() call the function
without holding the lock (adding respective locking to vPCI [or finding
an alternative to doing so] is the topic of a separate series).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Rahul Singh <rahul.singh@arm.com> Tested-by: Rahul Singh <rahul.singh@arm.com>
The last "wildcard" use of either function went away with f591755823a7
("IOMMU/PCI: don't let domain cleanup continue when device de-assignment
failed"). Don't allow them to be called this way anymore. Besides
simplifying the code this also fixes two bugs:
1) When seg != -1, the outer loops should have been terminated after the
first iteration, or else a device with the same BDF but on another
segment could be found / returned.
Reported-by: Rahul Singh <rahul.singh@arm.com>
2) When seg == -1 calling get_pseg() is bogus. The function (taking a
u16) would look for segment 0xffff, which might exist. If it exists,
we might then find / return a wrong device.
In pci_get_pdev_by_domain() also switch from using the per-segment list
to using the per-domain one, with the exception of the hardware domain
(see the code comment there).
While there also constify "pseg" and drop "pdev"'s already previously
unnecessary initializer.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Rahul Singh <rahul.singh@arm.com> Tested-by: Rahul Singh <rahul.singh@arm.com>
Jan Beulich [Thu, 11 Aug 2022 15:45:12 +0000 (17:45 +0200)]
build/x86: suppress GNU ld 2.39 warning about RWX load segments
Commit 68f5aac012b9 ("build: suppress future GNU ld warning about RWX
load segments") didn't quite cover all the cases: Apparently I missed
ones in the building of 32-bit helper objects because of only looking at
incremental builds (where those wouldn't normally be re-built). Clone
the workaround there to the specific Makefile in question.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Ross Lagerwall [Thu, 11 Aug 2022 15:44:26 +0000 (17:44 +0200)]
x86/amd: only call setup_force_cpu_cap for boot CPU
This should only be called for the boot CPU to avoid calling _init code
after it has been unloaded.
Fixes: 062868a5a8b4 ("x86/amd: Work around CLFLUSH ordering on older parts") Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Xenia Ragiadakou [Thu, 11 Aug 2022 09:47:34 +0000 (11:47 +0200)]
arm/vgic: fix coding style in macro REG_RANK_INDEX()
Add parentheses around the macro parameter 's' to prevent against unintended
expansions. This, also, resolves a MISRA C 2012 Rule 20.7 violation warning.
Anthony PERARD [Thu, 11 Aug 2022 09:47:11 +0000 (11:47 +0200)]
tools/libxl: Replace deprecated -sdl option on QEMU command line
"-sdl" is deprecated upstream since 6695e4c0fd9e ("softmmu/vl:
Deprecate the -sdl and -curses option"), QEMU v6.2, and the option is
removed by 707d93d4abc6 ("ui: Remove deprecated options "-sdl" and
"-curses""), in upcoming QEMU v7.1.
Instead, use "-display sdl", available since 1472a95bab1e ("Introduce
-display argument"), before QEMU v1.0.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Dario Faggioli [Thu, 11 Aug 2022 09:46:22 +0000 (11:46 +0200)]
xen/sched: setup dom0 vCPUs affinity only once
Right now, affinity for dom0 vCPUs is setup in two steps. This is a
problem as, at least in Credit2, unit_insert() sees and uses the
"intermediate" affinity, and place the vCPUs on CPUs where they cannot
be run. And this in turn results in boot hangs, if the "dom0_nodes"
parameter is used.
Fix this by setting up the affinity properly once and for all, in
sched_init_vcpu() called by create_vcpu().
Note that, unless a soft-affinity is explicitly specified for dom0 (by
using the relaxed mode of "dom0_nodes") we set it to the default, which
is all CPUs, instead of computing it basing on hard affinity (if any).
This is because hard and soft affinity should be considered as
independent user controlled properties. In fact, if we dor derive dom0's
soft-affinity from its boot-time hard-affinity, such computed value will
continue to be used even if later the user changes the hard-affinity.
And this could result in the vCPUs behaving differently than what the
user wanted and expects.
Fixes: dafd936dddbd ("Make credit2 the default scheduler") Reported-by: Olaf Hering <ohering@suse.de> Signed-off-by: Dario Faggioli <dfaggioli@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
xen/arm: vreg: Fix MISRA C 2012 Rule 20.7 violation
In VREG_REG_HELPERS(), the macro parameter 'offmask' is used as expression and
therefore it is good to be enclosed in parentheses to prevent against
unintended expansions.
xen/arm: regs: Fix MISRA C 2012 Rule 20.7 violation
In macro psr_mode(), the macro parameter 'm' is used as expression and
therefore it is good to be enclosed in parentheses to prevent against
unintended expansions.
Jason Andryuk [Tue, 19 Jul 2022 20:08:15 +0000 (16:08 -0400)]
x86: Expose more MSR_ARCH_CAPS to hwdom
commit e46474278a0e ("x86/intel: Expose MSR_ARCH_CAPS to dom0") started
exposing MSR_ARCH_CAPS to dom0. More bits in MSR_ARCH_CAPS have since
been defined, but they haven't been exposed. Update the list to allow
them through.
As one example, this allows a Linux Dom0 to know that it has the
appropriate microcode via FB_CLEAR. Notably, and with the updated
microcode, this changes dom0's
/sys/devices/system/cpu/vulnerabilities/mmio_stale_data changes from:
"Vulnerable: Clear CPU buffers attempted, no microcode; SMT Host state unknown"
to:
"Mitigation: Clear CPU buffers; SMT Host state unknown"
This exposes the MMIO Stale Data and Intel Branch History Injection
(BHI) controls as well as the page size change MCE issue bit.
Fixes: commit 2ebe8fe9b7e0 ("x86/spec-ctrl: Enumeration for MMIO Stale Data controls") Fixes: commit cea9ae062295 ("x86/spec-ctrl: Enumeration for new Intel BHI controls") Fixes: commit 59e89cdabc71 ("x86/vtx: Disable executable EPT superpages to work around CVE-2018-12207") Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
In MASK_DECLARE_ macros, the macro parameter 'x' is used as expression and
therefore it is good to be enclosed in parentheses to prevent against
unintended expansions.
Signed-off-by: Xenia Ragiadakou <burzalodowa@gmail.com>
While there add the blanks missing around the + operators involved.
Acked-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Jane Malalane [Tue, 9 Aug 2022 09:49:43 +0000 (11:49 +0200)]
x86/kexec: Add the '.L_' prefix to is_* and call_* labels
These are local symbols and shouldn't be externally visible.
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jane Malalane <jane.malalane@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
automation: qemu-smoke-arm64: Run ping test over a pv network interface
This patch modified the test in the following way
- Dom0 is booted with an alpine linux rootfs with the xen tools.
- Once Dom0 is booted, it starts xenstored, calls init-dom0less to setup
the xenstore interface for the dom0less Dom1, setups the bridged network
and attaches a pv network interface to Dom1.
- In the meantime, Dom1 in its init script tries to assign an ip to eth0
and ping Dom0,
- If Dom1 manages to ping Dom0, it prints 'passed'.
Use kernel 5.19 to unblock testing dom0less enhanced.
This kernel version has the necessary patches for deferring xenbus probe
until xenstore is fully initialized.
Also, build kernel with bridging and xen netback support enabled because
it will be used for testing network connectivity between Dom0 and Dom1
over a pv network interface.
automation: disable xen,enhanced in qemu-smoke-arm64
Disable xen,enhanced because we don't use PV drivers in this test and
also because the kernel used for testing is old and unpatched and would
break if xen,enhanced is passed.
Edwin Török [Fri, 29 Jul 2022 17:53:25 +0000 (18:53 +0100)]
tools/ocaml/*/Makefile: generate paths.ml from configure
paths.ml contains various paths known to configure, and currently is generated
via a Makefile rule. Simplify this and generate it through configure, similar
to how oxenstored.conf is generated from oxenstored.conf.in.
This will allow to reuse the generated file more easily with Dune.
No functional change.
Signed-off-by: Edwin Török <edvin.torok@citrix.com> Acked-by: Christian Lindig <christian.lindig@citrix.com>
Andrew Cooper [Tue, 2 Aug 2022 13:30:30 +0000 (14:30 +0100)]
x86/spec-ctrl: Use IST RSB protection for !SVM systems
There is a corner case where a VT-x guest which manages to reliably trigger
non-fatal #MC's could evade the rogue RSB speculation protections that were
supposed to be in place.
This is a lack of defence in depth; Xen does not architecturally execute more
RET than CALL instructions, so an attacker would have to locate a different
gadget (e.g. SpectreRSB) first to execute a transient path of excess RET
instructions.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
xen/hypfs: check the return value of snprintf to avoid leaking stack accidently
The function snprintf() returns the number of characters that would have been
written in the buffer if the buffer size had been sufficiently large,
not counting the terminating null character.
Hence, the value returned is not guaranteed to be smaller than the buffer size.
Check the return value of snprintf() to prevent leaking stack contents to the
guest by accident.
Also, for debug builds, add an assertion to ensure that the assumption made on
the size of the destination buffer still holds.
xen/compiler: fix MISRA C 2012 Rule 20.7 violation
In __must_be_array(), the macro parameter 'a' is used as expression and
therefore it is good to be enclosed in parentheses to prevent against
unintended expansions.
Signed-off-by: Xenia Ragiadakou <burzalodowa@gmail.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Juergen Gross [Fri, 5 Aug 2022 06:36:54 +0000 (08:36 +0200)]
tools/xenstore: add documentation for new set/get-feature commands
Add documentation for two new Xenstore wire commands SET_FEATURE and
GET_FEATURE used to set or query the Xenstore features visible in the
ring page of a given domain.
When calling python tools to convert misra documentation or merge
cppcheck xml files, use $(PYTHON).
While there fix misra document conversion script to be executable.
Fixes: 57caa5375321 ("xen: Add MISRA support to cppcheck make rule") Fixes: 43aa3f6e72d3 ("xen/build: Add cppcheck and cppcheck-html make rules") Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Add git commands examples that can be used to generate fixes and how to
use the pretty configuration for git.
This should make it easier for contributors to have the right format.