Anthony PERARD [Thu, 30 May 2019 17:08:45 +0000 (18:08 +0100)]
libxl: Use ev_qmp in libxl_set_vcpuonline
Removed libxl__qmp_cpu_add since it's not used anymore.
`cpumap' arg of libxl__set_vcpuonline_xenstore is constified.
The QMP command "query-cpus" is going to be called from different
places, so the algorithm that parse the answer is in a separate
function, qmp_parse_query_cpus.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Tue, 30 Jul 2019 14:56:30 +0000 (15:56 +0100)]
libxl_pci: Only check if qemu-dm is running in qemu-trad case
QEMU upstream (or qemu-xen) may not have set "running" state in
xenstore. "running" with QEMU doesn't mean that the binary is
running, it means that the emulation have started. When adding a
pci-passthrough device to QEMU, we do so via QMP, we have a direct
answer to whether QEMU is running or not, no need to check ahead.
Moving the check to do it only with qemu-trad makes upcoming changes
simpler.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Thu, 9 May 2019 17:08:09 +0000 (18:08 +0100)]
libxl_pci: Coding style of do_pci_add
do_pci_add is going to be asynchronous, so we start by having a single
path out of the function. All `return`s instead set rc and goto out.
While here, some use of `rc' was used to store the return value of
libxc calls, change them to store into `r'. Also, add the value of `r'
in the error message of those calls.
There were an `out' label that was use it seems to skip setting up the
IRQ, the label has been renamed to `out_no_irq'.
No functional changes.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Wed, 8 May 2019 14:23:52 +0000 (15:23 +0100)]
libxl: Use aodev for libxl__device_usbdev_remove
This also mean libxl__initiate_device_usbctrl_remove, which uses
libxl__device_usbdev_remove synchronously, needs to be updated to use
it with multidev.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Tue, 7 May 2019 14:54:08 +0000 (15:54 +0100)]
libxl: Add device_{config,type} to libxl__ao_device
These two fields help to give more information about the device been
hotplug/hotunplug to callbacks.
There is already `dev' of type `libxl__device', but it is mostly
useful when the backend/frontend is xenstore. Some device (like
`usbdev') don't have devid, so `dev' can't be used.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Wed, 17 Apr 2019 16:16:07 +0000 (17:16 +0100)]
libxl: Add libxl__ev_qmp to libxl__ao_device
`aodev->qmp' is initialised in libxl__prepare_ao_device(), but since
there isn't a single exit path for a `libxl__ao_device', users of this
new `qmp' field will have to disposed of it.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Tue, 7 May 2019 16:18:56 +0000 (17:18 +0100)]
libxl: Inline do_usbdev_remove into libxl__device_usbdev_remove
Having the function do_usbdev_remove makes it harder to add asynchronous
calls into it. Move its body back into libxl__device_usbdev_remove and
adjust the latter as there are no reason to have a separated function.
No functional changes.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Thu, 18 Apr 2019 11:10:30 +0000 (12:10 +0100)]
libxl: Inline do_usbdev_add into libxl__device_usbdev_add
Having the function do_usbdev_add makes it harder to add asynchronous
calls into it. Move its body back into libxl__device_usbdev_add and
adjust the latter as there are no reason to have a separated function.
No functional changes.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Sun, 26 May 2019 12:37:44 +0000 (13:37 +0100)]
libxl: Re-introduce libxl__domain_resume
libxl__domain_resume is a rework libxl__domain_resume_deprecated. It
makes uses of ev_xswatch and ev_qmp, to replace synchronous QMP calls
and libxl__wait_for_device_model_deprecated call.
This patch also introduce libxl__dm_resume which is a sub-operation of
both libxl__domain_resume and libxl__domain_unpause and can be used
instead of libxl__domain_resume_device_model_deprecated.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Thu, 23 May 2019 14:07:52 +0000 (15:07 +0100)]
libxl: Deprecate libxl__domain_{unpause,resume}
These two functions are used from many places in libxl and need to
change to be able to accomodate libxl__ev_qmp calls and thus needs to
be asynchronous.
(There is also libxl__domain_resume_device_model in the mix.)
A later patch will introduce a new libxl__domain_resume and
libxl__domain_unpause which will make use of libxl__ev_qmp.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Fri, 17 May 2019 09:39:13 +0000 (10:39 +0100)]
libxl: Replace libxl__qmp_initializations by ev_qmp calls
Setup a timeout of 10s for all the commands. It used to be about 5s
per commands.
The order of command is changed, we call 'query-vnc' before
'change-vnc-password', but that should not matter. That makes it
easier to call 'change-vnc-password' conditionally.
Also 'change' command is replaced by 'change-vnc-password'
because 'change' is deprecated. The new command is available in all
QEMU versions that also have Xen support.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Thu, 16 May 2019 13:23:28 +0000 (14:23 +0100)]
libxl: Move "qmp_initializations" to libxl_dm
libxl__qmp_initializations is part of the device domain startup, it
queries information about the newly spawned QEMU and do some
post-startup configuration. So the function call doesn't belong to the
general domain creation, but only to the device model part of the
process, thus the call belong to libxl_dm and libxl__dm_spawn_state's
machinery.
We move the call ahead of a follow-up patch which going to "inline"
libxl__qmp_initializations.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Thu, 13 Jun 2019 15:51:29 +0000 (16:51 +0100)]
libxl_usb: Use usbctrl instead of usbctrlinfo
The functions that calls usbctrl_getinfo() only needs information that
can be found in a `libxl_device_usbctrl'. So avoid calling
libxl_device_usbctrl_getinfo and call libxl_devid_to_device_usbctrl
instead. (libxl_device_usbctrl_getinfo needs a `libxl_device_usbctrl'
anyway.)
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Thu, 13 Jun 2019 15:26:40 +0000 (16:26 +0100)]
libxl_usb: usbctrl, make use of generic device handling functions
Two functions in generate `libxl_device_usbctrl' can be replaced by
generic macro:
- libxl_device_usbctrl_list -> LIBXL_DEFINE_DEVICE_LIST
- libxl_devid_to_device_usbctrl -> LIBXL_DEFINE_DEVID_TO_DEVICE
This patch only needs to define `libxl__usbctrl_devtype.from_xenstore'
to makes use of them.
Small change, libxl_devid_to_device_usbctrl doesn't list all usbctrl
anymore before finding the right one.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Thu, 13 Jun 2019 15:42:09 +0000 (16:42 +0100)]
libxl: Constify libxl_device_* param of *_getinfo
The libxl_device_TYPE parameter of all the libxl_device_TYPE_getinfo
function seems to be only used as input to find more information to bi
stored in the libxl_TYPEinfo parameter.
Make sure this is always true and constify the input parameter to avoid
further mistake.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Thu, 13 Jun 2019 11:20:31 +0000 (12:20 +0100)]
libxl_usb: Fix wrong usage of asserts
Replace the assert(0) by abort() since the intention in libxl is that
asserts are always compiled in. This patch makes its clear and removes
the need to deal with asserts been compiled out.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Wed, 29 May 2019 16:01:06 +0000 (17:01 +0100)]
libxl_usb: Use proper domid value, from libxl__device
ao->domid isn't a reliable way of getting a domid, it might not be set
(this isn't the case here). The right domid value can be found in the
libxl__device (which is the device we want to remove) attached to
libxl__ao_device.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Fri, 12 Apr 2019 16:54:48 +0000 (17:54 +0100)]
libxl_dom_save: Reorder functions for switch_qemu_logdirty
There are two differents set of callbacks here, one for
libxl__domain_common_switch_qemu_logdirty,
and one for libxl__domain_suspend_common_switch_qemu_logdirty.
The first set calls the second.
Pure code motion.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Thu, 9 May 2019 14:04:33 +0000 (15:04 +0100)]
libxl_pci: Constify arg `pcidev' of libxl__device_pci_add_xenstore
libxl__device_pci_add_xenstore doesn't modify `pcidev', so it can be
constified. Also, we don't need pcidev_saved anymore, so remove the
saved copy. (device_add_domain_config is going to make it's own copy
anyway.)
To achieve this, constify pcidev in all functions that
libxl__device_pci_add_xenstore calls.
No functional changes.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Thu, 9 May 2019 15:52:33 +0000 (16:52 +0100)]
libxl_pci: Make libxl__create_pci_backend static
libxl__create_pci_backend isn't called from outside of libxl_pci
anymore, and it's only useful as part of the pci_add process, so
remove the prototype from libxl_internal.h.
No functional changes.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Thu, 18 Apr 2019 16:26:09 +0000 (17:26 +0100)]
libxl: Rename struct libxl_device_type to libxl__device_type
libxl__device_type is internal to libxl, rename it to the internal
only prefix. And eliminate redundant 'struct' keyword, in accord with
the coding style.
No functional changes.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Julien Grall [Tue, 20 Aug 2019 12:22:55 +0000 (13:22 +0100)]
xen/arm: iommu: Panic if not all IOMMUs are initialized
At the moment, the platform can come up with only part of the IOMMUs
initialized. This could lead to a failure later on when building the
hardware domain or even trying to assign a device to a guest.
To avoid unwanted behavior, Xen will not continue if one of the IOMMUs
has not been initialized correctly.
[stefano: fix typo in comment, add '\n' to panic message]
Anthony PERARD [Fri, 22 Mar 2019 15:04:57 +0000 (15:04 +0000)]
libxl_disk: Cut libxl_cdrom_insert into steps ..
.. and use a new "slow" lock to avoid holding the userdata lock across
several functions.
This patch cuts libxl_cdrom_insert into different step/function but
there are still called synchronously. (Taking the ev_lock is the only
step that might be asynchronous.) A later patch will call them
asynchronously when QMP is involved.
Thee userdata lock (json_lock) use to protect against concurrent change
of cdrom is replaced by an ev_lock which can be held across different
CTX_LOCK sections. The json_lock is still used when reading/modifying
the domain userdata (mandatory) and update xenstore (mostly because
it's updated as the same time as the userdata).
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Fri, 7 Jun 2019 14:19:02 +0000 (15:19 +0100)]
libxl: Add optimisation to ev_lock
It will often be the case that the lock is free to grab. So we first
try to grab it before we have to fork. Even though in this case the
locks are grabbed in the wrong order in the lock hierarchy (ev_lock
should be outside of CTX_LOCK), it is fine to try without blocking. If
that failed, we will release CTX_LOCK and try to grab both lock again
in the right order.
That optimisation is only enabled in releases (debug=n) so the more
complicated code with fork is actually exercised.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Thu, 6 Jun 2019 13:32:11 +0000 (14:32 +0100)]
libxl_internal: Introduce libxl__ev_devlock for devices hotplug via QMP
The current lock `domain_userdata_lock' can't be used when modification
to a guest is done by sending command to QEMU, this is a slow process
and requires to call CTX_UNLOCK, which is not possible while holding
the `domain_userdata_lock'.
To resolve this issue, we create a new lock which can take over part
of the job of the json_lock.
This lock is outside CTX_LOCK in the lock hierarchy.
libxl__ev_devlock_lock will have CTX_UNLOCK before trying to grab the
ev_devlock. The callback is used to notify when the ev_devlock have
been acquired.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Wed, 18 Sep 2019 16:10:15 +0000 (17:10 +0100)]
tools/configure: Allow specifying python to be found from path
./configure takes a PYTHON=... argument. You can use this to specify
the python interpreter. However, for no good reason, it expects an
absolute path.
Fix this. The new logic is:
* if not set, default to `python'
* if not absolute, look it up with type -p
* split into directory and executable name
The results in config/Tools.mk (which contains @PYTHON@ and
@PYTHONPATH@) are identical for both
./configure
./configure PYTHON=/usr/bin/python
so I assert this has no functional change except that now you can say
./configure PYTHON=python
In particular you can now say
./configure PYTHON=python2
./configure PYTHON=python3
The latter is useful if you want python3 (which should probably be the
default, but does not work right now). The former is useful if you
want python2 but your distro has foolishly made "python" refer to
python3.
CC: Doug Goldstein <cardoe@cardoe.com> CC: George Dunlap <george.dunlap@citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Wei Liu <wl@xen.org>
Andrew Cooper [Mon, 9 Sep 2019 10:43:28 +0000 (11:43 +0100)]
x86: Misc trivial cleanup of bootsym_phys()
In smpboot, there is no need to abstract setup_trampoline() away. Drop the
define and use bootsym_phys() directly.
In tboot, the 3 size calculations are invariant of their bootsym_phys()/__pa()
transformations, but the compiler can't tell this. Drop the tranformations,
which simplifies the compiled function.
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-134 (-134)
Function old new delta
tboot_shutdown 620 486 -134
Total: Before=3337042, After=3336908, chg -0.00%
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
tools/arm: optee: create optee firmware node in DT if tee=optee
If TEE support is enabled with "tee=optee" option in xl.cfg,
then we need to inform guest about available TEE, by creating
corresponding node in the guest's device tree.
Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com> Reviewed-by: Julien Grall <julien.grall@arm.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich [Wed, 18 Sep 2019 13:20:00 +0000 (15:20 +0200)]
x86/CPUID: drop INVPCID dependency on PCID
PCID validly depends on LM, as it can be enabled in Long Mode only.
INVPCID, otoh, can be used not only without PCID enabled, but also
outside of Long Mode altogether. In both cases its functionality is
simply restricted to PCID 0, which is sort of expected as no other PCID
can be activated there.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Wed, 18 Sep 2019 13:14:49 +0000 (15:14 +0200)]
x86: limit the amount of TLB flushing in switch_cr3_cr4()
We really need to flush the TLB just once, if we do so with or after the
CR3 write. The only case where two flushes are unavoidable is when we
mean to turn off CR4.PGE (perhaps just temporarily; see the code
comment).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Wed, 18 Sep 2019 13:13:21 +0000 (15:13 +0200)]
x86emul: treat Hygon guests like AMD ones
For some reason the Hygon enabling series left out the insn emulator.
Make appropriate adjustments wherever we've been special casing AMD.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Wei Liu <wl@xen.org> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Jan Beulich [Wed, 18 Sep 2019 13:12:33 +0000 (15:12 +0200)]
core-parking: interact with runtime SMT-disabling
When disabling SMT at runtime, secondary threads should no longer be
candidates for bringing back up in response to _PUR ACPI events. Purge
them from the tracking array.
Doing so involves adding locking to guard accounting data in the core
parking code. While adding the declaration for the lock, take the
liberty to drop two unnecessary forward function declarations.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
make subdirs-install
make[2]: Entering directory `/home/travis/build/andyhhp/xen/tools'
make[3]: Entering directory `/home/travis/build/andyhhp/xen/tools'
make -C libs install
make[4]: Entering directory `/home/travis/build/andyhhp/xen/tools/libs'
make[5]: Entering directory `/home/travis/build/andyhhp/xen/tools/libs'
make -C toolcore install
make[6]: Entering directory `/home/travis/build/andyhhp/xen/tools/libs/toolcore'
make libs
make[7]: Entering directory`/home/travis/build/andyhhp/xen/tools/libs/toolcore'
for i in include/xentoolcore.h include/xentoolcore_internal.h; do \
gcc -x c -ansi -Wall -Werror -I<snip>/xen/tools/libs/toolcore/../../../tools/include \
-S -o /dev/null $i || exit 1; \
echo $i; \
done >headers.chk.new
include/xentoolcore_internal.h:30:31: fatal error: _xentoolcore_list.h: No such file or directory
#include "_xentoolcore_list.h"
^
compilation terminated.
make[7]: *** [headers.chk] Error 1
The problem is that xentoolcore_internal.h includes _xentoolcore_list.h which
hasn't been generated yet.
The toolcore headers.chk rule (unlike the other libraries) had an additional
dependency against $(AUTOINCS), which forced the headers to be generated
first. Replicate this in the common libs.mk
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
xen/arm: Zero BSS after the MMU and D-cache is turned on
At the moment BSS is zeroed before the MMU and D-Cache is turned on.
In other words, the cache will be bypassed when zeroing the BSS section.
On Arm64, per the Image protocol [1], the state of the cache for BSS region
is not known because it is not part of the "loaded kernel image".
On Arm32, the boot protocol [2] does not mention anything about the
state of the cache. Therefore, it should be assumed that it is not known
for BSS region.
This means that the cache will need to be invalidated twice for the BSS
region:
1) Before zeroing to remove any dirty cache line. Otherwise they may
get evicted while zeroing and therefore overriding the value.
2) After zeroing to remove any cache line that may have been
speculated. Otherwise when turning on MMU and D-Cache, the CPU may
see old values.
At the moment, the only reason to have BSS zeroed early is because the
boot page tables are part of it. To avoid the two cache invalidations,
it would be better if the boot page tables are part of the "loaded
kernel image" and therefore be zeroed when loading the image into
memory. A good candidate is the section .data.page_aligned.
A new macro DEFINE_BOOT_PAGE_TABLE is introduced to create and mark
page-tables used before BSS is zeroed. This includes all boot_* but also
xen_fixmap as zero_bss() will print a message when earlyprintk is
enabled.
Boot CPU and secondary CPUs will use different entry point to C code. At
the moment, the decision on which entry to use is taken within launch().
In order to avoid using conditional instruction and make the call
clearer, launch() is reworked to take in parameters the entry point and its
arguments.
Lastly, document the behavior and the main registers usage within the
function.
Andrew Cooper [Fri, 13 Sep 2019 16:17:21 +0000 (17:17 +0100)]
drivers/acpi: Drop "ERST table was not found" message
ERST isn't a mandatory table, and also isn't very common to find. The message
is unnecessary noise during boot. Furthermore, it is redundant with the list
of found ACPI tables printed just ahead.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
This patch defines a new bit reported in the hw_cap field of struct
xen_sysctl_physinfo to indicate whether the platform supports sharing of
HAP page tables (i.e. the P2M) with the IOMMU. This informs the toolstack
whether the domain needs extra memory to store discrete IOMMU page tables
or not.
NOTE: This patch makes sure iommu_hap_pt_shared is clear if HAP is not
supported or the IOMMU is disabled, and defines it to false if
!CONFIG_HVM.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Christian Lindig <christian.lindig@citrix.com> Acked-by: Wei Liu <wl@xen.org> Acked-by: Julien Grall <julien.grall@arm.com>
Paul Durrant [Tue, 17 Sep 2019 14:11:48 +0000 (16:11 +0200)]
use is_iommu_enabled() where appropriate...
...rather than testing the global iommu_enabled flag and ops pointer.
Now that there is a per-domain flag indicating whether the domain is
permitted to use the IOMMU (which determines whether the ops pointer will
be set), many tests of the global iommu_enabled flag and ops pointer can
be translated into tests of the per-domain flag. Some of the other tests of
purely the global iommu_enabled flag can also be translated into tests of
the per-domain flag.
NOTE: The comment in iommu_share_p2m_table() is also fixed; need_iommu()
disappeared some time ago. Also, whilst the style of the 'if' in
flask_iommu_resource_use_perm() is fixed, I have not translated any
instances of u32 into uint32_t to keep consistency. IMO such a
translation would be better done globally for the source module in
a separate patch.
The change to the definition of iommu_call() is to keep the PV shim
build happy. Without this change it will fail to compile with errors
of the form:
Paul Durrant [Tue, 17 Sep 2019 14:10:38 +0000 (16:10 +0200)]
domain: introduce XEN_DOMCTL_CDF_iommu flag
This patch introduces a common domain creation flag to determine whether
the domain is permitted to make use of the IOMMU. Currently the flag is
always set for both dom0 and any domU created by libxl if the IOMMU is
globally enabled (i.e. iommu_enabled == 1). sanitise_domain_config() is
modified to reject the flag if !iommu_enabled.
A new helper function, is_iommu_enabled(), is added to test the flag and
iommu_domain_init() will return immediately if !is_iommu_enabled(). This is
slightly different to the previous behaviour based on !iommu_enabled where
the call to arch_iommu_domain_init() was made regardless, however it appears
that this call was only necessary to initialize the dt_devices list for ARM
such that iommu_release_dt_devices() can be called unconditionally by
domain_relinquish_resources(). Adding a simple check of is_iommu_enabled()
into iommu_release_dt_devices() keeps this unconditional call working.
No functional change should be observed with this patch applied.
Subsequent patches will allow the toolstack to control whether use of the
IOMMU is enabled for a domain.
NOTE: The introduction of the is_iommu_enabled() helper function might
seem excessive but its use is expected to increase with subsequent
patches. Also, having iommu_domain_init() bail before calling
arch_iommu_domain_init() is not strictly necessary, but I think the
consequent addition of the call to is_iommu_enabled() in
iommu_release_dt_devices() makes the code clearer.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: "Roger Pau Monné" <roger.pau@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Christian Lindig <christian.lindig@citrix.com> Acked-by: Julien Grall <julien.grall@arm.com>
sched: populate cpupool0 only after all cpus are up
Simplify cpupool initialization by populating cpupool0 with cpus only
after all cpus are up. This avoids having to call the cpu notifier
directly for cpu 0.
With that in place there is no need to create cpupool0 earlier, so
do that just before assigning the cpus. Initialize free cpus with all
online cpus at that time in order to be able to add the cpu notifier
late, too.
Print the lock profile data when the system crashes and add some more
information for each lock data (lock address, cpu holding the lock).
While at it use the PRI_stime format specifier for printing time data.
This is especially beneficial for watchdog triggered crashes in case
of deadlocks.
In order to have the cpu holding the lock available let the
lock profile config option select DEBUG_LOCKS.
As printing the lock profile data will make use of locking, too, we
need to disable spinlock debugging before calling
spinlock_profile_printall() from panic().
While at it remove a superfluous #ifdef CONFIG_LOCK_PROFILE and rename
CONFIG_LOCK_PROFILE to CONFIG_DEBUG_LOCK_PROFILE.
Also move the .lockprofile.data section to init area in linker scripts
as the data is no longer needed after boot.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Jan Beulich <jbeulich@suse.com>
spinlocks: in debug builds store cpu holding the lock
Add the cpu currently holding the lock to struct lock_debug. This makes
analysis of locking errors easier and it can be tested whether the
correct cpu is releasing a lock again.
Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Tue, 17 Sep 2019 14:06:15 +0000 (16:06 +0200)]
x86/PCI: read MSI-X table entry count early
Rather than doing this every time we set up interrupts for a device
anew (and then in two distinct places) fill this invariant field
right after allocating struct arch_msix.
While at it also obtain the MSI-X capability structure position just
once, in msix_capability_init(), rather than in each caller.
Furthermore take the opportunity and eliminate the multi_msix_capable()
alias of msix_table_size().
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 17 Sep 2019 14:05:01 +0000 (16:05 +0200)]
AMD/IOMMU: introduce a "valid" flag for IVRS mappings
For us to no longer blindly allocate interrupt remapping tables for
everything the ACPI tables name, we can't use struct ivrs_mappings'
intremap_table field anymore to also have the meaning of "this entry
is valid". Add a separate boolean field instead.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 17 Sep 2019 14:03:44 +0000 (16:03 +0200)]
AMD/IOMMU: don't free shared IRT multiple times
Calling amd_iommu_free_intremap_table() for every IVRS entry is correct
only in per-device-IRT mode. Use a NULL 2nd argument to indicate that
the shared table should be freed, and call the function exactly once in
shared mode.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
microcode: pass a patch pointer to apply_microcode()
apply_microcode()'s always loading the cached ucode patch forces
a patch to be stored before being loaded. Make apply_microcode()
accept a patch pointer to remove the limitation so that a patch
can be stored after a successful loading.
Signed-off-by: Chao Gao <chao.gao@intel.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
microcode/amd: call svm_host_osvw_init() in common code
Introduce a vendor hook, .end_update_percpu, for svm_host_osvw_init().
The hook function is called on each cpu after loading an update.
It is a preparation for spliting out apply_microcode() from
cpu_request_microcode().
Note that svm_host_osvm_init() should be called regardless of the
result of loading an update.
Signed-off-by: Chao Gao <chao.gao@intel.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Some callbacks in microcode_ops or related functions take a cpu
id parameter. But at current call sites, the cpu id parameter is
always equal to current cpu id. Some of them even use an assertion
to guarantee this. Remove this redundent 'cpu' parameter.
Signed-off-by: Chao Gao <chao.gao@intel.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Remove the per-cpu cache field in struct ucode_cpu_info since it has
been replaced by a global cache. It would leads to only one field
remaining in ucode_cpu_info. Then, this struct is removed and the
remaining field (cpu signature) is stored in per-cpu area.
The cpu status notifier is also removed. It was used to free the "mc"
field to avoid memory leak.
Signed-off-by: Chao Gao <chao.gao@intel.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Previously, a per-cpu ucode cache is maintained. Then each CPU had one
per-cpu update cache and there might be multiple versions of microcode.
Thus microcode_resume_cpu tried best to update microcode by loading
every update cache until a successful load.
But now the cache struct is simplified a lot and only a single ucode is
cached. a single invocation of ->apply_microcode() would load the cache
and make microcode updated.
Signed-off-by: Chao Gao <chao.gao@intel.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
microcode: introduce a global cache of ucode patch
to replace the current per-cpu cache 'uci->mc'.
With the assumption that all CPUs in the system have the same signature
(family, model, stepping and 'pf'), one microcode update matches with
one cpu should match with others. Having differing microcode revisions
on cpus would cause system unstable and should be avoided. Hence, caching
one microcode update is good enough for all cases.
Introduce a global variable, microcode_cache, to store the newest
matching microcode update. Whenever we get a new valid microcode update,
its revision id is compared against that of the microcode update to
determine whether the "microcode_cache" needs to be replaced. And
this global cache is loaded to cpu in apply_microcode().
All operations on the cache is protected by 'microcode_mutex'.
Note that I deliberately avoid touching the old per-cpu cache ('uci->mc')
as I am going to remove it completely in the following patches. We copy
everything to create the new cache blob to avoid reusing some buffers
previously allocated for the old per-cpu cache. It is not so efficient,
but it is already corrected by a patch later in this series.
Signed-off-by: Chao Gao <chao.gao@intel.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
microcode/amd: distinguish old and mismatched ucode in microcode_fits()
Sometimes, an ucode with a level lower than or equal to current CPU's
patch level is useful. For example, to work around a broken bios which
only loads ucode for BSP, when BSP parses an ucode blob during bootup,
it is better to save an ucode with lower or equal level for APs
No functional change is made in this patch. But following patch would
handle "old ucode" and "mismatched ucode" separately.
Signed-off-by: Chao Gao <chao.gao@intel.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>