]> xenbits.xensource.com Git - xen.git/log
xen.git
3 years agox86/spec-ctrl: Drop SPEC_CTRL_{ENTRY_FROM,EXIT_TO}_HVM
Andrew Cooper [Wed, 12 Jan 2022 16:36:29 +0000 (16:36 +0000)]
x86/spec-ctrl: Drop SPEC_CTRL_{ENTRY_FROM,EXIT_TO}_HVM

These were written before Spectre/Meltdown went public, and there was large
uncertainty in how the protections would evolve.  As it turns out, they're
very specific to Intel hardware, and not very suitable for AMD.

Drop the macros, opencoding the relevant subset of functionality, and leaving
grep-fodder to locate the logic.  No change at all for VT-x.

For AMD, the only relevant piece of functionality is DO_OVERWRITE_RSB,
although we will soon be adding (different) logic to handle MSR_SPEC_CTRL.

This has a marginal improvement of removing an unconditional pile of long-nops
from the vmentry/exit path.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
3 years agox86/msr: Split MSR_SPEC_CTRL handling
Andrew Cooper [Wed, 12 Jan 2022 13:52:47 +0000 (13:52 +0000)]
x86/msr: Split MSR_SPEC_CTRL handling

In order to fix a VT-x bug, and support MSR_SPEC_CTRL on AMD, move
MSR_SPEC_CTRL handling into the new {pv,hvm}_{get,set}_reg() infrastructure.

Duplicate the msrs->spec_ctrl.raw accesses in the PV and VT-x paths for now.
The SVM path is currently unreachable because of the CPUID policy.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/guest: Introduce {get,set}_reg() infrastructure
Andrew Cooper [Mon, 17 Jan 2022 12:28:39 +0000 (12:28 +0000)]
x86/guest: Introduce {get,set}_reg() infrastructure

Various registers have per-guest-type or per-vendor locations or access
requirements.  To support their use from common code, provide accessors which
allow for per-guest-type behaviour.

For now, just infrastructure handling default cases and expectations.
Subsequent patches will start handling registers using this infrastructure.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/hvm: Drop .is_singlestep_supported() callback
Andrew Cooper [Thu, 13 Jan 2022 18:37:13 +0000 (18:37 +0000)]
x86/hvm: Drop .is_singlestep_supported() callback

There is absolutely no need for a function pointer call here.

Drop the hook, introduce a singlestep_supported boolean, and configure it in
start_vmx() like all other optional functionality.

No functional change, but rather more efficient logic.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tamas K Lengyel <tamas@tklengyel.com>
3 years agoConfig.mk: update seabios to 1.15.0
Wei Liu [Sun, 16 Jan 2022 12:54:27 +0000 (12:54 +0000)]
Config.mk: update seabios to 1.15.0

Signed-off-by: Wei Liu <wl@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
3 years agolibs/guest: move cpu policy related prototypes to xenguest.h
Roger Pau Monné [Wed, 19 Jan 2022 12:51:26 +0000 (13:51 +0100)]
libs/guest: move cpu policy related prototypes to xenguest.h

Do this before adding any more stuff to xg_cpuid_x86.c.

The placement in xenctrl.h is wrong, as they are implemented by the
xenguest library. Note that xg_cpuid_x86.c needs to include
xg_private.h, and in turn also fix xg_private.h to include
xc_bitops.h. The bitops definition of BITS_PER_LONG needs to be
changed to not be an expression, so that xxhash.h can use it in a
preprocessor if directive.

As a result also modify xen-cpuid to include xenguest.h.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
3 years agox86/mwait-idle: Adjust the SKX C6 parameters if PC6 is disabled
Chen Yu [Wed, 19 Jan 2022 12:50:43 +0000 (13:50 +0100)]
x86/mwait-idle: Adjust the SKX C6 parameters if PC6 is disabled

Because cpuidle assumes worst-case C-state parameters, PC6 parameters
are used for describing C6, which is worst-case for requesting CC6.
When PC6 is enabled, this is appropriate. But if PC6 is disabled
in the BIOS, the exit latency and target residency should be adjusted
accordingly.

Exit latency:
Previously the C6 exit latency was measured as the PC6 exit latency.
With PC6 disabled, the C6 exit latency should be the one of CC6.

Target residency:
With PC6 disabled, the idle duration within [CC6, PC6) would make the
idle governor choose C1E over C6. This would cause low energy-efficiency.
We should lower the bar to request C6 when PC6 is disabled.

To fill this gap, check if PC6 is disabled in the BIOS in the
MSR_PKG_CST_CONFIG_CONTROL(0xe2) register. If so, use the CC6 exit latency
for C6 and set target_residency to 3 times of the new exit latency. [This
is consistent with how intel_idle driver uses _CST to calculate the
target_residency.] As a result, the OS would be more likely to choose C6
over C1E when PC6 is disabled, which is reasonable, because if C6 is
enabled, it implies that the user cares about energy, so choosing C6 more
frequently makes sense.

The new CC6 exit latency of 92us was measured with wult[1] on SKX via NIC
wakeup as the 99.99th percentile. Also CLX and CPX both have the same CPU
model number as SkX, but their CC6 exit latencies are similar to the SKX
one, 96us and 89us respectively, so reuse the SKX value for them.

There is a concern that it might be better to use a more generic approach
instead of optimizing every platform. However, if the required code
complexity and different PC6 bit interpretation on different platforms
are taken into account, tuning the code per platform seems to be an
acceptable tradeoff.

Link: https://intel.github.io/wult/
Suggested-by: Len Brown <len.brown@intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Reviewed-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[Linux commit: 64233338499126c5c31e07165735ab5441c7e45a]

Alongside the dropping of "const" from skx_cstates[] add __read_mostly,
and extend that to other similar non-const tables.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
3 years agoMerge branch 'staging' of xenbits.xen.org:/home/xen/git/xen into staging
Jan Beulich [Wed, 19 Jan 2022 12:49:30 +0000 (13:49 +0100)]
Merge branch 'staging' of xenbits.xen.org:/home/xen/git/xen into staging

3 years agox86/mwait-idle: add Icelake-D support
Artem Bityutskiy [Wed, 19 Jan 2022 12:46:05 +0000 (13:46 +0100)]
x86/mwait-idle: add Icelake-D support

This patch adds Icelake Xeon D support to the intel_idle driver.

Since Icelake D and Icelake SP C-state characteristics the same,
we use Icelake SP C-states table for Icelake D as well.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Acked-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[Linux commit: 22141d5f411895bb1b0df2a6b05f702e11e63918]
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
3 years agox86/mwait-idle: update ICX C6 data
Artem Bityutskiy [Wed, 19 Jan 2022 12:45:11 +0000 (13:45 +0100)]
x86/mwait-idle: update ICX C6 data

Change IceLake Xeon C6 latency from 128 us to 170 us. The latency
was measured with the "wult" tool and corresponds to the 99.99th
percentile when measuring with the "nic" method. Note, the 128 us
figure correspond to the median latency, but in intel_idle we use
the "worst case" latency figure instead.

C6 target residency was increased from 384 us to 600 us, which may
result in less C6 residency in some workloads. This value was tested
and compared to values 384, and 1000. Value 600 is a reasonable
tradeoff between power and performance.

Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[Linux commit: d484b8bfc6fa71a088e4ac85d9ce11aa0385867e]
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
3 years agox86/mwait-idle: mention assumption that WBINVD is not needed
Alexander Monakov [Wed, 19 Jan 2022 12:44:31 +0000 (13:44 +0100)]
x86/mwait-idle: mention assumption that WBINVD is not needed

Intel SDM does not explicitly say that entering a C-state via MWAIT will
implicitly flush CPU caches as appropriate for that C-state. However,
documentation for individual Intel CPU generations does mention this
behavior.

Since intel_idle binds to any Intel CPU with MWAIT, list this assumption
of MWAIT behavior.

In passing, reword opening comment to make it clear that the driver can
load on any old and future Intel CPU with MWAIT.

Signed-off-by: Alexander Monakov <amonakov@ispras.ru>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[Linux commit: 8bb2e2a887afdf8a39e68fa0dccf82a168aae655]

Dropped "reword opending comment" part - this doesn't apply to our code:
First thing mwait_idle_probe() does is call x86_match_cpu(); we do not
have a 2nd such call looking for just MWAIT (in order to the use _CST
data directly, which we can't get our hands at _CST at this point yet).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
3 years agotools/libs/gnttab: remove old mini-os callback
Juergen Gross [Wed, 19 Jan 2022 07:28:23 +0000 (08:28 +0100)]
tools/libs/gnttab: remove old mini-os callback

It is possible now to delete minios_gnttab_close_fd().

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agotools/libs/evtchn: remove old mini-os callback
Juergen Gross [Wed, 19 Jan 2022 07:28:22 +0000 (08:28 +0100)]
tools/libs/evtchn: remove old mini-os callback

It is possible now to delete minios_evtchn_close_fd() and the extern
declaration of event_queue.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agoconfig: use more recent mini-os commit
Juergen Gross [Wed, 19 Jan 2022 07:28:21 +0000 (08:28 +0100)]
config: use more recent mini-os commit

In order to be able to use the recent Mini-OS features switch to the
most recent commit.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agotools/libs/ctrl: remove file related handling
Juergen Gross [Sun, 16 Jan 2022 08:23:46 +0000 (09:23 +0100)]
tools/libs/ctrl: remove file related handling

There is no special file handling related to libxenctrl in Mini-OS
any longer, so the close hook can be removed.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
3 years agotools/libs/gnttab: decouple more from mini-os
Juergen Gross [Sun, 16 Jan 2022 08:23:45 +0000 (09:23 +0100)]
tools/libs/gnttab: decouple more from mini-os

libgnttab is using implementation details of Mini-OS. Change that by
letting libgnttab use the new alloc_file_type() and get_file_from_fd()
functions and the generic dev pointer of struct file from Mini-OS.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agotools/libs/evtchn: decouple more from mini-os
Juergen Gross [Sun, 16 Jan 2022 08:23:44 +0000 (09:23 +0100)]
tools/libs/evtchn: decouple more from mini-os

Mini-OS and libevtchn are using implementation details of each other.
Change that by letting libevtchn use the new alloc_file_type() and
get_file_from_fd() function and the generic dev pointer of struct file
from Mini-OS.

By using private struct declarations Mini-OS will be able to drop the
libevtchn specific definitions of struct evtchn_port_info and
evtchn_port_list in future. While at it use bool for "pending" and
"bound".

Switch to use xce as function parameter instead of fd where possible.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agoconfig: use more recent mini-os commit
Juergen Gross [Tue, 18 Jan 2022 14:21:02 +0000 (15:21 +0100)]
config: use more recent mini-os commit

In order to be able to use the recent Mini-OS features switch to the
most recent commit.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agox86/APIC: mark wait_tick_pvh() __init
Jan Beulich [Mon, 17 Jan 2022 16:29:42 +0000 (17:29 +0100)]
x86/APIC: mark wait_tick_pvh() __init

It should have been that way right from its introduction by 02e0de011555
("x86: APIC timer calibration when running as a guest").

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wl@xen.org>
3 years agoMAINTAINERS: email address update in TXT section
Lukasz Hawrylko [Mon, 17 Jan 2022 16:29:00 +0000 (17:29 +0100)]
MAINTAINERS: email address update in TXT section

As I am not working for Intel anymore, I would like to update my email address
to my private one.

Signed-off-by: Lukasz Hawrylko <lukasz@hawrylko.pl>
3 years agoMAINTAINERS: update my email address
Nick Rosbrook [Mon, 17 Jan 2022 16:28:36 +0000 (17:28 +0100)]
MAINTAINERS: update my email address

I am no longer an employee at AIS. Use my personal email address
instead.

Signed-off-by: Nick Rosbrook <rosbrookn@gmail.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
3 years agox86/HVM: convert remaining hvm_funcs hook invocations to alt-call
Jan Beulich [Mon, 17 Jan 2022 08:45:04 +0000 (09:45 +0100)]
x86/HVM: convert remaining hvm_funcs hook invocations to alt-call

The aim being to have as few indirect calls as possible (see [1]),
whereas during initial conversion performance was the main aspect and
hence rarely used hooks didn't get converted. Apparently one use of
get_interrupt_shadow() was missed at the time.

While doing this, drop NULL checks ahead of CPU management and .nhvm_*()
calls when the hook is always present. Also convert the
.nhvm_vcpu_reset() call to alternative_vcall(), as the return value is
unused and the caller has currently no way of propagating it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tamas K Lengyel <tamas@tklengyel.com>
[1] https://lists.xen.org/archives/html/xen-devel/2021-11/msg01822.html

3 years agobuild: adjust include/xen/compile.h generation
Jan Beulich [Fri, 14 Jan 2022 10:03:03 +0000 (11:03 +0100)]
build: adjust include/xen/compile.h generation

Prior to 19427e439e01 ("build: generate "include/xen/compile.h" with
if_changed") running "make install-xen" as root would not have printed
the banner under normal circumstances. Its printing would instead have
indicated that something was wrong (or during a normal build the lack
of printing would do so).

Further aforementioned change had another undesirable effect, which I
didn't notice during review: Originally compile.h would have been
re-generated (and final binaries re-linked) when its dependencies were
updated after an earlier build. This is no longer the case now, which
means that if some other file also was updated, then the re-build done
during "make install-xen" would happen with a stale compile.h (as its
updating is suppressed in this case).

Restore the earlier behavior for both aspects.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
3 years agox86/hvm: Improve hvm_set_guest_pat() code generation
Andrew Cooper [Wed, 12 Jan 2022 13:54:12 +0000 (13:54 +0000)]
x86/hvm: Improve hvm_set_guest_pat() code generation

This is a fastpath on virtual vmentry/exit, and forcing guest_pat to be
spilled to the stack is bad.  Performing the shift in a register is far more
efficient.

Drop the (IMO useless) log message.  MSR_PAT only gets altered on boot, and a
bad value will be entirely evident in the ensuing #GP backtrace.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/hvm: Rework nested hap functions to reduce parameters
Andrew Cooper [Tue, 30 Nov 2021 17:05:09 +0000 (17:05 +0000)]
x86/hvm: Rework nested hap functions to reduce parameters

Most functions in this call chain have 8 parameters, meaning that the final
two booleans are spilled to the stack for calls.

First, delete nestedhap_walk_L1_p2m and introduce nhvm_hap_walk_L1_p2m() as a
thin wrapper around hvm_funcs, just like all the other nhvm_*() hooks.  This
involves including xen/mm.h as the forward declaration of struct npfec is no
longer enough.

Next, replace the triple of booleans with struct npfec, which contains the
same information in the bottom 3 bits.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/hvm: Simplify hvm_enable_msr_interception()
Andrew Cooper [Tue, 30 Nov 2021 14:37:59 +0000 (14:37 +0000)]
x86/hvm: Simplify hvm_enable_msr_interception()

The sole caller doesn't check the return value, and both vendors implement the
hook.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agolibxl/PCI: Fix PV hotplug & stubdom coldplug
Jason Andryuk [Thu, 13 Jan 2022 13:33:16 +0000 (14:33 +0100)]
libxl/PCI: Fix PV hotplug & stubdom coldplug

commit 0fdb48ffe7a1 "libxl: Make sure devices added by pci-attach are
reflected in the config" broken PCI hotplug (xl pci-attach) for PV
domains when it moved libxl__create_pci_backend() later in the function.

This also broke HVM + stubdom PCI passthrough coldplug.  For that, the
PCI devices are hotplugged to a running PV stubdom, and then the QEMU
QMP device_add commands are made to QEMU inside the stubdom.

A running PV domain calls libxl__wait_for_backend().  With the current
placement of libxl__create_pci_backend(), the path does not exist and
the call immediately fails:
libxl: error: libxl_device.c:1388:libxl__wait_for_backend: Backend /local/domain/0/backend/pci/43/0 does not exist
libxl: error: libxl_pci.c:1764:device_pci_add_done: Domain 42:libxl__device_pci_add failed for PCI device 0:2:0.0 (rc -3)
libxl: error: libxl_create.c:1857:domcreate_attach_devices: Domain 42:unable to add pci devices

The wait is only relevant when:
1) The domain is PV
2) The domain is running
3) The backend is already present

This is because:

1) xen-pcifront is only used for PV.  It does not load for HVM domains
   where QEMU is used.

2) If the domain is not running (starting), then the frontend state will
   be Initialising.  xen-pciback waits for the frontend to transition to
   at Initialised before attempting to connect.  So a wait for a
   non-running domain is not applicable as the backend will not
   transition to Connected.

3) For presence, num_devs is already used to determine if the backend
   needs to be created.  Re-use num_devs to determine if the backend
   wait is necessary.  The wait is necessary to avoid racing with
   another PCI attachment reconfiguring the front/back or changing to
   some other state like closing.  If we are creating the backend, then
   we don't have to worry about the state since it is being created.

Fixes: 0fdb48ffe7a1 ("libxl: Make sure devices added by pci-attach are
reflected in the config")

Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
3 years agobuild: correct usage comments in Kbuild.include
Jan Beulich [Thu, 13 Jan 2022 13:32:34 +0000 (14:32 +0100)]
build: correct usage comments in Kbuild.include

Macros with arguments need to be invoked via $(call ...); don't misguide
people looking up usage of such macros.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agox86/time: improve TSC / CPU freq calibration accuracy
Jan Beulich [Thu, 13 Jan 2022 13:31:52 +0000 (14:31 +0100)]
x86/time: improve TSC / CPU freq calibration accuracy

While the problem report was for extreme errors, even smaller ones would
better be avoided: The calculated period to run calibration loops over
can (and usually will) be shorter than the actual time elapsed between
first and last platform timer and TSC reads. Adjust values returned from
the init functions accordingly.

On a Skylake system I've tested this on accuracy (using HPET) went from
detecting in some cases more than 220kHz too high a value to about
±2kHz. On other systems (or on this system, but with PMTMR) the original
error range was much smaller, with less (in some cases only very little)
improvement.

Reported-by: James Dingwall <james-xen@dingwall.me.uk>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
3 years agox86/time: use relative counts in calibration loops
Jan Beulich [Thu, 13 Jan 2022 13:30:18 +0000 (14:30 +0100)]
x86/time: use relative counts in calibration loops

Looping until reaching/exceeding a certain value is error prone: If the
target value is close enough to the wrapping point, the loop may not
terminate at all. Switch to using delta values, which then allows to
fold the two loops each into just one.

Fixes: 93340297802b ("x86/time: calibrate TSC against platform timer")
Reported-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
3 years agotools/libs/evtchn: Deduplicate xenevtchn_fd()
Andrew Cooper [Mon, 10 Jan 2022 12:29:05 +0000 (12:29 +0000)]
tools/libs/evtchn: Deduplicate xenevtchn_fd()

struct xenevtchn_handle is common in private.h, meaning that xenevtchn_fd()
has exactly one correct implementation.

Implement it in core.c, rather than identically for each OS.  This matches all
other libraries (call, gnttab, gntshr) which implement an fd getter.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
3 years agoMAINTAINERS: requesting to be TXT reviewer
Daniel P. Smith [Wed, 12 Jan 2022 07:55:20 +0000 (08:55 +0100)]
MAINTAINERS: requesting to be TXT reviewer

I would like to submit myself, Daniel P. Smith, as a reviewer of TXT support in
Xen.

Signed-off-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
3 years agotools/debugger: fix make distclean
Juergen Gross [Wed, 12 Jan 2022 07:54:59 +0000 (08:54 +0100)]
tools/debugger: fix make distclean

"make distclean" will complain that "-c" is no supported flag for make.

Fix that by using "-C".

The error has been present for a long time, but it was uncovered only
recently.

Fixes: 2400a9a365c5619 ("tools/debugger: Allow make to recurse into debugger/")
Fixes: f9c9b127753e9ed ("tools: fix make distclean")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Tested-by: Jason Andryuk <jandryuk@gmail.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
3 years agox86/paging: replace most mfn_valid() in log-dirty handling
Jan Beulich [Wed, 12 Jan 2022 07:54:20 +0000 (08:54 +0100)]
x86/paging: replace most mfn_valid() in log-dirty handling

Top level table and intermediate table entries get explicitly set to
INVALID_MFN when un-allocated. There's therefore no need to use the more
expensive mfn_valid() when checking for that sentinel.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agox86/paging: tidy paging_mfn_is_dirty()
Jan Beulich [Wed, 12 Jan 2022 07:53:05 +0000 (08:53 +0100)]
x86/paging: tidy paging_mfn_is_dirty()

The function returning a boolean indicator, make it return bool. Also
constify its struct domain parameter, albeit requiring to also adjust
mm_locked_by_me(). Furthermore the function is used by shadow code only.

Since mm_locked_by_me() needs touching anyway, also switch its return
type to bool.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agoSUPPORT.md: limit support statement for Linux and Windows frontends
Juergen Gross [Tue, 11 Jan 2022 10:43:48 +0000 (11:43 +0100)]
SUPPORT.md: limit support statement for Linux and Windows frontends

Change the support state of Linux and Windows pv frontends from
"supported" to "supported with caveats" in order to reflect that the
frontends can probably be harmed by their respective backends.

Some of the Linux frontends have been hardened already.

This is XSA-376

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/viridian: EOI MSR should always happen in affected vCPU context
Roger Pau Monné [Tue, 11 Jan 2022 10:42:49 +0000 (11:42 +0100)]
x86/viridian: EOI MSR should always happen in affected vCPU context

The HV_X64_MSR_EOI wrmsr should always happen with the target vCPU
as current, as there's no support for EOI'ing interrupts on a remote
vCPU.

While there also turn the unconditional assert at the top of the
function into an error on non-debug builds.

No functional change intended.

Requested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/altp2m: p2m_altp2m_get_or_propagate() should honor present page order
Jan Beulich [Thu, 6 Jan 2022 15:12:39 +0000 (16:12 +0100)]
x86/altp2m: p2m_altp2m_get_or_propagate() should honor present page order

Prior to XSA-304 the only caller merely happened to not use any further
the order value that it passes into the function. Already then this was
a latent issue: The function really should, in the "get" case, hand back
the order the underlying mapping actually uses (or actually the smaller
of the two), such that (going forward) there wouldn't be any action on
unrelated mappings (in particular ones which did already diverge from
the host P2M).

Similarly in the "propagate" case only the smaller of the two orders
should actually get used for creating the new entry, again to avoid
altering mappings which did already diverge from the host P2M.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tamas K Lengyel <tamas@tklengyel.com>
3 years agotools/xen-detect: avoid possible pitfall with cpuid()
Jan Beulich [Thu, 6 Jan 2022 15:12:15 +0000 (16:12 +0100)]
tools/xen-detect: avoid possible pitfall with cpuid()

The 64-bit form forces %ecx to 0 while the 32-bit one so far didn't - it
only ended up that way when "pv_context" is zero. While presently no
leaf queried by callers has separate subleaves, let's avoid chancing it.

While there
- replace references to operands by number,
- relax constraints where possible,
- limit PUSH/POP to just the registers not also used as input,
all where applicable also for the 64-bit variant.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
3 years agox86/spec-ctrl: Fix default calculation of opt_srb_lock
Andrew Cooper [Tue, 4 Jan 2022 14:11:55 +0000 (14:11 +0000)]
x86/spec-ctrl: Fix default calculation of opt_srb_lock

Since this logic was introduced, opt_tsx has become more complicated and
shouldn't be compared to 0 directly.  While there are no buggy logic paths,
the correct expression is !(opt_tsx & 1) but the rtm_disabled boolean is
easier and clearer to use.

Fixes: 8fe24090d940 ("x86/cpuid: Rework HLE and RTM handling")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agotools/libxc: Drop copy-in in xc_physinfo()
Andrew Cooper [Thu, 23 Dec 2021 16:10:15 +0000 (16:10 +0000)]
tools/libxc: Drop copy-in in xc_physinfo()

The first thing XEN_SYSCTL_physinfo does is zero op->u.physinfo.

Do not copy-in.  It's pointless, and most callers don't initialise their
xc_physinfo_t buffer to begin with.  Remove the redundant zeroing from the
remaining callers.

Spotted by Coverity.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Chen <Wei.Chen@arm.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
3 years agoxenperf: omit meaningless trailing zeroes from output
Jan Beulich [Tue, 4 Jan 2022 09:21:12 +0000 (10:21 +0100)]
xenperf: omit meaningless trailing zeroes from output

There's no point producing a long chain of zeroes when the previously
calculated total value was zero. To guard against mistakenly skipping
non-zero individual fields, widen "sum" to "unsigned long long".

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
3 years agolibxc: avoid clobbering errno in xc_domain_pod_target()
Jan Beulich [Tue, 4 Jan 2022 09:20:15 +0000 (10:20 +0100)]
libxc: avoid clobbering errno in xc_domain_pod_target()

do_memory_op() supplies return value and has "errno" set the usual way.
Don't overwrite "errno" with 1 (aka EPERM on at least Linux). There's
also no reason to overwrite "err".

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
3 years agoVT-d: shorten vtd_flush_{context,iotlb}_reg()
Jan Beulich [Tue, 4 Jan 2022 09:19:32 +0000 (10:19 +0100)]
VT-d: shorten vtd_flush_{context,iotlb}_reg()

Their calculations of the value to write to the respective command
register can be partly folded, resulting in almost 100 bytes less code
for these two relatively short functions.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
3 years agoVT-d: use DMA_TLB_IVA_ADDR()
Jan Beulich [Tue, 4 Jan 2022 09:18:18 +0000 (10:18 +0100)]
VT-d: use DMA_TLB_IVA_ADDR()

Let's use the macro in the one place it's supposed to be used, and in
favor of then unnecessary manipulations of the address in
iommu_flush_iotlb_psi(): All leaf functions then already deal correctly
with the supplied address.

There also has never been a need to require (i.e. assert for) the
passing in of 4k-aligned addresses - it'll always be the order-sized
range containing the address which gets flushed.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
3 years agoVT-d: properly parenthesize a number of macros
Jan Beulich [Tue, 4 Jan 2022 09:17:44 +0000 (10:17 +0100)]
VT-d: properly parenthesize a number of macros

Let's eliminate the risk of any of these macros getting used with more
complex expressions as arguments.

Where touching lines anyway, also
- switch from u64 to uint64_t,
- drop unnecessary parentheses,
- drop pointless 0x prefixes.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
3 years agoxenperf: name "newer" hypercalls
Jan Beulich [Tue, 4 Jan 2022 09:16:48 +0000 (10:16 +0100)]
xenperf: name "newer" hypercalls

This table must not have got updated in quite a while; tmem_op for
example has managed to not only appear since then, but also disappear
again (adding a name for it nevertheless, to make more obvious that
something strange is going on if the slot would ever have a non-zero
value).

Also resolve arch_0 and arch_1 to more meaningful names on x86.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agoVT-d: avoid allocating domid_{bit,}map[] when possible
Jan Beulich [Tue, 4 Jan 2022 09:16:04 +0000 (10:16 +0100)]
VT-d: avoid allocating domid_{bit,}map[] when possible

When an IOMMU implements the full 16 bits worth of DID in context
entries, there's no point going through a memory base translation table.
For IOMMUs not using Caching Mode we can simply use the domain IDs
verbatim, while for Caching Mode we need to avoid DID 0.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
3 years agox86/EPT: squash meaningless TLB flush
Jan Beulich [Tue, 4 Jan 2022 09:13:06 +0000 (10:13 +0100)]
x86/EPT: squash meaningless TLB flush

ept_free_entry() gets called after a flush was already issued, if one is
necessary in the first place. That behavior is similar to NPT, which
also doesn't have any further flush in p2m_free_entry(). (Furthermore,
the function being recursive, in case of recursiveness way too many
flushes would have been issued.)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
3 years agomm: introduce INVALID_{G,M}FN_RAW
Jan Beulich [Tue, 21 Dec 2021 09:42:02 +0000 (10:42 +0100)]
mm: introduce INVALID_{G,M}FN_RAW

This allows properly tying together INVALID_{G,M}FN and
INVALID_{G,M}FN_INITIALIZER as well as using the actual values in
compile time constant expressions (or even preprocessor directives).

Since INVALID_PFN is unused, and with x86'es paging_mark_pfn_dirty()
being the only user of pfn_t it also doesn't seem likely that new uses
would appear, remove that one at this same occasion.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
3 years agox86/perfc: conditionalize HVM and shadow counters
Jan Beulich [Tue, 21 Dec 2021 09:38:18 +0000 (10:38 +0100)]
x86/perfc: conditionalize HVM and shadow counters

There's no point including them when the respective functionality isn't
enabled in the build. Note that this covers only larger groups; more
fine grained exclusion may want to be done later on.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agox86/traps: Clean up diagnostics
Andrew Cooper [Fri, 8 Oct 2021 12:40:17 +0000 (13:40 +0100)]
x86/traps: Clean up diagnostics

do{_unhandled,}_trap() should use fatal_trap() rather than opencoding part of
it.  This lets the remote stack trace logic work in more fatal error
conditions.

With do_trap() converted, there is only one single user of trapstr()
remaining.  Tweak the formatting in pv_inject_event(), and remove trapstr()
entirely.  Rename vec_name() to vector_name() now that it is exported.

Take the opportunity of vector_name() being exported to improve the
diagnostics in stub_selftest().

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/traps: Drop exception_table[] and use if/else dispatching
Andrew Cooper [Thu, 7 Oct 2021 13:04:03 +0000 (14:04 +0100)]
x86/traps: Drop exception_table[] and use if/else dispatching

There is also a lot of redundancy in the table.  8 vectors head to do_trap(),
3 are handled in the IST logic, and that only leaves 7 others not heading to
the do_reserved_trap() catch-all.  This also removes the fragility that any
accidental NULL entry in the table becomes a ticking timebomb.

Function pointers are expensive under retpoline, and different vectors have
wildly different frequences.  Drop the indirect call, and use an if/else chain
instead, which is a code layout technique used by profile-guided optimsiation.

Using Xen's own perfcounter infrastructure, we see the following frequences of
vectors measured from boot until I can SSH into dom0 and collect the stats:

  vec | CFL-R   | Milan   | Notes
  ----+---------+---------+
  NMI |     345 |    3768 | Watchdog.  Milan has many more CPUs.
  ----+---------+---------+
  #PF | 1233234 | 2006441 |
  #GP |   90054 |   96193 |
  #UD |     848 |     851 |
  #NM |       0 |     132 | Per-vendor lazy vs eager FPU policy.
  #DB |      67 |      67 | No clue, but it's something in userspace.

Bloat-o-meter (after some manual insertion of ELF metadata) reports:

  add/remove: 0/1 grow/shrink: 2/0 up/down: 102/-256 (-154)
  Function                                     old     new   delta
  handle_exception_saved                       148     226     +78
  handle_ist_exception                         453     477     +24
  exception_table                              256       -    -256

showing that the if/else chains are less than half the size that
exception_table[] was in the first place.

As part of this change, make two other minor changes.  do_reserved_trap() is
renamed to do_unhandled_trap() because it is the catchall, and already covers
things that aren't reserved any more (#VE/#VC/#HV/#SX).

Furthermore, don't forward #TS to guests.  #TS is specifically for errors
relating to the Task State Segment, which is a Xen-owned structure, not a
guest-owned structure.  Even in the 32bit days, we never let guests register
their own Task State Segments.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
3 years agoxen/domain: Remove function pointers from domain pause helpers
Andrew Cooper [Thu, 28 Oct 2021 03:07:02 +0000 (04:07 +0100)]
xen/domain: Remove function pointers from domain pause helpers

Function pointer calls are expensive (especially with Spectre v2 protections),
and all these do are select between the sync and nosync helpers.  Pass a
boolean instead, and use direct calls everywhere.

Pause/unpause operations on behalf of dom0 are not fastpaths, so avoid
exposing the __domain_pause_by_systemcontroller() internal.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agoxen/arm64: Zero the top 32 bits of gp registers on entry...
Michal Orzel [Fri, 17 Dec 2021 07:21:59 +0000 (08:21 +0100)]
xen/arm64: Zero the top 32 bits of gp registers on entry...

to hypervisor when switching from AArch32 state.

According to section D1.20.2 of Arm Arm(DDI 0487A.j):
"If the general-purpose register was accessible from AArch32 state the
upper 32 bits either become zero, or hold the value that the same
architectural register held before any AArch32 execution.
The choice between these two options is IMPLEMENTATION DEFINED"

Currently Xen does not ensure that the top 32 bits are zeroed and this
needs to be fixed. The reason why is that there are places in Xen
where we assume that top 32bits are zero for AArch32 guests.
If they are not, this can lead to misinterpretation of Xen regarding
what the guest requested. For example hypercalls returning an error
encoded in a signed long like do_sched_op, do_hmv_op, do_memory_op
would return -ENOSYS if the command passed as the first argument was
clobbered.

Create a macro clobber_gp_top_halves to clobber top 32 bits of gp
registers when hyp == 0 (guest mode) and compat == 1 (AArch32 mode).
Add a compile time check to ensure that save_x0_x1 == 1 if
compat == 1.

Signed-off-by: Michal Orzel <michal.orzel@arm.com>
[julieng: Tweak the comment in clobber_gp_top_halves]
Acked-by: Julien Grall <jgrall@amazon.com>
3 years agotools/xenstore: drop support for running under SunOS
Juergen Gross [Fri, 17 Dec 2021 07:50:59 +0000 (08:50 +0100)]
tools/xenstore: drop support for running under SunOS

Since several years now xenstored is no longer capable to run under
SunOS, as the needed libxengnttab interfaces are not available there.

Several attempts to let the SunOS maintainers address this situation
didn't change anything in this regard.

For those reasons drop SunOS support in xenstored by removing the SunOS
specific code.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
3 years agohvmloader: tidy pci_mem_{start,end}
Jan Beulich [Fri, 17 Dec 2021 07:56:34 +0000 (08:56 +0100)]
hvmloader: tidy pci_mem_{start,end}

For one at least pci_mem_start has to be precisely 32 bits wide, so use
uint32_t for both. Otherwise expressions like "pci_mem_start <<= 1"
won't have the intended effect (in their context).

Further since its introduction pci_mem_end was never written to. Mark it
const to make this explicit.

Finally drop PCI_MEM_END: It is used just once and needlessly
disconnected from the other constant (RESERVED_MEMBASE) it needs to
match. Use RESERVED_MEMBASE as initializer of pci_mem_end instead.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agorevert "hvmloader: PA range 0xfc000000-0xffffffff should be UC"
Jan Beulich [Fri, 17 Dec 2021 07:56:15 +0000 (08:56 +0100)]
revert "hvmloader: PA range 0xfc000000-0xffffffff should be UC"

This reverts commit c22bd567ce22f6ad9bd93318ad0d7fd1c2eadb0d.

While its description is correct from an abstract or real hardware pov,
the range is special inside HVM guests. The range being UC in particular
gets in the way of OVMF, which places itself at [FFE00000,FFFFFFFF].
While this is benign to epte_get_entry_emt() as long as the IOMMU isn't
enabled for a guest, it becomes a very noticable problem otherwise: It
takes about half a minute for OVMF to decompress itself into its
designated address range.

And even beyond OVMF there's no reason to have e.g. the ACPI memory
range marked UC.

Fixes: c22bd567ce22 ("hvmloader: PA range 0xfc000000-0xffffffff should be UC")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agoarm/efi: Handle Xen bootargs from both xen.cfg and DT
Luca Fancellu [Mon, 13 Dec 2021 11:48:54 +0000 (11:48 +0000)]
arm/efi: Handle Xen bootargs from both xen.cfg and DT

Currently the Xen UEFI stub can accept Xen boot arguments from
the Xen configuration file using the "options=" keyword, but also
directly from the device tree specifying xen,xen-bootargs
property.

When the configuration file is used, device tree boot arguments
are ignored and overwritten even if the keyword "options=" is
not used.

This patch handle this case, so if the Xen configuration file is not
specifying boot arguments, the device tree boot arguments will be
used, if they are present.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
3 years agoxen/arm: increase memory banks number define value
Luca Fancellu [Thu, 16 Dec 2021 22:43:19 +0000 (14:43 -0800)]
xen/arm: increase memory banks number define value

Currently the maximum number of memory banks (NR_MEM_BANKS define)
is fixed to 128, but on some new platforms that have a large amount
of memory, this value is not enough and prevents Xen from booting.

Increase the value to 256.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
3 years agox86/cpuid: Advertise SERIALIZE by default to guests
Andrew Cooper [Tue, 14 Dec 2021 20:04:17 +0000 (20:04 +0000)]
x86/cpuid: Advertise SERIALIZE by default to guests

I've played with SERIALIZE, TSXLDTRK, MOVDIRI and MOVDIR64 on real hardware,
and they all seem fine, including emulation support.

SERIALIZE exists specifically to have a userspace usable serialising operation
without other side effects.  (The only other two choices are CPUID which is a
VMExit under virt and clobbers 4 registers, and IRET-to-self which very slow
and consumes content from the stack.)

TSXLDTRK is a niche TSX feature, and TSX itself is niche outside of demos of
speculative sidechannels.  Leave the feature opt-in until a usecase is found,
in an effort to preempt the multiple person years of effort it has taken to
mop up TSX issues impacting every processor line.

MOVDIRI and MOVDIR64 are harder to judge.  They're architectural building
blocks towards ENQCMD{,S} without obvious usecases on their own.  They're of
no use to domains without PCI devices, so leave them opt-in for now.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/cpuid: Introduce dom0-cpuid command line option
Andrew Cooper [Tue, 14 Dec 2021 16:53:36 +0000 (16:53 +0000)]
x86/cpuid: Introduce dom0-cpuid command line option

Specifically, this lets the user opt in to non-default features.

Collect all dom0 settings together in dom0_{en,dis}able_feat[], and apply it
to dom0's policy when other tweaks are being made.

As recalculate_cpuid_policy() is an expensive action, and dom0-cpuid= is
likely to only be used by the x86 maintainers for development purposes, forgo
the recalculation in the general case.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/cpuid: Factor common parsing out of parse_xen_cpuid()
Andrew Cooper [Wed, 15 Dec 2021 16:30:25 +0000 (16:30 +0000)]
x86/cpuid: Factor common parsing out of parse_xen_cpuid()

dom0-cpuid= is going to want to reuse the common parsing loop, so factor it
out into parse_cpuid().

Irritatingly, despite being static const, the features[] array gets duplicated
each time parse_cpuid() is inlined.  As it is a large (and ever growing with
new CPU features) datastructure, move it to being file scope so all inlines
use the same single object.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/cpuid: Split dom0 handling out of init_domain_cpuid_policy()
Andrew Cooper [Wed, 15 Dec 2021 15:36:59 +0000 (15:36 +0000)]
x86/cpuid: Split dom0 handling out of init_domain_cpuid_policy()

To implement dom0-cpuid= support, the special cases would need extending.
However there is already a problem with late hwdom where the special cases
override toolstack settings, which is unintended and poor behaviour.

Introduce a new init_dom0_cpuid_policy() for the purpose, moving the ITSC and
ARCH_CAPS logic.  The is_hardware_domain() can be dropped, and for now there
is no need to rerun recalculate_cpuid_policy(); this is a relatively expensive
operation, and will become more-so over time.

Rearrange the logic in create_dom0() to make room for a call to
init_dom0_cpuid_policy().  The AMX plans for having variable sized XSAVE
states require that modifications to the policy happen before vCPUs are
created.

Additionally, factor out domid into a variable so we can be slightly more
correct in the case of a failure, and also print the error from
domain_create().  This will at least help distinguish -EINVAL from -ENOMEM.

No practical change in behaviour.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agostubdom: only build libxen*.a from tools/libs/
Anthony PERARD [Mon, 6 Dec 2021 17:02:35 +0000 (17:02 +0000)]
stubdom: only build libxen*.a from tools/libs/

Avoid generating *.map files or running headers.chk when all we need
is the libxen*.a.

Also, allow force make to check again if libxen*.a needs rebuilt by
adding a '.PHONY' prerequisite.

Also, remove DESTDIR= as we don't do installation in this target, so
the value of DESTDIR doesn't matter.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
3 years agotools/Rules.mk: Cleanup %.pc rules
Anthony PERARD [Mon, 6 Dec 2021 17:02:33 +0000 (17:02 +0000)]
tools/Rules.mk: Cleanup %.pc rules

PKG_CONFIG_VARS isn't set anymore, so is dead logic.

For "local" pkg-config file, we only have one headers directory now,
"tools/include", so there is no need to specify it twice. So remove
$(CFLAGS_xeninclude) from "Cflags:".

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agolibs/toolcore: don't install xentoolcore_internal.h anymore
Anthony PERARD [Mon, 6 Dec 2021 17:02:32 +0000 (17:02 +0000)]
libs/toolcore: don't install xentoolcore_internal.h anymore

With "xentoolcore_internal.h" been in LIBHEADER, it was installed. But
its dependency "_xentoolcore_list.h" wasn't installed so the header
couldn't be used anyway.

This patch also mean that the rule "headers.chk" doesn't check it
anymore as well.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
3 years agolibs: Remove both "libs" and "build" target
Anthony PERARD [Mon, 6 Dec 2021 17:02:22 +0000 (17:02 +0000)]
libs: Remove both "libs" and "build" target

"libs" is odd and has been introduced without a reason by c7d3afbb44.
Instead, only use "all".

Also remove "build" target as "all" is more appropriate and nothing is
using "build" in libs/ in the xen.git repo.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
3 years agotools/xcutils: rework Makefile
Anthony PERARD [Mon, 6 Dec 2021 17:02:17 +0000 (17:02 +0000)]
tools/xcutils: rework Makefile

Use TARGETS to collect targets to build

Remove "build" target.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
[Clean up $(RM)]
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agotools/vchan: Collect targets in TARGETS
Anthony PERARD [Mon, 6 Dec 2021 17:02:16 +0000 (17:02 +0000)]
tools/vchan: Collect targets in TARGETS

And use the new TARGETS to clean them. Now "clean" will remove
"vchan-socket-proxy".

$(RM) already have the "-f" flags, so remove the second one.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agotools/misc: rework Makefile
Anthony PERARD [Mon, 6 Dec 2021 17:02:15 +0000 (17:02 +0000)]
tools/misc: rework Makefile

Add missing "xen-detect" rule. It only works without it because we
still have make's built-ins rules and variables, but fix this to not
have to rely on them.

Rename $(TARGETS_BUILD) to $(TARGETS).

Remove the unused "build" target.

Also, they are no more "build-only" targets, remove the extra code.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agotools/debugger: Allow make to recurse into debugger/
Anthony PERARD [Mon, 6 Dec 2021 17:02:06 +0000 (17:02 +0000)]
tools/debugger: Allow make to recurse into debugger/

Avoid the need for explicite rules to recurse into debugger/* dirs by
adding a Makefile in debugger/.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agotools/include/xen-foreign: avoid to rely on default .SUFFIXES
Anthony PERARD [Mon, 6 Dec 2021 17:02:04 +0000 (17:02 +0000)]
tools/include/xen-foreign: avoid to rely on default .SUFFIXES

When a rule isn't a pattern rule, and thus don't have a %, the
value of the automatic variable stem $* depends on .SUFFIXES. GNU make
manual explain that it is better to avoid this "bizarre" behavior
which exist for compatibility.

Use $(basename ) instead. So we can one day avoid make's build-in
rules and variables.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agotools/Rules.mk: introduce FORCE target
Anthony PERARD [Mon, 6 Dec 2021 17:02:03 +0000 (17:02 +0000)]
tools/Rules.mk: introduce FORCE target

And replace the one defined in libs.mk.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
3 years agotools: Use config.h from autoconf instead of "buildmakevars2header"
Anthony PERARD [Mon, 6 Dec 2021 17:02:01 +0000 (17:02 +0000)]
tools: Use config.h from autoconf instead of "buildmakevars2header"

This avoid the need to generate the _paths.h header when the
information is from autoconf anyway.

They are no more users of the "buildmakevars2header" macro, so it can
be removed from "Config.mk".

Also removed the extra "-f" flag where "$(RM)" is used (xl/Makefile).

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
3 years agotools/xl: Remove unnecessary -I. from CFLAGS
Anthony PERARD [Mon, 6 Dec 2021 17:02:00 +0000 (17:02 +0000)]
tools/xl: Remove unnecessary -I. from CFLAGS

GCC will search the directory where the source file is for
quote-includes.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agotools/ocaml: Remove generation of _paths.h
Anthony PERARD [Mon, 6 Dec 2021 17:01:59 +0000 (17:01 +0000)]
tools/ocaml: Remove generation of _paths.h

_paths.h isn't useful anymore in systemd_stubs.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
3 years agotools/libacpi: cleanup Makefile, don't check for iasl binary
Anthony PERARD [Mon, 6 Dec 2021 17:01:58 +0000 (17:01 +0000)]
tools/libacpi: cleanup Makefile, don't check for iasl binary

iasl is been check for presence by ./configure, so this Makefile
doesn't have to do it. Also start to use $(IASL) that ./configure
generate.

iasl hasn't been download by our build system for a while and the
dependency on iasl is in the main xen.git README.

Make use of $< in one rule instead of spelling the %.asl file again.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agotools/flask/utils: remove unused variables/targets from Makefile
Anthony PERARD [Mon, 6 Dec 2021 17:01:57 +0000 (17:01 +0000)]
tools/flask/utils: remove unused variables/targets from Makefile

They are no *.opic or *.so in this subdir, so no need to clean them.

The TEST* variables doesn't seems to be used anywhere, and they weren't
used by xen.git when introduced.
Both CLIENTS_* variables aren't used.
Both target "print-dir" and "print-end" only exist in this directory
and are probably not used anywhere.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Daniel P. Smith <dpsmith@apertussolutions.com>
[Drop trailing whitespace and use $(RM) consistently]
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agotools/libs: Don't recursively expand MAJOR ?= $(shell ...)
Andrew Cooper [Mon, 13 Dec 2021 18:49:17 +0000 (18:49 +0000)]
tools/libs: Don't recursively expand MAJOR ?= $(shell ...)

?= is a deferred assignment.  Switch to an alternative form which lets us use
an immediate assignment.

Before, version.sh gets run anywhere between 46 and 88 times, with 50 on a
`clean`.  After, between 6 and 12 times.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
3 years agotools/libxl: Don't read STORE/CONSOLE_PFN from Xen
Andrew Cooper [Thu, 9 Dec 2021 16:59:06 +0000 (16:59 +0000)]
tools/libxl: Don't read STORE/CONSOLE_PFN from Xen

The values are already available in dom->{console,xenstore}_pfn, just like on
the PV side of things.  No need to ask Xen.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
3 years agoxen/build: Fix `make cscope` rune
Andrew Cooper [Thu, 16 Dec 2021 02:38:57 +0000 (02:38 +0000)]
xen/build: Fix `make cscope` rune

There are two problems, both in the all_sources definition.

First, everything in arch/*/include gets double hits with cscope queries,
because they end up getting listed twice in cscope.files.

Drop the first `find` rune of the three, because it's redundant with the third
rune following c/s 725381a5eab3 ("xen: move include/asm-* to
arch/*/include/asm").

Second, and this way for a long time:

  $ make cscope
  ( find arch/x86/include -name '*.h' -print; find include -name '*.h' -print;
  find xsm arch/x86 common drivers lib test -name '*.[chS]' -print ) >
  cscope.files
  cscope -k -b -q
  cscope: cannot find file arch/x86/efi/efi.h
  cscope: cannot find file arch/x86/efi/ebmalloc.c
  cscope: cannot find file arch/x86/efi/compat.c
  cscope: cannot find file arch/x86/efi/pe.c
  cscope: cannot find file arch/x86/efi/boot.c
  cscope: cannot find file arch/x86/efi/runtime.c

This is caused by these being symlinks to common/efi.  Restrict all find runes
to `-type f` to skip symlinks, because common/efi/*.c are already listed.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com>
3 years agoxen: make some per-scheduler performance counters sched global ones
Juergen Gross [Thu, 16 Dec 2021 05:45:02 +0000 (06:45 +0100)]
xen: make some per-scheduler performance counters sched global ones

Some performance counters listed to be credit or credit2 specific are
being used by the null scheduler, too.

Make those sched global ones.

Fixes: ab6ba8c6753fa76 ("perfc: conditionalize credit/credit2 counters")
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Stefano Stabellini <sstabellini@kernel.org>
3 years agoxen/arm: do not map PCI ECAM and MMIO space to Domain-0's p2m
Oleksandr Andrushchenko [Thu, 9 Dec 2021 07:29:18 +0000 (09:29 +0200)]
xen/arm: do not map PCI ECAM and MMIO space to Domain-0's p2m

PCI host bridges are special devices in terms of implementing PCI
passthrough. According to [1] the current implementation depends on
Domain-0 to perform the initialization of the relevant PCI host
bridge hardware and perform PCI device enumeration. In order to
achieve that one of the required changes is to not map all the memory
ranges in map_range_to_domain as we traverse the device tree on startup
and perform some additional checks if the range needs to be mapped to
Domain-0.

The generic PCI host controller device tree binding says [2]:
- ranges: As described in IEEE Std 1275-1994, but must provide
          at least a definition of non-prefetchable memory. One
          or both of prefetchable Memory and IO Space may also
          be provided.

- reg   : The Configuration Space base address and size, as accessed
          from the parent bus.  The base address corresponds to
          the first bus in the "bus-range" property.  If no
          "bus-range" is specified, this will be bus 0 (the default).

From the above none of the memory ranges from the "ranges" property
needs to be mapped to Domain-0 at startup as MMIO mapping is going to
be handled dynamically by vPCI as we assign PCI devices, e.g. each
device assigned to Domain-0/guest will have its MMIOs mapped/unmapped
as needed by Xen.

The "reg" property covers not only ECAM space, but may also have other
then the configuration memory ranges described, for example [3]:
- reg: Should contain rc_dbi, config registers location and length.
- reg-names: Must include the following entries:
   "rc_dbi": controller configuration registers;
   "config": PCIe configuration space registers.

This patch makes it possible to not map all the ranges from the
"ranges" property and also ECAM from the "reg". All the rest from the
"reg" property still needs to be mapped to Domain-0, so the PCI
host bridge remains functional in Domain-0. This is done by first
skipping the mappings while traversing the device tree as it is done for
usual devices and then by calling a dedicated pci_host_bridge_mappings
function which only maps MMIOs required by the host bridges leaving the
regions, needed for vPCI traps, unmapped.

[1] https://lists.xenproject.org/archives/html/xen-devel/2020-07/msg00777.html
[2] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/host-generic-pci.txt
[3] https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/hisilicon-pcie.txt

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Rahul Singh <rahul.singh@arm.com>
Tested-by: Rahul Singh <rahul.singh@arm.com>
3 years agoxen/arm: account IO handler for emulated PCI host bridge
Oleksandr Andrushchenko [Thu, 9 Dec 2021 07:29:17 +0000 (09:29 +0200)]
xen/arm: account IO handler for emulated PCI host bridge

At the moment, we always allocate an extra 16 slots for IO handlers
(see MAX_IO_HANDLER). So while adding an IO trap handler for the emulated
PCI host bridge we are not breaking anything, but we have a latent bug
as the maximum number of IOs may be exceeded.
Fix this by explicitly telling that we have an additional IO handler, so it is
accounted.

Fixes: d59168dc05a5 ("xen/arm: Enable the existing x86 virtual PCI support for ARM")
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Rahul Singh <rahul.singh@arm.com>
Tested-by: Rahul Singh <rahul.singh@arm.com>
3 years agoxen/arm: setup MMIO range trap handlers for hardware domain
Oleksandr Andrushchenko [Thu, 9 Dec 2021 07:29:16 +0000 (09:29 +0200)]
xen/arm: setup MMIO range trap handlers for hardware domain

In order for vPCI to work it needs to maintain guest and hardware
domain's views of the configuration space. For example, BARs and
COMMAND registers require emulation for guests and the guest view
of the registers needs to be in sync with the real contents of the
relevant registers. For that ECAM address space needs to also be
trapped for the hardware domain, so we need to implement PCI host
bridge specific callbacks to properly setup MMIO handlers for those
ranges depending on particular host bridge implementation.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reviewed-by: Rahul Singh <rahul.singh@arm.com>
Tested-by: Rahul Singh <rahul.singh@arm.com>
[julieng: Add ASSERT_UNREACHABLE()]
Acked-by: Julien Grall <jgrall@amazon.com>
3 years agoxen/arm: add pci-domain for disabled devices
Oleksandr Andrushchenko [Thu, 9 Dec 2021 07:29:15 +0000 (09:29 +0200)]
xen/arm: add pci-domain for disabled devices

If a PCI host bridge device is present in the device tree, but is
disabled, then its PCI host bridge driver was not instantiated.
This results in the failure of the pci_get_host_bridge_segment()
and the following panic during Xen start:

(XEN) Device tree generation failed (-22).
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Could not set up DOM0 guest OS
(XEN) ****************************************

Fix this by adding "linux,pci-domain" property for all device tree nodes
which have "pci" device type, so we know which segments will be used by
the guest for which bridges.

Fixes: 4cfab4425d39 ("xen/arm: Add linux,pci-domain property for hwdom if not available.")
Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reviewed-by: Rahul Singh <rahul.singh@arm.com>
Tested-by: Rahul Singh <rahul.singh@arm.com>
Acked-by: Julien Grall <jgrall@amazon.com>
3 years agoarm/traps: remove debugger_trap_fatal() calls
Bobby Eshleman [Tue, 28 Sep 2021 20:30:24 +0000 (13:30 -0700)]
arm/traps: remove debugger_trap_fatal() calls

ARM doesn't actually use debugger_trap_* anything, and is stubbed out.

This commit simply removes the unneeded calls.

Signed-off-by: Bobby Eshleman <bobby.eshleman@gmail.com>
Acked-by: Julien Grall <jgrall@amazon.com>
3 years agoArm: drop memguard_{,un}guard_range() stubs
Jan Beulich [Wed, 15 Dec 2021 09:24:45 +0000 (10:24 +0100)]
Arm: drop memguard_{,un}guard_range() stubs

These exist for no reason: The code using them is only ever built for
Arm32. And memguard_guard_stack() has no use outside of x86-specific
code at all.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
3 years agox86: drop MEMORY_GUARD
Jan Beulich [Wed, 15 Dec 2021 09:23:51 +0000 (10:23 +0100)]
x86: drop MEMORY_GUARD

The functions it guards are dead code. Worse, while intended to exist in
debug builds only, as of commit bacbf0cb7349 ("build: convert debug to
Kconfig") they also get compiled in release builds.

The remaining uses in show_stack_overflow() aren't really related to any
memory guarding anymore - with CET-SS support the stacks now get set up
the same in debug and release builds. Drop them as well; there's no harm
providing the information there in all cases.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agox86/PVH: permit more physdevop-s to be used by Dom0
Jan Beulich [Wed, 15 Dec 2021 09:20:35 +0000 (10:20 +0100)]
x86/PVH: permit more physdevop-s to be used by Dom0

Certain notifications of Dom0 to Xen are independent of the mode Dom0 is
running in. Permit further PCI related ones (only their modern forms).
Also include the USB2 debug port operation at this occasion. While
largely relevant for the latter, drop the has_vpci() part of the
conditional as redundant with is_hardware_domain(): There's no PVH Dom0
without vPCI.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
3 years agox86/PVH: improve Dom0 memory size calculation
Jan Beulich [Wed, 15 Dec 2021 09:19:54 +0000 (10:19 +0100)]
x86/PVH: improve Dom0 memory size calculation

Assuming that the accounting for IOMMU page tables will also take care
of the P2M needs was wrong: dom0_paging_pages() can determine a far
higher value, high enough for the system to run out of memory while
setting up Dom0. Hence in the case of shared page tables the larger of
the two values needs to be used (without shared page tables the sum of
both continues to be applicable).

To not further complicate the logic, eliminate the up-to-2-iteration
loop in favor of doing a few calculations twice (before and after
calling dom0_paging_pages()). While this will lead to slightly too high
a value in "cpu_pages", it is deemed better to account a few too many
than a few too little.

As a result the calculation is now deemed good enough to no longer
warrant the warning message, which therefore gets dropped.

Also uniformly use paging_mode_enabled(), not is_hvm_domain().

While there also account for two further aspects in the PV case: With
"iommu=dom0-passthrough" no IOMMU page tables would get allocated, so
none need accounting for. And if shadow mode is to be enabled (including
only potentially, because of "pv-l1tf=dom0"), setting aside a suitable
amount for the P2M pool to get populated is also necessary (i.e. similar
to the non-shared-page-tables case of PVH).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
3 years agobuild: adjust $(TARGET).efi creation in arch/arm
Anthony PERARD [Wed, 15 Dec 2021 09:17:34 +0000 (10:17 +0100)]
build: adjust $(TARGET).efi creation in arch/arm

There is no need to try to guess a relative path to the "xen.efi" file,
we can simply use $@. Also, there's no need to use `notdir`, make
already do that work via $(@F).

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
3 years agobuild: generate "include/xen/compile.h" with if_changed
Anthony PERARD [Wed, 15 Dec 2021 09:16:51 +0000 (10:16 +0100)]
build: generate "include/xen/compile.h" with if_changed

This will avoid regenerating "compile.h" if the content hasn't changed.

As it's currently the case, the file isn't regenerated during `sudo
make install` if it exist and does belong to a different user, thus we
can remove the target "delete-unfresh-files". Target "$(TARGET)" still
need a phony dependency, so add "FORCE".

Use "$(dot-target).tmp" as temporary file as this is already cover by
".*.tmp" partern in ".gitconfig".

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
3 years agoxen: move include/asm-* to arch/*/include/asm
Anthony PERARD [Wed, 15 Dec 2021 09:14:13 +0000 (10:14 +0100)]
xen: move include/asm-* to arch/*/include/asm

This avoid the need to create the symbolic link "include/asm".

Whenever a comment refer to an "asm" headers, this patch avoid
spelling the arch when not needed to avoid some code churn.

One unrelated change is to sort entries in MAINTAINERS for "INTEL(R)
VT FOR X86 (VT-X)"

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Paul Durrant <paul@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
3 years agobuild: factorise generation of the linker scripts
Anthony PERARD [Wed, 15 Dec 2021 09:08:38 +0000 (10:08 +0100)]
build: factorise generation of the linker scripts

In Arm and X86 makefile, generating the linker script is the same, so
we can simply have both call the same macro.

We need to add *.lds files into extra-y so that Rules.mk can find the
.*.cmd dependency file and load it.

Change made to the command line:
- Use cpp_flags macro which simply filter -Wa,% options from $(a_flags).
- Added -D__LINKER__ even it is only used by Arm's lds.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
3 years agox86/cpuid: Fix TSXLDTRK definition
Andrew Cooper [Mon, 13 Dec 2021 20:33:42 +0000 (20:33 +0000)]
x86/cpuid: Fix TSXLDTRK definition

TSXLDTRK lives in CPUID leaf 7[0].edx, not 7[0].ecx.

Bit 16 in ecx is LA57.

Fixes: a6d1b558471f ("x86emul: support X{SUS,RES}LDTRK")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agoperfc: drop calls_to_multicall performance counter
Juergen Gross [Tue, 14 Dec 2021 08:50:07 +0000 (09:50 +0100)]
perfc: drop calls_to_multicall performance counter

The calls_to_multicall performance counter is basically redundant to
the multicall hypercall counter. The only difference is the counting
of continuation calls, which isn't really that interesting.

Drop the calls_to_multicall performance counter.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86/perfc: add hypercall performance counters for hvm, correct pv
Juergen Gross [Tue, 14 Dec 2021 08:49:23 +0000 (09:49 +0100)]
x86/perfc: add hypercall performance counters for hvm, correct pv

The HVM hypercall handler is missing incrementing the per hypercall
counters. Add that.

The counters for PV are handled wrong, as they are not using
perf_incra() with the number of the hypercall as index, but are
incrementing the first hypercall entry (set_trap_table) for each
hypercall. Fix that.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
3 years agox86emul: drop "seg" parameter from insn_fetch() hook
Jan Beulich [Tue, 14 Dec 2021 08:48:17 +0000 (09:48 +0100)]
x86emul: drop "seg" parameter from insn_fetch() hook

This is specified (and asserted for in a number of places) to always be
CS. Passing this as an argument in various places is therefore
pointless. The price to pay is two simple new functions, with the
benefit of the PTWR case now gaining a more appropriate error code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Paul Durrant <paul@xen.org>