]> xenbits.xensource.com Git - people/liuw/xen.git/log
people/liuw/xen.git
5 years agoconfigure: fix print syntax for python 3 python-fix
Wei Liu [Wed, 18 Sep 2019 16:07:50 +0000 (17:07 +0100)]
configure: fix print syntax for python 3

16cc3362a missed one print statement.

Signed-off-by: Wei Liu <wl@xen.org>
5 years agotools/arm: optee: create optee firmware node in DT if tee=optee
Volodymyr Babchuk [Wed, 19 Jun 2019 17:54:19 +0000 (17:54 +0000)]
tools/arm: optee: create optee firmware node in DT if tee=optee

If TEE support is enabled with "tee=optee" option in xl.cfg,
then we need to inform guest about available TEE, by creating
corresponding node in the guest's device tree.

Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
5 years agotools/arm: tee: add "tee" option for xl.cfg
Volodymyr Babchuk [Wed, 19 Jun 2019 17:54:16 +0000 (17:54 +0000)]
tools/arm: tee: add "tee" option for xl.cfg

This enumeration controls TEE type for a domain. Currently there is
two possible options: either 'none' or 'optee'.

'none' is the default value and it basically disables TEE support at
all.

'optee' enables access to the OP-TEE running on a host machine. This
requires special OP-TEE build with virtualization support enabled.

Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
5 years agox86: PCID is unused when !PV
Jan Beulich [Wed, 18 Sep 2019 13:21:51 +0000 (15:21 +0200)]
x86: PCID is unused when !PV

This allows in particular some streamlining of the TLB flushing code
paths.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/CPUID: drop INVPCID dependency on PCID
Jan Beulich [Wed, 18 Sep 2019 13:20:00 +0000 (15:20 +0200)]
x86/CPUID: drop INVPCID dependency on PCID

PCID validly depends on LM, as it can be enabled in Long Mode only.
INVPCID, otoh, can be used not only without PCID enabled, but also
outside of Long Mode altogether. In both cases its functionality is
simply restricted to PCID 0, which is sort of expected as no other PCID
can be activated there.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/HVM: relax shadow mode check in hvm_set_cr3()
Jan Beulich [Wed, 18 Sep 2019 13:19:08 +0000 (15:19 +0200)]
x86/HVM: relax shadow mode check in hvm_set_cr3()

There's no need to re-obtain a page reference if only bits not affecting
the address change.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86: limit the amount of TLB flushing in switch_cr3_cr4()
Jan Beulich [Wed, 18 Sep 2019 13:14:49 +0000 (15:14 +0200)]
x86: limit the amount of TLB flushing in switch_cr3_cr4()

We really need to flush the TLB just once, if we do so with or after the
CR3 write. The only case where two flushes are unavoidable is when we
mean to turn off CR4.PGE (perhaps just temporarily; see the code
comment).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86: adjust cr3_pcid() return type
Jan Beulich [Wed, 18 Sep 2019 13:14:08 +0000 (15:14 +0200)]
x86: adjust cr3_pcid() return type

There's no need for it to be 64 bits wide - only the low twelve bits
of CR3 hold the PCID.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: treat Hygon guests like AMD ones
Jan Beulich [Wed, 18 Sep 2019 13:13:21 +0000 (15:13 +0200)]
x86emul: treat Hygon guests like AMD ones

For some reason the Hygon enabling series left out the insn emulator.
Make appropriate adjustments wherever we've been special casing AMD.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wl@xen.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-acked-by: Juergen Gross <jgross@suse.com>
5 years agocore-parking: interact with runtime SMT-disabling
Jan Beulich [Wed, 18 Sep 2019 13:12:33 +0000 (15:12 +0200)]
core-parking: interact with runtime SMT-disabling

When disabling SMT at runtime, secondary threads should no longer be
candidates for bringing back up in response to _PUR ACPI events. Purge
them from the tracking array.

Doing so involves adding locking to guard accounting data in the core
parking code. While adding the declaration for the lock, take the
liberty to drop two unnecessary forward function declarations.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agotools/libs: Fix build following c/s 56dccee3f, take 2
Andrew Cooper [Tue, 17 Sep 2019 18:30:05 +0000 (19:30 +0100)]
tools/libs: Fix build following c/s 56dccee3f, take 2

The fix for c/s 01ba8f62b618 was speculative given no local repro.  It turns
out that it didn't fix the problem.

The $(AUTOINCS) variable needs to be visible before libs.mk is included, to
have any effect.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
5 years agotools/libs: Fix build following c/s 56dccee3f
Andrew Cooper [Tue, 17 Sep 2019 17:39:14 +0000 (18:39 +0100)]
tools/libs: Fix build following c/s 56dccee3f

Travis reports:

  make subdirs-install
  make[2]: Entering directory `/home/travis/build/andyhhp/xen/tools'
  make[3]: Entering directory `/home/travis/build/andyhhp/xen/tools'
  make -C libs install
  make[4]: Entering directory `/home/travis/build/andyhhp/xen/tools/libs'
  make[5]: Entering directory `/home/travis/build/andyhhp/xen/tools/libs'
  make -C toolcore install
  make[6]: Entering directory `/home/travis/build/andyhhp/xen/tools/libs/toolcore'
  make libs
  make[7]: Entering directory`/home/travis/build/andyhhp/xen/tools/libs/toolcore'
  for i in include/xentoolcore.h include/xentoolcore_internal.h; do \
          gcc -x c -ansi -Wall -Werror -I<snip>/xen/tools/libs/toolcore/../../../tools/include \
                    -S -o /dev/null $i || exit 1; \
                        echo $i; \
                        done >headers.chk.new
  include/xentoolcore_internal.h:30:31: fatal error: _xentoolcore_list.h: No such file or directory
   #include "_xentoolcore_list.h"
                                 ^
  compilation terminated.
  make[7]: *** [headers.chk] Error 1

The problem is that xentoolcore_internal.h includes _xentoolcore_list.h which
hasn't been generated yet.

The toolcore headers.chk rule (unlike the other libraries) had an additional
dependency against $(AUTOINCS), which forced the headers to be generated
first.  Replicate this in the common libs.mk

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
5 years agoxen/arm: Zero BSS after the MMU and D-cache is turned on
Julien Grall [Sun, 14 Apr 2019 20:46:29 +0000 (21:46 +0100)]
xen/arm: Zero BSS after the MMU and D-cache is turned on

At the moment BSS is zeroed before the MMU and D-Cache is turned on.
In other words, the cache will be bypassed when zeroing the BSS section.

On Arm64, per the Image protocol [1], the state of the cache for BSS region
is not known because it is not part of the "loaded kernel image".

On Arm32, the boot protocol [2] does not mention anything about the
state of the cache. Therefore, it should be assumed that it is not known
for BSS region.

This means that the cache will need to be invalidated twice for the BSS
region:
    1) Before zeroing to remove any dirty cache line. Otherwise they may
    get evicted while zeroing and therefore overriding the value.
    2) After zeroing to remove any cache line that may have been
    speculated. Otherwise when turning on MMU and D-Cache, the CPU may
    see old values.

At the moment, the only reason to have BSS zeroed early is because the
boot page tables are part of it. To avoid the two cache invalidations,
it would be better if the boot page tables are part of the "loaded
kernel image" and therefore be zeroed when loading the image into
memory. A good candidate is the section .data.page_aligned.

A new macro DEFINE_BOOT_PAGE_TABLE is introduced to create and mark
page-tables used before BSS is zeroed. This includes all boot_* but also
xen_fixmap as zero_bss() will print a message when earlyprintk is
enabled.

[1] linux/Documentation/arm64/booting.txt
[2] linux/Documentation/arm/Booting

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm32: head: Setup HTTBR in enable_mmu() and add missing isb
Julien Grall [Sat, 20 Apr 2019 13:36:50 +0000 (14:36 +0100)]
xen/arm32: head: Setup HTTBR in enable_mmu() and add missing isb

At the moment, HTTBR is setup in create_page_tables(). This is fine as
it is called by every CPUs.

However, such assumption may not hold in the future. To make change
easier, the HTTBR is not setup in enable_mmu().

Take the opportunity to add the missing isb() to ensure the HTTBR is
seen before the MMU is turned on.

Lastly, the only use of r5 in create_page_tables() is now removed. So
the register can be removed from the clobber list of the function.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm32: head: Rework and document launch()
Julien Grall [Mon, 22 Jul 2019 15:08:30 +0000 (16:08 +0100)]
xen/arm32: head: Rework and document launch()

Boot CPU and secondary CPUs will use different entry point to C code. At
the moment, the decision on which entry to use is taken within launch().

In order to avoid using conditional instruction and make the call
clearer, launch() is reworked to take in parameters the entry point and its
arguments.

Lastly, document the behavior and the main registers usage within the
function.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agodrivers/acpi: Drop "ERST table was not found" message
Andrew Cooper [Fri, 13 Sep 2019 16:17:21 +0000 (17:17 +0100)]
drivers/acpi: Drop "ERST table was not found" message

ERST isn't a mandatory table, and also isn't very common to find.  The message
is unnecessary noise during boot.  Furthermore, it is redundant with the list
of found ACPI tables printed just ahead.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/vpmu: Drop "VPMU: disabled" message
Andrew Cooper [Fri, 13 Sep 2019 16:13:35 +0000 (17:13 +0100)]
x86/vpmu: Drop "VPMU: disabled" message

Printing "$foo disabled" is unnecessary noise during boot.  All other VPMU
settings emit a message, so this doesn't result in any ambiguity.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agotools/libs: put common Makefile parts into new libs.mk
Juergen Gross [Fri, 6 Sep 2019 12:41:03 +0000 (14:41 +0200)]
tools/libs: put common Makefile parts into new libs.mk

The Makefile below tools/libs have a lot in common. Put those common
parts into a new libs.mk and include that from the specific Makefiles.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wl@xen.org>
5 years agovpci: honor read-only devices
Roger Pau Monné [Tue, 17 Sep 2019 14:13:39 +0000 (16:13 +0200)]
vpci: honor read-only devices

Don't allow the hardware domain write access the PCI config space of
devices marked as read-only.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agosysctl / libxl: report whether IOMMU/HAP page table sharing is supported
Paul Durrant [Tue, 17 Sep 2019 14:12:47 +0000 (16:12 +0200)]
sysctl / libxl: report whether IOMMU/HAP page table sharing is supported

This patch defines a new bit reported in the hw_cap field of struct
xen_sysctl_physinfo to indicate whether the platform supports sharing of
HAP page tables (i.e. the P2M) with the IOMMU. This informs the toolstack
whether the domain needs extra memory to store discrete IOMMU page tables
or not.

NOTE: This patch makes sure iommu_hap_pt_shared is clear if HAP is not
      supported or the IOMMU is disabled, and defines it to false if
      !CONFIG_HVM.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agouse is_iommu_enabled() where appropriate...
Paul Durrant [Tue, 17 Sep 2019 14:11:48 +0000 (16:11 +0200)]
use is_iommu_enabled() where appropriate...

...rather than testing the global iommu_enabled flag and ops pointer.

Now that there is a per-domain flag indicating whether the domain is
permitted to use the IOMMU (which determines whether the ops pointer will
be set), many tests of the global iommu_enabled flag and ops pointer can
be translated into tests of the per-domain flag. Some of the other tests of
purely the global iommu_enabled flag can also be translated into tests of
the per-domain flag.

NOTE: The comment in iommu_share_p2m_table() is also fixed; need_iommu()
      disappeared some time ago. Also, whilst the style of the 'if' in
      flask_iommu_resource_use_perm() is fixed, I have not translated any
      instances of u32 into uint32_t to keep consistency. IMO such a
      translation would be better done globally for the source module in
      a separate patch.
      The change to the definition of iommu_call() is to keep the PV shim
      build happy. Without this change it will fail to compile with errors
      of the form:

iommu.c:361:32: error: unused variable ‘hd’ [-Werror=unused-variable]
     const struct domain_iommu *hd = dom_iommu(d);
                                     ^~

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: "Roger Pau Monné" <roger.pau@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agodomain: introduce XEN_DOMCTL_CDF_iommu flag
Paul Durrant [Tue, 17 Sep 2019 14:10:38 +0000 (16:10 +0200)]
domain: introduce XEN_DOMCTL_CDF_iommu flag

This patch introduces a common domain creation flag to determine whether
the domain is permitted to make use of the IOMMU. Currently the flag is
always set for both dom0 and any domU created by libxl if the IOMMU is
globally enabled (i.e. iommu_enabled == 1). sanitise_domain_config() is
modified to reject the flag if !iommu_enabled.

A new helper function, is_iommu_enabled(), is added to test the flag and
iommu_domain_init() will return immediately if !is_iommu_enabled(). This is
slightly different to the previous behaviour based on !iommu_enabled where
the call to arch_iommu_domain_init() was made regardless, however it appears
that this call was only necessary to initialize the dt_devices list for ARM
such that iommu_release_dt_devices() can be called unconditionally by
domain_relinquish_resources(). Adding a simple check of is_iommu_enabled()
into iommu_release_dt_devices() keeps this unconditional call working.

No functional change should be observed with this patch applied.

Subsequent patches will allow the toolstack to control whether use of the
IOMMU is enabled for a domain.

NOTE: The introduction of the is_iommu_enabled() helper function might
      seem excessive but its use is expected to increase with subsequent
      patches. Also, having iommu_domain_init() bail before calling
      arch_iommu_domain_init() is not strictly necessary, but I think the
      consequent addition of the call to is_iommu_enabled() in
      iommu_release_dt_devices() makes the code clearer.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: "Roger Pau Monné" <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agosched: populate cpupool0 only after all cpus are up
Juergen Gross [Tue, 17 Sep 2019 14:09:50 +0000 (16:09 +0200)]
sched: populate cpupool0 only after all cpus are up

Simplify cpupool initialization by populating cpupool0 with cpus only
after all cpus are up. This avoids having to call the cpu notifier
directly for cpu 0.

With that in place there is no need to create cpupool0 earlier, so
do that just before assigning the cpus. Initialize free cpus with all
online cpus at that time in order to be able to add the cpu notifier
late, too.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
5 years agospinlocks: print lock profile info in panic()
Juergen Gross [Tue, 17 Sep 2019 14:08:48 +0000 (16:08 +0200)]
spinlocks: print lock profile info in panic()

Print the lock profile data when the system crashes and add some more
information for each lock data (lock address, cpu holding the lock).
While at it use the PRI_stime format specifier for printing time data.

This is especially beneficial for watchdog triggered crashes in case
of deadlocks.

In order to have the cpu holding the lock available let the
lock profile config option select DEBUG_LOCKS.

As printing the lock profile data will make use of locking, too, we
need to disable spinlock debugging before calling
spinlock_profile_printall() from panic().

While at it remove a superfluous #ifdef CONFIG_LOCK_PROFILE and rename
CONFIG_LOCK_PROFILE to CONFIG_DEBUG_LOCK_PROFILE.

Also move the .lockprofile.data section to init area in linker scripts
as the data is no longer needed after boot.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agoxen: add new CONFIG_DEBUG_LOCKS option
Juergen Gross [Tue, 17 Sep 2019 14:08:03 +0000 (16:08 +0200)]
xen: add new CONFIG_DEBUG_LOCKS option

Instead of enabling debugging for debug builds only add a dedicated
Kconfig option for that purpose which defaults to DEBUG.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agospinlocks: in debug builds store cpu holding the lock
Juergen Gross [Tue, 17 Sep 2019 14:07:11 +0000 (16:07 +0200)]
spinlocks: in debug builds store cpu holding the lock

Add the cpu currently holding the lock to struct lock_debug. This makes
analysis of locking errors easier and it can be tested whether the
correct cpu is releasing a lock again.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/PCI: read MSI-X table entry count early
Jan Beulich [Tue, 17 Sep 2019 14:06:15 +0000 (16:06 +0200)]
x86/PCI: read MSI-X table entry count early

Rather than doing this every time we set up interrupts for a device
anew (and then in two distinct places) fill this invariant field
right after allocating struct arch_msix.

While at it also obtain the MSI-X capability structure position just
once, in msix_capability_init(), rather than in each caller.

Furthermore take the opportunity and eliminate the multi_msix_capable()
alias of msix_table_size().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agoAMD/IOMMU: let callers of amd_iommu_alloc_intremap_table() handle errors
Jan Beulich [Tue, 17 Sep 2019 14:05:34 +0000 (16:05 +0200)]
AMD/IOMMU: let callers of amd_iommu_alloc_intremap_table() handle errors

Additional users of the function will want to handle errors more
gracefully. Remove the BUG_ON()s and make the current caller panic()
instead.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agoAMD/IOMMU: introduce a "valid" flag for IVRS mappings
Jan Beulich [Tue, 17 Sep 2019 14:05:01 +0000 (16:05 +0200)]
AMD/IOMMU: introduce a "valid" flag for IVRS mappings

For us to no longer blindly allocate interrupt remapping tables for
everything the ACPI tables name, we can't use struct ivrs_mappings'
intremap_table field anymore to also have the meaning of "this entry
is valid". Add a separate boolean field instead.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agoAMD/IOMMU: don't free shared IRT multiple times
Jan Beulich [Tue, 17 Sep 2019 14:03:44 +0000 (16:03 +0200)]
AMD/IOMMU: don't free shared IRT multiple times

Calling amd_iommu_free_intremap_table() for every IVRS entry is correct
only in per-device-IRT mode. Use a NULL 2nd argument to indicate that
the shared table should be freed, and call the function exactly once in
shared mode.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agolivepatch: always print XENLOG_ERR information (ARM, ELF)
Pawel Wieczorkiewicz [Wed, 21 Aug 2019 10:04:30 +0000 (10:04 +0000)]
livepatch: always print XENLOG_ERR information (ARM, ELF)

This complements [1] commit for ARM and livepatch_elf files.

[1] 4470efeae4 livepatch: always print XENLOG_ERR information

Signed-off-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Acked-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>
5 years agomicrocode: pass a patch pointer to apply_microcode()
Chao Gao [Fri, 13 Sep 2019 10:31:34 +0000 (12:31 +0200)]
microcode: pass a patch pointer to apply_microcode()

apply_microcode()'s always loading the cached ucode patch forces
a patch to be stored before being loaded. Make apply_microcode()
accept a patch pointer to remove the limitation so that a patch
can be stored after a successful loading.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agomicrocode/amd: call svm_host_osvw_init() in common code
Chao Gao [Fri, 13 Sep 2019 10:31:01 +0000 (12:31 +0200)]
microcode/amd: call svm_host_osvw_init() in common code

Introduce a vendor hook, .end_update_percpu, for svm_host_osvw_init().
The hook function is called on each cpu after loading an update.
It is a preparation for spliting out apply_microcode() from
cpu_request_microcode().

Note that svm_host_osvm_init() should be called regardless of the
result of loading an update.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agomicrocode: remove pointless 'cpu' parameter
Chao Gao [Fri, 13 Sep 2019 10:30:12 +0000 (12:30 +0200)]
microcode: remove pointless 'cpu' parameter

Some callbacks in microcode_ops or related functions take a cpu
id parameter. But at current call sites, the cpu id parameter is
always equal to current cpu id. Some of them even use an assertion
to guarantee this. Remove this redundent 'cpu' parameter.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agomicrocode: remove struct ucode_cpu_info
Chao Gao [Fri, 13 Sep 2019 10:28:44 +0000 (12:28 +0200)]
microcode: remove struct ucode_cpu_info

Remove the per-cpu cache field in struct ucode_cpu_info since it has
been replaced by a global cache. It would leads to only one field
remaining in ucode_cpu_info. Then, this struct is removed and the
remaining field (cpu signature) is stored in per-cpu area.

The cpu status notifier is also removed. It was used to free the "mc"
field to avoid memory leak.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agomicrocode: clean up microcode_resume_cpu
Chao Gao [Fri, 13 Sep 2019 10:28:13 +0000 (12:28 +0200)]
microcode: clean up microcode_resume_cpu

Previously, a per-cpu ucode cache is maintained. Then each CPU had one
per-cpu update cache and there might be multiple versions of microcode.
Thus microcode_resume_cpu tried best to update microcode by loading
every update cache until a successful load.

But now the cache struct is simplified a lot and only a single ucode is
cached. a single invocation of ->apply_microcode() would load the cache
and make microcode updated.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agomicrocode: introduce a global cache of ucode patch
Chao Gao [Fri, 13 Sep 2019 10:27:42 +0000 (12:27 +0200)]
microcode: introduce a global cache of ucode patch

to replace the current per-cpu cache 'uci->mc'.

With the assumption that all CPUs in the system have the same signature
(family, model, stepping and 'pf'), one microcode update matches with
one cpu should match with others. Having differing microcode revisions
on cpus would cause system unstable and should be avoided. Hence, caching
one microcode update is good enough for all cases.

Introduce a global variable, microcode_cache, to store the newest
matching microcode update. Whenever we get a new valid microcode update,
its revision id is compared against that of the microcode update to
determine whether the "microcode_cache" needs to be replaced. And
this global cache is loaded to cpu in apply_microcode().

All operations on the cache is protected by 'microcode_mutex'.

Note that I deliberately avoid touching the old per-cpu cache ('uci->mc')
as I am going to remove it completely in the following patches. We copy
everything to create the new cache blob to avoid reusing some buffers
previously allocated for the old per-cpu cache. It is not so efficient,
but it is already corrected by a patch later in this series.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agomicrocode/amd: distinguish old and mismatched ucode in microcode_fits()
Chao Gao [Fri, 13 Sep 2019 10:26:51 +0000 (12:26 +0200)]
microcode/amd: distinguish old and mismatched ucode in microcode_fits()

Sometimes, an ucode with a level lower than or equal to current CPU's
patch level is useful. For example, to work around a broken bios which
only loads ucode for BSP, when BSP parses an ucode blob during bootup,
it is better to save an ucode with lower or equal level for APs

No functional change is made in this patch. But following patch would
handle "old ucode" and "mismatched ucode" separately.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agomicrocode/intel: extend microcode_update_match()
Chao Gao [Fri, 13 Sep 2019 10:26:16 +0000 (12:26 +0200)]
microcode/intel: extend microcode_update_match()

to a more generic function. So that it can be used alone to check
an update against the CPU signature and current update revision.

Note that enum microcode_match_result will be used in common code
(aka microcode.c), it has been placed in the common header. And
constifying the parameter of microcode_sanity_check() such that it
can be called by microcode_update_match().

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agopublic/xen.h: update the comment explaining 'Wallclock time'
Paul Durrant [Fri, 13 Sep 2019 10:21:47 +0000 (12:21 +0200)]
public/xen.h: update the comment explaining 'Wallclock time'

Since commit 0629adfd80e "Actually set a HVM domain's time offset when it
sets the RTC", the comment in the public header has been misleading, since
it claims that wallclock time is only updated by control software.
Moreover, the comments stating that wc_sec and wc_nsec are seconds and
nanoseconds (respectively) in UTC since the Unix epoch are bogus. Their
values are adjusted by the domain's time_offset_seconds value, which is
updated by a guest write to the emulated RTC and hence the wallclock
timezone is under guest control.

This patch attempts to bring the comment in line with reality whilst
keeping it reasonably short.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agoUpdate my MAINTAINERS entries
Paul Durrant [Thu, 12 Sep 2019 14:18:47 +0000 (15:18 +0100)]
Update my MAINTAINERS entries

My Citrix email address will expire shortly.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agodebugtrace: fix Arm build
Juergen Gross [Fri, 13 Sep 2019 06:15:05 +0000 (08:15 +0200)]
debugtrace: fix Arm build

Add missing #includes.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agoxen/arm: setup: Relocate the Device-Tree later on in the boot
Julien Grall [Wed, 11 Sep 2019 15:31:34 +0000 (16:31 +0100)]
xen/arm: setup: Relocate the Device-Tree later on in the boot

At the moment, the Device-Tree is relocated into xenheap while setting
up the memory subsystem. This is actually not necessary because the
early mapping is still present and we don't require the virtual address
to be stable until unflatting the Device-Tree.

So the relocation can safely be moved after the memory subsystem is
fully setup. This has the nice advantage to make the relocation common
and let the xenheap allocator decides where to put it.

Lastly, the device-tree is not going to be used for ACPI system. So
there are no need to relocate it and can just be discarded.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm: bootfd: Fix indentation in process_multiboot_node()
Julien Grall [Wed, 11 Sep 2019 15:19:42 +0000 (16:19 +0100)]
xen/arm: bootfd: Fix indentation in process_multiboot_node()

One line in process_multiboot_node() is using hard tab rather than soft
tab. So fix it!

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoscripts/add_maintainers.pl: Add logic to use V entry
Lars Kurth [Fri, 30 Aug 2019 19:35:13 +0000 (20:35 +0100)]
scripts/add_maintainers.pl: Add logic to use V entry

Add logic to use V section entry in THE REST for identifying xen trees

Specifically:
* Move check until after the MAINTAINERS file has been read
* Add get_xen_maintainers_file_version() for check
* Remove top_of_tree as not needed any more
* Fail with extended error message when used out of tree

Signed-off-by: Lars Kurth <lars.kurth@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
5 years agoMAINTAINERS: Add V section entry to allow identification of Xen file
Lars Kurth [Fri, 30 Aug 2019 17:42:56 +0000 (18:42 +0100)]
MAINTAINERS: Add V section entry to allow identification of Xen file

This change provides sufficient information to allow get_maintainer.pl /
add_maintainers.pl scripts to be run on xen sister repositories such as
mini-os.git, osstest.git, etc

A suggested template for sister repositories of Xen is

========================================================
This file follows the same conventions as outlined in
xen.git:MAINTAINERS. Please refer to the file in xen.git
for more information.

THE REST
M:      MAINTAINER1 <maintainer1@email.com>
M:      MAINTAINER2 <maintainer2@email.com>
L:      xen-devel@lists.xenproject.org
S:      Supported
F:      *
F:      */
V:      xen-maintainers-1
========================================================

Signed-off-by: Lars Kurth <lars.kurth@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agoscripts/add_maintainers.pl: Remove hardcoding
Lars Kurth [Fri, 30 Aug 2019 17:18:16 +0000 (18:18 +0100)]
scripts/add_maintainers.pl: Remove hardcoding

Instead of using a hardcoded location, inherit the
location from $0

Signed-off-by: Lars Kurth <lars.kurth@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agodebugtrace: add entry when entry count is wrapping
Juergen Gross [Thu, 12 Sep 2019 13:13:47 +0000 (15:13 +0200)]
debugtrace: add entry when entry count is wrapping

The debugtrace entry count is a 32 bit variable, so it can wrap when
lots of trace entries are being produced. Making it wider would result
in a waste of buffer space as the printed count value would consume
more bytes when not wrapping.

So instead of letting the count value grow to huge values let it wrap
and add a wrap counter printed in this situation. This will keep the
needed buffer space at today's value while avoiding to loose a way to
sort all entries in case multiple trace buffers are involved.

Note that the wrap message will be printed before the first trace
entry in case output is switched to console early. This is on purpose
in order to enable a future support of debugtrace to console without
any allocated buffer.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agodebugtrace: add per-cpu buffer option
Juergen Gross [Thu, 12 Sep 2019 13:12:21 +0000 (15:12 +0200)]
debugtrace: add per-cpu buffer option

debugtrace is normally writing trace entries into a single trace
buffer. There are cases where this is not optimal, e.g. when hunting
a bug which requires writing lots of trace entries and one cpu is
stuck. This will result in other cpus filling the trace buffer and
finally overwriting the interesting trace entries of the hanging cpu.

In order to be able to debug such situations add the capability to use
per-cpu trace buffers. This can be selected by specifying the
debugtrace boot parameter with the modifier "cpu:", like:

  debugtrace=cpu:16

At the same time switch the parsing function to accept size modifiers
(e.g. 4M or 1G).

Printing out the trace entries is done for each buffer in order to
minimize the effort needed during printing. As each entry is prefixed
with its sequence number sorting the entries can easily be done when
analyzing them.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agosysctl: report shadow paging capability
Roger Pau Monne [Tue, 10 Sep 2019 15:25:38 +0000 (17:25 +0200)]
sysctl: report shadow paging capability

Report whether shadow paging is supported by the hypervisor, since it
can be disabled at build time.

Reuse and tweak LIBXL_HAVE_PHYSINFO_CAP_HAP as it hasn't appeared in a
released version of Xen yet.

Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/msr: Fix 'plaform' typo
Andrew Cooper [Thu, 12 Sep 2019 09:57:37 +0000 (10:57 +0100)]
x86/msr: Fix 'plaform' typo

Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agosysctl/libxl: choose a sane default for HAP
Roger Pau Monné [Wed, 11 Sep 2019 12:55:20 +0000 (14:55 +0200)]
sysctl/libxl: choose a sane default for HAP

Current libxl code will always enable Hardware Assisted Paging (HAP),
expecting that the hypervisor will fallback to shadow if HAP is not
available. With the changes to DOMCTL_createdomain that's not the case
any longer, and the hypervisor will raise an error if HAP is not
available instead of silently falling back to shadow.

In order to keep the previous functionality report whether HAP is
available or not in XEN_SYSCTL_physinfo, so that the toolstack can
select a sane default if there's no explicit user selection of whether
HAP should be used.

Note that on ARM hardware HAP capability is always reported since it's
a required feature in order to run Xen.

Fixes: d0c0ba7d3de ('x86/hvm/domain: remove the 'hap_enabled' flag')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
5 years agox86/shadow: fold p2m page accounting into sh_min_allocation()
Jan Beulich [Wed, 11 Sep 2019 12:54:34 +0000 (14:54 +0200)]
x86/shadow: fold p2m page accounting into sh_min_allocation()

This is to make the function live up to the promise its name makes. And
it simplifies all callers.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
5 years agotools/ocaml: abi check: #include on x86 only. Spotted by Gitlab CI
Ian Jackson [Tue, 10 Sep 2019 15:16:51 +0000 (16:16 +0100)]
tools/ocaml: abi check: #include on x86 only.  Spotted by Gitlab CI

Reported-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: fix test harness and fuzzer build dependencies
Jan Beulich [Tue, 10 Sep 2019 14:35:09 +0000 (16:35 +0200)]
x86emul: fix test harness and fuzzer build dependencies

Commit fd35f32b4b ("tools/x86emul: Use struct cpuid_policy in the
userspace test harnesses") didn't account for the dependencies of
cpuid-autogen.h to potentially change between incremental builds. In
particular the harness has a "run" goal which is supposed to be usable
independently of the rest of the tools sub-tree building, and both the
harness and the fuzzer code are also supposed to be buildable
independently. Therefore a re-build of the generated header needs to be
triggered first, which is achieved by introducing a new top-level target
pattern (for just the "run" part for now).

Further cpuid.o did not have any dependencies added for it.

Finally, while at it, add a "run" target to the cpu-policy test harness.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
5 years agox86/IRQ: make 'i' debug output more tabular again
Jan Beulich [Tue, 10 Sep 2019 14:34:21 +0000 (16:34 +0200)]
x86/IRQ: make 'i' debug output more tabular again

Since the affinity values are no longer of uniform width, move them
further to the right such that as much of the output as possible comes
out aligned with one another.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agoioreq: fix hvm_all_ioreq_servers_add_vcpu fail path cleanup
Roger Pau Monné [Tue, 10 Sep 2019 14:32:47 +0000 (16:32 +0200)]
ioreq: fix hvm_all_ioreq_servers_add_vcpu fail path cleanup

The loop in FOR_EACH_IOREQ_SERVER is backwards hence the cleanup on
failure needs to be done forwards.

Fixes: 97a5a3e30161 ('x86/hvm/ioreq: maintain an array of ioreq servers rather than a list')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
5 years agotools/ocaml: Fix build error with CentOS 7
Andrew Cooper [Tue, 10 Sep 2019 14:04:55 +0000 (15:04 +0100)]
tools/ocaml: Fix build error with CentOS 7

gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28) complains:

  xenctrl_stubs.c: In function 'stub_xc_domain_create':
  xenctrl_stubs.c:216:28: error: 'val' may be used uninitialized
                          in this function [-Werror=maybe-uninitialized]
     cfg.arch.emulation_flags = ocaml_list_to_c_bitmap
                              ^
  xenctrl_stubs.c:198:12: error: 'val' may be used uninitialized
                          in this function [-Werror=maybe-uninitialized]
    cfg.flags = ocaml_list_to_c_bitmap
              ^
  cc1: all warnings being treated as errors

GCC doesn't point at the correct piece of code, but the diagnostic text is
correct, and can occur when the list is empty. Initialise val to 0.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
5 years agotools/ocaml: abi: Use formal conversion and check in more places
Andrew Cooper [Tue, 10 Sep 2019 11:17:30 +0000 (12:17 +0100)]
tools/ocaml: abi: Use formal conversion and check in more places

Now we have a caller for ocaml_list_to_c_bitmap.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
5 years agotools/ocaml: tools/ocaml: Add missing CDF_* values
Ian Jackson [Tue, 10 Sep 2019 11:34:03 +0000 (12:34 +0100)]
tools/ocaml: tools/ocaml: Add missing CDF_* values

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agotools/ocaml: abi-check: Check properly.
Ian Jackson [Tue, 10 Sep 2019 11:27:45 +0000 (12:27 +0100)]
tools/ocaml: abi-check: Check properly.

Fix a broken regexp which would mention `$/' when it ought to have
mentioned `$'.  The result would be that it would match lines like
    type some_ocaml_type = Thing | Other_Thing
but ignore everything but the type name, giving wrong answers.

Check that we check mentioned types.  Otherwise if we fail to spot
some suitable thing in the ocaml, we would just omit checking this
type !

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
5 years agotools/ocaml: Reformat domain_create_flag
Andrew Cooper [Tue, 10 Sep 2019 11:14:51 +0000 (12:14 +0100)]
tools/ocaml: Reformat domain_create_flag

This will allow us to apply the abi checker soon.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
5 years agotools/ocaml: abi-check: Cope with multiple conversions of same type
Ian Jackson [Tue, 10 Sep 2019 11:25:26 +0000 (12:25 +0100)]
tools/ocaml: abi-check: Cope with multiple conversions of same type

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
5 years agotools/ocaml: abi-check: Improve output and error messages
Ian Jackson [Tue, 10 Sep 2019 11:34:38 +0000 (12:34 +0100)]
tools/ocaml: abi-check: Improve output and error messages

In the generated C, add some comments saying where we found the ocaml
type.  This helps with debugging.  (I considered emitting #line
directives but decided this would be more confusing than helpful.)

Improve two dies.

Use better-named filehandles (perl prints thier names when it dies).

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
5 years agotools/ocaml: abi handling: Provide ocaml->C conversion/check
Andrew Cooper [Tue, 10 Sep 2019 11:18:45 +0000 (12:18 +0100)]
tools/ocaml: abi handling: Provide ocaml->C conversion/check

No users of this yet so no overall change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
5 years agotools/ocaml: abi-check: Add comments
Ian Jackson [Tue, 10 Sep 2019 11:12:44 +0000 (12:12 +0100)]
tools/ocaml: abi-check: Add comments

Provide interface documentation for this script.

Explain why we check .ml not .mli.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
5 years agoxen/domctl: Drop guest suffix from XEN_DOMCTL_CDF_hvm
Andrew Cooper [Tue, 10 Sep 2019 10:41:33 +0000 (11:41 +0100)]
xen/domctl: Drop guest suffix from XEN_DOMCTL_CDF_hvm

The suffix is redundant, and dropping it helps to simplify the Ocaml/C
ABI checking.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agotools/ocaml: Introduce xenctrl ABI build-time checks
Ian Jackson [Mon, 9 Sep 2019 17:12:06 +0000 (18:12 +0100)]
tools/ocaml: Introduce xenctrl ABI build-time checks

c/s f089fddd941 broke the Ocaml ABI by renumering
XEN_SYSCTL_PHYSCAP_directio without adjusting the Ocaml
physinfo_cap_flag enumeration.

Add build machinery which will check the ABI correspondence.

This will result in a compile time failure whenever constants get
renumbered/added without a compatible adjustment to the Ocaml ABI.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
5 years agotools/ocaml: Add missing CAP_PV
Andrew Cooper [Mon, 9 Sep 2019 17:12:05 +0000 (18:12 +0100)]
tools/ocaml: Add missing CAP_PV

c/s f089fddd941 broke the Ocaml ABI by renumering XEN_SYSCTL_PHYSCAP_directio
without adjusting the Ocaml physinfo_cap_flag enumeration.  Fix this by
inserting CAP_PV between CAP_HVM and CAP_DirectIO.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
5 years agotools/ocaml: Add missing X86_EMU_VPCI
Ian Jackson [Mon, 9 Sep 2019 17:12:04 +0000 (18:12 +0100)]
tools/ocaml: Add missing X86_EMU_VPCI

This was missing from x86_arch_emulation_flags.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
5 years agox86/boot: Improve code generation from bootsym()
Andrew Cooper [Mon, 9 Sep 2019 10:35:03 +0000 (11:35 +0100)]
x86/boot: Improve code generation from bootsym()

The code generation for bootsym() is atrocious, and unnecessarily complicated.
Given the appropriate physical address, all we need is to construct a virtual
address of the appropriate type.

  add/remove: 0/0 grow/shrink: 0/9 up/down: 0/-4256 (-4256)
  Function                                     old     new   delta
  kexec_reserve_area.constprop                 165     159      -6
  reset_videomode_after_s3                     231      70    -161
  identify_cpu                                1341    1176    -165
  parse_acpi_sleep                             408     240    -168
  early_init_intel                             632     440    -192
  __cpu_up                                    1983    1682    -301
  do_platform_op                              6469    5526    -943
  compat_platform_op                          6433    5482    -951
  __start_xen                                12939   11570   -1369
  Total: Before=3341298, After=3337042, chg -0.13%

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/cpuid: Fix build with CentOS 6 following c/s 7479151106
Andrew Cooper [Mon, 9 Sep 2019 15:53:28 +0000 (16:53 +0100)]
x86/cpuid: Fix build with CentOS 6 following c/s 7479151106

GCC of a CentOS 6 vintage complains:

  cpuid.c: In function 'parse_xen_cpuid':
  cpuid.c:32: error: 'mid' may be used uninitialized in this function

This can't occur in practice because the while() loop is guarenteed to be
entered, but initialise mid to work around the issues.

Spotted by Gitlab CI.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/cpuid: Fix handling of the CPUID.7[0].eax levelling MSR
Andrew Cooper [Fri, 6 Sep 2019 15:59:02 +0000 (16:59 +0100)]
x86/cpuid: Fix handling of the CPUID.7[0].eax levelling MSR

7a0 is an integer field, not a mask - taking the logical and of the hardware
and policy values results in nonsense.  Instead, take the policy value
directly.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@cirtrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agoxen: refactor debugtrace data
Juergen Gross [Mon, 9 Sep 2019 12:37:25 +0000 (14:37 +0200)]
xen: refactor debugtrace data

As a preparation for per-cpu buffers do a little refactoring of the
debugtrace data: put the needed buffer admin data into the buffer as
it will be needed for each buffer. In order not to limit buffer size
switch the related fields from unsigned int to unsigned long, as on
huge machines with RAM in the TB range it might be interesting to
support buffers >4GB.

While at it switch debugtrace_send_to_console and debugtrace_used to
bool and delete an empty line.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agoxen: move debugtrace coding to common/debugtrace.c
Juergen Gross [Mon, 9 Sep 2019 12:36:10 +0000 (14:36 +0200)]
xen: move debugtrace coding to common/debugtrace.c

Instead of living in drivers/char/console.c move the debugtrace
related coding to a new file common/debugtrace.c

No functional change, code movement only.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agoxen: fix debugtrace clearing
Juergen Gross [Mon, 9 Sep 2019 12:34:37 +0000 (14:34 +0200)]
xen: fix debugtrace clearing

After dumping the debugtrace buffer it is cleared. This results in some
entries not being printed in case the buffer is dumped again before
having wrapped.

While at it remove the trailing zero byte in the buffer as it is no
longer needed. Commit b5e6e1ee8da59f introduced passing the number of
chars to be printed in the related interfaces, so the trailing 0 byte
is no longer required.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agosysctl: report existing physcaps on Arm
Roger Pau Monne [Fri, 6 Sep 2019 14:30:20 +0000 (16:30 +0200)]
sysctl: report existing physcaps on Arm

Current physcaps in XEN_SYSCTL_physinfo are only used by x86, albeit
the capabilities themselves are not x86 specific.

This patch adds support for also reporting the current capabilities on
Arm hardware. Note that on Arm PHYSCAP_hvm is always reported, and
setting PHYSCAP_directio has been moved to common code since the same
logic to set it is used by x86 and Arm.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agoxen/arm32: head: Don't setup the fixmap on secondary CPUs
Julien Grall [Mon, 22 Jul 2019 13:24:43 +0000 (14:24 +0100)]
xen/arm32: head: Don't setup the fixmap on secondary CPUs

setup_fixmap() will setup the fixmap in the boot page tables in order to
use earlyprintk and also update the register r11 holding the address to
the UART.

However, secondary CPUs are not using earlyprintk between turning the
MMU on and switching to the runtime page table. So setting up the
fixmap in the boot pages table is pointless.

This means most of setup_fixmap() is not necessary for the secondary
CPUs. The update of UART address is now moved out of setup_fixmap() and
duplicated in the CPU boot and secondary CPUs boot. Additionally, the
call to setup_fixmap() is removed from secondary CPUs boot.

Lastly, take the opportunity to replace load from literal pool with the
new macro mov_w.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm32: head: Move assembly switch to the runtime PT in secondary CPUs path
Julien Grall [Sat, 20 Apr 2019 17:18:01 +0000 (18:18 +0100)]
xen/arm32: head: Move assembly switch to the runtime PT in secondary CPUs path

The assembly switch to the runtime PT is only necessary for the
secondary CPUs. So move the code in the secondary CPUs path.

While this is definitely not compliant with the Arm Arm as we are
switching between two differents set of page-tables without turning off
the MMU. Turning off the MMU is impossible here as the ID map may clash
with other mappings in the runtime page-tables. This will require more
rework to avoid the problem. So for now add a TODO in the code.

Finally, the code is currently assume that r5 will be properly set to 0
before hand. This is done by create_page_tables() which is called quite
early in the boot process. There are a risk this may be oversight in the
future and therefore breaking secondary CPUs boot. Instead, set r5 to 0
just before using it.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm32: head: Document enable_mmu()
Julien Grall [Sat, 20 Apr 2019 12:33:31 +0000 (13:33 +0100)]
xen/arm32: head: Document enable_mmu()

Document the behavior and the main registers usage within enable_mmu().

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm32: head: Document create_pages_tables()
Julien Grall [Sun, 21 Jul 2019 18:35:19 +0000 (19:35 +0100)]
xen/arm32: head: Document create_pages_tables()

Document the behavior and the main registers usage within the function.
Note that r6 is now only used within the function, so it does not need
to be part of the common register.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm32: head: Rework and document zero_bss()
Julien Grall [Wed, 26 Jun 2019 20:23:50 +0000 (21:23 +0100)]
xen/arm32: head: Rework and document zero_bss()

On secondary CPUs, zero_bss() will be a NOP because BSS only need to be
zeroed once at boot. So the call in the secondary CPUs path can be
removed.

Lastly, document the behavior and the main registers usage within the
function.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm32: head: Rework and document check_cpu_mode()
Julien Grall [Tue, 16 Apr 2019 13:53:19 +0000 (14:53 +0100)]
xen/arm32: head: Rework and document check_cpu_mode()

A branch in the success case can be avoided by inverting the branch
condition. At the same time, remove a pointless comment as Xen can only
run at Hypervisor Mode.

Lastly, document the behavior and the main registers usage within the
function.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm32: head: Introduce distinct paths for the boot CPU and secondary CPUs
Julien Grall [Wed, 26 Jun 2019 12:46:56 +0000 (13:46 +0100)]
xen/arm32: head: Introduce distinct paths for the boot CPU and secondary CPUs

The boot code is currently quite difficult to go through because of the
lack of documentation and a number of indirection to avoid executing
some path in either the boot CPU or secondary CPUs.

In an attempt to make the boot code easier to follow, each parts of the
boot are now in separate functions. Furthermore, the paths for the boot
CPU and secondary CPUs are now distinct and for now will call each
functions.

Follow-ups will remove unnecessary calls and do further improvement
(such as adding documentation and reshuffling).

Note that the switch from using the ID mapping to the runtime mapping
is duplicated for each path. This is because in the future we will need
to stay longer in the ID mapping for the boot CPU.

Lastly, it is now required to save lr in cpu_init() becauswe the
function will call other functions and therefore clobber lr.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm32: head: Introduce print_reg
Julien Grall [Mon, 15 Apr 2019 22:11:42 +0000 (23:11 +0100)]
xen/arm32: head: Introduce print_reg

At the moment, the user should save r14/lr if it cares about it.

Follow-up patches will introduce more use of putn in place where lr
should be preserved.

Furthermore, any user of putn should also move the value to register r0
if it was stored in a different register.

For convenience, a new macro is introduced to print a given register.
The macro will take care for us to move the value to r0 and also
preserve lr.

Lastly the new macro is used to replace all the callsite of putn. This
will simplify rework/review later on.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm32: head: Rework UART initialization on boot CPU
Julien Grall [Mon, 15 Apr 2019 21:16:25 +0000 (22:16 +0100)]
xen/arm32: head: Rework UART initialization on boot CPU

Anything executed after the label common_start can be executed on all
CPUs. However most of the instructions executed between the label
common_start and init_uart are not executed on the boot CPU.

The only instructions executed are to lookup the CPUID so it can be
printed on the console (if earlyprintk is enabled). Printing the CPUID
is not entirely useful to have for the boot CPU and requires a
conditional branch to bypass unused instructions.

Furthermore, the function init_uart is only called for boot CPU
requiring another conditional branch. This makes the code a bit tricky
to follow.

The UART initialization is now moved before the label common_start. This
now requires to have a slightly altered print for the boot CPU and set
the early UART base address in each the two path (boot CPU and
secondary CPUs).

This has the nice effect to remove a couple of conditional branch in
the code.

After this rework, the CPUID is only used at the very beginning of the
secondary CPUs boot path. So there is no need to "reserve" x24 for the
CPUID.

Lastly, take the opportunity to replace load from literal pool with the
new macro mov_w.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm32: head: Don't clobber r14/lr in the macro PRINT
Julien Grall [Mon, 15 Apr 2019 14:57:38 +0000 (15:57 +0100)]
xen/arm32: head: Don't clobber r14/lr in the macro PRINT

The current implementation of the macro PRINT will clobber r14/lr. This
means the user should save r14 if it cares about it.

Follow-up patches will introduce more use of PRINT in places where lr
should be preserved. Rather than requiring all the user to preserve lr,
the macro PRINT is modified to save and restore it.

While the comment state r3 will be clobbered, this is not the case. So
PRINT will use r3 to preserve lr.

Lastly, take the opportunity to move the comment on top of PRINT and use
PRINT in init_uart. Both changes will be helpful in a follow-up patch.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm32: head: Mark the end of subroutines with ENDPROC
Julien Grall [Wed, 26 Jun 2019 11:29:54 +0000 (12:29 +0100)]
xen/arm32: head: Mark the end of subroutines with ENDPROC

putn() and puts() are two subroutines. Add ENDPROC for the benefits of
static analysis tools and the reader.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm32: head: Add a macro to move an immediate constant into a 32-bit register
Julien Grall [Mon, 15 Apr 2019 20:58:51 +0000 (21:58 +0100)]
xen/arm32: head: Add a macro to move an immediate constant into a 32-bit register

The current boot code is using the pattern ldr rX, =... to move an
immediate constant into a 32-bit register.

This pattern implies to load the immediate constant from a literal pool,
meaning a memory access will be performed.

The memory access can be avoided by using movw/movt instructions.

A new macro is introduced to move an immediate constant into a 32-bit
register without a memory load. Follow-up patches will make use of it.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm64: head: Fix typo in the documentation on top of init_uart()
Julien Grall [Wed, 31 Jul 2019 19:26:19 +0000 (20:26 +0100)]
xen/arm64: head: Fix typo in the documentation on top of init_uart()

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm64: head: Introduce a macro to get a PC-relative address of a symbol
Julien Grall [Mon, 17 Jun 2019 13:51:21 +0000 (14:51 +0100)]
xen/arm64: head: Introduce a macro to get a PC-relative address of a symbol

Arm64 provides instructions to load a PC-relative address, but with some
limitations:
   - adr is enable to cope with +/-1MB
   - adrp is enale to cope with +/-4GB but relative to a 4KB page
     address

Because of that, the code requires to use 2 instructions to load any Xen
symbol. To make the code more obvious, introducing a new macro adr_l is
introduced.

The new macro is used to replace a couple of open-coded use in
efi_xen_start.

The macro is copied from Linux 5.2-rc4.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm64: head: Setup TTBR_EL2 in enable_mmu() and add missing isb
Julien Grall [Sat, 13 Apr 2019 21:55:18 +0000 (22:55 +0100)]
xen/arm64: head: Setup TTBR_EL2 in enable_mmu() and add missing isb

At the moment, TTBR_EL2 is setup in create_page_tables(). This is fine
as it is called by every CPUs.

However, such assumption may not hold in the future. To make change
easier, the TTBR_EL2 is not setup in enable_mmu().

Take the opportunity to add the missing isb() to ensure the TTBR_EL2 is
seen before the MMU is turned on.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm64: head: Rework and document launch()
Julien Grall [Mon, 15 Apr 2019 11:24:30 +0000 (12:24 +0100)]
xen/arm64: head: Rework and document launch()

Boot CPU and secondary CPUs will use different entry point to C code. At
the moment, the decision on which entry to use is taken within launch().

In order to avoid a branch for the decision and make the code clearer,
launch() is reworked to take in parameters the entry point and its
arguments.

Lastly, document the behavior and the main registers usage within the
function.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm: lpae: Allow more LPAE helpers to be used in assembly
Julien Grall [Tue, 6 Aug 2019 17:14:08 +0000 (18:14 +0100)]
xen/arm: lpae: Allow more LPAE helpers to be used in assembly

A follow-up patch will require to use *_table_offset() and *_MASK helpers
from assembly. This can be achieved by using _AT() macro to remove the type
when called from assembly.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agox86/cpuid: Extend the cpuid= option to support all named features
Andrew Cooper [Mon, 26 Nov 2018 17:06:23 +0000 (17:06 +0000)]
x86/cpuid: Extend the cpuid= option to support all named features

For gen-cpuid.py, fix a comment describing self.names, and generate the
reverse mapping in self.values.  Write out INIT_FEATURE_NAMES which maps a
string name to a bit position.

For parse_cpuid(), use cmdline_strcmp() and perform a binary search over
INIT_FEATURE_NAMES.  A tweak to cmdline_strcmp() is needed to break at equals
signs as well.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/apic: do not initialize LDR and DFR for bigsmp
Bandan Das [Fri, 6 Sep 2019 15:07:55 +0000 (17:07 +0200)]
x86/apic: do not initialize LDR and DFR for bigsmp

Legacy apic init uses bigsmp for smp systems with 8 and more CPUs. The
bigsmp APIC implementation uses physical destination mode, but it
nevertheless initializes LDR and DFR. The LDR even ends up incorrectly with
multiple bit being set.

This does not cause a functional problem because LDR and DFR are ignored
when physical destination mode is active, but it triggered a problem on a
32-bit KVM guest which jumps into a kdump kernel.

The multiple bits set unearthed a bug in the KVM APIC implementation. The
code which creates the logical destination map for VCPUs ignores the
disabled state of the APIC and ends up overwriting an existing valid entry
and as a result, APIC calibration hangs in the guest during kdump
initialization.

Remove the bogus LDR/DFR initialization.

This is not intended to work around the KVM APIC bug. The LDR/DFR
ininitalization is wrong on its own.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Bandan Das <bsd@redhat.com>
[Linux commit bae3a8d3308ee69a7dbdf145911b18dfda8ade0d]

Drop init_apic_ldr_x2apic_phys() at the same time.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/apic: include the LDR when clearing out APIC registers
Bandan Das [Fri, 6 Sep 2019 15:07:14 +0000 (17:07 +0200)]
x86/apic: include the LDR when clearing out APIC registers

Although APIC initialization will typically clear out the LDR before
setting it, the APIC cleanup code should reset the LDR.

This was discovered with a 32-bit KVM guest jumping into a kdump
kernel. The stale bits in the LDR triggered a bug in the KVM APIC
implementation which caused the destination mapping for VCPUs to be
corrupted.

Note that this isn't intended to paper over the KVM APIC bug. The kernel
has to clear the LDR when resetting the APIC registers except when X2APIC
is enabled.

Signed-off-by: Bandan Das <bsd@redhat.com>
[Linux commit 558682b5291937a70748d36fd9ba757fb25b99ae]
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86: drop CONFIG_X86_MCE_THERMAL
Jan Beulich [Fri, 6 Sep 2019 15:06:19 +0000 (17:06 +0200)]
x86: drop CONFIG_X86_MCE_THERMAL

There's no point having this if it's not exposed through Kconfig.

Take the liberty and also drop an unnecessary "return" in context.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/mwait-idle: add support for Jacobsville
Zhang Rui [Fri, 6 Sep 2019 15:05:39 +0000 (17:05 +0200)]
x86/mwait-idle: add support for Jacobsville

Jacobsville uses the same C-states as Denverton.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
[Linux commit 04b1d5d098491244f506c4265cc95b87210eef2f]
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/xstate: make use_xsave non-init
Roger Pau Monné [Fri, 6 Sep 2019 15:04:39 +0000 (17:04 +0200)]
x86/xstate: make use_xsave non-init

LLVM code generation can attempt to load from a variable in the next
condition of an expression under certain circumstances, thus
attempting to load use_xsave regardless of the value of the bsp
variable, which leads to a page fault when the init section has
already been unmapped.

Fix this by making use_xsave non-init, thus preventing the page fault;
use __read_mostly instead. The LLVM bug with the discussion about this
issue can be found at:

https://bugs.llvm.org/show_bug.cgi?id=39707

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>