]> xenbits.xensource.com Git - people/royger/xen.git/log
people/royger/xen.git
6 years agox86: restrict HVMOP_pagetable_dying to current
Jan Beulich [Fri, 26 Oct 2018 13:18:52 +0000 (15:18 +0200)]
x86: restrict HVMOP_pagetable_dying to current

This is not used (and probably was never meant to be) by the tool stack.
Limiting it to the current domain in particular allows to eliminate a
bogus use of vCPU 0 in pagetable_dying().

Remove the now unnecessary domain/vCPU parameters from the wrapper/hook
functions at the same time.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86: don't build guest-walk code without HVM and SHADOW_PAGING
Jan Beulich [Fri, 26 Oct 2018 13:16:23 +0000 (15:16 +0200)]
x86: don't build guest-walk code without HVM and SHADOW_PAGING

It's dead code in that case.

We could go further, as we don't really need the 2- and 3-level walk
code in PV mode, but to drop their compilation requires quite a bit of
disentangling of shadow mode code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agox86/vvmx: Disallow the use of VT-x instructions when nested virt is disabled
Andrew Cooper [Wed, 10 Oct 2018 09:17:15 +0000 (09:17 +0000)]
x86/vvmx: Disallow the use of VT-x instructions when nested virt is disabled

c/s ac6a4500b "vvmx: set vmxon_region_pa of vcpu out of VMX operation to an
invalid address" was a real bugfix as described, but has a very subtle bug
which results in all VT-x instructions being usable by a guest.

The toolstack constructs a guest by issuing:

  XEN_DOMCTL_createdomain
  XEN_DOMCTL_max_vcpus

and optionally later, HVMOP_set_param to enable nested virt.

As a result, the call to nvmx_vcpu_initialise() in hvm_vcpu_initialise()
(which is what makes the above patch look correct during review) is actually
dead code.  In practice, nvmx_vcpu_initialise() first gets called when nested
virt is enabled, which is typically never.

As a result, the zeroed memory of struct vcpu causes nvmx_vcpu_in_vmx() to
return true before nested virt is enabled for the guest.

Fixing the order of initialisation is a work in progress for other reasons,
but not viable for security backports.

A compounding factor is that the vmexit handlers for all instructions, other
than VMXON, pass 0 into vmx_inst_check_privilege()'s vmxop_check parameter,
which skips the CR4.VMXE check.  (This is one of many reasons why nested virt
isn't a supported feature yet.)

However, the overall result is that when nested virt is not enabled by the
toolstack (i.e. the default configuration for all production guests), the VT-x
instructions (other than VMXON) are actually usable, and Xen very quickly
falls over the fact that the nvmx structure is uninitialised.

In order to fail safe in the supported case, re-implement all the VT-x
instruction handling using a single function with a common prologue, covering
all the checks which should cause #UD or #GP faults.  This deliberately
doesn't use any state from the nvmx structure, in case there are other lurking
issues.

This is XSA-278

Reported-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Sergey Dyasli <sergey.dyasli@citrix.com>
6 years agoQEMU_TAG update
Ian Jackson [Wed, 24 Oct 2018 15:18:37 +0000 (16:18 +0100)]
QEMU_TAG update

6 years agotools/dombuilder: Initialise vcpu debug registers correctly
Andrew Cooper [Mon, 28 May 2018 14:18:17 +0000 (15:18 +0100)]
tools/dombuilder: Initialise vcpu debug registers correctly

In particular, initialising %dr6 with the value 0 is buggy, because on
hardware supporting Transactional Memory, it will cause the sticky RTM bit to
be asserted, even though a debug exception from a transaction hasn't actually
been observed.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agox86/domain: Initialise vcpu debug registers correctly
Andrew Cooper [Mon, 28 May 2018 14:18:17 +0000 (14:18 +0000)]
x86/domain: Initialise vcpu debug registers correctly

In particular, initialising %dr6 with the value 0 is buggy, because on
hardware supporting Transactional Memory, it will cause the sticky RTM bit to
be asserted, even though a debug exception from a transaction hasn't actually
been observed.

Introduce arch_vcpu_regs_init() to set various architectural defaults, and
reuse this in the hvm_vcpu_reset_state() path.

Architecturally, %edx's init state contains the processors model information,
and 0xf looks to be a remnant of the old Intel processors.  We clearly have no
software which cares, seeing as it is wrong for the last decade's worth of
Intel hardware and for all other vendors, so lets use the value 0 for
simplicity.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
6 years agox86/boot: Initialise the debug registers correctly
Andrew Cooper [Mon, 28 May 2018 14:18:17 +0000 (15:18 +0100)]
x86/boot: Initialise the debug registers correctly

In particular, initialising %dr6 with the value 0 is buggy, because on
hardware supporting Transactional Memory, it will cause the sticky RTM bit to
be asserted, even though a debug exception from a transaction hasn't actually
been observed.

Move X86_DR6_DEFAULT into x86-defns.h along with the other architectural
register constants, and introduce a new X86_DR7_DEFAULT.  Use the existing
write_debugreg() helper, rather than opencoded inline assembly.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
6 years agoSUPPORT: Correct the description of altp2m
Andrew Cooper [Tue, 23 Oct 2018 13:49:09 +0000 (14:49 +0100)]
SUPPORT: Correct the description of altp2m

Altp2m aids monitoring guest memory, not hypervisor memory.  Also, put its
common name in brackets to aid searching.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Acked-by: Julien Grall <julien.grall@arm.com>
6 years agotools/libfsimage: Set soname to 4.12 not 0.4.12
Ian Jackson [Mon, 15 Oct 2018 15:20:26 +0000 (16:20 +0100)]
tools/libfsimage: Set soname to 4.12 not 0.4.12

This was set to 0.4.12 by accident in
  c69a6aca8522c7f676953e56191584381adf2c06
    tools/libfsimage: Bump soname to 4.12

The extra 0. is harmless but ugly.  We should be somewhat consistent.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agox86/boot: enable NMIs after traps init
Sergey Dyasli [Tue, 23 Oct 2018 10:59:12 +0000 (11:59 +0100)]
x86/boot: enable NMIs after traps init

In certain scenarios, NMIs might be disabled during Xen boot process.
Such situation will cause alternative_instructions() to:

    panic("Timed out waiting for alternatives self-NMI to hit\n");

This bug was originally seen when using Tboot to boot Xen 4.11

To prevent this from happening, enable NMIs during cpu_init() and
during __start_xen() for BSP.

Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoarm: fix Dom0 creation after ef72c93df9
Wei Liu [Mon, 22 Oct 2018 13:40:21 +0000 (14:40 +0100)]
arm: fix Dom0 creation after ef72c93df9

ARM Dom0 creation was broken by the said commit because ARM neither
provided XEN_DOMCTL_CDF_hvm_guest nor had CONFIG_PV set.

Set XEN_DOMCTL_CDF_hvm_guest flag for ARM Dom0 to fix the issue. Also
set XEN_DOMCTL_CDF_hap while at it.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
6 years agoxen/vsprintf: Introduce %*pb[l] for printing bitmaps
Andrew Cooper [Thu, 6 Sep 2018 10:25:59 +0000 (10:25 +0000)]
xen/vsprintf: Introduce %*pb[l] for printing bitmaps

The format identifier is consistent with Linux.  The code is adapted from
bitmap_scn{,list}printf() but cleaned up.

This change allows all callers to avoid needing a secondary buffer to render a
cpumask/nodemask into.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: <jbeulich@suse.com>
6 years agox86: don't setup legacy syscall vector when !CONFIG_PV
Wei Liu [Fri, 19 Oct 2018 14:28:36 +0000 (15:28 +0100)]
x86: don't setup legacy syscall vector when !CONFIG_PV

The code snippet is to switch between SYS_DECS_trap_gate and
SYS_DESC_irq_gate depending on whether XPTI is used. When PV is
disabled there is no need to switch.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86: stub out PV only code in do_debug
Wei Liu [Fri, 19 Oct 2018 14:28:38 +0000 (15:28 +0100)]
x86: stub out PV only code in do_debug

When PV is disabled those symbols won't be available. It is impossible
for Xen to hit #DB there.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86: connect guest creation with CONFIG_PV
Wei Liu [Fri, 19 Oct 2018 14:28:34 +0000 (15:28 +0100)]
x86: connect guest creation with CONFIG_PV

This is a bit more complicated than the HVM case because system
domains have PV guest type. Leave them like that.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/pv: make guest_io_{read,write} local functions
Wei Liu [Fri, 19 Oct 2018 14:28:31 +0000 (15:28 +0100)]
x86/pv: make guest_io_{read,write} local functions

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86: make construct_dom0 build with !CONFIG_PV
Wei Liu [Fri, 19 Oct 2018 14:28:30 +0000 (15:28 +0100)]
x86: make construct_dom0 build with !CONFIG_PV

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoxen/arm: Don't build GICv3 with the new vGIC
Julien Grall [Fri, 19 Oct 2018 14:23:55 +0000 (15:23 +0100)]
xen/arm: Don't build GICv3 with the new vGIC

Commit 54ec59f6b0 "xen/arm: vgic-v3: Don't create empty re-distributor
regions" breaks compilation when using the new vGIC.

This is because the field nr_regions is not existing in the vgic
structure. For simplicity, as vGICv3 is not yet imported, disable GICv3.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agox86/hvm/ioreq: allow ioreq servers to use HVM_PARAM_[BUF]IOREQ_PFN
Paul Durrant [Tue, 9 Oct 2018 08:25:48 +0000 (09:25 +0100)]
x86/hvm/ioreq: allow ioreq servers to use HVM_PARAM_[BUF]IOREQ_PFN

Since commit 2c257bd6 "x86/hvm: remove default ioreq server (again)" the
GFNs allocated by the toolstack and set in HVM_PARAM_IOREQ_PFN and
HVM_PARAM_BUFIOREQ_PFN have been unused. This patch allows them to be used
by (non-default) ioreq servers.

While in the area, also make sure HVM_PARAM_[BUF]IOREQ_PFN can only be set
once. These parameters should have always been in the 'set once' category
but this has, so far, not been enforced.

NOTE: This fixes a compatibility issue. A guest created on a version of
      Xen that pre-dates the initial ioreq server implementation and then
      migrated in will currently fail to resume because its migration
      stream will lack values for HVM_PARAM_IOREQ_SERVER_PFN and
      HVM_PARAM_NR_IOREQ_SERVER_PAGES *unless* the system has an
      emulator domain that uses direct resource mapping (which depends
      on the version of privcmd it happens to have) in which case it
      will not require use of GFNs for the ioreq server shared
      pages.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/svm: Remove the pdpe fields from struct vmcb
Andrew Cooper [Fri, 5 Oct 2018 17:02:15 +0000 (17:02 +0000)]
x86/svm: Remove the pdpe fields from struct vmcb

These fields have existed since the SVM code was first introduced.

The earliest reference I can find is c/s d1bd157fbc9 which is unforunately a
rebase & squash of a separate dev tree.  Looking a the commit message, I'm
guessing it was introduced by:

  > user:        twoller@xen-trw1.site
  > date:        Tue Dec 13 19:49:53 2005 -0500
  > files:       ... xen/include/asm-x86/svm_vmcb.h ...
  > description:
  > Add SVM base files to repository.

Anyway, the AMD SDM has no mention of PDPE fields in the VMCB and marks this
part of the VMCB as reserved.  The manual does explicitly say that 32bit PAE
paging may read the PDPE fields from memory rather from the CPU registers.

Chances are very good that this is a vestigial remnent of an early design.
Xen doesn't use the fields at all, except to copy them on virtual
vmentry/vmexit.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
6 years agox86/svm: Fix svm_update_guest_efer() for domains using shadow paging
Andrew Cooper [Thu, 4 Oct 2018 16:36:35 +0000 (16:36 +0000)]
x86/svm: Fix svm_update_guest_efer() for domains using shadow paging

When using shadow paging, EFER.NX is a Xen controlled bit, and is required by
the shadow pagefault handler to distinguish instruction fetches from data
accesses.

This can be observed by a guest which has NX and SMEP clear but SMAP active by
attempting to execute code on a user mapping.  The first attempt to build the
target shadow will #PF so is handled by the shadow code, but when walking the
the guest pagetables, the lack of PFEC_insn_fetch being signalled causes the
shadow code to mistake the instruction fetch for a data fetch, and believe
that it is a real guest fault.  As a result, the guest receives #PF[-d-srP]
for an action which should complete successfully.

The suspicious-looking gymnastics with LME is actually a subtle corner case
with shadow paging.  When dropping out of Long Mode, a guests choice of LME
and Xen's choice of CR0.PG cause hardware to operate in Long Mode, but the
shadow code to operate in 2-on-3 mode.

In addition to describing this corner case in the SVM side, extend the comment
for the same fix on the VT-x side.  (I have a suspicion that I've just worked
out why VT-x doesn't tolerate LMA != LME when Unrestricted Guest is clear.)

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
6 years agoReservation of PCI device range 0xc200-0xc2ff to XCP-ng Project
Alexander Schulz [Wed, 17 Oct 2018 16:29:03 +0000 (17:29 +0100)]
Reservation of PCI device range 0xc200-0xc2ff to XCP-ng Project

We are the XCP-ng project (https://xcp-ng.org) and want to distribut our
 own PV-Tools (maybe also per windows updates) so we need an extra range.

We also registered a PCI-Device:

"XCP-ng Project PCI Device for Windows Update" ->
https://pci-ids.ucw.cz/read/PC/5853/c200

Signed-off-by: Alexander Schulz <code@schulzalex.de>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agomem_access: Fix npfec.kind propagation
George Dunlap [Thu, 27 Sep 2018 11:25:36 +0000 (12:25 +0100)]
mem_access: Fix npfec.kind propagation

The name of the "with_gla" flag is confusing; it has nothing to do
with the existence or lack thereof of a faulting GLA, but rather where
the fault originated.  The npfec.kind value is always valid, and
should thus be propagated, regardless of whether gla_valid is set or
not.

In particular, gla_valid will never be set on AMD systems; but
npfec.kind will still be valid and should still be propagated.

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Alexandru Isaila <aisaila@bitdefender.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
6 years agorangeset: introduce rangeset_merge
Roger Pau Monne [Tue, 17 Jul 2018 09:48:26 +0000 (11:48 +0200)]
rangeset: introduce rangeset_merge

This new helper will merge two rangesets.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agox86/altp2m: Add a subop for obtaining the mem access of a page
Razvan Cojocaru [Thu, 27 Sep 2018 07:58:54 +0000 (10:58 +0300)]
x86/altp2m: Add a subop for obtaining the mem access of a page

Currently there is a subop for setting the memaccess of a page, but not
for consulting it.  The new HVMOP_altp2m_get_mem_access adds this
functionality.

Both altp2m get/set mem access functions use the struct
xen_hvm_altp2m_mem_access which has now dropped the `set' part and has
been renamed from xen_hvm_altp2m_set_mem_access.

Signed-off-by: Adrian Pop <apop@bitdefender.com>
Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
6 years agox86: provide stub for arch_do_multicall_call
Wei Liu [Thu, 4 Oct 2018 15:43:25 +0000 (16:43 +0100)]
x86: provide stub for arch_do_multicall_call

This hypercall is PV only on x86. Provide a stub for it when
!CONFIG_PV.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86: make x86_64/traps.c build with !CONFIG_PV
Wei Liu [Thu, 4 Oct 2018 15:43:24 +0000 (16:43 +0100)]
x86: make x86_64/traps.c build with !CONFIG_PV

Provide declarations for hypercall_page_initialise_ring*_kernel, make
sure DCE work as expected.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86: introduce is_pv_64bit_{vcpu,domain}
Wei Liu [Thu, 4 Oct 2018 15:43:23 +0000 (16:43 +0100)]
x86: introduce is_pv_64bit_{vcpu,domain}

This is useful to rewrite the following pattern (v is PV vcpu)

   if ( is_pv_32bit_vcpu(v) )
       do_foo;
   else
       do_bar;

to

   if ( is_pv_32bit_vcpu(v) )
       do_foo;
   else if ( is_pv_64bit_vcpu(v) )
       do_bar;
   else
       ASSERT_UNREACHABLE;
.

Previously it is not possible to rely on DCE to eliminate the do_bar
part. It becomes possible with the new code structure.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86: turn is_pv_{,32bit_}{domain,vcpu} into inline functions
Wei Liu [Thu, 4 Oct 2018 15:43:22 +0000 (16:43 +0100)]
x86: turn is_pv_{,32bit_}{domain,vcpu} into inline functions

And make them work with CONFIG_PV.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agotools/libfsimage: Rename /usr/lib/fs to /usr/lib/xenfsimage
Ian Jackson [Tue, 9 Oct 2018 16:15:48 +0000 (17:15 +0100)]
tools/libfsimage: Rename /usr/lib/fs to /usr/lib/xenfsimage

Again, avoid namespace pollution.  These paths are purely internal to
libfsimage and its fs-specific modules, so no visible change from the
outside.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agotools/pygrub: Add `xen' to fsimage python module name
Ian Jackson [Tue, 9 Oct 2018 16:14:34 +0000 (17:14 +0100)]
tools/pygrub: Add `xen' to fsimage python module name

This module should be called `libxenfsimage' for the same reasons that
the C library should.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agotools/libfsimage: Add `xen' to .h names and principal .so name
Ian Jackson [Tue, 9 Oct 2018 16:02:42 +0000 (17:02 +0100)]
tools/libfsimage: Add `xen' to .h names and principal .so name

`fsimage' is rather general.  And we do not expect this library to be
very useful out of tree because of its unstable ABI.

So add the word `xen'.  This will avoid naming conflicts with anyone
else's fsimage library.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agotools/libfsimage: Bump soname to 4.12
Ian Jackson [Tue, 9 Oct 2018 16:02:34 +0000 (17:02 +0100)]
tools/libfsimage: Bump soname to 4.12

This library does not have a stable ABI promise.  As far as we know it
is used only by pygrub.  Bump its soname to the Xen version (and
intend to change it each time).

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxenstore.h: Put ( ) around XS_* define shifts
Ian Jackson [Tue, 9 Oct 2018 15:25:38 +0000 (16:25 +0100)]
xenstore.h: Put ( ) around XS_* define shifts

These definitions were not properly protected from unwanted operator
precedence interactions.

Existing use sites in-tree all use & or |, so this does not change any
actual behaviour in-tree.

The same seems likely to be true in external callers.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agotools/debugger/kdd: Install as `xen-kdd', not just `kdd'
Ian Jackson [Fri, 28 Sep 2018 14:30:54 +0000 (15:30 +0100)]
tools/debugger/kdd: Install as `xen-kdd', not just `kdd'

`kdd' is an unfortunate namespace landgrab.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
6 years agoxenmon: Install as xenmon, not xenmon.py
Ian Jackson [Fri, 28 Sep 2018 14:27:21 +0000 (15:27 +0100)]
xenmon: Install as xenmon, not xenmon.py

Adding the implementation language as a suffix to a program name is
poor practice.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agopygrub fsimage.so: Honour LDFLAGS when building
Ian Jackson [Thu, 4 Oct 2018 11:32:00 +0000 (12:32 +0100)]
pygrub fsimage.so: Honour LDFLAGS when building

This seems to have been simply omitted.  Obviously this is needed when
building and not just when installing.  Passing only when installing
is ineffective.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agogdbsx: Honour LDFLAGS when linking
Ian Jackson [Thu, 4 Oct 2018 11:30:37 +0000 (12:30 +0100)]
gdbsx: Honour LDFLAGS when linking

This command does the link, so it needs LDFLAGS.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
6 years agotools/Rules.mk: Honour PREPEND_LDFLAGS_XEN_TOOLS
Ian Jackson [Fri, 5 Oct 2018 16:52:54 +0000 (17:52 +0100)]
tools/Rules.mk: Honour PREPEND_LDFLAGS_XEN_TOOLS

This allows the caller to provide some LDFLAGS to the Xen build
system.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxen/xsm: Add new SILO mode for XSM
Xin Li [Tue, 9 Oct 2018 09:33:20 +0000 (17:33 +0800)]
xen/xsm: Add new SILO mode for XSM

When SILO is enabled, there would be no page-sharing or event notifications
between unprivileged VMs (no grant tables or event channels).

Signed-off-by: Xin Li <xin.li@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoxen/xsm: Introduce new boot parameter xsm
Xin Li [Tue, 9 Oct 2018 09:33:19 +0000 (17:33 +0800)]
xen/xsm: Introduce new boot parameter xsm

Introduce new boot parameter xsm to choose which xsm module is enabled,
and set default to dummy. And add new option in Kconfig to choose the
default XSM implementation.

Signed-off-by: Xin Li <xin.li@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoxen/xsm: remove unnecessary #define
Xin Li [Tue, 9 Oct 2018 09:33:18 +0000 (17:33 +0800)]
xen/xsm: remove unnecessary #define

this #define is unnecessary since XSM_INLINE is redefined in
xsm/dummy.h, it's a risk of build breakage, so remove it.

Signed-off-by: Xin Li <xin.li@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
6 years agoamd-iommu: use correct constants in amd_iommu_get_next_table_from_pte()
Paul Durrant [Wed, 26 Sep 2018 13:44:07 +0000 (14:44 +0100)]
amd-iommu: use correct constants in amd_iommu_get_next_table_from_pte()

...and change the name to amd_iommu_get_address_from_pte() since the
address read is not necessarily the address of a next level page table.
(If the 'next level' field is not 1 - 6 then the address is a page
address).

The constants in use prior to this patch relate to device table entries
rather than page table entries. Although they do have the same value, it
makes the code confusing to read.

This patch also changes the PDE/PTE pointer argument to void *, and
removes any u32/uint32_t casts in the call sites. Unnecessary casts
surrounding call sites are also removed.

No functional change.

NOTE: The patch also adds emacs boilerplate to iommu_map.c

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewd-by: Brian Woods <brian.woods@amd.com>
6 years agox86/dom0: switch parse_dom0_param to use parse_boolean
Roger Pau Monne [Tue, 9 Oct 2018 09:42:32 +0000 (11:42 +0200)]
x86/dom0: switch parse_dom0_param to use parse_boolean

No functional change expected.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86: don't report PV support when !CONFIG_PV
Wei Liu [Thu, 4 Oct 2018 15:43:33 +0000 (16:43 +0100)]
x86: don't report PV support when !CONFIG_PV

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agotools/pvh: set coherent MTRR state for all vCPUs
Roger Pau Monne [Wed, 10 Oct 2018 14:39:35 +0000 (16:39 +0200)]
tools/pvh: set coherent MTRR state for all vCPUs

Instead of just doing it for the BSP. This requires storing the
maximum number of possible vCPUs in xc_dom_image.

This has been a latent bug so far because PVH doesn't yet support
pci-passthrough, so the effective memory cache attribute is forced to
WB by the hypervisor. Note also that even without this in place vCPU#0
is preferred in certain scenarios in order to calculate the memory
cache attributes.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agox86/shadow: put PV L1TF functions under CONFIG_PV
Wei Liu [Thu, 4 Oct 2018 15:43:20 +0000 (16:43 +0100)]
x86/shadow: put PV L1TF functions under CONFIG_PV

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/vtd: fix IOMMU share PT destruction path
Wei Liu [Tue, 9 Oct 2018 14:57:08 +0000 (15:57 +0100)]
x86/vtd: fix IOMMU share PT destruction path

Commit 2916951c1 ("mm / iommu: include need_iommu() test in
iommu_use_hap_pt()") included need_iommu() in iommu_use_hap_pt and
91d4eca7add ("mm / iommu: split need_iommu() into has_iommu_pt() and
need_iommu_pt_sync()") made things finer grain by spliting need_iommu
into three states.

The destruction path can't use iommu_use_hap_pt because at the point
platform op is called, IOMMU is either already switched to or has
always been in disabled state, and the shared PT test would always be
false.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
6 years agolibxl: Restore scheduling parameters after migrate in best-effort fashion
George Dunlap [Wed, 10 Oct 2018 11:36:25 +0000 (12:36 +0100)]
libxl: Restore scheduling parameters after migrate in best-effort fashion

Commit 3b4adba ("tools/libxl: include scheduler parameters in the
output of xl list -l") added scheduling parameters to the set of
information collected by libxl_retrieve_domain_configuration(), in
order to report that information in `xl list -l`.

Unfortunately, libxl_retrieve_domain_configuration() is also called by
the migration / save code, and the results passed to the restore /
receive code.  This meant scheduler parameters were inadvertently
added to the migration stream, without proper consideration for how to
handle corner cases.  The result was that if migrating from a host
running one scheduler to a host running a different scheduler, the
migration would fail with an error like the following:

libxl: error: libxl_sched.c:232:sched_credit_domain_set: Domain 1:Getting domain sched credit: Invalid argument
libxl: error: libxl_create.c:1275:domcreate_rebuild_done: Domain 1:cannot (re-)build domain: -3

Luckily there's a fairly straightforward way to set parameters in a
"best-effort" fashion.  libxl provides a single struct containing the
parameters of all schedulers, as well as a parameter specifying which
scheduler.  Parameters not used by a given scheduler are ignored.
Additionally, the struct contains a parameter to specify the
scheduler.  If you specify a specific scheduler,
libxl_domain_sched_params_set() will fail if there's a different
scheduler.  However, if you pass LIBXL_SCHEDULER_UNKNOWN, it will use
the value of the current scheduler for that domain.

In domcreate_stream_done(), before calling libxl__build_post(), set
the scheduler to LIBXL_SCHEDULER_UNKNOWN.  This will propagate
scheduler parameters from the previous instantiation on a best-effort
basis.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Ian Jackson <ian.jackson@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agoiommu: fix arm build after e9be34be5
Wei Liu [Tue, 9 Oct 2018 18:58:12 +0000 (19:58 +0100)]
iommu: fix arm build after e9be34be5

The function iommu_share_p2m_table is used by both ARM and x86 but
hap_enabled macro is x86 only. Put the ASSERT under CONFIG_X86.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86: put_page_from_l2e() should honor _PAGE_RW
Jan Beulich [Tue, 9 Oct 2018 14:27:59 +0000 (16:27 +0200)]
x86: put_page_from_l2e() should honor _PAGE_RW

56fff3e5e9 ("x86: nuke PV superpage option and code") has introduced a
(luckily latent only) bug here, in that it didn't make reference
dropping dependent on whether the page was mapped writable. The only
current source of large page mappings for PV domains is the Dom0
builder, which only produces writeable ones.

Take the opportunity and also convert to bool both put_data_page()'s
respective parameter and the argument put_page_from_l3e() passes.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agox86/vtd: fix iommu_share_p2m_table
Roger Pau Monné [Tue, 9 Oct 2018 14:27:13 +0000 (16:27 +0200)]
x86/vtd: fix iommu_share_p2m_table

Commit 2916951c1 "mm / iommu: include need_iommu() test in
iommu_use_hap_pt()" changed the check in iommu_share_p2m_table to use
need_iommu(d) (as part of iommu_use_hap_pt) instead of iommu_enabled,
which broke the check because at the point in domain construction
where iommu_share_p2m_table is called need_iommu(d) will always return
false.

Fix this by reverting to the previous logic.

While there turn the hap_enabled check into an ASSERT, since the only
caller of iommu_share_p2m_table already performs the hap_enabled check
before calling the function.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agoflask: sort io{port,mem}con entries
Daniel De Graaf [Tue, 9 Oct 2018 14:26:54 +0000 (16:26 +0200)]
flask: sort io{port,mem}con entries

These entries are not always sorted by checkpolicy, so sort them during
policy load (as is already done for later ocontext additions).

Reported-by: Nicolas Poirot <nicolas.poirot@bertin.fr>
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Tested-by: Nicolas Poirot <nicolas.poirot@bertin.fr>
Reviewed-by: Nicolas Poirot <nicolas.poirot@bertin.fr>
6 years agox86/HVM: move vendor independent CPU save/restore logic to shared code
Jan Beulich [Tue, 9 Oct 2018 14:25:35 +0000 (16:25 +0200)]
x86/HVM: move vendor independent CPU save/restore logic to shared code

A few pieces of the handling here are (no longer?) vendor specific, and
hence there's no point in replicating the code. Zero the full structure
before calling the save hook, eliminating the need for the hook
functions to zero individual fields.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
6 years agotools/libxenstat: Fix SONAME following c/s 57077cc42
Andrew Cooper [Tue, 9 Oct 2018 14:06:25 +0000 (15:06 +0100)]
tools/libxenstat: Fix SONAME following c/s 57077cc42

The unstable ABI version is 4.12, not 4.11

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxen/sched: Drop set_current_state()
Andrew Cooper [Mon, 8 Oct 2018 14:28:28 +0000 (15:28 +0100)]
xen/sched: Drop set_current_state()

This appears to have been a Linux-ism which found its way into the Xen
codebase with the IA64 port, and remained after IA64 was removed.

As far as I can tell from code archeology, none of the other architectures
have ever had a current->state field.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
6 years agolibfsimage: Honour general LDFLAGS
Ian Jackson [Thu, 4 Oct 2018 11:31:25 +0000 (12:31 +0100)]
libfsimage: Honour general LDFLAGS

Do not reset LDFLAGS to empty.  Instead, append the fsimage-special
LDFLAGS.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agotools/xenstat: Fix shared library version
Bastian Blank [Sat, 5 Jul 2014 09:46:50 +0000 (11:46 +0200)]
tools/xenstat: Fix shared library version

libxenstat does not have a stable ABI.  Set its version to the current
Xen release version.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agodocs/man/xen-pv-channel.pod.7: Remove a spurious blank line
Ian Jackson [Wed, 3 Oct 2018 17:43:55 +0000 (18:43 +0100)]
docs/man/xen-pv-channel.pod.7: Remove a spurious blank line

No functional change.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agodocs/man: Provide properly-formatted NAME sections
Ian Jackson [Wed, 3 Oct 2018 17:42:42 +0000 (18:42 +0100)]
docs/man: Provide properly-formatted NAME sections

A manpage `foo.7.pod' must start with

  =head NAME

  foo - some summary of what foo is or what this manpage is

because otherwise manpage catalogue systems cannot generate a proper
`whatis' entry.

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agoVarious: Fix typo `mappping'
Ian Jackson [Wed, 3 Oct 2018 18:00:22 +0000 (19:00 +0100)]
Various: Fix typo `mappping'

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agoVarious: Fix typo `infomation'
Ian Jackson [Wed, 3 Oct 2018 17:59:18 +0000 (18:59 +0100)]
Various: Fix typo `infomation'

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
6 years agotools/python/xen/lowlevel: Fix typo `sucess'
Ian Jackson [Wed, 3 Oct 2018 17:57:13 +0000 (18:57 +0100)]
tools/python/xen/lowlevel: Fix typo `sucess'

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agoVarious: Fix typo `reseting'
Ian Jackson [Wed, 3 Oct 2018 17:56:39 +0000 (18:56 +0100)]
Various: Fix typo `reseting'

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agoVarious: Fix typo `occured'
Ian Jackson [Wed, 3 Oct 2018 17:55:36 +0000 (18:55 +0100)]
Various: Fix typo `occured'

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
6 years agoVarious: Fix typos `unkown', `retreive' (detected by lintian)
Ian Jackson [Wed, 3 Oct 2018 17:51:50 +0000 (18:51 +0100)]
Various: Fix typos `unkown', `retreive' (detected by lintian)

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agotools/xentrace/xenalyze: Fix typos detected by lintian
Ian Jackson [Wed, 3 Oct 2018 17:46:47 +0000 (18:46 +0100)]
tools/xentrace/xenalyze: Fix typos detected by lintian

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
6 years agodocs/man: Fix two typos detected by the Debian lintian tool
Ian Jackson [Wed, 3 Oct 2018 17:44:18 +0000 (18:44 +0100)]
docs/man: Fix two typos detected by the Debian lintian tool

Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agotools/ocaml: Release the global lock before invoking block syscalls
Yang Qian [Mon, 8 Oct 2018 03:10:14 +0000 (11:10 +0800)]
tools/ocaml: Release the global lock before invoking block syscalls

Functions related with event channel are parallelizable, so release global
lock before invoking C function which will finally call block syscalls.

Signed-off-by: Yang Qian <yang.qian@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agomm / iommu: split need_iommu() into has_iommu_pt() and need_iommu_pt_sync()
Paul Durrant [Fri, 5 Oct 2018 14:47:10 +0000 (16:47 +0200)]
mm / iommu: split need_iommu() into has_iommu_pt() and need_iommu_pt_sync()

The name 'need_iommu()' is a little confusing as it suggests a domain needs
to use the IOMMU but something might not be set up yet, when in fact it
represents a tri-state value (not a boolean as might be expected) where
-1 means 'IOMMU mappings being set up' and 1 means 'IOMMU mappings have
been fully set up'.

Two different meanings are also inferred from the macro it in various
places in the code:

- Some callers want to test whether a domain has IOMMU mappings at all
- Some callers want to test whether they need to synchronize the domain's
  P2M and IOMMU mappings

This patch replaces the 'need_iommu' tri-state value with a defined
enumeration and adds a boolean flag 'need_sync' to separate these meanings,
and places both of these in struct domain_iommu, rather than directly in
struct domain.
This patch also creates two new boolean macros:

- 'has_iommu_pt()' evaluates to true if a domain has IOMMU mappings, even
  if they are still under construction.
- 'need_iommu_pt_sync()' evaluates to true if a domain requires explicit
  synchronization of the P2M and IOMMU mappings.

All callers of need_iommu() are then modified to use the macro appropriate
to what they are trying to test, except for the instance in
xen/drivers/passthrough/pci.c:assign_device() which has simply been
removed since it appears to be unnecessary.

NOTE: There are some callers of need_iommu() that strictly operate on
      the hardware domain. In some of these case a more global flag is
      used instead.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
6 years agomm / iommu: include need_iommu() test in iommu_use_hap_pt()
Paul Durrant [Fri, 5 Oct 2018 14:36:56 +0000 (16:36 +0200)]
mm / iommu: include need_iommu() test in iommu_use_hap_pt()

The name 'iommu_use_hap_pt' suggests that that P2M table is in use as the
domain's IOMMU pagetable which, prior to this patch, is not strictly true
since the macro did not test whether the domain actually has IOMMU
mappings.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
6 years agovtd: add lookup_page method to iommu_ops
Paul Durrant [Fri, 5 Oct 2018 14:35:23 +0000 (16:35 +0200)]
vtd: add lookup_page method to iommu_ops

This patch adds a new method to the VT-d IOMMU implementation to find the
MFN currently mapped by the specified DFN along with a wrapper function
in generic IOMMU code to call the implementation if it exists.

NOTE: This patch only adds a Xen-internal interface. This will be used by
      a subsequent patch.
      Another subsequent patch will add similar functionality for AMD
      IOMMUs.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
6 years agopass-through: provide two !HVM stubs
Jan Beulich [Fri, 5 Oct 2018 14:25:43 +0000 (16:25 +0200)]
pass-through: provide two !HVM stubs

Older gcc (4.3 in my case), despite eliminating pci_clean_dpci_irqs()
when !HVM, does not manage to also eliminate pci_clean_dpci_irq(). Cope
with this.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agofix uninitialized variable error in do_poll()
Jan Beulich [Fri, 5 Oct 2018 14:24:56 +0000 (16:24 +0200)]
fix uninitialized variable error in do_poll()

Now that CONFIG_HVM can (and should) be turned off for the shim, gcc 8.2
apparently is no longer sure that "port" is indeed initialized at

    if ( sched_poll->nr_ports == 1 )
        v->poll_evtchn = port;

It doesn't look to be impossible for the compiler to prove it is not,
but we also can't rely on that to be the case. Add an initializer.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agox86: use VMLOAD for PV context switch
Jan Beulich [Fri, 5 Oct 2018 14:24:05 +0000 (16:24 +0200)]
x86: use VMLOAD for PV context switch

Having noticed that VMLOAD alone is about as fast as a single of the
involved WRMSRs, I thought it might be a reasonable idea to also use it
for PV. Measurements, however, have shown that an actual improvement can
be achieved only with an early prefetch of the VMCB (thanks to Andrew
for suggesting to try this), which I have to admit I can't really
explain. This way on my Fam15 box context switch takes over 100 clocks
less on average (the measured values are heavily varying in all cases,
though).

This is intentionally not using a new hvm_funcs hook: For one, this is
all about PV, and something similar can hardly be done for VMX.
Furthermore the indirect to direct call patching that is meant to be
applied to most hvm_funcs hooks would be ugly to make work with
functions having more than 6 parameters.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Brian Woods <brian.woods@amd.com>
Acked-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agomemory: add check_get_page_from_gfn() as a wrapper...
Paul Durrant [Fri, 5 Oct 2018 14:22:37 +0000 (16:22 +0200)]
memory: add check_get_page_from_gfn() as a wrapper...

...for some uses of get_page_from_gfn().

There are many occurrences of the following pattern in the code:

    q = <readonly look-up> ? P2M_ALLOC : P2M_UNSHARE;
    page = get_page_from_gfn(d, gfn, &p2mt, q);

    if ( p2m_is_paging(p2mt) )
    {
        if ( page )
            put_page(page);

        p2m_mem_paging_populate(d, gfn);
        return <-EAGAIN or equivalent>;
    }

    if ( (q & P2M_UNSHARE) && p2m_is_shared(p2mt) )
    {
        if ( page )
            put_page(page);

        return <-EAGAIN or equivalent>;
    }

    if ( !page )
        return <-EINVAL or equivalent>;

There are some small differences between the exact way the occurrences
are coded but the desired semantic is the same.

This patch introduces a new common implementation of this code in
check_get_page_from_gfn() and then converts the various open-coded patterns
into calls to this new function.

NOTE: A forward declaration of p2m_type_t enum has been introduced in
      p2m-common.h so that it is possible to declare
      check_get_page_from_gfn() there rather than having to add
      duplicate declarations in the per-architecture p2m headers.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Roger Pau Monne <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
6 years agoiommu: push use of type-safe DFN and MFN into iommu_ops
Paul Durrant [Fri, 5 Oct 2018 14:21:05 +0000 (16:21 +0200)]
iommu: push use of type-safe DFN and MFN into iommu_ops

This patch modifies the methods in struct iommu_ops to use type-safe DFN
and MFN. This follows on from the prior patch that modified the functions
exported in xen/iommu.h.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Roger Pau Monne <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Acked-by: Julien Grall <julien.grall@arm.com>
6 years agoiommu: make use of type-safe DFN and MFN in exported functions
Paul Durrant [Fri, 5 Oct 2018 14:16:13 +0000 (16:16 +0200)]
iommu: make use of type-safe DFN and MFN in exported functions

This patch modifies the declaration of the entry points to the IOMMU
sub-system to use dfn_t and mfn_t in place of unsigned long. A subsequent
patch will similarly modify the methods in the iommu_ops structure.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Roger Pau Monne <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
6 years agoAMD/IOMMU: Drop get_field_from_byte()
Andrew Cooper [Mon, 24 Sep 2018 10:39:46 +0000 (11:39 +0100)]
AMD/IOMMU: Drop get_field_from_byte()

It is MASK_EXTR() in disguise, but less flexible.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Brian Woods <brian.woods@amd.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
6 years agoAMD/IOMMU: Don't opencode memcpy() in queue_iommu_command()
Andrew Cooper [Mon, 24 Sep 2018 10:16:21 +0000 (11:16 +0100)]
AMD/IOMMU: Don't opencode memcpy() in queue_iommu_command()

In practice, this allows the compiler to replace the loop with a pair of movs.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Brian Woods <brian.woods@amd.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
6 years agox86: fix !CONFIG_HVM build for clang 3.8
Wei Liu [Thu, 4 Oct 2018 16:37:56 +0000 (17:37 +0100)]
x86: fix !CONFIG_HVM build for clang 3.8

It is discovered that hvm_funcs made it into monitor.o even when HVM
is disabled. This version of clang doesn't seem to completely
eliminate the code after is_hvm_domain() in
arch_monitor_get_capabilities().

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agotools/ocaml: Delete the Xenctrl.with_intf wrapper
Andrew Cooper [Wed, 3 Oct 2018 13:11:20 +0000 (14:11 +0100)]
tools/ocaml: Delete the Xenctrl.with_intf wrapper

This wrapper hides an opening and closing of the xenctrl handle, which amongst
other things opens and closes multiple device files.

A process should create one handle at the start of day and reuse that; indeed
there is no guarentee that the process will retain sufficient permissions to
re-open /dev/xen/privcmd at a later point.

With the final user of Xenctrl.with_intf removed, drop the wrapper entirely.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
6 years agooxenstored: Don't re-open a xenctrl handle for every domain introduction
Andrew Cooper [Wed, 3 Oct 2018 09:32:54 +0000 (10:32 +0100)]
oxenstored: Don't re-open a xenctrl handle for every domain introduction

Currently, an xc handle is opened in main() which is used for cleanup
activities, and a new xc handle is temporarily opened every time a domain is
introduced.  This is inefficient, and amongst other things, requires full root
privileges for the lifetime of oxenstored.

All code using the Xenctrl handle is in domains.ml, so initialise xc as a
global (now happens just before main() is called) and drop it as a parameter
from Domains.create and Domains.cleanup.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
6 years agotools/ocaml: Strip all trailing whitespace
Andrew Cooper [Wed, 3 Oct 2018 09:31:39 +0000 (10:31 +0100)]
tools/ocaml: Strip all trailing whitespace

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
6 years agotools/xen-hvmctx: drop bogus casts from dump_mtrr()
Jan Beulich [Thu, 4 Oct 2018 12:55:38 +0000 (14:55 +0200)]
tools/xen-hvmctx: drop bogus casts from dump_mtrr()

Also make the iteration variable unsigned.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agotools/xen-hvmctx: drop bogus casts from dump_hpet()
Jan Beulich [Thu, 4 Oct 2018 12:55:15 +0000 (14:55 +0200)]
tools/xen-hvmctx: drop bogus casts from dump_hpet()

Also specify field widths of the multiple similar lines printed in the
course of the loop, to help readability.

Make the iteration variable unsigned.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agotools/xen-hvmctx: drop bogus casts from dump_lapic_regs()
Jan Beulich [Thu, 4 Oct 2018 12:55:01 +0000 (14:55 +0200)]
tools/xen-hvmctx: drop bogus casts from dump_lapic_regs()

The casts weren't even to the right type - all LAPIC registers are
32-bit (pairs/groups of registers may be combined to form larger logical
ones, but this is not visible in the given data representation).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agotools/xen-hvmctx: drop bogus casts from dump_cpu()
Jan Beulich [Thu, 4 Oct 2018 12:54:48 +0000 (14:54 +0200)]
tools/xen-hvmctx: drop bogus casts from dump_cpu()

Also avoid printing the MSR flags (they're always zero as of commit
2f1add6e1c "x86/vmx: Don't leak host syscall MSR state into HVM
guests"), and print FPU registers only when the respective flag
indicates the space holds valid data.

Adjust format specifiers a little at the same time, in particular to
avoid at least some leading zeros to be printed when the positions
can't ever be non-zero. This helps readability in my opinion.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agovtd: add missing check for shared EPT...
Paul Durrant [Thu, 4 Oct 2018 12:53:57 +0000 (14:53 +0200)]
vtd: add missing check for shared EPT...

...in intel_iommu_unmap_page().

This patch also includes some non-functional modifications in
intel_iommu_map_page().

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
6 years agoiommu: introduce the concept of DFN...
Paul Durrant [Thu, 4 Oct 2018 12:50:41 +0000 (14:50 +0200)]
iommu: introduce the concept of DFN...

...meaning 'device DMA frame number' i.e. a frame number mapped in the IOMMU
(rather than the MMU) and hence used for DMA address translation.

This patch is a largely cosmetic change that substitutes the terms 'gfn'
and 'gaddr' for 'dfn' and 'daddr' in all the places where the frame number
or address relate to a device rather than the CPU.

The parts that are not purely cosmetic are:

 - the introduction of a type-safe declaration of dfn_t and definition of
   INVALID_DFN to make the substitution of gfn_x(INVALID_GFN) mechanical.
 - the introduction of __dfn_to_daddr and __daddr_to_dfn (and type-safe
   variants without the leading __) with some use of the former.

Subsequent patches will convert code to make use of type-safe DFNs.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
6 years agox86: fix "xpti=" and "pv-l1tf=" yet again
Jan Beulich [Thu, 4 Oct 2018 12:49:56 +0000 (14:49 +0200)]
x86: fix "xpti=" and "pv-l1tf=" yet again

While commit 2a3b34ec47 ("x86/spec-ctrl: Yet more fixes for xpti=
parsing") indeed fixed "xpti=dom0", it broke "xpti=no-dom0", in that
this then became equivalent to "xpti=no". In particular, the presence
of "xpti=" alone on the command line means nothing as to which default
is to be overridden; "xpti=no-dom0", for example, ought to have no
effect for DomU-s, as this is distinct from both "xpti=no-dom0,domu"
and "xpti=no-dom0,no-domu".

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86: split opt_pv_l1tf
Jan Beulich [Thu, 4 Oct 2018 12:49:19 +0000 (14:49 +0200)]
x86: split opt_pv_l1tf

Use separate tracking variables for the hardware domain and DomU-s.

No functional change intended, but adjust the comment in
init_speculation_mitigations() to match prior as well as resulting code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86: split opt_xpti
Jan Beulich [Thu, 4 Oct 2018 12:48:18 +0000 (14:48 +0200)]
x86: split opt_xpti

Use separate tracking variables for the hardware domain and DomU-s.

No functional change intended.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoxentrace: handle sparse cpu ids correctly in xen trace buffer handling
Juergen Gross [Thu, 4 Oct 2018 11:47:24 +0000 (12:47 +0100)]
xentrace: handle sparse cpu ids correctly in xen trace buffer handling

The per-cpu buffers for Xentrace are addressed by cpu-id, but the info
array for the buffers is sized only by number of online cpus. This
might lead to crashes when using Xentrace with smt=0.

The t_info structure has to be sized based on nr_cpu_ids.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
6 years agoxentrace: allow sparse cpu list
Juergen Gross [Thu, 4 Oct 2018 11:47:23 +0000 (12:47 +0100)]
xentrace: allow sparse cpu list

Modify the xentrace utility to allow sparse cpu list resulting in not
all possible cpus having a trace buffer allocated.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
6 years agotools/libxl: Switch Arm guest type to PVH
Julien Grall [Mon, 1 Oct 2018 18:57:21 +0000 (19:57 +0100)]
tools/libxl: Switch Arm guest type to PVH

Currently, the toolstack is considering Arm guest always PV. However,
they are very similar to PVH because HW virtualization extension are used
and QEMU is not started. So switch Arm guest type to PVH.

To keep compatibility with toolstack creating Arm guest with PV type
(e.g libvirt), libxl will now convert those guests to PVH.

Furthermore, the default type for Arm in xl will now be PVH to allow
smooth transition for user.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agotools/libxl: Deprecate PV fields kernel, ramdisk, cmdline
Julien Grall [Mon, 1 Oct 2018 18:57:19 +0000 (19:57 +0100)]
tools/libxl: Deprecate PV fields kernel, ramdisk, cmdline

The PV fields kernel, ramdisk, cmdline are only there for compatibility
with old toolstack. Instead of manually copying them over to there new
field, use the deprecated_by attribute in the IDL.

Suggested-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agotools/libxl: Rename libxl__arch_domain_build_info_acpi_setdefault to...
Julien Grall [Mon, 1 Oct 2018 18:57:17 +0000 (19:57 +0100)]
tools/libxl: Rename libxl__arch_domain_build_info_acpi_setdefault to...

libxl__arch_domain_build_info_setdefault

A follow-up will require to modify default of multiple fields of
build_info. So rename the function accordingly.

No functional change.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxen/arm: vgic-v3: Don't create empty re-distributor regions
Julien Grall [Mon, 1 Oct 2018 16:42:27 +0000 (17:42 +0100)]
xen/arm: vgic-v3: Don't create empty re-distributor regions

At the moment, Xen is assuming the hardware domain will have the same
number of re-distributor regions as the host. However, as the
number of CPUs or the stride (e.g on GICv4) may be different we end up
exposing regions which does not contain any re-distributors.

When booting, Linux will go through all the re-distributor region to
check whether a property (e.g vPLIs) is available accross all the
re-distributors. This will result to a data abort on empty regions
because there are no underlying re-distributor.

So we need to limit the number of regions exposed to the hardware
domain. The code reworked to only expose the minimun number of regions
required by the hardware domain. It is assumed the regions will be
populated starting from the first one.

Lastly, rename vgic_v3_rdist_count to reflect the value return by the
helper.

Reported-by: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Signed-off-by: Julien Grall <julien.grall@arm.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: vgic-v3: Delay the initialization of the domain information
Julien Grall [Mon, 1 Oct 2018 16:42:26 +0000 (17:42 +0100)]
xen/arm: vgic-v3: Delay the initialization of the domain information

A follow-up patch will require to know the number of vCPUs when
initializating the vGICv3 domain structure. However this information is
not available at domain creation. This is only known once
XEN_DOMCTL_max_vpus is called for that domain.

In order to get the max vCPUs around, delay the domain part of the vGIC
v3 initialization until the first vCPU of the domain is initialized.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi@huawei.com>
Acked-but-disliked-by: Stefano Stabellini <sstabellini@kernel.org>