]> xenbits.xensource.com Git - people/dwmw2/xen.git/log
people/dwmw2/xen.git
6 years agotools/libxl: correct vcpu affinity output with sparse physical cpu map
Juergen Gross [Fri, 31 Aug 2018 15:22:04 +0000 (17:22 +0200)]
tools/libxl: correct vcpu affinity output with sparse physical cpu map

With not all physical cpus online (e.g. with smt=0) the output of hte
vcpu affinities is wrong, as the affinity bitmaps are capped after
nr_cpus bits, instead of using max_cpu_id.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agolibxl: create control/sysrq xenstore node
Vitaly Kuznetsov [Tue, 4 Sep 2018 11:39:29 +0000 (13:39 +0200)]
libxl: create control/sysrq xenstore node

'xl sysrq' command doesn't work with modern Linux guests with the following
message in guest's log:

 xen:manage: sysrq_handler: Error -13 writing sysrq in control/sysrq

xenstore trace confirms:

 IN 0x24bd9a0 20180904 04:36:32 WRITE (control/sysrq )
 OUT 0x24bd9a0 20180904 04:36:32 ERROR (EACCES )

The problem seems to be in the fact that we don't pre-create control/sysrq
xenstore node and libxl_send_sysrq() doing libxl__xs_printf() creates it as
read-only. As we want to allow guests to clean 'control/sysrq' after the
requested action is performed, we need to make this node writable.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agotools/xl: fix output of xl vcpu-pin dry run with smt=0
Juergen Gross [Mon, 3 Sep 2018 11:26:30 +0000 (13:26 +0200)]
tools/xl: fix output of xl vcpu-pin dry run with smt=0

Fix another smt=0 fallout: xl -N vcpu-pin prints only parts of the
affinities as it is using the number of online cpus instead of the
maximum cpu number.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agox86: monitor.o is currently HVM only
Wei Liu [Tue, 4 Sep 2018 16:15:21 +0000 (17:15 +0100)]
x86: monitor.o is currently HVM only

There has been plan to make PV work, but it is not yet there.  Provide
stubs to make it build with !CONFIG_HVM.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
6 years agox86: change name of parameter for various invlpg functions
Wei Liu [Tue, 4 Sep 2018 16:15:18 +0000 (17:15 +0100)]
x86: change name of parameter for various invlpg functions

They all incorrectly named a parameter virtual address while it should
have been linear address.

Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
6 years agoxen/domain: Make rangeset_domain_destroy() idempotent
Andrew Cooper [Mon, 3 Sep 2018 12:56:55 +0000 (13:56 +0100)]
xen/domain: Make rangeset_domain_destroy() idempotent

... and move it into the common __domain_destroy() path.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxen/domain: Fold xsm_free_security_domain() paths together
Andrew Cooper [Mon, 3 Sep 2018 11:48:13 +0000 (12:48 +0100)]
xen/domain: Fold xsm_free_security_domain() paths together

xsm_free_security_domain() is idempotent (both the dummy handler, and the
flask handler).  Move it into the shared __domain_destroy() path, and drop the
INIT_xsm flag from domain_create()

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxen/domain: Call lock_profile_deregister_struct() from common code
Andrew Cooper [Mon, 3 Sep 2018 11:10:48 +0000 (12:10 +0100)]
xen/domain: Call lock_profile_deregister_struct() from common code

lock_profile_register_struct() is called from common code, but the matching
deregister was previously only called from x86 code.

The practical upshot of this when using CONFIG_LOCK_PROFILE, destroyed domains
on ARM (and in particular, the freed page behind struct domain) remain on the
lockprofile linked list, which will become corrupt when the page is reused.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxen/domain: Break _domain_destroy() out of domain_create() and complete_domain_destroy()
Andrew Cooper [Mon, 3 Sep 2018 10:52:17 +0000 (11:52 +0100)]
xen/domain: Break _domain_destroy() out of domain_create() and complete_domain_destroy()

This is the first step in making the destroy path idempotent, and using it in
place of the ad-hoc cleanup paths in the create path.

To begin with, the trivial free operations are broken out.  The rest of the
cleanup code will be moved as it is demonstrated (or made) to be idempotent.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxen/domain: Prepare data for is_{pv,hvm}_domain() as early as possible
Andrew Cooper [Mon, 3 Sep 2018 13:22:16 +0000 (14:22 +0100)]
xen/domain: Prepare data for is_{pv,hvm}_domain() as early as possible

Given two subtle failures from getting this wrong before, and more cleanup on
the way, move the setting of d->guest_type as early as possible.

Note that despite moving the assignment of d->guest_type outside of the
is_idle_domain(d) check, it still behaves the same.  Previously, system
domains had no direct assignment of d->guest_type and behaved as PV guests
because guest_type_pv has the value 0.

While tidying up the predicate, leave a comment referring to
is_system_domain(), and move the associated ASSERT() to be beside the
assignment.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agox86emul: clean up AVX2 insn use in test harness
Jan Beulich [Tue, 4 Sep 2018 09:30:29 +0000 (11:30 +0200)]
x86emul: clean up AVX2 insn use in test harness

Drop the pretty pointless conditionals from code testing AVX insns and
properly use AVX2 mnemonics in code testing AVX2 insns (the test harness
is already requiring sufficiently new a compiler/assembler).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86emul: extend MASKMOV{Q,DQU} tests
Jan Beulich [Tue, 4 Sep 2018 09:29:22 +0000 (11:29 +0200)]
x86emul: extend MASKMOV{Q,DQU} tests

While deriving the first AVX512 pieces from existing code I've got the
(in the end wrong) impression that the emulation of these insns would be
broken. Besides testing that the instructions act as no-ops when the
controlling mask bits are all zero, add ones to also check that the data
merging actually works.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86emul: fix FMA scalar operand sizes
Jan Beulich [Tue, 4 Sep 2018 09:28:30 +0000 (11:28 +0200)]
x86emul: fix FMA scalar operand sizes

FMA insns, unlike the earlier AVX additions, don't use the low opcode
bit to distinguish between single and double vector elements. While the
difference is benign for packed flavors, the scalar ones need to use
VEX.W here. Oddly enough the table entries didn't even use
simd_scalar_fp, but uniformly used simd_packed_fp (implying the
distinction was by [VEX-encoded] opcode prefix).

Split simd_scalar_fp into simd_scalar_opc and simd_scalar_vexw, and
correct FMA scalar table entries to use the latter.

Also correct the scalar insn comments (they only ever use XMM registers
as operands).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agohvmloader: set entry point in linker script
Roger Pau Monné [Tue, 4 Sep 2018 09:27:41 +0000 (11:27 +0200)]
hvmloader: set entry point in linker script

Or else it defaults to using 0x100000 as the entry point, which might
or might not point to _start. This is a fix for 09b3907f93.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/hvm: Fix mapping corner case during task switching
Andrew Cooper [Wed, 1 Aug 2018 13:48:33 +0000 (13:48 +0000)]
x86/hvm: Fix mapping corner case during task switching

hvm_map_entry() can fail for a number of reasons, including for a misaligned
LDT/GDT access which crosses a 4K boundary.  Architecturally speaking, this
should be fixed, but Long Mode doesn't support task switches, and no 32bit OS
is going to misalign its LDT/GDT base, which is why this task isn't very high
on the TODO list.

However, the hvm_map_fail error label returns failure without raising an
exception, which interferes with hvm_task_switch()'s exception tracking, and
can cause it to finish and return to guest context as if the task switch had
completed successfully.

Resolve this corner case by folding all the failure paths together, which
causes an hvm_map_entry() failure to result in #TS[SEL].  hvm_unmap_entry()
copes fine with a NULL pointer so can be called unconditionally.

In practice, this is just a latent corner case as all hvm_map_entry() failures
crash the domain, but it should be fixed nevertheless.

Finally, rename hvm_load_segment_selector() to task_switch_load_seg() to avoid
giving the impression that it is usable for general segment loading.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/mm: Drop {HAP,SHADOW}_ERROR() wrappers
Andrew Cooper [Wed, 24 Jan 2018 16:43:55 +0000 (16:43 +0000)]
x86/mm: Drop {HAP,SHADOW}_ERROR() wrappers

Unlike the PRINTK/DEBUG wrappers, these go straight out to the console, rather
than ending up in the debugtrace buffer.

A number of these users are followed by domain_crash(), and future changes
will want to combine the printk() into the domain_crash() call.  Expand these
wrappers in place, using XENLOG_ERR before a BUG(), and XENLOG_G_ERR before a
domain_crash().

Perfom some %pv/PRI_mfn/etc cleanup while modifying the invocations, and
explicitly drop some calls which are unnecessary (bad shadow op, and the empty
stubs for incorrect sh_map_and_validate_gl?e() calls).

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
6 years agoxen/x86: Ignore the automatically generated include/asm-x86/asm-macros.h
Andrew Cooper [Mon, 3 Sep 2018 16:45:52 +0000 (17:45 +0100)]
xen/x86: Ignore the automatically generated include/asm-x86/asm-macros.h

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agoThe hvmloader binary generated when using LLVM LD doesn't work
Roger Pau Monné [Mon, 3 Sep 2018 15:54:12 +0000 (17:54 +0200)]
The hvmloader binary generated when using LLVM LD doesn't work
properly and seems to get stuck while trying to generate and load the
ACPI tables. This is caused by the layout of the binary when linked
with LLVM LD.

LLVM LD has a different default linker script that GNU LD, and the
resulting hvmloader binary is slightly different:

LLVM LD:
Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  PHDR           0x000034 0x000ff034 0x000ff034 0x00060 0x00060 R   0x4
  LOAD           0x000000 0x000ff000 0x000ff000 0x38000 0x38000 RWE 0x1000
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0

GNU LD:
Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000080 0x00100000 0x00100000 0x36308 0x3fd74 RWE 0x10
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x4

Note that in the LLVM LD case (as with GNU LD) the .text section does
indeed have the address set to 0x100000 as requested on the command
line:

[ 1] .text             PROGBITS        00100000 001000 00dd10 00  AX  0   0 16

There's however the PHDR which is not present when using GNU LD.

Fix this by using a very simple linker script that generates the same
binary regardless of whether LLVM or GNU LD is used. By using a linker
script the usage of -Ttext can also be avoided by placing the desired
.text load address directly in the linker script.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/boot: silence MADT table entry logging
Jan Beulich [Mon, 3 Sep 2018 15:51:40 +0000 (17:51 +0200)]
x86/boot: silence MADT table entry logging

Logging disabled LAPIC / x2APIC entries with invalid local APIC IDs
(ones having "broadcast" meaning when used) isn't very useful, and can
be quite noisy on larger systems. Suppress their logging unless
opt_cpu_info is true.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86: assorted array_index_nospec() insertions
Jan Beulich [Mon, 3 Sep 2018 15:50:10 +0000 (17:50 +0200)]
x86: assorted array_index_nospec() insertions

Don't chance having Spectre v1 (including BCBS) gadgets. In some of the
cases the insertions are more of precautionary nature rather than there
provably being a gadget, but I think we should err on the safe (secure)
side here.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoxen/arm: Fix dom0 boot following c/s 580c45869
Andrew Cooper [Fri, 31 Aug 2018 18:01:25 +0000 (19:01 +0100)]
xen/arm: Fix dom0 boot following c/s 580c45869

c/s 580c45869 "Call arch_domain_create() as early as possible in
domain_create()" overlooked the fact that ARM uses is_hardware_domain() in at
least two places during arch_domain_create().

The bug manifests as:

  (XEN) Freed 292kB init memory.
  (XEN) traps.c:2017:d0v0 HSR=0x938c0007 pc=0xc0639d08 gva=0xe0800004 gpa=0x00000010481004

when dom0 tries to use the vuart.  Judging by other uses of
is_hardware_domain(), I expect the x86 PVH dom0 boot is similarly broken.

Reposition the code which sets up hardware_domain so that the
is_hardware_domain() predicate works correctly all the way through domain
creation.

While moving it, leave a related comment explaining the positioning of the
is_priv assignment, which in hindsight should have been part of c/s ef765ec98
when exactly the same problem was discovered for the is_control_domain()
predicate.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
Tested-by: Julien Grall <julien.grall@arm.com>
6 years agox86/hvm: Drop hvm_{vmx,svm} shorthands
Andrew Cooper [Tue, 28 Aug 2018 16:00:36 +0000 (16:00 +0000)]
x86/hvm: Drop hvm_{vmx,svm} shorthands

By making {vmx,svm} in hvm_vcpu into an anonymous union (consistent with
domain side of things), the hvm_{vmx,svm} defines can be dropped, and all code
refer to the correctly-named fields.  This means that the data hierachy is no
longer obscured from grep/cscope/tags/etc.

Reformat one comment and switch one bool_t to bool while making changes.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
6 years agox86/svm: Rename arch_svm_struct to svm_vcpu
Andrew Cooper [Tue, 28 Aug 2018 15:59:28 +0000 (15:59 +0000)]
x86/svm: Rename arch_svm_struct to svm_vcpu

The suffix and prefix are redundant, and the name is curiously odd.  Rename it
to svm_vcpu to be consistent with all the other similar structures.  In
addition, rename local arch_svm local variables to svm for further
consistency.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
6 years agox86/vmx: Rename arch_vmx_struct to vmx_vcpu
Andrew Cooper [Tue, 28 Aug 2018 15:53:06 +0000 (15:53 +0000)]
x86/vmx: Rename arch_vmx_struct to vmx_vcpu

The suffix and prefix are redundant, and the name is curiously odd.  Rename it
to vmx_vcpu to be consistent with all the other similar structures.  In
addition, rename local arch_vmx local variables to vmx for further
consistency.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
CC: Roger Pau Monné <roger.pau@citrix.com>
Some of the local pointers are named arch_vmx.  I'm open to renaming them to
just vmx (like all the other local pointers) if people are happy with the
additional patch delta.

6 years agox86/hvm: Rename v->arch.hvm_vcpu to v->arch.hvm
Andrew Cooper [Tue, 28 Aug 2018 15:52:34 +0000 (15:52 +0000)]
x86/hvm: Rename v->arch.hvm_vcpu to v->arch.hvm

The trailing _vcpu suffix is redundant, but adds to code volume.  Drop it.

Reflow lines as appropriate.  No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
6 years agoxen/hvm: Rename d->arch.hvm_domain to d->arch.hvm
Andrew Cooper [Tue, 28 Aug 2018 15:50:41 +0000 (15:50 +0000)]
xen/hvm: Rename d->arch.hvm_domain to d->arch.hvm

The trailing _domain suffix is redundant, but adds to code volume.  Drop it.

Reflow lines as appropriate, and switch to using the new XFREE/etc wrappers
where applicable.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
6 years agoxen/domain: Allocate d->vcpu[] in domain_create()
Andrew Cooper [Mon, 19 Mar 2018 17:07:50 +0000 (17:07 +0000)]
xen/domain: Allocate d->vcpu[] in domain_create()

For ARM, the call to arch_domain_create() needs to have completed before
domain_max_vcpus() will return the correct upper bound.

For each arch's dom0's, drop the temporary max_vcpus parameter, and allocation
of dom0->vcpu.

With d->max_vcpus now correctly configured before evtchn_init(), the poll mask
can be constructed suitably for the domain, rather than for the worst-case
setting.

Due to the evtchn_init() fixes, it no longer calls domain_max_vcpus(), and
ARM's two implementations of vgic_max_vcpus() no longer need work around the
out-of-order call.

From this point on, d->max_vcpus and d->vcpus[] are valid for any domain which
can be looked up by domid.

The XEN_DOMCTL_max_vcpus hypercall is modified to reject any call attempt with
max != d->max_vcpus, which does match the older semantics (not that it is
obvious from the code).  The logic to allocate d->vcpu[] is dropped, but at
this point the hypercall still needs making to allocate each vcpu.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
6 years agoxen/dom0: Arrange for dom0_cfg to contain the real max_vcpus value
Andrew Cooper [Mon, 19 Mar 2018 17:28:50 +0000 (17:28 +0000)]
xen/dom0: Arrange for dom0_cfg to contain the real max_vcpus value

Make dom0_max_vcpus() a common interface, and implement it on ARM by splitting
the existing alloc_dom0_vcpu0() function in half.

As domain_create() doesn't yet set up the vcpu array, the max value is also
passed into alloc_dom0_vcpu0().  This is temporary for bisectibility and
removed in the following patch.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agotools: Pass max_vcpus to XEN_DOMCTL_createdomain
Andrew Cooper [Tue, 27 Feb 2018 17:39:37 +0000 (17:39 +0000)]
tools: Pass max_vcpus to XEN_DOMCTL_createdomain

XEN_DOMCTL_max_vcpus is a mandatory hypercall, but nothing actually prevents a
toolstack from unpausing a domain with no vcpus.

Originally, d->vcpus[] was an embedded array in struct domain, but c/s
fb442e217 "x86_64: allow more vCPU-s per guest" in Xen 4.0 altered it to being
dynamically allocated.  A side effect of this is that d->vcpu[] is NULL until
XEN_DOMCTL_max_vcpus has completed, but a lot of hypercalls blindly
dereference it.

Even today, the behaviour of XEN_DOMCTL_max_vcpus is a mandatory singleton
call which can't change the number of vcpus once a value has been chosen.

In preparation to remote the hypercall, extend xen_domctl_createdomain with
the a max_vcpus field and arrange for all callers to pass the appropriate
value.  There is no change in construction behaviour yet, but later patches
will rearrange the hypervisor internals.

For the python stubs, extend the domain_create keyword list to take a
max_vcpus parameter, in lieu of deleting the pyxc_domain_max_vcpus function.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxen/domain: Call arch_domain_create() as early as possible in domain_create()
Andrew Cooper [Mon, 19 Mar 2018 16:50:46 +0000 (16:50 +0000)]
xen/domain: Call arch_domain_create() as early as possible in domain_create()

This is in preparation to set up d->max_cpus and d->vcpu[] in domain_create(),
and allow later parts of domain construction to have access to the values.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agoxen/gnttab: Fold grant_table_{create,set_limits}() into grant_table_init()
Andrew Cooper [Mon, 19 Mar 2018 16:06:24 +0000 (16:06 +0000)]
xen/gnttab: Fold grant_table_{create,set_limits}() into grant_table_init()

Now that the max_{grant,maptrack}_frames are specified from the very beginning
of grant table construction, the various initialisation functions can be
folded together and simplified as a result.

Leave grant_table_init() as the public interface, which is more consistent
with other subsystems.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agoxen/domctl: Remove XEN_DOMCTL_set_gnttab_limits
Andrew Cooper [Tue, 27 Feb 2018 17:39:37 +0000 (17:39 +0000)]
xen/domctl: Remove XEN_DOMCTL_set_gnttab_limits

Now that XEN_DOMCTL_createdomain handles the grant table limits, remove
XEN_DOMCTL_set_gnttab_limits (including XSM hooks and libxc wrappers).

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
6 years agoxen/gnttab: Pass max_{grant,maptrack}_frames into grant_table_create()
Andrew Cooper [Mon, 19 Mar 2018 11:19:52 +0000 (11:19 +0000)]
xen/gnttab: Pass max_{grant,maptrack}_frames into grant_table_create()

... rather than setting the limits up after domain_create() has completed.

This removes the common gnttab infrastructure for calculating the number of
dom0 grant frames (as the common grant table code is not an appropriate place
for it to live), opting instead to require the dom0 construction code to pass
a sane value in via the configuration.

In practice, this now means that there is never a partially constructed grant
table for a reference-able domain.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
6 years agotools: Pass grant table limits to XEN_DOMCTL_set_gnttab_limits
Andrew Cooper [Tue, 27 Feb 2018 17:39:37 +0000 (17:39 +0000)]
tools: Pass grant table limits to XEN_DOMCTL_set_gnttab_limits

XEN_DOMCTL_set_gnttab_limits is a fairly new hypercall, and is strictly
mandatory.  As it pertains to domain limits, it should be provided at
createdomain time.

In preparation to remove the hypercall, extend xen_domctl_createdomain with
the fields and arrange for all callers to pass appropriate details.  There is
no change in construction behaviour yet, but later patches will rearrange the
hypervisor internals, then delete the hypercall.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agox86/pv: Deprecate support for paging out the LDT
Andrew Cooper [Tue, 3 Oct 2017 10:18:37 +0000 (11:18 +0100)]
x86/pv: Deprecate support for paging out the LDT

This code is believed to be vestigial remnant of the PV Windows XP port.  It
is not used by Linux, NetBSD, Solaris or MiniOS.  Furthermore the
implementation is incomplete; it only functions for a present => not-present
transition, rather than a present => read/write transition.

The for_each_vcpu() is one scalability limitation for PV guests, which can't
reasonably be altered to be continuable.  Most importantly however, is that
this only codepath which plays with descriptor frames of a remote vcpu.

A side effect of dropping support for paging the LDT out is that the LDT no
longer automatically cleans itself up on domain destruction.  Cover this by
explicitly releasing the LDT frames at the same time as the GDT frames.

Finally, leave some asserts around to confirm the expected behaviour of all
the functions playing with PGT_seg_desc_page references.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/pv: Rename v->arch.pv_vcpu to v->arch.pv
Andrew Cooper [Tue, 28 Aug 2018 15:50:27 +0000 (15:50 +0000)]
x86/pv: Rename v->arch.pv_vcpu to v->arch.pv

The trailing _vcpu suffix is redundant, but adds to code volume.  Drop it.

Reflow lines as appropriate, and switch to using the new XFREE/etc wrappers
where applicable.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/pv: Rename d->arch.pv_domain to d->arch.pv
Andrew Cooper [Tue, 28 Aug 2018 15:49:09 +0000 (15:49 +0000)]
x86/pv: Rename d->arch.pv_domain to d->arch.pv

The trailing _domain suffix is redundant, but adds to code volume.  Drop it.

Reflow lines as appropriate, and switch to using the new XFREE/etc wrappers
where applicable.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/genapic: drop .target_cpus() hook
Jan Beulich [Thu, 30 Aug 2018 09:08:19 +0000 (11:08 +0200)]
x86/genapic: drop .target_cpus() hook

All flavors specify target_cpus_all() anyway - replace use of the hook
by &cpu_online_map.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/grant: mute gcc 4.1.x warning in steal_linear_address()
Zhenzhong Duan [Thu, 30 Aug 2018 09:05:01 +0000 (11:05 +0200)]
x86/grant: mute gcc 4.1.x warning in steal_linear_address()

Move reference of ol1e ahead or else we see below warning.

cc1: warnings being treated as errors
grant_table.c: In function 'replace_grant_pv_mapping':
grant_table.c:142: warning: 'ol1e.l1' may be used uninitialized in this function

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/alternatives: allow using assembler macros in favor of C ones
Jan Beulich [Thu, 30 Aug 2018 09:03:47 +0000 (11:03 +0200)]
x86/alternatives: allow using assembler macros in favor of C ones

As was validly pointed out as motivation for similar Linux side changes
(https://lkml.org/lkml/2018/6/22/677), using long sequences of
directives and auxiliary instructions, like is commonly the case when
setting up an alternative patch site, gcc can be mislead into believing
an asm() to be more heavy weight than it really is. By presenting it
with an assembler macro invocation instead, this can be avoided.

Initially I wanted to outright change the C macros ALTERNATIVE() and
ALTERNATIVE_2() to invoke the respective assembler ones, but doing so
would require quite a bit of cleanup of some use sites, because of the
exra necessary quoting combined with the need that each assembler macro
argument must consist of just a single string literal. We can consider
working towards that subsequently.

For now, set the stage of using the assembler macros here by providing a
new generated header, being the slightly massaged pre-processor output
of (for now just) alternative-asm.h. The massaging is primarily to be
able to properly track the build dependency: For this, we need the C
compiler to see the inclusion, which means we shouldn't directly use an
asm(". include ...") directive.

The dependency added to asm-offsets.s is not a true one; it's just the
easiest approach I could think of to make sure the new header gets
generated early on, without having to fiddle with xen/Makefile (and
introducing some x86-specific construct there).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoVMX: reduce number of posted-interrupt hooks
Jan Beulich [Thu, 30 Aug 2018 09:02:09 +0000 (11:02 +0200)]
VMX: reduce number of posted-interrupt hooks

Three of the four hooks are not exposed outside of vmx.c, and all of
them have only a single possible non-NULL value. So there's no reason to
use hooks here - a simple set of flag indicators is sufficient (and we
don't even need a flag for the VM entry one, as it's always
(de-)activated together the the vCPU blocking hook, which needs to
remain an actual function pointer). This is the more that with the
Spectre v2 workarounds indirect calls have become more expensive.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
6 years agox86/mm: re-arrange get_page_from_l<N>e() vs pv_l1tf_check_l<N>e()
Jan Beulich [Thu, 30 Aug 2018 09:01:02 +0000 (11:01 +0200)]
x86/mm: re-arrange get_page_from_l<N>e() vs pv_l1tf_check_l<N>e()

Restore symmetry between get_page_from_l<N>e(): pv_l1tf_check_l<N>e() is
now uniformly invoked from outside of them. They're no longer getting
called for non-present PTEs. This way the slightly odd three-way return
value meaning of the higher level ones can also be got rid of.

Leave an assertion in get_page_from_l1e() as the only non-static one of
the four siblings, to ensure that no new unguarded calls go unnoticed.

Introduce local variables holding the page table entries processed, and
use them throughout the loop bodies instead of re-reading them from the
page table several times.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/pt: split out HVM functions from vtd.c
Wei Liu [Sun, 26 Aug 2018 12:19:43 +0000 (13:19 +0100)]
x86/pt: split out HVM functions from vtd.c

Functions are moved to hvm.c. Reorder makefile items while at it.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
6 years agox86/pt: make it build with !CONFIG_HVM
Wei Liu [Sun, 26 Aug 2018 12:19:42 +0000 (13:19 +0100)]
x86/pt: make it build with !CONFIG_HVM

This requires providing stubs for a few functions which are part of
HVM code.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
6 years agoxen/arm: fix SMMU driver build
Stefano Stabellini [Tue, 28 Aug 2018 23:47:40 +0000 (16:47 -0700)]
xen/arm: fix SMMU driver build

Add missing "CONFIG_". This build regression was introduced by commit
277aa3523d "arm: make it possible to disable the SMMU driver".

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
[julieng: Add the commit where the regression was introduced]
Acked-by: Julien Grall <julien.grall@arm.com>
6 years agox86: reduce "visibility" of spec_ctrl_asm.h
Jan Beulich [Wed, 29 Aug 2018 14:32:17 +0000 (16:32 +0200)]
x86: reduce "visibility" of spec_ctrl_asm.h

Other than indirect_thunk_asm.h, spec_ctrl_asm.h is a header generally
needed by assembly source files only. Avoid having all C sources have a
dependency on that header (the set of assembly sources now gaining a
dependency on the C header is much smaller and hence more acceptable).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86: move quoting of __ASM_{STAC,CLAC}
Jan Beulich [Wed, 29 Aug 2018 14:31:32 +0000 (16:31 +0200)]
x86: move quoting of __ASM_{STAC,CLAC}

Both consumers want them quoted, so quote them right away instead of
using __stringify() upon use. In the spirit of other recent additions
also make the assembly forms assembler macros, allowing the helper
#define-s to be #undef-ed subsequently.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/alternatives: fully leverage automatic NOP filling
Jan Beulich [Wed, 29 Aug 2018 14:30:54 +0000 (16:30 +0200)]
x86/alternatives: fully leverage automatic NOP filling

As of commit 4008c71d7a ("x86/alt: Support for automatic padding
calculations") there's no point having explict ASM_NOPn instances in
alternatives anymore - drop them. As a result also drop the asm/nops.h
inclusion from alternative.h, adding explicit inclusions in the two
remaining C files needing them.

While touching it also move the CR4_PV32_RESTORE definition out of the
SMAP-specific conditional into a more general one.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86: drop NO_XPTI synthetic feature
Jan Beulich [Wed, 29 Aug 2018 14:29:42 +0000 (16:29 +0200)]
x86: drop NO_XPTI synthetic feature

With there not being any patching done based on it, we don't need this.
Non-patching conditionals can use opt_xpti instead.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/spec-ctrl: split reporting for PV and HVM guests
Jan Beulich [Wed, 29 Aug 2018 14:28:52 +0000 (16:28 +0200)]
x86/spec-ctrl: split reporting for PV and HVM guests

Putting them on separate lines was suggested before, and is going to
become necessary eventually anyway as things get added here. Split them
now, and put the respective pieces in CONFIG_* conditionals at the same
time.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
6 years agox86: report use of PCID together with reporting XPTI status
Jan Beulich [Wed, 29 Aug 2018 14:28:01 +0000 (16:28 +0200)]
x86: report use of PCID together with reporting XPTI status

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/alt: Fix build when CONFIG_LIVEPATCH is disabled
Andrew Cooper [Wed, 29 Aug 2018 10:55:32 +0000 (11:55 +0100)]
x86/alt: Fix build when CONFIG_LIVEPATCH is disabled

c/s b28cd21c3628 "x86/build: Use new .nops directive when available"
introduced a __read_mostly boolean which is included if the toolchain supports
the .nops directive.

When CONFIG_LIVEPATCH is compiled out, alternative.o is expected to be a fully
init module, and toolchain_nops_are_ideal trips the build system check:

  Error: size of alternative.o:.data.read_mostly is 0x01
  /local/xen.git/xen/Rules.mk:206: recipe for target 'alternative.init.o' failed
  make[3]: *** [alternative.init.o] Error 12

Introduce init_or_livepatch_read_mostly and switch the annotation for
toolchain_nops_are_ideal.

Reported-by: Olaf Hering <olaf@aepfle.de>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
6 years agox86/build: Use new .nops directive when available
Andrew Cooper [Fri, 9 Feb 2018 12:47:58 +0000 (12:47 +0000)]
x86/build: Use new .nops directive when available

Newer versions of binutils are capable of emitting an exact number bytes worth
of optimised nops, which are P6 nops.  Use this in preference to .skip when
available.

Check at boot time whether the toolchain nops are the correct for the running
hardware, andskip optimising nops entirely when possible.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/shadow: Use mfn_t in shadow_track_dirty_vram()
Andrew Cooper [Fri, 20 Jul 2018 17:50:28 +0000 (17:50 +0000)]
x86/shadow: Use mfn_t in shadow_track_dirty_vram()

... as the only user of sl1mfn would prefer it that way.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
6 years agox86/shadow: Clean up the MMIO fastpath helpers
Andrew Cooper [Fri, 20 Jul 2018 14:28:20 +0000 (15:28 +0100)]
x86/shadow: Clean up the MMIO fastpath helpers

Use bool when appropriate, remove extraneous brackets and fix up comment
style.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
6 years agox86/shadow: Use MASK_* helpers for the MMIO fastpath PTE manipulation
Andrew Cooper [Fri, 20 Jul 2018 14:21:51 +0000 (15:21 +0100)]
x86/shadow: Use MASK_* helpers for the MMIO fastpath PTE manipulation

Drop the now-unused SH_L1E_MMIO_GFN_SHIFT definition.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
6 years agox86/shadow: Switch shadow_domain.has_fast_mmio_entries to bool
Andrew Cooper [Fri, 20 Jul 2018 14:06:28 +0000 (15:06 +0100)]
x86/shadow: Switch shadow_domain.has_fast_mmio_entries to bool

Remove an unecessary if().

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
6 years agox86/shadow: Use more appropriate conversion functions
Andrew Cooper [Fri, 20 Jul 2018 16:57:24 +0000 (16:57 +0000)]
x86/shadow: Use more appropriate conversion functions

Replace pfn_to_paddr(mfn_x(...)) with mfn_to_maddr(), and replace an opencoded
gfn_to_gaddr().

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
6 years agox86/mm: Use mfn_eq()/mfn_add() rather than opencoded variations
Andrew Cooper [Fri, 1 Jun 2018 11:56:09 +0000 (12:56 +0100)]
x86/mm: Use mfn_eq()/mfn_add() rather than opencoded variations

Use l1e_get_mfn() in place of l1e_get_pfn() when applicable, and fix up style
on affected lines.

For sh_remove_shadow_via_pointer(), map_domain_page() is guaranteed to succeed
so there is no need to ASSERT() its success.  This allows the pointer
arithmetic to folded into the previous expression, and for vaddr to be
properly typed as l1_pgentry_t, avoiding the cast in l1e_get_mfn().

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
6 years agox86/domctl: XEN_DOMCTL_debug_op is HVM only
Wei Liu [Sun, 26 Aug 2018 12:19:51 +0000 (13:19 +0100)]
x86/domctl: XEN_DOMCTL_debug_op is HVM only

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/mmcfg/drhd: Move acpi_mmcfg_init() call before calling acpi_parse_dmar()
Zhenzhong Duan [Tue, 28 Aug 2018 15:13:42 +0000 (17:13 +0200)]
x86/mmcfg/drhd: Move acpi_mmcfg_init() call before calling acpi_parse_dmar()

pci_conf_read8() needs pci mmcfg mapping to work on multiple pci
segments system such as HPE Superdome-Flex.

Move acpi_mmcfg_init() call in acpi_boot_init() before calling
acpi_parse_dmar() so that when pci_conf_read8() is called in
acpi_parse_dev_scope(), we already have the mapping set up.

mmio_ro_ranges initialization is also moved ahead as it's the only
dependency of pci_mmcfg_arch_enable() need to be moved. Also
checked codes between the old and new call sites to ensure we
don't break anything.

Furthermore MMCFG will continue to not work this early (or
more precisely not at all until Dom0 boot has progressed far
enough) if the range(s) isn't/aren't marked reserved in E820.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Tested-by: Gopalasetty, Manoj <manoj.gopalasetty@hpe.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agoVMX: make vmx_read_guest_msr() cope with callers not checking its return value
Jan Beulich [Tue, 28 Aug 2018 15:12:05 +0000 (17:12 +0200)]
VMX: make vmx_read_guest_msr() cope with callers not checking its return value

It took till the 4.5 backports of the L1TF prereqs that gcc 8.2 finally
noticed that the vPMU callers, not checking the function's return value,
may consume uninitialized data. Guard against this by storing zero on
the error path.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
6 years agoxenforeignmemory: fix fd leakage in error path
Wei Liu [Tue, 28 Aug 2018 14:19:55 +0000 (15:19 +0100)]
xenforeignmemory: fix fd leakage in error path

b49ef5d3 (xenforeignmemory: work around bug in older privcmd) added an
error path but forgot to close fd there.

Spotted by Coverity.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agorombios: remove packed attribute for pushad_regs_t
Wei Liu [Tue, 28 Aug 2018 13:56:38 +0000 (14:56 +0100)]
rombios: remove packed attribute for pushad_regs_t

The structure already has explicitly padding.

Removing the attribute silences a clang 6 warning:

tcgbios.c:1519:34: error: taking address of packed member 'u' of class or structure 'pushad_regs_t' may result in an unaligned pointer value [-Werror,-Waddress-of-packed-member]
                                                  ®s->u.r32.edx);
                                                   ^~~~~~~~~~~~~~~

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agoxen: is_hvm_{domain,vcpu} should evaluate to false when !CONFIG_HVM
Wei Liu [Sun, 26 Aug 2018 12:19:35 +0000 (13:19 +0100)]
xen: is_hvm_{domain,vcpu} should evaluate to false when !CONFIG_HVM

Turn them into static inline functions which evaluate to false when
CONFIG_HVM is not set. ARM won't be broken because ARM guests are set
to PV type in the hypervisor.

But ARM has plan to switch to HVM guest type inside the hypervisor, so
preemptively introduce CONFIG_HVM for ARM here.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
6 years agoxen/xsm: Rename CONFIG_XSM_POLICY to CONFIG_XSM_FLASK_POLICY
Andrew Cooper [Tue, 26 Jun 2018 09:59:10 +0000 (10:59 +0100)]
xen/xsm: Rename CONFIG_XSM_POLICY to CONFIG_XSM_FLASK_POLICY

The embedded policy is specifically a flask policy, so update the
infrastructure to reflect this.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Doug Goldstein <cardoe@cardoe.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
6 years agoxen/xsm: Rename CONFIG_FLASK_* to CONFIG_XSM_FLASK_*
Andrew Cooper [Tue, 26 Jun 2018 09:56:50 +0000 (10:56 +0100)]
xen/xsm: Rename CONFIG_FLASK_* to CONFIG_XSM_FLASK_*

Flask is one single XSM module, and another is about to be introduced.
Properly namespace the symbols for clarity.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Doug Goldstein <cardoe@cardoe.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
6 years agox86/svm: Fixes to OS Visible Workaround handling
Andrew Cooper [Tue, 27 Feb 2018 17:22:40 +0000 (17:22 +0000)]
x86/svm: Fixes to OS Visible Workaround handling

OSVW data is technically per-cpu, but it is the firmwares reponsibility to
make it equivelent on each cpu.  A guests OSVW data is sourced from global
data in Xen, clearly making it per-domain data rather than per-vcpu data.

Move the data from struct arch_svm_struct to struct svm_domain, and call
svm_guest_osvw_init() from svm_domain_initialise() instead of
svm_vcpu_initialise().

In svm_guest_osvw_init(), reading osvw_length and osvw_status must be done
under the osvw_lock to avoid observing mismatched values.  The guests view of
osvw_length also needs clipping at 64 as we only offer one status register (To
date, 5 is the maximum index defined AFAICT).  Avoid opencoding max().

Drop svm_handle_osvw() as it is shorter and simpler to implement the
functionality inline in svm_msr_{read,write}_intercept().  As the OSVW MSRs
are a contiguous block, we can access them as an array for simplicity.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
6 years agoxen/pt: io.c contains HVM only code
Wei Liu [Sun, 26 Aug 2018 12:19:41 +0000 (13:19 +0100)]
xen/pt: io.c contains HVM only code

We also need to make it x86 only because ARM will define CONFIG_HVM at
some point.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/vpmu: put HVM only code under CONFIG_HVM
Wei Liu [Sun, 26 Aug 2018 12:19:40 +0000 (13:19 +0100)]
x86/vpmu: put HVM only code under CONFIG_HVM

Change u32 to uint32_t while at it.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86: provide stub for memory_type_changed
Wei Liu [Sun, 26 Aug 2018 12:19:38 +0000 (13:19 +0100)]
x86: provide stub for memory_type_changed

Jan indicated that for PV guests the memory type is not changed, for
HVM guests memory_type_changed is needed for EPT's effective memory
type calculation.  This means memory_type_changed is HVM only.

Provide a stub to minimise code churn.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/hvm: provide hvm_hap_supported
Wei Liu [Sun, 26 Aug 2018 12:19:37 +0000 (13:19 +0100)]
x86/hvm: provide hvm_hap_supported

And replace direct accesses in non-HVM subsystems to
hvm_funcs.hap_supported with the new function, to avoid accessing an
internal data structure of another subsystem directly.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86: enclose hvm_op and dm_op in CONFIG_HVM in relevant tables
Wei Liu [Sun, 26 Aug 2018 12:19:36 +0000 (13:19 +0100)]
x86: enclose hvm_op and dm_op in CONFIG_HVM in relevant tables

PV guest (Dom0) needs to able to use these two hypercalls in order to
serve HVM guests. But if xen doesn't support HVM at all there is no
point in exposing them to PV guests.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agoRevert "x86/hvm: remove default ioreq server"
Jan Beulich [Mon, 27 Aug 2018 13:50:50 +0000 (15:50 +0200)]
Revert "x86/hvm: remove default ioreq server"

This reverts commit 629856eae2a7f766f1f024a06ad3abf1fd4b9d37,
which breaks at least one of the qemu builds.

6 years agoVT-d/dmar: iommu mem leak fix
Zhenzhong Duan [Mon, 27 Aug 2018 09:37:24 +0000 (11:37 +0200)]
VT-d/dmar: iommu mem leak fix

Release memory allocated for drhd iommu in error path.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
6 years agobuild: remove tboot make targets
Doug Goldstein [Mon, 27 Aug 2018 09:37:01 +0000 (11:37 +0200)]
build: remove tboot make targets

The tboot targets are woefully out of date. These should really be
retired because setting up tboot is more complex than the build process
for it.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Christopher Clark <christopher.clark6@baesystems.com>
6 years agox86/hvm: remove default ioreq server
Paul Durrant [Mon, 27 Aug 2018 09:30:18 +0000 (11:30 +0200)]
x86/hvm: remove default ioreq server

My recent patch [1] to qemu-xen-traditional removes the last use of the
'default' ioreq server in Xen. (This is a catch-all ioreq server that is
used if no explicitly registered I/O range is targetted).

This patch can be applied once that patch is committed, to remove the
(>100 lines of) redundant code in Xen.

NOTE: The removal of the special case for HVM_PARAM_DM_DOMAIN in
      hvm_allow_set_param() is not directly related to removal of
      default ioreq servers. It could have been cleaned up at any time
      after commit 9a422c03 "x86/hvm: stop passing explicit domid to
      hvm_create_ioreq_server()". It is now added to the new
      deprecated sets introduced by this patch.

[1] https://lists.xenproject.org/archives/html/xen-devel/2018-08/msg00270.html

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/nestedhvm: provide some stubs for p2m code
Wei Liu [Mon, 13 Aug 2018 14:02:32 +0000 (15:02 +0100)]
x86/nestedhvm: provide some stubs for p2m code

Make two functions static inline so that they can be referenced in p2m
code. Check nestedhvm is enabled before calling
nestedhvm_vmcx_flushtlb (which also has a side effect of not issuing
unnecessary IPIs for non-nested case).

While moving, reformat code and use proper boolean.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/mm/shadow: split out HVM only code
Wei Liu [Fri, 17 Aug 2018 10:03:24 +0000 (11:03 +0100)]
x86/mm/shadow: split out HVM only code

Move the code previously enclosed in CONFIG_HVM into its own file.

Note that although some code explicitly check is_hvm_*, which hints it
can be used for PV too, I can't find a code path that would be the
case.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
6 years agox86/mm/shadow: make it build with !CONFIG_HVM
Wei Liu [Thu, 16 Aug 2018 10:05:34 +0000 (11:05 +0100)]
x86/mm/shadow: make it build with !CONFIG_HVM

Enclose HVM only emulation code under CONFIG_HVM. Add some BUG()s to
to catch any issue.

Note that although some code checks is_hvm_*, which hints it can be
called for PV as well, I can't find such paths.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
6 years agox86/vm_event: put vm_event_fill_regs under CONFIG_HVM
Wei Liu [Fri, 17 Aug 2018 10:19:42 +0000 (11:19 +0100)]
x86/vm_event: put vm_event_fill_regs under CONFIG_HVM

Ideally the HVM specific part of VM event should be moved into hvm/ at
some point, but this will do for now.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
6 years agox86/mem_access: put HVM only function under CONFIG_HVM
Wei Liu [Fri, 17 Aug 2018 12:51:11 +0000 (13:51 +0100)]
x86/mem_access: put HVM only function under CONFIG_HVM

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
6 years agox86: guard HAS_VPCI with CONFIG_HVM
Wei Liu [Fri, 10 Aug 2018 17:08:00 +0000 (18:08 +0100)]
x86: guard HAS_VPCI with CONFIG_HVM

VPCI is only useful for PVH / HVM guests. Ideally CONFIG_HVM should
imply !PV_SHIM_EXCLUSIVE, but we still want to build PV_SHIM_EXCLUSIVE
with CONFIG_HVM at this stage because a lot of things are still
entangled.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agoxenforeignmemory: work around bug in older privcmd
Paul Durrant [Fri, 24 Aug 2018 12:16:26 +0000 (13:16 +0100)]
xenforeignmemory: work around bug in older privcmd

Versions of linux privcmd prior to commit dc9eab6fd94d ("return -ENOTTY
for unimplemented IOCTLs") will return -EINVAL rather than the conventional
-ENOTTY for unimplemented codes. This breaks the error path in
libxenforeignmemory resource mapping, which only translates ENOTTY into
EOPNOTSUPP to inform callers of the need to use an alternative (legacy)
mechanism.

This patch adds a new 'unimplemented' [1] ioctl code into the local
privcmd header which is then used to probe for the appropriate errno to
translate in the resource mapping error path

[1] this is a code that has, so far, never been used in any version of
    privcmd and will be added to future versions of the header in the
    linux source, to make sure it stays unimplemented.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
6 years agotools: building IPXE should be determined by CONFIG_IPXE
Wei Liu [Fri, 24 Aug 2018 10:54:04 +0000 (11:54 +0100)]
tools: building IPXE should be determined by CONFIG_IPXE

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
6 years agoQEMU_TAG update
Ian Jackson [Thu, 23 Aug 2018 14:10:11 +0000 (15:10 +0100)]
QEMU_TAG update

6 years agoxen/arm: p2m: Introduce a new variable removing_mapping in __p2m_set_entry
Julien Grall [Mon, 16 Jul 2018 17:27:10 +0000 (18:27 +0100)]
xen/arm: p2m: Introduce a new variable removing_mapping in __p2m_set_entry

This is making the code slightly easier to understand.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: p2m: Rename ret to mfn in p2m_lookup
Julien Grall [Mon, 16 Jul 2018 17:27:09 +0000 (18:27 +0100)]
xen/arm: p2m: Rename ret to mfn in p2m_lookup

Comestic change to make clearer what is the return ('ret' is a bit
too generic).

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: guest_walk: Use lpae_is_mapping to simplify the code
Julien Grall [Mon, 16 Jul 2018 17:27:06 +0000 (18:27 +0100)]
xen/arm: guest_walk: Use lpae_is_mapping to simplify the code

!lpae_is_page(pte, level) && !lpae_is_superpage(pte, level) is
equivalent to !lpae_is_mapping(pte, level).

At the same time drop lpae_is_page(pte, level) that is now unused.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: Rename lpae_valid to lpae_is_valid
Julien Grall [Mon, 16 Jul 2018 17:27:05 +0000 (18:27 +0100)]
xen/arm: Rename lpae_valid to lpae_is_valid

This will help to keep the naming consistent accross all lpae helpers.

No functional change intended.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: Rework lpae_table
Julien Grall [Mon, 16 Jul 2018 17:27:04 +0000 (18:27 +0100)]
xen/arm: Rework lpae_table

Currently, lpae_table can only work on entry from any level other than
3. Make it work with any level by extending the prototype to pass the
level.

At the same time, rename the function to lpae_is_mapping so naming stay
consistent accross all lpae_* helpers.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: Rework lpae_mapping
Julien Grall [Mon, 16 Jul 2018 17:27:03 +0000 (18:27 +0100)]
xen/arm: Rework lpae_mapping

Currently, lpae_mapping can only work on entry from any level other than
3. Make it work with any level by extending the prototype to pass the
level.

At the same time, rename the function to lpae_is_mapping so naming stay
consistent accross lpae_* helpers.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: p2m: Limit call to mem access code use in get_page_from_gva
Julien Grall [Mon, 16 Jul 2018 17:27:02 +0000 (18:27 +0100)]
xen/arm: p2m: Limit call to mem access code use in get_page_from_gva

Mem access has only an impact on the hardware translation between a
guest virtual address and the machine physical address. So it is not
necessary to fallback to memaccess for all the other case (e.g when it
is not possible to acquire the page behind the MFN).

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: p2m: Reduce the locking section in get_page_from_gva
Julien Grall [Mon, 16 Jul 2018 17:27:01 +0000 (18:27 +0100)]
xen/arm: p2m: Reduce the locking section in get_page_from_gva

The p2m lock is only necessary to prevent gvirt_to_maddr failing when
break-before-make sequence is used in the P2M update concurrently on
another pCPU. So reduce the locking section.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: cpregs: Fix typo in the documentation of TTBCR
Julien Grall [Mon, 16 Jul 2018 17:26:59 +0000 (18:26 +0100)]
xen/arm: cpregs: Fix typo in the documentation of TTBCR

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: cpregs: Allow HSR_CPREG* to receive more than 1 parameter
Julien Grall [Mon, 16 Jul 2018 17:26:58 +0000 (18:26 +0100)]
xen/arm: cpregs: Allow HSR_CPREG* to receive more than 1 parameter

At the moment, HSR_CPREG is expected to receive only the co-processor
register name in parameter. Because the name is actually a define, this
may have been expanded by a previous macro.

Rather than imposing the use of _HSR_CPREG* in such cases, allow
HSR_CPREG to receive more than 1 parameter.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: rename acpi_make_chosen_node to make_chosen_node
Stefano Stabellini [Tue, 31 Jul 2018 23:27:50 +0000 (16:27 -0700)]
xen/arm: rename acpi_make_chosen_node to make_chosen_node

acpi_make_chosen_node is actually generic and can be reused. Rename it
to make_chosen_node and make it available to non-ACPI builds.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Acked-by: Julien Grall <julien.grall@arm.com>
6 years agoxen/arm: move evtchn_allocate call out of make_hypervisor_node
Stefano Stabellini [Tue, 31 Jul 2018 23:27:49 +0000 (16:27 -0700)]
xen/arm: move evtchn_allocate call out of make_hypervisor_node

In the case of domUs, evtchn_irq is allocated by arch_domain_create and
set to GUEST_EVTCHN_PPI.

To make make_hypervisor_node more reusable, move the call to
evtchn_allocate out of make_hypervisor_node, to the dom0 specific caller
(handle_node).

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
6 years agoxen/arm: move a few DT related defines to public/device_tree_defs.h
Stefano Stabellini [Tue, 31 Jul 2018 23:27:45 +0000 (16:27 -0700)]
xen/arm: move a few DT related defines to public/device_tree_defs.h

Move a few constants defined by libxl_arm.c to
xen/include/public/device_tree_defs.h, so that they can be used from Xen
and libxl. Prepend GUEST_ to avoid conflicts.

Move the DT_IRQ_TYPE* definitions from libxl_arm.c to
public/device_tree_defs.h. Use them in Xen where appropriate.

Re-define the existing Xen internal IRQ_TYPEs as DT_IRQ_TYPEs: they
already happen to be the same, let make it clear.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
CC: ian.jackson@eu.citrix.com
6 years agoxen/arm: do not pass dt_host to make_memory_node and make_hypervisor_node
Stefano Stabellini [Tue, 31 Jul 2018 23:27:48 +0000 (16:27 -0700)]
xen/arm: do not pass dt_host to make_memory_node and make_hypervisor_node

In order to make make_memory_node and make_hypervisor_node more
reusable, do not pass them dt_host. As they only use it to calculate
addrcells and sizecells, pass addrcells and sizecells directly.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Acked-by: Julien Grall <julien.grall@arm.com>