]> xenbits.xensource.com Git - xen.git/log
xen.git
8 years agox86: use unambiguous register names
Jan Beulich [Fri, 6 Jan 2017 14:07:31 +0000 (15:07 +0100)]
x86: use unambiguous register names

Eliminate the mis-naming of 64-bit fields with 32-bit register names
(eflags instead of rflags etc). To ensure no piece of code was missed,
transiently use the underscore prefixed names only for 32-bit register
accesses. This will be cleaned up subsequently.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86: drop cpu_has_sse{,2}
Jan Beulich [Fri, 6 Jan 2017 14:06:09 +0000 (15:06 +0100)]
x86: drop cpu_has_sse{,2}

Commit dc88221c97 ("x86: rename XMM* features to SSE*") pointlessly
added them - these features are always available on 64-bit CPUs. (Let's
not assume this for MMX though in at least the insn emulator.)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: support fencing insns
Jan Beulich [Fri, 6 Jan 2017 14:04:22 +0000 (15:04 +0100)]
x86emul: support fencing insns

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/mtrr: use stdbool instead of int + define
Doug Goldstein [Thu, 5 Jan 2017 16:26:09 +0000 (10:26 -0600)]
x86/mtrr: use stdbool instead of int + define

Instead of using an int and providing a define for TRUE and FALSE,
change the code to use stdbool that Xen provides.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
[Minor style tweaks]
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agolibxl: Update xenstore on VCPU hotplug for all guest types
Boris Ostrovsky [Tue, 3 Jan 2017 14:04:12 +0000 (09:04 -0500)]
libxl: Update xenstore on VCPU hotplug for all guest types

Currently HVM guests that use upstream qemu do not update xenstore's
availability entry for VCPUs. While it is not strictly necessary for
hotplug to work, xenstore ends up not reflecting actual status of
VCPUs. We should fix this.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agobuild: use debug_symbols to add -g3
Wei Liu [Thu, 5 Jan 2017 16:36:51 +0000 (16:36 +0000)]
build: use debug_symbols to add -g3

While doing archeology I found 38ce7ce3,  we should make sure
debug_symbols is responsible for adding "-g" to CFLAGS.

Move adding "-g3" from being guarded by debug to being guarded by
debug_symbols.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agobuild: move debug{,_symbols} to tools/Rules.mk
Wei Liu [Fri, 23 Dec 2016 12:24:16 +0000 (12:24 +0000)]
build: move debug{,_symbols} to tools/Rules.mk

31d41d7b tried to make debug affect tools build only but failed to take
care of debug_symbols (which appends "-g" to CFLAGS).

Move both to tools/Rules.mk at once in this patch.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
8 years agobuild: move setting LTO options to xen/Rules.mk
Wei Liu [Fri, 23 Dec 2016 12:12:36 +0000 (12:12 +0000)]
build: move setting LTO options to xen/Rules.mk

Having them in StdGNU.mk would affect both hypervisor and tools build.
However judging from the commit message of e4cdd74f LTO was only meant
to affect hypvervisor build.

Move the relevant bits to xen/Rules.mk.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agotools/libxl: include scheduler parameters in the output of xl list -l
Roger Pau Monne [Thu, 5 Jan 2017 10:08:34 +0000 (10:08 +0000)]
tools/libxl: include scheduler parameters in the output of xl list -l

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reported-by: Fatih Acar <fatih@gandi.net>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agox86/pv: Defer I/O bitmap checks even in 64bit mode for emulate_privilege_op()
Andrew Cooper [Thu, 5 Jan 2017 11:41:50 +0000 (11:41 +0000)]
x86/pv: Defer I/O bitmap checks even in 64bit mode for emulate_privilege_op()

The I/O bitmap doesn't change function depending on mode.  64bit userspace
such as an X server still needs to enter guest_io_okay() to find that the PV
kernel did set up an appropriate virtual I/O bitmap to permit access.

While moving the check, alter its representation to be easier to read.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/pv: Fix determination of 64bit mode in emulate_privilege_op()
Andrew Cooper [Thu, 5 Jan 2017 11:23:15 +0000 (11:23 +0000)]
x86/pv: Fix determination of 64bit mode in emulate_privilege_op()

ctxt->addr_size is expressed in bits rather than bytes, and has the value 16,
32 or 64.  Comparing < 8 made the intended non-64bit paths dead.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/vvmx: Drop sreg_to_index[]
Andrew Cooper [Tue, 3 Jan 2017 11:55:54 +0000 (11:55 +0000)]
x86/vvmx: Drop sreg_to_index[]

Since c/s 0888d36b "x86/emul: Correct the decoding of SReg3 operands",
x86_seg_* have followed hardware encodings, meaning that this translation
table is now an identiy transform.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
8 years agox86/VMX: use unambiguous register names
Jan Beulich [Thu, 5 Jan 2017 10:11:19 +0000 (11:11 +0100)]
x86/VMX: use unambiguous register names

This is in preparation of eliminating the mis-naming of 64-bit fields
with 32-bit register names (eflags instead of rflags etc). Use the
guaranteed 32-bit underscore prefixed names for now where appropriate.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
8 years agox86/apicv: fix RTC periodic timer and apicv issue
Quan Xu [Thu, 5 Jan 2017 10:10:01 +0000 (11:10 +0100)]
x86/apicv: fix RTC periodic timer and apicv issue

When Xen apicv is enabled, wall clock time is faster on Windows7-32
guest with high payload (with 2vCPU, captured from xentrace, in
high payload, the count of IPI interrupt increases rapidly between
these vCPUs).

If IPI intrrupt (vector 0xe1) and periodic timer interrupt (vector 0xd1)
are both pending (index of bit set in vIRR), unfortunately, the IPI
intrrupt is high priority than periodic timer interrupt. Xen updates
IPI interrupt bit set in vIRR to guest interrupt status (RVI) as a high
priority and apicv (Virtual-Interrupt Delivery) delivers IPI interrupt
within VMX non-root operation without a VM-Exit. Within VMX non-root
operation, if periodic timer interrupt index of bit is set in vIRR and
highest, the apicv delivers periodic timer interrupt within VMX non-root
operation as well.

But in current code, if Xen doesn't update periodic timer interrupt bit
set in vIRR to guest interrupt status (RVI) directly, Xen is not aware
of this case to decrease the count (pending_intr_nr) of pending periodic
timer interrupt, then Xen will deliver a periodic timer interrupt again.

And that we update periodic timer interrupt in every VM-entry, there is
a chance that already-injected instance (before EOI-induced exit happens)
will incur another pending IRR setting if there is a VM-exit happens
between virtual interrupt injection (vIRR->0, vISR->1) and EOI-induced
exit (vISR->0), since pt_intr_post hasn't been invoked yet, then the
guest receives more periodic timer interrupt.

So we set eoi_exit_bitmap for intack.vector - give a chance to post
periodic time interrupts when periodic time interrupts become the
highest one.

Signed-off-by: Quan Xu <xuquan8@huawei.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Chao Gao <chao.gao@intel.com>
8 years agox86/cpuid: Untangle the <asm/cpufeature.h> include hierachy
Andrew Cooper [Thu, 8 Dec 2016 08:46:42 +0000 (08:46 +0000)]
x86/cpuid: Untangle the <asm/cpufeature.h> include hierachy

The use of X86_FEATURES_ONLY was shortlived in Linux for the same problem
encountered here.  The following series needs to add extra includes to
asm/cpuid.h, which breaks the build elsewhere given the current hierachy.

Move the feature definitions into a separate header file, which also matches
the solution Linux used.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/svm: Replace opencoded 1GB superpage check
Andrew Cooper [Tue, 3 Jan 2017 17:46:58 +0000 (17:46 +0000)]
x86/svm: Replace opencoded 1GB superpage check

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agox86/mwait-idle: add Knights Mill CPUID
Piotr Luc [Wed, 4 Jan 2017 13:29:30 +0000 (14:29 +0100)]
x86/mwait-idle: add Knights Mill CPUID

Add Knights Mill (KNM) to the list of CPUIDs supported by mwait-idle.

Signed-off-by: Piotr Luc <piotr.luc@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
[Linux commit: a2c1bc645e87346150516b3abf1933ed29d0f48b]
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/mwait-idle: add CPU model 0x4a (Atom Z34xx series)
Andy Shevchenko [Wed, 4 Jan 2017 13:29:08 +0000 (14:29 +0100)]
x86/mwait-idle: add CPU model 0x4a (Atom Z34xx series)

Add CPU ID for Atom Z34xx processors. Datasheets indicate support for this,
detailed information about potential quirks or limitations are missing, though.
So we just reuse the definition from official BSP code.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
[Linux commit: 5e7ec268fd48d63cfd0e3a9be6c6443f01673bd4]
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: use unambiguous register names
Jan Beulich [Wed, 4 Jan 2017 13:28:32 +0000 (14:28 +0100)]
x86emul: use unambiguous register names

This is in preparation of eliminating the mis-naming of 64-bit fields
with 32-bit register names (eflags instead of rflags etc).

Note that the result is not fully consistent until after at least one
more patch is in place, primarily to limit patch size (by trying to not
touch the same line twice).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: make _PRE_EFLAGS() tolerate first argument being 32-bit
Jan Beulich [Wed, 4 Jan 2017 13:28:02 +0000 (14:28 +0100)]
x86emul: make _PRE_EFLAGS() tolerate first argument being 32-bit

While this may appear to introduce a truncation issue, the high 32 bits
get zapped already anyway (early in _PRE_EFLAGS() as well as in
_POST_EFLAGS()). Once a subsequent patch switches to use proper 32-bit
EFLAGS operands, we'll in fact end up with more correct code, as that
zeroing of the upper halves will then go away.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: support LAR/LSL/VERR/VERW
Jan Beulich [Wed, 4 Jan 2017 13:27:17 +0000 (14:27 +0100)]
x86emul: support LAR/LSL/VERR/VERW

This involves protmode_load_seg() accepting x86_seg_none as input, with
the meaning to
- suppress any exceptions other than #PF,
- not commit any state.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agoxen/arm: fix GIC_INVALID_LR
Stefano Stabellini [Thu, 22 Dec 2016 02:15:10 +0000 (18:15 -0800)]
xen/arm: fix GIC_INVALID_LR

GIC_INVALID_LR should be 0xff, but actually, defined as ~(uint8_t)0, is
0xffffffff. Fix the problem by placing the ~ operator before the cast.

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
8 years agox86/cpu: Improvements to get_cpu_vendor()
Andrew Cooper [Fri, 16 Dec 2016 17:36:22 +0000 (17:36 +0000)]
x86/cpu: Improvements to get_cpu_vendor()

Comparing 3 integers is more efficient than using strcmp(), and is more useful
to the gcv_guest case than having to fabricate a suitable string to pass.  The
gcv_host cases have both options easily to hand, and experimentally, the
resulting code is more efficient.

Update the cpu_dev structure to be more efficient.  c_vendor[] only needs to
be 8 bytes long to cover all the CPU drivers Xen has, which avoids storing an
8-byte pointer to 8 bytes of data.  Drop c_ident[1] as we have no CPU drivers
with a second ident string, and turn it into an anonymous union to allow
access to the integer values directly.

This avoids all need for the vendor_id union in update_domain_cpuid_info().

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/cpu: Don't update this_cpu for get_cpu_vendor(, gcv_guest)
Andrew Cooper [Tue, 3 Jan 2017 12:55:55 +0000 (12:55 +0000)]
x86/cpu: Don't update this_cpu for get_cpu_vendor(, gcv_guest)

Otherwise booting a cross-vendor guest would cause PCPU hotplug to
malfunction, because of trying to use the wrong CPU driver.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/cpu: Drop unused X86_VENDOR_* values
Andrew Cooper [Fri, 16 Dec 2016 17:53:09 +0000 (17:53 +0000)]
x86/cpu: Drop unused X86_VENDOR_* values

Xen only has CPU drivers for Intel, Centaur and AMD.  All other contributions
to X86_VENDOR_NUM simply make the cpu_devs[] array longer, reducing the
efficiency of get_cpu_vendor()

There is one remaning hidden reference to X86_VENDOR_CYRIX in the MTRR code.
However, as far as I can tell, Cyrix never realeased a 64bit processor.  It is
therefore dead code.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agolibxl: fix libxl_set_memory_target
Wei Liu [Thu, 29 Dec 2016 16:36:31 +0000 (16:36 +0000)]
libxl: fix libxl_set_memory_target

Commit 26dbc93a ("libxl: Remove pointless hypercall from
libxl_set_memory_target") removed the call to xc_domain_getinfolist, but
it failed to notice that "info" was actually needed later.

Put that back. While at it, make the code conform to coding style
requirement.

Reported-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
8 years agox86/HVM: constify VMFUNC emulation hook
Jan Beulich [Tue, 3 Jan 2017 08:44:43 +0000 (09:44 +0100)]
x86/HVM: constify VMFUNC emulation hook

... to clarify that the register state does not get altered (behind the
back of the emulator).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
8 years agox86/SVM: use unambiguous register names
Jan Beulich [Tue, 3 Jan 2017 08:44:10 +0000 (09:44 +0100)]
x86/SVM: use unambiguous register names

This is in preparation of eliminating the mis-naming of 64-bit fields
with 32-bit register names (eflags instead of rflags etc). Use the
guaranteed 32-bit underscore prefixed names for now where appropriate.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
8 years agox86/HVMemul: use unambiguous register names
Jan Beulich [Tue, 3 Jan 2017 08:43:29 +0000 (09:43 +0100)]
x86/HVMemul: use unambiguous register names

This is in preparation of eliminating the mis-naming of 64-bit fields
with 32-bit register names (eflags instead of rflags etc). Use the
guaranteed 32-bit underscore prefixed names for now where appropriate.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/guest-walk: use unambiguous register names
Jan Beulich [Tue, 3 Jan 2017 08:42:52 +0000 (09:42 +0100)]
x86/guest-walk: use unambiguous register names

This is in preparation of eliminating the mis-naming of 64-bit fields
with 32-bit register names (eflags instead of rflags etc). Use the
guaranteed 32-bit underscore prefixed names for now where appropriate.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
8 years agox86/MSR: introduce MSR access split/fold helpers
Jan Beulich [Tue, 3 Jan 2017 08:42:10 +0000 (09:42 +0100)]
x86/MSR: introduce MSR access split/fold helpers

This is in preparation of eliminating the mis-naming of 64-bit fields
with 32-bit register names (eflags instead of rflags etc). Use the
guaranteed 32-bit underscore prefixed names for now where appropriate.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
8 years agolibxl/libxl_qmp.c: Fix code style in qmp_next()
Zhang Chen [Mon, 26 Dec 2016 07:18:09 +0000 (15:18 +0800)]
libxl/libxl_qmp.c: Fix code style in qmp_next()

Fix text-indent.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agotools/blktap2: remove unused inclusion of sys/sysctl.l
Alistair Francis [Tue, 20 Dec 2016 19:47:00 +0000 (11:47 -0800)]
tools/blktap2: remove unused inclusion of sys/sysctl.l

That header file is not used. Removing it would avoid build error with
musl libc, which doesn't have that header file.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Doug Goldstein <cardoe@cardoe.com>
[ wei: rewrote commit message ]
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agoINSTALL: remove stale lto build instruction
Wei Liu [Wed, 21 Dec 2016 16:44:24 +0000 (16:44 +0000)]
INSTALL: remove stale lto build instruction

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/emul: Correct the return value handling of VMFUNC
Andrew Cooper [Fri, 9 Dec 2016 18:40:11 +0000 (18:40 +0000)]
x86/emul: Correct the return value handling of VMFUNC

The bracketing of x86_emulate() calling the ops->vmfunc() hook is wrong with
respect to the assignment to rc, which can trip the new assertions in
x86_emulate_wrapper().

The hvmemul_vmfunc() hook should only raise #UD if X86EMUL_EXCEPTION is
returned.  This is only a latent bug at the moment.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agotools/blktap2: Fix missing header file
Alistair Francis [Tue, 20 Dec 2016 19:46:59 +0000 (11:46 -0800)]
tools/blktap2: Fix missing header file

To avoid build errors relating to missing declarations of ssize_t add
the appropriate header file to atomic.h.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Doug Goldstein <cardoe@cardoe.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agotools/blktap2/vhd: Remove unused struct stat stats
Alistair Francis [Tue, 20 Dec 2016 19:46:58 +0000 (11:46 -0800)]
tools/blktap2/vhd: Remove unused struct stat stats

The unsued variable 'struct stat stats' causes build errors in some
situations. As it isn't used just remove it.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Doug Goldstein <cardoe@cardoe.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agoCorrected comment typo "count not" to "could not"
Eric DeVolder [Wed, 21 Dec 2016 21:37:31 +0000 (13:37 -0800)]
Corrected comment typo "count not" to "could not"

Fix cut-n-paste typo; changed the words "count not" to "could not".

No functional changes.

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibacpi: don't build x86-only AML for ARM64 mk_dsdt
Boris Ostrovsky [Thu, 22 Dec 2016 09:56:34 +0000 (10:56 +0100)]
libacpi: don't build x86-only AML for ARM64 mk_dsdt

Commit d6ac8e22c7c5 ("acpi/x86: define ACPI IO registers for
PVH guests") broke ARM64 build of mk_dsdt.c due to introduction
of XEN_ACPI_CPU_MAP[_LEN] macros that are needed only for x86
guests.

We could fix the build by dealing specifically with those macros
but since post-MADT code is not executed on ARM64 anyway we can
compile it for x86 only.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Tested-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
8 years agoinit/FreeBSD: fix incorrect usage of $rc_pids in xendriverdomain
Roger Pau Monne [Wed, 21 Dec 2016 16:47:26 +0000 (16:47 +0000)]
init/FreeBSD: fix incorrect usage of $rc_pids in xendriverdomain

It should be rc_pid.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reported-by: Nathan Friess <nathan.friess@gmail.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agoinit/FreeBSD: add rc control variables
Roger Pau Monne [Mon, 19 Dec 2016 15:02:04 +0000 (15:02 +0000)]
init/FreeBSD: add rc control variables

Those are used in order to decide which scripts are executed at init.

Ref: https://www.freebsd.org/doc/en/articles/rc-scripting/article.html#rcng-confdummy

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
[ wei: fix up conflict ]
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
8 years agoinit/FreeBSD: fix xencommons so it can only be launched by Dom0
Roger Pau Monne [Mon, 19 Dec 2016 15:02:03 +0000 (15:02 +0000)]
init/FreeBSD: fix xencommons so it can only be launched by Dom0

At the moment the execution of xencommons is gated on the presence of the
privcmd device, but that's not correct, since privcmd is available to all Xen
domains (privileged or unprivileged). Instead of using privcmd use the
xenstored device, which will only be available to the domain that's in charge
of running xenstored, and thus xencommons.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agoinit/FreeBSD: remove xendriverdomain_precmd
Roger Pau Monne [Mon, 19 Dec 2016 15:02:02 +0000 (15:02 +0000)]
init/FreeBSD: remove xendriverdomain_precmd

...because it's empty. While there also rename xendriverdomain_startcmd to
xendriverdomain_start in order to match the nomenclature of the file.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
[ wei: fix up minor error ]
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
8 years agoinit/FreeBSD: set correct PATH for xl devd
Roger Pau Monne [Mon, 19 Dec 2016 15:02:01 +0000 (15:02 +0000)]
init/FreeBSD: set correct PATH for xl devd

FreeBSD init scripts don't have /usr/local/{bin/sbin} in it's PATH, which
prevents `xl devd` from working properly since hotplug scripts require the set
of xenstore cli tools to be in PATH.

While there also fix the usage of --pidfile, which according to the xl help
doesn't use "=", and add braces around XLDEVD_PIDFILE.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl: fix coding style issues in init_acpi_config
Wei Liu [Fri, 16 Dec 2016 17:40:09 +0000 (17:40 +0000)]
libxl: fix coding style issues in init_acpi_config

1. Use "r" to store return values from xc calls.
2. Don't initialise "rc" at the beginning of the function.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agox86/shadow: use unambiguous register names
Jan Beulich [Wed, 21 Dec 2016 16:02:52 +0000 (17:02 +0100)]
x86/shadow: use unambiguous register names

This is in preparation of eliminating the mis-naming of 64-bit fields
with 32-bit register names (eflags instead of rflags etc).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/misc: use unambiguous register names
Jan Beulich [Wed, 21 Dec 2016 16:01:58 +0000 (17:01 +0100)]
x86/misc: use unambiguous register names

This is in preparation of eliminating the mis-naming of 64-bit fields
with 32-bit register names (eflags instead of rflags etc). Use the
guaranteed 32-bit underscore prefixed names for now where appropriate.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/traps: use unambiguous register names
Jan Beulich [Wed, 21 Dec 2016 16:01:34 +0000 (17:01 +0100)]
x86/traps: use unambiguous register names

This is in preparation of eliminating the mis-naming of 64-bit fields
with 32-bit register names (eflags instead of rflags etc). Use the
guaranteed 32-bit underscore prefixed names for now where appropriate.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/vm-event: use unambiguous register names
Jan Beulich [Wed, 21 Dec 2016 16:01:08 +0000 (17:01 +0100)]
x86/vm-event: use unambiguous register names

This is in preparation of eliminating the mis-naming of 64-bit fields
with 32-bit register names (eflags instead of rflags etc).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/HVM: use unambiguous register names
Jan Beulich [Wed, 21 Dec 2016 16:00:40 +0000 (17:00 +0100)]
x86/HVM: use unambiguous register names

This is in preparation of eliminating the mis-naming of 64-bit fields
with 32-bit register names (eflags instead of rflags etc). Use the
guaranteed 32-bit underscore prefixed names for now where appropriate.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/oprofile: use unambiguous register names
Jan Beulich [Wed, 21 Dec 2016 15:59:13 +0000 (16:59 +0100)]
x86/oprofile: use unambiguous register names

This is in preparation of eliminating the mis-naming of 64-bit fields
with 32-bit register names (eflags instead of rflags etc).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: don't unconditionally clear segment bases upon null selector loads
Jan Beulich [Wed, 21 Dec 2016 15:58:20 +0000 (16:58 +0100)]
x86emul: don't unconditionally clear segment bases upon null selector loads

AMD explicitly documents that namely FS and GS don't have their bases
cleared in that case, and I see no reason why guests may not rely on
that behavior. To facilitate this a new input field (the CPU vendor) is
being added.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: some REX related polishing
Jan Beulich [Wed, 21 Dec 2016 15:57:34 +0000 (16:57 +0100)]
x86emul: some REX related polishing

While there are a few cases where it seems better to open-code REX_*
values, there's one where this clearly is a bad idea. And the SYSEXIT
emulation has no need to look at REX at all, it can simply use op_bytes
instead.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agosched: removal of redundant check in Credit
Praveen Kumar [Wed, 21 Dec 2016 15:53:35 +0000 (16:53 +0100)]
sched: removal of redundant check in Credit

The patch gets rid of a redundant check in csched_vcpu_acct. In fact,
the function is only called from csched_tick, which already checks
that current is not the idle vcpu. The patch also adds an ASSERT to
the same effect, in order to make assumption ( i.e., no calling this
on idle vcpus) even more clear and as a guard for future mis-use.

Signed-off-by: Praveen Kumar <kpraveen.lkml@gmail.com>
Acked-by: Dario Faggioli <dario.faggioli@citrix.com>
8 years agox86/HVM: add missing NULL check before using VMFUNC hook
Jan Beulich [Wed, 21 Dec 2016 15:47:19 +0000 (16:47 +0100)]
x86/HVM: add missing NULL check before using VMFUNC hook

This is CVE-2016-10025 / XSA-203.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86: force EFLAGS.IF on when exiting to PV guests
Jan Beulich [Wed, 21 Dec 2016 15:46:13 +0000 (16:46 +0100)]
x86: force EFLAGS.IF on when exiting to PV guests

Guest kernels modifying instructions in the process of being emulated
for another of their vCPU-s may effect EFLAGS.IF to be cleared upon
next exiting to guest context, by converting the being emulated
instruction to CLI (at the right point in time). Prevent any such bad
effects by always forcing EFLAGS.IF on. And to cover hypothetical other
similar issues, also force EFLAGS.{IOPL,NT,VM} to zero.

This is CVE-2016-10024 / XSA-202.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/hvm: Don't emulate all instructions hitting the #UD intercept
Andrew Cooper [Mon, 19 Dec 2016 12:05:20 +0000 (12:05 +0000)]
x86/hvm: Don't emulate all instructions hitting the #UD intercept

Having the instruction emulator fill in all #UDs when using FEP is unhelpful
when trying to test emulation behaviour against hardware.

Restrict emulation from the #UD intercept to the cross-vendor case, and when a
postive Forced Emulation Prefix has been identified.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/emul: Don't opencode CR0_TS in CLTS handling
Andrew Cooper [Mon, 19 Dec 2016 10:19:29 +0000 (10:19 +0000)]
x86/emul: Don't opencode CR0_TS in CLTS handling

Also replace implicit 0 checks with X86EMUL_OKAY

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agoacpi/x86: define ACPI IO registers for PVH guests
Boris Ostrovsky [Tue, 20 Dec 2016 08:54:38 +0000 (09:54 +0100)]
acpi/x86: define ACPI IO registers for PVH guests

Define VCPU available map address (used by AML's PRSC method)
and GPE0 CPU hotplug event number. Use these definitions in mk_dsdt
instead hardcoded values.

These definitions will later be used by both the hypervisor and
the toolstack (initially for PVH guests only), thus they are
placed in public headers.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/pmtimer: move ACPI registers from PMTState to hvm_domain
Boris Ostrovsky [Tue, 20 Dec 2016 08:54:12 +0000 (09:54 +0100)]
x86/pmtimer: move ACPI registers from PMTState to hvm_domain

These registers (pm1a specifically) are not all specific to pm timer
and are accessed by non-pmtimer code (for example, sleep/power button
emulation).

The public name for save state structure is kept as 'pmtimer' to avoid
code churn with the expected changes in migration code. hvm_hw_acpi
name is introduced for internal use but when migration code is updated
hvm_hw_pmtimer will be renamed to hvm_hw_acpi.

No functional changes are introduced.

(While this file is being modified, also add emacs mode style rune)

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agovvmx: replace vmreturn() by vmsucceed() and vmfail*()
Haozhong Zhang [Tue, 20 Dec 2016 08:53:39 +0000 (09:53 +0100)]
vvmx: replace vmreturn() by vmsucceed() and vmfail*()

Replace vmreturn() by vmsucceed(), vmfail(), vmfail_valid() and
vmfail_invalid(), which are consistent to the pseudo code on Intel
SDM, and allow to return VM instruction error numbers to L1
hypervisor.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
8 years agovvmx: fix the wrong address width in c/s 08fac63
Haozhong Zhang [Tue, 20 Dec 2016 08:51:45 +0000 (09:51 +0100)]
vvmx: fix the wrong address width in c/s 08fac63

c/s 08fac63 misused v->domain-arch.paging.gfn_bits as the width of
guest physical address and missed adding PAGE_SHIFT to it when
checking vmxon operand.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
8 years agox86emul: check for CMPXCHG8B availability
Jan Beulich [Tue, 20 Dec 2016 08:51:08 +0000 (09:51 +0100)]
x86emul: check for CMPXCHG8B availability

We can't exclude someone wanting to hide the instruction from guests.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86: fix asm() constraint in clear_user()
Jan Beulich [Mon, 19 Dec 2016 16:52:42 +0000 (17:52 +0100)]
x86: fix asm() constraint in clear_user()

Commit 2fdf5b2554 ("x86: streamline copying to/from user memory")
wrongly used "g" here, when it obviously needs to be a register.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/emul: Correct the handling of eflags with SYSCALL
Andrew Cooper [Sun, 18 Dec 2016 15:42:59 +0000 (15:42 +0000)]
x86/emul: Correct the handling of eflags with SYSCALL

A singlestep #DB is determined by the resulting eflags value from the
execution of SYSCALL, not the original eflags value.

By using the original eflags value, we negate the guest kernels attempt to
protect itself from a privilege escalation by masking TF.

(re)introduce a singlestep boolean, defaulting to the original eflags state,
but have the SYSCALL emulation recalculate it after masking has occurred.

This is XSA-204

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/SMP: CPU0's scratch mask is needed earlier
Jan Beulich [Mon, 19 Dec 2016 10:49:20 +0000 (11:49 +0100)]
x86/SMP: CPU0's scratch mask is needed earlier

When putting together commit 3b61726458 ("x86: introduce and use
scratch CPU mask") I failed to remember that AMD IOMMU setups needs the
scratch mask prior to smp_prepare_cpus() having run. Use a static mask
for the boot CPU instead.

Note that the definition of scratch_cpu0mask could also be put inside a
"NR_CPUS > 2 * BITS_PER_LONG" conditional, but it seems preferable to
me to carry the extra variable in all cases and avoid the #ifdef-ary.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agoxen/arm: Add support for 16 bit VMIDs
Bhupinder Thakur [Fri, 16 Dec 2016 07:16:28 +0000 (12:46 +0530)]
xen/arm: Add support for 16 bit VMIDs

VMID space is increased to 16-bits from 8-bits in ARMv8 8.1 revision.
This allows more than 256 VMs to be supported by Xen.

This change adds support for 16-bit VMIDs in Xen based on whether the
architecture supports it.

Signed-off-by: Bhupinder Thakur <bhupinder.thakur@linaro.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
8 years agoxen/arm: Move p2m_vmid_allocator_init() inside setup_virt_paging()
Bhupinder Thakur [Fri, 16 Dec 2016 07:16:27 +0000 (12:46 +0530)]
xen/arm: Move p2m_vmid_allocator_init() inside setup_virt_paging()

Since VMIDs are related to 2nd stage address translation, it makes more sense
to move the call to p2m_vmid_allocator_init(), which initializes the vmid
allocation bitmap, inside setup_virt_paging(), where 2nd stage address translation
is set up.

Signed-off-by: Bhupinder Thakur <bhupinder.thakur@linaro.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
8 years agolibxl: set rc to 0 in init_acpi_config in success path
Wei Liu [Fri, 16 Dec 2016 15:51:33 +0000 (15:51 +0000)]
libxl: set rc to 0 in init_acpi_config in success path

xc_doamin_getinfo returns >=0 in success path, and if there is no vnode
configured, that rc will be returned to caller, which indicates error.

Fix that by setting rc to 0 in success path.

Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agox86/emul: Simplfy L{ES,DS,SS,FS,GS} handling
Andrew Cooper [Wed, 14 Dec 2016 11:05:18 +0000 (11:05 +0000)]
x86/emul: Simplfy L{ES,DS,SS,FS,GS} handling

%ss, %fs and %gs can be calculated by directly masking the opcode.  %es and
%ds cant, but the calculation isn't hard.

Use seg rather than dst.val for storing the calculated segment, which is
appropriately typed.  Drop the sel local variable entirely and use dst.val
instead.  The mode_64bit() check can be repositioned and simplified to drop
the ext check.  Replace opencoding of X86EMUL_OKAY.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/HVM: handle_{mmio*,pio}() return value adjustments
Jan Beulich [Fri, 16 Dec 2016 13:38:29 +0000 (14:38 +0100)]
x86/HVM: handle_{mmio*,pio}() return value adjustments

Don't ignore their return values. Don't indicate success to callers of
handle_pio() when in fact the domain has been crashed.

Make all three functions return bool. Adjust formatting of switch()
statements being touched anyway.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
8 years agox86/boot: fix build with certain older gcc versions
Jan Beulich [Fri, 16 Dec 2016 13:37:35 +0000 (14:37 +0100)]
x86/boot: fix build with certain older gcc versions

Despite all attempts so far (ending in commit fecf584294 ["Config.mk:
fix comment for debug option"] adjusting the respective comment),
Config.mk's debug= setting still affects the hypervisor build: CFLAGS
gets -g added there.

xen/arch/x86/boot/build32.mk includes that file, and hence inherits the
setting too. Some gcc versions take -g to create an .eh_frame section
despite -fno-asynchronous-unwind-tables (which instead one would expect
to produce .debug_frame).

In turn, commit 93c0c0287a ("x86/boot: create *.lnk files with linker
script") was - in my understanding - supposed to make sure .text is
first, but apparently it did also not really achieve that effect: Both
reloc.lnk and reloc.bin in the case here ended up with .eh_frame first,
which obviously rendered the whole final binary unusable.

Explicitly suppress generation of any kind of debug info when building
reloc.o.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: CMPXCHG16B requires an aligned operand
Jan Beulich [Fri, 16 Dec 2016 13:37:11 +0000 (14:37 +0100)]
x86emul: CMPXCHG16B requires an aligned operand

This distinguishes it from CMPXCHG8B.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: reduce CMPXCHG{8,16}B footprint and casting
Jan Beulich [Fri, 16 Dec 2016 13:36:36 +0000 (14:36 +0100)]
x86emul: reduce CMPXCHG{8,16}B footprint and casting

Re-use an existing stack variable (reducing stack footprint, which also
results in smaller code due to some stack accesses no longer needing a
32-bit displacement), at once using a union instead of casts. Also
switch to rex_prefix based conditionals instead of op_bytes ones.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: support {RD,WR}{F,G}SBASE
Jan Beulich [Fri, 16 Dec 2016 13:35:58 +0000 (14:35 +0100)]
x86emul: support {RD,WR}{F,G}SBASE

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86: introduce and use scratch CPU mask
Jan Beulich [Fri, 16 Dec 2016 13:34:34 +0000 (14:34 +0100)]
x86: introduce and use scratch CPU mask

__get_page_type(), so far using an on-stack CPU mask variable, is
involved in recursion when e.g. pinning page tables. This means there
may be up to five instances of the function active at a time, implying
five instances of the (up to 512 bytes large) CPU mask variable. An IRQ
happening at the deepest point of the stack has been observed to cause
a stack overflow with a 4095-pCPU build, when the IRQ handling results
in send_guest_pirq() being called (leading to vcpu_kick() -> ... ->
csched_vcpu_wake() -> __runq_tickle() -> cpumask_raise_softirq(), the
last two of which also have CPU mask variables on their stacks).

Introduce a per-CPU variable instead, which can then be used by any
code never running in IRQ context.

The mask can then also be used by other MMU code as well as by
msi_compose_msg() (and quite likely we'll find further uses down the
road).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agoVT-d: correct dma_msi_set_affinity()
Jan Beulich [Fri, 16 Dec 2016 13:33:43 +0000 (14:33 +0100)]
VT-d: correct dma_msi_set_affinity()

Commit 83cd2038fe ("VT-d: use msi_compose_msg()) together with
15aa6c6748 ("amd iommu: use base platform MSI implementation"),
introducing the use of a per-CPU scratch CPU mask, went too far:
dma_msi_set_affinity() may, at least in theory, be called in
interrupt context, and hence the use of that scratch variable is not
correct.

Since the function overwrites the destination information anyway,
allow msi_compose_msg() to be called with a NULL CPU mask, avoiding
the use of that scratch variable.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86: streamline copying to/from user memory
Jan Beulich [Fri, 16 Dec 2016 13:32:51 +0000 (14:32 +0100)]
x86: streamline copying to/from user memory

Their size parameters being "unsigned", there's neither a point for
them returning "unsigned long", nor for any of their (assembly)
arithmetic to involved 64-bit operations on other than addresses.

Take the opportunity and fold __do_clear_user() into its single user
(using qword stores instead of dword ones), name all asm() operands,
and reduce the amount of (redundant) operands.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agoxsm: allow relevant permission during migrate and gpu-passthrough.
Anshul Makkar [Mon, 12 Dec 2016 14:00:05 +0000 (14:00 +0000)]
xsm: allow relevant permission during migrate and gpu-passthrough.

During guest migrate allow permission to prevent
spurious page faults.
Prevents these errors:
d73: Non-privileged (73) attempt to map I/O space 00000000

avc: denied  { set_misc_info } for domid=0 target=11
scontext=system_u:system_r:dom0_t
tcontext=system_u:system_r:domU_t tclass=domain

GPU passthrough for hvm guest:
avc:  denied  { send_irq } for domid=0 target=10
scontext=system_u:system_r:dom0_t
tcontext=system_u:system_r:domU_t tclass=hvm

Signed-off-by: Anshul Makkar <anshul.makkar@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
8 years agolibxl: init_acpi_config should return rc in exit path
Wei Liu [Wed, 14 Dec 2016 11:44:36 +0000 (11:44 +0000)]
libxl: init_acpi_config should return rc in exit path

... otherwise it returns 0 even if the function fails.

Coverity-ID: 1397121

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agotools/xenstat: Don't disable xentop when cross-compiling
Edgar E. Iglesias [Thu, 15 Dec 2016 12:36:09 +0000 (13:36 +0100)]
tools/xenstat: Don't disable xentop when cross-compiling

This partially reverts 16504669c5cbb8b195d20412aadc838da5c428f7
since xentop cross-compiles fine.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agotools/xenstat: Remove redundant check for curses.h
Edgar E. Iglesias [Thu, 15 Dec 2016 12:36:08 +0000 (13:36 +0100)]
tools/xenstat: Remove redundant check for curses.h

This check for curses.h does not consider cross-compilation.
It only checks host paths.

Luckily, commit 65da4913214120ddc95bd846cb3649a29f87146a
introduced proper configure checks for ncurses so we can
remove the redundant check in the Makefile.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agox86/paging: Rename paging_mark_pfn_dirty() and use pfn_t
Andrew Cooper [Wed, 14 Dec 2016 14:20:12 +0000 (14:20 +0000)]
x86/paging: Rename paging_mark_pfn_dirty() and use pfn_t

paging_mark_gfn_dirty() actually takes a pfn, even by paramter name.  Rename
the function and alter the type to pfn_t to match.

Push pfn_t into the LOGDIRTY_IDX() macros, and clean up a couple of local
variable types in paging_mark_pfn_dirty().

Leave an explicit comment in vmx_vcpu_flush_pml_buffer() when we intentally
perform a straight conversion from gfn to pfn.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
8 years agox86/paging: Update paging_mark_dirty() to use mfn_t
Andrew Cooper [Wed, 14 Dec 2016 14:13:10 +0000 (14:13 +0000)]
x86/paging: Update paging_mark_dirty() to use mfn_t

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
8 years agox86emul: ignore most segment bases for 64-bit mode in is_aligned()
Jan Beulich [Thu, 15 Dec 2016 10:13:32 +0000 (11:13 +0100)]
x86emul: ignore most segment bases for 64-bit mode in is_aligned()

ops->read_segment() will report whatever is actually there in the
register, so we need to actively distinguish ES/CS/SS/DS from FS/GS.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agonestedhvm: replace VMCX_EADDR by INVALID_PADDR
Haozhong Zhang [Thu, 15 Dec 2016 10:12:34 +0000 (11:12 +0100)]
nestedhvm: replace VMCX_EADDR by INVALID_PADDR

... because INVALID_PADDR is a more general one.

Suggested-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agovvmx: check the operand of L1 vmxon
Haozhong Zhang [Thu, 15 Dec 2016 10:12:06 +0000 (11:12 +0100)]
vvmx: check the operand of L1 vmxon

Check whether the operand of L1 vmxon is a valid VMXON region address
and whether the VMXON region at that address contains a valid revision
ID.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
8 years agovvmx: return VMfail to L1 if L1 vmxon is executed in VMX operation
Haozhong Zhang [Thu, 15 Dec 2016 10:11:45 +0000 (11:11 +0100)]
vvmx: return VMfail to L1 if L1 vmxon is executed in VMX operation

According to Intel SDM, section "VMXON - Enter VMX Operation", a
VMfail should be returned to L1 hypervisor if L1 vmxon is executed in
VMX operation, rather than just print a warning message.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
8 years agovvmx: set vmxon_region_pa of vcpu out of VMX operation to an invalid address
Haozhong Zhang [Thu, 15 Dec 2016 10:11:20 +0000 (11:11 +0100)]
vvmx: set vmxon_region_pa of vcpu out of VMX operation to an invalid address

nvmx_handle_vmxon() previously checks whether a vcpu is in VMX
operation by comparing its vmxon_region_pa with GPA 0. However, 0 is
also a valid VMXON region address. If L1 hypervisor had set the VMXON
region address to 0, the check in nvmx_handle_vmxon() will be skipped.
Fix this problem by using an invalid VMXON region address for vcpu
out of VMX operation.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
8 years agox86/vm_event: add support for VM_EVENT_REASON_INTERRUPT
Razvan Cojocaru [Thu, 15 Dec 2016 10:09:03 +0000 (11:09 +0100)]
x86/vm_event: add support for VM_EVENT_REASON_INTERRUPT

Added support for a new event type, VM_EVENT_REASON_INTERRUPT,
which is now fired in a one-shot manner when enabled via the new
VM_EVENT_FLAG_GET_NEXT_INTERRUPT vm_event response flag.
The patch also fixes the behaviour of the xc_hvm_inject_trap()
hypercall, which would lead to non-architectural interrupts
overwriting pending (specifically reinjected) architectural ones.

Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Acked-by: Julien Grall <julien.grall@arm.com>
8 years agox86/HVM: introduce hvm_get_cpl() and respective hook
Jan Beulich [Thu, 15 Dec 2016 10:07:55 +0000 (11:07 +0100)]
x86/HVM: introduce hvm_get_cpl() and respective hook

... instead of repeating the same code in various places (and getting
it  wrong in some of them).

In vmx_inst_check_privilege() also stop open coding
vmx_guest_x86_mode().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Tim Deegan <tim@xen.org>
8 years agotools/livepatch: Exit with 2 if a timeout occurs
Ross Lagerwall [Wed, 14 Dec 2016 07:52:00 +0000 (07:52 +0000)]
tools/livepatch: Exit with 2 if a timeout occurs

Exit with 0 for success.
Exit with 1 for an error.
Exit with 2 if the operation should be retried for any reason (e.g. a
timeout or because another operation was in progress).

This allows a program or script driving xen-livepatch to determine if
the operation should be retried without parsing the output.

Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
8 years agotools/livepatch: Save errno where needed
Ross Lagerwall [Wed, 14 Dec 2016 07:51:59 +0000 (07:51 +0000)]
tools/livepatch: Save errno where needed

Fix a number of incorrect uses of errno after an operation that could
set it (e.g. fprintf, close).

Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
8 years agotools/livepatch: Remove unused struct member
Ross Lagerwall [Wed, 14 Dec 2016 07:51:58 +0000 (07:51 +0000)]
tools/livepatch: Remove unused struct member

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
8 years agotools/livepatch: Remove pointless retry loop
Ross Lagerwall [Wed, 14 Dec 2016 07:51:57 +0000 (07:51 +0000)]
tools/livepatch: Remove pointless retry loop

The default timeout in the hypervisor for a livepatch operation is 30 ms,
but xen-livepatch currently waits for up to 30 seconds for the operation
to complete. Instead, remove the retry loop and simply wait for 2 * 30 ms
for the operation to complete. The extra period is to account for the
time to actually start the operation.

Furthermore, have xen-livepatch set the hypervisor timeout rather than
relying on the hypervisor default since the tool doesn't know how long
it will be. Use nanosleep rather than usleep since usleep has been
removed from POSIX.1-2008.

Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
8 years agolivepatch: Fix documentation of timeout
Ross Lagerwall [Wed, 14 Dec 2016 07:51:56 +0000 (07:51 +0000)]
livepatch: Fix documentation of timeout

The hypervisor expects the timeout from the hypercall to be in
nanoseconds, so document this correctly. Also correctly document
what happens when timeout is set to zero.

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
8 years agotools/livepatch: Improve output
Ross Lagerwall [Wed, 14 Dec 2016 07:51:55 +0000 (07:51 +0000)]
tools/livepatch: Improve output

Improving the output of xen-livepatch, which is currently hard to read,
especially when an error occurs.

Some examples of the changes:
Before:
    $ xen-livepatch apply test
    Performing apply:. completed
After:
    $ xen-livepatch apply test
    Applying test:. completed

Before:
    $ xen-livepatch apply test2
    test2 failed with 22(Invalid argument)
    Performing apply: (no newline)
After:
    $ xen-livepatch apply test2
    Applying test2: failed
    Error 22: Invalid argument

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
8 years agotools/livepatch: Set stdout and stderr unbuffered
Ross Lagerwall [Wed, 14 Dec 2016 07:51:54 +0000 (07:51 +0000)]
tools/livepatch: Set stdout and stderr unbuffered

Using both stdout and stderr interleaved without newlines can result in
strange output when using line buffered mode (e.g. a terminal) or when
fully buffered (e.g. redirected to a file). Set stdout to unbuffered mode
to fix this (stderr is always unbuffered by default).

Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
8 years agotools/livepatch: Show the correct expected state before action
Ross Lagerwall [Wed, 14 Dec 2016 07:51:53 +0000 (07:51 +0000)]
tools/livepatch: Show the correct expected state before action

Somewhat confusingly, before the action has been executed the patch is
expected to be in the "allow" state, not the "expected" state.  The
check for this was correct but the subsequent error message was not.
Fix the error message to show this state correctly.

Before:
    $ xen-livepatch unload test
    test: in wrong state (APPLIED), expected (unknown)
After:
    $ xen-livepatch unload test
    test: in wrong state (APPLIED), expected (CHECKED)

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
8 years agox86/traps: Correct pagefault handling issues introduced in c/s d5c251c
Andrew Cooper [Wed, 14 Dec 2016 11:33:17 +0000 (11:33 +0000)]
x86/traps: Correct pagefault handling issues introduced in c/s d5c251c

There are two bugs.

Firstly, the ASSERT(paging_mode_only_log_dirty(d)) can trip when servicing a
hypervisor #PF in the context of an HVM guest, e.g. a copy_to_user() failure
in the shadow pagetable code.

Secondly, the entry conditions paging_fault() were previously guarded on
!paging_mode_external(d) which limited entry to PV contexts, but for both
guest and hypervisor faults.  Switching this to paging_mode_log_dirty() opened
it up to HVM contexts as well.

Reinstate the old !paging_mode_external(d) check, as it is actually the
relevent fact, and extend the comment to explicitly state that hypervisor
faults should follow this path.

Inside, we are now guarenteed to be in the context of a PV guest, so can
safely use the assertion about log dirty.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>