]> xenbits.xensource.com Git - xen.git/log
xen.git
10 years agox86: cleanup usage of nmi_watchdog
Boris Ostrovsky [Tue, 3 Feb 2015 10:29:28 +0000 (11:29 +0100)]
x86: cleanup usage of nmi_watchdog

Use NMI_NONE when testing whether NMI watchdog is off.

Remove unused NMI_INVALID macro.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86/mm: introduce a p2m class
Ed White [Tue, 3 Feb 2015 10:27:46 +0000 (11:27 +0100)]
x86/mm: introduce a p2m class

Use the class to differentiate between host and nested p2m's, and
potentially other classes in the future.

Fix p2m class checks that implicitly assume nested and host are
the only two classes that will ever exist.

Signed-off-by: Ed White <edmund.h.white@intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agotime: widen wallclock seconds to 64 bits
Jan Beulich [Tue, 3 Feb 2015 10:25:47 +0000 (11:25 +0100)]
time: widen wallclock seconds to 64 bits

Linux is in the process of converting their seconds representation to
64 bits, so in order to support it consistently we should follow suit
(which at some point in quite a few years we'd have to do anyway). To
represent this in struct shared_info we leverage a 32-bit hole in
x86-64's and arm's variant of the structure; for x86-32 guests the only
(reasonable) choice we have is to put the extension in struct
arch_shared_info.

A note on the conditional suppressing the xen_wc_sec_hi helper macro
definition in the ix86 case for hypervisor and tools: Neither of the
two actually need this, and its presence causes the tools to fail to
build (due to the inclusion of both the x86-64 and x86-32 variants of
the header).

As a secondary change, x86's do_platform_op() gets a pointless
initializer as well as a pointless assignment of that same variable
dropped.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agoQEMU_TAG update
Ian Jackson [Mon, 2 Feb 2015 17:11:56 +0000 (17:11 +0000)]
QEMU_TAG update

10 years agoocaml/xenctrl: Fix stub_xc_readconsolering()
Andrew Cooper [Fri, 30 Jan 2015 14:11:14 +0000 (14:11 +0000)]
ocaml/xenctrl: Fix stub_xc_readconsolering()

The Ocaml stub to retrieve the hypervisor console ring had a few problems.

 * A single 32k buffer would truncate a large console ring.
 * The buffer was static and not under the protection of the Ocaml GC lock so
   could be clobbered by concurrent accesses.
 * Embedded NUL characters would cause caml_copy_string() (which is strlen()
   based) to truncate the buffer.

The function is rewritten from scratch, using the same algorithm as the python
stubs, but uses the protection of the Ocaml GC lock to maintain a static
running total of the ring size, to avoid redundant realloc()ing in future
calls.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Dave Scott <dave.scott@eu.citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: David Scott <dave.scott@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoocaml/xenctrl: Make failwith_xc() thread safe
Andrew Cooper [Wed, 28 Jan 2015 17:55:32 +0000 (17:55 +0000)]
ocaml/xenctrl: Make failwith_xc() thread safe

The static error_str[] buffer is not thread-safe, and 1024 bytes is
unreasonably large.  Reduce to 256 bytes (which is still much larger than any
current use), and move it to being a stack variable.

Also, propagate the Noreturn attribute from caml_raise_with_string().

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Dave Scott <Dave.Scott@eu.citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: David Scott <dave.scott@citrix.com>
10 years agotools/xenctrl: correct some function declarations
Tiejun Chen [Fri, 30 Jan 2015 07:32:26 +0000 (15:32 +0800)]
tools/xenctrl: correct some function declarations

When commit 6865e52b78f4, "PCI multi-seg: adjust domctl interface",
is introduced, we missed to sync that head file.

Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
10 years agotools/libxc: Don't leave scratch_pfn uninitialised if the domain has no memory
Andrew Cooper [Wed, 28 Jan 2015 15:52:35 +0000 (15:52 +0000)]
tools/libxc: Don't leave scratch_pfn uninitialised if the domain has no memory

c/s 5b5c40c0d1 "libxc: introduce a per architecture scratch pfn for temporary
grant mapping" accidentally an issue whereby there were two paths out of
xc_core_arch_get_scratch_gpfn() which returned 0, but only one of which
assigned a value to the gpfn parameter.

xc_domain_maximum_gpfn() can validly return 0, at which point gpfn 1 is a
valid scratch page to use.

In addition, widen rc before adding 1 and possibly overflowing.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Julien Grall <julien.grall@linaro.org>
CC: Jan Beulich <JBeulich@suse.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl_set_memory_target: only remove videoram from absolute targets
Stefano Stabellini [Mon, 26 Jan 2015 16:47:11 +0000 (16:47 +0000)]
libxl_set_memory_target: only remove videoram from absolute targets

If the new target is relative to the current target, do not remove
videoram again: it has already been removed from the current target.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen/arm: vgic-v2: message in the emulation code should be rate-limited
Julien Grall [Mon, 19 Jan 2015 12:59:42 +0000 (12:59 +0000)]
xen/arm: vgic-v2: message in the emulation code should be rate-limited

printk is not rated-limited by default. Therefore a malicious guest may
be able to flood the Xen console.

If we use gdprintk, unecessary information will be printed such as the
filename and the line. Instead use XENLOG_G_ERR combine with %pv.

This is XSA-118.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen/arm: vgic-v3: message in the emulation code should be rate-limited
Julien Grall [Mon, 19 Jan 2015 14:01:09 +0000 (14:01 +0000)]
xen/arm: vgic-v3: message in the emulation code should be rate-limited

printk by default is not rate-limited by default. Therefore a malicious guest
may be able to flood the Xen console.

If we use gdprintk, unnecessary information will be printed such as the
filename and the line. Instead use XENLOG_G_{ERR,DEBUG} combine with %pv.

Also remove the vGICv3 prefix which is not neccessary and update some
message which were wrong.

This is XSA-118.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agognttab: fix a printk() format specifier
Jan Beulich [Thu, 29 Jan 2015 14:57:11 +0000 (15:57 +0100)]
gnttab: fix a printk() format specifier

... to fix arm32 build.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
10 years agorandom: add missing include xen/cache.h
Julien Grall [Thu, 29 Jan 2015 14:50:00 +0000 (15:50 +0100)]
random: add missing include xen/cache.h

The commit f6c9698 " x86: allow reading MSR_IA32_TSC with XENPF_resource_op"
introduced a built regression on ARM platform.

random.c:8:28: error: expected \91=\92\91,\92\91;\92\91asm\92 or \91__attribute__\92 before \91boot_random\92
 unsigned int __read_mostly boot_random;
                            ^
The define __read_mostly is defined in asm/cache.h which is included by
other headers on x86 but not on ARM. Include xen/cache.h to fix the
build.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agox86/shadow: use shorter constants for callback masks
Jan Beulich [Thu, 29 Jan 2015 13:46:09 +0000 (13:46 +0000)]
x86/shadow: use shorter constants for callback masks

private.h defining them I can't see why they couldn't be used here to
make the code easier to read.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
10 years agox86/shadow: adjust mask shadow_audit_tables() passes to hash_foreach()
Jan Beulich [Thu, 29 Jan 2015 13:42:20 +0000 (13:42 +0000)]
x86/shadow: adjust mask shadow_audit_tables() passes to hash_foreach()

It so far having been ~1 made most of the code preceding the call
pointless, but I assume this wasn't meant to be that way. Also replace
the remaining hard coded ~1 with an expression documenting the
intention a little better.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Adjust again to use SHF_page_type_mask, at Jan's suggestion.

Signed-off-by: Tim Deegan <tim@xen.org>
10 years agox86/shadow: convert non-const statics
Jan Beulich [Thu, 29 Jan 2015 13:40:40 +0000 (13:40 +0000)]
x86/shadow: convert non-const statics

To make obvious that such statics are safe to use, they should be
const. In some of the cases, they wouldn't even need to be static, but
keep them so upon the maintainer's request.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
10 years agox86: support SMBIOS v3
Jan Beulich [Thu, 29 Jan 2015 13:24:04 +0000 (14:24 +0100)]
x86: support SMBIOS v3

While presumably of primary use to ARM64 (once the code gets
generalized), we should still support this more modern variant,
allowing for the actual DMI data to reside in memory above 4Gb.

While based on draft version 3.0.0d, it is assumed that the final
version of the specification will not render this implementation
invalid (not the least because Linux 3.19 already makes the same
assumption).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agogrant-table: defer releasing pages acquired in a grant copy
David Vrabel [Thu, 29 Jan 2015 13:22:22 +0000 (14:22 +0100)]
grant-table: defer releasing pages acquired in a grant copy

Acquiring a page for the source or destination of a grant copy is an
expensive operation.  A common use case is for two adjacent grant copy
ops to operate on either the same source or the same destination page.

Instead of always acquiring and releasing destination and source pages
for each operation, release the page once it is no longer valid for
the next op.

If either the source or destination domains changes both pages are
released as it is unlikely that either will still be valid.

XenServer's performance benchmarks show modest improvements in network
receive throughput (netback uses grant copy in the guest Rx path) and
no regressions in disk performance (using tapdisk3 which grant copies
as the backend).

                         Baseline   Deferred Release
Interhost receive to VM   7.2 Gb/s  ~9 Gbit/s
Interhost aggregate      24 Gb/s    28 Gb/s
Intrahost single stream  14 Gb/s    14 Gb/s
Intrahost aggregate      34 Gb/s    36 Gb/s
Aggregate disk write    900 MB/s   900 MB/s
Aggregate disk read     890 MB/s   890 MB/s

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
10 years agogrant-table: refactor grant copy to reduce duplicate code
David Vrabel [Thu, 29 Jan 2015 13:21:00 +0000 (14:21 +0100)]
grant-table: refactor grant copy to reduce duplicate code

Much of the grant copy operation is identical for the source and
destination buffers.  Refactor the code into per-buffer functions.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
10 years agox86/shadow: make some log-dirty handling functions static
Jan Beulich [Thu, 29 Jan 2015 11:18:32 +0000 (11:18 +0000)]
x86/shadow: make some log-dirty handling functions static

Noticed while introducing the stub replacement for disabling shadow
paging support at build time.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
10 years agobunzip2: off by one in get_next_block()
Dan Carpenter [Wed, 28 Jan 2015 15:50:08 +0000 (16:50 +0100)]
bunzip2: off by one in get_next_block()

"origPtr" is used as an offset into the bd->dbuf[] array.  That array is
allocated in start_bunzip() and has "bd->dbufSize" number of elements so
the test here should be >= instead of >.

Later we check "origPtr" again before using it as an offset so I don't
know if this bug can be triggered in real life.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Trivial adjustments to make the respective Linux commit
b5c8afe5be51078a979d86ae5ae78c4ac948063d apply to Xen.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agox86: skip further initialization for idle domains
Jan Beulich [Wed, 28 Jan 2015 15:38:20 +0000 (16:38 +0100)]
x86: skip further initialization for idle domains

While in the end not really found necessary, early versions of the
patches to follow pointed out that we needlessly set up paging for idle
domains. Arranging for that to be skipped made me notice that we can at
once skip vMCE setup for them. Leverage to adjustment to further
re-arrange the way FPU setup gets skipped.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86: allow reading MSR_IA32_TSC with XENPF_resource_op
Chao Peng [Wed, 28 Jan 2015 15:33:01 +0000 (16:33 +0100)]
x86: allow reading MSR_IA32_TSC with XENPF_resource_op

Memory bandwidth monitoring requires system time information returned
along with the monitoring counter to verify the correctness of the
counter value and to calculate the time elapsed between two samplings.

Add MSR_IA32_TSC to the read path and it returns scaled system time(ns)
instead of raw timestamp to elimanate the needs to convert. The return
time is obfuscated with booting random to eliminate the potential abuse
of it. RESOURCE_ACCESS_MAX_ENTRIES is also increased to 3 so MSR_IA32_TSC
can be used together with an MSR write/read operation pair.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Fix uninitialized variable build error.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
10 years agokexec: prefer __copy_to_guest() when possible
Jan Beulich [Wed, 28 Jan 2015 15:32:01 +0000 (16:32 +0100)]
kexec: prefer __copy_to_guest() when possible

It's slightly cheaper and safe as long a copy_from_guest() for the same
guest address range was issued before.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: David Vrabel <david.vrabel@citrix.com>
10 years agodocs/commandline: correct information for 'x2apic_phys' parameter
Andrew Cooper [Wed, 28 Jan 2015 15:31:07 +0000 (16:31 +0100)]
docs/commandline: correct information for 'x2apic_phys' parameter

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86: also use tzcnt instead of bsf in __scanbit()
Jan Beulich [Wed, 28 Jan 2015 15:29:46 +0000 (16:29 +0100)]
x86: also use tzcnt instead of bsf in __scanbit()

... when available, i.e. by runtime patching. This saves the
conditional move, having a back-to-back dependency on BSF's (EFLAGS)
result.

The need to include asm/cpufeatures.h from asm/bitops.h requires a
workaround for an otherwise resulting circular header file dependency:
Provide a mode by which the including site of the former header can
request to only get the X86_FEATURE_* defines (and very little more)
from it, allowing it to nevertheless be included in its entirety later
on.

While doing this I also noticed that the function's "max" parameter was
pointlessly "unsigned long" - the function only returning
"unsigned int", this can't be of any use, and hence gets converted at
once, along with the necessary adjustments to CMOVZ's output operands.

Note that while only alternative_io() is needed by this change (and
hence gets pulled over from Linux), for completeness its input-only
counterpart alternative_input() gets added as well.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agolibxl: correct function name
Wei Liu [Wed, 28 Jan 2015 13:26:21 +0000 (13:26 +0000)]
libxl: correct function name

spaw_ -> spawn_

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agotools/libxc: Disable CONFIG_MIGRATE in stubdom environments
Andrew Cooper [Tue, 27 Jan 2015 16:58:06 +0000 (16:58 +0000)]
tools/libxc: Disable CONFIG_MIGRATE in stubdom environments

The legacy save/restore infrastructure requires several function pointers from
the toolstack (libxl or Xend in the past) in order to work, and for HVM guests
also need to be able to play around in dom0's filesystem to move the device
model save record.

Migration v2 changes some of this, but is similarly dependent on
toolstack-provided function pointers.

Someone who wishes to re-architect the interaction of moving parts for running
a domain might be in a position to re-enabled this, but for now, explicitly
fail with ENOSYS (from xc_nomigrate.c) rather than failing with an error about
a missing function pointer (or indeed falling over a NULL pointer on certain
paths).

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoocaml/xenctrl: Check return values from hypercalls
Andrew Cooper [Tue, 27 Jan 2015 20:38:11 +0000 (20:38 +0000)]
ocaml/xenctrl: Check return values from hypercalls

rather than blindly continuing and possibly using negative values.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Dave Scott <dave.scott@eu.citrix.com>
Acked-by: David Scott <dave.scott@citrix.com>
10 years agoxen/arm: split the init_xen_time() in 2 parts
Oleksandr Tyshchenko [Wed, 28 Jan 2015 10:54:41 +0000 (12:54 +0200)]
xen/arm: split the init_xen_time() in 2 parts

Create preinit_xen_time() and move to it minimum required
subset of operations needed to properly initialized
cpu_khz and boot_count vars. This is allow us to use udelay()
immediately after the call.

Signed-off-by: Oleksandr Tyshchenko <oleksandr.tyshchenko@globallogic.com>
Reviewed-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agovTPM/TPM2: Record some infomation in docs/misc/vtpmmgr.txt about
Quan Xu [Thu, 15 Jan 2015 09:21:53 +0000 (04:21 -0500)]
vTPM/TPM2: Record some infomation in docs/misc/vtpmmgr.txt about

'vtpmmgr on TPM 2.0'

Signed-off-by: Quan Xu <quan.xu@intel.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
10 years agovTPM/TPM2: Unind group keys and sectors data on disk
Quan Xu [Thu, 15 Jan 2015 09:21:52 +0000 (04:21 -0500)]
vTPM/TPM2: Unind group keys and sectors data on disk

Signed-off-by: Quan Xu <quan.xu@intel.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
10 years agovTPM/TPM2: Bind group keys and sectors data on disk
Quan Xu [Thu, 15 Jan 2015 09:21:51 +0000 (04:21 -0500)]
vTPM/TPM2: Bind group keys and sectors data on disk

Signed-off-by: Quan Xu <quan.xu@intel.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
10 years agovTPM/TPM2: Support TPM 2.0 bind and unbind data
Quan Xu [Thu, 15 Jan 2015 09:21:50 +0000 (04:21 -0500)]
vTPM/TPM2: Support TPM 2.0 bind and unbind data

Bind data with TPM2_RSA_Encrypt, which performs RSA encryption using
the indicated padding scheme according to PKCS#1v2.1(PKCS#1). If the
scheme of keyHandle is TPM_ALG_NULL, then the caller may use inScheme
to specify the padding scheme.
Unbind data with TPM2_RSA_Decrypt, which performs RSA decryption using
the indicated padding scheme according to PKCS#1v2.1(PKCS#1).

Signed-off-by: Quan Xu <quan.xu@intel.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
10 years agovTPM/TPM2: TPM 2.0 PCRs read
Quan Xu [Thu, 15 Jan 2015 09:21:49 +0000 (04:21 -0500)]
vTPM/TPM2: TPM 2.0 PCRs read

Signed-off-by: Quan Xu <quan.xu@intel.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
10 years agovTPM/TPM2: Support 'tpm2' extra command line.
Quan Xu [Thu, 15 Jan 2015 09:21:48 +0000 (04:21 -0500)]
vTPM/TPM2: Support 'tpm2' extra command line.

Make vtpm-stubdom domain compatible to launch on TPM 1.x / TPM 2.0.
Add:
..
     extra="tpm2=1"
..
to launch vtpm-stubdom domain on TPM 2.0, ignore it on TPM 1.x. for
example,
vtpm-stubdom domain configuration on TPM 2.0:

  kernel="/usr/lib/xen/boot/vtpmmgr-stubdom.gz"
  memory=16
  disk=["file:/var/scale/vdisk/vmgr,hda,w"]
  name="vtpmmgr"
  iomem=["fed40,5"]
  extra="tpm2=1"

vtpm-stubdom domain configuration on TPM 1.x:

  kernel="/usr/lib/xen/boot/vtpmmgr-stubdom.gz"
  memory=16
  disk=["file:/var/scale/vdisk/vmgr,hda,w"]
  name="vtpmmgr"
  iomem=["fed40,5"]

Signed-off-by: Quan Xu <quan.xu@intel.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
10 years agovTPM/TPM2: Add main entrance vtpmmgr2_init()
Quan Xu [Thu, 15 Jan 2015 09:21:47 +0000 (04:21 -0500)]
vTPM/TPM2: Add main entrance vtpmmgr2_init()

Accept commands from the vtpm-stubdom domains via the mini-os TPM
backend driver. The vTPM manager communicates directly with hardware
TPM 2.0 using the mini-os tpm2_tis driver.

Signed-off-by: Quan Xu <quan.xu@intel.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
10 years agovTPM/TPM2: TPM2.0 TIS initialization and self test.
Quan Xu [Thu, 15 Jan 2015 09:21:46 +0000 (04:21 -0500)]
vTPM/TPM2: TPM2.0 TIS initialization and self test.

call the TPM 2.0 various registers that allow communication between
the TPM 2.0 and platform hardware and software. TPM2_SelfTest causes
the TPM 2.0 to perform a test of its capabilities.

Signed-off-by: Quan Xu <quan.xu@intel.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
10 years agovTPM/TPM2: Create and load SK on TPM 2.0
Quan Xu [Thu, 15 Jan 2015 09:21:45 +0000 (04:21 -0500)]
vTPM/TPM2: Create and load SK on TPM 2.0

TPM2_Create is used to create an object that can be loaded into a
TPM using TPM2_Load(). If the command completes successfully, the
TPM will create the new object and return the object’s creation.
data (creationData), its public area (outPublic), and its encrypted
sensitive area (outPrivate). Preservation of the returned data is
the responsibility of the caller. The object will need to be loaded
(TPM2_Load()).
TPM2_Load is used to load objects into the TPM. This command is used
when both a TPM2B_PUBLIC and TPM2B_PRIVATE are to be loaded. If only
a TPM2B_PUBLIC is to be loaded, the TPM2_LoadExternal command is used.

Signed-off-by: Quan Xu <quan.xu@intel.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
10 years agovTPM/TPM2: TPM 2.0 takes ownership and create SRK
Quan Xu [Thu, 15 Jan 2015 09:21:44 +0000 (04:21 -0500)]
vTPM/TPM2: TPM 2.0 takes ownership and create SRK

TPM2_CreatePrimary is used to create a Primary Object under one of
the Primary Seeds or a Temporary Object under TPM_RH_NULL. The command
uses a TPM2B_PUBLIC as a template for the object to be created. The
command will create and load a Primary Object. The sensitive area is
not returned. Any type of object and attributes combination that is
allowed by TPM2_Create() may be created by this command. The constraints
on templates and parameters are the same as TPM2_Create() except that a
Primary Storage Key and a Temporary Storage Key are not constrained to
use the algorithms of their parents.

Signed-off-by: Quan Xu <quan.xu@intel.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
10 years agovTPM/TPM2: Add TPM 2.0 Exposed APIs
Quan Xu [Thu, 15 Jan 2015 09:21:43 +0000 (04:21 -0500)]
vTPM/TPM2: Add TPM 2.0 Exposed APIs

These TPM 2.0 Exposed APIs for the Mini-os to access TPM 2.0
hardware.

Signed-off-by: Quan Xu <quan.xu@intel.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
10 years agovTPM/TPM2: Add global data in vtpm_globals{}
Quan Xu [Thu, 15 Jan 2015 09:21:42 +0000 (04:21 -0500)]
vTPM/TPM2: Add global data in vtpm_globals{}

These data is for the Mini-os to access TPM 2.0 hardware.

Signed-off-by: Quan Xu <quan.xu@intel.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
10 years agovTPM/TPM2: TPM 2.0 data structures marshal
Quan Xu [Thu, 15 Jan 2015 09:21:41 +0000 (04:21 -0500)]
vTPM/TPM2: TPM 2.0 data structures marshal

Add TPM 2.0 data structure marshal for packing and unpacking TPM
2.0 data structures.

Signed-off-by: Quan Xu <quan.xu@intel.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
10 years agovTPM/TPM2: Add TPM 2.0 data structures and commands definition
Quan Xu [Thu, 15 Jan 2015 09:21:40 +0000 (04:21 -0500)]
vTPM/TPM2: Add TPM 2.0 data structures and commands definition

Add TPM 2.0 data structures on Trusted Platform Module Library Part 2:
Structures and Trust Platform Module Library Part 3: Commands.

Signed-off-by: Quan Xu <quan.xu@intel.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
10 years agotools/libxl: Correct static pattern rule for pkgconfig files
Andrew Cooper [Tue, 27 Jan 2015 20:34:02 +0000 (20:34 +0000)]
tools/libxl: Correct static pattern rule for pkgconfig files

Attempting to build libxl causes Make to emit the following warnings

andrewcoop@andrewcoop:xen.git$ make -C tools/libxl all
...
Makefile:253: target `xenlight.pc' doesn't match the target pattern
Makefile:253: target `xlutil.pc' doesn't match the target pattern
...

because the static pattern rule is malformed.  'Makefile' as the only
prereq-pattern does not contain a pattern.

The rule ends up working because of the use of $@.in where $< should have been
used, but lacked any dependency between a $FOO.pc and its .in source file.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen: arm: enable sync console in machine_reboot.
Ian Campbell [Thu, 15 Jan 2015 11:22:27 +0000 (11:22 +0000)]
xen: arm: enable sync console in machine_reboot.

Otherwise the last thing printed is "(XE" or something.

In line with x86 also disable the watchdog and spin debugging.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Julien Grall <julien.grall@linaro.org>
10 years agolibxl: Prevent qemu closing QMP socket on shutdown before libxl is done with it.
Sander Eikelenboom [Thu, 22 Jan 2015 17:21:40 +0000 (18:21 +0100)]
libxl: Prevent qemu closing QMP socket on shutdown before libxl is done with it.

At present on shutdown when using pci-passthrough with qemu-xen, qemu
closes the QMP socket before libxl is done with it causing these
errors to be logged by libxl:

    Waiting for domain test (domid 1) to die [pid 11568]
    Domain 1 has shut down, reason code 0 0x0
    Action for shutdown reason code 0 is destroy
    Domain 1 needs to be cleaned up: destroying the domain
    libxl: error: libxl_qmp.c:443:qmp_next: Socket read error: Connection reset by peer
    libxl: error: libxl_qmp.c:701:libxl__qmp_initialize: Failed to connect to QMP
    libxl: error: libxl_qmp.c:686:libxl__qmp_initialize: Connection error: Connection refused
    libxl: error: libxl_dm.c:1588:kill_device_model: Device Model already exited
    Done. Exiting now

Prevent this by using the qemu '-no-shutdown' parameter which is
described as doing:

    "Don’t exit QEMU on guest shutdown, but instead only stop the emulation.
     This allows for instance switching to monitor to commit changes to the disk image."

So Qemu will stop emulating, but keeps the QMP socket open and waits
for libxl to kill the qemu process when it is done, preventing the
race and resulting in this to be logged by libxl:

    Waiting for domain test (domid 1) to die [pid 10859]
    Domain 1 has shut down, reason code 0 0x0
    Action for shutdown reason code 0 is destroy
    Domain 1 needs to be cleaned up: destroying the domain
    Done. Exiting now

Signed-off-by: Sander Eikelenboom <linux@eikelenboom.it>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
10 years agolibxl: Spice streaming video setting support for upstream qemu
Fabio Fantoni [Tue, 20 Jan 2015 10:33:17 +0000 (11:33 +0100)]
libxl: Spice streaming video setting support for upstream qemu

Usage:
spice_streaming_video=[filter|all|off]

Specifies what streaming video setting is to be used by spice (if
given),
otherwise the qemu default will be used.

Signed-off-by: Fabio Fantoni <fabio.fantoni@m2r.biz>
Acked-by: Wei Liu <wei.liu2@citrix.com>
10 years agolibxl: Spice image compression setting support for upstream qemu
Fabio Fantoni [Tue, 20 Jan 2015 10:26:30 +0000 (11:26 +0100)]
libxl: Spice image compression setting support for upstream qemu

Usage:
spice_image_compression=[auto_glz|auto_lz|quic|glz|lz|off]

Specifies what image compression is to be used by spice (if given),
otherwise the qemu default will be used.

Signed-off-by: Fabio Fantoni <fabio.fantoni@m2r.biz>
Acked-by: Wei Liu <wei.liu2@citrix.com>
10 years agotools/Makefile: fix qemu-xen-traditional build
Wei Liu [Sun, 25 Jan 2015 15:38:59 +0000 (15:38 +0000)]
tools/Makefile: fix qemu-xen-traditional build

In d9740237a ("tools: unhook blktap1 from the build and remove all
references to it"), one spot was left unchanged, which leads to failure
in building qemu-xen-traditional.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agotools: generate systemd service files only when systemd is available
Wei Liu [Tue, 20 Jan 2015 11:47:46 +0000 (11:47 +0000)]
tools: generate systemd service files only when systemd is available

Though that's not in any way harmful but it is on the other hand not
very useful.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
[ ijc -- rerun autogen.sh ]
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agotools: fix "make distclean"
Wei Liu [Tue, 20 Jan 2015 13:31:12 +0000 (13:31 +0000)]
tools: fix "make distclean"

The original rule to target "distclean" in tools/Rules.mk was in effect
"make clean". It should be "make distclean".

However not all Makefiles in subdirectories have distclean target
defined. So this patch also adds a bunch of distclean targets to various
Makefiles. They only depend on clean target and don't have any actions
in most cases.

With the patch applied, following command outputs 0 results:

  find tools -name 'Makefile*' -exec grep -L 'distclean' {} \+ \
     | grep -v ocaml | grep -v libfsimage

Ocaml and libfsimage are known to have distclean defined in a dedicated
rules file.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl: provide xlutil.pc
Wei Liu [Tue, 20 Jan 2015 12:22:50 +0000 (12:22 +0000)]
libxl: provide xlutil.pc

Please rerun autogen.sh after applying this patch.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl: delete xenlight.pc.in in distclean
Wei Liu [Tue, 20 Jan 2015 12:22:49 +0000 (12:22 +0000)]
libxl: delete xenlight.pc.in in distclean

That file is generated by configure. Deleting it in "make clean" leads
to rerun configure. Move it under distclean target.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxc: introduce a per architecture scratch pfn for temporary grant mapping
Julien Grall [Wed, 21 Jan 2015 13:25:44 +0000 (13:25 +0000)]
libxc: introduce a per architecture scratch pfn for temporary grant mapping

The code to initialize the grant table in libxc uses
xc_domain_maximum_gpfn() + 1 to get a guest pfn for mapping the grant
frame and to initialize it.

This solution has two major issues:
    - The check of the return of xc_domain_maximum_gpfn is buggy because
    xen_pfn_t is unsigned and in case of an error -ERRNO is returned.
    Which is never catch with ( pfn <= 0 ).
    - The guest memory layout maybe filled up to the end, i.e
    xc_domain_maximum_gpfn() + 1 gives either 0 or an invalid PFN due to
    hardware limitation.

Futhermore, on ARM, xc_domain_maximum_gpfn() is not implemented and
return -ENOSYS. This will make libxc to use always the same PFN which
may colapse with an already mapped region (see xen/include/public/arch-arm.h
for the layout).

This patch only address the problem for ARM, the x86 version use the same
behavior (ie xc_domain_maximum_gpfn() + 1), as I'm not familiar with Xen x86.

A new function xc_core_arch_get_scratch_gpfn is introduced to be able to
choose the gpfn per architecture.

For the ARM version, we use the GUEST_GNTTAB_GUEST which is the base of
the region by the guest to map the grant table. At the build time,
nothing is mapped there.

At the same time correctly check the return of xc_domain_maximum_gpfn
for x86.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agox86: vcpu_destroy_pagetables() must not return -EINTR
Konrad Rzeszutek Wilk [Mon, 26 Jan 2015 11:51:09 +0000 (12:51 +0100)]
x86: vcpu_destroy_pagetables() must not return -EINTR

.. otherwise it has the side effect that: domain_relinquish_resources
will stop and will return to user-space with -EINTR which it is not
equipped to deal with that error code; or vcpu_reset - which will
ignore it and convert the error to -ENOMEM..

The preemption mechanism we have for domain destruction is to return
-EAGAIN (and then user-space calls the hypercall again) and as such we need
to catch the case of:

domain_relinquish_resources
  ->vcpu_destroy_pagetables
    -> put_page_and_type_preemptible
       -> __put_page_type
           returns -EINTR

and convert it to the proper type. For:

XEN_DOMCTL_setvcpucontext
 -> vcpu_reset
   -> vcpu_destroy_pagetables

we need to return -ERESTART otherwise we end up returning -ENOMEM.

There are also other callers of vcpu_destroy_pagetables: arch_vcpu_reset
(vcpu_reset) are:
 - hvm_s3_suspend (asserts on any return code),
 - vlapic_init_sipi_one (asserts on any return code),

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
10 years agox86: use tzcnt instead of bsf
Jan Beulich [Mon, 26 Jan 2015 11:50:21 +0000 (12:50 +0100)]
x86: use tzcnt instead of bsf

Following a compiler change done in 2012, make use of the fact that for
non-zero input BSF and TZCNT produce the same numeric result (EFLAGS
setting differs), and that CPUs not knowing of TZCNT will treat the
instruction as BSF (i.e. ignore what looks like a REP prefix to them).
The assumption here is that TZCNT would never have worse performance
than BSF.

Also extend the asm() input in find_first_set_bit() to allow memory
operands.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86/HVM: improve EFER validation error messages
Andrew Cooper [Mon, 26 Jan 2015 11:48:38 +0000 (12:48 +0100)]
x86/HVM: improve EFER validation error messages

The previous error message was very little use in identifying the actual
problem after the fact.  Now, hvm_efer_valid() will indicate the issue which
it objects to, which is far more useful for diagnosing issues from logs.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agoRevert "x86/VPMU: handle APIC_LVTPC accesses"
Jan Beulich [Mon, 26 Jan 2015 11:47:30 +0000 (12:47 +0100)]
Revert "x86/VPMU: handle APIC_LVTPC accesses"

This reverts commit 8097616fbdda2d214b305dc41f2468f9fb88d500, most
likely reponsible for regressions found by osstest.

10 years agointel/VPMU: MSR_CORE_PERF_GLOBAL_CTRL should be initialized to zero
Boris Ostrovsky [Fri, 23 Jan 2015 16:54:23 +0000 (17:54 +0100)]
intel/VPMU: MSR_CORE_PERF_GLOBAL_CTRL should be initialized to zero

MSR_CORE_PERF_GLOBAL_CTRL register should be set zero initially. It is up to
the guest to set it so that counters are enabled.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
10 years agox86/VPMU: handle APIC_LVTPC accesses
Boris Ostrovsky [Fri, 23 Jan 2015 16:53:49 +0000 (17:53 +0100)]
x86/VPMU: handle APIC_LVTPC accesses

Don't have the hypervisor update APIC_LVTPC when _it_ thinks the vector should
be updated. Instead, handle guest's APIC_LVTPC accesses and write what the guest
explicitly wanted.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
10 years agovmx: merge MSR management routines
Boris Ostrovsky [Fri, 23 Jan 2015 16:53:01 +0000 (17:53 +0100)]
vmx: merge MSR management routines

vmx_add_host_load_msr() and vmx_add_guest_msr() share fair amount of code. Merge
them to simplify code maintenance.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
10 years agox86/VPMU: clean up Intel VPMU code
Boris Ostrovsky [Fri, 23 Jan 2015 16:52:23 +0000 (17:52 +0100)]
x86/VPMU: clean up Intel VPMU code

Remove struct pmumsr and core2_pmu_enable. Replace static MSR structures with
fields in core2_vpmu_context.

Call core2_get_pmc_count() once, during initialization.

Properly clean up when core2_vpmu_alloc_resource() fails.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
10 years agox86/VPMU: make vpmu macros a bit more efficient
Boris Ostrovsky [Fri, 23 Jan 2015 16:51:43 +0000 (17:51 +0100)]
x86/VPMU: make vpmu macros a bit more efficient

Introduce vpmu_are_all_set that allows testing multiple bits at once. Convert macros
into inlines for better compiler checking.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
10 years agox86/VPMU: set MSR bitmaps only for HVM/PVH guests
Boris Ostrovsky [Fri, 23 Jan 2015 16:51:15 +0000 (17:51 +0100)]
x86/VPMU: set MSR bitmaps only for HVM/PVH guests

In preparation for making VPMU code shared with PV make sure that we we update
MSR bitmaps only for HVM/PVH guests

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
10 years agox86/VPMU: manage VPMU_CONTEXT_SAVE flag in vpmu_save_force()
Boris Ostrovsky [Fri, 23 Jan 2015 16:50:53 +0000 (17:50 +0100)]
x86/VPMU: manage VPMU_CONTEXT_SAVE flag in vpmu_save_force()

There is a possibility that we set VPMU_CONTEXT_SAVE on VPMU context in
vpmu_load() and never clear it (because vpmu_save_force() will see
VPMU_CONTEXT_LOADED bit clear, which is possible on AMD processors)

The problem is that amd_vpmu_save() assumes that if VPMU_CONTEXT_SAVE is set
then (1) we need to save counters and (2) we don't need to "stop" control
registers since they must have been stopped earlier. The latter may cause all
sorts of problem (like counters still running in a wrong guest and hypervisor
sending to that guest unexpected PMU interrupts).

Since setting this flag is currently always done prior to calling
vpmu_save_force() let's both set and clear it there.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
10 years agox86/VPMU: don't globally disable VPMU if initialization fails
Boris Ostrovsky [Fri, 23 Jan 2015 16:49:50 +0000 (17:49 +0100)]
x86/VPMU: don't globally disable VPMU if initialization fails

The failure to initialize VPMU may be temporary so we shouldn'd disable VMPU
forever.

Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
10 years agox86: prevent access to HPET from Dom0
Roger Pau Monné [Fri, 23 Jan 2015 14:16:18 +0000 (15:16 +0100)]
x86: prevent access to HPET from Dom0

Prevent Dom0 from accessing HPET MMIO region by adding the HPET mfn to the
list of forbiden memory regions (if ACPI_HPET_PAGE_PROTECT4 or
ACPI_HPET_PAGE_PROTECT64 flag is set) or to the list of read-only regions.

Also provide an option that prevents adding the HPET to the read-only memory
regions called ro-hpet, in case there are systems that put other stuff in
the HPET page.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Don't loop over iomem_deny_access() for consecutive MFNs.

Put new command line option's doc entry in right spot.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
10 years agox86/pvh: check permissions when adding MMIO regions
Roger Pau Monné [Fri, 23 Jan 2015 14:15:30 +0000 (15:15 +0100)]
x86/pvh: check permissions when adding MMIO regions

Check that MMIO regions added to PVH Dom0 are allowed. Previously a PVH Dom0
would have access to the full MMIO range.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
10 years agox86: allow set_mmio_p2m_entry to specify access type
Roger Pau Monné [Fri, 23 Jan 2015 14:14:56 +0000 (15:14 +0100)]
x86: allow set_mmio_p2m_entry to specify access type

Preparatory change that allows setting the access type to
set_mmio_p2m_entry.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
10 years agox86/HVM: replace plain numbers
Jan Beulich [Fri, 23 Jan 2015 14:13:39 +0000 (15:13 +0100)]
x86/HVM: replace plain numbers

... making the code better document itself. No functional change
intended.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86/HVM: make hvm_efer_valid() honor guest features
Jan Beulich [Fri, 23 Jan 2015 14:13:05 +0000 (15:13 +0100)]
x86/HVM: make hvm_efer_valid() honor guest features

Following the earlier similar change validating CR4 modifications.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agohandle XENMEM_get_vnumainfo in compat_memory_op
Wei Liu [Fri, 23 Jan 2015 14:06:26 +0000 (15:06 +0100)]
handle XENMEM_get_vnumainfo in compat_memory_op

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
10 years agogrant-table: use uint16_t consistently for grant copy offset and length
David Vrabel [Fri, 23 Jan 2015 14:05:48 +0000 (15:05 +0100)]
grant-table: use uint16_t consistently for grant copy offset and length

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
10 years agoVMX: replace plain numbers
Jan Beulich [Fri, 23 Jan 2015 14:05:08 +0000 (15:05 +0100)]
VMX: replace plain numbers

... making the code better document itself. No functional change
intended.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agox86/traps: replace plain numbers
Jan Beulich [Fri, 23 Jan 2015 14:04:26 +0000 (15:04 +0100)]
x86/traps: replace plain numbers

... making the code better document itself. No functional change
intended.

Note that for now (as we don't support RTM yet) DR_STATUS_RESERVED_ONE
and its users don't take DR_NOT_RTM into consideration yet.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86/HVM: replace plain number in hvm_combine_hw_exceptions()
Jan Beulich [Fri, 23 Jan 2015 14:03:28 +0000 (15:03 +0100)]
x86/HVM: replace plain number in hvm_combine_hw_exceptions()

While doing so also take care of #VE here (even if we don't make use of
it yet). Note that contributory_exceptions, other than the original
0x7c01 constant, doesn't include #PF anymore, but the check where the
variable is used is after one that already filtered out #PF.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: TIm Deegan <tim@xen.org>
10 years agoarm64: fix fls()
Jan Beulich [Fri, 23 Jan 2015 14:02:39 +0000 (15:02 +0100)]
arm64: fix fls()

It using CLZ on a 64-bit register while specifying the input operand as
only 32 bits wide is wrong: An operand intentionally shrunk down to 32
bits at the source level doesn't imply respective zero extension also
happens at the machine instruction level, and hence the wrong result
could get returned.

Add suitable inline assembly abstraction so that the function can
remain shared between arm32 and arm64. The need to include asm_defns.h
in bitops.h makes it necessary to adjust processor.h though - it is
generally wrong to include public headers without making sure that
integer types are properly defined. (I didn't innvestigate or try
whether the possible alternative of moving the public/arch-arm.h
inclusion down in the file would also work.)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agomake fls() and ffs() consistent across architectures
Jan Beulich [Fri, 23 Jan 2015 13:59:37 +0000 (14:59 +0100)]
make fls() and ffs() consistent across architectures

Their parameter types differed between ARM and x86.

Along with generalizing the functions this fixes
- x86's non-long functions having long parameter types
- ARM's ffs() using a long intermediate variable
- generic_fls64() being broken when the upper half of the input is
  non-zero
- common (and in one case also ARM) code using fls() when flsl() was
  meant

Also drop ARM's constant_fls() in favor of the identical generic_fls().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agox86/vlapic: express x2apic msr readability with a bitmap
Andrew Cooper [Thu, 22 Jan 2015 11:59:14 +0000 (12:59 +0100)]
x86/vlapic: express x2apic msr readability with a bitmap

The x2apic MSR space is currently defined between 0x800 and 0x83f, which
conveniently fits in a 64 bit wide bitmap.  This is far more efficient than
the cascade comparisons generated by the switch statement, which can't be
optimised because of the case ranges used for the ISR, TMR and IRR blocks.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Convert 0xffUL to ((1UL << (NR_VECTORS / 32)) - 1) and drop a couple of
clearly superfluous parentheses.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
10 years agox86/domain: improvements to switch_native() and switch_compat()
Andrew Cooper [Thu, 22 Jan 2015 11:58:47 +0000 (12:58 +0100)]
x86/domain: improvements to switch_native() and switch_compat()

Both are called with known-good domains, making the NULL check redundant.
Both also have open-coded forms of for_each_vcpu() which are replaced.

switch_compat() is updated to propagate the error from set_compat_l4(), rather
than automatically overriding with -ENOMEM.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agoVMX: use cached "current" where available
Jan Beulich [Thu, 22 Jan 2015 11:58:06 +0000 (12:58 +0100)]
VMX: use cached "current" where available

..., yielding better code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agoVMX: drop VMCS *_HIGH enumerators
Jan Beulich [Thu, 22 Jan 2015 11:57:27 +0000 (12:57 +0100)]
VMX: drop VMCS *_HIGH enumerators

Most of them have been unused since the dropping of 32-bit support, and
the few remaining cases are more efficiently dealt with using a generic
macro (and probably things should have been done that way from the
beginning).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agoVMX: dump further control state
Jan Beulich [Thu, 22 Jan 2015 11:56:33 +0000 (12:56 +0100)]
VMX: dump further control state

A few relevant control state fields did not get dumped so far; in
particular, VM_ENTRY_INSTRUCTION_LEN got printed twice (instead of also
printing VM_EXIT_INSTRUCTION_LEN). Where suitable (to reduce the amount
of output) make some of the dumping conditional upon guest settings
(this isn't required for correctness as vmr() already uses
__vmread_safe(), i.e. it is fine to access non-existing fields).

Also drop casts.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agoVMX: dump full host state
Jan Beulich [Thu, 22 Jan 2015 11:55:56 +0000 (12:55 +0100)]
VMX: dump full host state

A few host state fields did not get dumped so far. Where suitable (to
reduce the amount of output) make some of the dumping conditional upon
guest settings (this isn't required for correctness as vmr() already
uses __vmread_safe(), i.e. it is fine to access non-existing fields).

Also drop casts - many of them haven't been needed anymore since the
dropping of 32-bit support.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agoVMX: dump full guest state
Jan Beulich [Thu, 22 Jan 2015 11:54:49 +0000 (12:54 +0100)]
VMX: dump full guest state

Several guest state fields did not get dumped so far. Where suitable
(to reduce the amount of output) make some of the dumping conditional
upon guest settings (this isn't required for correctness as vmr()
already uses __vmread_safe(), i.e. it is fine to access non-existing
fields).

Move CR3_TARGET_* and TSC_OFFSET processing into the control state
section, at once making the upper bound of CR3_TARGET_VALUEn printed
depend on CR3_TARGET_COUNT (which architecturally can be higher than
4).

Also rename GUEST_PDPTRn to GUEST_PDPTEn (matching the SDM naming) and
group them as well as CR3_TARGET_VALUEn similar to EOI_EXIT_BITMAP.

Finally, drop casts - they haven't been needed anymore since the
dropping of 32-bit support (and some of them were not really needed in
the first place). Introduce vmr16() and vmr32() helper macros to avoid
the "l" printk format modifier and at the same time validate that only
16-/32-bit fields get accessed this way.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agox86: don't open-code cpuid_count() in pv_cpuid()
Jan Beulich [Thu, 22 Jan 2015 11:49:04 +0000 (12:49 +0100)]
x86: don't open-code cpuid_count() in pv_cpuid()

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86: correctly check for sub-leaf zero of leaf 7 in pv_cpuid()
Jan Beulich [Thu, 22 Jan 2015 11:48:40 +0000 (12:48 +0100)]
x86: correctly check for sub-leaf zero of leaf 7 in pv_cpuid()

Only the low 32 bits are relevant.

For consistency also change a cast on regs->eax to regs->_eax.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86: don't expose XSAVES capability to PV guests
Jan Beulich [Thu, 22 Jan 2015 11:47:56 +0000 (12:47 +0100)]
x86: don't expose XSAVES capability to PV guests

As done by the recent Linux commit b65d6e17fe ("kvm: x86: mask out
XSAVES") for KVM, we should also mask out XSAVES from what PV guests
get to see as long as we don't emulate accesses to MSR_IA32_XSS.

Actually, go beyond that: Just like for leaf 7, switch from
blacklisting to whitelisting, i.e. only allow XSAVEOPT and XSAVEC for
the time being. And do these overrides consistently for both Dom0 and
DomU-s.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agodrop redundant MAX_VIRT_CPUS bounds checks
Andrew Cooper [Thu, 22 Jan 2015 11:46:43 +0000 (12:46 +0100)]
drop redundant MAX_VIRT_CPUS bounds checks

In all 4 cases, visible in the context are bounds check against d->max_vcpus.
Domain building will ensure that d->max_vcpus never exceeds an appropriate
bound.  In the x86 case, different types of domains have different maxima for
vcpus, making the checks wrong as opposed to simply redundant.

For vpsci in ARM, 'vcpuid' is an unsigned type so could never be less than 0.

For the common changes to do_{,compat}_vcpu_op(), these changes do result in a
guest visible change, but only in so far as certain invalid vcpu ids will now
fail with -ENOENT rather than -EINVAL.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agosched/arinc653: remove MAX_VIRT_CPUS bounds check
Andrew Cooper [Thu, 22 Jan 2015 11:46:10 +0000 (12:46 +0100)]
sched/arinc653: remove MAX_VIRT_CPUS bounds check

The arinc653 interface is capable of specifying a domain in the schedule (from
the toolstack) before the domain itself exists, or is present in the cpupool
(The domain is identified by UUID rather than domid). As a result, the
schedule can't be validated at this point.

The vcpu_id from userspace is only ever used to compare against a list of real
vcpus available to the scheduler, which prevents ill-specified vcpus from
actually being scheduled.

Remove the MAX_VIRT_CPUS test, as it is not an appropriate bound for vcpu_id.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by Robert VanVossen <robert.vanvossen@dornerworks.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
10 years agoevtchn: reduce the size of the poll_mask where possible
Andrew Cooper [Thu, 22 Jan 2015 11:45:13 +0000 (12:45 +0100)]
evtchn: reduce the size of the poll_mask where possible

Use domain_max_vcpus(d) in preference to MAX_VIRT_CPUS when allocating the
poll mask.  This allows x86 HVM guests to have a poll mask of 128 bits rather
than 8k bits.

While changing this, use xzalloc_array() in preference to xmalloc_array() to
avoid needing the subsequent call to bitmap_zero().

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
10 years agointroduce domain_max_vcpus() helper and implement per arch
Andrew Cooper [Thu, 22 Jan 2015 11:44:03 +0000 (12:44 +0100)]
introduce domain_max_vcpus() helper and implement per arch

This allows the common XEN_DOMCTL_max_vcpus handler to lose some x86-specific
architecture knowledge.

It turns out that Xen had the same magic number twice in-tree with different
names (HVM_MAX_VCPUS and MAX_HVM_VCPUS).  This removes all use of
MAX_HVM_VCPUS, and x86 uses HVM_MAX_VCPUS from the public headers.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
10 years agomake error codes a formal part of the ABI
Jan Beulich [Thu, 22 Jan 2015 11:41:50 +0000 (12:41 +0100)]
make error codes a formal part of the ABI

Now that we have two cases where patches against hvmloader got
submitted needing to include the hypervisor's errno.h (for the host's
system header not necessarily reflecting the correct numbers), take
this as a strong sign that we need to make the error return values part
of the hypervisor ABI (which de-fact they've always been).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agore-order struct domain fields
Jan Beulich [Tue, 20 Jan 2015 09:47:21 +0000 (10:47 +0100)]
re-order struct domain fields

... to reduce padding holes.

I also wonder whether having independent spin locks side by side is
really a good thing cache-line-bouncing-wise.

Also change suspend_evtchn's type to evtchn_port_t.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
10 years agox86: latch current->domain in do_physdev_op()
Jan Beulich [Tue, 20 Jan 2015 09:46:19 +0000 (10:46 +0100)]
x86: latch current->domain in do_physdev_op()

... and drop global latching of current, as being needed more than once
only in PHYSDEVOP_set_iopl and PHYSDEVOP_set_iobitmap, and not at all
in all other cases.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86: slightly simplify PHYSDEVOP_pirq_eoi_gmfn_v* handling
Jan Beulich [Tue, 20 Jan 2015 09:45:01 +0000 (10:45 +0100)]
x86: slightly simplify PHYSDEVOP_pirq_eoi_gmfn_v* handling

We don't really need the MFN in more than one place (after dropping
mfn_to_page() translations where we know the result already), so no
need to have a local variable for it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agophysdev: hide compatibility definitions for new enough interface version
Jan Beulich [Tue, 20 Jan 2015 09:43:52 +0000 (10:43 +0100)]
physdev: hide compatibility definitions for new enough interface version

There's no point in continuing to expose those.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxsm/evtchn: never pretend to have successfully created a Xen event channel
Andrew Cooper [Tue, 20 Jan 2015 09:42:26 +0000 (10:42 +0100)]
xsm/evtchn: never pretend to have successfully created a Xen event channel

Xen event channels are not internal resources.  They still have one end in a
domain, and are created at the request of privileged domains.  This logic
which "successfully" creates a Xen event channel opens up undesirable failure
cases with ill-specified XSM policies.

If a domain is permitted to create ioreq servers or memevent listeners, but
not to create event channels, the ioreq/memevent creation will succeed but
attempting to bind the returned event channel will fail without any indication
of a permission error.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
10 years agox86/cpuid: correct parameter types for cpuid_count()
Andrew Cooper [Tue, 20 Jan 2015 09:41:18 +0000 (10:41 +0100)]
x86/cpuid: correct parameter types for cpuid_count()

About half of the cpuid space has the top bit of op set, and op it always
specified with unsigned integers.  There are no problematic uses in tree at
the moment.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>