]> xenbits.xensource.com Git - xen.git/log
xen.git
10 years agoalways print offending CPU on bringup/teardown failure
Dario Faggioli [Thu, 7 May 2015 13:15:24 +0000 (15:15 +0200)]
always print offending CPU on bringup/teardown failure

In fact, before this change, if bringing up or tearing down a
CPU fails with -EBUSY, we BUG_ON() and never get to see what
CPU caused the problem.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
10 years agox86/hvm: use white-lists for HVM param guest accessibility checks
Paul Durrant [Thu, 7 May 2015 13:08:43 +0000 (15:08 +0200)]
x86/hvm: use white-lists for HVM param guest accessibility checks

There are actually very few HVM parameters that a guest needs to read
and even fewer that a guest needs to write. Use white-lists to specify
those parameters and also ensre that, by default, newly introduced
parameters are not accessible.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86/hvm: introduce functions for HVMOP_get/set_param allowance checks
Paul Durrant [Thu, 7 May 2015 13:07:57 +0000 (15:07 +0200)]
x86/hvm: introduce functions for HVMOP_get/set_param allowance checks

Some parameters can only (validly) be set once. Some should not be set
by a guest for its own domain, and others must not be set since they
require the domain to be paused. Consolidate these checks, along with
the XSM check, in a new hvm_allow_set_param() function for clarity.

Also, introduce hvm_allow_get_param() for similar reasons.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86/hvm: give HVMOP_set_param and HVMOP_get_param their own functions
Paul Durrant [Thu, 7 May 2015 13:06:25 +0000 (15:06 +0200)]
x86/hvm: give HVMOP_set_param and HVMOP_get_param their own functions

The level of switch nesting in those ops is getting unreadable. Giving
them their own functions does introduce some code duplication in the
the pre-op checks but the overall result is easier to follow.

This patch is code movement (including style fixes). There is no
functional change.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86/apic: refactor error_interrupt
Tiejun Chen [Wed, 6 May 2015 12:28:04 +0000 (14:28 +0200)]
x86/apic: refactor error_interrupt

Just make this readable while debugging.

Signed-off-by: Tiejun Chen <tiejun.chen@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86: allow 64-bit PV guest kernels to suppress user mode exposure of M2P
Jan Beulich [Tue, 5 May 2015 16:01:33 +0000 (18:01 +0200)]
x86: allow 64-bit PV guest kernels to suppress user mode exposure of M2P

Xen L4 entries being uniformly installed into any L4 table and 64-bit
PV kernels running in ring 3 means that user mode was able to see the
read-only M2P presented by Xen to the guests. While apparently not
really representing an exploitable information leak, this still very
certainly was never meant to be that way.

Building on the fact that these guests already have separate kernel and
user mode page tables we can allow guest kernels to tell Xen that they
don't want user mode to see this table. We can't, however, do this by
default: There is no ABI requirement that kernel and user mode page
tables be separate. Therefore introduce a new VM-assist flag allowing
the guest to control respective hypervisor behavior:
- when not set, L4 tables get created with the respective slot blank,
  and whenever the L4 table gets used as a kernel one the missing
  mapping gets inserted,
- when set, L4 tables get created with the respective slot initialized
  as before, and whenever the L4 table gets used as a user one the
  mapping gets zapped.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
10 years agodomctl: don't truncate XEN_DOMCTL_max_mem requests
Jan Beulich [Tue, 5 May 2015 16:00:03 +0000 (18:00 +0200)]
domctl: don't truncate XEN_DOMCTL_max_mem requests

Instead saturate the value if the input can't be represented in the
respective struct domain field.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
10 years agoxen/arm64: Use virtual address when setting up early_printk fixmap
Chen Baozi [Tue, 7 Apr 2015 11:24:44 +0000 (19:24 +0800)]
xen/arm64: Use virtual address when setting up early_printk fixmap

We have already switched to the boot pagetable when reaching the point
of early_printk fixmap setup. Thus t is no longer necessary to
calculate physical address of xen_fixmap.

Signed-off-by: Chen Baozi <baozich@gmail.com>
Reviewed-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- fixed commit message typos ]

10 years agoxen/arm: p2m: Restrict preemption check in apply_p2m_changes
Julien Grall [Tue, 5 May 2015 15:02:09 +0000 (16:02 +0100)]
xen/arm: p2m: Restrict preemption check in apply_p2m_changes

The commit 569fb6c "xen/arm: Data abort exception (R/W) mem_access
events" makes apply_p2m_changes to call hypercall_preempt_check for any
operation rather than for relinquish.

The function hypercall_preempt_check call local_events_need_delivery
which rely on the current VCPU is not an idle VCPU.
Although, during DOM0 building the current VCPU is an idle one. This
would make Xen crash with the following stack trace:

(XEN) CPU0: Unexpected Trap: Data Abort
[...]
(XEN) Xen call trace:
(XEN)    [<00256ef4>] apply_p2m_changes+0x210/0x1190 (PC)
(XEN)    [<002506b4>] gic_events_need_delivery+0x5c/0x13c (LR)
(XEN)    [<002580ec>] map_mmio_regions+0x64/0x74
(XEN)    [<00251958>] gicv2v_setup+0xf8/0x150
(XEN)    [<00250964>] gicv_setup+0x20/0x30
(XEN)    [<0024cb3c>] arch_domain_create+0x170/0x244
(XEN)    [<00207df0>] domain_create+0x2ac/0x4d8
(XEN)    [<0028e3d0>] start_xen+0xcbc/0xee4
(XEN)    [<00200540>] paging+0x94/0xd8
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) CPU0: Unexpected Trap: Data Abort
(XEN)
(XEN) ****************************************

hypercall_preempt_check is expecting to be call only when the current
VCPU belong to a real domain (see x86 behavior).

As the bug prevents Xen booting on some platform, fix it by only check
preemption when the current VCPU is an idle one for now. We could
improve it later.

Reported-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Julien Grall <julien.grall@citrix.com>
CC: Tamas K Lengyel <tklengyel@sec.in.tum.de>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agotools/libxc: Migration v2 compatibility for unmodified libxl
Andrew Cooper [Thu, 2 Apr 2015 10:33:58 +0000 (11:33 +0100)]
tools/libxc: Migration v2 compatibility for unmodified libxl

These changes cause migration v2 to behave similarly enough to legacy
migration to function for HVM guests under an unmodified xl/libxl.

The migration v2 work for libxl will fix the layering issues with the
toolstack and qemu records, at which point this patch will be unneeded.

It is however included here for people wishing to experiment with migration v2
ahead of the libxl work.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agodocs: libxc migration stream specification
David Vrabel [Tue, 3 Jun 2014 13:48:12 +0000 (14:48 +0100)]
docs: libxc migration stream specification

Add the specification for a new migration stream format.  The document
includes all the details but to summarize:

The existing (legacy) format is dependant on the word size of the
toolstack.  This prevents domains from migrating from hosts running
32-bit toolstacks to hosts running 64-bit toolstacks (and vice-versa).

The legacy format lacks any version information making it difficult to
extend in compatible way.

The new format has a header (the image header) with version information,
a domain header with basic information of the domain and a stream of
records for the image data.

The format will be used for future domain types (such as on ARM).

The specification is pandoc format (an extended markdown format) and the
documentation build system is extended to support pandoc format documents.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agotools/libxc: common restore code
Andrew Cooper [Tue, 17 Feb 2015 18:20:23 +0000 (18:20 +0000)]
tools/libxc: common restore code

Restore a domain from the new format.  This reads and validates the domain and
image header and loads the guest memory from the PAGE_DATA records, populating
the p2m as it does so.

This provides the xc_domain_restore2() function as an alternative to the
existing xc_domain_restore().

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agotools/libxc: common save code
Andrew Cooper [Sun, 8 Jun 2014 02:03:29 +0000 (03:03 +0100)]
tools/libxc: common save code

Save a domain, calling domain type specific function at the appropriate
points.  This implements the xc_domain_save2() API function which is
equivalent to the existing xc_domain_save().

This writes the image and domain headers, and writes all the PAGE_DATA records
using a "live" process.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agotools/libxc: x86 HVM restore code
Andrew Cooper [Sat, 7 Jun 2014 20:17:51 +0000 (21:17 +0100)]
tools/libxc: x86 HVM restore code

Restore the x86 HVM specific parts of a domain.  This is the HVM_CONTEXT and
HVM_PARAMS records.

There is no need for any page localisation.

This also includes writing the trailing qemu save record to a file because
this is what libxc currently does.  This is intended to be moved into libxl
proper in the future.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agotools/libxc: x86 HVM save code
Andrew Cooper [Sat, 7 Jun 2014 20:17:33 +0000 (21:17 +0100)]
tools/libxc: x86 HVM save code

Save the x86 HVM specific parts of the domain.  This is considerably simpler
than an x86 PV domain.  Only the HVM_CONTEXT and HVM_PARAMS records are
needed.

There is no need for any page normalisation.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agotools/libxc: x86 PV restore code
Andrew Cooper [Sat, 7 Jun 2014 20:17:09 +0000 (21:17 +0100)]
tools/libxc: x86 PV restore code

Restore the x86 PV specific parts.  The X86_PV_INFO, the P2M_FRAMES,
SHARED_INFO, and VCPU context records.

The localise_page callback is called from the common PAGE_DATA code to convert
PFNs in page tables to MFNs.

Page tables are pinned and the guest's P2M is updated when the stream is
complete.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agotools/libxc: x86 PV save code
Andrew Cooper [Sat, 7 Jun 2014 20:17:02 +0000 (21:17 +0100)]
tools/libxc: x86 PV save code

Save the x86 PV specific parts of a domain.  This is the X86_PV_INFO record,
the P2M_FRAMES, the X86_PV_SHARED_INFO, the three different VCPU context
records, and the MSR records.

The normalise_page callback used by the common code when writing the PAGE_DATA
records, converts MFNs in page tables to PFNs.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agotools/libxc: x86 PV common code
Andrew Cooper [Sat, 7 Jun 2014 20:16:33 +0000 (21:16 +0100)]
tools/libxc: x86 PV common code

Add functions common to save and restore of x86 PV guests.  This includes
functions for dealing with the P2M and M2P and the VCPU context.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agotools/libxc: x86 common code
Andrew Cooper [Sat, 7 Jun 2014 20:32:07 +0000 (21:32 +0100)]
tools/libxc: x86 common code

Save/restore records common to all x86 domain types (HVM, PV).

This is only the TSC_INFO record.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agotools/libxc: generic common code
Andrew Cooper [Sun, 8 Jun 2014 02:05:40 +0000 (03:05 +0100)]
tools/libxc: generic common code

Add the context structure used to keep state during the save/restore
process.

Define the set of architecture or domain type specific operations with a
set of callbacks (save_ops, and restore_ops).

Add common functions for writing records.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agotools/libxc: C implementation of stream format
Andrew Cooper [Sat, 15 Mar 2014 20:18:45 +0000 (20:18 +0000)]
tools/libxc: C implementation of stream format

Provide the C structures matching the binary (wire) format of the new
stream format.  All header/record fields are naturally aligned and
explicit padding fields are used to ensure the correct layout (i.e.,
there is no need for any non-standard structure packing pragma or
attribute).

Provide some helper functions for converting types to string for
diagnostic purposes.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agotools/libxc: Migration v2 framework
Andrew Cooper [Sat, 15 Mar 2014 18:50:31 +0000 (18:50 +0000)]
tools/libxc: Migration v2 framework

For testing purposes, the environmental variable "XG_MIGRATION_V2" allows the
two save/restore codepaths to coexist, and have a runtime switch.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agolibxc/progress: Extend the progress interface
Andrew Cooper [Thu, 24 Jul 2014 12:05:27 +0000 (13:05 +0100)]
libxc/progress: Extend the progress interface

Progress information is logged via a different logger to regular libxc log
messages, and currently can only express a range.  However, not everything
which needs reporting as progress comes with a range.  Extend the interface to
allow reporting of a single statement.

The programming interface now looks like:
  xc_set_progress_prefix()
    set the prefix string to be used
  xc_report_progress_single()
    report a single action
  xc_report_progress_step()
    report $X of $Y

The new programming interface is implemented in a compatible way with the
existing caller interface (by reporting a single action as "0 of 0").

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agotools/libxc: Implement writev_exact() in the same style as write_exact()
Andrew Cooper [Tue, 1 Jul 2014 18:10:35 +0000 (19:10 +0100)]
tools/libxc: Implement writev_exact() in the same style as write_exact()

This implementation of writev_exact() will cope with an iovcnt greater than
IOV_MAX because glibc will actually let this work anyway, and it is very
useful not to have to work about this in the caller of writev_exact().  The
caller is still required to ensure that the sum of iov_len's doesn't overflow
a ssize_t.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agoxen: arm: X-Gene Storm check GIC DIST address for EOI quirk
Pranavkumar Sawargaonkar [Wed, 29 Apr 2015 09:38:27 +0000 (15:08 +0530)]
xen: arm: X-Gene Storm check GIC DIST address for EOI quirk

In old X-Gene Storm firmware and DT, secure mode addresses have been
mentioned in GICv2 node. In this case maintenance interrupt is used
instead of EOI HW method.

This patch checks the GIC Distributor Base Address to enable EOI quirk
for old firmware.

Ref:
http://lists.xen.org/archives/html/xen-devel/2014-07/msg01263.html

Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Tested-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen/arm: p2m: Add an ASSERT to check that p2m lock is taken in __p2m_lookup
Julien Grall [Mon, 27 Apr 2015 14:58:33 +0000 (15:58 +0100)]
xen/arm: p2m: Add an ASSERT to check that p2m lock is taken in __p2m_lookup

__p2m_lookup should be called with the p2m lock taken. Add an ASSERT in
order to catch wrong caller in debug build.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl: convert strings and ints to xenbus_state
Olaf Hering [Fri, 24 Apr 2015 09:07:14 +0000 (09:07 +0000)]
libxl: convert strings and ints to xenbus_state

Convert all plain ints and strings which are used for xenbus "state"
files to xenbus_state. This makes it easier to find code which deals
with backend/frontend state changes.

Convert usage of libxl__sprintf to GCSPRINTF.

No change in behaviour is expected by this change, beside a small
increase of runtime memory usage in places that used a string constant.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agotools/libxc: Set HVM_PARAM_CONSOLE_EVTCHN during restore
Boris Ostrovsky [Thu, 23 Apr 2015 02:49:18 +0000 (22:49 -0400)]
tools/libxc: Set HVM_PARAM_CONSOLE_EVTCHN during restore

When resuming, the guest needs to check whether the port has changed. HVM
guests use this parameter to get the port number.

(We can't always use xenstore where this value is also written: for example
on Linux the console is resumed very early, before the store is up).

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
10 years agop2m/ept: enable PML in p2m-ept for log-dirty
Kai Huang [Mon, 4 May 2015 10:19:25 +0000 (12:19 +0200)]
p2m/ept: enable PML in p2m-ept for log-dirty

This patch firstly enables EPT A/D bits if PML is used, as PML depends on EPT
A/D bits to work. A bit is set for all present p2m types in middle and leaf EPT
entries, and D bit is set for all writable types in EPT leaf entry, except for
log-dirty type with PML.

With PML, for 4K pages, instead of setting EPT entry to read-only, we just need
to clear D bit in order to log that GFN. For superpages, we still need to set it
to read-only as we need to split superpage to 4K pages in EPT violation.

Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agolog-dirty: refine common code to support PML
Kai Huang [Mon, 4 May 2015 10:18:51 +0000 (12:18 +0200)]
log-dirty: refine common code to support PML

Using PML, it's possible there are dirty GPAs logged in vcpus' PML buffers
when userspace peek/clear dirty pages, therefore we need to flush them befor
reporting dirty pages to userspace. This applies to both video ram tracking and
paging_log_dirty_op.

This patch adds new p2m layer functions to enable/disable PML and flush PML
buffers. The new functions are named to be generic to cover potential futher
PML-like features for other platforms.

Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Acked-by: Tim Deegan <tim@xen.org>
10 years agovmx: disable PML in vmx_vcpu_destroy
Kai Huang [Mon, 4 May 2015 10:17:43 +0000 (12:17 +0200)]
vmx: disable PML in vmx_vcpu_destroy

It's possible domain still remains in log-dirty mode when it is about to be
destroyed, in which case we should manually disable PML for it.

Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agovmx: handle PML enabling in vmx_vcpu_initialise
Kai Huang [Mon, 4 May 2015 10:17:10 +0000 (12:17 +0200)]
vmx: handle PML enabling in vmx_vcpu_initialise

It's possible domain has already been in log-dirty mode when creating vcpu, in
which case we should enable PML for this vcpu if PML has been enabled for the
domain.

Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agovmx: handle PML buffer full VMEXIT
Kai Huang [Mon, 4 May 2015 10:15:49 +0000 (12:15 +0200)]
vmx: handle PML buffer full VMEXIT

We need to flush PML buffer when it's full.

Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agovmx: add help functions to support PML
Kai Huang [Mon, 4 May 2015 10:15:07 +0000 (12:15 +0200)]
vmx: add help functions to support PML

This patch adds help functions to enable/disable PML, and flush PML buffer for
single vcpu and particular domain for further use.

Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agovmx: add new data structure member to support PML
Kai Huang [Mon, 4 May 2015 10:14:15 +0000 (12:14 +0200)]
vmx: add new data structure member to support PML

A new 4K page pointer is added to arch_vmx_struct as PML buffer for vcpu. And a
new 'status' field is added to vmx_domain to indicate whether PML is enabled for
the domain or not.

Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agovmx: add PML definition and feature detection
Kai Huang [Mon, 4 May 2015 10:12:11 +0000 (12:12 +0200)]
vmx: add PML definition and feature detection

The patch adds PML definition and feature detection. Note PML won't be detected
if PML is disabled from boot parameter. PML is also disabled in construct_vmcs,
as it will only be enabled when domain is switched to log dirty mode.

Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agolog-dirty: add new paging_mark_gfn_dirty
Kai Huang [Mon, 4 May 2015 10:10:41 +0000 (12:10 +0200)]
log-dirty: add new paging_mark_gfn_dirty

PML logs GPA in PML buffer. Original paging_mark_dirty takes MFN as parameter
but it gets guest pfn internally and use guest pfn to as index for looking up
radix log-dirty tree. In flushing PML buffer, calling paging_mark_dirty directly
introduces redundant p2m lookups (gfn->mfn->gfn), therefore we introduce
paging_mark_gfn_dirty which is bulk of paging_mark_dirty but takes guest pfn as
parameter, and in flushing PML buffer we call paging_mark_gfn_dirty directly.
Original paging_mark_dirty then simply is a wrapper of paging_mark_gfn_dirty.

Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Acked-by: Tim Deegan <tim@xen.org>
10 years agovmx: add new boot parameter to control PML enabling
Kai Huang [Mon, 4 May 2015 10:09:03 +0000 (12:09 +0200)]
vmx: add new boot parameter to control PML enabling

A top level EPT parameter "ept=<options>" and a sub boolean "opt_pml_enabled"
are added to control PML. Other booleans can be further added for any other EPT
related features.

The document description for the new parameter is also added.

Signed-off-by: Kai Huang <kai.huang@linux.intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agotest_x86_emulate: extend EFLAGS check of CMPXCHG test
Eugene Korenevsky [Mon, 4 May 2015 09:56:21 +0000 (11:56 +0200)]
test_x86_emulate: extend EFLAGS check of CMPXCHG test

CMPXCHG: in the case of inequality of the rAX and the operand,
need to check CF, PF, AF, SF and OF flags as well.

This adjustment covers the fix of incorrect comparison during
CMPXCHG emulation.

Signed-off-by: Eugene Korenevsky <ekorenevsky@gmail.com>
10 years agox86_emulate: fix EFLAGS setting of CMPXCHG emulation
Eugene Korenevsky [Mon, 4 May 2015 09:55:41 +0000 (11:55 +0200)]
x86_emulate: fix EFLAGS setting of CMPXCHG emulation

CMPXCHG sets CF, PF, AF, SF, and OF flags according to the results of the
comparison the rAX with the operand of the instruction.
rAX must be the first argument of the comparison (a minuend), the operand
must be the second one (a subtrahend).

Due to improper order of comparison arguments, CF, PF, AF, SF and OF flags were
set incorrectly in the case of inequality. Need to swap them.

Signed-off-by: Eugene Korenevsky <ekorenevsky@gmail.com>
10 years agox86: improve psr scheduling code
Chao Peng [Mon, 4 May 2015 09:54:39 +0000 (11:54 +0200)]
x86: improve psr scheduling code

Switching RMID from previous vcpu to next vcpu only needs to write
MSR_IA32_PSR_ASSOC once. Write it with the value of next vcpu is enough,
no need to write '0' first. Idle domain has RMID set to 0 and because MSR
is already updated lazily, so just switch it as it does.

Also move the initialization of per-CPU variable which used for lazy
update from context switch to CPU starting.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
10 years agolibxlu: don't crash on empty lists
Jan Beulich [Fri, 24 Apr 2015 10:15:15 +0000 (12:15 +0200)]
libxlu: don't crash on empty lists

Prior to 1a09c5113a ("libxlu: rework internal representation of
setting") empty lists in config files did get accepted. Restore that
behavior.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
10 years agox86/hvm: implicitly disable an ioreq server when it is destroyed
Paul Durrant [Fri, 24 Apr 2015 10:14:23 +0000 (12:14 +0200)]
x86/hvm: implicitly disable an ioreq server when it is destroyed

Currently, unless a (non-default) ioreq server is explicitly disabled before
being destroyed, its gmfns will not be placed back into the p2m but still
released back into the ioreq_gmfn mask. This is somewhat counter-intuitive
and easily remedied by this small patch.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86/hvm: actually release ioreq server pages
Paul Durrant [Fri, 24 Apr 2015 10:13:48 +0000 (12:13 +0200)]
x86/hvm: actually release ioreq server pages

hvm_free_ioreq_gmfn has the sense of the ioreq_gmfn mask inverted; it
needs to set a bit to release the gmfn, not clear it.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
10 years agouse 'Hardware domain' instead of 'Domain 0' in hwdom_shutdown()
Vitaly Kuznetsov [Fri, 24 Apr 2015 10:07:00 +0000 (12:07 +0200)]
use 'Hardware domain' instead of 'Domain 0' in hwdom_shutdown()

hwdom_shutdown() operates with hardware domains, use the proper wording.
Eliminate pointless braces from switch cases.

Use hardware_domain->domain_id instead of hardware_domid to print the actual
domain ID as in some cases it can differ (e.g. Dom0 dies before the actual HW
domain got created, kexec for the HW domain is being performed,...).

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
10 years agoAMD IOMMU: only translate remapped IO-APIC RTEs
Jan Beulich [Fri, 24 Apr 2015 10:06:26 +0000 (12:06 +0200)]
AMD IOMMU: only translate remapped IO-APIC RTEs

1aeb1156fa ("x86 don't change affinity with interrupt unmasked")
introducing RTE reads prior to the respective interrupt having got
enabled for the first time uncovered a bug in 2ca9fbd739 ("AMD IOMMU:
allocate IRTE entries instead of using a static mapping"): We obviously
shouldn't be translating RTEs for which remapping didn't get set up
yet.

Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
10 years agopassthrough/amd: avoid reading an uninitialized variable
Tim Deegan [Fri, 24 Apr 2015 10:04:57 +0000 (12:04 +0200)]
passthrough/amd: avoid reading an uninitialized variable

update_intremap_entry_from_msi() doesn't write to its data pointer on
some error paths, so we copying that variable into the msg would count
as undefined behaviour.

Signed-off-by: Tim Deegan <tim@xen.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Suravee Suthikulpanit <Suravee.Suthikulpanit@amd.com>
10 years agox86/shadow: fix big-memory build
Jan Beulich [Thu, 23 Apr 2015 11:10:19 +0000 (13:10 +0200)]
x86/shadow: fix big-memory build

Modifiers to the pointer passed into list_next_entry() are also being
applied to the macro's return type, and hence if the input pointer is
const-qualified a variable the result gets assigned to would also need
to be. As that doesn't seem desirable here, drop the const qualifier
on the input pointer instead.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
10 years agorefine C++ header checking compiler invocation
Jan Beulich [Thu, 23 Apr 2015 11:09:10 +0000 (13:09 +0200)]
refine C++ header checking compiler invocation

g++ 4.1.x dies with "cc1plus: error: output filename specified twice"
on the currently used construct. That's apparently due to it converting
the manually specified "c++" into "c++-header", and mis-handling that
(which, when using "c++-header" explicitly btw gets mis-handled even
with 4.9.x and also, using "c-header", by the plain C compiler).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
10 years agoadjust assertion in alloc_heap_pages()
Jan Beulich [Thu, 23 Apr 2015 11:08:40 +0000 (13:08 +0200)]
adjust assertion in alloc_heap_pages()

Older gcc warns (and due to -Werror fails) on this ASSERT() now that
"node" is of unsigned type. Make it more useful at once.

Coverity-ID: 1055630

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agosysctl: zero structures on the stack
Andrew Cooper [Thu, 23 Apr 2015 11:07:59 +0000 (13:07 +0200)]
sysctl: zero structures on the stack

None of these structures currently contain a hole.  However, there is a risk
that a change to the structure might introduce a hole, and thus create a
hypervisor stack leak to the toolstack.

Mitigate this risk by preemptively zeroing these structures.  These are not
hotpaths, so the slight overhead is not an issue.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agoVT-d: replace bogus gprintk()
Jan Beulich [Thu, 23 Apr 2015 11:05:33 +0000 (13:05 +0200)]
VT-d: replace bogus gprintk()

Just like the other messages in this function this one should be issued
through plain printk() - the current vCPU is irrelevant here. (Noticed
while backporting to older trees, which don't have gprintk().)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agox86/hvm: refactor code that allocates ioreq gfns.
Tim Deegan [Thu, 16 Apr 2015 16:34:24 +0000 (17:34 +0100)]
x86/hvm: refactor code that allocates ioreq gfns.

It was confusing GCC's uninitialized-variable detection.

Signed-off-by: Tim Deegan <tim@xen.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agolibxl: fd events: Suppress spurious fd events
Ian Jackson [Thu, 16 Apr 2015 18:23:28 +0000 (19:23 +0100)]
libxl: fd events: Suppress spurious fd events

Always recheck with poll() right before making the callback.

All sorts of things may have happened since poll() originally signaled
the fd.  We would like the main functional libxl code not to have to
worry about spurious wakeups.

In particular, this fixes a bug in the save/restore callout: the save
helper message reader operates with the fd in blocking mode.  In a
multithreaded program one thread might have eaten all the messages out
of the fd while another one is busy returning from poll and reacquiring
the libxl lock, possibly resulting in a deadlock.

(Also, we abolish the anomalous direct caller of efd->func.)

Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reported-by: Jim Fehlig <jfehlig@suse.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Jim Fehlig <jfehlig@suse.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Jim Fehlig <jfehlig@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl: fd events: Break out fd_occurs
Ian Jackson [Thu, 16 Apr 2015 18:23:27 +0000 (19:23 +0100)]
libxl: fd events: Break out fd_occurs

No functional change, only code motion.

Currently, contrary to this function's name, there are two sites where
efd->func() is called so one of them doesn't go through here just yet.
That will be dealt with in the next commit.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Jim Fehlig <jfehlig@suse.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Jim Fehlig <jfehlig@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl: fd events: Break out libxl__fd_poll_recheck
Ian Jackson [Thu, 16 Apr 2015 18:23:26 +0000 (19:23 +0100)]
libxl: fd events: Break out libxl__fd_poll_recheck

Replaces two call sites where a rechecking poll() was open-coded.

No functional change, other than to highly unusual error path
diagnosis, and debug and error message output.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Jim Fehlig <jfehlig@suse.com>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Jim Fehlig <jfehlig@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agodocs/build: Support generation of pandoc documents
Andrew Cooper [Tue, 21 Apr 2015 15:47:25 +0000 (16:47 +0100)]
docs/build: Support generation of pandoc documents

pandoc is a superset of markdown

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agodocs/build: Move install checks into individual build targets
Andrew Cooper [Tue, 21 Apr 2015 15:47:05 +0000 (16:47 +0100)]
docs/build: Move install checks into individual build targets

For top-level targets which use more than a single program to produce content
(txt already, pdf once pandoc is supported), these current checks are
unsuitable.

By moving the the install checks to the rules which actually use the programs,
it is now possible to build a subset of a top-level target depending on the
installed programs.

As a bonus, it removes the need to recurse for txt, man-pages and pdf targets.

A side effect of this is that every individual source which cannot be
generated will have a specific message logged, giving the file and program.
As such, these message are updated to consistently report the target file
which was not generated.

Finally, update "ifdef foo" to "ifneq($(foo),)" to be more resilient to errors
caused by having foo defined as an empty string.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agodocs/build: Do not create directories if we are not going to use them
Andrew Cooper [Mon, 20 Apr 2015 10:49:24 +0000 (11:49 +0100)]
docs/build: Do not create directories if we are not going to use them

and be quite about doing so; these are only intermediate directories.

No practical change, but the build log is roughly halved.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agodocs/build: Do not use move-if-changed
Andrew Cooper [Mon, 20 Apr 2015 10:49:23 +0000 (11:49 +0100)]
docs/build: Do not use move-if-changed

Nothing expensive depends on these results.

Also prefer $(INSTALL_DATA) over cp to get correct file attributes (see
fb33b2b "docs: make .txt files over-writable when building from r/o sources")

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agodocs/build: Move two rules for consistency, and comment sections
Andrew Cooper [Mon, 20 Apr 2015 10:49:22 +0000 (11:49 +0100)]
docs/build: Move two rules for consistency, and comment sections

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agodocs/build: Do not open-code $*
Andrew Cooper [Mon, 20 Apr 2015 10:49:21 +0000 (11:49 +0100)]
docs/build: Do not open-code $*

Sometimes there is already a round enough wheel to hand.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agodocs/build: Misc cleanup
Andrew Cooper [Mon, 20 Apr 2015 10:49:20 +0000 (11:49 +0100)]
docs/build: Misc cleanup

 * Use $(PANDOC) from ./configure
 * Swap '-N' for its less-obscure longer form
 * Don't explicitly echo about markdown.  The call to markdown is emitted
 * Whitespace cleanup

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
10 years agohotplug/FreeBSD: set network interface MTU to bridge MTU
Gustau Perez [Mon, 20 Apr 2015 07:12:52 +0000 (09:12 +0200)]
hotplug/FreeBSD: set network interface MTU to bridge MTU

On creation time, tap and xnb interfaces are created with an mtu of
1500 bytes, assuming the bridge will have the same value.
Instead, check the bridge mtu and configure the new xnb or
tap interface with the same value.

The tools used are sed and ifconfig, both included on base. No need
to install additional ports (no new dependences).

Signed-off-by: Gustau Perez <gustau.perez@gmail.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.cmapbell@citrix.com>
[ ijc -- clarified title ]

10 years agolibxl: document foreground '-F' option of create command
Giuseppe Mazzotta [Fri, 17 Apr 2015 15:36:34 +0000 (17:36 +0200)]
libxl: document foreground '-F' option of create command

Signed-off-by: Giuseppe Mazzotta <g.mazzotta@iragan.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl: use DEBUG log level instead of INFO
Wei Liu [Fri, 17 Apr 2015 11:31:29 +0000 (12:31 +0100)]
libxl: use DEBUG log level instead of INFO

Make libxl less noisy when destroying a domain.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl: provide libxl_bitmap_{or,and}
Linda Jacobson [Wed, 15 Apr 2015 17:02:07 +0000 (11:02 -0600)]
libxl: provide libxl_bitmap_{or,and}

New functions to provide logical and and or of two bitmaps.  These are
generically useful utility functions added to the public API for the
benefit of libxl's users.

In the future they may also be useful internally, e.g. in the
vNUMA configuration check function.

Signed-off-by: Linda Jacobson <lindaj@jma3.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- rewrote commit message and fixed typo ]

10 years agoxen/arm: Enable mem_access on ARM
Tamas K Lengyel [Mon, 20 Apr 2015 15:06:24 +0000 (17:06 +0200)]
xen/arm: Enable mem_access on ARM

Signed-off-by: Tamas K Lengyel <tklengyel@sec.in.tum.de>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Julien Grall <julien.grall@linaro.org>
10 years agotools/tests: Enable xen-access on ARM
Tamas K Lengyel [Mon, 20 Apr 2015 15:06:23 +0000 (17:06 +0200)]
tools/tests: Enable xen-access on ARM

Switch to use maximum gpfn as the limit to setting permissions. Also,
move HAS_MEM_ACCESS definition into config.

Signed-off-by: Tamas K Lengyel <tklengyel@sec.in.tum.de>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- removed obsolete reference to test_and_set_bit from the
         commit message ]

10 years agotools/libxc: Allocate magic page for mem access on ARM
Tamas K Lengyel [Mon, 20 Apr 2015 15:06:22 +0000 (17:06 +0200)]
tools/libxc: Allocate magic page for mem access on ARM

Signed-off-by: Tamas K Lengyel <tklengyel@sec.in.tum.de>
Reviewed-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen/arm: Implement domain_get_maximum_gpfn
Julien Grall [Mon, 20 Apr 2015 15:06:21 +0000 (17:06 +0200)]
xen/arm: Implement domain_get_maximum_gpfn

The function domain_get_maximum_gpfn is returning the maximum gpfn ever
mapped in the guest. We can use d->arch.p2m.max_mapped_gfn for this purpose.

We use this in xenaccess as to avoid the user attempting to set page
permissions on pages which don't exist for the domain, as a non-arch specific
sanity check.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen/arm: Instruction prefetch abort (X) mem_access event handling
Tamas K Lengyel [Mon, 20 Apr 2015 15:06:19 +0000 (17:06 +0200)]
xen/arm: Instruction prefetch abort (X) mem_access event handling

Add missing structure definition for iabt and update the trap handling
mechanism to only inject the exception if the mem_access checker
decides to do so.

Signed-off-by: Tamas K Lengyel <tklengyel@sec.in.tum.de>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen/arm: Data abort exception (R/W) mem_access events
Tamas K Lengyel [Mon, 20 Apr 2015 15:06:18 +0000 (17:06 +0200)]
xen/arm: Data abort exception (R/W) mem_access events

This patch enables to store, set, check and deliver LPAE R/W mem_events.
As the LPAE PTE's lack enough available software programmable bits,
we store the permissions in a Radix tree. The tree is only looked at if
mem_access_enabled is turned on.

Signed-off-by: Tamas K Lengyel <tklengyel@sec.in.tum.de>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen/arm: Allow hypervisor access to mem_access protected pages
Tamas K Lengyel [Mon, 20 Apr 2015 15:06:17 +0000 (17:06 +0200)]
xen/arm: Allow hypervisor access to mem_access protected pages

The hypervisor may use the MMU to verify that the given guest has read/write
access to a given page during hypercalls. As we may have custom mem_access
permissions set on these pages, we do a software-based type checking in case
the MMU based approach failed, but only if mem_access_enabled is set.

These memory accesses are not forwarded to the mem_event listener. Accesses
performed by the hypervisor are currently not part of the mem_access scheme.
This is consistent behaviour with the x86 side as well.

Signed-off-by: Tamas K Lengyel <tklengyel@sec.in.tum.de>
Reviewed-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen/arm: groundwork for mem_access support on ARM
Tamas K Lengyel [Mon, 20 Apr 2015 15:06:16 +0000 (17:06 +0200)]
xen/arm: groundwork for mem_access support on ARM

Add necessary changes for page table construction routines to pass
the default access information and hypercall continuation mask. Also,
define necessary functions and data fields to be used later by mem_access.

The p2m_access_t info will be stored in a Radix tree as the PTE lacks
enough software programmable bits, thus in this patch we add the radix-tree
construction/destruction portions. The tree itself will be used later
by mem_access.

Signed-off-by: Tamas K Lengyel <tklengyel@sec.in.tum.de>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoConfig.mk: Fix (and, effectively, update) QEMU_TAG
Ian Jackson [Tue, 21 Apr 2015 10:27:59 +0000 (11:27 +0100)]
Config.mk: Fix (and, effectively, update) QEMU_TAG

In 952944f7 "QEMU_TAG update" my tag update script mangled the
machinery which sets QEMU_TRADITIONAL_REVISION, by replacing the first
assignment to QEMU_TRADITIONAL_REVISION it found rather than the one
which ought to have been replaced.

The result was that:
 * From that commit on, QEMU_TAG was no longer honoured although
   QEMU_TRADITIONAL_REVISION still was
 * That particular update to QEMU_TRADITIONAL_REVISION's default
   value was effective
 * The next attempt to update QEMU_TRADITIONAL_REVISION, in
   1fc3aeb3 "libxl: use new QEMU xenstore protocol" was totally
   ineffective.

Fix this by restoring the transfer from QEMU_TAG.  The effects are:
 * Once more, honour QEMU_TAG.
 * Belatedly apply the qemu-trad change part of "libxl: use new QEMU
   xenstore protocol.

(I have also fixed my script to not do this again.)

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: George Dunlap <george.dunlap@eu.citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
Reported-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
10 years agosysctl: make XEN_SYSCTL_numainfo a little more efficient
Boris Ostrovsky [Tue, 21 Apr 2015 07:06:00 +0000 (09:06 +0200)]
sysctl: make XEN_SYSCTL_numainfo a little more efficient

A number of changes to XEN_SYSCTL_numainfo interface:

* Make sysctl NUMA topology query use fewer copies by combining some
  fields into a single structure and copying distances for each node
  in a single copy.
* NULL meminfo and distance handles are a request for maximum number
  of nodes (num_nodes). If those handles are valid and num_nodes is
  is smaller than the number of nodes in the system then -ENOBUFS is
  returned (and correct num_nodes is provided)
* Instead of using max_node_index for passing number of nodes keep this
  value in num_nodes: almost all uses of max_node_index required adding
  or subtracting one to eventually get to number of nodes anyway.
* Replace INVALID_NUMAINFO_ID with XEN_INVALID_MEM_SZ and add
  XEN_INVALID_NODE_DIST.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
10 years agox86/domctl: don't allow a toolstack domain to pause itself
Andrew Cooper [Tue, 21 Apr 2015 07:05:26 +0000 (09:05 +0200)]
x86/domctl: don't allow a toolstack domain to pause itself

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
10 years agox86/domctl: cleanup
Andrew Cooper [Tue, 21 Apr 2015 07:04:45 +0000 (09:04 +0200)]
x86/domctl: cleanup

 * latch curr/currd once at start
 * drop redundant "ret = 0" and braces
 * use "copyback = 1" when appropriate
 * move break statements inside case-specific braced scopes
 * don't bother check for NULL before calling xfree()
 * eliminate trailing whitespace
 * Xen style corrections

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
10 years agodomctl/sysctl: don't leak hypervisor stack to toolstacks
Andrew Cooper [Tue, 21 Apr 2015 07:03:15 +0000 (09:03 +0200)]
domctl/sysctl: don't leak hypervisor stack to toolstacks

This is CVE-2015-3340 / XSA-132.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agox86/efi: Reserve SMBIOS table region when EFI booting
Ross Lagerwall [Fri, 17 Apr 2015 08:44:48 +0000 (10:44 +0200)]
x86/efi: Reserve SMBIOS table region when EFI booting

Some EFI firmware implementations may place the SMBIOS table in RAM
marked as BootServicesData, which Xen does not consider as reserved.
When dom0 tries to access the SMBIOS, the region is not contained in the
initial P2M and it crashes with a page fault. To fix this, reserve the
SMBIOS region.

Also, fix the memcmp checks for existence of the SMBIOS.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
10 years agopublic/grant_table.h: fix description of GNTTABOP_map_grant_ref
Rafał Wojdyła [Fri, 17 Apr 2015 08:44:29 +0000 (10:44 +0200)]
public/grant_table.h: fix description of GNTTABOP_map_grant_ref

Error code is not returned in the <handle> field of the
gnttab_map_grant_ref structure but in the <status> field only.

Signed-off-by: Rafał Wojdyła <omeg@invisiblethingslab.com>
10 years agoVMX: replace some plain numbers
Liang Li [Fri, 17 Apr 2015 08:42:13 +0000 (10:42 +0200)]
VMX: replace some plain numbers

... making the code better document itself. No functional change
intended.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
10 years agovtpmmgr: execute deep quote in locality 0
Emil Condrea [Wed, 15 Apr 2015 18:00:14 +0000 (21:00 +0300)]
vtpmmgr: execute deep quote in locality 0

Enables deep quote execution for vtpmmgr which can not be started
using locality 2. Flags are used to request additional data to be
present when executing quote. They are interpreted as a bitmask of:
 * VTPM_QUOTE_FLAGS_HASH_UUID
 * VTPM_QUOTE_FLAGS_VTPM_MEASUREMENTS
 * VTPM_QUOTE_FLAGS_GROUP_INFO
 * VTPM_QUOTE_FLAGS_GROUP_PUBKEY

The externData param for TPM_Quote is calculated as:
externData = SHA1 (
       extraInfoFlags
       requestData
       [SHA1 (
          [SHA1 (UUIDs if requested)]
          [SHA1 (vTPM measurements if requested)]
          [SHA1 (vTPM group update policy if requested)]
          [SHA1 (vTPM group public key if requested)]
       ) if flags !=0 ]
)

The response param pcrValues is an array containing requested hashes used
for externData calculation : UUIDs, vTPM measurements, vTPM group update
policy, group public key. At the end of these hashes the PCR values are
appended.

Signed-off-by: Emil Condrea <emilcondrea@gmail.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
10 years agovtpm: deep quote flags
Emil Condrea [Wed, 15 Apr 2015 18:00:13 +0000 (21:00 +0300)]
vtpm: deep quote flags

Currently, the flags are not interpreted by vTPM. They are just
packed and sent to vtpmmgr.

Signed-off-by: Emil Condrea <emilcondrea@gmail.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
10 years agoxen/vm_event: Add RESUME option to vm_event_op domctl
Tamas K Lengyel [Thu, 9 Apr 2015 14:32:53 +0000 (16:32 +0200)]
xen/vm_event: Add RESUME option to vm_event_op domctl

Thus far mem_access and mem_sharing memops had been able to signal
to Xen to start pulling responses off the corresponding rings. In this patch
we retire these memops and add them to the option to the vm_event_op domctl.

The vm_event_op domctl suboptions are the same for each ring thus we
consolidate them into XEN_VM_EVENT_ENABLE/DISABLE/RESUME.

As part of this patch in libxc we also rename the mem_access_enable/disable
functions to monitor_enable/disable and move them into xc_monitor.c.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
10 years agoxen/xsm: Split vm_event_op into three separate labels
Tamas K Lengyel [Thu, 9 Apr 2015 14:32:52 +0000 (16:32 +0200)]
xen/xsm: Split vm_event_op into three separate labels

The XSM label vm_event_op has been used to control the three memops
controlling mem_access, mem_paging and mem_sharing. While these systems
rely on vm_event, these are not vm_event operations themselves. Thus,
in this patch we introduce three separate labels for each of these memops.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Tim Deegan <tim@xen.org>
10 years agoxen/vm_event: Relocate memop checks
Tamas K Lengyel [Thu, 9 Apr 2015 14:32:51 +0000 (16:32 +0200)]
xen/vm_event: Relocate memop checks

The memop handler function for paging/sharing responsible for calling XSM
doesn't really have anything to do with vm_event, thus in this patch we
relocate it into mem_paging_memop and mem_sharing_memop. This has already
been the approach in mem_access_memop, so in this patch we just make it
consistent.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
10 years agoxen/vm_event: Decouple vm_event and mem_access.
Tamas K Lengyel [Thu, 9 Apr 2015 14:32:50 +0000 (16:32 +0200)]
xen/vm_event: Decouple vm_event and mem_access.

The vm_event subsystem has been artifically tied to the presence of mem_access.
While mem_access does depend on vm_event, vm_event is an entirely independent
subsystem that can be used for arbitrary function-offloading to helper apps in
domains. This patch removes the dependency that mem_access needs to be supported
in order to enable vm_event.

A new vm_event_resume function is introduced which pulls all responses off from
given ring and delegates handling to appropriate helper functions (if
necessary). By default, vm_event_resume just pulls the response from the ring
and unpauses the corresponding vCPU. This approach reduces code duplication
and present a single point of entry for the entire vm_event subsystem's
response handling mechanism.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Tim Deegan <tim@xen.org>
10 years agoxen/vm_event: Deprecate VM_EVENT_FLAG_DUMMY flag
Tamas K Lengyel [Thu, 9 Apr 2015 14:32:49 +0000 (16:32 +0200)]
xen/vm_event: Deprecate VM_EVENT_FLAG_DUMMY flag

There are no use-cases for this flag.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
Acked-by: Tim Deegan <tim@xen.org>
10 years agoxen: Introduce monitor_op domctl
Tamas K Lengyel [Thu, 9 Apr 2015 14:32:48 +0000 (16:32 +0200)]
xen: Introduce monitor_op domctl

In preparation for allowing for introspecting ARM and PV domains the old
control interface via the hvm_op hypercall is retired. A new control mechanism
is introduced via the domctl hypercall: monitor_op.

This patch aims to establish a base API on which future applications can build
on.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Tim Deegan <tim@xen.org>
10 years agolibxenstat: qmp_read fix and cleanup
Wei Liu [Wed, 8 Apr 2015 16:08:22 +0000 (17:08 +0100)]
libxenstat: qmp_read fix and cleanup

The second argument of poll(2) is the number of file descriptors. POLLIN
is defined as 1 so it happens to work. Also reduce the size of array to
one as there is only one file descriptor.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Charles Arnold <carnold@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxenstat: always free qmp_stats
Wei Liu [Wed, 8 Apr 2015 16:08:21 +0000 (17:08 +0100)]
libxenstat: always free qmp_stats

Originally qmp_stats is only freed in failure path and leaked in success
path.

Instead of wiring up the success path, rearrange the code a bit to
always free qmp_stats before checking if info is NULL.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Charles Arnold <carnold@suse.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxenstat: YAJL_GET_STRING may return NULL
Wei Liu [Wed, 8 Apr 2015 16:08:20 +0000 (17:08 +0100)]
libxenstat: YAJL_GET_STRING may return NULL

Passing NULL to strcmp can cause segmentation fault. Continue in that
case.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Charles Arnold <carnold@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxenstat: reuse xc_handle open in xenstat_init
Wei Liu [Wed, 8 Apr 2015 16:08:19 +0000 (17:08 +0100)]
libxenstat: reuse xc_handle open in xenstat_init

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Charles Arnold <carnold@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl: check return value of libxl_vcpu_setaffinity
Wei Liu [Wed, 8 Apr 2015 16:05:24 +0000 (17:05 +0100)]
libxl: check return value of libxl_vcpu_setaffinity

That function can fail.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoxen/arm: Don't write to GICH_MISR
Edgar E. Iglesias [Fri, 10 Apr 2015 06:21:10 +0000 (16:21 +1000)]
xen/arm: Don't write to GICH_MISR

GICH_MISR is read-only in GICv2.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agoREADME: Reference some more comprehensive docs from the Quick-start
Ian Campbell [Tue, 14 Apr 2015 15:25:49 +0000 (16:25 +0100)]
README: Reference some more comprehensive docs from the Quick-start

The quick-start is not terribly comprehensive for beginners.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Jan Beulich <JBeulich@suse.com>
10 years agoxenstore: document xs_set_permissions
Wei Liu [Tue, 31 Mar 2015 12:26:11 +0000 (13:26 +0100)]
xenstore: document xs_set_permissions

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
10 years agolibxl/vcpu-set - allow to decrease vcpu count on overcommitted guests (v5)
Konrad Rzeszutek Wilk [Fri, 3 Apr 2015 20:02:34 +0000 (16:02 -0400)]
libxl/vcpu-set - allow to decrease vcpu count on overcommitted guests (v5)

We have a check to warn the user if they are overcommitting.
But the check only checks the hosts CPU amount and does
not take into account the case when the user is trying to fix
the overcommit. That is - they want to limit the amount of
online VCPUs.

This fix allows the user to offline vCPUs without any
warnings when they are running an overcommitted guest.

Also fix the extra space in the message.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>