]> xenbits.xensource.com Git - people/liuw/libxenctrl-split/xen.git/log
people/liuw/libxenctrl-split/xen.git
9 years agolibxl: convert libxl__sprintf(gc) to GCSPRINTF
Wei Liu [Tue, 17 Nov 2015 16:19:19 +0000 (16:19 +0000)]
libxl: convert libxl__sprintf(gc) to GCSPRINTF

The rune used is:

  sed -i 's/libxl__sprintf(gc,\s*/GCSPRINTF(/g' libxl*.c

This rune is simple and better than trying to match every possible
patterns.

Two instances in libxl_dm.c need fixing up. They are in fact better to just
use libxl__strdup.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotools/hotplug: quote all variables in vif-bridge
Olaf Hering [Thu, 19 Nov 2015 08:32:52 +0000 (08:32 +0000)]
tools/hotplug: quote all variables in vif-bridge

Cosmetics: most of the variables used in vif-bridge are already quoted.
Add quoting also to the remaining shell variables.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agodocs: Introduce xenstore paths for guest network address information
Paul Durrant [Tue, 17 Nov 2015 11:32:05 +0000 (11:32 +0000)]
docs: Introduce xenstore paths for guest network address information

It is useful for a toolstack to be able to see the network addresses
in use by a domain for a particular vif in xenstore for display
purposes and, for example, so that a VNC session can be established
to the guest GUI.

This patch documents paths to allow a domain to advertise an interface
name, MAC (unicast and multicast) and IP (version 4 and 6) address
information.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Tim Deegan <tim@xen.org>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agodocs: Introduce xenstore paths for hotplug features
Paul Durrant [Tue, 17 Nov 2015 11:32:04 +0000 (11:32 +0000)]
docs: Introduce xenstore paths for hotplug features

Without some indication from a guest it is not possible for a
toolstack to know whether instantiation of a new vbd or vif should
result in a new PV device of the appropriate type being brought online.
(In other words whether guest PV drivers are present and functioning).

This patch documents two paths which vif and vbd frontend drivers can
use to advertise their ability to respond to new vif or vbd
instantiations.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Tim Deegan <tim@xen.org>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agodocs: Introduce xenstore paths for PV driver information
Paul Durrant [Tue, 17 Nov 2015 11:32:03 +0000 (11:32 +0000)]
docs: Introduce xenstore paths for PV driver information

For domain management purposes it is convenient to be able to see
information about PV drivers in xenstore. The XAPI toolstack in
XenServer has always created a ~/drivers path for this purpose.

This patch documents that path and also adds a specification of how
it should be used.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Tim Deegan <tim@xen.org>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agodocs: Introduce xenstore paths for PV control features
Paul Durrant [Tue, 17 Nov 2015 11:32:02 +0000 (11:32 +0000)]
docs: Introduce xenstore paths for PV control features

XenServer already makes use of ~/control/feature-suspend being written
to advertise guest capability of responding to 'suspend' when written to
~/control/shutdown and, since they are derived from XenServer drivers,
the Xen Project Windows PV drivers attempt to write this value. The write
currently fails for libxl provisioned VMs because ~/control is read-only
to the guest (only ~/control/shutdown is writable, for ackowledgement
purposes).

This patch documents feature-suspend and also a set of similar control
feature flags, so that that they may be added to libxl provisioned
guests by subsequent patches:

feature-poweroff: PV drivers/agent can shut down the guest
feature-reboot: PV drivers/agent can reboot the guest
feature-s3: PV drivers/agent can trigger guest sleep (HVM only)
feature-s4: PV drivers/agent can trigger guest hibernate (HVM only)

The patch (bacause it adds features relating to S3 and S4 power states)
also clarifies that the initial set of platform properties mentioned are
booleans, and updates the specifier accordingly.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Tim Deegan <tim@xen.org>
9 years agoget_maintainer: fix perl 5.22/5.24 deprecated/incompatible "\C" use
Joe Perches [Thu, 19 Nov 2015 08:43:53 +0000 (08:43 +0000)]
get_maintainer: fix perl 5.22/5.24 deprecated/incompatible "\C" use

Perl 5.22 emits a deprecated message when "\C" is used in a regex.  Perl
5.24 will disallow it altogether.

Fix it by using [A-Z] instead of \C.

 [ Upstream commit ce8155f7a3d59ce868ea16d8891edda4d865e873 ]

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Tim Deegan <tim@xen.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotools/libxl: Drop dead code following calls to libxl__exec()
Andrew Cooper [Thu, 19 Nov 2015 12:43:52 +0000 (12:43 +0000)]
tools/libxl: Drop dead code following calls to libxl__exec()

libxl__exec() doesn't ever return.  Inform the compiler of this, and
remove all dead code.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxen/arm: use masking operation instead of test_bit for MCSF bits
Julien Grall [Thu, 19 Nov 2015 12:46:09 +0000 (12:46 +0000)]
xen/arm: use masking operation instead of test_bit for MCSF bits

This is a follow of commit 90f2e2a307fc6a6258c39cc87b3b2bf9441c0fa7 "use
masking operation instead of test_bit for MCSF bits" where the ARM
changes were missing.

Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoMAINTAINERS: mini-os patches should be copied to minios-devel
Ian Campbell [Fri, 20 Nov 2015 14:22:11 +0000 (14:22 +0000)]
MAINTAINERS: mini-os patches should be copied to minios-devel

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: samuel.thibault@ens-lyon.org
Cc: stefano.stabellini@eu.citrix.com
Cc: minios-devel@lists.xenproject.org
Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
9 years agoMINIOS_UPSTREAM_REVISION Update
Ian Campbell [Tue, 24 Nov 2015 16:10:32 +0000 (16:10 +0000)]
MINIOS_UPSTREAM_REVISION Update

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoConfig.mk: Update SEABIOS_UPSTREAM_TAG to rel-1.9.0
Ian Campbell [Wed, 18 Nov 2015 12:01:33 +0000 (12:01 +0000)]
Config.mk: Update SEABIOS_UPSTREAM_TAG to rel-1.9.0

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agosched: get rid of the per domain vCPU list in Credit2
Dario Faggioli [Tue, 24 Nov 2015 13:50:30 +0000 (14:50 +0100)]
sched: get rid of the per domain vCPU list in Credit2

As, curently, there is no reason for bothering having
it and keeping it updated.

In fact, it is only used for dumping and changing
vCPUs parameters, but that can be achieved easily with
for_each_vcpu.

While there, improve alignment of comments, ad
add a const qualifier to a pointer, making things
more consistent with what happens everywhere else
in the source file.

This also allows us to kill one of the remaining
FIXMEs in the code, which is always good.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
9 years agosched: get rid of the per domain vCPU list in RTDS
Dario Faggioli [Tue, 24 Nov 2015 13:50:09 +0000 (14:50 +0100)]
sched: get rid of the per domain vCPU list in RTDS

As, curently, there is no reason for bothering having
it and keeping it updated.

In fact, it is only used for dumping and changing
vCPUs parameters, but that can be achieved easily with
for_each_vcpu.

While there, take care of the case when
XEN_DOMCTL_SCHEDOP_getinfo is called but no vCPUs have
been allocated yet (by returning the default scheduling
parameters).

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: Meng Xu <mengxu@cis.upenn.edu>
9 years agosched: better handle (not) inserting idle vCPUs in runqueues
Dario Faggioli [Tue, 24 Nov 2015 13:49:47 +0000 (14:49 +0100)]
sched: better handle (not) inserting idle vCPUs in runqueues

Idle vCPUs are set to run immediately, as a part of their
own initialization, so we shouldn't even try to put them
in a runqueue. In fact, no scheduler does that, even when
asked to (that is rather explicit in Credit2 and RTDS, a
bit less evident in Credit1).

Let's make things look as follows:
 - in generic code, explicitly avoid even trying to
   insert idle vCPUs in runqueues;
 - in specific schedulers' code, enforce that.

Note that, as csched_vcpu_insert() is no longer being
called, during boot (from sched_init_vcpu()) we can
safely avoid saving the flags when taking the runqueue
lock.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
9 years agosched: clarify use cases of schedule_cpu_switch()
Dario Faggioli [Tue, 24 Nov 2015 13:49:09 +0000 (14:49 +0100)]
sched: clarify use cases of schedule_cpu_switch()

schedule_cpu_switch() is meant to be only used for moving
pCPUs from a cpupool to no cpupool, and from there back
to a cpupool, *not* to move them directly from one cpupool
to another.

This is something inherent to the way the function is
implemented and called, but is not that clear, just by the
look of it.

Make it more evident by:
 - adding commentary and ASSERT()s;
 - update the cpupool per-CPU variable (mapping pCPUs to
   pools) directly in schedule_cpu_switch(), rather than
   in various places in cpupool.c.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Juergen Gross <jgross@suse.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
9 years agosched: fix locking for insert_vcpu() in credit1 and RTDS
Dario Faggioli [Tue, 24 Nov 2015 13:48:34 +0000 (14:48 +0100)]
sched: fix locking for insert_vcpu() in credit1 and RTDS

The insert_vcpu() hook is handled with inconsistent locking.
In fact, schedule_cpu_switch() calls the hook with runqueue
lock held, while sched_move_domain() relies on the hook
implementations to take the lock themselves (and, since that
is not done in Credit1 and RTDS, such operation is not safe
in those cases).

This is fixed as follows:
 - take the lock in the hook implementations, in specific
   schedulers' code;
 - avoid calling insert_vcpu(), for the idle vCPU, in
   schedule_cpu_switch(). In fact, idle vCPUs are set to run
   immediately, and the various schedulers won't insert them
   in their runqueues anyway, even when explicitly asked to.

While there, still in schedule_cpu_switch(), locking with
_irq() is enough (there's no need to do *_irqsave()).

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: Meng Xu <mengxu@cis.upenn.edu>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
9 years agox86/HVM: type adjustments
Jan Beulich [Tue, 24 Nov 2015 11:31:13 +0000 (12:31 +0100)]
x86/HVM: type adjustments

- constify struct hvm_trap * function parameters
- width reduce and shuffle some struct hvm_trap members
- use bool_t for boolean fields struct hvm_function_table
- use unsigned for struct hvm_function_table's hap_capabilities field

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky<boris.ostrovsky@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
9 years agoVMX: fix/adjust trap injection
Jan Beulich [Tue, 24 Nov 2015 11:30:31 +0000 (12:30 +0100)]
VMX: fix/adjust trap injection

In the course of investigating the 4.1.6 backport issue of the XSA-156
patch I realized that #DB injection has always been broken, but with it
now getting always intercepted the problem has got worse: Documentation
clearly states that neither DR7.GD nor DebugCtl.LBR get cleared before
the intercept, so this is something we need to do before reflecting the
intercepted exception.

While adjusting this (and also with 4.1.6's strange use of
X86_EVENTTYPE_SW_EXCEPTION for #DB in mind) I further realized that
the special casing of individual vectors shouldn't be done for
software interrupts (resulting from INT $nn).

And then some code movement: Setting of CR2 for #PF can be done in the
same switch() statement (no need for a separate if()), and reading of
intr_info is better done close the the consumption of the variable
(allowing the compiler to generate better code / use fewer registers
for variables).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
9 years agoACPI 6.0: Add changes for FADT table
Bob Moore [Tue, 24 Nov 2015 11:25:37 +0000 (12:25 +0100)]
ACPI 6.0: Add changes for FADT table

ACPICA commit 72b0b6741990f619f6aaa915302836b7cbb41ac4

One new 64-bit field at the end of the table.
FADT version is now 6.

Signed-off-by: Bob Moore <robert.moore@intel.com>
[Linux commit aeb823bbacc2a3aaee29eda5875b58a049fa1f78]
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
9 years agoacpi/NUMA: build NUMA for x86 only
Naresh Bhat [Tue, 24 Nov 2015 11:18:02 +0000 (12:18 +0100)]
acpi/NUMA: build NUMA for x86 only

NUMA is currently not supported for ARM in Xen. Add a new compilation
option HAS_NUMA for NUMA. Configure and build NUMA only for x86
architecture now.

Signed-off-by: Naresh Bhat <naresh.bhat@linaro.org>
Signed-off-by: Parth Dixit <parth.dixit@linaro.org>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
9 years agoVT-d: dump the posted format IRTE
Feng Wu [Tue, 24 Nov 2015 11:14:17 +0000 (12:14 +0100)]
VT-d: dump the posted format IRTE

Add the utility to dump the posted format IRTE.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
9 years agovt-d: extend struct iremap_entry to support VT-d Posted-Interrupts
Feng Wu [Tue, 24 Nov 2015 11:13:58 +0000 (12:13 +0100)]
vt-d: extend struct iremap_entry to support VT-d Posted-Interrupts

Extend struct iremap_entry according to VT-d Posted-Interrupts Spec.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
9 years agoVT-d: remove pointless casts
Feng Wu [Tue, 24 Nov 2015 11:13:03 +0000 (12:13 +0100)]
VT-d: remove pointless casts

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
9 years agovmx: initialize VT-d Posted-Interrupts Descriptor
Feng Wu [Tue, 24 Nov 2015 11:12:39 +0000 (12:12 +0100)]
vmx: initialize VT-d Posted-Interrupts Descriptor

This patch initializes the VT-d Posted-interrupt Descriptor.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
9 years agovmx: add some helper functions for Posted-Interrupts
Feng Wu [Tue, 24 Nov 2015 11:11:00 +0000 (12:11 +0100)]
vmx: add some helper functions for Posted-Interrupts

This patch adds some helper functions to manipulate the
Posted-Interrupts Descriptor.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
9 years agovmx: extend struct pi_desc to support VT-d Posted-Interrupts
Feng Wu [Tue, 24 Nov 2015 11:10:36 +0000 (12:10 +0100)]
vmx: extend struct pi_desc to support VT-d Posted-Interrupts

Extend struct pi_desc according to VT-d Posted-Interrupts Spec.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
9 years agoVT-d Posted-Interrupts feature detection
Feng Wu [Tue, 24 Nov 2015 11:10:10 +0000 (12:10 +0100)]
VT-d Posted-Interrupts feature detection

VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
With VT-d Posted-Interrupts enabled, external interrupts from
direct-assigned devices can be delivered to guests without VMM
intervention when guest is running in non-root mode.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
9 years agoiommu: add iommu_intpost to control VT-d Posted-Interrupts feature
Feng Wu [Tue, 24 Nov 2015 11:09:28 +0000 (12:09 +0100)]
iommu: add iommu_intpost to control VT-d Posted-Interrupts feature

VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
With VT-d Posted-Interrupts enabled, external interrupts from
direct-assigned devices can be delivered to guests without VMM
intervention when guest is running in non-root mode.

This patch adds variable 'iommu_intpost' to control whether enable VT-d
posted-interrupt or not in the generic IOMMU code.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
9 years agovVMX: use latched VMCS machine address
Jan Beulich [Tue, 24 Nov 2015 11:07:27 +0000 (12:07 +0100)]
vVMX: use latched VMCS machine address

Instead of calling domain_page_map_to_mfn() over and over, latch the
guest VMCS machine address unconditionally (i.e. independent of whether
VMCS shadowing is supported by the hardware).

Since this requires altering the parameters of __[gs]et_vmcs{,_real}()
(and hence all their callers) anyway, take the opportunity to also drop
the bogus double underscores from their names (and from
__[gs]et_vmcs_virtual() as well).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
9 years agoVMX: allocate VMCS pages from domain heap
Jan Beulich [Tue, 24 Nov 2015 11:06:26 +0000 (12:06 +0100)]
VMX: allocate VMCS pages from domain heap

There being only very few uses of the virtual address of a VMCS,
convert these cases to establish a mapping and lift the Xen heap
restriction from the VMCS allocation.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
9 years agoMINIOS_UPSTREAM_REVISION Update
Ian Campbell [Mon, 23 Nov 2015 09:39:17 +0000 (09:39 +0000)]
MINIOS_UPSTREAM_REVISION Update

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotools/libxc: Correct XC_DOM_PAGE_SIZE() to return a long long
Andrew Cooper [Thu, 19 Nov 2015 14:45:41 +0000 (14:45 +0000)]
tools/libxc: Correct XC_DOM_PAGE_SIZE() to return a long long

c/s abdf3c5b "libxc: create p2m list outside of kernel mapping if supported"
introduces a use which Coverity objects to; an int used to mask a uint64_t.

The result needs to be signed to allow ~XC_DOM_PAGE_SIZE() to function
correctly, and long long to function properly in 32bit builds.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agolibxl: correct bug in domain builder regarding page tables for pvh
Juergen Gross [Thu, 19 Nov 2015 16:11:08 +0000 (17:11 +0100)]
libxl: correct bug in domain builder regarding page tables for pvh

Commit 81a76e4b12961a9f54f5021809074196dfe6dbba ("libxc: rework of
domain builder's page table handler") dropped a special case for pvh
resulting in page tables being mapped read-only. This led to a panic
of the domain in early boot.

Correct this error.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
9 years agox86/P2M: consolidate handling of types not requiring a valid MFN
Jan Beulich [Fri, 20 Nov 2015 11:38:33 +0000 (12:38 +0100)]
x86/P2M: consolidate handling of types not requiring a valid MFN

As noted regarding the mixture of checks in p2m_pt_set_entry(),
introduce a new P2M type group allowing to be used everywhere we
just care about accepting operations with either a valid MFN or a type
permitting to be used without (valid) MFN.

Note that p2m_mmio_dm is not included in P2M_NO_MFN_TYPES, as for the
intended purpose that one ought to be treated similar to p2m_invalid
(perhaps the two should ultimately get folded anyway).

Note further that PoD superpages now get INVALID_MFN used when creating
page table entries (was _mfn(0) before).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
9 years agox86/PoD: tighten conditions for checking super page
Jan Beulich [Fri, 20 Nov 2015 11:37:37 +0000 (12:37 +0100)]
x86/PoD: tighten conditions for checking super page

Since calling the function isn't cheap, try to avoid the call when we
know up front it won't help; see the code comment for details on those
conditions.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
9 years agox86/IO-APIC: fix setting of destinations
Jan Beulich [Thu, 19 Nov 2015 15:46:10 +0000 (16:46 +0100)]
x86/IO-APIC: fix setting of destinations

In commit a85da715cf ("x86/IO-APIC: adjust setting of destinations") I
made a pretty blatant mistake: get_apic_id() can be used there only
when running APICs in physical mode. For both flat and clustered modes
the change was wrong, causing different kinds of boot problems on
affected systems. Don't revert that change though, but use TARGET_CPUS
(equaling cpu_online_map, and with there only being a single online CPU
fulfilling the original commits intention).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86: fixes to LAPIC probing
Andrew Cooper [Thu, 19 Nov 2015 15:44:59 +0000 (16:44 +0100)]
x86: fixes to LAPIC probing

* Fix (unsafe) assumption that X86_FEATURE_APIC resided in feature word 0.
* All 64bit processors have local APICs; drop the vendor check.
* Unconditionally probe MSR_IA32_APICBASE (safely, to fail more gracefully in
  broken situations) and avoid a redundant double rdmsr().
* Avoid repeatedly OR'ing APICBASE_ENABLE and DEFAULT_PHYS_BASE when
  attempting to reenable the LAPIC.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
9 years agons16550: limit mapped MMIO size
Jan Beulich [Tue, 17 Nov 2015 12:23:11 +0000 (13:23 +0100)]
ns16550: limit mapped MMIO size

There's no point in mapping more than the memory we actually may need
to touch, and in fact the too large region could actually extend into
another device's one (which currently is benign on x86 since only a
single page gets mapped anyway, but which is a latent bug on ARM
whenever PCI support gets enabled there).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agons16550: reset bar_64 on each iteration
Jan Beulich [Tue, 17 Nov 2015 12:22:44 +0000 (13:22 +0100)]
ns16550: reset bar_64 on each iteration

Re-using the possibly non-zero value from a previous iteration can't
do any good.

Take the opportunity and
- limit a few other variables' scopes at once,
- adjust a few types.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agox86: move some APIC related macros to apicdef.h
Feng Wu [Tue, 17 Nov 2015 12:21:52 +0000 (13:21 +0100)]
x86: move some APIC related macros to apicdef.h

Move some APIC related macros to apicdef.h, so they can be used
outside of vlapic.c.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
9 years agox86: add cmpxchg16b support
Feng Wu [Tue, 17 Nov 2015 12:21:33 +0000 (13:21 +0100)]
x86: add cmpxchg16b support

This patch adds cmpxchg16b support for x86-64, so software
can perform 128-bit atomic write/read.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
9 years agoblkif.h: document blkif multi-queue/ring extension
Bob Liu [Tue, 17 Nov 2015 12:21:13 +0000 (13:21 +0100)]
blkif.h: document blkif multi-queue/ring extension

Document the multi-queue/ring feature in terms of XenStore keys to be written
by the backend and by the frontend.

Signed-off-by: Bob Liu <bob.liu@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
9 years agoMerge branch 'staging' of ssh://xenbits.xen.org/home/xen/git/xen into staging
Ian Campbell [Mon, 16 Nov 2015 13:38:33 +0000 (13:38 +0000)]
Merge branch 'staging' of ssh://xenbits.xen.org/home/xen/git/xen into staging

9 years agolibxc: create p2m list outside of kernel mapping if supported
Juergen Gross [Thu, 12 Nov 2015 13:43:36 +0000 (14:43 +0100)]
libxc: create p2m list outside of kernel mapping if supported

In case the kernel of a new pv-domU indicates it is supporting a p2m
list outside the initial kernel mapping by specifying INIT_P2M, let
the domain builder allocate the memory for the p2m list from physical
guest memory only and map it to the address the kernel is expecting.

This will enable loading pv-domUs larger than 512 GB.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agolibxc: rework of domain builder's page table handler
Juergen Gross [Thu, 12 Nov 2015 13:43:35 +0000 (14:43 +0100)]
libxc: rework of domain builder's page table handler

In order to prepare a p2m list outside of the initial kernel mapping
do a rework of the domain builder's page table handler. The goal is
to be able to use common helpers for page table allocation and setup
for initial kernel page tables and page tables mapping the p2m list.
This is achieved by supporting multiple mapping areas. The mapped
virtual addresses of the single areas must not overlap, while the
page tables of a new area added might already be partially present.
Especially the top level page table is existing only once, of course.

Currently restrict the number of mappings to 1 because the only mapping
now is the initial mapping created by toolstack. There should not be
behaviour change and guest visible change introduced.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
9 years agolibxc: split p2m allocation in domain builder from other magic pages
Juergen Gross [Thu, 12 Nov 2015 13:43:34 +0000 (14:43 +0100)]
libxc: split p2m allocation in domain builder from other magic pages

Carve out the p2m list allocation from the .alloc_magic_pages hook of
the domain builder in order to prepare allocating the p2m list outside
of the initial kernel mapping. This will be needed to support loading
domains with huge memory (>512 GB).

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agolibxc: create unmapped initrd in domain builder if supported
Juergen Gross [Thu, 12 Nov 2015 13:43:33 +0000 (14:43 +0100)]
libxc: create unmapped initrd in domain builder if supported

In case the kernel of a new pv-domU indicates it is supporting an
unmapped initrd, don't waste precious virtual space for the initrd,
but allocate only guest physical memory for it.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agolibxc: use domain builder architecture private data for x86 pv domains
Juergen Gross [Thu, 12 Nov 2015 13:43:32 +0000 (14:43 +0100)]
libxc: use domain builder architecture private data for x86 pv domains

Move some data private to the x86 domain builder to the private data
section. Remove extra_pages as they are used nowhere.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agolibxc: introduce domain builder architecture specific data
Juergen Gross [Thu, 12 Nov 2015 13:43:31 +0000 (14:43 +0100)]
libxc: introduce domain builder architecture specific data

Reorganize struct xc_dom_image to contain a pointer to domain builder
architecture specific private data. This will abstract the architecture
or domain type specific data from the general used data.

The new area is allocated as soon as the domain type is known.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agolibxc: rename domain builder count_pgtables to alloc_pgtables
Juergen Gross [Thu, 12 Nov 2015 13:43:30 +0000 (14:43 +0100)]
libxc: rename domain builder count_pgtables to alloc_pgtables

Rename the count_pgtables hook of the domain builder to alloc_pgtables
and do the allocation of the guest memory for page tables inside this
hook. This will remove the need for accessing the x86 specific pgtables
member of struct xc_dom_image in the generic domain builder code.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agoxen: add generic flag to elf_dom_parms indicating support of unmapped initrd
Juergen Gross [Thu, 12 Nov 2015 13:43:29 +0000 (14:43 +0100)]
xen: add generic flag to elf_dom_parms indicating support of unmapped initrd

Support of an unmapped initrd is indicated by the kernel of the domain
via elf notes. In order not to have to use raw elf data in the tools
for support of an unmapped initrd add a flag to the parsed data area
to indicate the kernel supporting this feature.

Switch using this flag in the hypervisor domain builder.

Cc: andrew.cooper3@citrix.com
Cc: jbeulich@suse.com
Cc: keir@xen.org
Suggested-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agolibxc: reorganize domain builder guest memory allocator
Juergen Gross [Thu, 12 Nov 2015 13:43:28 +0000 (14:43 +0100)]
libxc: reorganize domain builder guest memory allocator

Guest memory allocation in the domain builder of libxc is done via
virtual addresses only. In order to be able to support preallocated
areas not virtually mapped reorganize the memory allocator to keep
track of allocated pages globally and in allocated segments.

This requires an interface change of the allocate callback of the
domain builder which currently is using the last mapped virtual
address as a parameter. This is no problem as the only user of this
callback is stubdom/grub/kexec.c using this virtual address to
calculate the last used pfn.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agox86: drop hard_smp_procssor_id()
Jan Beulich [Mon, 16 Nov 2015 12:12:20 +0000 (13:12 +0100)]
x86: drop hard_smp_procssor_id()

... and use what it aliased to directly.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/IO-APIC: adjust setting of destinations
Jan Beulich [Mon, 16 Nov 2015 12:11:59 +0000 (13:11 +0100)]
x86/IO-APIC: adjust setting of destinations

setup_IO_APIC_irqs() runs before APs get brought up, so using
desc->arch.cpu_mask as best risks it being either empty or having bits
for CPUs other than the BP set. Just use the APIC ID of the only
online CPU directly.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/IO-APIC: fix setup of Xen internally used IRQs (take 2)
Jan Beulich [Mon, 16 Nov 2015 12:11:08 +0000 (13:11 +0100)]
x86/IO-APIC: fix setup of Xen internally used IRQs (take 2)

..., i.e. namely that of a PCI serial card with an IRQ above the
legacy range. This had got broken by the switch to cpumask_any() in
cpu_mask_to_apicid_phys(). Fix this by allowing all CPUs for that IRQ
(via setup_vector_irq() properly updating a booting CPU's vector_irq[],
thus avoiding "No irq handler for vector" messages and the interrupt
not working).

Cleanup coding style and types there at once.

While doing this I also noticed that io_apic_set_pci_routing() can't
be quite right: It sets up the destination _before_ getting a vector
allocated (which on other than systems using the flat APIC mode
affects the possible destinations), and also didn't restrict affinity
to ->arch.cpu_mask (as established by assign_irq_vector()).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agoMINIOS_UPSTREAM_REVISION Update
Ian Campbell [Mon, 16 Nov 2015 11:29:45 +0000 (11:29 +0000)]
MINIOS_UPSTREAM_REVISION Update

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotools/ocaml/xb: Correct calculations of data/space the ring
Andrew Cooper [Tue, 10 Nov 2015 10:46:44 +0000 (10:46 +0000)]
tools/ocaml/xb: Correct calculations of data/space the ring

ml_interface_{read,write}() would miscalculate the quantity of
data/space in the ring if it crossed the ring boundary, and incorrectly
return a short read/write.

This causes a protocol stall, as either side of the ring ends up waiting
for what they believe to be the other side needing to take the next
action.

Correct the calculations to cope with crossing the ring boundary.

In addition, correct the error detection.  It is a hard error if the
producer index gets more than a ring size ahead of the consumer, or if
the consumer ever overtakes the producer.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: David Scott <dave@recoil.org>
9 years agolibxl: relax readonly check introduced by XSA-142 fix
Jim Fehlig [Fri, 13 Nov 2015 02:40:46 +0000 (19:40 -0700)]
libxl: relax readonly check introduced by XSA-142 fix

The fix for XSA-142 is quite a big hammer, rejecting readonly
disk configuration even when the requested backend is known to
support readonly. While it is true that qemu doesn't support
readonly for emulated IDE or AHCI disks

$ /usr/lib/xen/bin/qemu-system-i386 \
 -drive file=/tmp/disk.raw,if=ide,media=disk,format=raw,readonly=on
qemu-system-i386: Can't use a read-only drive

$ /usr/lib/xen/bin/qemu-system-i386 -device ahci,id=ahci0 \
 -drive file=/tmp/disk.raw,if=none,id=ahcidisk-0,format=raw,readonly=on \
 -device ide-hd,bus=ahci0.0,unit=0,drive=ahcidisk-0
qemu-system-i386: -device ide-hd,bus=ahci0.0,unit=0,drive=ahcidisk-0:
Can't use a read-only drive

It does support readonly SCSI disks

$ /usr/lib/xen/bin/qemu-system-i386 \
 -drive file=/tmp/disk.raw,if=scsi,media=disk,format=raw,readonly=on
[ok]

Inside a guest using such a disk, the SCSI kernel driver sees write
protect on

[   7.339232] sd 2:0:1:0: [sdb] Write Protect is on

Also, PV drivers support readonly, but the patch rejects such
configuration even when PV drivers (vdev=xvd*) have been explicitly
specified and creation of an emulated twin is skiped.

This follow-up patch loosens the restriction to reject readonly when
creating an emulated IDE or AHCI disk, but allows it when the backend
is known to support readonly.

Signed-off-by: Jim Fehlig <jfehlig@suse.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agolibxc: remove xc_get_bit_size() from tools/libxc/xc_dom_compat_linux.c
Juergen Gross [Fri, 23 Oct 2015 13:05:01 +0000 (15:05 +0200)]
libxc: remove xc_get_bit_size() from tools/libxc/xc_dom_compat_linux.c

xc_get_bit_size() is being used by the unused python wrapper
xc.getBitSize() only. Remove the wrapper and xc_get_bit_size().

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agolibxc: remove most of tools/libxc/xc_dom_compat_linux.c
Juergen Gross [Fri, 23 Oct 2015 13:05:00 +0000 (15:05 +0200)]
libxc: remove most of tools/libxc/xc_dom_compat_linux.c

In tools/libxc/xc_dom_compat_linux.c xc_linux_build() is the only
domain building function used by an in-tree component (qemu-xen) which
is really necessary.

Remove the other domain building functions and the unused python
wrapper xc.linux_build() referencing one of the to be removed
functions.

Suggested-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoConfig.mk: update OVMF changeset
Wei Liu [Thu, 12 Nov 2015 10:06:58 +0000 (10:06 +0000)]
Config.mk: update OVMF changeset

The new osstest tested head contains a fix for gcc-4.4 toolchain.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agooxenstored: Quota.merge: don't assume domain already exists
Jonathan Davies [Wed, 11 Nov 2015 11:21:53 +0000 (11:21 +0000)]
oxenstored: Quota.merge: don't assume domain already exists

In Quota.merge, we merge two quota hashtables, orig_quota and mod_quota, putting
the results into dest_quota. These hashtables map domids to the number of
entries currently owned by that domain.

When mod_quota contains an entry for a domid that was not present in orig_quota
(or dest_quota), the call to get_entry caused Quota.merge to raise a Not_found
exception. This propagates back to the client as an ENOENT error, which is not
an appropriate return value from some operations, such as transaction_end.

This situation can arise when a transaction that introduces a domain (hence
calling Quota.add_entry) needs to be coalesced due to concurrent xenstore
activity.

This patch handles the merge in the case where mod_quota contains an entry not
present in orig_quota (or in dest_quota) by treating that hashtable as having
existing value 0.

Signed-off-by: Jonathan Davies <jonathan.davies@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agoxen/serial: Return actual bytes stored in TX FIFO for OMAP
Oleksandr Tyshchenko [Thu, 5 Nov 2015 17:53:07 +0000 (19:53 +0200)]
xen/serial: Return actual bytes stored in TX FIFO for OMAP

This is intended to decrease a time spending in transmitter
while waiting for the free space in TX FIFO.
And as result to reduce the impact of hvc on the entire system
running on OMAP5/DRA7XX based platforms.

Signed-off-by: Oleksandr Tyshchenko <oleksandr.tyshchenko@globallogic.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Julien Grall <julien.grall@citrix.com>
CC: Stefano Stabellini <stefano.stabellini@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoxen/serial: Move any OMAP specific things to OMAP UART driver
Oleksandr Tyshchenko [Thu, 5 Nov 2015 17:53:06 +0000 (19:53 +0200)]
xen/serial: Move any OMAP specific things to OMAP UART driver

The 8250-uart.h contains extra serial register definitions
for the internal UARTs in TI OMAP SoCs which are used in
OMAP UART driver only.
In order to clean up code move these definitions to omap-uart.c.
Also rename some definitions to follow to the UART_OMAP* prefix.

Signed-off-by: Oleksandr Tyshchenko <oleksandr.tyshchenko@globallogic.com>
CC: Ian Campbell <ian.campbell@citrix.com>
CC: Julien Grall <julien.grall@citrix.com>
CC: Stefano Stabellini <stefano.stabellini@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoocaml/xc: correct shutdown_reason enumeration
Simon Rowe [Thu, 5 Nov 2015 11:39:05 +0000 (11:39 +0000)]
ocaml/xc: correct shutdown_reason enumeration

As defined by the Xen public header the fifth value of
shutdown_reason is watchdog.

Signed-off-by: Simon Rowe <simon.rowe@eu.citrix.com>
Acked-by: David Scott <dave@recoil.org>
9 years agorun QEMU as non-root
Stefano Stabellini [Thu, 5 Nov 2015 12:47:26 +0000 (12:47 +0000)]
run QEMU as non-root

Try to use "xen-qemuuser-domid$domid" first, then
"xen-qemuuser-shared" and root if everything else fails.

The uids need to be manually created by the user or, more likely, by the
xen package maintainer.

Expose a device_model_user setting in libxl_domain_build_info, so that
opinionated callers, such as libvirt, can set any user they like. Do not
fall back to root if device_model_user is set. Users can also set
device_model_user by hand in the xl domain config file.

QEMU is going to setuid and setgid to the user ID and the group ID of
the specified user, soon after initialization, before starting to deal
with any guest IO.

To actually secure QEMU when running in Dom0, we need at least to
deprivilege the privcmd and xenstore interfaces, this is just the first
step in that direction.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotools: pygrub: if partition table is empty, try treating as a whole disk
Ian Campbell [Thu, 5 Nov 2015 14:46:12 +0000 (14:46 +0000)]
tools: pygrub: if partition table is empty, try treating as a whole disk

pygrub (in identify_disk_image()) detects a DOS style partition table
via the presence of the 0xaa55 signature at the end of the first
sector of the disk.

However this signature is also present in whole-disk configurations
when there is an MBR on the disk. Many filesystems (e.g. ext[234])
include leading padding in their on disk format specifically to enable
this.

So if we think we have a DOS partition table but do not find any
actual partition table entries we may as well try looking at it as a
whole disk image. Worst case is we probe and find there isn't anything
there.

This was reported by Sjors Gielen in Debian bug #745419. The fix was
inspired by a patch by Adi Kriegisch in
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=745419#27

Tested by genext2fs'ing my /boot into a new raw image (works) and
then:
   dd if=/usr/lib/grub/i386-pc/g2ldr.mbr of=img conv=notrunc bs=512 count=1

to add an MBR (with 0xaa55 signature) to it, which after this patch
also works.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: 745419-forwarded@bugs.debian.org
9 years agotools: migration: Use PRIpfn when printing frame numbers.
Ian Campbell [Wed, 11 Nov 2015 13:33:46 +0000 (13:33 +0000)]
tools: migration: Use PRIpfn when printing frame numbers.

This avoids various printf formatting warnings when building on arm32.

While touching the affected lines make them consistently use %#.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agons16550: misc minor adjustments
Jan Beulich [Fri, 13 Nov 2015 14:41:47 +0000 (15:41 +0100)]
ns16550: misc minor adjustments

First and foremost: fix documentation: The use of "clock_hz", when
"base_baud" was meant, has taken me several hours (suspecting a more
complicated problem with the PCIe card I've been trying to get
working). At once correct the "gdb" option, which is more like
"console", not like "com<N>".

Next, fix the types of ns_{read,write}_reg(): Especially the former
having had a signed return type so far caused quite interesting effects
when determining to baud rate if "auto" was specified. In that same
code, also avoid dividing by zero when in fact the baud rate was not
previously set up.

Further, accept I/O port based serial PCI cards with a port range wider
than 8 bytes.

Finally, slightly rearrange struct ns16550 to reduce holes.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoRevert "x86/IO-APIC: fix setup of Xen internally used IRQs"
Jan Beulich [Fri, 13 Nov 2015 14:39:57 +0000 (15:39 +0100)]
Revert "x86/IO-APIC: fix setup of Xen internally used IRQs"

This reverts commit 1126b40892ab56cb13c3cae5822bf3a18a689ffb,
as it breaks (at least) x2apic systems.

9 years agox86/IO-APIC: make SET_DEST() easier to use
Jan Beulich [Thu, 12 Nov 2015 16:04:31 +0000 (17:04 +0100)]
x86/IO-APIC: make SET_DEST() easier to use

There has been quite a bit of redundancy between the various use sites.
Eliminate that. No change of generated code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/IO-APIC: fix setup of Xen internally used IRQs
Jan Beulich [Thu, 12 Nov 2015 16:04:10 +0000 (17:04 +0100)]
x86/IO-APIC: fix setup of Xen internally used IRQs

..., i.e. namely that of a PCI serial card with an IRQ above the
legacy range. This had got broken by the switch to cpumask_any() in
cpu_mask_to_apicid_phys(). Fix this by allowing all CPUs for that IRQ
(such that __setup_vector_irq() will properly update a booting CPU's
vector_irq[], avoiding "No irq handler for vector" messages and the
interrupt not working).

While doing this I also noticed that io_apic_set_pci_routing() can't
be quite right: It sets up the destination _before_ getting a vector
allocated (which on other than systems using the flat APIC mode
affects the possible destinations), and also didn't restrict affinity
to ->arch.cpu_mask (as established by assign_irq_vector()).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/event: correct debug event generation
Jan Beulich [Thu, 12 Nov 2015 16:03:20 +0000 (17:03 +0100)]
x86/event: correct debug event generation

RIP is not a linear address, and hence should not on its own be subject
to GVA -> GFN translation. Once at it, move all of the (perhaps
expensive) operations in the two functions into their main if()'s body,
and improve the error code passed to the translation function.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86: #PF error code adjustments
Jan Beulich [Thu, 12 Nov 2015 16:02:35 +0000 (17:02 +0100)]
x86: #PF error code adjustments

Add a definition for the (for now unused) protection key related error
code bit, moving our own custom ones out of the way. In the course of
checking the uses of the latter I realized that while right now they
can only get set on their own, callers would better not depend on that
property and check just for the bit rather than matching the entire
value.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
9 years agox86/traps: honor EXT bit in error codes
Jan Beulich [Thu, 12 Nov 2015 16:01:53 +0000 (17:01 +0100)]
x86/traps: honor EXT bit in error codes

The specification does not explicitly limit the use of this bit to
exceptions that can have selector style error codes, so to be on the
safe side we should deal with it being set even on error codes formally
documented to be always zero (if they're indeed always zero, the change
is simply dead code in those cases).

Introduce and use (where suitable) X86_XEC_* constants to make the code
easier to read.

To match the placement of the "hardware_trap" label, the "hardware_gp"
one gets moved slightly too.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/SVM: don't exceed segment limit when fetching instruction bytes
Jan Beulich [Thu, 12 Nov 2015 16:01:04 +0000 (17:01 +0100)]
x86/SVM: don't exceed segment limit when fetching instruction bytes

Also consistently use the vmcb local variable whenever possible.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/HVM: unify and fix #UD intercept
Jan Beulich [Thu, 12 Nov 2015 16:00:31 +0000 (17:00 +0100)]
x86/HVM: unify and fix #UD intercept

The SVM and VMX versions really were identical, so instead of fixing
the same issue in two places, fold them at once. The issue fixed is the
missing seg:off -> linear translation of the current code address.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/HVM: don't inject #DB with error code
Jan Beulich [Thu, 12 Nov 2015 15:59:18 +0000 (16:59 +0100)]
x86/HVM: don't inject #DB with error code

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper@citrix.com>
9 years agoelfnotes: intorduce a new PHYS_ENTRY elfnote
Roger Pau Monné [Thu, 12 Nov 2015 15:58:07 +0000 (16:58 +0100)]
elfnotes: intorduce a new PHYS_ENTRY elfnote

This new elfnote contains the 32bit entry point into the kernel. Xen will
use this entry point in order to launch the guest kernel in 32bit protected
mode with paging disabled.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agoefi: fix booting failure with UEFI on ARM
Shannon Zhao [Tue, 10 Nov 2015 11:08:29 +0000 (12:08 +0100)]
efi: fix booting failure with UEFI on ARM

Commit 9fd08b4 (efi: split out efi_get_gop()) splits out the
codes getting the pointer to GOP as efi_get_gop(), but it doesn't
initialize the variable handles and gop to NULL like what the original
codes do. This will cause booting failure on ARM while printing below
logs:
Xen 4.7-unstable (c/s Tue Oct 13 14:40:28 2015 +0100 git:7a92036) EFI loader
Synchronous Exception at 0x00000000FECB021C

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
9 years agosymbols.c: avoid warn_unused_result build failure on fgets()
Riku Voipio [Tue, 10 Nov 2015 11:07:55 +0000 (12:07 +0100)]
symbols.c: avoid warn_unused_result build failure on fgets()

In commit:

d37d63d symbols: prefix static symbols with their source file names

An unchecked fgets was added. This causes a compile error at least
on ubuntu utopic:

symbols.c: In function 'read_symbol':
symbols.c:181:3: error: ignoring return value of 'fgets', declared with
attribute warn_unused_result [-Werror=unused-result]
   fgets(str, 500, in); /* discard rest of line */
   ^

Paper over the warning by checking the return value in the if statement.

Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
9 years agox86: allow disabling the emulated VGA
Roger Pau Monné [Tue, 10 Nov 2015 11:07:32 +0000 (12:07 +0100)]
x86: allow disabling the emulated VGA

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86: allow disabling the emulated RTC
Roger Pau Monné [Tue, 10 Nov 2015 11:07:03 +0000 (12:07 +0100)]
x86: allow disabling the emulated RTC

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86: allow disabling power management
Roger Pau Monné [Tue, 10 Nov 2015 11:06:48 +0000 (12:06 +0100)]
x86: allow disabling power management

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86: allow disabling the emulated PIT
Roger Pau Monné [Tue, 10 Nov 2015 11:06:28 +0000 (12:06 +0100)]
x86: allow disabling the emulated PIT

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reported by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
9 years agox86: allow disabling the emulated PIC
Roger Pau Monné [Tue, 10 Nov 2015 11:06:09 +0000 (12:06 +0100)]
x86: allow disabling the emulated PIC

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86: allow disabling the emulated IOMMU
Roger Pau Monné [Tue, 10 Nov 2015 11:05:35 +0000 (12:05 +0100)]
x86: allow disabling the emulated IOMMU

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>
9 years agox86: allow disabling the emulated IO APIC
Roger Pau Monné [Tue, 10 Nov 2015 11:05:18 +0000 (12:05 +0100)]
x86: allow disabling the emulated IO APIC

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86: allow disabling the emulated HPET
Roger Pau Monné [Tue, 10 Nov 2015 11:04:57 +0000 (12:04 +0100)]
x86: allow disabling the emulated HPET

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86: add bitmap of enabled emulated devices
Roger Pau Monné [Tue, 10 Nov 2015 11:04:04 +0000 (12:04 +0100)]
x86: add bitmap of enabled emulated devices

Introduce a bitmap in x86 xen_arch_domainconfig that allows enabling or
disabling specific devices emulated inside of Xen for HVM guests.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agox86/HVM: always intercept #AC and #DB
Jan Beulich [Tue, 10 Nov 2015 11:03:08 +0000 (12:03 +0100)]
x86/HVM: always intercept #AC and #DB

Both being benign exceptions, and both being possible to get triggered
by exception delivery, this is required to prevent a guest from locking
up a CPU (resulting from no other VM exits occurring once getting into
such a loop).

The specific scenarios:

1) #AC may be raised during exception delivery if the handler is set to
be a ring-3 one by a 32-bit guest, and the stack is misaligned.

This is CVE-2015-5307 / XSA-156.

Reported-by: Benjamin Serebrin <serebrin@google.com>
2) #DB may be raised during exception delivery when a breakpoint got
placed on a data structure involved in delivering the exception. This
can result in an endless loop when a 64-bit guest uses a non-zero IST
for the vector 1 IDT entry, but even without use of IST the time it
takes until a contributory fault would get raised (results depending
on the handler) may be quite long.

This is CVE-2015-8104 / XSA-156.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86/hvm: make sure stdvga cache cannot be re-enabled
Paul Durrant [Fri, 6 Nov 2015 14:17:00 +0000 (15:17 +0100)]
x86/hvm: make sure stdvga cache cannot be re-enabled

As soon as the cache is disabled, it will become out-of-sync with the
VGA device model and since no mechanism exists to acquire current VRAM
state from the device model, re-enabling it leads to stale data
being seen by the guest.

The problem was introduced by commit 3bbaaec0 ("x86/hvm: unify stdvga
mmio intercept with standard mmio intercept") and can be seen by
deliberately crashing a Windows guest; the BSOD output is corrupted.

This patch changes the existing 'cache' boolean in hvm_hw_stdvga into a
tri-state enum and only allows the state to move from 'uninitialized' to
'enabled'. Once the cache state becomes 'disabled' it will remain so for
the lifetime of the VM.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agosched: fix locking of remove_vcpu() in credit1
Dario Faggioli [Fri, 6 Nov 2015 14:16:38 +0000 (15:16 +0100)]
sched: fix locking of remove_vcpu() in credit1

In fact, csched_vcpu_remove() (i.e., the credit1
implementation of remove_vcpu()) manipulates runqueues,
so holding the runqueue lock is necessary.

However, the vCPU just can't be on the runqueue, when
the function is called. We can therefore ASSERT() that,
and avoid doing any runqueue manipulations (rather than
adding the runqueue locking around it).

Also, while there, *_lock_irq() (for the private lock) is
enough, there is no need to *_lock_irqsave().

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
9 years agocpufreq: allow ordinary boolean options to be passed on the command line
Jan Beulich [Fri, 6 Nov 2015 14:15:32 +0000 (15:15 +0100)]
cpufreq: allow ordinary boolean options to be passed on the command line

I was quite surprised to find "cpufreq=off" not doing what one would
expect it to do. Fix this.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agox86: cleanup of early cpuid handling
Andrew Cooper [Wed, 4 Nov 2015 16:47:17 +0000 (17:47 +0100)]
x86: cleanup of early cpuid handling

Use register names for variables, rather than their content for leaf 1.
Reduce the number of cpuid instructions issued.  Also drop some trailing
whitespace.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
9 years agocredit: remove cpu argument to __runq_insert()
Harmandeep Kaur [Wed, 4 Nov 2015 16:46:46 +0000 (17:46 +0100)]
credit: remove cpu argument to __runq_insert()

__runq_insert() takes two arguments, cpu and svc. However,
the cpu argument is redundant because we can get all the
information we need about cpu from svc.

Signed-off-by: Harmandeep Kaur <write.harmandeep@gmail.com>
Acked-by: Dario Faggioli <dario.faggioli@citrix.com>
9 years agoblkif: document blkif multi-queue/ring extension
Bob Liu [Wed, 4 Nov 2015 16:46:24 +0000 (17:46 +0100)]
blkif: document blkif multi-queue/ring extension

Document the multi-queue/ring feature in terms of XenStore keys to be written
by the backend and by the frontend.

Signed-off-by: Bob Liu <bob.liu@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
9 years agoxenconsoled: Remove unexpected daemonize behavior
Ross Lagerwall [Mon, 2 Nov 2015 11:17:38 +0000 (11:17 +0000)]
xenconsoled: Remove unexpected daemonize behavior

Previously, xenconsoled's daemonize function would do nothing if its
parent process is init (as it is under systemd but not sysv init).
This is confusing. Instead, always daemonize when asked to, but use the
"interactive" switch when running from the systemd service.

Because a pidfile is only written when daemonizing, drop the pidfile
parameters from the service file (systemd keeps track of the pids
anyway).

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
9 years agoxl: log an error if libxl_cpupool_destroy() fails
Dario Faggioli [Wed, 4 Nov 2015 10:48:24 +0000 (11:48 +0100)]
xl: log an error if libxl_cpupool_destroy() fails

In fact, right now, failing at destroying a cpupool is just
not reported to the user in any explicit way.

Let's log an error, as it is customary for xl in these cases.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>