]> xenbits.xensource.com Git - xen.git/log
xen.git
11 years agox86/ioapic: avoid trying to access the -1th ioapic
Andrew Cooper [Tue, 10 Sep 2013 14:40:34 +0000 (16:40 +0200)]
x86/ioapic: avoid trying to access the -1th ioapic

Discovered by Coverity, CID 1055743

Depending on the contents of the mp_irqs/mp_ioapics from the MP table,
find_isa_irq_apic() might return -1, at which point calling
ioapic_read_entry() with it is bad.

In addition to bailing if pin is -1, bail if apic is -1.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agoconsole: buffer and show origin of guest PV writes
Daniel De Graaf [Tue, 10 Sep 2013 14:39:46 +0000 (16:39 +0200)]
console: buffer and show origin of guest PV writes

Guests other than domain 0 using the console output have previously been
controlled by the VERBOSE #define, but with no designation of which
guest's output was on the console. This patch converts the HVM output
buffering to be used by all domains except the hardware domain (dom0):
stripping non-printable characters, line buffering the output, and
prefixing it with the domain ID. This is especially useful for debugging
stub domains during early boot.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Keir Fraser <keir@xen.org>
11 years agolibelf: add hvm callback vector feature
Mukesh Rathor [Tue, 10 Sep 2013 14:38:43 +0000 (16:38 +0200)]
libelf: add hvm callback vector feature

Add XENFEAT_hvm_callback_vector to elf_xen_feature_names so we can
ensure the kernel supports all features required for PVH mode when
building a PVH domU here. Note, hvm callback is required for PVH.

Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agolibxl: ocaml: fix code intended to output comments before definitions
Rob Hoes [Thu, 22 Aug 2013 10:50:53 +0000 (11:50 +0100)]
libxl: ocaml: fix code intended to output comments before definitions

I'm not sure how useful these comments actually are but erred on the
side of fixing rather than removing.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agolibxl: idl: complete some enums in the IDL with their defaults
Rob Hoes [Thu, 22 Aug 2013 10:50:52 +0000 (11:50 +0100)]
libxl: idl: complete some enums in the IDL with their defaults

There are several enums in the IDL that are initialised to 0, while
the value 0 is not part of the enum itself. This creates problems for
language bindings generated from the IDL, such as the OCaml ones.

Added an explicit (0, "UNKNOWN") enum value where appropriate, or used
init_val to default to a sensible value.

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agolibxl: idl: add domain_type field to libxl_dominfo struct
Rob Hoes [Thu, 22 Aug 2013 10:50:51 +0000 (11:50 +0100)]
libxl: idl: add domain_type field to libxl_dominfo struct

This allows a toolstack to find out whether a VM has booted as PV or HVM.

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agolibxl: Add LIBXL_SHUTDOWN_REASON_UNKNOWN
Rob Hoes [Thu, 22 Aug 2013 10:50:49 +0000 (11:50 +0100)]
libxl: Add LIBXL_SHUTDOWN_REASON_UNKNOWN

libxl_dominfo.shutdown_reason is valid iff (shutdown||dying). This is a bit
annoying when generating language bindings since it needs all sorts of special
casing. Just introduce an explicit value instead.

Signed-off-by: Ian Campbell <ian.cambell@citrix.com>
Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agolibxc/pm: Fix NULL pointer checks.
Andrew Cooper [Tue, 10 Sep 2013 09:29:39 +0000 (10:29 +0100)]
libxc/pm: Fix NULL pointer checks.

Discovered by Coverity,
CIDs 1054968 1054969 1054970 1054971 1054972 1054973 10549704

This was broken by c/s 5cc436c1d2b3b0 which did a blanket change of 'int
xc_handle' -> 'xc_interface *xch'.  The types got updated, but error
conditions were left as-were.  (I suspect some sed was involved originally)

Also while playing around in this area, fix up some of the bracketing style to
match the Xen coding style.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
11 years agoxen: Add new string function
Julien Grall [Wed, 28 Aug 2013 14:47:20 +0000 (15:47 +0100)]
xen: Add new string function

Add strcasecmp. The code is copied from Linux.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoxen/arm: Move __PSCI* from traps.c to the header
Julien Grall [Wed, 28 Aug 2013 14:47:19 +0000 (15:47 +0100)]
xen/arm: Move __PSCI* from traps.c to the header

These defines will be used to create the fake PSCI node in dom0 device tree.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoxen/dts: Don't check the number of address and size cells in process_cpu_node
Julien Grall [Wed, 28 Aug 2013 14:47:17 +0000 (15:47 +0100)]
xen/dts: Don't check the number of address and size cells in process_cpu_node

CPU nodes are not required to have #address-cells == 1 and #size-cells == 0, so
don't check for that (see Linux Documentation/devicetree/booting-without-of.txt
Section III.5.a).

In some OMAP5 device, tree, these 2 properties are not correctly set. Therefore,
Xen will only able to handle 1 CPU.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
CC: andrii.anisov@globallogic.com
CC: baozich@gmail.com
11 years agoxen: Introduce __initconst to store initial const data
Julien Grall [Wed, 28 Aug 2013 14:47:16 +0000 (15:47 +0100)]
xen: Introduce __initconst to store initial const data

It's possible to have 2 type (const and non-const) of data in the same
compilation unit. Using only __initdata will result to a compilation error:

    error: $variablename causes as section tupe conflict with $variablename2

because a section containing const variables is marked read only and so cannot
contain non-const variables.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Cambell <ian.campbell@citrix.com>
CC: Jan Beulich <JBeulich@suse.com>
CC: Keir Fraser <keir@xen.org>
11 years agominios: fix xenbus_rm() calls in frontend drivers
Ben Cressey [Fri, 6 Sep 2013 19:52:07 +0000 (12:52 -0700)]
minios: fix xenbus_rm() calls in frontend drivers

The commit "minios: refactor xenbus state machine" caused "/state" to
be appended to the local value of nodename. Previously the nodename
variable pointed to dev->nodename.

The xenbus_rm() calls were not updated to reflect this change, and
refer to paths that do not exist.

For example, shutdown_blkfront() for vbd 2049 would issue these calls:
    xenbus_rm(XBT_NIL, "device/vbd/2049/state/ring-ref");
    xenbus_rm(XBT_NIL, "device/vbd/2049/state/event-channel");

This patch restores the previous behavior, issuing these calls
instead:
    xenbus_rm(XBT_NIL, "device/vbd/2049/ring-ref");
    xenbus_rm(XBT_NIL, "device/vbd/2049/event-channel");

This causes frontend drivers to not be properly reset when PV-GRUB
exists. Some PV Linux drivers fail to re-initialize frontend devices
if PV-GRUB leaves them in this state.

Signed-off-by: Ben Cressey <bcressey@amazon.com>
Reviewed-by: Matt Wilson <msw@amazon.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
[msw: adjusted commit message to include consequences, split out
 changes into separate patches]
Signed-off-by: Matt Wilson <msw@amazon.com>
11 years agominios: clean up unneeded "err = NULL" in frontend drivers
Ben Cressey [Fri, 6 Sep 2013 19:52:06 +0000 (12:52 -0700)]
minios: clean up unneeded "err = NULL" in frontend drivers

This patch removes cases where the error message pointer is already
NULL and is then set to NULL. These are harmless, but suggest
incorrect practice: the pointer should be passed to free() to
deallocate memory prior to reassignment. There are no functional
changes in this patch.

Signed-off-by: Ben Cressey <bcressey@amazon.com>
Reviewed-by: Matt Wilson <msw@amazon.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
[msw: split a larger patch from Ben into this cleanup patch]
Signed-off-by: Matt Wilson <msw@amazon.com>
11 years agominios: clean up allocation of char arrays used for xenbus paths
Matt Wilson [Fri, 6 Sep 2013 19:52:05 +0000 (12:52 -0700)]
minios: clean up allocation of char arrays used for xenbus paths

This patch cleans up instances of char array allocation where string
lengths were manually counted to use strlen() instead. There are no
functional changes in this patch.

Signed-off-by: Matt Wilson <msw@amazon.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-By: Samuel Thibault <samuel.thibault@ens-lyon.org>
11 years agominios: correct char array allocation for xenbus paths
Matt Wilson [Fri, 6 Sep 2013 19:52:04 +0000 (12:52 -0700)]
minios: correct char array allocation for xenbus paths

The char arrays used to hold xenbus paths have historically been
allocated by manually counting the length longest string constants
included in constructing the path. This has led to improperly sized
buffers, both too large (with little consequence) and too small (which
obviously causes problems). This patch corrects the instances where
the length was incorrectly calculated by using strlen() on the longest
string constant used in building a xenbus path.

A follow-on clean-up patch will change all instances to use strlen().

Signed-off-by: Ben Cressey <bcressey@amazon.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-By: Samuel Thibault <samuel.thibault@ens-lyon.org>
[msw: split this patch from a larger patch from Ben, reworked to use
 strlen()]
Signed-off-by: Matt Wilson <msw@amazon.com>
11 years agoconfigure: Regenerate with autoconf 2.69
Ian Campbell [Mon, 9 Sep 2013 13:52:35 +0000 (14:52 +0100)]
configure: Regenerate with autoconf 2.69

This is the version from Debian Wheezy which is what both Ian Jackson and
myself run on our workstations. As committers it is useful to minimise
regeneration noise.

This is purely a run of autogen.sh. I have not tried to build the result.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Tested-by: Ian Jackson <ian.jackson@citrix.com>
11 years agotools: allow user to specify a system seabios binary
Fabio Fantoni [Thu, 5 Sep 2013 10:40:01 +0000 (12:40 +0200)]
tools: allow user to specify a system seabios binary

If this option is given don't bother building seabios ourselves.
Likely to be handy for distros who have an existing seabios
package which they want to reuse.

Signed-off-by: Fabio Fantoni <fabio.fantoni@m2r.biz>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoxen/dts: fix DT_ROOT_NODE_ADDR_CELLS_DEFAULT
Julien Grall [Mon, 9 Sep 2013 11:59:07 +0000 (12:59 +0100)]
xen/dts: fix DT_ROOT_NODE_ADDR_CELLS_DEFAULT

The commit dbd1243 "xen/arm: Add helpers to use the device tree" introduced
DT_ROOT_NODE_ADDR_CELLS_DEFAULT with is used for default value when
bad copy from Linux code.

The ePAR (section 2.3.5) says: "If missing, a client program should assume a
default value of 2 for #address-cells, and a value of 1 for #size-cells."

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoxen/arm: Panic if we can't copy the DTB to dom0 memory
Julien Grall [Wed, 4 Sep 2013 15:11:57 +0000 (16:11 +0100)]
xen/arm: Panic if we can't copy the DTB to dom0 memory

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoxen/arm: Introduce MPIDR_HWID_MASK
Julien Grall [Fri, 30 Aug 2013 13:30:27 +0000 (14:30 +0100)]
xen/arm: Introduce MPIDR_HWID_MASK

This define will be use later to retrieve the correct hardware CPU ID.
Also replace hardcoded mask in arm32/head.S by this define.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agotools: build debug qemu-xen in debug tools builds
Matthew Daley [Tue, 3 Sep 2013 13:12:59 +0000 (01:12 +1200)]
tools: build debug qemu-xen in debug tools builds

When building tools in debug mode (debug=y), pass --enable-debug when
configuring qemu-xen to enable some debug support (namely, to prevent
symbols from being stripped).

Signed-off-by: Matthew Daley <mattjd@gmail.com>
11 years agohotplug/Linux: add sysconfig tags to xencommons
Olaf Hering [Tue, 27 Aug 2013 13:43:43 +0000 (15:43 +0200)]
hotplug/Linux: add sysconfig tags to xencommons

YaST2 sysconfig can logically group the various sysconfig settings if the
files are tagged. Add the missing (YaST specific) tags to xencommons.
See for a description
http://old-en.opensuse.org/Packaging/SUSE_Package_Conventions/Sysconfig

Signed-off-by: Olaf Hering <olaf@aepfle.de>
11 years agox86/xsave: fix migration from xsave-capable to xsave-incapable host
Jan Beulich [Mon, 9 Sep 2013 12:36:54 +0000 (14:36 +0200)]
x86/xsave: fix migration from xsave-capable to xsave-incapable host

With CPUID features suitably masked this is supposed to work, but was
completely broken (i.e. the case wasn't even considered when the
original xsave save/restore code was written).

First of all, xsave_enabled() wrongly returned the value of
cpu_has_xsave, i.e. not even taking into consideration attributes of
the vCPU in question. Instead this function ought to check whether the
guest ever enabled xsave support (by writing a [non-zero] value to
XCR0). As a result of this, a vCPU's xcr0 and xcr0_accum must no longer
be initialized to XSTATE_FP_SSE (since that's a valid value a guest
could write to XCR0), and the xsave/xrstor as well as the context
switch code need to suitably account for this (by always enforcing at
least this part of the state to be saved/loaded).

This involves undoing large parts of c/s 22945:13a7d1f7f62c ("x86: add
strictly sanity check for XSAVE/XRSTOR") - we need to cleanly
distinguish between hardware capabilities and vCPU used features.

Next both HVM and PV save code needed tweaking to not always save the
full state supported by the underlying hardware, but just the parts
that the guest actually used. Similarly the restore code should bail
not just on state being restored that the hardware cannot handle, but
also on inconsistent save state (inconsistent XCR0 settings or size of
saved state not in line with XCR0).

And finally the PV extended context get/set code needs to use slightly
different logic than the HVM one, as here we can't just key off of
xsave_enabled() (i.e. avoid doing anything if a guest doesn't use
xsave) because the tools use this function to determine host
capabilities as well as read/write vCPU state. The set operation in
particular needs to be capable of cleanly dealing with input that
consists of only the xcr0 and xcr0_accum values (if they're both zero
then no further data is required).

While for things to work correctly both sides (saving _and_ restoring
host) need to run with the fixed code, afaict no breakage should occur
if either side isn't up to date (other than the breakage that this
patch attempts to fix).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Yang Zhang <yang.z.zhang@intel.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agoEFI: fix tool chain capabilities detection
Jan Beulich [Mon, 9 Sep 2013 12:35:29 +0000 (14:35 +0200)]
EFI: fix tool chain capabilities detection

Commit f5a54e92 ("xen: move some arch CFLAGS into the common Rules.mk")
transformed CFLAGS assignments to CFLAGS-y ones, which collides with
the was xen/arch/x86/efi/Makefile determines whether the tol chain is
usable for an EFI build. Transform the block back to using CFLAGS.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Acked-by: Keir Fraser <keir@xen.org>
11 years agoxmalloc: make whole pages xfree() clear the order field (ab)used by xmalloc()
Jan Beulich [Mon, 9 Sep 2013 12:34:12 +0000 (14:34 +0200)]
xmalloc: make whole pages xfree() clear the order field (ab)used by xmalloc()

Not doing this was found to cause problems with sequences of allocation
(multi-page), freeing, and then again allocation of the same page upon
boot when interrupts are still disabled (causing the owner field to be
non-zero, thus making the allocator attempt a TLB flush and, in its
processing, triggering an assertion).

Reported-by: Tomasz Wroblewski <tomasz.wroblewski@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Tomasz Wroblewski <tomasz.wroblewski@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agox86: allow guest to set/clear MSI-X mask bit (try 2)
Joby Poriyath [Mon, 9 Sep 2013 08:43:11 +0000 (10:43 +0200)]
x86: allow guest to set/clear MSI-X mask bit (try 2)

Guest needs the ability to enable and disable MSI-X interrupts
by setting the MSI-X control bit, for a passed-through device.
Guest is allowed to write MSI-X mask bit only if Xen *thinks*
that mask is clear (interrupts enabled). If the mask is set by
Xen (interrupts disabled), writes to mask bit by the guest is
ignored.

Currently, a write to MSI-X mask bit by the guest is silently
ignored.

A likely scenario is where we have a 82599 SR-IOV nic passed
through to a guest. From the guest if you do

  ifconfig <ETH_DEV> down
  ifconfig <ETH_DEV> up

the interrupts remain masked. On VF reset, the mask bit is set
by the controller. At this point, Xen is not aware that mask is set.
However, interrupts are enabled by VF driver by clearing the mask
bit by writing directly to BAR3 region containing the MSI-X table.

From dom0, we can verify that
interrupts are being masked using 'xl debug-keys M'.

Initially, guest was allowed to modify MSI-X bit.
Later this behaviour was changed.
See changeset 74c213c506afcd74a8556dd092995fd4dc38b225.

Signed-off-by: Joby Poriyath <joby.poriyath@citrix.com>
11 years agox86/EFI: properly handle run time memory regions outside the 1:1 map
Jan Beulich [Mon, 9 Sep 2013 08:40:11 +0000 (10:40 +0200)]
x86/EFI: properly handle run time memory regions outside the 1:1 map

Namely with PFN compression, MMIO ranges that the firmware may need
runtime access to can live in the holes that gets shrunk/eliminated by
PFN compression, and hence no mappings would result from simply
copying Xen's direct mapping table's L3 page table entries. Build
mappings for this "manually" in the EFI runtime call 1:1 page tables.

Use the opportunity to also properly identify (via a forcibly undefined
manifest constant) all the disabled code regions associated with it not
being acceptable for us to call SetVirtualAddressMap().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
11 years agox86: Introduce and use GLOBAL() in asm code
Andrew Cooper [Mon, 9 Sep 2013 08:25:40 +0000 (10:25 +0200)]
x86: Introduce and use GLOBAL() in asm code

Also clean up some cases of misused/opencoded ENTRY()

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 years agoSVM: streamline entry.S code
Jan Beulich [Mon, 9 Sep 2013 08:24:21 +0000 (10:24 +0200)]
SVM: streamline entry.S code

- fix a bogus "test" with zero immediate
- move stuff easily/better done in C into C code
- re-arrange code paths so that no redundant GET_CURRENT() would remain
  on the fast paths
- move long latency operations earlier
- slightly defer disabling global interrupts on the VM entry path

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
11 years agoVMX: use proper instruction mnemonics if assembler supports them
Jan Beulich [Mon, 9 Sep 2013 08:23:32 +0000 (10:23 +0200)]
VMX: use proper instruction mnemonics if assembler supports them

With the hex byte emission we were taking away a good part of
flexibility from the compiler, as for simplicity reasons these were
built using fixed operands. All half way modern build environments
would allow using the mnemonics (but we can't disable the hex variants
yet, since the binutils around at the time gcc 4.1 got released didn't
support these yet).

I didn't convert __vmread() yet because that would, just like for
__vmread_safe(), imply converting to a macro so that the output operand
can be the caller supplied variable rather than an intermediate one. As
that would require touching all invocation points of __vmread() (of
which there are quite a few), I'd first like to be certain the approach
is acceptable; the main question being whether the now conditional code
might be considered to cause future maintenance issues, and the second
being that of parameter/argument ordering (here I made __vmread_safe()
match __vmwrite(), but one could also take the position that read and
write should use the inverse order of one another, in line with the
actual instruction operands).

Additionally I was quite puzzled to find that all the asm()-s involved
here have memory clobbers - what are they needed for? Or can they be
dropped at least in some cases?

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
11 years agoVMX: move various uses of UD2 out of fast paths
Jan Beulich [Mon, 9 Sep 2013 08:22:23 +0000 (10:22 +0200)]
VMX: move various uses of UD2 out of fast paths

... at once making conditional forward jumps, which are statically
predicted to be not taken, only used for the unlikely (error) cases.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
11 years agoVMX: streamline entry.S code
Jan Beulich [Mon, 9 Sep 2013 08:20:52 +0000 (10:20 +0200)]
VMX: streamline entry.S code

- move stuff easily/better done in C into C code
- re-arrange code paths so that no redundant GET_CURRENT() would remain
  on the fast paths
- move long latency operations earlier
- slightly defer disabling interrupts on the VM entry path
- use ENTRY() instead of open coding it

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
11 years agoxen/char: dt-uart: Allow the user to give a path to the node
Julien Grall [Wed, 28 Aug 2013 14:47:15 +0000 (15:47 +0100)]
xen/char: dt-uart: Allow the user to give a path to the node

On some board, there is no alias to the UART. To avoid modification in
the device tree, dt-uart should also search device by path.

To distinguish an alias from a path, dt-uart will check the first character.
If it's a / then it's path, otherwise it's an alias.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agohvmloader: fix SeaBIOS interface
Jan Beulich [Thu, 5 Sep 2013 09:47:03 +0000 (11:47 +0200)]
hvmloader: fix SeaBIOS interface

The SeaBIOS ROM image may validly exceed 128k in size, it's only our
interface code that so far assumed that it wouldn't. Remove that
restriction by setting the base address depending on image size.

Add a check to HVM loader so that too big images won't result in silent
guest failure anymore.

Uncomment the intended build-time size check for rombios, moving it
into a function so that it would actually compile.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoxen/docs: Correct documentation for the conswitch parameter
Andrew Cooper [Wed, 28 Aug 2013 10:19:31 +0000 (11:19 +0100)]
xen/docs: Correct documentation for the conswitch parameter

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
11 years agolibvhd: use UTC for VHD timestamp
Wei Liu [Fri, 30 Aug 2013 11:33:33 +0000 (12:33 +0100)]
libvhd: use UTC for VHD timestamp

[ported from xapi-project/blktap a79ac2c05f9 ("XOP-289: use UTC for VHD
timestamps")]

Currently, the local timezone is factored into VHD timestamps due to the
use of mktime(). This breaks "vhd-util check" for VHDs created in one
timezone and then moved westward, which results in "primary footer
invalid: creation time in future" errors.

Signed-off-by: Andrei Lifchits <andrei.lifchits@citrix.com>
Andrei no longer works for Citrix but Germano Percossi (ex-colleague of
Andrei) contacted Andrei and confirmed that
  1) this work was written on Citrix time,
  2) it is OK to have Andrei's SoB.

Remove unused variable "tm".

Signed-off-by: Germano Percossi <germano.percossi@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agolibxl: prefer qdisk over blktap when choosing disk backend
Wei Liu [Tue, 27 Aug 2013 14:22:43 +0000 (15:22 +0100)]
libxl: prefer qdisk over blktap when choosing disk backend

There are some disk formats commonly supported by both qdisk and blktap.
As qdisk is better supported and blktap is unmaintained, we choose qdisk
over blktap whenever possible.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agoxend: fix file descriptor leak in pci utilities
Xi Xiong [Fri, 30 Aug 2013 19:21:56 +0000 (12:21 -0700)]
xend: fix file descriptor leak in pci utilities

A file descriptor leak was detected after creating multiple domUs with
pass-through PCI devices. This patch fixes the issue.

Signed-off-by: Xi Xiong <xixiong@amazon.com>
Reviewed-by: Matt Wilson <msw@amazon.com>
[msw: adjusted commit message]
Signed-off-by: Matt Wilson <msw@amazon.com>
11 years agoxend: handle extended PCI configuration space when saving state
Steven Noonan [Fri, 30 Aug 2013 23:40:42 +0000 (16:40 -0700)]
xend: handle extended PCI configuration space when saving state

Newer PCI standards (e.g., PCI-X 2.0 and PCIe) introduce extended
configuration space which is larger than 256 bytes. This patch uses
stat() to determine the amount of space used to correctly save all of
the PCI configuration space. Resets handled by the xen-pciback driver
don't have this problem, as that code correctly handles saving
extended configuration space.

Signed-off-by: Steven Noonan <snoonan@amazon.com>
Reviewed-by: Matt Wilson <msw@amazon.com>
[msw: adjusted commit message]
Signed-off-by: Matt Wilson <msw@amazon.com>
11 years agopl011: preserve RTS and DTR signal on UART init
Andre Przywara [Tue, 3 Sep 2013 14:00:52 +0000 (16:00 +0200)]
pl011: preserve RTS and DTR signal on UART init

Although we do not support hardware flow control in the Xen driver
for the PL011 UART, the other end may be configured to use it.
In this case it waits in vain for the RTS signal to be asserted by
the host and will never transmit any characters.
So we leave RTS and DTR as they had been setup before.
This fixes the UART input on Calxeda Midway, which uses hardware
flow control for the serial-over-LAN functionality.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoxen/x86: don't use '.ifnes' in bug frame construction.
Tim Deegan [Thu, 29 Aug 2013 15:47:12 +0000 (16:47 +0100)]
xen/x86: don't use '.ifnes' in bug frame construction.

Spotted because it breaks the clang build for LLVM <3.2.  .ifnes is
not right here as it will choke on a string with embedded quotes.

.ifnb would be better except that LLVM <3.2 doesn't support that either
(and nor does binutils 2.16).

It should be possible to use something like !!msg or !!msg[0] instead
of a separate flag, but I gave up trying to find something that would
make it through CPP, asm() and gas as a usable constant. :|

Signed-off-by: Tim Deegan <tim@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
11 years agox86/mwait_idle: initial C8, C9, C10 support
Len Brown [Fri, 30 Aug 2013 09:00:07 +0000 (11:00 +0200)]
x86/mwait_idle: initial C8, C9, C10 support

Allow mwait_idle to utilize C8, C9, C10 when they are present on...
"Fourth Generation Intel(R) Core(TM) Processors", which are based on
Intel(R) microarchitecture code name Haswell.

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agox86/mwait_idle: export both C1 and C1E
Len Brown [Fri, 30 Aug 2013 08:59:09 +0000 (10:59 +0200)]
x86/mwait_idle: export both C1 and C1E

Here we disable HW promotion of C1 to C1E and export both C1 and C1E
as distinct C-states.

This allows a cpuidle governor to choose a lower latency C-state than
C1E when necessary to satisfy performance and QOS constraints -- and
still save power versus polling.
This also corrects the erroneous latency previously reported for C1E
-- it is 10usec, not 1usec.

Signed-off-by: Len Brown <len.brown@intel.com>
Avoided the effect of changing the meaning of "max_cstate=".

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agox86/mwait_idle: remove assumption of one C-state per MWAIT flag
Len Brown [Fri, 30 Aug 2013 08:58:21 +0000 (10:58 +0200)]
x86/mwait_idle: remove assumption of one C-state per MWAIT flag

Remove the assumption that cstate_tables are indexed by MWAIT flag
values. Each entry identifies itself via its own flags value. This
change is needed to support multiple states that share the same MWAIT
flags.

Note that this can have an effect on what state is described by 'N' on
cmdline max_cstate=N on some systems.

Signed-off-by: Len Brown <len.brown@intel.com>
Avoided the effect of changing the meaning of "max_cstate=".
Drop MWAIT_MAX_NUM_CSTATES (done differently in a prior patch on
Linux).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agox86/xsave: initialization improvements
Jan Beulich [Fri, 30 Aug 2013 08:56:07 +0000 (10:56 +0200)]
x86/xsave: initialization improvements

- properly validate available feature set on APs
- also validate xsaveopt availability on APs
- properly indicate whether the initialization is on the BSP (we
  shouldn't be using "cpu == 0" checks for this)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agox86: remove PentiumPro check
Matt Wilson [Fri, 30 Aug 2013 08:54:32 +0000 (10:54 +0200)]
x86: remove PentiumPro check

... as it's not a supported processor

Signed-off-by: Matt Wilson <msw@amazon.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agox86: remove X86_INTEL_USERCOPY code
Matt Wilson [Fri, 30 Aug 2013 08:54:00 +0000 (10:54 +0200)]
x86: remove X86_INTEL_USERCOPY code

Nothing defines CONFIG_X86_INTEL_USERCOPY, and as far as I can tell it
was never used even when Xen supported 32-bit x86.

Signed-off-by: Matt Wilson <msw@amazon.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 years agox86/apic: remove DMI checks in bigsmp driver for obsolete systems
Matt Wilson [Fri, 30 Aug 2013 08:53:33 +0000 (10:53 +0200)]
x86/apic: remove DMI checks in bigsmp driver for obsolete systems

The DMI checks that force the use of the bigsmp APIC driver are for
systems that are no longer supported by Xen (32-bit x86).

Signed-off-by: Matt Wilson <msw@amazon.com>
Acked-by: Keir Fraser <keir@xen.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 years agox86: remove references to unimplemented BIOS reboot option
Matt Wilson [Fri, 30 Aug 2013 08:49:59 +0000 (10:49 +0200)]
x86: remove references to unimplemented BIOS reboot option

The BIOS reboot option was never implemented for x86_64, and retaining
it is somewhat false advertising.

Signed-off-by: Matt Wilson <msw@amazon.com>
Acked-by: Keir Fraser <keir@xen.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 years agopublic/hvm_xs_strings.h: Fix ABI regression for OEM SMBios strings
Andrew Cooper [Fri, 30 Aug 2013 08:40:48 +0000 (10:40 +0200)]
public/hvm_xs_strings.h: Fix ABI regression for OEM SMBios strings

The old code for OEM SMBios strings was:

        char path[20] = "bios-strings/oem-XX";
        path[(sizeof path) - 3] = '0' + ((i < 10) ? i : i / 10);
        path[(sizeof path) - 2] = (i < 10) ? '\0' : '0' + (i % 10);

Where oem-1 thru 9 specifically had no leading 0.

However, the definition of HVM_XS_OEM_STRINGS specifically requires leading
0s.

This regression was introduced by the combination of c/s 4d23036e709627 and
e64c3f71ceb662

I realise that this patch causes a change to the public headers.  However I
feel it is justified as:

* All toolstacks used to have to embed the magic string (and almost certainly
  still do)
* If by some miriacle a new toolstack has started using the new define will
  continue to work.
* The only intree consumer of the define is hvmloader itself.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agohvmloader/smbios: Correctly count the number of tables written
Andrew Cooper [Fri, 30 Aug 2013 08:40:29 +0000 (10:40 +0200)]
hvmloader/smbios: Correctly count the number of tables written

Fixes regression indirectly introduced by c/s 4d23036e709627

That changeset added some smbios tables which were option based on the
toolstack providing appropriate xenstore keys.  The do_struct() macro would
unconditionally increment nr_structs, even if a table was not actually
written.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agoAMD IOMMU: allow command line overrides for broken IVRS tables
Jan Beulich [Thu, 29 Aug 2013 07:53:07 +0000 (09:53 +0200)]
AMD IOMMU: allow command line overrides for broken IVRS tables

With there being so many systems with broken ACPI tables, and with it
generally being known what's wrong with those tables, give people a
handle to overcome the resulting disabling of their IOMMUs.

Inspired by Linux side patches providing similar functionality.

Suggested-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-By: Sander Eikelenboom <linux@eikelenboom.it>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Suravee Suthikulpanit <suravee.suthikulapanit@amd.com>
11 years agoAMD IOMMU: add missing checks
Jan Beulich [Thu, 29 Aug 2013 07:31:37 +0000 (09:31 +0200)]
AMD IOMMU: add missing checks

For one we shouldn't accept IVHD tables specifying IO-APIC IDs beyond
the limit we support (MAX_IO_APICS, currently 128).

And then we shouldn't memset() a pointer allocation of which failed.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Suravee Suthikulpanit <suravee.suthikulapanit@amd.com>
11 years agox86: AVX instruction emulation fixes
Jan Beulich [Wed, 28 Aug 2013 15:03:50 +0000 (17:03 +0200)]
x86: AVX instruction emulation fixes

- we used the C4/C5 (first prefix) byte instead of the apparent ModR/M
  one as the second prefix byte
- early decoding normalized vex.reg, thus corrupting it for the main
  consumer (copy_REX_VEX()), resulting in #UD on the two-operand
  instructions we emulate

Also add respective test cases to the testing utility plus
- fix get_fpu() (the fall-through order was inverted)
- add cpu_has_avx2, even if it's currently unused (as in the new test
  cases I decided to refrain from using AVX2 instructions in order to
  be able to actually run all the tests on the hardware I have)
- slightly tweak cpu_has_avx to more consistently express the outputs
  we don't care about (sinking them all into the same variable)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agoxen: update tx_ready callback for ARM serial drivers
Tomasz Wroblewski [Wed, 28 Aug 2013 11:36:18 +0000 (13:36 +0200)]
xen: update tx_ready callback for ARM serial drivers

Type of tx_ready callback got changed to int to facilitate error condition,
but the ARM serial drivers were not modified thus breaking the compilation.

Reported-by: Julien Grall <julien.grall@linaro.org>
Signed-off-by: Tomasz Wroblewski <tomasz.wroblewski@citrix.com>
11 years agoPCI UART: better cope with UART being temporarily unavailable
Tomasz Wroblewski [Wed, 28 Aug 2013 08:19:42 +0000 (10:19 +0200)]
PCI UART: better cope with UART being temporarily unavailable

This happens for example when dom0 disables ioport responses during PCI
subsystem initialisation. If a __ns16550_poll() happens to be scheduled
during that time, Xen hangs. Detect and exit that condition.

Amended ns16550_ioport_invalid function to only check IER register,
which contins 3 reserved (always 0) bits, therefore it's sufficient for
that test.

Signed-off-by: Tomasz Wroblewski <tomasz.wroblewski@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agoFix inactive timer list corruption on second S3 resume
Tomasz Wroblewski [Wed, 28 Aug 2013 08:18:39 +0000 (10:18 +0200)]
Fix inactive timer list corruption on second S3 resume

init_timer cannot be safely called multiple times on same timer since it does memset(0)
on the structure, erasing the auxiliary member used by linked list code. This breaks
inactive timer list in common/timer.c.

Moved resume_timer initialisation to ns16550_init_postirq, so it's only done once.

Signed-off-by: Tomasz Wroblewski <tomasz.wroblewski@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agoPCI: centralize parsing of device coordinates in command line options
Jan Beulich [Wed, 28 Aug 2013 08:12:36 +0000 (10:12 +0200)]
PCI: centralize parsing of device coordinates in command line options

With yet another case to come in a subsequent patch, it seems time to
do this in a single place rather than hand crafting it in various
scattered around locations.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agoAMD IOMMU: also allocate IRTEs for HPET MSI
Jan Beulich [Wed, 28 Aug 2013 08:11:19 +0000 (10:11 +0200)]
AMD IOMMU: also allocate IRTEs for HPET MSI

Omitting this was a blatant oversight of mine in commit 2ca9fbd7 ("AMD
IOMMU: allocate IRTE entries instead of using a static mapping").

This also changes a bogus inequality check into a sensible one, even
though it is already known that this will make HPET MSI unusable on
certain systems (having respective broken firmware). This, however,
seems better than failing on systems with consistent ACPI tables.

Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
11 years agotools: drop VT-i example
Jan Beulich [Tue, 27 Aug 2013 13:23:07 +0000 (14:23 +0100)]
tools: drop VT-i example

... as being another IA64 leftover.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
11 years agoxen/arm: use defines for boot module indexes instead of open coded numbers
Ian Campbell [Thu, 22 Aug 2013 15:24:46 +0000 (16:24 +0100)]
xen/arm: use defines for boot module indexes instead of open coded numbers

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Julien Grall <julien.grall@linaro.org>
11 years agoxen: arm: indicate when we have early paniced
Ian Campbell [Thu, 22 Aug 2013 16:01:57 +0000 (17:01 +0100)]
xen: arm: indicate when we have early paniced

Otherwise the hypervisor simply appears to stop after a message which may or
may not look all that severe.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Julien Grall <julien.grall@linaro.org>
11 years agopl011: early_panic if baud rate not set in hardware
Ian Campbell [Thu, 22 Aug 2013 16:01:59 +0000 (17:01 +0100)]
pl011: early_panic if baud rate not set in hardware

Now that the driver defaults to BAUD_AUTO this can happen if the early uart !=
console or if early printk isn't in use.

The following division by zero causes a trap but that uses regular printk and
not early_printk, so it is never seen.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
11 years agoxen/arm: add lower-bound check in mfn_valid
Jaeyong Yoo [Fri, 23 Aug 2013 09:08:41 +0000 (18:08 +0900)]
xen/arm: add lower-bound check in mfn_valid

mfn_valid only checks the upper-bound of mfn (max_page).
Add the lower-bound check of mfn (frametable_base_mfn).

Signed-off-by: Jaeyong Yoo <jaeyong.yoo@samsung.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
11 years agoxen/arm: Introduce and use GLOBAL() in asm code.
Andrew Cooper [Mon, 26 Aug 2013 19:18:33 +0000 (20:18 +0100)]
xen/arm: Introduce and use GLOBAL() in asm code.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agodrivers/char: pl011: Enable receive timeout interrupt
Julien Grall [Tue, 27 Aug 2013 12:13:35 +0000 (13:13 +0100)]
drivers/char: pl011: Enable receive timeout interrupt

The commit 874f76a "PL011: fix reverse logic for interrupt mask register"
introduced regression on the Versatile Express. The board didn't receive
correctly input.

The timeout interrupt may be asserted when the FIFO is not empty, and no futher
data is received over a 32-bit period.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoRevert "x86/boot: Explicitly clean pcpu stacks in debug builds"
Jan Beulich [Tue, 27 Aug 2013 13:13:20 +0000 (15:13 +0200)]
Revert "x86/boot: Explicitly clean pcpu stacks in debug builds"

This reverts commit 8a3c4acc9907cfec9aae9f1bc251fbf50af6828e.
It's reportedly broken.

11 years agopygrub: add Debian extlinux.conf path
Ian Campbell [Fri, 16 Aug 2013 14:21:05 +0000 (15:21 +0100)]
pygrub: add Debian extlinux.conf path

This is Debian bug #697407.

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=697407

Signed-off-by: Ian Campbell <ijc@hellion.org.uk>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agofix gdbstub build c/s c8177e691f
Andrew Cooper [Tue, 27 Aug 2013 09:29:03 +0000 (11:29 +0200)]
fix gdbstub build c/s c8177e691f

That changeset moved the watchdog functions from nmi.h to their own
watchdog.h.  I thought I had updated all relevant header files and the
compiler was happy as well.  However, gdbstub is not even compiled by default,
and I accidentally missed it.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 years agox86/boot: Explicitly clean pcpu stacks in debug builds
Andrew Cooper [Tue, 27 Aug 2013 09:28:26 +0000 (11:28 +0200)]
x86/boot: Explicitly clean pcpu stacks in debug builds

This reduces confusion when looking at a hexdump of the pcpu stacks and
wondering were on earth some of the junk was coming from.  Also leave some
grep fodder for finding where the BSP switches stack (because it took me
far longer to find than I care to admit to).

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 years agox86/time: remove Cyclone as a platform timer
Matt Wilson [Tue, 27 Aug 2013 09:23:09 +0000 (11:23 +0200)]
x86/time: remove Cyclone as a platform timer

The Cyclone time source was part of IBM's Summit chipset, which was
only used for 32-bit only ccNUMA and IA-64 machines. Neither of these
are supported by Xen anymore.

Signed-off-by: Matt Wilson <msw@amazon.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 years agox86/apic: remove Summit support
Matt Wilson [Tue, 27 Aug 2013 09:20:17 +0000 (11:20 +0200)]
x86/apic: remove Summit support

IBM's Summit chipset was only used for 32-bit only Intel ccNUMA and
IA-64 machines, neither of which are supported by Xen anymore.

Signed-off-by: Matt Wilson <msw@amazon.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
11 years agox86/Intel: add support for Haswell CPU models
Jan Beulich [Tue, 27 Aug 2013 09:15:15 +0000 (11:15 +0200)]
x86/Intel: add support for Haswell CPU models

... according to their most recent public documentation.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agoVMX: convert EOI exit bitmap to a proper bitmap
Jan Beulich [Tue, 27 Aug 2013 09:13:50 +0000 (11:13 +0200)]
VMX: convert EOI exit bitmap to a proper bitmap

... allowing bitmap operations to be used on it, making things
consistent with struct pi_desc's pir field, and shrinking overall
source code size.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agox86: don't allow Dom0 access to the HT address range
Jan Beulich [Tue, 27 Aug 2013 09:12:12 +0000 (11:12 +0200)]
x86: don't allow Dom0 access to the HT address range

In particular, MMIO assignments should not be done using this area.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
11 years agox86: don't allow Dom0 access to the MSI address range
Jan Beulich [Tue, 27 Aug 2013 09:11:38 +0000 (11:11 +0200)]
x86: don't allow Dom0 access to the MSI address range

In particular, MMIO assignments should not be done using this area.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by Xiantao Zhang <xiantao.zhang@intel.com>

11 years agoRevert "interrupts: allow guest to set/clear MSI-X mask bit"
Jan Beulich [Mon, 26 Aug 2013 10:40:44 +0000 (12:40 +0200)]
Revert "interrupts: allow guest to set/clear MSI-X mask bit"

This reverts commit 54a46bce768033b1c36e25eace15f7abde972389.
It's not fully cooked yet.

11 years agodomctl: replace cpumask_weight() uses
Jan Beulich [Fri, 23 Aug 2013 13:07:00 +0000 (15:07 +0200)]
domctl: replace cpumask_weight() uses

In one case it could easily be replaced by range checking the result of
a subsequent operation, and in general cpumask_next(), not always
needing to scan the whole bitmap, is more efficient than the specific
uses of cpumask_weight() here. (When running on big systems, operations
on CPU masks aren't cheap enough to use them carelessly.)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agocredit1: replace cpumask_empty() uses
Jan Beulich [Fri, 23 Aug 2013 13:06:21 +0000 (15:06 +0200)]
credit1: replace cpumask_empty() uses

In one case it was redundant with the operation it got combined with,
and in the other it could easily be replaced by range checking the
result of a subsequent operation. (When running on big systems,
operations on CPU masks aren't cheap enough to use them carelessly.)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
11 years agocredit2: replace cpumask_first() uses
Jan Beulich [Fri, 23 Aug 2013 13:05:39 +0000 (15:05 +0200)]
credit2: replace cpumask_first() uses

... with cpumask_any() or cpumask_cycle().

In one case this also allows elimination of a cpumask_empty() call,
and while doing this I also spotted a redundant use of
cpumask_weight(). (When running on big systems, operations on CPU masks
aren't cheap enough to use them carelessly.)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
11 years agox86: use cpumask_any() in mask-to-APIC-ID conversions
Jan Beulich [Fri, 23 Aug 2013 13:04:17 +0000 (15:04 +0200)]
x86: use cpumask_any() in mask-to-APIC-ID conversions

This is to avoid picking CPU0 for almost any such operation, resulting
in very uneven distribution of interrupt load.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agoun-alias cpumask_any() from cpumask_first()
Jan Beulich [Fri, 23 Aug 2013 13:01:53 +0000 (15:01 +0200)]
un-alias cpumask_any() from cpumask_first()

In order to achieve more symmetric distribution of certain things,
cpumask_any() shouldn't always pick the first CPU (which frequently
will end up being CPU0). To facilitate that, introduce a library-like
function to obtain random numbers.

The per-architecture function is supposed to return zero if no valid
random number can be obtained (implying that if occasionally zero got
produced as random number, it wouldn't be considered such).

As fallback this uses the trivial algorithm from the C standard,
extended to produce "unsigned int" results.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
11 years agox86: correct public header's documentation of PAT MSR settings
Jan Beulich [Fri, 23 Aug 2013 07:23:24 +0000 (09:23 +0200)]
x86: correct public header's documentation of PAT MSR settings

The first (PAT6) column was wrong across the board, and the column for
PAT7 was missing altogether.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agoPCI: break MSI-X data out of struct pci_dev_info
Jan Beulich [Fri, 23 Aug 2013 07:22:08 +0000 (09:22 +0200)]
PCI: break MSI-X data out of struct pci_dev_info

Considering that a significant share of PCI devices out there (not the
least the myriad of CPU-exposed ones) don't support MSI-X at all, and
that the amount of data is well beyond a handful of bytes, break this
out of the common structure, at once allowing the actual data to be
tracked to become architecture specific.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agox86: move struct bug_frame instances out of line
Jan Beulich [Fri, 23 Aug 2013 07:19:29 +0000 (09:19 +0200)]
x86: move struct bug_frame instances out of line

Just like Linux did many years ago, move them into a separate (data)
section, such that they no longer pollute instruction caches and TLBs.

Assertion frames, requiring two pointers to be stored, occupy two slots
in the array, with the second slot mimicking a frame the location
pointer of which doesn't match any address within .text or .init.text
(it effectively points back to the slot itself, which - being in a data
section - can't be reached by non-buggy execution).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agoxen: arm: retry trylock if strex fails on free lock.
Ian Campbell [Fri, 19 Jul 2013 15:20:10 +0000 (16:20 +0100)]
xen: arm: retry trylock if strex fails on free lock.

This comes from the Linux patches 15e7e5c1ebf5 for arm32 and 4ecf7ccb1973 for
arm64 by Will Deacon and Catalin Marinas respectively. The Linux commit message
says:

    An exclusive store instruction may fail for reasons other than lock
    contention (e.g. a cache eviction during the critical section) so, in
    line with other architectures using similar exclusive instructions
    (alpha, mips, powerpc), retry the trylock operation if the lock appears
    to be free but the strex reported failure.

I have observed this due to register_cpu_notifier containing:
    if ( !spin_trylock(&cpu_add_remove_lock) )
        BUG(); /* Should never fail as we are called only during boot. */
which was spuriously failing.

The ARMv8 variant is taken directly from the Linux patch. For v7 I had to
reimplement since we don't currently use ticket locks.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
11 years agoxen/arm64: resync atomics and spinlock asm with Linux
Ian Campbell [Fri, 19 Jul 2013 15:20:09 +0000 (16:20 +0100)]
xen/arm64: resync atomics and spinlock asm with Linux

This picks up the changes from Linux commit 3a0310eb369a:
    arm64: atomics: fix grossly inconsistent asm constraints for exclusives

    Our uses of inline asm constraints for atomic operations are fairly
    wild and varied. We basically need to guarantee the following:

      1. Any instructions with barrier implications
         (load-acquire/store-release) have a "memory" clobber

      2. When performing exclusive accesses, the addresing mode is generated
         using the "Q" constraint

      3. Atomic blocks which use the condition flags, have a "cc" clobber

    This patch addresses these concerns which, as well as fixing the
    semantics of the code, stops GCC complaining about impossible asm
    constraints.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
11 years agoxen/arm64: Assembly optimized bitops from Linux
Ian Campbell [Fri, 19 Jul 2013 15:20:08 +0000 (16:20 +0100)]
xen/arm64: Assembly optimized bitops from Linux

This patch replaces the previous hashed lock implementaiton of bitops with
assembly optimized ones taken from Linux v3.10-rc4.

The Linux derived ASM only supports 8 byte aligned bitmaps (which under Linux
are unsigned long * rather than our void *). We do have actually uses of 4
byte alignment (i.e. the bitmaps in struct xmem_pool) which trigger alignment
faults.

Therefore adjust the assembly to work in 4 byte increments, which involved:
    - bit offset now bits 4:0 => mask #31 not #63
    - use wN register not xN for load/modify/store loop.

There is no need to adjust the shift used to calculate the word offset, the
difference is already acounted for in the #63->#31 change.

NB: Xen's build system cannot cope with the change from .c to .S file,
remove xen/arch/arm/arm64/lib/.bitops.o.d or clean your build tree.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
11 years agoxen/arm: Specific mapping for dom0 on OMAP5 platform
Chen Baozi [Thu, 15 Aug 2013 13:19:48 +0000 (21:19 +0800)]
xen/arm: Specific mapping for dom0 on OMAP5 platform

Signed-off-by: Chen Baozi <baozich@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoxen/arm: Platform recognition and initialize arch_timer for the OMAP5
Chen Baozi [Tue, 13 Aug 2013 11:14:26 +0000 (19:14 +0800)]
xen/arm: Platform recognition and initialize arch_timer for the OMAP5

Signed-off-by: Chen Baozi <baozich@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoxen/arm: Add support for device tree specified arch_timer clock frequency.
Chen Baozi [Tue, 13 Aug 2013 11:14:25 +0000 (19:14 +0800)]
xen/arm: Add support for device tree specified arch_timer clock frequency.

Signed-off-by: Chen Baozi <baozich@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoxen/arm: Add the new OMAP UART driver.
Chen Baozi [Tue, 13 Aug 2013 11:14:24 +0000 (19:14 +0800)]
xen/arm: Add the new OMAP UART driver.

TI OMAP UART introduces some features such as register access modes, which
makes its configuration and interrupt handling differs from 8250 compatible
UART. Thus, we seperate this driver from ns16550's implementation.

Signed-off-by: Chen Baozi <baozich@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoxen: Introduce a helper to read a u32 property in device tree.
Chen Baozi [Tue, 13 Aug 2013 11:14:23 +0000 (19:14 +0800)]
xen: Introduce a helper to read a u32 property in device tree.

Signed-off-by: Chen Baozi <baozich@gmail.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
11 years agoxen/arm: add 8250 compatible UART support for early_printk
Chen Baozi [Tue, 13 Aug 2013 11:14:22 +0000 (19:14 +0800)]
xen/arm: add 8250 compatible UART support for early_printk

Both OMAP5 and sun6i/sun7i SoCs share this UART driver for early_printk.

Signed-off-by: Chen Baozi <baozich@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoxen: rename ns16550-uart.h to 8250-uart.h and fix some typos
Chen Baozi [Tue, 13 Aug 2013 11:14:21 +0000 (19:14 +0800)]
xen: rename ns16550-uart.h to 8250-uart.h and fix some typos

Since UARTs on OMAP5 & Allwinner's SoC are not ns16550 but only 8250
compatible, rename ns16550-uart.h to 8250-uart.h, which is a more pervasive
name. At the same time, fix some typos, which have redundance UART_
prefixes in some macros.

Signed-off-by: Chen Baozi <baozich@gmail.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Keir Fraser <keir@xen.org>
11 years agoARM: fix const declaration of platform struct
Andre Przywara [Thu, 22 Aug 2013 07:40:54 +0000 (09:40 +0200)]
ARM: fix const declaration of platform struct

As Julien pointed out the other day, the data type for the platform
DT name match struct is wrong.
To be really immutable, we have to use "const char * const".

Fix it on the three currently existing platforms.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoCorrect X2-APIC HVM emulation
Juergen Gross [Thu, 22 Aug 2013 09:24:00 +0000 (11:24 +0200)]
Correct X2-APIC HVM emulation

commit 6859874b61d5ddaf5289e72ed2b2157739b72ca5 ("x86/HVM: fix x2APIC
APIC_ID read emulation") introduced an error for the hvm emulation of
x2apic. Any try to write to APIC_ICR MSR will result in a GP fault.

Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
11 years agoNested VMX: Update APIC-v(RVI/SVI) when vmexit to L1
Yang Zhang [Thu, 22 Aug 2013 08:59:01 +0000 (10:59 +0200)]
Nested VMX: Update APIC-v(RVI/SVI) when vmexit to L1

If enabling APIC-v, all interrupts to L1 are delivered through APIC-v.
But when L2 is running, external interrupt will casue L1 vmexit with
reason external interrupt. Then L1 will pick up the interrupt through
vmcs12. when L1 ack the interrupt, since the APIC-v is enabled when
L1 is running, so APIC-v hardware still will do vEOI updating. The problem
is that the interrupt is delivered not through APIC-v hardware, this means
SVI/RVI/vPPR are not setting, but hardware required them when doing vEOI
updating. The solution is that, when L1 tried to pick up the interrupt
from vmcs12, then hypervisor will help to update the SVI/RVI/vPPR to make
sure the following vEOI updating and vPPR updating corrently.

Also, since interrupt is delivered through vmcs12, so APIC-v hardware will
not cleare vIRR and hypervisor need to clear it before L1 running.

Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Acked-by: "Dong, Eddie" <eddie.dong@intel.com>
11 years agoNested VMX: Clear APIC-v control bit in vmcs02
Yang Zhang [Thu, 22 Aug 2013 08:52:05 +0000 (10:52 +0200)]
Nested VMX: Clear APIC-v control bit in vmcs02

There is no vAPIC-v support, so mask APIC-v control bit when
constructing vmcs02.

Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Acked-by: "Dong, Eddie" <eddie.dong@intel.com>