Depending on the contents of the mp_irqs/mp_ioapics from the MP table,
find_isa_irq_apic() might return -1, at which point calling
ioapic_read_entry() with it is bad.
In addition to bailing if pin is -1, bail if apic is -1.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
Daniel De Graaf [Tue, 10 Sep 2013 14:39:46 +0000 (16:39 +0200)]
console: buffer and show origin of guest PV writes
Guests other than domain 0 using the console output have previously been
controlled by the VERBOSE #define, but with no designation of which
guest's output was on the console. This patch converts the HVM output
buffering to be used by all domains except the hardware domain (dom0):
stripping non-printable characters, line buffering the output, and
prefixing it with the domain ID. This is especially useful for debugging
stub domains during early boot.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Keir Fraser <keir@xen.org>
Add XENFEAT_hvm_callback_vector to elf_xen_feature_names so we can
ensure the kernel supports all features required for PVH mode when
building a PVH domU here. Note, hvm callback is required for PVH.
Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Rob Hoes [Thu, 22 Aug 2013 10:50:52 +0000 (11:50 +0100)]
libxl: idl: complete some enums in the IDL with their defaults
There are several enums in the IDL that are initialised to 0, while
the value 0 is not part of the enum itself. This creates problems for
language bindings generated from the IDL, such as the OCaml ones.
Added an explicit (0, "UNKNOWN") enum value where appropriate, or used
init_val to default to a sensible value.
Signed-off-by: Rob Hoes <rob.hoes@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Rob Hoes [Thu, 22 Aug 2013 10:50:49 +0000 (11:50 +0100)]
libxl: Add LIBXL_SHUTDOWN_REASON_UNKNOWN
libxl_dominfo.shutdown_reason is valid iff (shutdown||dying). This is a bit
annoying when generating language bindings since it needs all sorts of special
casing. Just introduce an explicit value instead.
Signed-off-by: Ian Campbell <ian.cambell@citrix.com> Signed-off-by: Rob Hoes <rob.hoes@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
This was broken by c/s 5cc436c1d2b3b0 which did a blanket change of 'int
xc_handle' -> 'xc_interface *xch'. The types got updated, but error
conditions were left as-were. (I suspect some sed was involved originally)
Also while playing around in this area, fix up some of the bracketing style to
match the Xen coding style.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
Julien Grall [Wed, 28 Aug 2013 14:47:17 +0000 (15:47 +0100)]
xen/dts: Don't check the number of address and size cells in process_cpu_node
CPU nodes are not required to have #address-cells == 1 and #size-cells == 0, so
don't check for that (see Linux Documentation/devicetree/booting-without-of.txt
Section III.5.a).
In some OMAP5 device, tree, these 2 properties are not correctly set. Therefore,
Xen will only able to handle 1 CPU.
Ben Cressey [Fri, 6 Sep 2013 19:52:07 +0000 (12:52 -0700)]
minios: fix xenbus_rm() calls in frontend drivers
The commit "minios: refactor xenbus state machine" caused "/state" to
be appended to the local value of nodename. Previously the nodename
variable pointed to dev->nodename.
The xenbus_rm() calls were not updated to reflect this change, and
refer to paths that do not exist.
For example, shutdown_blkfront() for vbd 2049 would issue these calls:
xenbus_rm(XBT_NIL, "device/vbd/2049/state/ring-ref");
xenbus_rm(XBT_NIL, "device/vbd/2049/state/event-channel");
This patch restores the previous behavior, issuing these calls
instead:
xenbus_rm(XBT_NIL, "device/vbd/2049/ring-ref");
xenbus_rm(XBT_NIL, "device/vbd/2049/event-channel");
This causes frontend drivers to not be properly reset when PV-GRUB
exists. Some PV Linux drivers fail to re-initialize frontend devices
if PV-GRUB leaves them in this state.
Signed-off-by: Ben Cressey <bcressey@amazon.com> Reviewed-by: Matt Wilson <msw@amazon.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
[msw: adjusted commit message to include consequences, split out
changes into separate patches] Signed-off-by: Matt Wilson <msw@amazon.com>
Ben Cressey [Fri, 6 Sep 2013 19:52:06 +0000 (12:52 -0700)]
minios: clean up unneeded "err = NULL" in frontend drivers
This patch removes cases where the error message pointer is already
NULL and is then set to NULL. These are harmless, but suggest
incorrect practice: the pointer should be passed to free() to
deallocate memory prior to reassignment. There are no functional
changes in this patch.
Signed-off-by: Ben Cressey <bcressey@amazon.com> Reviewed-by: Matt Wilson <msw@amazon.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
[msw: split a larger patch from Ben into this cleanup patch] Signed-off-by: Matt Wilson <msw@amazon.com>
Matt Wilson [Fri, 6 Sep 2013 19:52:05 +0000 (12:52 -0700)]
minios: clean up allocation of char arrays used for xenbus paths
This patch cleans up instances of char array allocation where string
lengths were manually counted to use strlen() instead. There are no
functional changes in this patch.
Signed-off-by: Matt Wilson <msw@amazon.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-By: Samuel Thibault <samuel.thibault@ens-lyon.org>
Matt Wilson [Fri, 6 Sep 2013 19:52:04 +0000 (12:52 -0700)]
minios: correct char array allocation for xenbus paths
The char arrays used to hold xenbus paths have historically been
allocated by manually counting the length longest string constants
included in constructing the path. This has led to improperly sized
buffers, both too large (with little consequence) and too small (which
obviously causes problems). This patch corrects the instances where
the length was incorrectly calculated by using strlen() on the longest
string constant used in building a xenbus path.
A follow-on clean-up patch will change all instances to use strlen().
Signed-off-by: Ben Cressey <bcressey@amazon.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-By: Samuel Thibault <samuel.thibault@ens-lyon.org>
[msw: split this patch from a larger patch from Ben, reworked to use
strlen()] Signed-off-by: Matt Wilson <msw@amazon.com>
Ian Campbell [Mon, 9 Sep 2013 13:52:35 +0000 (14:52 +0100)]
configure: Regenerate with autoconf 2.69
This is the version from Debian Wheezy which is what both Ian Jackson and
myself run on our workstations. As committers it is useful to minimise
regeneration noise.
This is purely a run of autogen.sh. I have not tried to build the result.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Tested-by: Ian Jackson <ian.jackson@citrix.com>
tools: allow user to specify a system seabios binary
If this option is given don't bother building seabios ourselves.
Likely to be handy for distros who have an existing seabios
package which they want to reuse.
Signed-off-by: Fabio Fantoni <fabio.fantoni@m2r.biz> Acked-by: Ian Campbell <ian.campbell@citrix.com>
The commit dbd1243 "xen/arm: Add helpers to use the device tree" introduced
DT_ROOT_NODE_ADDR_CELLS_DEFAULT with is used for default value when
bad copy from Linux code.
The ePAR (section 2.3.5) says: "If missing, a client program should assume a
default value of 2 for #address-cells, and a value of 1 for #size-cells."
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Matthew Daley [Tue, 3 Sep 2013 13:12:59 +0000 (01:12 +1200)]
tools: build debug qemu-xen in debug tools builds
When building tools in debug mode (debug=y), pass --enable-debug when
configuring qemu-xen to enable some debug support (namely, to prevent
symbols from being stripped).
Olaf Hering [Tue, 27 Aug 2013 13:43:43 +0000 (15:43 +0200)]
hotplug/Linux: add sysconfig tags to xencommons
YaST2 sysconfig can logically group the various sysconfig settings if the
files are tagged. Add the missing (YaST specific) tags to xencommons.
See for a description
http://old-en.opensuse.org/Packaging/SUSE_Package_Conventions/Sysconfig
Jan Beulich [Mon, 9 Sep 2013 12:36:54 +0000 (14:36 +0200)]
x86/xsave: fix migration from xsave-capable to xsave-incapable host
With CPUID features suitably masked this is supposed to work, but was
completely broken (i.e. the case wasn't even considered when the
original xsave save/restore code was written).
First of all, xsave_enabled() wrongly returned the value of
cpu_has_xsave, i.e. not even taking into consideration attributes of
the vCPU in question. Instead this function ought to check whether the
guest ever enabled xsave support (by writing a [non-zero] value to
XCR0). As a result of this, a vCPU's xcr0 and xcr0_accum must no longer
be initialized to XSTATE_FP_SSE (since that's a valid value a guest
could write to XCR0), and the xsave/xrstor as well as the context
switch code need to suitably account for this (by always enforcing at
least this part of the state to be saved/loaded).
This involves undoing large parts of c/s 22945:13a7d1f7f62c ("x86: add
strictly sanity check for XSAVE/XRSTOR") - we need to cleanly
distinguish between hardware capabilities and vCPU used features.
Next both HVM and PV save code needed tweaking to not always save the
full state supported by the underlying hardware, but just the parts
that the guest actually used. Similarly the restore code should bail
not just on state being restored that the hardware cannot handle, but
also on inconsistent save state (inconsistent XCR0 settings or size of
saved state not in line with XCR0).
And finally the PV extended context get/set code needs to use slightly
different logic than the HVM one, as here we can't just key off of
xsave_enabled() (i.e. avoid doing anything if a guest doesn't use
xsave) because the tools use this function to determine host
capabilities as well as read/write vCPU state. The set operation in
particular needs to be capable of cleanly dealing with input that
consists of only the xcr0 and xcr0_accum values (if they're both zero
then no further data is required).
While for things to work correctly both sides (saving _and_ restoring
host) need to run with the fixed code, afaict no breakage should occur
if either side isn't up to date (other than the breakage that this
patch attempts to fix).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Yang Zhang <yang.z.zhang@intel.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Mon, 9 Sep 2013 12:35:29 +0000 (14:35 +0200)]
EFI: fix tool chain capabilities detection
Commit f5a54e92 ("xen: move some arch CFLAGS into the common Rules.mk")
transformed CFLAGS assignments to CFLAGS-y ones, which collides with
the was xen/arch/x86/efi/Makefile determines whether the tol chain is
usable for an EFI build. Transform the block back to using CFLAGS.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Mon, 9 Sep 2013 12:34:12 +0000 (14:34 +0200)]
xmalloc: make whole pages xfree() clear the order field (ab)used by xmalloc()
Not doing this was found to cause problems with sequences of allocation
(multi-page), freeing, and then again allocation of the same page upon
boot when interrupts are still disabled (causing the owner field to be
non-zero, thus making the allocator attempt a TLB flush and, in its
processing, triggering an assertion).
Reported-by: Tomasz Wroblewski <tomasz.wroblewski@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Tomasz Wroblewski <tomasz.wroblewski@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
x86: allow guest to set/clear MSI-X mask bit (try 2)
Guest needs the ability to enable and disable MSI-X interrupts
by setting the MSI-X control bit, for a passed-through device.
Guest is allowed to write MSI-X mask bit only if Xen *thinks*
that mask is clear (interrupts enabled). If the mask is set by
Xen (interrupts disabled), writes to mask bit by the guest is
ignored.
Currently, a write to MSI-X mask bit by the guest is silently
ignored.
A likely scenario is where we have a 82599 SR-IOV nic passed
through to a guest. From the guest if you do
ifconfig <ETH_DEV> down
ifconfig <ETH_DEV> up
the interrupts remain masked. On VF reset, the mask bit is set
by the controller. At this point, Xen is not aware that mask is set.
However, interrupts are enabled by VF driver by clearing the mask
bit by writing directly to BAR3 region containing the MSI-X table.
From dom0, we can verify that
interrupts are being masked using 'xl debug-keys M'.
Jan Beulich [Mon, 9 Sep 2013 08:40:11 +0000 (10:40 +0200)]
x86/EFI: properly handle run time memory regions outside the 1:1 map
Namely with PFN compression, MMIO ranges that the firmware may need
runtime access to can live in the holes that gets shrunk/eliminated by
PFN compression, and hence no mappings would result from simply
copying Xen's direct mapping table's L3 page table entries. Build
mappings for this "manually" in the EFI runtime call 1:1 page tables.
Use the opportunity to also properly identify (via a forcibly undefined
manifest constant) all the disabled code regions associated with it not
being acceptable for us to call SetVirtualAddressMap().
Jan Beulich [Mon, 9 Sep 2013 08:24:21 +0000 (10:24 +0200)]
SVM: streamline entry.S code
- fix a bogus "test" with zero immediate
- move stuff easily/better done in C into C code
- re-arrange code paths so that no redundant GET_CURRENT() would remain
on the fast paths
- move long latency operations earlier
- slightly defer disabling global interrupts on the VM entry path
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Tim Deegan <tim@xen.org>
Jan Beulich [Mon, 9 Sep 2013 08:23:32 +0000 (10:23 +0200)]
VMX: use proper instruction mnemonics if assembler supports them
With the hex byte emission we were taking away a good part of
flexibility from the compiler, as for simplicity reasons these were
built using fixed operands. All half way modern build environments
would allow using the mnemonics (but we can't disable the hex variants
yet, since the binutils around at the time gcc 4.1 got released didn't
support these yet).
I didn't convert __vmread() yet because that would, just like for
__vmread_safe(), imply converting to a macro so that the output operand
can be the caller supplied variable rather than an intermediate one. As
that would require touching all invocation points of __vmread() (of
which there are quite a few), I'd first like to be certain the approach
is acceptable; the main question being whether the now conditional code
might be considered to cause future maintenance issues, and the second
being that of parameter/argument ordering (here I made __vmread_safe()
match __vmwrite(), but one could also take the position that read and
write should use the inverse order of one another, in line with the
actual instruction operands).
Additionally I was quite puzzled to find that all the asm()-s involved
here have memory clobbers - what are they needed for? Or can they be
dropped at least in some cases?
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Tim Deegan <tim@xen.org>
Jan Beulich [Mon, 9 Sep 2013 08:20:52 +0000 (10:20 +0200)]
VMX: streamline entry.S code
- move stuff easily/better done in C into C code
- re-arrange code paths so that no redundant GET_CURRENT() would remain
on the fast paths
- move long latency operations earlier
- slightly defer disabling interrupts on the VM entry path
- use ENTRY() instead of open coding it
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Tim Deegan <tim@xen.org>
Jan Beulich [Thu, 5 Sep 2013 09:47:03 +0000 (11:47 +0200)]
hvmloader: fix SeaBIOS interface
The SeaBIOS ROM image may validly exceed 128k in size, it's only our
interface code that so far assumed that it wouldn't. Remove that
restriction by setting the base address depending on image size.
Add a check to HVM loader so that too big images won't result in silent
guest failure anymore.
Uncomment the intended build-time size check for rombios, moving it
into a function so that it would actually compile.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Fri, 30 Aug 2013 11:33:33 +0000 (12:33 +0100)]
libvhd: use UTC for VHD timestamp
[ported from xapi-project/blktap a79ac2c05f9 ("XOP-289: use UTC for VHD
timestamps")]
Currently, the local timezone is factored into VHD timestamps due to the
use of mktime(). This breaks "vhd-util check" for VHDs created in one
timezone and then moved westward, which results in "primary footer
invalid: creation time in future" errors.
Signed-off-by: Andrei Lifchits <andrei.lifchits@citrix.com>
Andrei no longer works for Citrix but Germano Percossi (ex-colleague of
Andrei) contacted Andrei and confirmed that
1) this work was written on Citrix time,
2) it is OK to have Andrei's SoB.
Remove unused variable "tm".
Signed-off-by: Germano Percossi <germano.percossi@citrix.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Wei Liu [Tue, 27 Aug 2013 14:22:43 +0000 (15:22 +0100)]
libxl: prefer qdisk over blktap when choosing disk backend
There are some disk formats commonly supported by both qdisk and blktap.
As qdisk is better supported and blktap is unmaintained, we choose qdisk
over blktap whenever possible.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Xi Xiong [Fri, 30 Aug 2013 19:21:56 +0000 (12:21 -0700)]
xend: fix file descriptor leak in pci utilities
A file descriptor leak was detected after creating multiple domUs with
pass-through PCI devices. This patch fixes the issue.
Signed-off-by: Xi Xiong <xixiong@amazon.com> Reviewed-by: Matt Wilson <msw@amazon.com>
[msw: adjusted commit message] Signed-off-by: Matt Wilson <msw@amazon.com>
Steven Noonan [Fri, 30 Aug 2013 23:40:42 +0000 (16:40 -0700)]
xend: handle extended PCI configuration space when saving state
Newer PCI standards (e.g., PCI-X 2.0 and PCIe) introduce extended
configuration space which is larger than 256 bytes. This patch uses
stat() to determine the amount of space used to correctly save all of
the PCI configuration space. Resets handled by the xen-pciback driver
don't have this problem, as that code correctly handles saving
extended configuration space.
Signed-off-by: Steven Noonan <snoonan@amazon.com> Reviewed-by: Matt Wilson <msw@amazon.com>
[msw: adjusted commit message] Signed-off-by: Matt Wilson <msw@amazon.com>
Andre Przywara [Tue, 3 Sep 2013 14:00:52 +0000 (16:00 +0200)]
pl011: preserve RTS and DTR signal on UART init
Although we do not support hardware flow control in the Xen driver
for the PL011 UART, the other end may be configured to use it.
In this case it waits in vain for the RTS signal to be asserted by
the host and will never transmit any characters.
So we leave RTS and DTR as they had been setup before.
This fixes the UART input on Calxeda Midway, which uses hardware
flow control for the serial-over-LAN functionality.
Signed-off-by: Andre Przywara <andre.przywara@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Tim Deegan [Thu, 29 Aug 2013 15:47:12 +0000 (16:47 +0100)]
xen/x86: don't use '.ifnes' in bug frame construction.
Spotted because it breaks the clang build for LLVM <3.2. .ifnes is
not right here as it will choke on a string with embedded quotes.
.ifnb would be better except that LLVM <3.2 doesn't support that either
(and nor does binutils 2.16).
It should be possible to use something like !!msg or !!msg[0] instead
of a separate flag, but I gave up trying to find something that would
make it through CPP, asm() and gas as a usable constant. :|
Signed-off-by: Tim Deegan <tim@xen.org> Acked-by: Jan Beulich <jbeulich@suse.com>
Len Brown [Fri, 30 Aug 2013 09:00:07 +0000 (11:00 +0200)]
x86/mwait_idle: initial C8, C9, C10 support
Allow mwait_idle to utilize C8, C9, C10 when they are present on...
"Fourth Generation Intel(R) Core(TM) Processors", which are based on
Intel(R) microarchitecture code name Haswell.
Signed-off-by: Len Brown <len.brown@intel.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Len Brown [Fri, 30 Aug 2013 08:59:09 +0000 (10:59 +0200)]
x86/mwait_idle: export both C1 and C1E
Here we disable HW promotion of C1 to C1E and export both C1 and C1E
as distinct C-states.
This allows a cpuidle governor to choose a lower latency C-state than
C1E when necessary to satisfy performance and QOS constraints -- and
still save power versus polling.
This also corrects the erroneous latency previously reported for C1E
-- it is 10usec, not 1usec.
Signed-off-by: Len Brown <len.brown@intel.com>
Avoided the effect of changing the meaning of "max_cstate=".
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Len Brown [Fri, 30 Aug 2013 08:58:21 +0000 (10:58 +0200)]
x86/mwait_idle: remove assumption of one C-state per MWAIT flag
Remove the assumption that cstate_tables are indexed by MWAIT flag
values. Each entry identifies itself via its own flags value. This
change is needed to support multiple states that share the same MWAIT
flags.
Note that this can have an effect on what state is described by 'N' on
cmdline max_cstate=N on some systems.
Signed-off-by: Len Brown <len.brown@intel.com>
Avoided the effect of changing the meaning of "max_cstate=".
Drop MWAIT_MAX_NUM_CSTATES (done differently in a prior patch on
Linux).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Fri, 30 Aug 2013 08:56:07 +0000 (10:56 +0200)]
x86/xsave: initialization improvements
- properly validate available feature set on APs
- also validate xsaveopt availability on APs
- properly indicate whether the initialization is on the BSP (we
shouldn't be using "cpu == 0" checks for this)
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
I realise that this patch causes a change to the public headers. However I
feel it is justified as:
* All toolstacks used to have to embed the magic string (and almost certainly
still do)
* If by some miriacle a new toolstack has started using the new define will
continue to work.
* The only intree consumer of the define is hvmloader itself.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
Andrew Cooper [Fri, 30 Aug 2013 08:40:29 +0000 (10:40 +0200)]
hvmloader/smbios: Correctly count the number of tables written
Fixes regression indirectly introduced by c/s 4d23036e709627
That changeset added some smbios tables which were option based on the
toolstack providing appropriate xenstore keys. The do_struct() macro would
unconditionally increment nr_structs, even if a table was not actually
written.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Thu, 29 Aug 2013 07:53:07 +0000 (09:53 +0200)]
AMD IOMMU: allow command line overrides for broken IVRS tables
With there being so many systems with broken ACPI tables, and with it
generally being known what's wrong with those tables, give people a
handle to overcome the resulting disabling of their IOMMUs.
Inspired by Linux side patches providing similar functionality.
Jan Beulich [Thu, 29 Aug 2013 07:31:37 +0000 (09:31 +0200)]
AMD IOMMU: add missing checks
For one we shouldn't accept IVHD tables specifying IO-APIC IDs beyond
the limit we support (MAX_IO_APICS, currently 128).
And then we shouldn't memset() a pointer allocation of which failed.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Suravee Suthikulpanit <suravee.suthikulapanit@amd.com>
Jan Beulich [Wed, 28 Aug 2013 15:03:50 +0000 (17:03 +0200)]
x86: AVX instruction emulation fixes
- we used the C4/C5 (first prefix) byte instead of the apparent ModR/M
one as the second prefix byte
- early decoding normalized vex.reg, thus corrupting it for the main
consumer (copy_REX_VEX()), resulting in #UD on the two-operand
instructions we emulate
Also add respective test cases to the testing utility plus
- fix get_fpu() (the fall-through order was inverted)
- add cpu_has_avx2, even if it's currently unused (as in the new test
cases I decided to refrain from using AVX2 instructions in order to
be able to actually run all the tests on the hardware I have)
- slightly tweak cpu_has_avx to more consistently express the outputs
we don't care about (sinking them all into the same variable)
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
xen: update tx_ready callback for ARM serial drivers
Type of tx_ready callback got changed to int to facilitate error condition,
but the ARM serial drivers were not modified thus breaking the compilation.
Reported-by: Julien Grall <julien.grall@linaro.org> Signed-off-by: Tomasz Wroblewski <tomasz.wroblewski@citrix.com>
PCI UART: better cope with UART being temporarily unavailable
This happens for example when dom0 disables ioport responses during PCI
subsystem initialisation. If a __ns16550_poll() happens to be scheduled
during that time, Xen hangs. Detect and exit that condition.
Amended ns16550_ioport_invalid function to only check IER register,
which contins 3 reserved (always 0) bits, therefore it's sufficient for
that test.
Signed-off-by: Tomasz Wroblewski <tomasz.wroblewski@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
Fix inactive timer list corruption on second S3 resume
init_timer cannot be safely called multiple times on same timer since it does memset(0)
on the structure, erasing the auxiliary member used by linked list code. This breaks
inactive timer list in common/timer.c.
Moved resume_timer initialisation to ns16550_init_postirq, so it's only done once.
Signed-off-by: Tomasz Wroblewski <tomasz.wroblewski@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Wed, 28 Aug 2013 08:12:36 +0000 (10:12 +0200)]
PCI: centralize parsing of device coordinates in command line options
With yet another case to come in a subsequent patch, it seems time to
do this in a single place rather than hand crafting it in various
scattered around locations.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Wed, 28 Aug 2013 08:11:19 +0000 (10:11 +0200)]
AMD IOMMU: also allocate IRTEs for HPET MSI
Omitting this was a blatant oversight of mine in commit 2ca9fbd7 ("AMD
IOMMU: allocate IRTE entries instead of using a static mapping").
This also changes a bogus inequality check into a sensible one, even
though it is already known that this will make HPET MSI unusable on
certain systems (having respective broken firmware). This, however,
seems better than failing on systems with consistent ACPI tables.
The commit 874f76a "PL011: fix reverse logic for interrupt mask register"
introduced regression on the Versatile Express. The board didn't receive
correctly input.
The timeout interrupt may be asserted when the FIFO is not empty, and no futher
data is received over a 32-bit period.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
That changeset moved the watchdog functions from nmi.h to their own
watchdog.h. I thought I had updated all relevant header files and the
compiler was happy as well. However, gdbstub is not even compiled by default,
and I accidentally missed it.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Tue, 27 Aug 2013 09:28:26 +0000 (11:28 +0200)]
x86/boot: Explicitly clean pcpu stacks in debug builds
This reduces confusion when looking at a hexdump of the pcpu stacks and
wondering were on earth some of the junk was coming from. Also leave some
grep fodder for finding where the BSP switches stack (because it took me
far longer to find than I care to admit to).
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Matt Wilson [Tue, 27 Aug 2013 09:23:09 +0000 (11:23 +0200)]
x86/time: remove Cyclone as a platform timer
The Cyclone time source was part of IBM's Summit chipset, which was
only used for 32-bit only ccNUMA and IA-64 machines. Neither of these
are supported by Xen anymore.
Signed-off-by: Matt Wilson <msw@amazon.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 23 Aug 2013 13:07:00 +0000 (15:07 +0200)]
domctl: replace cpumask_weight() uses
In one case it could easily be replaced by range checking the result of
a subsequent operation, and in general cpumask_next(), not always
needing to scan the whole bitmap, is more efficient than the specific
uses of cpumask_weight() here. (When running on big systems, operations
on CPU masks aren't cheap enough to use them carelessly.)
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Fri, 23 Aug 2013 13:06:21 +0000 (15:06 +0200)]
credit1: replace cpumask_empty() uses
In one case it was redundant with the operation it got combined with,
and in the other it could easily be replaced by range checking the
result of a subsequent operation. (When running on big systems,
operations on CPU masks aren't cheap enough to use them carelessly.)
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Jan Beulich [Fri, 23 Aug 2013 13:05:39 +0000 (15:05 +0200)]
credit2: replace cpumask_first() uses
... with cpumask_any() or cpumask_cycle().
In one case this also allows elimination of a cpumask_empty() call,
and while doing this I also spotted a redundant use of
cpumask_weight(). (When running on big systems, operations on CPU masks
aren't cheap enough to use them carelessly.)
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Jan Beulich [Fri, 23 Aug 2013 13:01:53 +0000 (15:01 +0200)]
un-alias cpumask_any() from cpumask_first()
In order to achieve more symmetric distribution of certain things,
cpumask_any() shouldn't always pick the first CPU (which frequently
will end up being CPU0). To facilitate that, introduce a library-like
function to obtain random numbers.
The per-architecture function is supposed to return zero if no valid
random number can be obtained (implying that if occasionally zero got
produced as random number, it wouldn't be considered such).
As fallback this uses the trivial algorithm from the C standard,
extended to produce "unsigned int" results.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Jan Beulich [Fri, 23 Aug 2013 07:22:08 +0000 (09:22 +0200)]
PCI: break MSI-X data out of struct pci_dev_info
Considering that a significant share of PCI devices out there (not the
least the myriad of CPU-exposed ones) don't support MSI-X at all, and
that the amount of data is well beyond a handful of bytes, break this
out of the common structure, at once allowing the actual data to be
tracked to become architecture specific.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Fri, 23 Aug 2013 07:19:29 +0000 (09:19 +0200)]
x86: move struct bug_frame instances out of line
Just like Linux did many years ago, move them into a separate (data)
section, such that they no longer pollute instruction caches and TLBs.
Assertion frames, requiring two pointers to be stored, occupy two slots
in the array, with the second slot mimicking a frame the location
pointer of which doesn't match any address within .text or .init.text
(it effectively points back to the slot itself, which - being in a data
section - can't be reached by non-buggy execution).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Ian Campbell [Fri, 19 Jul 2013 15:20:10 +0000 (16:20 +0100)]
xen: arm: retry trylock if strex fails on free lock.
This comes from the Linux patches 15e7e5c1ebf5 for arm32 and 4ecf7ccb1973 for
arm64 by Will Deacon and Catalin Marinas respectively. The Linux commit message
says:
An exclusive store instruction may fail for reasons other than lock
contention (e.g. a cache eviction during the critical section) so, in
line with other architectures using similar exclusive instructions
(alpha, mips, powerpc), retry the trylock operation if the lock appears
to be free but the strex reported failure.
I have observed this due to register_cpu_notifier containing:
if ( !spin_trylock(&cpu_add_remove_lock) )
BUG(); /* Should never fail as we are called only during boot. */
which was spuriously failing.
The ARMv8 variant is taken directly from the Linux patch. For v7 I had to
reimplement since we don't currently use ticket locks.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org>
Ian Campbell [Fri, 19 Jul 2013 15:20:08 +0000 (16:20 +0100)]
xen/arm64: Assembly optimized bitops from Linux
This patch replaces the previous hashed lock implementaiton of bitops with
assembly optimized ones taken from Linux v3.10-rc4.
The Linux derived ASM only supports 8 byte aligned bitmaps (which under Linux
are unsigned long * rather than our void *). We do have actually uses of 4
byte alignment (i.e. the bitmaps in struct xmem_pool) which trigger alignment
faults.
Therefore adjust the assembly to work in 4 byte increments, which involved:
- bit offset now bits 4:0 => mask #31 not #63
- use wN register not xN for load/modify/store loop.
There is no need to adjust the shift used to calculate the word offset, the
difference is already acounted for in the #63->#31 change.
NB: Xen's build system cannot cope with the change from .c to .S file,
remove xen/arch/arm/arm64/lib/.bitops.o.d or clean your build tree.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org>
Chen Baozi [Tue, 13 Aug 2013 11:14:24 +0000 (19:14 +0800)]
xen/arm: Add the new OMAP UART driver.
TI OMAP UART introduces some features such as register access modes, which
makes its configuration and interrupt handling differs from 8250 compatible
UART. Thus, we seperate this driver from ns16550's implementation.
Signed-off-by: Chen Baozi <baozich@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Chen Baozi [Tue, 13 Aug 2013 11:14:21 +0000 (19:14 +0800)]
xen: rename ns16550-uart.h to 8250-uart.h and fix some typos
Since UARTs on OMAP5 & Allwinner's SoC are not ns16550 but only 8250
compatible, rename ns16550-uart.h to 8250-uart.h, which is a more pervasive
name. At the same time, fix some typos, which have redundance UART_
prefixes in some macros.
Andre Przywara [Thu, 22 Aug 2013 07:40:54 +0000 (09:40 +0200)]
ARM: fix const declaration of platform struct
As Julien pointed out the other day, the data type for the platform
DT name match struct is wrong.
To be really immutable, we have to use "const char * const".
Fix it on the three currently existing platforms.
Signed-off-by: Andre Przywara <andre.przywara@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Juergen Gross [Thu, 22 Aug 2013 09:24:00 +0000 (11:24 +0200)]
Correct X2-APIC HVM emulation
commit 6859874b61d5ddaf5289e72ed2b2157739b72ca5 ("x86/HVM: fix x2APIC
APIC_ID read emulation") introduced an error for the hvm emulation of
x2apic. Any try to write to APIC_ICR MSR will result in a GP fault.
Yang Zhang [Thu, 22 Aug 2013 08:59:01 +0000 (10:59 +0200)]
Nested VMX: Update APIC-v(RVI/SVI) when vmexit to L1
If enabling APIC-v, all interrupts to L1 are delivered through APIC-v.
But when L2 is running, external interrupt will casue L1 vmexit with
reason external interrupt. Then L1 will pick up the interrupt through
vmcs12. when L1 ack the interrupt, since the APIC-v is enabled when
L1 is running, so APIC-v hardware still will do vEOI updating. The problem
is that the interrupt is delivered not through APIC-v hardware, this means
SVI/RVI/vPPR are not setting, but hardware required them when doing vEOI
updating. The solution is that, when L1 tried to pick up the interrupt
from vmcs12, then hypervisor will help to update the SVI/RVI/vPPR to make
sure the following vEOI updating and vPPR updating corrently.
Also, since interrupt is delivered through vmcs12, so APIC-v hardware will
not cleare vIRR and hypervisor need to clear it before L1 running.
Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com> Acked-by: "Dong, Eddie" <eddie.dong@intel.com>