Ian Campbell [Wed, 14 Jan 2015 16:16:50 +0000 (16:16 +0000)]
xen: arm: enable perf counters
As well as the existing common perf counters add a bunch of ARM
specifics, including the various trap types, vuart/vgic/vtimer
accesses and different types of interrupt.
Adjust the common code so that the columns line up again, not sure
when/where this went wrong.
This is mostly the set of stuff I happened to be interested in, it can
be made more (or less) fine grained as the need arises.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Julien Grall <julien.grall@linaro.org>
Julien Grall [Wed, 14 Jan 2015 18:00:43 +0000 (18:00 +0000)]
xen/arm: Blacklist the memory mapped timer (armv7-timer-mem)
Some platform (such as the VFP Base AEMv8 model) has a memory mapped
timer. We don't want DOM0 use this timer rather than the generic ARM
timer. So blacklist it for all platforms.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Tue, 13 Jan 2015 18:17:21 +0000 (18:17 +0000)]
xen/arm: grant-table: Increased the initial number of grant frame to 4
When a domain is created on ARM, the grant table code initialized one
grant frame. With a basic load (i.e disk usage), Xen is quickly trying
to expand the number of frames:
(XEN) grant_table.c:1305:d2v0 Expanding dom (2) grant table from (1) to (2) frames.
(XEN) grant_table.c:1305:d2v0 Expanding dom (2) grant table from (2) to (3) frames.
(XEN) grant_table.c:311:d0v0 Increased maptrack size to 2 frames
(XEN) grant_table.c:311:d0v0 Increased maptrack size to 3 frames
The x86 code is initialied 4 frames (I didn't find the exact reason). I
think we could use the same default value.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Martin Lucina [Thu, 4 Dec 2014 13:33:53 +0000 (14:33 +0100)]
Mini-OS: netfront: Fix rx ring starvation in network_rx
In network_rx() we must push the same amount of requests back onto the
ring in the second loop that we consumed in the first loop. Otherwise
the ring will eventually starve itself of free request slots and no
packets will be delivered.
Further, we make the HAVE_LIBC codepath clearer to follow by removing
the "some" variable from the for loop initialisation and conditions.
Signed-off-by: Martin Lucina <martin@lucina.net> Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
tools: xl: refactor code to parse network device options
This patch removes duplicate code in /tools/libxl/xl_cmdimpl.c by
adding parse_nic_config function. This function parses configuration
data and adds the information into libxl_device_nic struct. It is
called in both main_networkattach and parse_config_data functions
to replace duplicate code.
Signed-off-by: Alexandra Sandulescu <alecsandra.sandulescu@gmail.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Uma Sharma [Mon, 20 Oct 2014 21:45:10 +0000 (03:15 +0530)]
tools/xl: Call init function for libxl_bitmap
This patch calls init function for libxl_bitmap in
main_cpupoolnumasplit() and vcpuset()
tools/libxl/xl_cmdimpl.c
IDL generated libxl types should be used only after calling the init
function even if the variable is simply being passed by reference as
an output parameter to a libxl function
Signed-off-by: Uma Sharma <uma.sharma523@gmail.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
--
Uma Sharma [Mon, 20 Oct 2014 21:42:11 +0000 (03:12 +0530)]
tools/xl: Call init function for libxl_domain_sched_params
This patch calls init function for libxl_domain_sched_params before
passing it as reference to sched_domain_get() function in
tools/libxl/xl_cmdimpl.c
IDL generated libxl types should be used only after calling the init
function even if the variable is simply being passed by reference as
an output parameter to a libxl function
Signed-off-by: Uma Sharma <uma.sharma523@gmail.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
--
Wei Liu [Mon, 1 Dec 2014 15:33:28 +0000 (15:33 +0000)]
libxl: add emacs local variables in libxl_{x86, arm}.c
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Dario Faggioli <dario.faggioli@citrix.com> Cc: Elena Ufimtseva <ufimtseva@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Thu, 8 Jan 2015 13:34:22 +0000 (13:34 +0000)]
tools/misc: Cleanup makefile
The existing makefile was awkward with needing to express conditional
inclusion for both the build and install rules, and contained both split and
unsplit long lines.
The INSTALL_* rules now contain the conditional inclusion information, while
the TARGET_* rules generate the build list from the complete install list,
less the minority of scripts which simply need copying into place. Comments
are introduces to aid clarity.
In addition, collect the CFLAGS expressions, remove the unreferenced and empty
HDRS list, and restrict the libxc internals build fix to the offending
binaries.
No functional change as a result of this patch, but it is rather more simple
to add or remove utilities.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Wei Liu <wei.liu2@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Euan Harris [Mon, 1 Dec 2014 14:21:05 +0000 (14:21 +0000)]
tools/Rules.mk: Don't optimize debug builds; add macro debugging information
Tools debug builds are built with optimization level -O1, inherited from
the CFLAGS definition in StdGNU.mk. Optimizations confuse the debugger,
and the comment justifying -O1 in StdGNU.mk should not apply for a
userspace library. Disable optimization by appending -O0 to CFLAGS,
which overrides the -O1 flag specified earlier.
Also specify -g3, to add macro debugging information which allows
gdb to expand macro invocations. This is useful as libxl uses many
non-trivial macros.
Signed-off-by: Euan Harris <euan.harris@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- inserted a clarifying "enable" into comment ]
Julien Grall [Fri, 9 Jan 2015 15:56:45 +0000 (15:56 +0000)]
libxl: Don't ignore error when we fail to give access to ioport/irq/iomem
If we fail to give the access, the domain will unlikely work correctly.
So we should bail out at the first error.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Julien Grall [Fri, 9 Jan 2015 16:13:02 +0000 (16:13 +0000)]
libxl/arm: Correctly spelled FDT_ERR_* in a comment
Signed-off-by: Julien Grall <julien.grall@linaro.org> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Thu, 8 Jan 2015 13:46:53 +0000 (13:46 +0000)]
libxl, hotplug/Linux: default to phy backend for raw format file, take 2
This patch resurrects 11a63a166. The previous patch had a bug that
wrong "physical-device" was written to xenstore causing block script
execution fail. This patch fixes that problem.
Following configurations have been tested:
1. Raw file and PV
2. Raw file and HVM
3. Block device and PV
4. Block device and HVM
Creation / destruction / local migration all worked.
Modify libxl and hotplug script to allow raw format file to use phy
backend.
The block script now tests the path and determine the actual type of
file (block device or regular file) then use the actual type to
determine which branch to run.
With these changes, plus the current ordering of backend preference (phy
> qdisk > tap), we will use phy backend for raw format file by default.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Thu, 8 Jan 2015 11:53:56 +0000 (11:53 +0000)]
dt-uart: support /chosen/stdout-path property.
ePAPR v1.1 section 3.5 defines the /chosen/stdout-path property to
refer to the device to be used for boot console output, so if no
dtuart property is given try to use that instead. This will make Xen
find a suitable console by default on DT platforms which include this
property.
As it happens the dtuart option has the exact same syntax as
stdout-path, so we can just copy the value into that buffer if it is
empty. If the string is too large for the buffer we truncate and warn
but continue in the hopes that enough of the path survived (i.e. only
the options part was dropped) to get something out.
FWIW support for this was added to Linux in v3.19-rc1 (7914a7c5651a
"of: support passing console options with stdout-path") and a fairly
large number of the dts files shipped with Linux have already included
a stdout-path property for quite a while now.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Julien Grall <julien.grall@linaro.org>
Ian Campbell [Thu, 8 Jan 2015 11:53:55 +0000 (11:53 +0000)]
dt-uart: use ':' as separator between path and options
',' is a valid character in a device-tree path (see ePAPR v1.1 Table
2-1), in fact ',' is actually pretty common in node names.
Using ',' as a separator breaks for example on fast models. If you use
the full path (/smb/motherboard/iofpga@3,00000000/uart@090000) rather
than the alias then earlyprintk gives:
(XEN) Looking for UART console /smb/motherboard/iofpga@3
(XEN) Unable to find device "/smb/motherboard/iofpga@3"
(XEN) Bad console= option 'dtuart'
I actually noticed this on Jetson where the uart is
"/serial@0,70006300" and there happened to be no alias defined.
Instead use ':' as the separator, it is defined to terminate the path
in the context of /chosen/stdout-path (Table 3-4) which is pretty
closely analogous to the dtuart= option and so makes a pretty good
choice (especially since the next patch adds support for stdout-path).
Since no DT aware driver current supports any options there is no
point in retaining support for ',' for backwards compatibility.
Additionally, expand the buffer for the dtuart option, a path can be
far longer than 30 characters (in fact the maximum size of a single
node name is 31, so it's not even necessarily enough for an alias).
128 is completely arbitrary and allows for paths at least 8 deep even
with worst case node names.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Julien Grall <julien.grall@linaro.org>
Ian Campbell [Thu, 8 Jan 2015 11:53:54 +0000 (11:53 +0000)]
dt-uart: Clarify log messages at init time.
- Don't log at all if console=dtuart (the default) was not present, in
that case the user has asked for something else, no need for every
other driver to tell them this.
- Use "dtuart" in all other messages, rather than just "console" or
"uart".
- Be more explicit if we are exiting because dtuart= wasn't given.
- Log the options which we've parsed.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Julien Grall <julien.grall@linaro.org>
Ian Campbell [Thu, 8 Jan 2015 10:57:47 +0000 (10:57 +0000)]
tools: libxl: directly initialise saved_* in _libxl_types.c
Coverity complains:
> /tools/libxl/_libxl_types.c: 9194 in libxl__device_channel_parse_json()
> 9188 }
> 9189 x = libxl__json_map_get("connection.socket", o, JSON_MAP);
> 9190 if (x) {
> 9191 libxl_device_channel_init_connection(p, LIBXL_CHANNEL_CONNECTION_SOCKET);
> 9192 {
> 9193 const libxl__json_object *saved_path = NULL;
> >>> CID 1261758: Unused value (UNUSED_VALUE)
> >>> Value from "x" is assigned to "saved_path" here, but that
> >>> stored value is not used before it is overwritten.
> 9194 saved_path = x;
> 9195 x = libxl__json_map_get("path", x, JSON_STRING | JSON_NULL);
> 9196 if (x) {
> 9197 rc = libxl__string_parse_json(gc, x, &p->u.socket.path);
> 9198 if (rc)
> 9199 goto out;
Which we can avoid by initialising saved_%s as we define it. Resulting
in numerous instances of the generated code changing like this:
if (x) {
libxl_channelinfo_init_connection(p, LIBXL_CHANNEL_CONNECTION_PTY);
{
- const libxl__json_object *saved_path = NULL;
- saved_path = x;
+ const libxl__json_object *saved_path = x;
x = libxl__json_map_get("path", x, JSON_STRING | JSON_NULL);
if (x) {
rc = libxl__string_parse_json(gc, x, &p->u.pty.path);
CID: 1261758, 1261759 (and I would have expected others, but not
seeing them for some reason).
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Jan Beulich [Mon, 12 Jan 2015 14:42:53 +0000 (15:42 +0100)]
arm64/EFI: minor corrections
- don't bail when using the last slot of bootinfo.mem.bank[] (due to
premature incrementing of the array index)
- GUIDs should be static const (and placed into .init.* whenever
possible)
- PrintErrMsg() issues a CR/LF pair itself - no need to explicitly
append one to the message passed to the function
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Jan Beulich [Mon, 12 Jan 2015 14:41:39 +0000 (15:41 +0100)]
x86/MCE: allow overriding the CMCI threshold
We've had reports of systems where CMCIs would surface at a relatively
high rate during certain periods of time, without them apparently
causing subsequent more severe problems (see Xeon E7-8800/4800/2800
specification clarification SC1). Give the admin a knob to lower the
impact on the system logs.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Christoph Egger <chegger@amazon.de> Acked-by: Liu Jinsong <jinsong.liu@alibaba-inc.com>
Jan Beulich [Mon, 12 Jan 2015 14:41:12 +0000 (15:41 +0100)]
x86emul: tighten CLFLUSH emulation
While for us it's not as bad as it was for Linux, their commit 13e457e0ee ("KVM: x86: Emulator does not decode clflush well", by
Nadav Amit <namit@cs.technion.ac.il>) nevertheless points out two
shortcomings in our code: opcode 0F AE /7 is clflush only when it uses
a memory mode (otherwise it's SFENCE) and when there's no REP prefix
(an operand size prefix is fine, as that's CLFLUSHOPT).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Mon, 12 Jan 2015 14:40:06 +0000 (15:40 +0100)]
x86: also allow REP STOS emulation acceleration
While the REP MOVS acceleration appears to have helped qemu-traditional
based guests, qemu-upstream (or really the respective video BIOSes)
doesn't appear to benefit from that. Instead the acceleration added
here provides a visible performance improvement during very early HVM
guest boot.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Chao Peng [Fri, 9 Jan 2015 16:35:43 +0000 (17:35 +0100)]
x86: expose CMT L3 event mask to user space
L3 event mask indicates the event types supported in host, including
cache occupancy event as well as local/total memory bandwidth events
for Memory Bandwidth Monitoring(MBM). Expose it so all these events
can be monitored in user space.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Fri, 9 Jan 2015 16:32:54 +0000 (17:32 +0100)]
hvmloader: avoid named helper symbols
Newer iasl validly complains that such routines would otherwise need to
be marked Serialized (in the SSDT case it can't know that explicit
serialization is being enforced), which is undesirable. Use Local<N>
instead.
Reported-by: Ian Campbell <Ian.Campbell@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Jan Beulich [Fri, 9 Jan 2015 16:29:44 +0000 (17:29 +0100)]
x86/HVM: vMSI simplification
- struct msixtbl_entry's table_len field can be unsigned int, and by
moving it down a little the structure size can be reduced slightly
- a disjoint xmalloc()/memset() pair can be converted to xzalloc()
- a pointless local variable can be dropped
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 9 Jan 2015 16:26:31 +0000 (17:26 +0100)]
x86/HVM: clobber hypercall arguments just like for PV
Unused arguments get clobbered before the call (not affecting caller
visible state), while used arguments get clobbered afterwards unless
a continuation is needed (affecting caller visible state).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 9 Jan 2015 16:25:55 +0000 (17:25 +0100)]
x86: streamline hypercall_create_continuation()
- drop clearing of excessive multicall arguments in compat case (no
longer needed now that hypercall_xlat_continuation() only checks the
actual arguments)
- latch current into a local variable
- use the cached value of hvm_guest_x86_mode() instead of re-executing
it
- scope restrict "regs"
- while at it, convert the remaining two argument checking BUG_ON()s in
hypercall_xlat_continuation() to ASSERT()s
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Fri, 9 Jan 2015 16:24:23 +0000 (17:24 +0100)]
x86/stack: avoid peeking into unmapped guard pages when dumping Xens stack
Currently, Xens stack tracing and dumping of its own stacks will always
attempt to continue to the top of the primary stack. While this is fine for
99% of cases, it is incorrect when the stack pointer starts on an IST stack.
In particular, the stack analysis functions will wander up from the IST
stacks, through the syscall trampolines and then onto the primary stack. If
MEMORY_GUARD is enabled, this will cause a pagefault when attempting to read
from the guard page. Being an unhandled hypervisor fault, the pagefault
handler will then attempt to dump the stacks, and fall over the same problem.
This change introduces more finegrained knowledge of the cpu stack layouts,
and introduces different boundaries for whether the stack pointer is on an IST
stack or the primary stack. Stack analysis starting from an IST stack will
now never exceed the stack they start on, and specifically not spill over into
an adjacent IST stack, or the syscall trampoline area.
Juergen Gross [Fri, 9 Jan 2015 16:21:26 +0000 (17:21 +0100)]
x86: expand arch_shared_info to support linear p2m list
The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list
currently contains the mfn of the top level page frame of the 3 level
p2m tree, which is used by the Xen tools during saving and restoring
(and live migration) of pv domains and for crash dump analysis. With
three levels of the p2m tree it is possible to support up to 512 GB of
RAM for a 64 bit pv domain.
A 32 bit pv domain can support more, as each memory page can hold 1024
instead of 512 entries, leading to a limit of 4 TB.
To be able to support more RAM on x86-64 switch to an additional
virtual mapped p2m list.
This patch expands struct arch_shared_info with a new p2m list virtual
address, the root of the page table root and a p2m generation count.
The new information is indicated by the domain to be valid by storing
a non-zero value into the page table root member.
To avoid build failures in the tools directory the checked structure
sizes must be adapted, too.
Wei Liu [Wed, 7 Jan 2015 15:23:00 +0000 (15:23 +0000)]
libxl_internal: comment on domain userdata unlock function
Discuss why we need to unlink file path before closes fd.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
[ ijc -- s/to avoid such/to avoid the following/ as requested by Ian ]
Wei Liu [Wed, 7 Jan 2015 15:22:59 +0000 (15:22 +0000)]
libxl_internal: lock_carefd -> carefd
lock_ prefix is redundant.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andrew Cooper [Thu, 8 Jan 2015 13:45:33 +0000 (13:45 +0000)]
tools/misc: Remove sbdf2devicepath
This script has become orphaned from the build system, and depends on removed
Xend functionality (xen.util.pci) so can't possibly function now.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Thomas Leonard [Fri, 3 Oct 2014 09:20:51 +0000 (10:20 +0100)]
mini-os: arm: show registers, stack and exception vector on fault
Signed-off-by: Thomas Leonard <talex5@gmail.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
[ ijc -- dropped change to extras/mini-os/ARM-TODO.txt, since the
patch which creates it hasn't been applied yet. ]
Thomas Leonard [Fri, 3 Oct 2014 09:20:48 +0000 (10:20 +0100)]
mini-os: arm: time
Based on an initial patch by Karim Raslan.
Signed-off-by: Karim Allah Ahmed <karim.allah.ahmed@gmail.com> Signed-off-by: Thomas Leonard <talex5@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Liang Li [Fri, 28 Nov 2014 10:52:05 +0000 (18:52 +0800)]
libxc: Expose the 1GB pages cpuid flag to guest
If hardware support the 1GB pages, expose the feature to guest by
default. Users don't have to use a 'cpuid= ' option in config fil
e to turn it on.
If guest use shadow mode, the 1GB pages feature will be hidden from
guest, this is done in the function hvm_cpuid(). So the change is
okay for shadow mode case.
Signed-off-by: Liang Li <liang.z.li@intel.com> Signed-off-by: Yang Zhang <yang.z.zhang@intel.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Vijaya Kumar K [Tue, 9 Dec 2014 04:39:55 +0000 (10:09 +0530)]
xen/arm: Manage pl011 uart TX interrupt correctly
In pl011.c, when TX interrupt is received
serial_tx_interrupt() is called to push next
characters. If TX buffer is empty, serial_tx_interrupt()
does not disable TX interrupt and hence pl011 UART
irq handler pl011_interrupt() always sees TX interrupt
status set in MIS register and cpu does not come out of
UART irq handler.
With this patch, mask TX interrupt by writing 0 to
IMSC register when TX buffer is empty and unmask by
writing 1 to IMSC register before sending characters.
Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com> Reviewed-by: Tim Deegan <tim@xen.org>
Paul Durrant [Wed, 7 Jan 2015 10:28:57 +0000 (11:28 +0100)]
x86/viridian: add Partition Reference Time enlightenment
The presence of the partition reference time enlightenment persuades newer
versions of Windows to prefer the TSC as their primary time source. Hence,
if rdtsc is not being emulated and is invariant then many vmexits (for
alternative time sources such as the HPET or reference counter MSR) can
be avoided.
The implementation is not yet complete as no attempt is made to prevent
emulation of rdtsc if the enlightenment is active and guest and host
TSC frequencies differ. To do that requires invasive changes in the core
x86 time code and hence a lot more testing.
This patch avoids the issue by disabling the enlightenment if rdtsc is
being emulated, causing Windows to choose another time source. This is
safe, but may cause a big variation in performance of guests migrated
between hosts of differing TSC frequency. Thus the enlightenment is not
enabled in the default set, but may be enabled to improve guest performance
where such migrations are not a concern.
See section 15.4 of the Microsoft Hypervisor Top Level Functional
Specification v4.0a for details.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Christoph Egger <chegger@amazon.de>
Yu Zhang [Wed, 7 Jan 2015 10:26:44 +0000 (11:26 +0100)]
x86: add a new p2m type - p2m_mmio_write_dm
A new p2m type, p2m_mmio_write_dm, is added to trap and emulate
the write operations on GPU's page tables. Handling of this new
p2m type are similar with existing p2m_ram_ro in most condition
checks, with only difference on final policy of emulation vs. drop.
For p2m_ram_ro types, write operations will not trigger the device
model, and will be discarded later in __hvm_copy(); while for the
p2m_mmio_write_dm type pages, writes will go to the device model
via ioreq-server.
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com> Signed-off-by: Wei Ye <wei.ye@intel.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org>
Yu Zhang [Wed, 7 Jan 2015 10:25:55 +0000 (11:25 +0100)]
x86: add a new p2m type class - P2M_DISCARD_WRITE_TYPES
Currently, the P2M_RO_TYPES bears 2 meanings: one is
"_PAGE_RW bit is clear in their PTEs", and another is
to discard the write operations on these pages. This
patch adds a p2m type class, P2M_DISCARD_WRITE_TYPES,
to bear the second meaning, so we can use this type
class instead of the P2M_RO_TYPES, to decide if a write
operation is to be ignored.
Signed-off-by: Yu Zhang <yu.c.zhang@linux.intel.com> Reviewed-by: Tim Deegan <tim@xen.org> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Wed, 7 Jan 2015 10:13:58 +0000 (11:13 +0100)]
VT-d: don't crash when PTE bits 52 and up are non-zero
This can (and will) be legitimately the case when sharing page tables
with EPT (more of a problem before p2m_access_rwx became zero, but
still possible even now when other than that is the default for a
guest), leading to an unconditional crash (in print_vtd_entries())
when a DMA remapping fault occurs.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Boris Ostrovsky [Wed, 7 Jan 2015 10:12:27 +0000 (11:12 +0100)]
x86/VPMU: Clear last_vcpu when destroying VPMU
We need to make sure that last_vcpu is not pointing to VCPU whose
VPMU is being destroyed. Otherwise we may try to dereference it in
the future, when VCPU is gone.
We have to do this via IPI since otherwise there is a (somewheat
theoretical) chance that between test and subsequent clearing
of last_vcpu the remote processor (i.e. vpmu->last_pcpu) might do
both vpmu_load() and then vpmu_save() for another VCPU. The former
will clear last_vcpu and the latter will set it to something else.
Performing this operation via IPI will guarantee that nothing can
happen on the remote processor between testing and clearing of
last_vcpu.
We should also check for VPMU_CONTEXT_ALLOCATED in vpmu_destroy() to
avoid unnecessary percpu tests and arch-specific destroy ops. Thus
checks in AMD and Intel routines are no longer needed.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Kevin Tian <kevin.tian@intel.com>
Mihai Donțu [Wed, 7 Jan 2015 10:11:27 +0000 (11:11 +0100)]
console: const-ify the arguments for __warn() and __bug()
Both __warn() and __bug() take as first parameter the file name of the
current compilation unit (__FILE__). Mark that parameter as constant to
better reflect that.
Signed-off-by: Mihai Donțu <mdontu@bitdefender.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Juergen Gross [Wed, 7 Jan 2015 10:10:28 +0000 (11:10 +0100)]
expand x86 arch_shared_info to support linear p2m list
The x86 struct arch_shared_info field pfn_to_mfn_frame_list_list
currently contains the mfn of the top level page frame of the 3 level
p2m tree, which is used by the Xen tools during saving and restoring
(and live migration) of pv domains and for crash dump analysis. With
three levels of the p2m tree it is possible to support up to 512 GB of
RAM for a 64 bit pv domain.
A 32 bit pv domain can support more, as each memory page can hold 1024
instead of 512 entries, leading to a limit of 4 TB.
To be able to support more RAM on x86-64 switch to an additional
virtual mapped p2m list.
This patch expands struct arch_shared_info with a new p2m list virtual
address, the root of the page table root and a p2m generation count.
The new information is indicated by the domain to be valid by storing
a non-zero value into the page table root member.
Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Olaf Hering [Wed, 7 Jan 2015 10:09:50 +0000 (11:09 +0100)]
use more fixed strings to build the hypervisor
It should be possible to repeatedly build identical sources and get
identical binaries, even on different hosts at different build times.
This fails for xen.gz and xen.efi because current time and buildhost
get included in the binaries.
Provide variables XEN_BUILD_DATE, XEN_BUILD_TIME and XEN_BUILD_HOST
which the build environment can set to fixed strings to get a
reproducible build.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Paul Durrant [Wed, 7 Jan 2015 10:08:49 +0000 (11:08 +0100)]
x86/hvm: extend HVM cpuid leaf with vcpu id
To perform certain hypercalls HVM guests need to use Xen's idea of
vcpu id, which may well not match the guest OS idea of CPU id.
This patch adds vcpu id to the HVM cpuid leaf allowing the guest
to build a mapping.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Ian Jackson [Tue, 6 Jan 2015 16:21:21 +0000 (16:21 +0000)]
configure: Rerun autogen.sh
Various configure scripts have the Xen version built into them by
autoconf. Rereun autogen.sh (on Debian wheezy) so that they all say
4.6. There are no changes other than to doc comments, usage messages,
and so forth.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Ian Jackson [Tue, 6 Jan 2015 16:18:42 +0000 (16:18 +0000)]
Open Xen 4.6.
* Update README's figlet.
* Remove obsolete 4.3 features paragraph from README (!)
* Update QEMU_UPSTREAM_REVISION to refer simply to `master'
* Update QEMU_TRADITIONAL_REVISION to refer to the actual commit hash.
* Change version in xen/Makefile to 4.6.0-unstable
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Olaf Hering [Fri, 19 Dec 2014 11:25:32 +0000 (12:25 +0100)]
tools/hotplug: remove EnvironmentFile from xen-qemu-dom0-disk-backend.service
The referenced Environment file does not exist, and the service file
does not make use of variables anyway.
N.B. If we start honouring env settings for any reason this will
have to be changed.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Olaf Hering [Fri, 19 Dec 2014 11:25:31 +0000 (12:25 +0100)]
tools/hotplug: use XENCONSOLED_TRACE in xenconsoled.service
Instead of inventing a new XENCONSOLED_LOG= variable reuse the
existing XENCONSOLED_TRACE= variable in xenconsoled.service.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Olaf Hering [Fri, 19 Dec 2014 11:25:30 +0000 (12:25 +0100)]
tools/hotplug: use xencommons as EnvironmentFile in xenconsoled.service
The referenced sysconfig/xenconsoled does not exist. If anything
needs to be specified it has to go into the existing
sysconfig/xencommons file.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Olaf Hering [Fri, 19 Dec 2014 11:25:29 +0000 (12:25 +0100)]
tools/hotplug: xendomains.service depends on network
Starting domains during boot will most likely require network for
the local bridge and it may need access to remote filesystems. Add
ordering tags to systemd service file.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Olaf Hering [Fri, 19 Dec 2014 11:25:28 +0000 (12:25 +0100)]
tools/hotplug: remove XENSTORED_ROOTDIR from xenstored.service
There is no need to export XENSTORED_ROOTDIR. This variable can be
enabled in sysconfig/xencommons. If the variable is unset xenstored
will automatically use @XEN_LIB_STORED@.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Olaf Hering [Fri, 19 Dec 2014 11:25:27 +0000 (12:25 +0100)]
tools/hotplug: remove SELinux options from var-lib-xenstored.mount
Using SELinux mount options per default breaks several systems.
Either the context= mount option is not known at all to the kernel,
as reported for ArchLinux. Or the default value "none" is unknown to
SELinux, as reported for Fedora. In both cases the unit will fail.
The proper place to specify mount options is /etc/fstab. Apparently
systemd is kind enough to use values from there even if Options= or
What= is specified in a .mount file.
Remove XENSTORED_MOUNT_CTX, the reference to a non-existent
EnvironmentFile and trim default Options= for the mount point.
The removed code was first mentioned in the patch referenced below,
with the following description:
...
* Some systems define the selinux context in the systemd Option for
the /var/lib/xenstored tmpfs:
Options=mode=755,context="system_u:object_r:xenstored_var_lib_t:s0"
For the upstream version we remove that and let systems specify
the context on their system /etc/default/xenstored or
/etc/sysconfig/xenstored $XENSTORED_MOUNT_CTX variable
...
It is nowhere stated (on xen-devel) what "Some systems" means, which
is unfortunately common practice in nearly all opensource projects.
http://lists.xenproject.org/archives/html/xen-devel/2014-03/msg02462.html
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Anthony PERARD <anthony.perard@citrix.com> Cc: M A Young <m.a.young@durham.ac.uk> Cc: Luis R. Rodriguez <mcgrof@do-not-panic.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Ed Swierk [Tue, 6 Jan 2015 15:21:07 +0000 (15:21 +0000)]
libxl: Fix building libxlu_cfg_y.y with bison 3.0
- Use %lex-param instead of obsolete YYLEX_PARAM to override lex scanner
parameter
- Change deprecated %name-prefix= to %name-prefix
Tested against bison 2.4.1 and 3.0.2.
This is expected to sometimes (depending on timestamps and whether the
bison input files are edited) break building on systems with ancient
versions of bison. Bison 2.4.1 is known to work and was released in
December 2008.
Also, consquentially, regenerate bison output files with bison
1:2.5.dfsg-2.1 from Debian wheezy.
Signed-off-by: Ed Swierk <eswierk@skyportsystems.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Tested-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Ian Jackson [Tue, 6 Jan 2015 15:15:15 +0000 (15:15 +0000)]
libxl: Renegerate flex output files
Regenerate libxl_*_l.* with flex 2.5.35-10.1 as in current Debian
wheezy. The differences are trivial: addition of declarations of
xlu__cfg_yyget_column and xlu__cfg_yyset_column, but no code body
changes.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Jan Beulich [Fri, 19 Dec 2014 11:17:02 +0000 (11:17 +0000)]
EFI: suppress bogus loader warning
This was accidentally lost in commit fbc3d9a220 ("EFI: add
efi_arch_handle_cmdline() for processing commandline"), leading to the
"Unknown command line option" warning being printed whenever options
get passed to the core hypervisor or the Dom0 kernel.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Mihai Donțu [Tue, 6 Jan 2015 12:49:52 +0000 (12:49 +0000)]
x86/HVM: prevent use-after-free when destroying a domain
hvm_domain_relinquish_resources() can free certain domain resources
which can still be accessed, e.g. by HVMOP_set_param, while the domain
is being cleaned up.
This is CVE-2015-0361 / XSA-116.
Signed-off-by: Mihai Donțu <mdontu@bitdefender.com> Tested-by: Răzvan Cojocaru <rcojocaru@bitdefender.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
reset PCI devices on force removal even when QEMU returns error
On do_pci_remove when QEMU returns error, we just bail out early without
resetting the device. On domain shutdown we are racing with QEMU exiting
and most often QEMU closes the QMP connection before executing the
requested command.
In these cases if force=1, it makes sense to go ahead with rest of the
PCI device removal, that includes resetting the device and calling
xc_deassign_device. Otherwise we risk not resetting the device properly
on domain shutdown.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Sun, 21 Dec 2014 11:18:53 +0000 (11:18 +0000)]
xen: arm: correct off-by-one error in consider_modules
By iterating up to <= mi->nr_mods we are running off the end of the boot
modules, but more importantly it causes us to then skip the first FDT reserved
region, meaning we might clobber it.
Signed-off-by: Ian Campbell <ijc@hellion.org.uk> Reviewed-by: Julien Grall <julien.grall@linaro.org>
Andrew Cooper [Mon, 5 Jan 2015 14:19:58 +0000 (14:19 +0000)]
tools/libxl: Use of init()/dispose() to avoid leaking libxl_dominfo.ssid_label
libxl_dominfo contains a ssid_label pointer which will have memory allocated
for it in libxl_domain_info() if the hypervisor has CONFIG_XSM compiled.
However, the lack of appropriate use of libxl_dominfo_{init,dispose}() will
cause the label string to be leaked, even in success cases.
This was discovered by XenServers Coverity scanning, and are issues not
identified by upstream Coverity Scan.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Wei Liu <wei.liu2@citrix.com> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Fri, 14 Nov 2014 14:41:38 +0000 (14:41 +0000)]
libxl: Fix if{} nesting in do_pci_remove
do_pci_remove contained this:
if (type == LIBXL_DOMAIN_TYPE_HVM) {
[stuff]
} else if (type != LIBXL_DOMAIN_TYPE_PV)
abort();
{
This is bizarre, and not correct. The effect is that HVM guests end
up running both the proper code and that intended for PV guests. This
causes (amongst other things) trouble when PCI devices are
hot-unplugged from HVM guests.
This bug was introduced in abfb006f "tools/libxl: explicitly grant
access to needed I/O-memory ranges".
This is clear candidate for Xen 4.5, being a bugfix to an important
feature.
Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Tested-by: Robert Hu <robert.hu@intel.com> Rlease-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> CC: Sander Eikelenboom <linux@eikelenboom.it> CC: George Dunlap <George.Dunlap@eu.citrix.com>
Ian Jackson [Mon, 5 Jan 2015 14:31:00 +0000 (14:31 +0000)]
libxl: Initialise CTX->xce in domain suspend, as needed
When excuting xl migrate/Remus, the following error can occur:
[root@master xen]# xl migrate 5 slaver
migration target: Ready to receive domain.
Saving to migration stream new xl format (info 0x1/0x0/1225)
Loading new save file <incoming migration stream> (new xl fmt info 0x1/0x0/12\
)
Savefile contains xl domain config in JSON format
Parsing config from <saved>
Segmentation fault (core dumped)
This is because CTX->xce is used without been initialized.
The bug was introduced by commit 2ffeb5d7f5d8
libxl: events: Deregister evtchn fd when not needed
which removed the initialization of xce from libxl__ctx_alloc.
In this patch we initialise the CTX->xce before using it. Also, we
adjust the doc comment for libxl__ev_evtchn_* to mention the need to
do so.
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Wei Liu <wei.liu2@citrix.com>
Avoid emitting an error message referring to an incorrect or corrupt
container file just because no entry was found for the running CPU.
Additionally switch the order of data validation and consumption in
cpu_request_microcode()'s first loop, and also check the types of
skipped blocks in container_fast_forward().
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Jan Beulich [Fri, 12 Dec 2014 10:24:13 +0000 (10:24 +0000)]
domctl: fix IRQ permission granting/revocation
Commit 545607eb3c ("x86: fix various issues with handling guest IRQs")
wasn't really consistent in one respect: The granting of access to an
IRQ shouldn't assume the pIRQ->IRQ translation to be the same in both
domains. In fact it is wrong to assume that a translation is already/
still in place at the time access is being granted/revoked.
What is wanted is to translate the incoming pIRQ to an IRQ for
the invoking domain (as the pIRQ is the only notion the invoking
domain has of the IRQ), and grant the subject domain access to
the resulting IRQ.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Acked-by: Ian Campbell <ian.campbell@citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Jan Beulich [Thu, 11 Dec 2014 10:47:21 +0000 (10:47 +0000)]
x86: don't deliver NMI to PVH Dom0
... for the time being: The mechanism used depends on the domain's use
of the IRET hypercall - which PVH is not using. HVM code (which PVH
uses) will deliver an NMI if it sees v->nmi_pending however that
temporary affinity adjustment gets undone in the HYPERVISOR_iret
handler, yet PVH can't call that hypercall.
Also drop two bogus code lines spotted while going through the involved
code paths: Addresses of per-CPU variables can't possibly be NULL, and
the setting of st->vcpu in send_guest_trap()'s MCE case is redundant
with an earlier cmpxchgptr().
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
M A Young [Thu, 18 Dec 2014 10:02:16 +0000 (10:02 +0000)]
tools/xl: fix segfault in xl migrate --debug
If differences are found during the verification phase of xl migrate
--debug then it is likely to crash with a segfault because the bogus
pagebuf->pfn_types[pfn] is used in a print statement instead of
pfn_type[pfn] .
Signed-off-by: Michael Young <m.a.young@durham.ac.uk> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
George Dunlap [Tue, 9 Dec 2014 14:04:19 +0000 (14:04 +0000)]
libxl: Tell qemu to use raw format when using a tapdisk
At the moment libxl unconditinally passes the underlying file format
to qemu in the device string. However, when tapdisk is in use,
tapdisk handles the underlying format and presents qemu with
effectively a raw disk. When qemu looks at the tapdisk block device
and doesn't find the image format it was looking for, it will fail.
This effectively means that tapdisk cannot be used with HVM domains at
the moment except for raw files.
Instead, if we're using a tapdisk backend, tell qemu to use a raw file
format.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
[ ijc -- nuked extra blank line ]
Wei Liu [Mon, 15 Dec 2014 10:56:24 +0000 (10:56 +0000)]
xl: print message to stdout when (!debug && dryrun)
In commit d36a3734a ("xl: fix migration failure with xl migrate
--debug"), message is printed to stderr for both debug mode
and dryrun mode. That caused rdname() in xendomains fails to parse
domain name since it's expecting input from xl's stdout.
So this patch separates those two cases. If xl is running in debug mode,
then message is printed to stderr; if xl is running in dryrun mode and
debug is not enabled, message is printed to stdout. This will fix
xendomains and other scripts that use "xl create --dryrun", as well as
not re-introducing the old bug fixed in d36a3734a.
Reported-by: Mark Pryor <tlviewer@yahoo.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: M A Young <m.a.young@durham.ac.uk> Cc: Ian Campbell <ian.campbell@citrix.com> Release-Acked-by: Konrad Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Fri, 12 Dec 2014 18:26:02 +0000 (18:26 +0000)]
docs/commandline: Minor formatting fixes and clarifications
`font` had a trailing single quote which was out of place.
`gnttab_max_frames` was missing escapes for the underscores which caused the
underscores to take their markdown meaning, causing 'max' in the middle to be
italicised. Escape the underscores, and make all command line parameters
bold, to be consistent with the existing style.
Clarify how the default for `nmi` changes between debug and non debug builds
of Xen.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Tue, 9 Dec 2014 16:43:22 +0000 (16:43 +0000)]
python/xc: Fix multiple issues in pyflask_context_to_sid()
The error handling from a failed memory allocation should return
PyErr_SetFromErrno(xc_error_obj); rather than simply calling it and continuing
to the memcpy() below, with the dest pointer being NULL.
Coverity also complains about passing a non-NUL terminated string to
xc_flask_context_to_sid(). xc_flask_context_to_sid() doesn't actually take a
NUL terminated string, but it does take a char* which, in context, used to be
a string, which is why Coverity complains.
One solution would be to use strdup(ctx) which is simpler than a
strlen()/malloc()/memcpy() combo, which would result in a NUL-terminated
string being used with xc_flask_context_to_sid().
However, ctx is strictly an input to the hypercall and is not mutated along
the way. Both these issues can be fixed, and the error logic simplified, by
not duplicating ctx in the first place.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Coverity-IDs: 10553051055721 Acked-by: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Wei Liu <wei.liu2@citrix.com> CC: Xen Coverity Team <coverity@xen.org>
The UART is not able to receive bytes when idle mode is not configured
properly, therefore setup the UART with autoidle and wakeup enabled.
Older Linux kernels (for example 3.8) configure hwmods for all devices
even if the device tree nodes for those devices is absent in device
tree, thus UART idle mode is configured too. With such kernels we can
workaround the issue by adding a fake node in the UART containing this
MMIO range, which is therefore mapped by Xen to dom0, which
reconfigures the UART, causing things to work normally.
Newer Linux Kernels (3.12 and beyond) do not configure idle mode for
UART and so this hack no longer works.
Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- updated commit message as discussed ]
Jan Beulich [Mon, 15 Dec 2014 08:30:05 +0000 (09:30 +0100)]
console: allocate ring buffer earlier
... when "conring_size=" was specified on the command line. We can't
really do this as early as we would want to when the option was not
specified, as the default depends on knowing the system CPU count. Yet
the parsing of the ACPI tables is one of the things that generates a
lot of output especially on large systems.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> (ARM and generic bits) Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Jan Beulich [Thu, 11 Dec 2014 16:14:07 +0000 (17:14 +0100)]
have architectures specify the number of PIRQs a hardware domain gets
The current value of nr_static_irqs + 256 is often too small for larger
systems. Make it dependent on CPU count and number of IO-APIC pins on
x86, and (until it obtains PCI support) simply NR_IRQS on ARM.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Jan Beulich [Thu, 11 Dec 2014 16:13:04 +0000 (17:13 +0100)]
lock down hypercall continuation encoding masks
Andrew validly points out that even if these masks aren't a formal part
of the hypercall interface, we aren't free to change them: A guest
suspended for migration in the middle of a continuation would fail to
work if resumed on a hypervisor using a different value. Hence add
respective comments to their definitions.
Additionally, to help future extensibility as well as in the spirit of
reducing undefined behavior as much as possible, refuse hypercalls made
with the respective bits non-zero when the respective sub-ops don't
make use of those bits.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org> Release-Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Ian Jackson [Wed, 26 Nov 2014 17:28:18 +0000 (17:28 +0000)]
libxl: events: Document and enforce actual callbacks restriction
libxl_event_register_callbacks cannot reasonably be called while libxl
is busy (has outstanding operations and/or enabled events).
This is because the previous spec implied (although not entirely
clearly) that event hooks would not be called for existing fd and
timeout interests. There is thus no way to reliably ensure that libxl
would get told about fds and timeouts which it became interested in
beforehand.
So there have to be no such fds or timeouts, which means that the
callbacks must only be registered or changed when the ctx is idle.
Document this restriction, and enforce it with a pair of asserts.
(It would be nicer, perhaps, to say that the application may not call
libxl_osevent_register_hooks other than right after creating the ctx.
But there are existing callers, including libvirt, who do it later -
even after doing major operations such as domain creation.)
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Tested-by: Ian Campbell <ian.campbell@citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Ian Jackson [Wed, 26 Nov 2014 17:27:27 +0000 (17:27 +0000)]
libxl: events: Deregister evtchn fd when not needed
We want to have no fd events registered when we are idle.
In this patch, deal with the evtchn fd:
* Defer setup of the evtchn handle to the first use.
* Defer registration of the evtchn fd; register as needed on use.
* When cancelling an evtchn wait, or when wait setup fails, check
whether there are now no evtchn waits and if so deregister the fd.
* On libxl teardown, the evtchn fd should therefore be unregistered.
assert that this is the case.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Tested-by: Ian Campbell <ian.campbell@citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
v2: Do not bother putting evtchn_fd in the ctx; instead, get it
from xc_evtchn_fd when we need it. (Cosmetic.)
Do not register the evtchn fd multiple times: check it's not
registered before we call libxl__ev_fd_register. (Bugfix.)