Wei Liu [Mon, 16 Mar 2015 09:52:34 +0000 (09:52 +0000)]
libxl: define LIBXL_HAVE_VNUMA
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Mon, 16 Mar 2015 09:52:33 +0000 (09:52 +0000)]
libxl: disallow memory relocation when vNUMA is enabled
Disallow memory relocation when vNUMA is enabled, because relocated
memory ends up off node. Further more, even if we dynamically expand
node coverage in hvmloader, low memory and high memory may reside
in different physical nodes, blindly relocating low memory to high
memory gives us a sub-optimal configuration.
Introduce a function called libxl__vnuma_configured and use it.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Konrad Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Mon, 16 Mar 2015 09:52:32 +0000 (09:52 +0000)]
libxl: build, check and pass vNUMA info to Xen for HVM guest
Transform user supplied vNUMA configuration into libxl internal
representations then libxc representations. Check validity along the
line.
Libxc has more involvement in building vmemranges in HVM case compared
to PV case. The building of vmemranges is placed after xc_hvm_build
returns, because it relies on memory hole information provided by
xc_hvm_build.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Dario Faggioli <dario.faggioli@citrix.com> Cc: Elena Ufimtseva <ufimtseva@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Mon, 16 Mar 2015 09:52:31 +0000 (09:52 +0000)]
libxc: allocate memory with vNUMA information for HVM guest
The algorithm is more or less the same as the one used for PV guest.
Libxc gets hold of the mapping of vnode to pnode and size of each vnode
then allocate memory accordingly.
And then the function returns low memory end, high memory end and mmio
start to caller. Libxl needs those values to construct vmemranges for
that guest.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Dario Faggioli <dario.faggioli@citrix.com> Cc: Elena Ufimtseva <ufimtseva@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Mon, 16 Mar 2015 09:52:30 +0000 (09:52 +0000)]
libxc: indentation change to xc_hvm_build_x86.c
Move a while loop in xc_hvm_build_x86 one block to the right. No
functional change introduced.
Functional changes will be introduced in next patch.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Dario Faggioli <dario.faggioli@citrix.com> Cc: Elena Ufimtseva <ufimtseva@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Mon, 16 Mar 2015 09:52:29 +0000 (09:52 +0000)]
libxl: build, check and pass vNUMA info to Xen for PV guest
Transform the user supplied vNUMA configuration into libxl internal
representations, and finally libxc representations. Check validity of
the configuration along the line.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Dario Faggioli <dario.faggioli@citrix.com> Cc: Elena Ufimtseva <ufimtseva@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Mon, 16 Mar 2015 09:52:28 +0000 (09:52 +0000)]
libxl: functions to build vmemranges for PV guest
Introduce a arch-independent routine to generate one vmemrange per
vnode. Also introduce arch-dependent routines for different
architectures because part of the process is arch-specific -- ARM has
yet have NUMA support and E820 is x86 only.
For those x86 guests who care about machine E820 map (i.e. with
e820_host=1), vnode is further split into several vmemranges to
accommodate memory holes. A few stubs for libxl_arm.c are created.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Dario Faggioli <dario.faggioli@citrix.com> Cc: Elena Ufimtseva <ufimtseva@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Mon, 16 Mar 2015 09:52:27 +0000 (09:52 +0000)]
libxl: x86: factor out e820_host_sanitize
This function gets the machine E820 map and sanitize it according to PV
guest configuration.
This will be used in later patch. No functional change introduced in
this patch.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Elena Ufimtseva <ufimtseva@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Mon, 16 Mar 2015 09:52:26 +0000 (09:52 +0000)]
libxl: introduce libxl__vnuma_config_check
This function is used to check whether vNUMA configuration (be it
auto-generated or supplied by user) is valid.
Define a new error code ERROR_VNUMA_CONFIG_INVALID.
The checks performed can be found in the comment of the function.
This vNUMA function (and future ones) is placed in a new file called
libxl_vnuma.c
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Dario Faggioli <dario.faggioli@citrix.com> Cc: Elena Ufimtseva <ufimtseva@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Mon, 16 Mar 2015 09:52:25 +0000 (09:52 +0000)]
libxl: add vmemrange to libxl__domain_build_state
A vnode consists of one or more vmemranges (virtual memory range). One
example of multiple vmemranges is that there is a hole in one vnode.
Currently we haven't exported vmemrange interface to libxl user.
Vmemranges are generated during domain build, so we have relevant
structures in domain build state.
Later if we discover we need to export the interface, those structures
can be moved to libxl_domain_build_info as well.
These new fields (along with other fields in that struct) are set to 0
at start of day so we don't need to explicitly initialise them. A
following patch which introduces an independent checking function will
need to access these fields. I don't feel very comfortable squashing
this change into that one so I didn't use a single commit.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Dario Faggioli <dario.faggioli@citrix.com> Cc: Elena Ufimtseva <ufimtseva@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Mon, 16 Mar 2015 09:52:24 +0000 (09:52 +0000)]
libxl: introduce vNUMA types
A domain can contain several virtual NUMA nodes, hence we introduce an
array in libxl_domain_build_info.
libxl_vnode_info contains the size of memory in that node, the distance
from that node to every nodes, the underlying pnode and a bitmap of
vcpus.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Dario Faggioli <dario.faggioli@citrix.com> Cc: Elena Ufimtseva <ufimtseva@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Mon, 16 Mar 2015 09:52:23 +0000 (09:52 +0000)]
libxc: allocate memory with vNUMA information for PV guest
From libxc's point of view, it only needs to know vnode to pnode mapping
and size of each vnode to allocate memory accordingly. Add these fields
to xc_dom structure.
The caller might not pass in vNUMA information. In that case, a dummy
layout is generated for the convenience of libxc's allocation code. The
upper layer (libxl etc) still sees the domain has no vNUMA
configuration.
Note that for this patch on PV x86 guest can have multiple regions of
ram allocated.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Dario Faggioli <dario.faggioli@citrix.com> Cc: Elena Ufimtseva <ufimtseva@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Mon, 16 Mar 2015 09:52:22 +0000 (09:52 +0000)]
libxc: add p2m_size to xc_dom_image
Add a new field p2m_size to keep track of the number of pages covered by
p2m. Change total_pages to p2m_size in functions which in fact need
the size of p2m.
This is needed because we are going to ditch the assumption that PV x86
has only one contiguous ram region. Originally the p2m size was always
equal to total_pages, but we will soon change that in later patch.
This patch doesn't change the behaviour of libxc.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Mon, 16 Mar 2015 09:52:21 +0000 (09:52 +0000)]
libxc: duplicate snippet to allocate p2m_host array
Currently all in tree code doesn't set the superpage flag, I would just
remove superpage support if I can, but Konrad wants it retained for the
moment.
As I'm going to change the p2m_host array allocation, duplicate the code
snippet to allocate p2m_host array in this patch, so that we retain the
behaviour in superpage case.
This patch introduces no functional change and it will make future patch
easier to review. Also removed one stray tab while I was there.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Konrad Wilk <konrad.wilk@oracle.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Quan Xu [Tue, 17 Mar 2015 01:00:03 +0000 (21:00 -0400)]
stubdom: fix vtpm build failure due to duplicated typedefs.
Typedefs are duplicated in stubdom/vtpmmgr/tcg.h and supported compilers
do not cope with current staging branch.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Quan Xu <quan.xu@intel.com> Tested-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- reworded subject line ]
Andrew Cooper [Mon, 16 Mar 2015 13:29:54 +0000 (13:29 +0000)]
tools/libxl: Adjust datacopiers POLLHUP handling when the fd is also readable
POLLHUP|POLLIN is a valid revent to receive when there is readable data in a
pipe, but the writable fd has been closed. This occurs in migration v2 when
the legacy conversion process (which transforms the data inline) completes and
exits successfully.
In the case that there is data to read, suppress the POLLHUP. POSIX states
that the hangup state is latched[1], which means it will reoccur on subsequent
poll() calls. The datacopier is thus provided the opportunity to read until
EOF, if possible.
A POLLHUP on its own is treated exactly as before, indicating a different
error with the fd.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ross Lagerwall [Mon, 16 Mar 2015 13:29:53 +0000 (13:29 +0000)]
tools/libxl: Extend datacopier to support reading into a buffer
Currently a datacopier may source its data from an fd or local buffer, but its
destination must be an fd. For migration v2, libxl needs to read from the
migration stream into a local buffer.
Implement a "read into local buffer" mode, invoked when readbuf is set and
writefd is -1. On success, the callback passes the number of bytes read.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
[Rewrite commit message] Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ross Lagerwall [Mon, 16 Mar 2015 13:29:52 +0000 (13:29 +0000)]
tools/libxl: Allow limiting amount copied by datacopier
Currently, a datacopier will unconditionally read until EOF on its read fd.
For migration v2, libxl needs to read records of a specific length out of the
migration stream, without reading any further data.
Introduce a parameter, bytes_to_read, which may be used to stop the datacopier
ahead of reaching EOF. If bytes_to_read is set to -1, then the datacopier will
read until EOF.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
[Rewrite commit message] Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ross Lagerwall [Mon, 16 Mar 2015 13:29:51 +0000 (13:29 +0000)]
tools/libxl: Avoid overrunning static buffer with prefixdata
An individual datacopier_buf contains a static buffer of 1000 bytes.
Attempting to add prefixdata of more than 1000 bytes would overrun the buffer
and cause heap corruption.
Instead, split the prefixdata and chain together multiple datacopier buffers.
This allows for an arbitrary quantity of prefixdata to be added to a
datacopier.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Wei Liu <wei.liu2@citrix.com>
Wen Congyang [Mon, 16 Mar 2015 13:29:50 +0000 (13:29 +0000)]
tools/libxl: Update datacopier to support sending data only
Currently, starting a datacopier requires a valid read and write fd, but this
is a problem when purely sending data from a local buffer to a writable fd.
The prefixdata mechanism already exists and works for inserting data from a
local buffer ahead of reading from the read fd.
Make the lack of a read fd non-fatal. A datacopier with no read fd, but some
prefixdata will write the prefixdata to the write fd and complete successfully.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
[Rewrite commit message] Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Mon, 16 Mar 2015 13:29:49 +0000 (13:29 +0000)]
tools/libxl: Introduce min and max macros
This is the same set used by libxc.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Wei Liu <wei.liu2@citrix.com>
Dario Faggioli [Tue, 17 Mar 2015 14:11:33 +0000 (15:11 +0100)]
make dumping vcpu info look better
and more consistent. In fact, before this changes, it looks
like this:
(XEN) VCPU information and callbacks for domain 0:
(XEN) VCPU0: CPU4 [has=F] poll=0 upcall_pend = 00, upcall_mask = 00 dirty_cpus={4} cpu_affinity={0-15}
(XEN) cpu_soft_affinity={0-15}
(XEN) pause_count=0 pause_flags=1
(XEN) No periodic timer
After, it looks like this:
(XEN) VCPU information and callbacks for domain 0:
(XEN) VCPU0: CPU4 [has=F] poll=0 upcall_pend=00 upcall_mask=00 dirty_cpus={4}
(XEN) cpu_hard_affinity={0-15} cpu_soft_affinity={0-15}
(XEN) pause_count=0 pause_flags=1
(XEN) No periodic timer
So, consistently _not_ put space between fields and '=',
and consistently _not_ use ',' as separator. Also, put the
info about affinity on the same, properly indented.
Dario Faggioli [Tue, 17 Mar 2015 14:11:05 +0000 (15:11 +0100)]
sched_rt: implement the .free_pdata hook
which is called by cpu_schedule_down(), and is necessary
for resetting the spinlock pointers in schedule_data from
the RTDS global runqueue lock, back to the default _lock
fields in the struct.
Not doing so causes Xen to explode, e.g., when removing
pCPUs from an RTDS cpupool and assigning them to another
one.
Daniel De Graaf [Tue, 17 Mar 2015 09:58:40 +0000 (10:58 +0100)]
xsm: add device tree labeling support
This adds support in the hypervisor and policy build toolchain for
Xen/Flask policy version 30, which adds the ability to label ARM device
tree nodes and expands the IOMEM ocontext entries to 64 bits.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Tested-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Dario Faggioli [Tue, 17 Mar 2015 09:55:49 +0000 (10:55 +0100)]
sched: honour generic perf conuters in the RTDS scheduler
more specifically, about vCPU initialization and destruction events,
in line with adb26c09f26e ("xen: sched: introduce a couple of counters
in credit2 and SEDF").
EACCES cannot be distinguished against an incorrect DOMCTL_INTERFACE_VERSION,
and will cause an incorrect "need to rebuild the user-space tool set?" message
from libxc.
On the libxc side, put the useful piece of information in the error message,
rathe than the -1 from do_domctl().
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Ross Lagerwall [Fri, 13 Mar 2015 11:41:51 +0000 (12:41 +0100)]
x86: don't apply reboot quirks if reboot set by user
If reboot= is specified on the command-line, don't apply reboot quirks
to allow the command-line option to take precedence.
This is a port of Linux commit 5955633e91bf ("x86/reboot: Skip DMI
checks if reboot set by user").
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Leverage (and make apply on top of) c643fb110a ("x86/EFI: allow
reboot= overrides when running under EFI").
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
At the point this patch calls domain_update_node_affinity(), the vcpu
hard affinities have not yet been updated; so calling it at this point
can in some circumstances trigger an ASSERT().
domain_update_node_affinity() is already called in
cpu_disable_scheduler(), so adding it to cpupool_unassign_cpu() is
redundant. Simply reverting the patch is sufficient.
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
x86/EFI: allow reboot= overrides when running under EFI
By default we will always use EFI reboot mechanism when
running under EFI platforms. However some EFI platforms
are buggy and need to use the ACPI mechanism to
reboot (such as Lenovo ThinkCentre M57). As such
respect the 'reboot=' override and DMI overrides
for EFI platforms.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
- BOOT_INVALID is just zero
- also consider acpi_disabled in BOOT_INVALID resolution
- duplicate BOOT_INVALID resolution in machine_restart()
- don't fall back from BOOT_ACPI to BOOT_EFI (if it was overridden, it
surely was for a reason)
- adjust doc change formatting
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 13 Mar 2015 10:23:14 +0000 (11:23 +0100)]
x86emul: simplify asm() constraints
Use + on outputs instead of = and a matching input. Allow not just
memory for the _eflags operand (it turns out that recent gcc produces
worse code when also doing this for _dst.val, so the latter is being
avoided).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Ian Campbell [Fri, 20 Feb 2015 14:41:09 +0000 (14:41 +0000)]
tools: libxl: Explicitly disable graphics backends on qemu cmdline
By default qemu will try to create some sort of backend for the
emulated VGA device, either SDL or VNC.
However when the user specifies sdl=0 and vnc=0 in their configuration
libxl was not explicitly disabling either backend, which could lead to
one unexpectedly running.
If either sdl=1 or vnc=1 is configured then both before and after this
change only the backends which are explicitly enabled are configured,
i.e. this issue only occurs when all backends are supposed to have
been disabled.
This affects qemu-xen and qemu-xen-traditional differently.
If qemu-xen was compiled with SDL support then this would result in an
SDL window being opened if $DISPLAY is valid, or a failure to start
the guest if not. Passing "-display none" to qemu before any further
-sdl options disables this default behaviour and ensures that SDL is
only started if the libxl configuration demands it.
If qemu-xen was compiled without SDL support then qemu would instead
start a VNC server listening on ::1 (IPv6 localhost) or 127.0.0.1
(IPv4 localhost) with IPv6 preferred if available. Explicitly pass
"-vnc none" when vnc is not enabled in the libxl configuration to
remove this possibility.
qemu-xen-traditional would never start a vnc backend unless asked.
However by default it will start an SDL backend, the way to disable
this is to pass a -vnc option. In other words passing "-vnc none" will
disable both vnc and sdl by default. sdl can then be reenabled if
configured by subsequent use of the -sdl option.
Tested with both qemu-xen and qemu-xen-traditional built with SDL
support and:
xl cr # defaults
xl cr sdl=0 vnc=0
xl cr sdl=1 vnc=0
xl cr sdl=0 vnc=1
xl cr sdl=0 vnc=0 vga=\"none\"
xl cr sdl=0 vnc=0 nographic=1
with both valid and invalid $DISPLAY.
This is XSA-119 / CVE-2015-2152.
Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Wei Liu [Wed, 11 Mar 2015 16:48:36 +0000 (16:48 +0000)]
tools/firmware: fix OVMF clean and distclean
They should have used "-ovmf-dir" suffix instead of "-ovmf", as the
directory in question is ovmf-dir.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Mon, 2 Mar 2015 10:52:20 +0000 (10:52 +0000)]
tools: OVMF parallel build
Though it doesn't work with make's "-j" option, the build system of OVMF
has an option to specify parallel threads used to run the build.
Using 4 threads to build OVMF looks like a sensible default.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Anthony Perard <anthony.perard@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Dario Faggioli [Fri, 6 Mar 2015 17:21:07 +0000 (18:21 +0100)]
docs: fix `xl list' manpage entry
as it was not covering the '-n' option, which is present
since d743a223 ("xl: add node-affinity to the output of
`xl list`").
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Julien Grall [Wed, 11 Mar 2015 13:05:25 +0000 (14:05 +0100)]
passthrough: share_p2m: fix build failure on ARM
The commit 7978429 "iommu: fix usage of shared EPT/IOMMU page tables on
PVH guests" breaks the hypervisor compilation on ARM.
This is because the macro hap_enabled is not defined on ARM.
On x86, the P2M can only be shared when hap is enabled and the user
didn't deny it (via the command line). Those checks are done by
iommu_use_hap_pt().
On ARM, the macro iommu_use_hap_pt() is also defined. So move the
if ( iommu_use_hap_pt(d) ) from the IOMMU drivers up to
iommu_share_p2m_table.
Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Julien Grall <julien.grall@linaro.org> Reviewed-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Mon, 9 Mar 2015 12:48:56 +0000 (12:48 +0000)]
tools: xl: handle unspecified extra= when dealing with root=
If the cfg file includes root= but not extra= (nor cmdline=, which
supercedes both) then the command line will end up with an extra
"(null)" on it (at least with glibc's implementation of asprintf).
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Chunyan Liu <cyliu@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
xen/arm: Remove warning for platforms without platform specific code
Replace the warning with an info message stating that the platform
is generic.
Suggested-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Julien Grall <julien.grall@linaro.org> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Julien Grall <julien.grall@linaro.org> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- resorted list in Makefile, moving thunderx entry ]
Philipp Hahn [Sun, 8 Mar 2015 10:54:17 +0000 (11:54 +0100)]
VHD: Fix locale aware character encoding handling
ASCII is 7 bit only, which does not work in UTF-8 environments:
> failed to read parent name
Setup locale in vhd-util to parse LC_CTYPE and use the right codeset
when doing file name encoding and decoding.
Increase allocation for UTF-8 buffer as one UTF-16 character might use
twice as much space in UTF-8 (or more).
Don't check outbytesleft==0 as one UTF-8 characters get encoded into
1..8 bytes, so it's perfectly fine (and expected) for the output to have
remaining bytes left.
libxl_wait_for_memory_target: wait for 2 sec at a time
Use a 2 sec sleep time in the loop to allow the guest to release a
decent amount of memory in an iteration (empirical tests show ballooning
speed to be 512MB/sec or recent boxes).
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Mike Latimer <mlatimer@suse.com> Tested-by: Mike Latimer <mlatimer@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
freemem: remove call to libxl_wait_for_free_memory
Now that libxl_wait_for_memory_target is capable of waiting until dom0
reaches its target, we can remove the other wait function call:
libxl_wait_for_free_memory. No need to wait twice. Once dom0 has met its
target, simply loop again and recalculate free_memkb.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: Mike Latimer <mlatimer@suse.com> Tested-by: Mike Latimer <mlatimer@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Fri, 6 Mar 2015 11:33:48 +0000 (11:33 +0000)]
libxc: use xc_dom_panic when decompressor is not supported
State explicitly that specific decompressor is not supported by libxc.
Without this change, libxc error message only says the provided kernel
is invalid, which is misleading.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: svenvan.van@gmail.com Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ross Lagerwall [Tue, 10 Mar 2015 12:52:01 +0000 (13:52 +0100)]
EFI: fix getting EFI variable list on some systems
Copy the entire output buffer to the guest because some firmwares update
size on successful calls (contrary to the spec) and the buffer may
contain data beyond the output size that the firmware requires on a
subsequent GetNextVariableName() call (e.g. a NULL character).
Note that this shouldn't change the amount of data copied because on success, a
compliant firmware does not change size and so the entire buffer is copied
anyway. If size is changed, Xen does not copy the buffer.
Without this change, the following (simplified) sequence would occur:
GetNextVariableName: in \0, size 1024 || out AdminPw\0, size 7
GetNextVariableName: in AdminPw\0, size 1024 || out UserPw\0, size 6
GetNextVariableName: in UserPww\0, size 1024 || NOT FOUND
This was seen on an Intel S1200RP_SE with firmware
S1200RP.86B.02.02.0005.102320140911, version 4.6, date 2014-10-23.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Daniel De Graaf [Tue, 10 Mar 2015 12:50:24 +0000 (13:50 +0100)]
flask: create unified "flask=" boot parameter
This unifies the flask_enforcing and flask_enabled boot parameters into
a single parameter with additional states. Defined options are:
enforcing - require policy to be loaded at boot time and enforce it
permissive - a missing or broken policy does not panic
disabled - revert to dummy (no XSM) policy. Was flask_enabled=0
late - bootloader policy is not used; later loadpolicy is enforcing
The default mode remains "permissive" and the flask_enforcing boot
parameter is retained for compatibility. If flask_enforcing=1 is
specified and flask= is not, the bootloader policy will be loaded in
enforcing mode if present, but errors will disable access controls until
a successful loadpolicy instead of causing a panic at boot.
Suggested-by: Julien Grall <julien.grall@linaro.org> Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 10 Mar 2015 12:45:51 +0000 (13:45 +0100)]
x86emul: fully ignore segment override for register-only operations
For ModRM encoded instructions with register operands we must not
overwrite ea.mem.seg (if a - bogus in that case - segment override was
present) as it aliases with ea.reg.
This is CVE-2015-2151 / XSA-123.
Reported-by: Felix Wilhelm <fwilhelm@ernw.de> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Reviewed-by: Keir Fraser <keir@xen.org>
Daniel De Graaf [Mon, 9 Mar 2015 13:04:55 +0000 (14:04 +0100)]
flask: clean up initialization and #defines
This removes the FLASK_DEVELOP and FLASK_BOOTPARAM configuration
parameters which have never been settable by users. Disabling the
FLASK_DEVELOP configuration option has not produced a compiling
hypervisor for some time, and the FLASK_BOOTPARAM option will be
replaced with a more flexible boot parameter.
This also changes the return type of xsm_initcall_t to void to properly
reflect the fact that the caller ignores the return value.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monné [Mon, 9 Mar 2015 13:01:40 +0000 (14:01 +0100)]
iommu: fix usage of shared EPT/IOMMU page tables on PVH guests
iommu_share_p2m_table should not prevent PVH guests from using a shared page
table between the IOMMU and EPT. Clean the code by removing the asserts in
the vendor specific implementations (amd_iommu_share_p2m, iommu_set_pgd),
and moving the hap_enabled assert to the caller (iommu_share_p2m_table).
Also fix another incorrect usage of is_hvm_domain usage in
arch_iommu_populate_page_table. This has not given problems so far because
all the pages in PVH guests are of type PGT_writable_page.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Tested-by: David Vrabel <david.vrabel@citrix.com> Reviewed-by: Tim Deegan <tim@xen.org> Acked-by: Kevin Tian <kevin.tian@intel.com>
Jan Beulich [Mon, 9 Mar 2015 13:00:19 +0000 (14:00 +0100)]
VT-d: print_vtd_entries() should cope with superpages
Even if VT-d code alone (i.e. when not sharing tables with EPT) still
doesn't support superpages, this function - invoked upon DMA remapping
faults - needs to cope with such.
While at it also replace a few more plain numbers with suitable named
constants.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Kevin Tian <kevin.tian@intel.com>
Jan Beulich [Fri, 6 Mar 2015 16:28:54 +0000 (17:28 +0100)]
x86: widen NUMA nodes to be allocated from
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 6 Mar 2015 16:27:33 +0000 (17:27 +0100)]
allow domain heap allocations to specify more than one NUMA node
... using struct domain as a container for passing the respective
affinity mask: Quite a number of allocations are domain specific, yet
not to be accounted for that domain. Introduce a flag suppressing the
accounting altogether (i.e. going beyond MEMF_no_refcount) and use it
right away in common code (x86 and IOMMU code will get adjusted
subsequently).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 6 Mar 2015 16:26:30 +0000 (17:26 +0100)]
x86: allow specifying the NUMA nodes Dom0 should run on
... by introducing a "dom0_nodes" option augmenting the "dom0_mem" and
"dom0_max_vcpus" ones.
Note that this gives meaning to MEMF_exact_node specified alone (i.e.
implicitly combined with NUMA_NO_NODE): In such a case any node inside
the domain's node mask is acceptable, but no other node. This changed
behavior is (implicitly) being exposed through the memop hypercalls.
Note further that this change doesn't take care of moving the initrd
image into memory matching Dom0's affinity when the initrd doesn't get
copied (because of being part of the initial mapping) anyway.
And note finally that this doesn't get us meaningfully closer to
handing vNUMA information to Dom0 (which will require the current
striping of allocations to become node-specific in order for the passed
on information to be meaningful).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Dario Faggioli <dario.faggioli@cirix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 6 Mar 2015 15:56:53 +0000 (16:56 +0100)]
credit: generalize __vcpu_has_soft_affinity()
As pointed out in the discussion of the patch at
http://lists.xenproject.org/archives/html/xen-devel/2015-02/msg03256.html
generalizing the conditions here means code elsewhere doesn't need to
take into consideration internals of how load balancing in the credit
scheduler works.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Jan Beulich [Fri, 6 Mar 2015 15:56:16 +0000 (16:56 +0100)]
test_x86_emulate: fix inline assembly in blowfish code
With certain gcc versions, commit 1166ecf781 ("tools/Rules.mk: Don't
optimize debug builds; add macro debugging information") results in the
file scope inline assembly no longer being emitted to the .text section
without explicitly switching to it, which causes the blowfish test to
signal SEGV.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 6 Mar 2015 15:54:53 +0000 (16:54 +0100)]
do_xen_version() cleanup
- use exisiting latched value of current->domain where available
- use __copy_to_guest() instead of copy_to_guest() where possible
- drop redundant inclusion of xen/config.h
- drop pointless braces
- consistenly use typedef names
- formatting
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Ian Campbell [Tue, 3 Mar 2015 17:02:22 +0000 (17:02 +0000)]
libxl: Correct license header on libxl_flask.c to be LGPL
libxl is intended to be an LGPL 2.1 licensed library, however this
file inadvertently got given a GPL header.
The following people have touched this file, although all but Machon's
contributions are trivial and/or mechanical an Ack from each would be
unambiguous:
$ git log --format='%an <%aE>' tools/libxl/libxl_flask.c | sort -u
Ian Campbell <ian.campbell@citrix.com>
Ian Jackson <ian.jackson@eu.citrix.com>
Machon Gregory <mbgrego@tycho.ncsc.mil>
Wei Liu <liuw@liuw.name>
$
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Machon Gregory <mbgrego@tycho.ncsc.mil> Cc: Wei Liu <liuw@liuw.name> Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Wei Liu <liuw@liuw.name> Acked-by: M. Gregory <mbgrego@tycho.ncsc.mil>
Wei Liu [Tue, 3 Mar 2015 12:44:38 +0000 (12:44 +0000)]
xsm/policy: remove gawk-ism line in Makefile
Translate gawk regex to mawk regex to allow using mawk. The new regex
works on both gawk and mawk.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Vijaya Kumar K [Wed, 4 Mar 2015 06:06:25 +0000 (11:36 +0530)]
xen/arm: Don't pass the PSCI-0.2 node to DOM0
psci node is generated by xen for dom0.
if the host device tree has psci-0.2 skip parsing this node
and avoid copying from host device tree to dom0 device tree.
Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Frediano Ziglio [Tue, 3 Mar 2015 15:41:14 +0000 (15:41 +0000)]
xen/arm: Make gic-v2 code handle hip04-d01 platform
The GIC in this platform is mainly compatible with the standard
GICv2 beside:
- ITARGET is extended to 16 bit to support 16 CPUs;
- SGI mask is extended to support 16 CPUs;
- maximum supported interrupt is 510;
- GICH APR and LR register offsets.
Signed-off-by: Frediano Ziglio <frediano.ziglio@huawei.com> Signed-off-by: Zoltan Kiss <zoltan.kiss@huawei.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Frediano Ziglio [Tue, 3 Mar 2015 15:41:12 +0000 (15:41 +0000)]
xen/arm: Duplicate gic-v2.c file to support hip04 platform version
HiSilison Hip04 platform use a slightly different version.
This is just a verbatim copy of the file to workaround git
not fully supporting copy operation.
Signed-off-by: Frediano Ziglio <frediano.ziglio@huawei.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Add Memory Bandwidth Monitoring(MBM) for VMs. Two types of monitoring
are supported: total and local memory bandwidth monitoring. To use it,
CMT should be enabled in hypervisor.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Jan Beulich [Wed, 4 Mar 2015 09:02:50 +0000 (10:02 +0100)]
domctl: cleanup
- drop redundant "ret = 0" statements
- drop unnecessary braces
- eliminate a few single use local variables
- move break statements inside case-specific braced scopes
- eliminate trailing whitespace
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Wed, 4 Mar 2015 09:01:41 +0000 (10:01 +0100)]
vNUMA: validate XEN_DOMCTL_setvnumainfo input
As we get ready to use the information set for a domain here we should
make sure it is actually valid: Both vNode and pNode numbers should be
in range. Do a little bit of other cleanup so the code ends up looking
reasonably consistent in style.
Along with this goes that we don't need an array of unsigned int to
store the pNode number - a nodeid_t one (a quarter the size) suffices.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@cigtrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Wed, 4 Mar 2015 08:59:47 +0000 (09:59 +0100)]
x86/tboot: invalidate FIX_TBOOT_MAP_ADDRESS mapping after use
In order for commit cbeeaa7d ("x86/nmi: fix shootdown of pcpus
running in VMX non-root mode")'s re-use of that fixmap entry to not
cause undesirable (in crash context) cross-CPU TLB flushes, invalidate
the fixmap entry right after use.
Ian Campbell [Wed, 25 Feb 2015 13:39:48 +0000 (13:39 +0000)]
netif.h: describe request/response structures in terms of binary layout
In RFC style, rather than relying on the implicit assumptions of a
particular C ABI.
I have also confirmed, using the Python gdb extension technique in
[0], that the struct offsets (in a Linux binary at least) are the same
as described here.
I took the opportunity to also confirm that x86_32, x86_64, arm32 and
arm64 are all the same.
This highlighted that struct netif_rx_request was missing some
explicit padding, which is added here.
Lastly, fixup some struct names to allow the generated docs to
properly hyperlink, mainly by adding the _t to type names where
appropriate, but also s/netif_tx_extra/netif_extra_info_t/.
Wei Liu [Wed, 25 Feb 2015 14:56:06 +0000 (14:56 +0000)]
libxl: update libxl.h to say _dispose is idempotent
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Wed, 25 Feb 2015 14:56:05 +0000 (14:56 +0000)]
testidl: call _init and _dispose several times
Call _init and _dispose between 1 to 10 times on a type to test if _init
and _dispose are idempotent.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>