Doug Goldstein [Tue, 25 Aug 2015 13:49:49 +0000 (13:49 +0000)]
build: use correct qemu emulator binary
Per http://wiki.qemu.org/ChangeLog/1.0 and the fact that no currently
supported distro ships the x86 system emulator binary as 'qemu', this
changes the default when a user specifies --with-system-qemu without a
PATH to 'qemu-system-i386', otherwise the default results in a
non-functional setup.
[ Reran autogen.sh -iwj ]
Signed-off-by: Doug Goldstein <cardoe@cardoe.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Wei Liu [Thu, 27 Aug 2015 15:54:01 +0000 (16:54 +0100)]
build: fix tarball stubdom build
When we create a source code tarball, mini-os is extracted to
extras/mini-os directory. When building a source code tarball, we
shouldn't clone mini-os again.
Only clone mini-os when that directory doesn't exist. This fixes tarball
build and doesn't affect non-tarball build.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com>
Jan Beulich [Thu, 27 Aug 2015 15:40:38 +0000 (17:40 +0200)]
IOMMU: skip domains without page tables when dumping
Reported-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Roger Pau Monné <roger.pau@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Specificially we are pulling in the upstream patch (commit 1b56452121672e6408c38ac8926bdd6998a39004)):
[ath9k] Remove confusing logic inversion in an ANI variable
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Jan Beulich [Tue, 25 Aug 2015 14:18:31 +0000 (16:18 +0200)]
x86/IO-APIC: don't create pIRQ mapping from masked RTE
While moving our XenoLinux patches to 4.2-rc I noticed bogus "already
mapped" messages resulting from Linux (legitimately) writing RTEs with
only the mask bit set. Clearly we shouldn't even attempt to create a
pIRQ <-> IRQ mapping from such RTEs.
In the course of this I also found that the respective message isn't
really useful without also printing the pre-existing mapping. And I
noticed that map_domain_pirq() allowed IRQ0 to get through, despite us
never allowing a domain to control that interrupt.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
x86, amd_ucode: skip microcode updates for final levels
Some of older[Fam10h] systems require that certain number of
applied microcode patch levels should not be overwritten by
the microcode loader. Otherwise, system hangs are known to occur.
The 'final_levels' of patch ids have been obtained empirically.
Refer bug https://bugzilla.suse.com/show_bug.cgi?id=913996
for details of the issue.
The short version is that people have predominantly noticed
system hang issues when trying to update microcode levels
beyond the patch IDs below.
[0x01000098, 0x0100009f, 0x010000af]
From internal discussions, we gathered that OS/hypervisor
cannot reliably perform microcode updates beyond these levels
due to hardware issues. Therefore, we need to abort microcode
update process if we hit any of these levels.
In this patch, we check for those microcode versions and abort
if the current core has one of those final patch levels applied
by the BIOS
A linux version of the patch has already made it into tip-
http://marc.info/?l=linux-kernel&m=143703405627170
Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Wei Liu [Mon, 17 Aug 2015 18:57:01 +0000 (19:57 +0100)]
libxc: fix vNUMA memory allocation
Only 4KB allocation was using new_memflags. We should use new_memflags
in for 2MB and 1GB allocation as well because that variable contains
node information.
Without this patch, when creating a HVM guest with vNUMA, because the
node information was not present in the flags passed to libxc, actual
memory allocation didn't comply with what user specified. With this
patch the behaviour is correct.
Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Mon, 17 Aug 2015 18:57:00 +0000 (19:57 +0100)]
xl: error out if vNUMA specifies more vcpus than pcpus
... but allow user to override that check by specifying maxvcpus= in xl
configuration file.
Note that the code is constructed such that the fallout is dealt with
after parsing. We can live with that because though it wastes a bit of
cpu cycles but it is still functionally correct and I would like to have
a clear split between parsing and dealing with fallouts.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Before this change, vdistance from node 0 to all nodes (including
itself) was 10 and vdistance from node 1 to all nodes was 20.
After this change, vdistance from node 0 to itself is 10, to node 1 is
20 and vdistance from node 1 to node 0 is 20, to itself is 10. That's
the correct vdistance settings we expect.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Boris Ostrovsky [Fri, 14 Aug 2015 16:18:52 +0000 (12:18 -0400)]
libxc: allow empty memory nodes in vNUMA
The test for 'nr_vmemranges < nr_vnodes' in xc_domain_setvnuma() was
originally writtten with the idea that number of memory ranges would
at least be equal to number of nodes.
We may want to specify nodes with no memory, however, and thus this
check should be removed.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Roger Pau Monne [Fri, 7 Aug 2015 10:17:38 +0000 (12:17 +0200)]
libxl: fix libxl__build_hvm error handling
With the current code in libxl__build_hvm it is possible for the function to
fail and still return 0.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich [Fri, 14 Aug 2015 10:36:26 +0000 (12:36 +0200)]
x86/HVM: honor p2m_ram_ro in hvm_map_guest_frame_rw()
... and its callers.
While all non-nested users are made fully honor the semantics of that
type, doing so in the nested case seemed insane (if doable at all,
considering VMCS shadowing), and hence there the respective operations
are simply made fail.
One case not (yet) taken care of is that of a page getting transitioned
to this type after a mapping got established.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Kevin Tian <kevin.tian@intel.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Jean Delvare [Thu, 13 Aug 2015 12:48:40 +0000 (14:48 +0200)]
x86/dmi_scan: only honor end-of-table for 64-bit tables
A 32-bit entry point to a DMI table says how many structures the table
contains. The SMBIOS specification explicitly says that end-of-table
markers should be ignored if they are not actually at the end of the
DMI table. So only honor the end-of-table marker for tables accessed
through 64-bit entry points, as they do not specify a structure count.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
[Linux commit 17cd5bd5391e6e7b363d66335e1bc6760ae969b9] Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Jan Beulich [Thu, 13 Aug 2015 12:47:06 +0000 (14:47 +0200)]
add page_get_owner_and_reference() related ASSERT()s
The function shouldn't return NULL after having obtained a reference,
or else the caller won't know to drop it.
Also its result shouldn't be ignored - if calling code is certain that
a page already has a non-zero refcount, it better ASSERT()s so.
Finally this as well as get_page() and put_page() are required to be
available on all architectures - move the declarations to xen/mm.h.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Acked-by: Ian Campbell <ian.campbell@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
hvm_event_crX() already returns a bool_t to tell us whether an
event will be sent out or not, so the extra check that value != old
is not only useless, but also prevents non-onchangeonly events from
being sent.
Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Jan Beulich [Thu, 13 Aug 2015 12:44:21 +0000 (14:44 +0200)]
x86/p2m: clear_identity_p2m_entry() must cope with 'relaxed' RDM mode
Tearing down a 1:1 mapping that was never established isn't really nice
(and in fact hits an ASSERT() in p2m_remove_page()). Convert from a
wrapper macro to a proper function which then can take care of the
situation.
Also take the opportunity to remove the 'page_order' parameter of
clear_identity_p2m_entry(), to make it match set_identity_p2m_entry().
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Julien Grall [Thu, 13 Aug 2015 12:41:09 +0000 (14:41 +0200)]
mm: populate_physmap: validate correctly the gfn for direct mapped domain
Direct mapped domain has already the memory allocated 1:1, so we are
directly using the gfn as mfn to map the RAM in the guest.
While we are validating that the page associated to the first mfn belongs to
the domain, the subsequent MFN are not validated when the extent_order
is > 0.
This may result to map memory region (MMIO, RAM) which doesn't belong to the
domain.
Although, only DOM0 on ARM is using a direct memory mapped. So it
doesn't affect any guest (at least on the upstream version) or even x86.
Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Wed, 12 Aug 2015 13:56:01 +0000 (14:56 +0100)]
tools: libxl: Remove unnecessary trailing \n from log messages.
Both xl's LOG and the various libxl logging mechanisms automatically
include a trailing \n.
Remove all unnecessary \n's from the logs messages with the following
semantic patch.
spatch also reindents (I couldn't see how to make it stop). In general
it has improved matters but in 1 case it has introduced a long line,
this will be fixed in the next patch.
Semantic patch, run as
spatch --in-place --no-includes --include-headers \
--sp-file libxl-log-nl.spatch \
tools/libxl/libxl*.[ch] tools/libxl/xl*.[ch]
=========
// Heavily inspired by https://lkml.org/lkml/2014/9/12/134
Ian Campbell [Wed, 12 Aug 2015 09:07:37 +0000 (10:07 +0100)]
gitignore: Don't ignore *.rej
These indicate a patch application went wrong, I want to see them in
"git status". This appears to have been imported from .hgignore where
it has been since 2005.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Andrew Cooper [Fri, 7 Aug 2015 13:51:59 +0000 (14:51 +0100)]
tools/xenstore: Correct use of va_end() after va_copy()
C requires that every use of va_copy() is matched with a va_end() call.
This is especially important for x86_64 as va_{start,copy}() may need to
allocate memory to generate a va_list containing parameters which were
previously in registers.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Fri, 7 Aug 2015 14:06:24 +0000 (15:06 +0100)]
tools/libxl: Alter the use of rand() in testidl
Coverity warns for every occurrence of rand(), which is made worse
because each time the IDL changes, some of the calls get re-flagged.
Collect all calls to rand() in a single function, test_rand(), which
takes a modulo parameter for convenience. This turns 40 defects
currently into 1, which won't get re-flagged when the IDL changes.
In addition, fix the erroneous random choice for libxl_defbool_set().
"!!rand() % 1" is unconditionally 0, and even without the "% 1" would
still be very heavily skewed in one direction.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Fri, 7 Aug 2015 14:06:23 +0000 (15:06 +0100)]
tools/libxl: Assert success of memory allocation in testidl
The chances of an allocation failing are slim but nonzero. Assert
success of each allocation to quieten Coverity, which re-notices defects
each time the IDL changes.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Fri, 7 Aug 2015 18:53:55 +0000 (19:53 +0100)]
tools/libxc: linux: Don't use getpagesize() when unmapping the grants
The grants are based on the Xen granularity (i.e 4KB). While the function
to map grants for Linux (linux_gnttab_grant_map) is using the correct
size (XC_PAGE_SIZE), the unmap one (linux_gnttab_munmap) is using
getpagesize().
On domain using a page granularity different than Xen (this is the case
for AARCH64 guest using 64KB page), the unmap will be called with the
wrong size.
Signed-off-by: Julien Grall <julien.grall@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Mon, 10 Aug 2015 08:00:18 +0000 (09:00 +0100)]
oxenstored: fix systemd socket activation
Use the correct API sd_listen_fds to determine whether the process is
started by systemd.
Change sd_booted to launched_by_systemd to avoid confusion with
systemd's API.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Dave Scott <dave.scott@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Wei Liu [Mon, 10 Aug 2015 08:00:16 +0000 (09:00 +0100)]
cxenstored: fix systemd socket activation
There were two problems with original code:
1. sd_booted() was used to determined if the process was started by
systemd, which was wrong.
2. Exit with error if pidfile was specified, which was too harsh.
These two combined made cxenstored unable to start by hand if it ran
on a system which had systemd.
Fix issues with following changes:
1. Use sd_listen_fds to determine if the process is started by systemd.
2. Don't exit if pidfile is specified.
Rename function and restructure code to make things clearer.
A side effect of this patch is that gcc 4.8 with -Wmaybe-uninitialized
in non-debug build spits out spurious warning about sock and ro_sock
might be uninitialized. Since CentOS 7 ships gcc 4.8, we need to work
around that by setting sock and ro_sock to NULL at the beginning of
main.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Tested-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Tue, 11 Aug 2015 14:00:20 +0000 (15:00 +0100)]
Update QEMU_UPSTREAM_REVISION for 4.6 RC1
When we make RC1 we arrange to get a specific version of
qemu-xen-upstream.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Ian Jackson [Tue, 11 Aug 2015 13:51:43 +0000 (14:51 +0100)]
Update version to Xen 4.6 RC
* Change README to say `Xen 4.6-rc'
* Change XEN_EXTRAVERSION so that we are `4.6.0-rc'
Note that the RC number (eg, 1 for rc1) is not in the version string,
so that we do not need to update this again when we cut the next RC.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> CC: Jan Beulich <jbeulich@suse.com> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Ian Jackson [Tue, 11 Aug 2015 13:41:23 +0000 (14:41 +0100)]
tools: Update sonames for 4.6 RCs
Update libxc to 4.6.
Update libxl to 4.6.
Update libxlu to 4.6.
I did
git-grep 'MAJOR.*='
and also to check I had everything
git-grep 'SONAME_LDFLAG' | egrep -v 'MAJOR' |less
The other, un-updated, libraries are:
blktap2 (control, libvhd) 1.0 in-tree users only, no ABI changes
libfsimage 1.0 no ABI changes
libvchan 1.0 no ABI changes
libxenstat 0.0 (!) no ABI changes
libxenstore 3.0 no ABI changes
My assertions "no ABI changes" are based on the output of
git-diff origin/stable-4.5..staging .
and similar runes, sometimes limited to .h files.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Ian Campbell <ian.campbell@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
---
v2: Bump libxlu too. [ Reported by Wei Liu. ]
Ian Campbell [Thu, 6 Aug 2015 10:55:57 +0000 (11:55 +0100)]
libxl: use correct command line for arm guests.
We need to use libxl__domain_build_state.pv_cmdline in order to pickup
the correct args when using pygrub. libxl_domain_build_info.cmdline is
any args statically configured by the user.
This is consistent with the call to xc_domain_allocate, which takes
the cmdline too (in that case for x86/PV usage).
state->pv_cmdline is also set for non-pygrub guests, since
libxl__bootloader_run propagates info->cmdline if no bootloader is
configured.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Anshul Makkar [Wed, 5 Aug 2015 15:47:59 +0000 (16:47 +0100)]
x86/mm: Make {hap, shadow}_teardown() preemptible
A domain with sufficient shadow allocation can cause a watchdog timeout
during domain destruction. Expand the existing -ERESTART logic in
paging_teardown() to allow {hap/sh}_set_allocation() to become
restartable during the DOMCTL_destroydomain hypercall.
Signed-off-by: Anshul Makkar <anshul.makkar@citrix.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Tim Deegan <tim@xen.org> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Ting-Wei Lan [Wed, 5 Aug 2015 17:10:06 +0000 (01:10 +0800)]
VT-d: add iommu=igfx option to workaround graphics issues
When using Linux >= 3.19 (commit 47591df) as dom0 on some Intel Ironlake
devices, It is possible to encounter graphics issues that make screen
unreadable or crash the system. It was reported in freedesktop bugzilla:
As we still cannot find a proper fix for this problem, this patch adds
iommu=igfx option to control whether Intel graphics IOMMU is enabled.
Running Xen with iommu=no-igfx is similar to running Linux with
intel_iommu=igfx_off, which disables IOMMU for Intel GPU. This can be
used by users to manually workaround the problem before a fix is
available for i915 driver.
Signed-off-by: Ting-Wei Lan <lantw44@gmail.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Kevin Tian <kevin.tian@intel.com>
Andrew Cooper [Tue, 4 Aug 2015 17:16:34 +0000 (18:16 +0100)]
tools/libxl: Prepare to write multiple records with EMULATOR headers
With the newly specified EMULATOR_XENSTORE_DATA record, there are two
libxl records with an emulator subheader. Refactor the existing code to
make future additions easier, and rename some functions for consistency
with the new scheme.
* Calculate the subheader at stream start time, rather than on the fly.
Its contents are not going to change.
* Introduce a new setup_emulator_write() to insert a sub header in the
appropriate place before a blob of data.
* Rename *toolstack_* to *emulator_xenstore_*
* Rename *emulator_* to *emulator_context_*
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
Andrew Cooper [Tue, 4 Aug 2015 17:16:32 +0000 (18:16 +0100)]
docs/libxl: Re-specify XENSTORE_DATA as EMULATOR_XENSTORE_DATA
The legacy "toolstack" record as implemented in libxl turns out not to
be 32/64bit safe. As migration v2 has not shipped yet, take this
opportunity to adjust the specification and fix the incompatibility.
Libxl shall loose all knowledge of the old "toolstack" blob and use this
EMULATOR_XENSTORE_DATA record instead. Compatibility shall be handled
by the legacy conversion script.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Tue, 4 Aug 2015 17:16:31 +0000 (18:16 +0100)]
tools/libxl: Make libxl__conversion_helper_abort() safe to use
Previously, in the case of an error causing a call to
libxl__conversion_helper_abort() on a stream without legacy conversion,
libxl would fall over a NULL pointer because chs->ao was not set up.
Arrange for all ->ao's to be set up at _init() time, by having each
_init() function assert that their caller has done the right thing.
While doing so, introduce a previously-missing save_helper_init() in
stream_read_init().
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
Roger Pau Monne [Tue, 4 Aug 2015 10:02:55 +0000 (12:02 +0200)]
libxl: increase hotplug timeout to 40s
The default libxl timeout for hotplug scripts execution is too low, when
launching 40 HVM guests in parallel, all using the same file as disk,
execution times of ~20s are expected. Increase the timeout to 40s in order
to be sure hotplug scripts have enough time to execute.
This is a short term solution.
Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Mon, 3 Aug 2015 17:05:43 +0000 (18:05 +0100)]
x86/gdt: Drop write-only, xalloc()'d array from set_gdt()
It is not used, and can cause a spurious failure of the set_gdt() hypercall in
low memory situations.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Paul Durrant [Fri, 31 Jul 2015 15:34:22 +0000 (16:34 +0100)]
x86/hvm: don't rely on shared ioreq state for completion handling
Both hvm_io_pending() and hvm_wait_for_io() use the shared (with emulator)
ioreq structure to determined whether there is a pending I/O. The latter will
misbehave if the shared state is driven to STATE_IOREQ_NONE by the emulator,
or when the shared ioreq page is cleared for re-insertion into the guest
P2M when the ioreq server is disabled (STATE_IOREQ_NONE == 0) because it
will terminate its wait without calling hvm_io_assist() to adjust Xen's
internal I/O emulation state. This may then lead to an io completion
handler finding incorrect internal emulation state and calling
domain_crash().
This patch fixes the problem by adding a pending flag to the ioreq server's
per-vcpu structure which cannot be directly manipulated by the emulator
and thus can be used to determine whether an I/O is actually pending for
that vcpu on that ioreq server. If an I/O is pending and the shared state
is seen to go to STATE_IOREQ_NONE then it can be treated as an abnormal
completion of emulation (hence the data placed in the shared structure
is not used) and the internal state is adjusted as for a normal completion.
Thus, when a completion handler subsequently runs, the internal state is as
expected and domain_crash() will not be called.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Tested-by: Roger Pau Monné <roger.pau@citrix.com> Cc: Keir Fraser <keir@xen.org> Cc: Jan Beulich <jbeulich@suse.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Ting-Wei Lan [Thu, 30 Jul 2015 06:51:10 +0000 (14:51 +0800)]
build: use correct qemu path in systemd service file and init script
When --with-system-qemu is used, it is possible that we cannot find
qemu-system-i386 in LIBEXEC_BIN, which can cause error in xencommons
init script and xen-qemu-dom0-disk-backend.service systemd service.
Signed-off-by: Ting-Wei Lan <lantw44@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ravi Sahita [Wed, 29 Jul 2015 16:39:22 +0000 (09:39 -0700)]
x86/hvm.c: Don't tear down altp2m state if it was never set up
Reported-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Ravi Sahita <ravi.sahita@intel.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Tested-by: Wei Liu <wei.liu2@citrix.com>
[ ijc -- replacement subject from Andy ]
Andrew Cooper [Tue, 28 Jul 2015 21:44:37 +0000 (22:44 +0100)]
tools/libxl: Assert that libxl__ao_inprogress_gc() is not called with NULL
libxl__ao_inprogress_gc() is hidden behind various macros used to
construct local variables. Assert() that NULL is not passed, to make
such an error very obvious, rather than a plain segfault at 0.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andrew Cooper [Tue, 28 Jul 2015 21:44:36 +0000 (22:44 +0100)]
tools/libxl: Only continue stream operations if the stream is still in progress
Part of the callback contract with check_all_finished() is that each
running parallel task shall call it exactly once.
Previously, it was possible for stream_continue() or
write_toolstack_record() to fail and call into check_all_finished(). As
the save helpers callback has fired, it no longer counts as in use,
which causes check_all_finished() to fire the stream callback. Then,
unwinding the stack back and calling check_all_finished() a second time
results in the same conditions being observed, and the stream callback
being fired a second time.
To avoid this, check_all_finished() is called before any other actions
which continue the stream functionality, and the stream is only
continued if it has not been torn down. This guarantees not to continue
stream operations if the stream does not owe a callback to
check_all_finished().
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Ian Campbell [Wed, 29 Jul 2015 10:00:36 +0000 (11:00 +0100)]
Replace FSF street address with canonical URL
As recommended in http://www.gnu.org/licenses/gpl-howto.en.html.
This is the result of:
$ git grep -El Mass\|Temple\|Franklin | xargs ./fsf.pl
Where fsf.pl is:
#!/usr/bin/perl -w -pi.bak -0777
my $repl = 'If not, see <http://www.gnu.org/licenses/>.';
my $br = qr/(?:\s*\n\s*(?:[\*\#]|\/\/|\.\\" )?\s*|\s+)/;
my $inwt = qr/[Ii]f${br}not,${br}write${br}(?:to${br})?the${br}Free${br}Software${br}Foundation,(?:${br}Inc\.,)?/;
my $mass = qr/675${br}Mass${br}Ave,?${br}Cambridge,?${br}MA${br}02139,?${br}USA,?\.?/;
my $franklin = qr/51${br}Franklin${br}St(?:reet)?(?:,${br}| - )Fifth${br}Floor,?${br}Boston,?${br}MA,?${br}02110-1301,?${br}USA,?\.?/;
my $temple = qr/59${br}Temple${br}Place(?:,${br}| - )Suite${br}330,?${br}Boston,?${br}MA,?${br}021110?-1307,?${br}USA,?\.?/;
The only remaining mentions of these addresses are in COPYING files which I
haven't touched.
Some of the changed files are imports from elsewhere, however
filtering them out is tricky, I think it is tolerable to have these
files be modified here and then perhaps reverted on the next sync,
since it's only 1-2 lines and obvious what is going on.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Mon, 27 Jul 2015 16:47:26 +0000 (17:47 +0100)]
tools/libxl: Do not fire the stream callback multiple times
Avoid stacking of check_all_finished() via synchronous teardown of
tasks. If the _abort() functions call back synchronously,
stream->completion_callback() ends up getting called twice, as first
and last check_all_finished() frames observe each task being finished.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andrew Cooper [Mon, 27 Jul 2015 16:47:25 +0000 (17:47 +0100)]
tools/libxl: Do not set stream->rc in stream_complete()
Only ever set stream->rc in check_all_finished(). The first version of
the migration v2 series had separate rc and joined_rc parameters, where
this logic worked. However when combining the two, the teardown path
fails to trigger if stream_complete() records stream->rc itself. A side
effect of this is that stream_done() needs to take an rc parameter.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Currently we always use memory map[] to help hvmloader construct e820 table
but hvmloader may have relocated RAM to support mmio allocation or just
populated ram to ensure we can have enough room to load ovmf. Anyway we
need to sync these changes into memory map[].
CC: Keir Fraser <keir@xen.org> CC: Jan Beulich <jbeulich@suse.com> CC: Andrew Cooper <andrew.cooper3@citrix.com> CC: Ian Jackson <ian.jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> CC: Ian Campbell <ian.campbell@citrix.com> CC: Wei Liu <wei.liu2@citrix.com> CC: George Dunlap <george.dunlap@eu.citrix.com> Signed-off-by: Tiejun Chen <tiejun.chen@intel.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Wei Liu [Mon, 27 Jul 2015 17:45:08 +0000 (18:45 +0100)]
python/xc: reinstate original implementation of next_bdf
I missed the fact that next_bdf is used to parsed user supplied
strings when reviewing. The user supplied string is a NULL-terminated
string separated by comma. User can supply several PCI devices in that
string. There is, however, no delimiter for different devices, hence
we can't change the syntax of that string.
This patch reinstate the original implementation of next_bdf to
preserve the original syntax. The last argument for xc_assign_device
is always 0.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Mon, 27 Jul 2015 17:45:02 +0000 (18:45 +0100)]
libxl: properly clean up array in libxl_list_cpupool failure path
Document how cpupool_info works. Distinguish success (ERROR_FAIL +
ENOENT) vs failure in libxl_list_cpupool and properly clean up the array
in failure path.
Also switch to libxl__realloc and call libxl_cpupool_{init,dispose}
where appropriate.
There is change of behaviour. Previously if memory allocation fails the
said function returns NULL. Now memory allocation failure is fatal. This
is in line with how we deal with memory allocation failure in other
places in libxl though.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Mon, 20 Jul 2015 10:37:59 +0000 (11:37 +0100)]
tools/libx{l, c}: Fix trivial Coverity defects in migration v2 code
All of these are UNUSED_VALUE defects where a default value is
unconditionally overwritten. They are not particularly interesting,
bug wise, but keeping these defects at bay helps prevent real bugs
going unnoticed in the volume.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Mon, 20 Jul 2015 10:37:58 +0000 (11:37 +0100)]
docs: Migration v2 is now no longer draft
Add further instructions to the libxc "Future Extensions" section, and
provide such a section for libxl.
In addition, drop the "In experimental __func__" IPRINTF()s from the
libxc implementations.
Finally, a correction to libxl's "Not Yet Included" section which
should have been amended in c/s 7eaec00 when libxl Remus support was
introduced into the protocol.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Mon, 20 Jul 2015 10:37:57 +0000 (11:37 +0100)]
tools/libx{l, c}: Drop '2' suffixes from xc_domain_{save, restore}2() functions
As there is now only the one implementation.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
All handling of device model files is now at the libxl level. Remove
XC_DEVICE_MODEL_RESTORE_FILE and introduce LIBXL_DEVICE_MODEL_RESTORE_FILE in
its place.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Mon, 20 Jul 2015 10:37:55 +0000 (11:37 +0100)]
tools/libx{l, c}: Remove the toolstack_{save, restore} callbacks
Update the libxc spec to indicate more sternly that TOOLSTACK records
should no longer be used.
Also, trim further toolstack infrastructure which should have gone in
c/s 39bf4e9 "tools/libxl: Drop all knowledge of toolstack callbacks"
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
One complication is that xc_map_m2p() has users in xc_offline_page.c,
xen-mfndump and xen-mceinj. Move its implementation into
xc_offline_page (for want of a better location) beside it's current
user.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- drop mentions of removed files from MAINTAINERS ]
Wei Liu [Mon, 27 Jul 2015 14:01:32 +0000 (15:01 +0100)]
libxl: check nesthvm and altp2m in libxl
In ea214001 ("x86/altp2m: add altp2mhvm HVM domain parameter"), a
check was added to ensure nestedhvm and altp2m cannot be enabled at
the same time. That check was added in xl, but in fact it should be in
libxl because it should be the entity that decides whether
the provided configuration is valid.
This patch moves the check to libxl. The code snippet is moved after
calling libxl__domain_build_info_setdefault so that we can:
1. remove libxl_defbool_is_default in `if()';
2. detect mistake in libxl__domain_build_info_setdefault.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Martin Lucina [Fri, 24 Jul 2015 15:29:41 +0000 (17:29 +0200)]
xenconsole: Ensure exclusive access to console using locks
If more than one instance of xenconsole is run against the same DOMID
then each instance will only get some data. This change ensures
exclusive access to the console by obtaining an exclusive lock on
<XEN_LOCK_DIR>/xenconsole.<DOMID>.
The locking strategy used is based on
tools/libxl/libxl_internal.c:libxl__lock_domain_userdata().
Signed-off-by: Martin Lucina <martin@lucina.net> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Wei Liu [Sun, 26 Jul 2015 21:34:54 +0000 (22:34 +0100)]
libxc: fix memory leak in migration v2
Originally there was only one counter to keep track of pages. It was
used erroneously to keep track of how many pages were mapped and how
many pages needed to be sent. In the end munmap(2) always had 0 as the
length argument, which resulted in leaking the mapping.
This problem was discovered on 32bit toolstack because 32bit applications
have notably smaller address space. In fact this bug affects 64bit
toolstack too.
Use a separate counter to keep track of the number of mapped pages to
solve this problem.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tamas K Lengyel [Fri, 24 Jul 2015 11:42:24 +0000 (13:42 +0200)]
xen-access: altp2m testcases
Working altp2m test-case. Extended the test tool to support singlestepping
to better highlight the core feature of altp2m view switching.
Signed-off-by: Tamas K Lengyel <tlengyel@novetta.com> Signed-off-by: Ed White <edmund.h.white@intel.com> Reviewed-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Tamas K Lengyel [Fri, 24 Jul 2015 11:42:12 +0000 (13:42 +0200)]
libxc: add support to altp2m hvmops
Wrappers to issue altp2m hvmops.
Signed-off-by: Tamas K Lengyel <tlengyel@novetta.com> Signed-off-by: Ravi Sahita <ravi.sahita@intel.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Martin Lucina [Fri, 24 Jul 2015 11:30:48 +0000 (13:30 +0200)]
xenconsole: Allow non-interactive use
If xenconsole is run with stdin closed or redirected to /dev/null,
console_loop() will return immediately due to failure to read from
STDIN_FILENO. This patch tests if stdin and stdout are both connected to
a TTY and, if not, xenconsole will not attempt to read from stdin or
modify stdout terminal attributes.
Existing behaviour when xenconsole is run from a terminal does not
change.
This allows for non-interactive use, eg. running "xl create -c" under
systemd or piping the output of "xl console" to another command.
Signed-off-by: Martin Lucina <martin@lucina.net> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ravi Sahita [Fri, 24 Jul 2015 11:39:33 +0000 (13:39 +0200)]
x86/altp2m: XSM hooks for altp2m HVM ops
Signed-off-by: Ravi Sahita <ravi.sahita@intel.com> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Drop now bogus conditional expression from xsm_hvm_altp2mhvm_op()
invocation.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Ed White [Fri, 24 Jul 2015 11:38:28 +0000 (13:38 +0200)]
x86/altp2m: add altp2mhvm HVM domain parameter
The altp2mhvm and nestedhvm parameters are mutually
exclusive and cannot be set together.
Signed-off-by: Ed White <edmund.h.white@intel.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Ed White [Fri, 24 Jul 2015 11:36:15 +0000 (13:36 +0200)]
x86/altp2m: add remaining support routines
Add the remaining routines required to support enabling the alternate
p2m functionality.
Signed-off-by: Ed White <edmund.h.white@intel.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Fix off-by-one in various checks against MAX_ALTP2M. Adjust error code
in p2m_destroy_altp2m_by_id(). Cosmetic adjustments.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Ed White [Fri, 24 Jul 2015 11:34:46 +0000 (13:34 +0200)]
x86/altp2m: alternate p2m memory events
Add a flag to indicate that a memory event occurred in an alternate p2m
and a field containing the p2m index. Allow any event response to switch
to a different alternate p2m using the same flag and field.
Modify p2m_mem_access_check() to handle alternate p2m's.
Signed-off-by: Ed White <edmund.h.white@intel.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> for the x86 bits. Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Tamas K Lengyel <tlengyel@novetta.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
George Dunlap [Fri, 24 Jul 2015 11:30:44 +0000 (13:30 +0200)]
x86/altp2m: add control of suppress_ve
The existing ept_set_entry() and ept_get_entry() routines are extended
to optionally set/get suppress_ve. Passing -1 will set suppress_ve on
new p2m entries, or retain suppress_ve flag on existing entries.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Signed-off-by: Ravi Sahita <ravi.sahita@intel.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Also adjust the caller in set_identity_p2m_entry().
Ed White [Fri, 24 Jul 2015 11:29:18 +0000 (13:29 +0200)]
VMX/altp2m: add code to support EPTP switching and #VE
Implement and hook up the code to enable VMX support of VMFUNC and #VE.
VMFUNC leaf 0 (EPTP switching) emulation is added in a later patch.
Signed-off-by: Ed White <edmund.h.white@intel.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com> Acked-by: Jan Beulich <jbeulich@suse.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Ed White [Fri, 24 Jul 2015 11:28:00 +0000 (13:28 +0200)]
x86/altp2m: basic data structures and support routines
Add the basic data structures needed to support alternate p2m's and
the functions to initialise them and tear them down.
Although Intel hardware can handle 512 EPTP's per hardware thread
concurrently, only 10 per domain are supported in this patch for
performance reasons.
This change also splits the p2m lock into one lock type for altp2m's
and another type for all other p2m's. The purpose of this is to place
the altp2m list lock between the types, so the list lock can be
acquired whilst holding the host p2m lock.
Signed-off-by: Ed White <edmund.h.white@intel.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Cosmetic adjustments.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Ed White [Fri, 24 Jul 2015 11:26:02 +0000 (13:26 +0200)]
x86/HVM: hardware alternate p2m support detection
As implemented here, only supported on platforms with VMX HAP.
By default this functionality is force-disabled, it can be enabled
by specifying altp2m=1 on the Xen command line.
Signed-off-by: Ed White <edmund.h.white@intel.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Ed White [Fri, 24 Jul 2015 11:25:29 +0000 (13:25 +0200)]
VMX: implement suppress #VE
In preparation for selectively enabling #VE in a later patch, set
suppress #VE on all EPTE's.
Suppress #VE should always be the default condition for two reasons:
it is generally not safe to deliver #VE into a guest unless that guest
has been modified to receive it; and even then for most EPT violations only
the hypervisor is able to handle the violation.
Signed-off-by: Ed White <edmund.h.white@intel.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Ed White [Fri, 24 Jul 2015 11:24:51 +0000 (13:24 +0200)]
VMX: VMFUNC and #VE definitions and detection
Currently, neither is enabled globally but may be enabled on a per-VCPU
basis by the altp2m code.
Remove the check for EPTE bit 63 == zero in ept_split_super_page(), as
that bit is now hardware-defined.
Signed-off-by: Ed White <edmund.h.white@intel.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Jun Nakajima <jun.nakajima@intel.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>