Karim Raslan [Thu, 26 Jun 2014 11:28:23 +0000 (12:28 +0100)]
mini-os: switched initial C entry point to arch_init
Signed-off-by: Karim Allah Ahmed <karim.allah.ahmed@gmail.com>
[talex5@gmail.com: separated from big ARM commit]
[talex5@gmail.com: restored comment, moved prototypes to headers] Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
[talex5@gmail.com: restored stack address printk on x86]
[talex5@gmail.com: moved first printk's after start_info setup on x86] Signed-off-by: Thomas Leonard <talex5@gmail.com>
Thomas Leonard [Thu, 26 Jun 2014 11:28:22 +0000 (12:28 +0100)]
mini-os: made off_t type signed
POSIX requires this.
Signed-off-by: Thomas Leonard <talex5@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Thomas Leonard [Thu, 26 Jun 2014 11:28:21 +0000 (12:28 +0100)]
mini-os: use unbind_evtchn in unbind_all_ports
This marks the channel as closed, in case someone tries to use it again.
Signed-off-by: Thomas Leonard <talex5@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Thomas Leonard [Thu, 26 Jun 2014 11:28:20 +0000 (12:28 +0100)]
mini-os: fixed format string error in unbind_evtchn
Would crash if HYPERVISOR_event_channel_op returned an error code.
The other changes in this commit are just fixing indentation.
Signed-off-by: Thomas Leonard <talex5@gmail.com> Acked-by: Ian Campbell <ian.cammpbell@citrix.com> Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Thomas Leonard [Thu, 26 Jun 2014 11:28:19 +0000 (12:28 +0100)]
mini-os: fixed shutdown thread
Before, it read "" and started a shutdown immediately. Now, it waits for
a non-empty value and then actually shuts down.
Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
[talex5@gmail.com: avoid declaration-after-statement in kernel.c] Signed-off-by: Thomas Leonard <talex5@gmail.com>
Thomas Leonard [Thu, 26 Jun 2014 11:28:18 +0000 (12:28 +0100)]
mini-os: build fixes
Make .o rules depend on the includes. Before, only the final link step
depended on setting up the includes directory, making parallel builds
unreliable.
Make symlinks use explicit make rules instead of using a phony target.
Avoids unnecessary rebuilds.
[talex5@gmail.com: bring back "make links", for stubdom] Signed-off-by: Thomas Leonard <talex5@gmail.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Wei Liu [Fri, 20 Jun 2014 16:19:37 +0000 (18:19 +0200)]
libxl/xl: push VCPU affinity pinning down to libxl
This patch introduces an array of libxl_bitmap called "vcpu_hard_affinity"
in libxl IDL to preserve VCPU to PCPU mapping. This is necessary for libxl
to preserve all information to construct a domain.
The array accommodates at most max_vcpus elements, each containing the
affinity of the respective VCPU. If less than max_vcpus bitmaps are
present, the VCPUs associated to the missing elements will just stay with
their default affinity (they'll be free to execute on every PCPU).
In case both this new field, and the already existing cpumap field are
used, the content of the array will override what's set in cpumap. (In
xl, we make sure that this never happens in xl, by using only one of the
two at any given time.)
The proper macro to mark the API change (called
LIBXL_HAVE_BUILDINFO_VCPU_AFFINITY_ARRAYS) is added but it is commented.
It will be uncommented by the patch in the series that completes the
process, by adding the "vcpu_soft_affinity" array. This is because, after
all, these two fields are being added sort-of together, and are very
very similar, in both meaning and usage, so it makes sense for them to
share the same marker.
This patch was originally part of Wei's series about pushing as much
information as possible on domain configuration in libxl, rather than
xl. See here, for more details:
http://lists.xen.org/archives/html/xen-devel/2014-06/msg01026.html
http://lists.xen.org/archives/html/xen-devel/2014-06/msg01031.html
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Dario Faggioli [Fri, 20 Jun 2014 16:19:29 +0000 (18:19 +0200)]
libxl: Change default for b_info->{cpu, node}map to "not allocated"
by avoiding allocating them in libxl__domain_build_info_setdefault.
In fact, back in 7e449837 ("libxl: provide _init and _setdefault for
libxl_domain_build_info") and a5d30c23 ("libxl: allow for explicitly
specifying node-affinity"), it was decided that the default for these
fields was for them to be allocated and filled.
That is now causing problem, whenever we have to figure out whether
the caller is using or not one of those fields. In fact, when we see
a full bitmap, is it just the default value, or is the user that
wants it that way?
Since that kind of knowledge has become important, change the default
to be "bitmap not allocated". It then becomes easy to know whether a
libxl caller is using one of the fields, just by checking whether the
bitmap is actually there with a non-zero size.
This is very important for the following patches introducing new ways
of specifying hard and soft affinity. It also allows us to improve
the checks around NUMA automatic placement, during domain creation
(and that bit is done in this very patch).
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Dario Faggioli [Fri, 20 Jun 2014 16:19:12 +0000 (18:19 +0200)]
libxl: get and set soft affinity
Make space a new cpumap in vcpu_info, called cpumap_soft,
for retrieving soft affinity, and amend the relevant API
accordingly.
libxl_set_vcpuaffinity() now takes two cpumaps, one for hard
and one for soft affinity (LIBXL_API_VERSION is exploited to
retain source level backword compatibility). Either of the
two cpumap can be NULL, in which case, only the affinity
corresponding to the non-NULL cpumap will be affected.
Getting soft affinity happens indirectly (see, e.g.,
`xl vcpu-list'), as it is already for hard affinity).
This commit also introduces some logic to check whether the
affinity which will be used by Xen to schedule the vCPU(s)
does actually match with the cpumaps provided. In fact, we
want to allow every possible combination of hard and soft
affinity to be set, but we warn the user upon particularly
weird situations (e.g., hard and soft being disjoint sets
of pCPUs).
This very change also update the error handling for calls
to libxl_set_vcpuaffinity() in xl, as that can now be any
libxl error code, not just only -1.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Dario Faggioli [Fri, 20 Jun 2014 16:19:01 +0000 (18:19 +0200)]
libxc: get and set soft and hard affinity
by using the flag and the new cpumap arguments introduced in
the parameters of the DOMCTL_{get,set}_vcpuaffinity hypercalls.
Now, both xc_vcpu_setaffinity() and xc_vcpu_getaffinity() have
a new flag parameter, to specify whether the user wants to
set/get hard affinity, soft affinity or both. They also have
two cpumap parameters instead of only one. This way, it is
possible to set/get both hard and soft affinity at the same
time (and, in case of set, each one to its own value).
In xc_vcpu_setaffinity(), the cpumaps are IN/OUT parameters,
as it is for the corresponding arguments of the
DOMCTL_set_vcpuaffinity hypercall. What Xen puts there is the
hard and soft effective affinity, that is what Xen will actually
use for scheduling.
In-tree callers are also fixed to cope with the new interface.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Thu, 26 Jun 2014 08:53:42 +0000 (09:53 +0100)]
xen: arm: take FIQ exceptions to Xen not guest by setting HCR_EL2.FMO
As with HCR_EL2.{IMO,AMO} we want to route FIQs to Xen not the guest. See ARM
ARM DDI 0406C.b B1.8.4.
So far none of the platforms which we support use FIQ for anything, but when we
end up supporting one it would be far better to surprise Xen with them than
whatever guest happens to be running...
Jan Beulich [Wed, 25 Jun 2014 12:42:15 +0000 (14:42 +0200)]
VT-d/qinval: eliminate redundant locking
The qinval-specific lock would only ever get used with the IOMMU's
register lock already held. Along with dropping the lock also drop
another unused field from struct qi_ctrl.
Furthermore the gen_*_dsc() helpers become pretty pointless with the
lock dropped - being each used only in a single place, simply fold
them into their callers.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Kevin Tian <kevin.tian@intel.com>
Jan Beulich [Wed, 25 Jun 2014 12:40:34 +0000 (14:40 +0200)]
x86/HVM: consolidate and sanitize CR4 guest reserved bit determination
First of all, this is needed by just a single source file, so it gets
moved there instead of getting fed to the compiler for most other
source files too. With that it becomes sensible for this to no longer
be a macro, allowing elimination of the mostly redundant helpers
hvm_vcpu_has_{smep,smap}(). And finally, following the model SMEP and
SMAP already used, tie the determination of reserved bits to the
features the guest is shown rather than the host's.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Wed, 18 Jun 2014 18:04:14 +0000 (19:04 +0100)]
tools/libxl: Fix free() of wild pointer in libxl__initiate_device_remove()
libxl__initiate_device_remove() had a preexisting error path issue where
libxl_dominfo_dispose() could be called on a libxl_dominfo object before it
had been initialised with libxl_dominfo_init().
This was safe until c/s ab44401 added the pointer ssid_label, which point
libxl_dominfo_dispose() free()s.
Unconditionally initialise info in libxl__initiate_device_remove() before
taking an error path which will free it.
Coverity-ID: 1223212 Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Wei Liu <wei.liu2@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
Andrew Cooper [Wed, 18 Jun 2014 17:44:44 +0000 (18:44 +0100)]
tools/libxc: Fix missing break in xc_domain_bind_pt_irq()
c/s 568da4f8 "pt-irq fixes and improvements" accidentally forgot a break when
refactoring xc_domain_bind_pt_irq() which results in bind->u.pci.bus being
clobbered by isa_irq for PCI and MSI_TRANSLATE interrupts.
Coverity-ID: 1223210 Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Jan Beulich <JBeulich@suse.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
Dario Faggioli [Fri, 20 Jun 2014 14:09:00 +0000 (16:09 +0200)]
blktap2: Fix two 'maybe uninitialized' variables
for which gcc 4.9.0 complains about, like this:
block-qcow.c: In function `get_cluster_offset':
block-qcow.c:431:3: error: `tmp_ptr' may be used uninitialized in this function
[-Werror=maybe-uninitialized]
memcpy(tmp_ptr, l1_ptr, 4096);
^
block-qcow.c:606:7: error: `tmp_ptr2' may be used uninitialized in this
function [-Werror=maybe-uninitialized]
if (write(s->fd, tmp_ptr2, 4096) != 4096) {
^
cc1: all warnings being treated as errors
/home/dario/Sources/xen/xen/xen.git/tools/blktap2/drivers/../../../tools/Rules.mk:89:
recipe for target 'block-qcow.o' failed
make[5]: *** [block-qcow.o] Error 1
The proper behavior is to return upon allocation failure.
About what to return, 0 seems the best option, looking
at both the function and the call sites.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Yang Hongyang [Fri, 20 Jun 2014 06:59:34 +0000 (14:59 +0800)]
libxl: Rewind toolstack_save_fd in libxl_save_helper when using remus
Commit b327a3f421bb57d262b7d1fb3c43b710852b103b moved the rewinding of
toolstack_save_fd to libxl. This breaks remus, because in remus mode,
toolstack_save_cb will be called in every checkpoint, and if we don't
rewind it in libxl_save_helper, it will surely fail.
This fix is just a hack: in fact the whole toolstack save thing should
be done in libxl. But for now (until migration v2) this fix should
solve both remus and Jason Adryuk's use case.
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com> Tested-by: Jason Andryuk <andryuk@aero.org> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Julien Grall [Mon, 23 Jun 2014 13:27:58 +0000 (14:27 +0100)]
libxc: Fix xc_mem_event.c compilation for ARM
The commit 6ae2df9 "mem_access: Add helper API to setup ring and enable
mem_access¨ break libxc compilation for ARM.
This is because xc_map_foreign_map and xc_domain_decrease_reservation_exact
is taking an xen_pfn_t in parameters. On ARM, xen_pfn_t is always an uin64_t.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Cc: Aravindh Puthiyaparambil <aravindp@cisco.com> Cc: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
mem_access: Add helper API to setup ring and enable mem_access
tools/libxc: Add helper function to setup ring for mem events
This patch adds a helper function that maps the ring, enables mem_event
and removes the ring from the guest physmap while the domain is paused.
This can be used by all mem_events but is only enabled for mem_access at
the moment.
tests/xen-access: Use helper API to setup ring and enable mem_access
Prior to this patch, xen-access was setting up the ring page in a way
that would give a malicous guest a window to write in to the shared ring
page. This patch fixes this by using the helper API that does it safely
on behalf of xen-access.
This is XSA-99.
Signed-off-by: Aravindh Puthiyaparambil <aravindp@cisco.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Fri, 20 Jun 2014 12:47:55 +0000 (14:47 +0200)]
VT-d/qinval: clean up error handling
- neither qinval_update_qtail() nor qinval_next_index() can fail: make
the former return "void", and drop caller error checks for the latter
(all of which would otherwise return with a spin lock still held)
- or-ing together error codes is a bad idea
At once drop bogus initializers.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Yang Zhang <yang.z.zhang@intel.com>
Roger Pau Monné [Fri, 20 Jun 2014 08:38:07 +0000 (10:38 +0200)]
x86/PVH: allow guest_remove_page to remove p2m_mmio_direct pages
IF a guest tries to do a foreign/grant mapping in a memory region
marked as p2m_mmio_direct Xen will complain with the following
message:
(XEN) memory.c:241:d0v0 Bad page free for domain 0
Albeit the mapping will succeed. This is specially problematic for PVH
Dom0, in which we map all the e820 holes and memory up to 4GB as
p2m_mmio_direct.
In order to deal with it, add a special casing for p2m_mmio_direct
regions in guest_remove_page if the domain is a hardware domain, that
calls clear_mmio_p2m_entry in order to remove the mappings.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Tim Deegan <tim@xen.org>
Jan Beulich [Fri, 20 Jun 2014 08:26:37 +0000 (10:26 +0200)]
VT-d: drop redundant calls to invalidate_sync()
The call tree iommu_flush_iec_index() -> __iommu_flush_iec() already
invokes invalidate_sync(). Removing the superfluous instances at once
allows the function to become static.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Yang Zhang <yang.z.zhang@intel.com>
Roger Pau Monne [Mon, 2 Jun 2014 15:08:23 +0000 (17:08 +0200)]
build: export linker emulation parameter to SeaBIOS
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Cc: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Roger Pau Monne [Mon, 2 Jun 2014 15:08:22 +0000 (17:08 +0200)]
hvmloader: remove size_t typedef and include stddef.h
The open coded typedef of size_t was clashing with the typedef in
FreeBSD headers. Remove the typedef and include the proper header
where size_t is defined (stddef.h).
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com>
Roger Pau Monne [Mon, 2 Jun 2014 15:08:20 +0000 (17:08 +0200)]
libxl: only include utmp.h if it's present
Add a configure check for utmp.h presence, and gate the usage of
utmp.h in libxl to the result of the test.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- resolved minor conflict in configure.ac and reran autogen ]
Roger Pau Monne [Mon, 2 Jun 2014 15:08:14 +0000 (17:08 +0200)]
xenstored: add FreeBSD xenstored device paths
Add the path to FreeBSD special xenstored device, this is all that's
needed to get xenstored working on FreeBSD after the unification of
the implementations.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Cc: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Roger Pau Monne [Mon, 2 Jun 2014 15:08:12 +0000 (17:08 +0200)]
libxc: remove broken endianess gate on lz4 decompressor
The lz4 decompressor had wrongly implemented a gate between
little-endian and big-endian versions of get_unaligned_le{16/32},
which turns out to be broken on all architectures supported by Xen,
because __LITTLE_ENDIAN is not defined. Instead of trying to fix
this, just implement the little-endian version and remove the switch.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Cc: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Roger Pau Monne [Mon, 2 Jun 2014 15:08:11 +0000 (17:08 +0200)]
libxc: add support for FreeBSD
Add the FreeBSD implementation of the privcmd and evtchn devices
interface.
The evtchn device interface is the same as the Linux one, while the
privcmd map interface is simplified because FreeBSD only supports
IOCTL_PRIVCMD_MMAPBATCH.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Cc: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Roger Pau Monne [Mon, 2 Jun 2014 15:08:10 +0000 (17:08 +0200)]
include: import FreeBSD headers for evtchn and privcmd devices
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com>
Roger Pau Monne [Mon, 2 Jun 2014 15:08:09 +0000 (17:08 +0200)]
configure: disable ROMBIOS if qemu-trad is disabled
ROMBIOS only works with qemu-traditional, so if it is disabled,
disable ROMBIOS also.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Cc: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- ran autogen.sh ]
Roger Pau Monne [Mon, 2 Jun 2014 15:08:08 +0000 (17:08 +0200)]
configure: disable qemu-trad on FreeBSD systems by default
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Cc: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- ran autogen.sh ]
Fabio Fantoni [Tue, 27 May 2014 15:01:39 +0000 (17:01 +0200)]
libxl: disable usbredirection if spice is disabled
Now if usbredirection is enabled in domU's xl cfg is added also
if spice is disabled and then usbredirection remain unused.
This patch if usbredirection is enabled but spice not disable
usbredirection and show a warning.
Andrew Cooper [Wed, 18 Jun 2014 12:57:58 +0000 (13:57 +0100)]
tools/libxc: rename pfn_to_mfn to xc_pfn_to_mfn
Also refactor the contents of xc_pfn_to_mfn(). It is functionally identical,
but contains less lisp, fewer magic numbers, and more description of why 32bit
guests are treated differently.
Note that this does not affect pfn_to_mfn() in xc_domain_save.c That was
already a macro which aliased pfn_to_mfn() in xg_private.h but without
actually using it.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Introduce vgic_rank_irq: a new helper function that gives you the struct
vgic_irq_rank corresponding to a given irq number.
Use it in vgic_vcpu_inject_irq.
Andrew Cooper [Wed, 11 Jun 2014 18:31:55 +0000 (19:31 +0100)]
tools/pygrub: Fix extlinux when /boot is a separate partition from /
Grub and Grub2 already cope with this.
Reported-by: Joseph Hom <jhom@softlayer.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Tue, 17 Jun 2014 09:32:21 +0000 (10:32 +0100)]
xl / libxl: push parsing of SSID and CPU pool ID down to libxl
This patch pushes parsing of "init_seclabel", "seclabel",
"device_model_stubdomain_seclabel" and "pool" down to libxl level.
Originally the parsing is done in xl level, which is not ideal because
libxl won't have the truely relevant information. With this patch libxl
holds important information by itself.
The libxl IDL is extended to hold the string of labels and pool name.
And if there those strings are present they take precedence over the
numeric representations.
As all relevant structures (libxl_dominfo etc) have a field called
X_name / X_label now, a string is also copied there so that callers
won't have to do ID to name / label translation.
In order to be compatible with users of older versions of libxl, this
patch also defines LIBXL_HAVE_SSID_LABEL and LIBXL_HAVE_CPUPOOL_NAME. If
they are defined, the respective strings are available. And if those
strings are not NULL, libxl will do the parsing and ignore the numeric
values.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Dario Faggioli <dario.faggioli@citrix.com> Cc: Juergen Gross <jgross@suse.com> Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Tue, 17 Jun 2014 20:44:28 +0000 (21:44 +0100)]
xen/arm: Panic when we receive an unexpected trap
The current implementation of do_unexpected_trap make Xen spin forever
on the current physical CPU. This may lead to stall guests VCPU and print
unhelpful message (RCU stall...).
Usually when Xen receives an unexpected trap, it means that something goes
wrong either in the hypervisor or in the CPU. In this case we should
directly panic to also stop the other CPUs.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Tue, 17 Jun 2014 17:26:18 +0000 (18:26 +0100)]
tools/python: Remove some legacy scripts
Nothing in scripts/ is referenced by the current Xen build system. It is a
legacy version of the XenAPI bindings, other parts of which have already been
removed from the tree.
Additionally, prevent the install target from creating an $(SBINDIR) directory
but putting nothing in it. This appears to be something missed when removing
Xend.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Jan Beulich [Wed, 18 Jun 2014 13:53:27 +0000 (15:53 +0200)]
x86/EFI: allow FPU/XMM use in runtime service functions
UEFI spec update 2.4B developed a requirement to enter runtime service
functions with CR0.TS (and CR0.EM) clear, thus making feasible the
already previously stated permission for these functions to use some of
the XMM registers. Enforce this requirement (along with the connected
ones on FPU control word and MXCSR) by going through a full FPU save
cycle (if the FPU was dirty) in efi_rs_enter() (along with loading the
specified values into the other two registers).
Note that the UEFI spec mandates that extension registers other than
XMM ones (for our purposes all that get restored eagerly) are preserved
across runtime function calls, hence there's nothing we need to restore
in efi_rs_leave() (they do get saved, but just for simplicity's sake).
Roger Pau Monné [Wed, 18 Jun 2014 13:52:25 +0000 (15:52 +0200)]
x86: prevent PVH Dom0 from having pages with more than one ref
On PV guests a reference is taken when a page gets added to the page
tables, which makes pages added to the page tables have two
references, but this is not suitable for PVH that doesn't use the
PVMMU. In the PVH case only one reference has to be taken or else the
page would not be freed when the memory of the domain is decreased.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Wed, 18 Jun 2014 13:51:28 +0000 (15:51 +0200)]
x86/mce: sanitise the #MC entry path
The 'error_code' function parameters are not used at all; drop it from the
call chain. If it is needed at some point in the future, it is available via
cpu_user_regs.
Having do_machine_check() call the non-inlineable machine_check_vector() just
to get at the static function pointer '_machine_check_vector' is silly. Move
do_machine_check() from traps.c to mce.c and do away with
machine_check_vector() entirely.
Both {intel,amd}_init_mce() register their own local function as the #MC
handler, each of which call mcheck_cmn_handler() in an identical way. Fix
this craziness by actually turning mcheck_cmn_handler() into a valid #MC
handler (as its comments already state), and have {intel,amd}_init_mce()
register it instead of their own private handlers.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Christoph Egger <chegger@amazon.de>
Malcolm Crossley [Wed, 18 Jun 2014 13:50:02 +0000 (15:50 +0200)]
IOMMU: prevent VT-d device IOTLB operations on wrong IOMMU
PCIe ATS allows for devices to contain IOTLBs, the VT-d code was iterating
around all ATS capable devices and issuing IOTLB operations for all IOMMUs,
even though each ATS device is only accessible via one particular IOMMU.
Issuing an IOMMU operation to a device not accessible via that IOMMU results
in an IOMMU timeout because the device does not reply. VT-d IOMMU timeouts
result in a Xen panic.
Therefore this bug prevents any Intel system with 2 or more ATS enabled IOMMUs,
each with an ATS device connected to them, from booting Xen.
The patch adds a IOMMU pointer to the ATS device struct so the VT-d code can
ensure it does not issue IOMMU ATS operations on the wrong IOMMU. A void
pointer has to be used because AMD and Intel IOMMU implementations do not have
a common IOMMU structure or indexing mechanism.
Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
xen/arm: gic_events_need_delivery and irq priorities
Introduce GIC_IRQ_GUEST_ACTIVE to track which irqs are currently
active in the guest.
gic_events_need_delivery should only return positive if an outstanding
pending irq has an higher group priority than the currently active group
priotity and the priority mask.
Read GICH_APR to find the active group priority.
Read GICH_VMCR to find the priority mask.
Find the highest priority non-active enabled irq by going through the
inflight list.
In gic_restore_pending_irqs replace lower priority pending (and not
active) irqs in GICH_LRs with higher priority irqs if no more GICH_LRs
are available.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
GICH_LR registers and GICH_VMCR only support 5 bits for guest irq
priorities.
Introduce a macro to reduce the 8-bit priority fields to 5 bits; use it
in gic.c.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
xen/arm: don't protect GICH and lr_queue accesses with gic.lock
GICH is banked, protect accesses by disabling interrupts.
Protect lr_queue accesses with the vgic.lock only.
gic.lock only protects accesses to GICD now.
xen/arm: second irq injection while the first irq is still inflight
Set GICH_LR_PENDING in the corresponding GICH_LR to inject a second irq
while the first one is still active.
If the first irq is already pending (not active), clear
GIC_IRQ_GUEST_QUEUED because the guest doesn't need a second
notification.If the irq has already been EOI'ed then just clear the
GICH_LR right away and move the interrupt to lr_pending so that it is
going to be reinjected by gic_restore_pending_irqs on return to guest.
If the target cpu is not the current cpu, then set GIC_IRQ_GUEST_QUEUED
and send an SGI. The target cpu is going to be interrupted and call
gic_clear_lrs, that is going to take the same actions.
Do not call vgic_vcpu_inject_irq from gic_inject if
evtchn_upcall_pending is set. If we remove that call, we don't need to
special case evtchn_irq in vgic_vcpu_inject_irq anymore.
We need to force the first injection of evtchn_irq (call
gic_vcpu_inject_irq) from vgic_enable_irqs because evtchn_upcall_pending
is already set by common code on vcpu creation.
A later patch is going to use uint8_t to keep track of LRs.
Both GICv3 and GICv2 don't need any more than an uint8_t to keep track
of the number of LRs.
xen/arm: support HW interrupts, do not request maintenance_interrupts
If the irq to be injected is an hardware irq (p->desc != NULL), set
GICH_LR_HW. Do not set GICH_LR_MAINTENANCE_IRQ.
Remove the code to EOI a physical interrupt on behalf of the guest
because it has become unnecessary.
Introduce a new function, gic_clear_lrs, that goes over the GICH_LR
registers, clear the invalid ones and free the corresponding interrupts
from the inflight queue if appropriate. Add the interrupt to lr_pending
if the GIC_IRQ_GUEST_PENDING is still set.
Call gic_clear_lrs on entry to the hypervisor if we are coming from
guest mode to make sure that the calculation in Xen of the highest
priority interrupt currently inflight is correct and accurate and not
based on stale data.
In vgic_vcpu_inject_irq, if the target is a vcpu running on another
pcpu, we are already sending an SGI to the other pcpu so that it would
pick up the new IRQ to inject. Now also send an SGI to the other pcpu
even if the IRQ is already inflight, so that it can clear the LR
corresponding to the previous injection as well as injecting the new
interrupt.
xen/arm: set GICH_HCR_UIE if all the LRs are in use
On return to guest, if there are no free LRs and we still have more
interrupt to inject, set GICH_HCR_UIE so that we are going to receive a
maintenance interrupt when no pending interrupts are present in the LR
registers.
The maintenance interrupt handler won't do anything anymore, but
receiving the interrupt is going to cause gic_inject to be called on
return to guest that is going to clear the old LRs and inject new
interrupts.
xen/arm: no need to set HCR_VI when using the vgic to inject irqs
HCR_VI forces the guest to resume execution in IRQ mode and can actually
cause spurious interrupt injections.
The GIC is capable of injecting interrupts into the guest and causing it
to switch to IRQ mode automatically, without any need for the hypervisor
to set HCR_VI manually.
See ARM ARM B1.8.11 and chapter 5.4 of the Generic Interrupt Controller
Architecture Specification.
Olaf Hering [Tue, 17 Jun 2014 08:44:40 +0000 (10:44 +0200)]
libxl: properly set default of discard_enable
Initialze discard_enable properly. This avoids a crash if a
libxl_device_disk with an uninitialized discard_enable is passed to
device_disk_add. Up to now only xl initialized discard_enable in its
config parser. External users of libxl, such as libvirt, do not need to
provide a default value.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Dario Faggioli [Mon, 16 Jun 2014 10:13:25 +0000 (12:13 +0200)]
sched: DOMCTL_*vcpuaffinity works with hard and soft affinity
by adding a flag for the caller to specify which one he cares about.
At the same time, enable the caller to get back the "effective affinity"
of the vCPU. That is the intersection between cpupool's cpus, the (new)
hard affinity and, for soft affinity, the (new) soft affinity. In fact,
despite what has been successfully set with the DOMCTL_setvcpuaffinity
hypercall, the Xen scheduler will never run a vCPU outside of its hard
affinity or of its domain's cpupool.
This happens by adding another cpumap to the interface and making both
the cpumaps IN/OUT parameters (for DOMCTL_setvcpuaffinity, they're of
course out-only for DOMCTL_getvcpuaffinity).
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Dario Faggioli [Mon, 16 Jun 2014 10:13:03 +0000 (12:13 +0200)]
derive NUMA node affinity from hard and soft CPU affinity
if a domain's NUMA node-affinity (which is what controls
memory allocations) is provided by the user/toolstack, it
just is not touched. However, if the user does not say
anything, leaving it all to Xen, let's compute it in the
following way:
1. cpupool's cpus & hard-affinity & soft-affinity
2. if (1) is empty: cpupool's cpus & hard-affinity
This guarantees memory to be allocated from the narrowest
possible set of NUMA nodes, ad makes it relatively easy to
set up NUMA-aware scheduling on top of soft affinity.
Note that such 'narrowest set' is guaranteed to be non-empty.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Dario Faggioli [Mon, 16 Jun 2014 10:12:28 +0000 (12:12 +0200)]
sched: introduce soft-affinity and use it instead d->node-affinity
Before this change, each vcpu had its own vcpu-affinity
(in v->cpu_affinity), representing the set of pcpus where
the vcpu is allowed to run. Since when NUMA-aware scheduling
was introduced the (credit1 only, for now) scheduler also
tries as much as it can to run all the vcpus of a domain
on one of the nodes that constitutes the domain's
node-affinity.
The idea here is making the mechanism more general by:
* allowing for this 'preference' for some pcpus/nodes to be
expressed on a per-vcpu basis, instead than for the domain
as a whole. That is to say, each vcpu should have its own
set of preferred pcpus/nodes, instead than it being the
very same for all the vcpus of the domain;
* generalizing the idea of 'preferred pcpus' to not only NUMA
awareness and support. That is to say, independently from
it being or not (mostly) useful on NUMA systems, it should
be possible to specify, for each vcpu, a set of pcpus where
it prefers to run (in addition, and possibly unrelated to,
the set of pcpus where it is allowed to run).
We will be calling this set of *preferred* pcpus the vcpu's
soft affinity, and this changes introduce it, and starts using it
for scheduling, replacing the indirect use of the domain's NUMA
node-affinity. This is more general, as soft affinity does not
have to be related to NUMA. Nevertheless, it allows to achieve the
same results of NUMA-aware scheduling, just by making soft affinity
equal to the domain's node affinity, for all the vCPUs (e.g.,
from the toolstack).
This also means renaming most of the NUMA-aware scheduling related
functions, in credit1, to something more generic, hinting toward
the concept of soft affinity rather than directly to NUMA awareness.
As a side effects, this simplifies the code quit a bit. In fact,
prior to this change, we needed to cache the translation of
d->node_affinity (which is a nodemask_t) to a cpumask_t, since that
is what scheduling decisions require (we used to keep it in
node_affinity_cpumask). This, and all the complicated logic
required to keep it updated, is not necessary any longer.
The high level description of NUMA placement and scheduling in
docs/misc/xl-numa-placement.markdown is being updated too, to match
the new architecture.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Dario Faggioli [Mon, 16 Jun 2014 10:11:52 +0000 (12:11 +0200)]
sched: rename v->cpu_affinity into v->cpu_hard_affinity
in order to distinguish it from the cpu_soft_affinity which will
be introduced a later commit ("xen: sched: introduce soft-affinity
and use it instead d->node-affinity").
This patch does not imply any functional change, it is basically
the result of something like the following:
Malcolm Crossley [Mon, 16 Jun 2014 10:02:00 +0000 (12:02 +0200)]
spread boot time page scrubbing across all available CPU's
The page scrubbing is done in 128MB chunks in lockstep across all the
non-SMT CPU's. This allows for the boot CPU to hold the heap_lock whilst each
chunk is being scrubbed and then release the heap_lock when the CPU's are
finished scrubing their individual chunk. This allows for the heap_lock to
not be held continously and for pending softirqs are to be serviced
periodically across the CPU's.
The page scrub memory chunks are allocated to the CPU's in a NUMA aware
fashion to reduce socket interconnect overhead and improve performance.
Specifically in the first phase we scrub at the same time on all the
NUMA nodes that have CPUs - we also weed out the SMT threads so that
we only use cores (that gives a 50% boost). The second phase is for NUMA
nodes that have no CPUs - for that we use the closest NUMA node's CPUs
(non-SMT again) to do the job.
This patch reduces the boot page scrub time on a 128GB 64 core AMD Opteron
6386 machine from 49 seconds to 3 seconds.
On a IvyBridge-EX 8 socket box with 1.5TB it cuts it down from 15 minutes
to 63 seconds.
Signed-off-by: Malcolm Crossley <malcolm.crossley@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Tim Deegan <tim@xen.org> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
x86/mce: don't spam the console with "CPUx: Temperature z"
If the machine has been quite busy it ends up with these messages
printed on the hypervisor console:
(XEN) CPU3: Temperature/speed normal
(XEN) CPU1: Temperature/speed normal
(XEN) CPU0: Temperature/speed normal
(XEN) CPU1: Temperature/speed normal
(XEN) CPU0: Temperature/speed normal
(XEN) CPU2: Temperature/speed normal
(XEN) CPU3: Temperature/speed normal
(XEN) CPU0: Temperature/speed normal
(XEN) CPU2: Temperature/speed normal
(XEN) CPU3: Temperature/speed normal
(XEN) CPU1: Temperature/speed normal
(XEN) CPU0: Temperature above threshold
(XEN) CPU0: Running in modulated clock mode
(XEN) CPU1: Temperature/speed normal
(XEN) CPU2: Temperature/speed normal
(XEN) CPU3: Temperature/speed normal
While the state changes are important, the non-altered state
information is not needed. As such add a latch mechanism to only print
the information if it has changed since the last update (and the
hardware doesn't properly suppress redundant notifications).
This was observed on Intel DQ67SW,
BIOS SWQ6710H.86A.0066.2012.1105.1504 11/05/2012
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Christoph Egger <chegger@amazon.de>
Ross Lagerwall [Mon, 16 Jun 2014 09:59:05 +0000 (11:59 +0200)]
cpuidle: improve perf for certain workloads
The existing mechanism of using interrupt frequency as a heuristic does
not work well for certain workloads. As an example, synchronous dd on a
small block size uses deep C-states because much of the time is spent
doing processing so the interrupt frequency is not too high, but when an
IOP is submitted, the interrupt occurs soon after going idle. This
causes exit latency to be a significant factor.
To fix this, add a new factor which limits the exit latency to be no
more than 10% of the decaying measured idle time. This improves
performance for workloads with a medium interrupt frequency but a short
idle duration.
In the workload given previously, throughput improves by 20% with this
patch.
This is not ported from the Linux menu governor since that uses load
average and number of IO wait processes to satisfy latency constraints.
If a process is in IO wait state, it compares the exit latency with the
predicted residency reduced by a factor of 10, which is somewhat similar
to what this patch does.
A side effect of this patch is to correctly limit the maximum idle time
used in the correction factor calculation. Previously data->measured_us
was used, and it was never set.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Jan Beulich [Mon, 16 Jun 2014 09:52:34 +0000 (11:52 +0200)]
x86/EFI: improve boot time diagnostics (try 2)
To aid analysis of eventual errors, print EFI status codes with error
messages where available. Also remove a case where the status gets
stored into a local variable without being used examined (which mis-
guided me to add an error check there in try 1 of this patch).
Jan Beulich [Mon, 16 Jun 2014 09:50:44 +0000 (11:50 +0200)]
pt-irq fixes and improvements
Tools side:
- don't silently ignore unrecognized PT_IRQ_TYPE_* values
- respect that the interface type contains a union, making the code at
once no longer depend on the hypervisor ignoring the bus field of the
PCI portion of the interface structure)
Hypervisor side:
- don't ignore the PCI bus number passed in
- don't store values (gsi, link) calculated from other stored values
- avoid calling xfree() with a spin lock held where easily possible
- have pt_irq_destroy_bind() respect the passed in type
- scope reduction and constification of various variables
- use switch instead of if/else-if chains
- formatting
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Yang Zhang <yang.z.zhang@intel.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
xen/arm: Implement a dummy debug monitor for ARM32
XSA-93 (commit 0b18220 "xen/arm: Don't let guess access to Debug and Performance
Monitors registers") disable Debug Registers access.
When CONFIG_PERF_EVENTS is enabled in the Linux Kernel, it will try to
initialize the debug monitors. If an error occured Linux won't use this
feature.
The implementation made Xen expose a minimal set of registers which let think
the guest (i.e.) thinks HW debug won't work.
Signed-off-by: Julien Grall <julien.grall@linaro.org>
[ ijc -- s/DBGCR/DBGBCR/ to use correct register name ] Acked-by: Ian Campbell <ian.campbell@citrix.com>
xen/arm: Implement a dummy Performance Monitor for ARM32
XSA-93 (commit 0b18220 "xen/arm: Don't let guess access to Debug and Performance
Monitor registers") disable Performance Monitor.
When CONFIG_PERF_EVENTS is enabled in the Linux Kernel, regardless the
ID_DFR0 (which tell if Perfomance Monitors Extension is implemented) the
kernel will try to access to PMCR.
Therefore we tell the guest we have 0 counters. Unfortunately we must always
support PMCCNTR (the cycle counter): we just RAZ/WI for all PM register,
which doesn't crash the kernel at least.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Karim Raslan [Wed, 11 Jun 2014 10:30:15 +0000 (11:30 +0100)]
mini-os: moved events code under arch
This is all code motion, except that we now initialise
the ev_actions array before calling the arch-specific code
to make it more robust against future changes.
Signed-off-by: Karim Allah Ahmed <karim.allah.ahmed@gmail.com>
[talex5@gmail.com: separated from big ARM commit] Signed-off-by: Thomas Leonard <talex5@gmail.com> Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Karim Raslan [Wed, 11 Jun 2014 10:30:14 +0000 (11:30 +0100)]
mini-os: tidied up code
Signed-off-by: Karim Allah Ahmed <karim.allah.ahmed@gmail.com>
[talex5@gmail.com: separated from big ARM commit] Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
[talex5@gmail.com: use __func__ in DEBUG macro]
[talex5@gmail.com: drop text about "xm create"] Signed-off-by: Thomas Leonard <talex5@gmail.com>
Client requests are safe to compile into code for running outside of
valgrind. Therefore, enable client requests whenever autoconf can find
memcheck.h and debug builds are enabled.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- reran autogen.sh ]
Boris Ostrovsky [Wed, 11 Jun 2014 08:55:43 +0000 (10:55 +0200)]
x86/VPMU: mark context LOADED before registers are loaded
Because a PMU interrupt may be generated as soon as PMU registers are
loaded (or, more precisely, as soon as HW PMU is "armed") we don't want
to delay marking context as LOADED until after registers are loaded.
Otherwise during interrupt handling VPMU_CONTEXT_LOADED may not be set
and this could be confusing.
(Technically, only SVM needs this change right now since VMX will "arm"
PMU later, during VMRUN when global control register is loaded from
VMCS. However, both AMD and Intel code will require this patch when we
introduce PV VPMU.)
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> Tested-by: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Wei Liu [Tue, 10 Jun 2014 21:21:40 +0000 (22:21 +0100)]
libxl: move some internal functions to libxl_internal.h
In 752f181f ("libxl_json: introduce parser functions for builtin types")
a bunch of parser functions are added to libxl_json.h, which breaks
GCC < 4.6.
These functions are internal and libxl_json.h is public header, so move
them to libxl_internal.h.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Tue, 10 Jun 2014 14:07:59 +0000 (15:07 +0100)]
tools/libxc: Introduce ARRAY_SIZE() and replace handrolled examples
xen-hptool and xen-mfndump include xc_private.h. This is bad, but not trivial
to fix, so they gain a protective #undef and a stern comment.
MiniOS leaks ARRAY_SIZE into the libxc namespace as part of a stubdom build.
Therefore, xc_private.h gains an #ifndef until MiniOS is fixed.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Despite my 'Reviewed-by' tag on c/s 65e3554908 "x86/PV: support data
breakpoint extension registers", I have re-evaluated my position as far as the
hypercall interface is concerned.
Previously, for the sake of not modifying the migration code in libxc,
XEN_DOMCTL_get_ext_vcpucontext would jump though hoops to return -ENOBUFS if
and only if MSRs were in use and no buffer was present.
This is fragile, and awkward from a toolstack point-of-view when actually
sending MSR content in the migration stream. It also complicates fixing a
further race condition, between querying the number of MSRs for a vcpu, and
the vcpu touching a new one.
As this code is still only in unstable, take this opportunity to redesign the
interface. This patch introduces the brand new XEN_DOMCTL_{get,set}_vcpu_msrs
subops.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>