public/platform.h: replace unsigned long with xen_ulong_t
Replace unsigned long with xen_ulong_t in public/platform.h.
Also replace unsigned int with uint32_t for clarity. It is safe because
unsigned int are 4 byte sized and 4 byte aligned an all the supported
architectures.
Jan Beulich [Fri, 28 Mar 2014 12:44:44 +0000 (13:44 +0100)]
x86/vMTRR: pass domain to mtrr_*_msr_set()
This is in preparation for the next patch, and mtrr_def_type_msr_set()
and mtrr_fix_range_msr_set() in sync with mtrr_var_range_msr_set() in
this regard.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org>
Jan Beulich [Fri, 28 Mar 2014 12:37:10 +0000 (13:37 +0100)]
x86/EPT: simplification and cleanup
- drop rsvd*_ prefixes from fields not really reserved anymore
- replace odd uses of <expr> ? 1 : 0
- drop pointless variables from ept_set_entry()
- streamline IOMMU mirroring code in ept_set_entry()
- don't open code is_epte_valid() (and properly use it when dumping)
- streamline entry cloning in ept_split_super_page()
- compact dumping code and output
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org>
CPUID[80000008].EAX[23:16] have been given the meaning of the guest
physical address restriction (in case it needs to be smaller than the
host's), hence we need to mirror that into vCPUID[80000008].EAX[7:0].
Enforce a lower limit at the same time, as well as a fixed value for
the virtual address bits, and zero for the guest physical address ones.
In order for the vMTRR code to see these overrides we need to make it
call hvm_cpuid() instead of domain_cpuid(), which in turn requires
special casing (and relaxing) the controlling domain.
This additionally should hide an ordering problem in the tools: Both
xend and xl appear to be restoring a guest from its image before
setting up the CPUID policy in the hypervisor, resulting in
domain_cpuid() returning all zeros and hence the check in
mtrr_var_range_msr_set() failing if the guest previously had more than
the minimum 36 physical address bits.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org>
Jan Beulich [Fri, 28 Mar 2014 12:31:23 +0000 (13:31 +0100)]
x86/HVM: fix preemption handling in do_hvm_op()
Just like previously done for some mem-op hypercalls, undo preemption
using the interface structures (altering it in ways the caller may not
expect) and replace it by storing the continuation point in the high
bits of sub-operation argument.
This also changes the "nr" fields of struct xen_hvm_track_dirty_vram
(operation already limited to 1Gb worth of pages) and struct
xen_hvm_modified_memory to be only 32 bits wide, consistent with those
of struct xen_set_mem{type,access}. If that's not acceptable for some
reason, we'd need to shrink the HVMOP_op_bits (while still enforcing
the [then higher] limit resulting from the need to be able to encode
the continuation).
Whether (and if so how) to adjust xc_hvm_track_dirty_vram(),
xc_hvm_modified_memory(), xc_hvm_set_mem_type(), and
xc_hvm_set_mem_access() to reflect the 32-bit restriction on "nr" is
unclear: If the APIs need to remain stable, all four functions should
probably check that there was no truncation. Preferably their
parameters would be changed to uint32_t or unsigned int, though.
As a minor cleanup, along with introducing the switch-wide "pfn" the
redundant "d" is also being converted to a switch-wide one.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org>
Jan Beulich [Fri, 28 Mar 2014 12:30:10 +0000 (13:30 +0100)]
x86/HVM: simplify do_hvm_op()
- boundary checks in HVMOP_modified_memory, HVMOP_set_mem_type, and
HVMOP_set_mem_access: all of these already check for overflow, so
there's no need to range check the first _and_ last PFN (checking
the last one suffices)
- copying back interface structures that were previously copied from
guest memory can use __copy_to_...(), since copy_from_...() already
did the address range validation
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Tim Deegan <tim@xen.org>
Olaf Hering [Tue, 11 Feb 2014 14:27:24 +0000 (15:27 +0100)]
xend/pvscsi: recognize also SCSI CDROM devices
Attaching a CDROM device with 'xm scsi-attach domU /dev/sr0 0:0:0:0'
fails because for some reason the sr driver was not handled at all in
the match list. With the change the above command succeeds and the
device is attached.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Matt Wilson <msw@amazon.com>
Olaf Hering [Mon, 10 Feb 2014 07:57:34 +0000 (08:57 +0100)]
tools/xend: move assert to exception block
The two assert in restore trigger sometimes after hundreds of
migrations. If they trigger the destination host will not destroy the
newly created, yet empty guest. After a second migration attempt to this
host there will be two guets with the same name and uuid. This situation
is poorly handled by the xm tools.
With this change the empty guest will be destroyed.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Matt Wilson <msw@amazon.com>
Julien Grall [Fri, 21 Mar 2014 15:22:13 +0000 (15:22 +0000)]
xen/xsm: Add support for device tree
This patch adds a new module "xen,xsm-policy" to allow the user to load the XSM
policy when Xen is booting.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Fri, 21 Mar 2014 15:22:12 +0000 (15:22 +0000)]
xen/xsm: Add xsm_core_init function
This function contains non-specific architecture code (mostly the tail of
xsm_multiboot_init). It will be used later to avoid code duplication.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Andrew Cooper [Wed, 26 Mar 2014 14:36:13 +0000 (15:36 +0100)]
x86: identify which vcpu's CR4 is being badly modified
When the toolstack is setting vcpu state on behalf of a migrating guest, the
domain/vcpu reference from gdprintk() identifies the toolstack, not the
affected domain.
Daniel De Graaf [Mon, 24 Mar 2014 09:55:26 +0000 (10:55 +0100)]
evtchn: optimize XSM ssid field
When FLASK is the only enabled implementation of the XSM hooks in Xen,
some of the abstractions required to handle multiple XSM providers are
redundant and only produce unneeded overhead. This patch reduces the
memory overhead of enabling XSM on event channels by replacing the
untyped ssid pointer from struct evtchn with a union containing the
contents of the structure. This avoids an additional heap allocation
for every event channel, and on 64-bit systems, reduces the size of
struct evtchn by 4 bytes. If an out-of-tree XSM module needs the full
flexibility of the generic evtcnn ssid pointer, defining the symbol
XSM_NEED_GENERIC_EVTCHN_SSID will include a suitable pointer field.
This also cleans up the unused selinux_checkreqprot declaration left
from the Linux port.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Daniel De Graaf [Mon, 24 Mar 2014 09:54:27 +0000 (10:54 +0100)]
xsm: Reduce compiler command line clutter
Move the preprocessor definitions for all FLASK parameters other than
the enable flag off the compiler command line and into config.h, which
is the preferred location for such definitions.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Mon, 24 Mar 2014 09:49:19 +0000 (10:49 +0100)]
sysctl: annotate struct pm_cx_stat's pc[]/cc[]
Suggested-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Mon, 24 Mar 2014 09:48:03 +0000 (10:48 +0100)]
x86: fix determination of bit count for struct domain allocations
We can't just add in the hole shift value, as the hole may be at or
above the 44-bit boundary. Instead we need to determine the total bit
count until reaching 32 significant (not squashed out) bits in PFN
representations.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Mukesh Rathor [Mon, 24 Mar 2014 08:47:59 +0000 (09:47 +0100)]
x86/pvh: disallow PHYSDEVOP_pirq_eoi_gmfn_v2/v1
A call to do_physdev_op with PHYSDEVOP_pirq_eoi_gmfn_v2/v1 will corrupt
struct hvm_domain when it writes to domain->arch.pv_domain.pirq_eoi_map.
Disallow that. Currently, such a path exists for linux dom0 pvh.
Ian Jackson [Mon, 14 Oct 2013 15:13:19 +0000 (16:13 +0100)]
xl: Introduce children[].description and xl_report_child_exitstatus
We record the descriptive string for the child in the children[]
array, and use it when reporting the exit status.
The only functional change is that the message reported for the
migration child is changed from "migration target process" to
"migration transport process".
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
Samuel Thibault [Fri, 21 Mar 2014 01:56:56 +0000 (02:56 +0100)]
PV-GRUB: fix blk access at end of disk
GRUB usually always loads a whole disk track, even if that means going
beyond the end of the disk. We thus have to gracefully return an error,
instead of letting the blkfront go panic.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
/local/home/julien/works/arndale/xen/xen/include/asm/platforms/exynos5.h:1:9: error: '__ASM_ARM_PLATFORMS_EXYNOS5_H' is used as a header guard here, followed by #define of a different macro [-Werror,-Wheader-guard]
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/local/home/julien/works/arndale/xen/xen/include/asm/platforms/exynos5.h:2:9: note: '__ASM_ASM_PLATFORMS_EXYSNO5_H' is defined here; did you mean '__ASM_ARM_PLATFORMS_EXYNOS5_H'?
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
__ASM_ARM_PLATFORMS_EXYNOS5_H
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Roger Pau Monne [Fri, 7 Mar 2014 12:22:58 +0000 (13:22 +0100)]
xenstore: set READ_THREAD_STACKSIZE to a sane value
On FreeBSD PTHREAD_STACK_MIN is 2048 by default, which is obviously
too low. Set the default back to the previous value (16 * 1024), or if
that's too low set it to PTHREAD_STACK_MIN.
Julien Grall [Mon, 17 Mar 2014 14:06:01 +0000 (14:06 +0000)]
xen/xsm: flask: Add missing header in hooks.c
nr_static_irqs and nr_irqs is defined in asm/irq.h (on both x86 and ARM).
Include directly the header in hooks.c to avoid compilation failure on ARM.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Mon, 17 Mar 2014 14:06:00 +0000 (14:06 +0000)]
xen/xsm: flask: flask_copying_string is taking a XEN_GUEST_HANDLE as first param
Rather than x86, on ARM XEN_GUEST_HANDLE and XEN_GUEST_HANDLE_PARAM are
not compatible. This will result to a compilation failure on ARM when XSM
will be enabled.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Mon, 17 Mar 2014 14:05:59 +0000 (14:05 +0000)]
xen/xsm: flask: MSI is PCI specific
MSI is not yet support on ARM and will break the compilation when XSM_ENABLE=y.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Mon, 17 Mar 2014 14:05:58 +0000 (14:05 +0000)]
xen/xsm: flask: Rename variable "bool" in "b"
On ARM, the compilation is failing with the following error:
In file included from flask_op.c:21:0:
./include/conditional.h:24:43: error: two or more data types in declaration specifiers
./include/conditional.h:25:42: error: two or more data types in declaration specifiers
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Mon, 17 Mar 2014 14:05:57 +0000 (14:05 +0000)]
xen/xsm: flask: Fix compilation when CONFIG_COMPAT=n
The commit f7d29f7b "flask: add compat mode guest support" introduces
build breakage on ARM when XSM is enabled. It's because ARM doesn't use
compat mode.
flask_op.c:794:34: fatal error: compat/event_channel.h: No such file or directory
#include <compat/event_channel.h>
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Mon, 17 Mar 2014 14:05:56 +0000 (14:05 +0000)]
xen/xsm: xsm_do_mca is x86 specific
xsm_do_mca is only used by x86. Only define the function for x86 to
avoid usage on ARM.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Mon, 17 Mar 2014 14:05:55 +0000 (14:05 +0000)]
xen/xsm: xsm functions for PCI passthrough is not x86 specific
Protect xsm functions for PCI passthrough by HAS_PASSTHROUGH && HAS_PCI
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Ian campbell <ian.campbell@citrix.com>
Julien Grall [Mon, 17 Mar 2014 14:05:54 +0000 (14:05 +0000)]
xen/arm: next_module: Skip module if the size is 0
When the the module size is 0, it means that the module was not provided by
the user. It can happen, if the user choose to boot without initrd.
In this case, both fields (start and size) are zeroed. Therefore, next_module
will return 0 every time if there is other non-zero module after this one. This
can happen when the XSM module is added.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Tue, 7 Jan 2014 18:40:05 +0000 (18:40 +0000)]
xl: Pass -v options on to migration receiver
Compute a -v option to pass to the migration receiver.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
---
v2: Use minmsglevel_default to initialise minmsglevel.
Ian Jackson [Tue, 7 Jan 2014 18:23:04 +0000 (18:23 +0000)]
xl: migration: pass -t to xl migrate-receive
If we ourselves are using cr-based overwriting for logging to stderr,
pass -t to the migration receiver so that it knows to do the same
(since its stderr is normally the pipe from sshd).
This requires, of course, that the receiver support that option. This
is OK from a compatibility point of view because we support migration
to newer, but not necessarily to older, versions. (If unsupported
backwards migration is still desired the use of -s "" allows the
remote invocation rune to be overridden by a command of one's choice.)
This fixes a regression introduced in 2f80ac9c0e8f, where migration
messages from the receiver would not use of the overwriting protocol.
CC: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Ian Jackson [Tue, 7 Jan 2014 18:07:30 +0000 (18:07 +0000)]
xentoollog: provide XTL_STDIOSTREAM_PROGRESS_USE_CR
Provide flags
XTL_STDIOSTREAM_PROGRESS_USE_CR
XTL_STDIOSTREAM_PROGRESS_NO_CR
to allow the caller to force, or disable, the use of \r-based
overwriting of progress messages.
In the implementation, rename the variable "tty" to "progress_use_cr".
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Tue, 18 Mar 2014 16:37:18 +0000 (16:37 +0000)]
libxl: hotplug scripts: stdin < /dev/null
Give hotplug scripts /dev/null for stdin. That way if they try read
anything anything (which really they shouldn't), nothing odd will
happen.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Roger Pau Monne <roger.pau@citrix.com> CC: Vasiliy Tolstov <v.tolstov@selfip.ru> Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
Ian Jackson [Tue, 18 Mar 2014 16:33:06 +0000 (16:33 +0000)]
libxl: hotplug scripts: stdout >& stderr
Plumb hotplug scripts' stdout to stderr. That way if they print
anything (which really they shouldn't), it won't get mixed up with
the application's stdout. (Eg, perhaps with an xl migration
stream...)
Ian Jackson [Tue, 18 Mar 2014 17:04:36 +0000 (17:04 +0000)]
libxl: Make libxl_exec tolerate foofd<=2
Make passing 0, 1, or 2 as stdinfd, stdoutfd or stderrfd work
properly.
Also, document the meaning of the fd arguments.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Roger Pau Monne <roger.pau@citrix.com> CC: Vasiliy Tolstov <v.tolstov@selfip.ru> Acked-by: Ian Campbell <Ian.Campbell@citrix.com>
Ian Jackson [Thu, 27 Feb 2014 17:46:49 +0000 (17:46 +0000)]
tools/console: xenconsole tolerate tty errors
Since 28d386fc4341 (XSA-57), libxl writes an empty value for the
console tty node, with read-only permission for the guest, when
setting up pv console "frontends". (The actual tty value is later set
by xenconsoled.) Writing an empty node is not strictly necessary to
stop the frontend from writing dangerous values here, but it is a good
belt-and-braces approach.
Unfortunately this confuses xenconsole. It reads the empty value, and
tries to open it as the tty. xenconsole then exits.
Fix this by having xenconsole treat an empty value the same way as no
value at all.
Also, make the error opening the tty be nonfatal: we just print a
warning, but do not exit. I think this is helpful in theoretical
situations where xenconsole is racing with libxl and/or xenconsoled.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <Ian.Campbell@citrix.com> CC: George Dunlap <george.dunlap@eu.citrix.com>
---
v2: Combine two conditions and move the free
Ian Jackson [Mon, 24 Feb 2014 15:16:19 +0000 (15:16 +0000)]
tools/console: reset tty when xenconsole fails
If xenconsole (the client program) fails, it calls err. This would
previously neglect to reset the user's terminal to sanity. Use atexit
to do so.
This routinely happens in Xen 4.4 RC5 with pygrub because libxl
writes the value "" to the tty xenstore key when using xenconsole.
After this patch this just results in a harmless error message.
Reported-by: M A Young <m.a.young@durham.ac.uk> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: M A Young <m.a.young@durham.ac.uk> CC: Ian Campbell <Ian.Campbell@citrix.com> Acked-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: Fix whitespace error (reintroduce hard tab)
Fix commit message not to claim ignorance about root cause
Ian Campbell [Mon, 17 Mar 2014 17:27:40 +0000 (17:27 +0000)]
xen: arm: make stage 2 page tables walks inner-shareable
The comment was previously incorrect and indicated that these mappings were
unshared (00) when in reality the register was set for outer-shareable (01).
Clarify ORGN0/IRGN0 in the comments while at it.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Julien Grall <julien.grall@linaro.org>
Ian Campbell [Mon, 17 Mar 2014 14:53:29 +0000 (14:53 +0000)]
xen: arm: weaken SMP barriers to inner shareable.
Since all processors are in the inner-shareable domain and we map everything
that way this is sufficient.
The non-SMP barriers remain full system. Although in principle they could
become outer shareable barriers for some hardware this would require us to
know which class a given device is. Given the small number of device drivers
in Xen itself its probably not worth worrying over, although maybe someone
will benchmark at some point.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Tim Deegan <tim@xen.org>
Ian Campbell [Mon, 17 Mar 2014 14:53:23 +0000 (14:53 +0000)]
xen: arm: map memory as inner shareable.
The inner shareable domain contains all SMP processors, including different
clusters (e.g. big.LITTLE). Therefore this is the correct thing to use for Xen
memory mappings. The outer shareable domain is for devices on busses which are
coherent and barrier-aware (e.g. AMBA4 AXI with ACE). While the system domain
is for things behind bridges which are not.
One wrinkle is that Normal memory with attributes Inner Non-cacheable, Outer
Non-cacheable (which we call BUFFERABLE) must be mapped Outer Shareable on ARM
v7. Therefore change the prototype of mfn_to_xen_entry to take the attribute
index so we can DTRT. On ARMv8 the sharability is ignored and considered to
always be Outer Shareable.
Don't adjust the barriers, flushes etc, those remain as they were (which is
more than is now required). I'll change those in a later patch.
Many thanks to Leif for explaining the difference between Inner- and
Outer-Shareable in words of two or less syllables, I hope I've replicated that
explanation properly above!
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Julien Grall <julien.grall@linaro.org>
Ian Jackson [Tue, 18 Mar 2014 13:45:25 +0000 (13:45 +0000)]
libxc: Fix buffer length for get_suspend_file
Declaring a formal parameter to have an array type doesn't result in
the parameter actually having an array type. The type is "adjusted"
to a pointer. (C99 6.9.1(7), 6.7.5.3.)
So the use of sizeof in xc_suspend.c:get_suspend_file was wrong.
Instead, use the #define. Also get rid of the array size, as it is
misleading.
Newer versions of gcc warn about the erroneous code:
xc_suspend.c:39:25: error: argument to 'sizeof' in 'snprintf' call
is the same expression as the destination; did you mean to provide
an explicit length? [-Werror=sizeof-pointer-memaccess]
Reported-By: Julien Grall <julien.grall@linaro.org> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Ian Campbell <ian.campbell@citrix.com> CC: Julien Grall <julien.grall@linaro.org>
--
v2: Actually change the declaration of buf.
Jan Beulich [Tue, 18 Mar 2014 10:52:34 +0000 (11:52 +0100)]
x86/idle: update to include further package/core residency MSRs
With the number of these growing it becomes increasingly desirable to
not repeatedly alter the sysctl interface to accommodate them. Replace
the explicit listing of numbered states by arrays, unused fields of
which will remain untouched by the hypercall.
The adjusted sysctl interface at once fixes an unrelated shortcoming
of the original one: The "nr" field, specifying the size of the
"triggers" and "residencies" arrays, has to be an input (along with
being an output), which the previous implementation didn't obey to.
Note that the bouncing direction in the libxc interface at once gets
corrected to OUT (was BOTH).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Kevin Tian <kevin.tian@intel.com>
Ian Jackson [Thu, 12 Dec 2013 19:17:03 +0000 (19:17 +0000)]
libxl: suspend: Apply guest timeout in evtchn case
When negotiating guest suspend via the evtchn ("fast") protocol,
the guest may still fail to respond.
So set the timeout. The existing error path will already properly
tear down our (event channel) wait.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Tue, 10 Dec 2013 17:40:49 +0000 (17:40 +0000)]
libxl: suspend: Async evtchn wait
When negotiating guest suspend via the evtchn ("fast") protocol,
abolish synchronous wait for domain suspend.
If the guest supports the event channel suspend protocol, we used to
sit in a loop in xc_await_suspend waiting (perhaps indefinitely) for
it to suspend.
Instead, use the new libxl event channel event facility. When we see
that the event is signaled, we look at the domain to see if it has
suspended. (In this patch we do not yet set a timeout; that will come
next.)
So the suspend operation no longer blocks with the libxl ctx lock
held, and instead returns to the event loop. Additionally, domains
which signal the event channel themselves, or undergo other state
changes, will be handled more correctly.
We end up making a few more hypercalls.
Also, if we encounter errors setting up the suspend event channel
(which should not happen), abort the operation rather than falling
back to the xenstore protocol.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: Improve commit message.
Ian Jackson [Fri, 6 Dec 2013 16:40:08 +0000 (16:40 +0000)]
libxl: suspend: Fix suspend wait corner cases
When we are waiting for a guest to suspend, this suspend operation
would continue to wait (until the timeout) if the guest was destroyed
or shut down for another reason, or if xc_domain_getinfolist failed.
Handle these cases correctly, as errors.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: Remove unwanted error call after a new "goto err".
Ian Jackson [Fri, 6 Dec 2013 16:12:44 +0000 (16:12 +0000)]
libxl: suspend: Abolish usleeps in domain suspend wait
Replace the use of a loop with usleep().
Instead, use a xenstore watch and an event system timeout. (xenstore
fires watches on @releaseDomain when a domain shuts down.)
The logic which checks for the state of the domain is unchanged, and
not ideal, but we will leave that for the next patch.
There is not intended to be any semantic change, other than to make
the algorithm properly asynchronous and the consequential waiting be
on xenstore, rather than polling.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: Remove some trailing whitespace
Improve commit message.
v3X: Do NOT use an xswait instead of separate watch and timeout.
Ian Jackson [Thu, 5 Dec 2013 18:50:55 +0000 (18:50 +0000)]
libxl: suspend: Async xenstore pvcontrol wait
When negotiating guest suspend via the xenstore pvcontrol protocol
(ie when the guest does NOT support the evtchn fast suspend protocol):
Replace the use of loops and usleep with a call to libxl__xswait.
Also, replace the xenstore transaction loop with one using
libxl__xs_transaction_start et al.
There is not intended to be any semantic change, other than to make
the algorithm properly asynchronous.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: Add a comment to clarify last minute ack.
v3X: Do NOT rename "pvcontrol" xswait state struct to "guest_wait"
(because we're NOT going to use it for the event channel based wait
too).
In domain_suspend_callback_common, use libxl__xs_transaction_start in
a loop, rather than xs_transaction_start and a goto label.
This will improve the error handling, but have no other semantic
effect.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Thu, 5 Dec 2013 18:48:21 +0000 (18:48 +0000)]
libxl: suspend: New domain_suspend_pvcontrol_acked
Factor out domain_suspend_pvcontrol_acked.
This replaces a bunch of open-coded strcmp()s and makes the code
clearer. It also eliminates the need to check for state==NULL each
time it's read, because we can check for NULL once before the strcmp.
No functional change.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: Improve comment re xswatch state ENOENT
Ian Jackson [Thu, 5 Dec 2013 18:27:30 +0000 (18:27 +0000)]
libxl: suspend: New libxl__domain_pvcontrol_xspath
Factor out the pv control node xenstore path calculation into
libxl__domain_pvcontrol_xspath.
This xs path calculation was open coded in
libxl__domain_pvcontrol_read and _write. This is undesirable because
it duplicates the code and because it makes the path inaccessible to
other parts of libxl (which are soon going to want it).
No functional change.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Make domain_suspend_callback_common more callback-oriented:
* Turn the functionality behind the goto labels "err" and
"guest_suspended" into functions which can be called just before
"return".
* Deindent the "issuing %s suspend request via XenBus control node"
branch; it is going to be split up into various functions as the
xenstore work becomes callback-based.
No functional change.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Make domain_suspend_callback_common do its work and then call
dss->callback_common_done, rather than simply returning its answer.
This is preparatory to abolishing the usleeps in this function and
replacing them with use of the event machinery.
No functional change.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: Remove some trailing whitespace
Mark the suspend callback libxl__domain_suspend_callback as
asynchronous in the helper stub generator (libxl_save_msgs_gen.pl).
We are going to want to provide an asynchronous version of this
function to get rid of the usleeps and waiting loops in the suspend
code.
libxl__domain_suspend_common_callback, the common synchronous core,
which used to be provided directly as the callback function for the
helper machinery, becomes libxl__domain_suspend_callback_common. It
can now take a typesafe parameter.
For now, provide two very similar asynchronous wrappers for it
(normal, and remus). Each is a simple function which contains only
boilerplate, calls the common synchronous core, and returns the
asynchronous response.
Essentially, we have just moved (in the case of suspend callbacks) the
call site of libxl__srm_callout_sendreply. It was in the switch
statement in the autogenerated _libxl_save_msgs_callout.c, and is now
in the handwritten libxl_dom.c.
No functional change.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Cc: Shriram Rajagopalan <rshriram@cs.ubc.ca> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: Clarify commit message.
Fix a misformatted CC in the commit message
Do not introduce a whitespace error in libxl_save_msgs_gen.pl
v2: Commit message mentions usleeps, not Remus, as motivation.
Ian Jackson [Wed, 11 Dec 2013 16:29:38 +0000 (16:29 +0000)]
libxc: suspend: Fix suspend event channel locking
Use fcntl F_SETLK, rather than writing our pid into a "lock" file.
That way if we crash we don't leave the lockfile lying about. Callers
now need to keep the fd for our lockfile. (We don't use flock because
we don't want anyone who inherits this fd across fork to end up with a
handle onto the lock.)
While we are here:
* Move the lockfile to /var/run/xen
* De-duplicate the calculation of the pathname
* Compute the buffer size for the pathname so that it will definitely
not overrun (and use the computed value everywhere)
* Fix various error handling bugs
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Shriram Rajagopalan <rshriram@cs.ubc.ca>
xc_suspend_evtchn_init expects to eat the first event on the xce. If
the xce is used for any other purpose then this can break. Document
this fact and rename the function to xc_suspend_evtchn_init_exclusive.
(I haven't checked the call sites for improper shared use of the xce.)
Provide a corresponding xc_suspend_evtchn_init_sane which doesn't try
to eat an event, and instead leaves the caller the ability to
demultiplex.
Also document that xc_await_suspend needs exclusive use of the xce.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> CC: Shriram Rajagopalan <rshriram@cs.ubc.ca> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: Drop spurious addition of #include <assert.h>
Ian Jackson [Wed, 11 Dec 2013 14:06:02 +0000 (14:06 +0000)]
libxl: events: Provide libxl__ev_evtchn*
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: Fix commit message not to refer to libxl_ctx_alloc's gc, done earlier
Change type of port in evtchn_fd_callback to evtchn_port_or_error_t
Clarify comment about use of ctx->xce.
Fix typo in comment.
Ian Jackson [Fri, 6 Dec 2013 15:31:02 +0000 (15:31 +0000)]
libxl: events: Use libxl__xswait_* in spawn code
Replace open-coded use of ev_time and ev_xswatch with xswait.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Fri, 14 Mar 2014 17:38:38 +0000 (17:38 +0000)]
libxl: events: libxl__xswait* support @paths
Special-case paths starting with '@' in libxl__xswait. Attempting to
read these from xenstore gives EINVAL. Callers waiting for (say)
@releaseDomain will be checking for some condition which can be
observed other than by looking at xenstore.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: New patch in this version of the series.
Ian Jackson [Thu, 5 Dec 2013 18:49:12 +0000 (18:49 +0000)]
libxl: events: Provide libxl__xswait_*
This is an ao utility for for conveniently doing a timed wait on
xenstore. It handles setting up and cancelling the timeout, and also
conveniently reads the key for you.
No callers yet in this patch.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: Fix doc comments to refer to correct rc values
The comments for libxl__ev_time_isregistered and the corresponding
watch function even say that these should be const. Make it so.
Also fix libxl__ev_child_inuse and libxl__ev_spawn_inuse.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Tue, 17 Dec 2013 15:20:25 +0000 (15:20 +0000)]
libxl: init: libxl__poller_init and _get take gc
Change libxl__poller_init and libxl__poller__get to take a libxl__gc*
rather than a libxl_ctx*. The gc is not used for memory allocation
but simply to provide the standard local variable "gc" expected by the
convenience macros. Doing this makes the error logging more
convenient.
Hence, convert the logging calls to use the LOG* convenience macros.
And consequently, change the call sites, and the function bodies to
use CTX rather than ctx.
Also convert a call to malloc() (with error check) in
libxl__poller_get, to libxl__zalloc (no error check needed).
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> CC: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>