Fu Wei [Thu, 21 Apr 2016 11:07:09 +0000 (19:07 +0800)]
docs/arm64: update the documentation for loading XSM support
This patch updates the documentation for allowing detection of an XSM
module that lacks a specific compatible string.
This mechanism has been added by the commit ca32012341f3de7d3975407fb963e6028f0d0c8b.
Signed-off-by: Fu Wei <fu.wei@linaro.org> Signed-off-by: Stefano Stabellini <sstabellini@kernel.org> Acked-by: Julien Grall <julien.grall@arm.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Xen needs to blacklist any PSCI node as it will be recreated for DOM0.
Up to now, this was done only for arm,psci and arm,psci-0.2 compatible
nodes. Add PSCI 1.0 compatibility to make device tree nodes with
George Dunlap [Fri, 22 Apr 2016 11:19:23 +0000 (12:19 +0100)]
committers to be REST maintainers
As proposed on the hackathon.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org> Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: George Dunlap <george.dunlap@citrix.com>
David Vrabel [Tue, 12 Apr 2016 16:19:43 +0000 (17:19 +0100)]
x86/ept: defer the invalidation until the p2m lock is released
Holding the p2m lock while calling ept_sync_domain() is very expensive
since it does an on_selected_cpus() call. IPIs on many socket
machines can be very slow and on_selected_cpus() is serialized.
It is safe to defer the invalidate until the p2m lock is released
except for two cases:
1. When freeing a page table page (since partial translations may be
cached).
2. When reclaiming a zero page as part of PoD.
For these cases, add p2m_tlb_flush_sync() calls which will immediately
perform the invalidate before the page is freed or reclaimed.
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: George Dunlap <geroge.dunlap@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Tim Deegan [Mon, 14 Mar 2016 11:05:48 +0000 (11:05 +0000)]
x86: limit GFNs to 32 bits for shadowed superpages.
Superpage shadows store the shadowed GFN in the backpointer field,
which for non-BIGMEM builds is 32 bits wide. Shadowing a superpage
mapping of a guest-physical address above 2^44 would lead to the GFN
being truncated there, and a crash when we come to remove the shadow
from the hash table.
Track the valid width of a GFN for each guest, including reporting it
through CPUID, and enforce it in the shadow pagetables. Set the
maximum witth to 32 for guests where this truncation could occur.
This is XSA-173.
Reported-by: Ling Liu <liuling-it@360.cn> Signed-off-by: Tim Deegan <tim@xen.org> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Tue, 19 Apr 2016 17:27:05 +0000 (18:27 +0100)]
tools/libxc: Correct use of X86_XSS_MASK in guest xstate generation
c/s 75f9455e "tools/libxc: Calculate xstate cpuid leaf from guest information"
incorrectly inverted the shift and mask when using X86_XSS_MASK. Luckily, the
mask is currently zero, avoiding incorrect calculations.
While adjusting this, use an explcit uint32_t cast rather than masking against
0xffffffff.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Requested-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
libxc: cpupools: adjust retry loop in xc_cpupool_removecpu()
Commit 1ef6beea187b ("libxc: do some retries in xc_cpupool_removecpu()
for EBUSY case") added a retry loop in xc_cpupool_removecpu() for the
EBUSY case. As EBUSY was returned in multiple error situations the
loop would have been executed in situations where a retry would not
be successful. Additionally calling sleep(1) between the rerires is a
bad idea when being called in a daemon.
The hypervisor has been changed to return different error values now.
The retry added in above mentioned commit should be done in the
EADDRINUSE case now. As the error condition should last only for a
very short time, the sleep(1) call can be removed.
Requested-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com> Reviewed-by: Alan Robinson <alan.robinson@ts.fujitsu.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
xen: cpupools: return different error values for cpupool operations
Today there are several different situations in which moving a cpu
from or to a cpupool will return -EBUSY. This makes it hard for the
user to know what he did wrong, as the Xen tools are not capable to
print a detailed error message.
Depending on the situation return different error codes in order to
enable the tools to print useful messages.
Requested-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Dario Faggioli <dario.faggioli@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Wei Liu [Sun, 17 Apr 2016 22:36:53 +0000 (23:36 +0100)]
libxl: fix old style declarations
Fix errors like:
/local/work/xen.git/dist/install/usr/local/include/libxl_uuid.h:59:1: error: 'static' is not at beginning of declaration [-Werror=old-style-declaration]
void static inline libxl_uuid_copy_0x040400(libxl_uuid *dst,
^
/local/work/xen.git/dist/install/usr/local/include/libxl_uuid.h:59:1: error: 'inline' is not at beginning of declaration [-Werror=old-style-declaration]
/local/work/xen.git/dist/install/usr/local/include/libxl.h:1233:1: error: 'static' is not at beginning of declaration [-Werror=old-style-declaration]
int static inline libxl_domain_create_restore_0x040200(
^
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Wei Liu [Wed, 13 Apr 2016 17:02:36 +0000 (18:02 +0100)]
hotplug/Linux: fix same_vm check in block script
The original same_vm check has two bugs. When stubdom is in use because
it relies on numeric domid to check if two domains are in fact the same
one. Another one is that the check would fail when two stubdoms are
checked against each other.
The first bug is fixed by using uuid to identify a domain. The second
bug is fixed by comparing the domains two stubdoms serve.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Thu, 14 Apr 2016 19:54:15 +0000 (20:54 +0100)]
tools/libxl: Fix legacy migration following COLO backchannel breakage
c/s f5d947bf1b "tools/libxl: add back channel support to read stream"
made a bogus adjustment to libxl__stream_read_start(), including
removing the comment hinting at what was going on, which breaks
conversion of a legacy migration stream.
Symptoms look like:
root@anonymi:~ # xl migrate domU host
migration target: Ready to receive domain.
Saving to migration stream new xl format (info 0x1/0x0/2677)
xc: error: error polling suspend notification channel: -1: Internal error
Loading new save file <incoming migration stream> (new xl fmt info 0x1/0x0/2677)
Savefile contains xl domain config in JSON format
Parsing config from <saved>
libxl: error: libxl_stream_read.c:327:stream_header_done: Invalid ident: expected 0x4c6962786c466d74, got 0x01f00f0000000000
libxl: error: libxl_utils.c:430:libxl_read_exactly: file/stream truncated reading ipc msg header from domain 1 save/restore helper stdout pipe
The adjustment is not required for backchannel support (as there is no
interaction between back channels and legacy conversion), and caused
stream->fd to be latched in the datacopier before legacy conversion
substitutes it for the fd which is the output of the conversion script.
This causes libxl to consume data from the legacy stream rather than the
v2 stream, and for the conversion script to encounter an error as the
legacy stream appears to skip ahead.
Undo the adjustments to libxl__stream_read_start(), and introduce a
better description of what is going on. Introduce some extra assertions
to try and catch similar breakage in the future.
Reported-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Wen Congyang <wency@cn.fujitsu.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com> Tested-by: Olaf Hering <olaf@aepfle.de>
xen: change the sizes of memory fields in the HVM start info to be 64bits
At the moment the only consumer of this structure is x86, but other arches
might also use it, so make all the fields 64bits. On x86 Xen will still try
to place everything below the 4GiB boundary, but that might not be feasible
in other arches.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Requested-by: Jan Beulich <jbeulich@suse.com> Acked-by: Jan Beulich <jbeulich@suse.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
libxl/save: set domain_suspend_state->domid in do_domain_soft_reset()
c/s d5c693d "libxl/save: Refactor libxl__domain_suspend_state" broke soft
reset as libxl__domain_suspend_device_model() now fails when domid in not set
in libxl__domain_suspend_state.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
drivers/pl011: ACPI: The interrupt should always be high level triggered
The SPCR does not specify if the interrupt is edge or level triggered.
So the configuration needs to be hardcoded in the code.
Based on the PL011 TRM (see 2.2.8 in ARM DDI 0183G), the interrupt generated
will be active high. Whilst the wording may be interpreted differently,
the SBSA (section 4.3.2 in ARM-DEN-0029 v2.3) states the PL011 is
implemented with a level triggered interrupt.
So the driver should configure the interrupt as high level triggered.
xen: sched: fix spinlock issue in schedule_cpu_switch().
Commit 94734ab7c3f5 ("xen: sched: close potential races
when switching scheduler to CPUs") buggily replaced a call
to pcpu_schedule_lock_irq() with just pcpu_schedule_lock(),
causing the relevant irq_safe vs. non-irq_safe ASSERT()
in check_lock() to trigger.
Fix that.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: George Dunlap <george.dunlap@citrix.com>
Andrew Cooper [Mon, 11 Apr 2016 09:03:55 +0000 (10:03 +0100)]
x86/pv: Correctly fold vIOPL back into vcpu_guest_context
c/s f71ecb6 "x86: introduce a new VMASSIST for architectural behaviour of
iopl" shifted the vcpu iopl field by 12, but didn't update the logic which
reconstructs the guests eflags for migration.
Existing guest kernels set a vIOPL of 1, to prevent them from faulting when
accessing IO ports. This bug manifests as a crash after migrate, as the vIOPL
reverts back to the default of 0, and the guest suffers an unexpected #GP
fault.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
xen/arm: acpi: The boot CPU does not always match the first entry in the MADT
Since the ACPI 6.0 errata document [1], the first entry in the MADT
does not have to correspond to the boot CPU.
Introduce a new variable to know if a MADT entry matching the boot CPU
is found. Furthermore, it's not necessary to check if the MPIDR is
duplicated for the boot CPU. So the rest of the function can be skipped.
[1] 1380 Unnecessary restrictions to FW vendors in ordering of GIC structures
in MADT
Andrew Cooper [Tue, 24 Nov 2015 14:49:49 +0000 (14:49 +0000)]
tools/libxc: Calculate xstate cpuid leaf from guest information
The existing logic is broken for heterogeneous migration. By always
advertising the host maximum xstate, a migration to a less capable host always
fails as Xen cannot accomodate the xcr0_accum in the migration stream.
By calculating xstate from the feature information (which a multi-host
toolstack will have levelled appropriately), the guest will have the current
hosts maximum xstate advertised, allowing for correct migration to less
capable hosts.
In addition, some further improvements and corrections:
- don't discard the known flags in sub-leaves 2..63 ECX
- zap sub-leaves beyond 62
- zap all bits in leaf 1, EBX/ECX. No XSS features are currently supported.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Tue, 6 Oct 2015 15:01:37 +0000 (16:01 +0100)]
tools/libxc: Wire a featureset through to cpuid policy logic
Later changes (Patch titled "tools/libxc: Use featuresets rather than
guesswork") will cause the cpuid generation logic to seed their
information from a featureset. This patch adds the infrastructure to
specify a featureset, and will obtain the appropriate defaults from Xen
if omitted.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Thu, 4 Feb 2016 22:42:50 +0000 (22:42 +0000)]
tools: Utility for dealing with featuresets
It is able to reports the current featuresets; both the static masks and
dynamic featuresets from Xen, or to decode an arbitrary featureset into
`/proc/cpuinfo` style strings.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Andrew Cooper [Mon, 25 Jan 2016 17:07:13 +0000 (17:07 +0000)]
tools/libxc: Expose the automatically generated cpu featuremask information
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Andrew Cooper [Tue, 17 Nov 2015 18:11:18 +0000 (18:11 +0000)]
tools/libxc: Use public/featureset.h for cpuid policy generation
Rather than having a different local copy of some of the feature
definitions.
Modify the xc_cpuid_x86.c cpumask helpers to appropiately truncate the
new values.
As some of the feature have been renamed in the public API, similar renames
are made here.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Andrew Cooper [Thu, 21 Jan 2016 14:45:24 +0000 (14:45 +0000)]
tools/libxc: Modify bitmap operations to take void pointers
The type of the pointer to a bitmap is not interesting; it does not affect the
representation of the block of bits being pointed to.
Make the libxc functions consistent with those in Xen, so they can work just
as well with 'unsigned int *' based bitmaps.
As part of doing so, change the implementation to be in terms of char rather
than unsigned long. This fixes alignment concerns with ARM.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Andrew Cooper [Tue, 4 Aug 2015 14:37:43 +0000 (15:37 +0100)]
xen+tools: Export maximum host and guest cpu featuresets via SYSCTL
And provide stubs for toolstack use.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: David Scott <dave@recoil.org> Acked-by: Jan Beulich <JBeulich@suse.com> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Andrew Cooper [Fri, 27 Nov 2015 18:34:57 +0000 (18:34 +0000)]
x86/domctl: Update PV domain cpumasks when setting cpuid policy
This allows PV domains with different featuresets to observe different values
from a native cpuid instruction, on supporting hardware.
It is important to leak the host view of X2APIC, HTT and CMP_LEGACY through to
guests, even though they could be hidden. These flags affect how to interpret
other cpuid leaves which are not maskable.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Andrew Cooper [Thu, 26 Nov 2015 18:56:43 +0000 (18:56 +0000)]
x86/pv: Provide custom cpumasks for PV domains
And use them in preference to cpumask_defaults on context switch. HVM domains
must not be masked (to avoid interfering with cpuid calls within the guest),
so always lazily context switch to the host default.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <JBeulich@suse.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Andrew Cooper [Thu, 26 Nov 2015 18:36:52 +0000 (18:36 +0000)]
x86/cpu: Context switch cpuid masks and faulting state in context_switch()
A single ctxt_switch_levelling() function pointer is provided
(defaulting to an empty nop), which is overridden in the appropriate
$VENDOR_init_levelling().
set_cpuid_faulting() is made private and included within
intel_ctxt_switch_levelling().
One (attempted) functional change is that the faulting configuration should
not be special cased for dom0. It turns out that the toolstack relies on the
special case (and indeed, on being a PV domain in the first place) to
correctly build HVM domains.
For now, the control domain is left as a special case, until futher work can
be completed to remove the restriction.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <JBeulich@suse.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Andrew Cooper [Fri, 31 Jul 2015 19:38:13 +0000 (20:38 +0100)]
x86/cpu: Rework Intel masking/faulting setup
This patch is best reviewed as its end result rather than as a diff, as it
rewrites almost all of the setup.
On the BSP, cpuid information is used to evaluate the potential available set
of masking MSRs, and they are unconditionally probed, filling in the
availability information and hardware defaults. A side effect of this is that
probe_intel_cpuid_faulting() can move to being __init.
The command line parameters are then combined with the hardware defaults to
further restrict the Xen default masking level.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <JBeulich@suse.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Andrew Cooper [Fri, 31 Jul 2015 19:18:22 +0000 (20:18 +0100)]
x86/cpu: Rework AMD masking MSR setup
This patch is best reviewed as its end result rather than as a diff, as it
rewrites almost all of the setup.
On the BSP, cpuid information is used to evaluate the potential available set
of masking MSRs, and they are unconditionally probed, filling in the
availability information and hardware defaults.
The command line parameters are then combined with the hardware defaults to
further restrict the Xen default masking level.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <JBeulich@suse.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Andrew Cooper [Fri, 31 Jul 2015 14:24:03 +0000 (15:24 +0100)]
x86/cpu: Sysctl and common infrastructure for levelling context switching
A toolstack needs to know how much control Xen has over the visible cpuid
values in PV guests. Provide an explicit mechanism to query what Xen is
capable of.
This interface will currently report no capabilities. This change is
scaffolding for future patches, which will introduce detection and switching
logic, after which the interface will report hardware capabilities correctly.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <JBeulich@suse.com> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Andrew Cooper [Thu, 26 Nov 2015 16:02:10 +0000 (16:02 +0000)]
x86/cpu: Move set_cpumask() calls into c_early_init()
Before c/s 44e24f8567 "x86: don't call generic_identify() redundantly", the
commandline-provided masks would take effect in Xen's view of the processor
features.
As the masks got applied after the query for features, the redundant call to
generic_identify() would clobber the pre-masking feature information with the
post-masking information.
Move the set_cpumask() calls into c_early_init() so the effects of the command
line parameters take place before the main query for features in
generic_identify().
The cpuid_mask_* command line parameters now limit the entire system.
Subsequent changes will cause the mask MSRs to be context switched per-domain,
removing the need to use the command line parameters for heterogeneous
levelling purposes.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Tue, 1 Dec 2015 14:35:17 +0000 (14:35 +0000)]
xen/x86: Improvements to in-hypervisor cpuid sanity checks
Currently, {pv,hvm}_cpuid() has a large quantity of essentially-static logic
for modifying the features visible to a guest. A lot of this can be subsumed
by {pv,hvm}_featuremask, which identify the features available on this
hardware which could be given to a PV or HVM guest.
This is a step in the direction of full per-domain cpuid policies, but lots
more development is needed for that. As a result, the static checks are
simplified, but the dynamic checks need to remain for now.
As a side effect, some of the logic for special features can be improved.
OSXSAVE and OSPKE will be automatically cleared because of being absent in the
featuremask. This allows the fast-forward logic to be more simple.
In addition, there are some corrections to the existing logic:
* Hiding PSE36 out of PAE mode is architecturally wrong. It turns out that
it was a bugfix for running HyperV under Xen, which wanted to see PSE36
even after choosing to use PAE paging. PSE36 is not supported by shadow
paging, so is hidden from non-HAP guests, but is still visible for HAP
guests. It is also leaked into non-HAP guests when the guest is already
running in PAE mode.
* Changing the visibility of RDTSCP based on host TSC stability or virtual
TSC mode is bogus, so dropped.
* When emulating Intel to a guest, the common features in e1d should be
cleared.
* The APIC bit in e1d (on non-Intel) is also a fast-forward from the
APIC_BASE MSR.
* A guest with XSAVES and no xcr0|xss features should see
XSTATE_AREA_MIN_SIZE in %ebx (bug in c/s 9d313bde "x86/xsaves: ebx may
return wrong value using CPUID eax=0xd,ecx =1").
As a small improvement, use compiler-visible &'s and |'s, rather than
{clear,set}_bit().
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <JBeulich@suse.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Andrew Cooper [Wed, 18 Nov 2015 11:43:01 +0000 (11:43 +0000)]
xen/x86: Improve disabling of features which have dependencies
APIC and XSAVE have dependent features, which also need disabling if Xen
chooses to disable a feature.
Use setup_clear_cpu_cap() rather than clear_bit(), as it takes care of
dependent features as well.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <JBeulich@suse.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Andrew Cooper [Wed, 18 Nov 2015 12:51:20 +0000 (12:51 +0000)]
xen/x86: Clear dependent features when clearing a cpu cap
When clearing a cpu cap, clear all dependent features. This avoids having a
featureset with intermediate features disabled, but leaf features enabled.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <JBeulich@suse.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Andrew Cooper [Sat, 30 Jan 2016 15:52:41 +0000 (15:52 +0000)]
xen/x86: Generate deep dependencies of features
Some features depend on other features. Working out and maintaining the exact
dependency tree is complicated, so it is expressed in the automatic generation
script.
At runtime, Xen needs to be disable all features which are dependent on a
feature being disabled. Because of the flattening performed at compile time,
runtime can use a single mask to disable all eventual features.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 8 Apr 2016 20:34:09 +0000 (22:34 +0200)]
x86: introduce a new VMASSIST for architectural behaviour of iopl
The existing vIOPL interface is hard to use, and need not be.
Introduce a VMASSIST with which a guest can opt-in to having vIOPL behaviour
consistenly with native hardware.
Specifically:
- virtual iopl updated from do_iret() hypercalls.
- virtual iopl reported in bounce frames.
- guest kernels assumed to be level 0 for the purpose of iopl checks.
v->arch.pv_vcpu.iopl is altered to store IOPL shifted as it would exist
eflags, for the benefit of the assembly code.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Fri, 8 Apr 2016 20:33:17 +0000 (22:33 +0200)]
x86/vMSI-X: fix qword write covering vector control field
Along with using the upper 32 bits of the written value, the address
also needs advancing, so that msix_write_completion() will use the
correct address for re-invocation of msixtbl_write().
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
mwait-idle: support for Intel Xeon Phi Processor x200 Product Family
Enables "Intel(R) Xeon Phi(TM) Processor x200 Product Family" support,
formerly code-named KNL. It is based on modified Intel Atom Silvermont
microarchitecture.
Signed-off-by: Dasaratharaman Chandramouli <dasaratharaman.chandramouli@intel.com>
[micah.barany@intel.com: adjusted values of residency and latency] Signed-off-by: Micah Barany <micah.barany@intel.com>
[Linux commit: 281baf7a702693deaa45c98ef0c5161006b48257] Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Fri, 8 Apr 2016 20:30:44 +0000 (22:30 +0200)]
x86: calculate maximum host and guest featuresets
All of this information will be used by the toolstack to make informed
levelling decisions for VMs, and by Xen to sanity check toolstack-provided
information.
The split between the shadow and hap HVM masks is necessary due to the lack of
a "get cpuid policy" hypercall. Multi-host toolstacks (i.e. not libxl)
dealing with hap and non-hap capable hosts need to be able to calculate that
migrating a shadow guest is safe.
Future planned development work will implement proper cpuid policy handing in
Xen, including a "get policy" hypercall, but until then, the difference is
made available for toolstack use via a non-stable interface.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 8 Apr 2016 20:29:44 +0000 (22:29 +0200)]
x86: annotate VM applicability in featureset
Use attributes to specify whether a feature is applicable to be exposed to:
1) All guests
2) HVM guests
3) HVM HAP guests
and, via absence of an attribute, to no guests.
There is no current need for other categories (e.g. PV-only features), and
such categories should not be introduced if possible. These categories follow
from the fact that, with increased hardware support, a guest gets more
features to use.
These settings are derived from the existing code in {pv,hvm}_cpuid(), and
xc_cpuid_x86.c. One notable exception is EXTAPIC which was previously
erroneously exposed to guests. PV guests don't get to use the APIC and the
HVM APIC emulation doesn't support extended space.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
And already applied as I see. I think rushing things in like this is
not a solution, no matter that we want to freeze the tree today.
Changes like this should be free to go in as de-facto bug fixes
after the freeze date.
libxl: remove code added to use the 'phy' backend with CDROM devices
This is a partial revert of 612f15, that allowed CDROM devices to use the
'phy' PV backend. Due to limitations in the current implementation of the
libxl_cdrom_insert function, the PV backend used in conjunction with an
emulated CDROM device must always be Qdisk at the moment. This is due to
libxl_cdrom_insert not running disk hotplug scripts on plug and unplug of PV
CDROM backends (and possibly other yet to be identified issues).
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
libxl: set the backend type to Qdisk for CDROM devices on DM HVM guests
This is needed because the cd-{insert/eject} functions are not prepared to
deal with blkback, which would be used by default if no backend was
specified.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
libxl: set the device model version earlier in xenstore
So libxl doesn't have to pass the build info around just to get the device
model used by the guest. This allows to simplify
libxl__device_nic_setdefault.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
xen/arm64: correctly emulate the {w, x}zr registers
On AArch64, encoding 31 for an R<n> in the HSR is used to represent
either {w,x}sp or {w,x}zr (See C1.2.4 in ARM DDI 0486A.d) depending on
how the register field is interpreted by the instruction.
All the instructions trapped by Xen (either via a sysreg access or
data abort) interpret encoding 31 as {w,x}zr. Therefore we don't have
to worry about the possibility that a trap could refer to sp or about
decoding the instruction.
For example AArch64 LDR and STR can have zr in the source/target
register <Xt>, but never sp. sp can be present in the destination
pointer( i.e. "[sp]"), but that would be represented by the value of
FAR_EL2, not in the HSR.
For AArch32 it is possible for a LDR to target the PC, but this would
not result in a valid ISS in the HSR register. However this could only
occur if loading or storing the PC to MMIO, which we simply choose not
to support for now.
Finally, features such as xenaccess can lead to us trapping on
arbitrary instructions accessing RAM and not just for MMIO. However in
many such cases HSR.ISS is not valid and in general features such as
xenaccess do not rely on the nature of the specific instruction, they
resolve the fault (via information found elsewhere e.g. FAR_EL2)
without needing to know anything about the instruction which triggered
the trap.
The register zr represents the zero register, i.e it will always
return 0 and write to it is ignored. To properly handle this property,
2 new helpers have been introduced {get,set}_user_reg to read/write a
value from/to a register. All the calls to select_user_reg have been
replaced by these 2 helpers.
Furthermore, the code to emulate encoding 31 in select_user_reg has been
dropped because it was invalid. For Aarch64 context, the encoding is
used for sp or zr. For AArch32 context, the ISS won't be valid for data
abort from AArch32 using r15 (i.e pc) as source/destination (See D7-1881
ARM DDI 0487A.d, note the validity is more restrictive than on ARMv7).
It's also not possible to use r15 in co-processor instructions.
This patch fixes setting MMIO register and sysreg to a random value
(actually PC) instead of zero by something like:
*((volatile int*)reg) = 0;
compilers tend to generate "str wzr, [xx]" here.
[ian: added BUG_ON to select_user_reg and clarified bits of the commit message] Reported-by: Marc Zyngier <Marc.Zyngier@arm.com> Signed-off-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Fu Wei [Tue, 5 Apr 2016 16:46:36 +0000 (00:46 +0800)]
xen/arm64: check XSM Magic from the second unknown module.
This patch adds a has_xsm_magic helper function for detecting XSM
from the second unknown module.
If Xen can't get the kind of module from compatible, we guess the kind of
these unknowns respectively:
(1) The first unknown must be kernel.
(2) Detect the XSM Magic from the 2nd unknown:
a. If it's XSM, set the kind as XSM, and that also means we
won't load ramdisk;
b. if it's not XSM, set the kind as ramdisk.
So if user want to load ramdisk, it must be the 2nd unknown.
We also detect the XSM Magic for the following unknowns, then set its kind
according to the return value of has_xsm_magic.
By this way, arm64 behavior can be compatible to x86 and can simplify
multi-arch bootloader such as GRUB.
Signed-off-by: Fu Wei <fu.wei@linaro.org> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Julien Grall <julien.grall@arm.com>
xen: change the sizes of fields in the HVM start info layout to be 64bits
At the moment the only consumer of this structure is x86, but other arches
might also use it, so make all the fields 64bits. On x86 Xen will still try
to place everything below the 4GiB boundary, but that might not be feasible
in other arches.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Requested-by: Jan Beulich <jbeulich@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Daniel De Graaf [Wed, 6 Apr 2016 19:35:59 +0000 (15:35 -0400)]
flask: change default state to enforcing
The previous default of "permissive" is meant for developing or
debugging a disaggregated system. However, this default makes it too
easy to accidentally boot a machine in this state, which does not place
any restrictions on guests. This is not suitable for normal systems
because any guest can perform any operation (including operations like
rebooting the machine, kexec, and reading or writing another domain's
memory).
This change will cause the boot to fail if you do not specify an XSM
policy during boot; if you need to load a policy from dom0, use the
"flask=late" boot parameter.
Original patch by Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>; modified
to also change the default value of flask_enforcing so that the policy
is not still in permissive mode. This also removes the (no longer
documented) command line argument directly changing that variable since
it has been superseded by the flask= parameter.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Vikram Sethi [Tue, 29 Mar 2016 04:46:12 +0000 (23:46 -0500)]
arm: Fix asynchronous aborts (SError exceptions) due to bogus PTEs
ARMv8 architecture allows performing prefetch data/instructions
from memory locations marked as normal memory. Prefetch does not
mean that the data/instruction has to be used/executed in code
flow. All PTEs that appear to be valid to MMU must contain valid
physical address with proper attributes otherwise MMU table walk
might cause imprecise asynchronous aborts.
The way current XEN code is preparing page tables for frametable
and xenheap memory can create bogus PTEs. This patch fixes the
issue by clearing page table memory before populating EL2 L0/L1
PTEs. Without this patch XEN crashes on Qualcomm Technologies
server chips due to asynchronous aborts.
The speculative/prefetch feature explanation is scattered everywhere
in ARM specification but below two sections have useful information.
E2.8 Memory types and attributes (ver DDI0487A_h)
G4.12.6 External abort on a translation table walk (ver DDI0487A_h)
Justin Weaver [Fri, 18 Mar 2016 15:40:09 +0000 (16:40 +0100)]
xen: sched: implement vcpu hard affinity in Credit2
as it was still missing.
Note that this patch "only" implements hard affinity,
i.e., the possibility of specifying on what pCPUs a
certain vCPU can run. Soft affinity (which express a
preference for vCPUs to run on certain pCPUs) is still
not supported by Credit2, even after this patch.
Dario Faggioli [Fri, 18 Mar 2016 16:03:51 +0000 (17:03 +0100)]
xen: sched: provide some scratch space for not putting cpumasks on stack
directly, from schedule.c, for any scheduler that needs
it to use it.
In fact, Credit1 and RTDS needs this already. Credit2 is
also going to need it, for supporting hard affinity
(which is, typically, what requires a lot of cpumask
manipulations, inside various functions).
Therefore, let's define the scratch space at a broader
scope, to limit code duplication in handling it.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
xen: sched: allow for choosing credit2 runqueues configuration at boot
In fact, credit2 uses CPU topology to decide how to arrange
its internal runqueues. Before this change, only 'one runqueue
per socket' was allowed. However, experiments have shown that,
for instance, having one runqueue per physical core improves
performance, especially in case hyperthreading is available.
In general, it makes sense to allow users to pick one runqueue
arrangement at boot time, so that:
- more experiments can be easily performed to even better
assess and improve performance;
- one can select the best configuration for his specific
use case and/or hardware.
This patch enables the above.
Note that, for correctly arranging runqueues to be per-core,
just checking cpu_to_core() on the host CPUs is not enough.
In fact, cores (and hyperthreads) on different sockets, can
have the same core (and thread) IDs! We, therefore, need to
check whether the full topology of two CPUs matches, for
them to be put in the same runqueue.
Note also that the default (although not functional) for
credit2, since now, has been per-socket runqueue. This patch
leaves things that way, to avoid mixing policy and technical
changes.
Finally, it would be a nice feature to be able to select
a particular runqueue arrangement, even when creating a
Credit2 cpupool. This is left as future work.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Signed-off-by: Uma Sharma <uma.sharma523@gmail.com> Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
xen: sched: fix per-socket runqueue creation in credit2
The credit2 scheduler tries to setup runqueues in such
a way that there is one of them per each socket. However,
that does not work. The issue is described in bug #36
"credit2 only uses one runqueue instead of one runq per
socket" (http://bugs.xenproject.org/xen/bug/36), and a
solution has been attempted by an old patch series:
Here, we take advantage of the fact that now initialization
happens (for all schedulers) during CPU_STARTING, so we
have all the topology information available when necessary.
This is true for all the pCPUs _except_ the boot CPU. That
is not an issue, though. In fact, no runqueue exists yet
when the boot CPU is initialized, so we can just create
one and put the boot CPU in there.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Dario Faggioli [Fri, 18 Mar 2016 17:32:50 +0000 (18:32 +0100)]
xen: sched: on credit2, don't reprogram the timer if idle
As other schedulers are doing already: if the idle vcpu
is picked and scheduled, there is no need to reprogram the
scheduler timer to fire and invoke csched2_schedule()
again in future.
Tickling or external events will serve as pokes, when
necessary, but until we can, we should just stay idle.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Reported-by: Tianyang Chen <tiche@seas.upenn.edu> Suggested-by: George Dunlap <george.dunlap@citrix.com> Acked-by: George Dunlap <george.dunlap@citrix.com>
xen: sched: close potential races when switching scheduler to CPUs
In short, the point is making sure that the actual switch
of scheduler and the remapping of the scheduler's runqueue
lock occur in the same critical section, protected by the
"old" scheduler's lock (and not, e.g., in the free_pdata
hook, as it is now for Credit2 and RTDS).
Not doing so, is (at least) racy. In fact, for instance,
if we switch cpu X from, Credit2 to Credit, we do:
So, the first problem is that, if anything related to
scheduling, and involving CPU, happens at [1] or [2], we:
- take csched2_lock,
- operate on Credit1 functions and data structures,
which is no good!
The second problem is that the ASSERT at [3] triggers, and
the third that at [4], we screw up the lock remapping we've
done for ourself in csched2_init_pdata()!
The first problem arises because there is a window during
which the lock is already the new one, but the scheduler is
still the old one. The other two, becase we let schedulers
mess with the lock (re)mapping done by others.
This patch, therefore, introduces a new hook in the scheduler
interface, called switch_sched, meant at being used when
switching scheduler on a CPU, and implements it for the
various schedulers, so that things are done in the proper
order and under the protection of the best suited (set of)
lock(s). It is necessary to add the hook (as compared to
keep doing things in generic code), because different
schedulers may have different locking schemes.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Robert VanVossen <robert.vanvossen@dornerworks.com>
xl: make return type of create_domain() more consistent.
create_domain() is of uint32_t return type, because on
success it returns the domid of the new domain, and
uint32_t is what we typically use for domid-s.
However, on failure, it returns ERROR_FAIL or ERROR_INVAL,
which are -3 and -6. Callers assign the return value to an
'int rc' variable and then check for '(rc < 0)'.
Although things work, and no tool (compiler, Coverity, ecc.)
is complaining, using 'int' as return type seems better.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
xl: improve return and exit codes of memory related functions
by making them more consistent with other examples in xl.
While there, make freemem() of boolean return type, which
looks more natural, and add comment explaining why
parse_mem_size_kb() needs to diverge from the pattern.
libxl: replace the usage of uuid_t with a char array
The internals of the uuid_t struct don't match a big endian octet stream on
*BSD systems, which means that it cannot be directly casted to a
uint8_t[16].
In order to solve that change the type to be an unsigned char[16], which
doesn't imply any other change on Linux. On *BSDs change the helpers so that
the uuid is always stored as a big endian byte stream.
NB: tested on FreeBSD and Linux only.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Discussed-with: Ian Jackson <Ian.Jackson@eu.citrix.com>
Discussed-with: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Olaf Hering [Thu, 7 Apr 2016 16:31:01 +0000 (18:31 +0200)]
tools: handle xl migrate --debug in legacy stream
Doing a 'xl migrate --debug domU host' on xen-4.5 adds a
XC_SAVE_ID_ENABLE_VERIFY_MODE marker, which is not handled.
Since using --debug is valid usage, handle it by logging the fact
instead of aborting the migration.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com> CC: Simon Cao <caobosimon@gmail.com> CC: George Dunlap <george.dunlap@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Mon, 4 Apr 2016 14:18:03 +0000 (15:18 +0100)]
libxl: Set rc on failure of usbdev_busaddr_to_busid
We must set rc before using `goto out'.
Bug introduced in bf7628f0 "libxl: add pvusb API".
CID: 1358113 Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: coverity@xenproject.org CC: Simon Cao <caobosimon@gmail.com> CC: George Dunlap <george.dunlap@citrix.com> Reviewed-by: Chunyan Liu <cyliu@suse.com>
Chunyan Liu [Thu, 7 Apr 2016 09:40:25 +0000 (17:40 +0800)]
libxl: fix rc handling in libxl_device_usbdev_list
In testing with libvirt pvusb functionality, found a rc check
error in libxl_device_usbdev_list. Correct it. This function
is not used by xl.
Signed-off-by: Chunyan Liu <cyliu@suse.com> CC: Simon Cao <caobosimon@gmail.com> CC: George Dunlap <george.dunlap@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andrew Cooper [Thu, 7 Apr 2016 22:05:01 +0000 (00:05 +0200)]
x86: remove the use of vm86_mode()
Xen, being 64bit only, cannot run PV guests in vm86 mode. HVM guests however
can be running in vm86 mode, and common codepaths need to be able to cope.
The definition of vm86_mode() in x86_64/regs.h is incorrect, as the predicate
is used by non-PV codepaths.
One buggy use is in hvm/emulate.c. An HVM guest can be in vm86 mode, and
vm86_mode() sliently omits the check. Luckily, due to the VEX prefix decoding
logic in x86_emulate(), there is no path to the erronious use with EFLAGS_VM
set.
Another potentially problematic use is in show_guest_stack(). In principle,
show_guest_stack() is common code called for both PV and HVM vcpus. HVM vcpus
exit early (with no reasonable way of making the code generic), making this
part a PV-only codepath.
Open-code its use in emulate.c, matching the surrounding code. This causes
all other uses to be in PV-only codepaths, making the code to be logically
dead. Drop it completely, to avoid future misuse.
Part of resulting cleanup removes vm86attr from read_descriptor(), although
retaining one relevant piece of information; i.e. whether we are reading a
selector for an instruction fetch, or a data fetch.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
x86/xsaves: ebx may return wrong value using CPUID eax=0xd,ecx =1
Refer to SDM Volume 1 Extended Region of an XSAVE Area. The value returned
by ecx[1] with cpuid function 0xd and sub-function i (i>1) indicates
the alignment of the state component i when the compacted format of the
extended region of an xsave area is used.
So when hvm guest using CPUID eax=0xd, ecx=1 to get the size of area
used for compacted format, we need to take alignment into consideration.
tools side is fixed by
"tools/libxc: Calculate xstate cpuid leaf from guest information"
by Andrew Cooper
Signed-off-by: Shuai Ruan <shuai.ruan@intel.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Thu, 7 Apr 2016 22:03:43 +0000 (00:03 +0200)]
build: fix build with Clang
c/s 607044bf9 "build: avoid putting local absolute symbols in symbol tables"
breaks the build with Clang, as the command line argument isn't understood.
Clang does not appear to have any equivielent option, and already has
outstanding issues with duplicate symbols. Excluding this option makes the
problem no worse.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
That will turn out useful in following patches, where such
code will need to be called more than just once. Create an
helper now, and move the code there, to avoid mixing code
motion and functional changes later.
In Credit2, some style cleanup is also done.
No functional change intended.
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com> Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
sched: implement .init_pdata in Credit, Credit2 and RTDS
In fact, if a scheduler needs per-pCPU information,
that needs to be initialized appropriately. So, we take
the code that is performing initializations from (right
now) .alloc_pdata, and use it for .init_pdata, leaving
only actualy allocations in the former, if any (which
is the case in RTDS and Credit1).
On the other hand, in Credit2, since we don't really
need any per-pCPU data allocation, everything that was
being done in .alloc_pdata, is now done in .init_pdata.
And the fact that now .alloc_pdata can be left undefined,
allows us to just get rid of it.
Still for Credit2, the fact that .init_pdata is called
during CPU_STARTING (rather than CPU_UP_PREPARE) kills
the need for the scheduler to setup a similar callback
itself, simplifying the code.
And thanks to such simplification, it is now also ok to
move some of the logic meant at double checking that a
cpu was (or was not) initialized, into ASSERTS (rather
than an if() and a BUG_ON).
The .alloc_pdata scheduler hook must, before this change,
be implemented by all schedulers --even those ones that
don't need to allocate anything.
Make it possible to just use the SCHED_OP(), like for
the other hooks, by using ERR_PTR() and IS_ERR() for
error reporting. This:
- makes NULL a variant of success;
- allows for errors other than ENOMEM to be properly
communicated (if ever necessary).
This, in turn, means that schedulers not needing to
allocate any per-pCPU data, can avoid implementing the
hook. In fact, the artificial implementation of
.alloc_pdata in the ARINC653 is removed (and, while there,
nuke .free_pdata too, as it is equally useless).
Chunyan Liu [Thu, 7 Apr 2016 09:40:28 +0000 (17:40 +0800)]
libxl: pvusb: Correctly check the controller type
Missing a check of controller type.
Signed-off-by: Chunyan Liu <cyliu@suse.com> CC: Simon Cao <caobosimon@gmail.com> CC: George Dunlap <george.dunlap@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com> CC: Simon Cao <caobosimon@gmail.com> CC: George Dunlap <george.dunlap@citrix.com> Acked-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com> CC: Simon Cao <caobosimon@gmail.com> CC: George Dunlap <george.dunlap@citrix.com> Acked-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Chong Li [Wed, 6 Apr 2016 20:30:38 +0000 (15:30 -0500)]
libxc: fix uninitialized variable when changing rtds scheduling parameters
Commit 046c2b503a89d21b41e4d555a9f75d02af00dbc6 introduces a build
failure: in some cases (e.g., num_vcpus <=0),
xc_sched_rtds_vcpu_get/set returns an uninitialized variable.
Fix it.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Chong Li <chong.li@wustl.edu> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Chong Li [Fri, 1 Apr 2016 18:33:27 +0000 (18:33 +0000)]
xl: enable per-VCPU parameter for RTDS
Change main_sched_rtds and related output functions to support
per-VCPU settings.
Signed-off-by: Chong Li <chong.li@wustl.edu> Signed-off-by: Meng Xu <mengxu@cis.upenn.edu> Signed-off-by: Sisu Xi <xisisu@gmail.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Chong Li [Fri, 1 Apr 2016 14:39:05 +0000 (14:39 +0000)]
libxl: enable per-VCPU parameter for RTDS
Add libxl_vcpu_sched_params_get/set and sched_rtds_vcpu_get/set
functions to support per-VCPU settings.
Signed-off-by: Chong Li <chong.li@wustl.edu> Signed-off-by: Meng Xu <mengxu@cis.upenn.edu> Signed-off-by: Sisu Xi <xisisu@gmail.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Chong Li [Fri, 1 Apr 2016 15:14:42 +0000 (15:14 +0000)]
libxc: enable per-VCPU parameter for RTDS
Add xc_sched_rtds_vcpu_get/set functions to interact with
Xen to get/set a domain's per-VCPU parameters.
Signed-off-by: Chong Li <chong.li@wustl.edu> Signed-off-by: Meng Xu <mengxu@cis.upenn.edu> Signed-off-by: Sisu Xi <xisisu@gmail.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Roger Pau Monne [Thu, 31 Mar 2016 12:56:41 +0000 (14:56 +0200)]
libxl: properly use vdev vs local device
The current code in libxl assumed that vdev is equal to local device, but
this is only true for Linux systems. In other OSes the local device can use
a nomenclature completely different from the virtual device one.
Move the current libxl__devid_to_localdev Linux implementation out of the
OS-specific file and rename it to libxl__devid_to_vdev, and then make sure
local_device_attach_cb return the local device in the diskpath field.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>