]> xenbits.xensource.com Git - xen.git/log
xen.git
8 years agox86/cpuid: Move all xstate leaf handling into guest_cpuid()
Andrew Cooper [Fri, 16 Dec 2016 16:21:20 +0000 (16:21 +0000)]
x86/cpuid: Move all xstate leaf handling into guest_cpuid()

The xstate union now contains sanitised values, so it can be handled fully in
the non-legacy path.

c/s 1c0bc709d "x86/cpuid: Perform max_leaf calculations in guest_cpuid()"
accidentally introduced a boundary error for the subleaf check, although it
was masked by the correct logic in the legacy path.

Two dynamic adjustments need making, but a TODO and BUILD_BUG_ON() are left to
cover a latent bug which will present itself when Xen starts supporting XSS
states for guests.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/cpuid: Introduce recalculate_xstate()
Andrew Cooper [Wed, 4 Jan 2017 15:00:23 +0000 (15:00 +0000)]
x86/cpuid: Introduce recalculate_xstate()

All data in the xstate union, other than the Da1 feature word, is derived from
other state; either feature bits from other words, or layout information which
has already been collected by Xen's xstate driver.

Recalculate the xstate information for each policy object when the feature
bits may have changed.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/cpuid: Move x86_vendor from arch_domain to cpuid_policy
Andrew Cooper [Thu, 12 Jan 2017 11:45:10 +0000 (11:45 +0000)]
x86/cpuid: Move x86_vendor from arch_domain to cpuid_policy

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
8 years agox86/cpuid: Drop a guests cached x86 family and model information
Andrew Cooper [Thu, 12 Jan 2017 11:45:10 +0000 (11:45 +0000)]
x86/cpuid: Drop a guests cached x86 family and model information

The model information isn't used at all, and the family information is only
used once.

Make get_cpu_family() a static inline (as it is just basic calculation, and
the function call is probably more expensive than the function itself) and
rearange the logic to avoid calculating model entirely if the caller doesn't
want it.

Calculate a guests family only when necessary in hvm_select_ioreq_server().

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agokexec: implement STATUS hypercall to check if image is loaded
Eric DeVolder [Tue, 17 Jan 2017 17:29:16 +0000 (11:29 -0600)]
kexec: implement STATUS hypercall to check if image is loaded

The tools that use kexec are asynchronous in nature and do not keep
state changes. As such provide an hypercall to find out whether an
image has been loaded for either type.

Note: No need to modify XSM as it has one size fits all check and
does not check for subcommands.

Note: No need to check KEXEC_FLAG_IN_PROGRESS (and error out of
kexec_status()) as this flag is set only once by the first/only
cpu on the crash path.

Note: This is just the Xen side of the hypercall, kexec-tools patch
to come separately.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agoxen/arm: Don't mix GFN and MFN when using iomem_deny_access
Julien Grall [Tue, 17 Jan 2017 15:52:53 +0000 (15:52 +0000)]
xen/arm: Don't mix GFN and MFN when using iomem_deny_access

iomem_deny_access is working on MFN and not GFN. Make it clear by
renaming the local variables.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: bootfdt.c is only used during initialization
Julien Grall [Tue, 17 Jan 2017 15:53:24 +0000 (15:53 +0000)]
xen/arm: bootfdt.c is only used during initialization

This file contains data and code only used at initialization. Mark the
file as such in the build system and correct kind_guess.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agox86emul: support ADCX/ADOX
Jan Beulich [Tue, 17 Jan 2017 09:33:25 +0000 (10:33 +0100)]
x86emul: support ADCX/ADOX

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: support POPCNT
Jan Beulich [Tue, 17 Jan 2017 09:32:54 +0000 (10:32 +0100)]
x86emul: support POPCNT

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: VEX.B is ignored in compatibility mode
Jan Beulich [Tue, 17 Jan 2017 09:32:25 +0000 (10:32 +0100)]
x86emul: VEX.B is ignored in compatibility mode

While VEX.R and VEX.X are guaranteed to be 1 in compatibility mode
(and hence a respective mode_64bit() check can be dropped), VEX.B can
be encoded as zero, but would be ignored by the processor. Since we
emulate instructions in 64-bit mode (except possibly in the test
harness), we need to force the bit to 1 in order to not act on the
wrong {X,Y,Z}MM register (which has no bad effect on 32-bit test
harness builds, as there the bit would again be ignored by the
hardware, and would by default be expected to be 1 anyway).

We must not, however, fiddle with the high bit of VEX.VVVV in the
decode phase, as that would undermine the checking of instructions
requiring the field to be all ones independent of mode. This is
being enforced in copy_REX_VEX() instead.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: suppress memory writes after faulting FPU insns
Jan Beulich [Tue, 17 Jan 2017 09:31:39 +0000 (10:31 +0100)]
x86emul: suppress memory writes after faulting FPU insns

FPU insns writing to memory must not touch memory if they latch #MF (to
be delivered on the next waiting FPU insn). Note that inspecting FSW.ES
needs to be avoided for all FNST* insns, as they don't raise exceptions
themselves, but may instead be invoked with the bit already set.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agoAdd XENV to docs/misc
Stefano Stabellini [Mon, 16 Jan 2017 18:46:16 +0000 (10:46 -0800)]
Add XENV to docs/misc

Add the latest version of the XEN Environment table specification for
ACPI to docs/misc.

The original authors are:
  Parth Dixit<parth.dixit@linaro.org>
  Julien Grall<julien.grall@citrix.com>

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoAdd STAO spec to docs/misc
Stefano Stabellini [Mon, 16 Jan 2017 18:42:18 +0000 (10:42 -0800)]
Add STAO spec to docs/misc

Add the latest version of the STAtus Override table specification for
ACPI to docs/misc.

The original authors are:

  Al Stone <al.stone@linaro.org>
  Graeme Gregory <graeme.gregory@linaro.org>
  Parth Dixit<parth.dixit@linaro.org>

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agox86/xstate: Fix array overrun on hardware with LWP
Andrew Cooper [Fri, 13 Jan 2017 18:51:04 +0000 (18:51 +0000)]
x86/xstate: Fix array overrun on hardware with LWP

c/s da62246e4c "x86/xsaves: enable xsaves/xrstors/xsavec in xen" introduced
setup_xstate_features() to allocate and fill xstate_offsets[] and
xstate_sizes[].

However, fls() casts xfeature_mask to 32bits which truncates LWP out of the
calculation.  As a result, the arrays are allocated too short, and the cpuid
infrastructure reads off the end of them when calculating xstate_size for the
guest.

On one test system, this results in 0x3fec83c0 being returned as the maximum
size of an xsave area, which surprisingly appears not to bother Windows or
Linux too much.  I suspect they both use current size based on xcr0, which Xen
forwards from real hardware.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/pv: Check that emulate_privileged_op() don't change any unexpected flags
Andrew Cooper [Fri, 6 Jan 2017 20:05:36 +0000 (20:05 +0000)]
x86/pv: Check that emulate_privileged_op() don't change any unexpected flags

No bits, other than arithmetic ones and the resume flag (which will most
likely change from 1 to 0), can be changed by the instructions we permit.
Extend the check to cover other flags.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/emul: Calculate not_64bit during instruction decode
Andrew Cooper [Fri, 13 Jan 2017 13:23:42 +0000 (13:23 +0000)]
x86/emul: Calculate not_64bit during instruction decode

... rather than repeating "generate_exception_if(mode_64bit(), EXC_UD);" in
the emulation switch statement.

Bloat-o-meter shows:

  add/remove: 0/0 grow/shrink: 1/2 up/down: 8/-495 (-487)
  function                                     old     new   delta
  per_cpu__state                                98     106      +8
  x86_decode                                  6782    6726     -56
  x86_emulate                                57160   56721    -439

The reason for x86_decode() getting smaller is that this change alters the
x86_decode_onebyte() switch statement from a chain of if()/else's to a jump
table.  The jump table adds 250 bytes of data which bloat-o-meter clearly
can't see.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agotools/misc: add AVX512 vpopcntdq in xen-cpuid.c
He Chen [Mon, 16 Jan 2017 08:05:03 +0000 (16:05 +0800)]
tools/misc: add AVX512 vpopcntdq in xen-cpuid.c

Add AVX512 vpopcntdq information in xen-cpuid.c

Signed-off-by: He Chen <he.chen@linux.intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agoxenstored: remove -L option
Wei Liu [Fri, 13 Jan 2017 12:13:39 +0000 (12:13 +0000)]
xenstored: remove -L option

The only place that used such option was removed in 388d3011.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
8 years agox86emul: improve CR/DR access handling
Jan Beulich [Fri, 13 Jan 2017 14:28:31 +0000 (15:28 +0100)]
x86emul: improve CR/DR access handling

- don't accept LOCK for DR accesses (it's undefined in the manuals)
- only accept LOCK for CR accesses when the respective feature flag is
  set (which would not normally be the case for Intel)
- add (rather than or) 8 when LOCK is present; real hardware #UDs
  when both REX.W and LOCK are present, implying that these would
  rather access hypothetical CR16...23
- eliminate explicit decode_register() calls
- streamline remaining read/write code

No further functional change, i.e. not addressing the missing exception
generation (#UD for invalid CR/DR encodings, #GP(0) for invalid write
values, #DB for DR accesses with DR7.GD set).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: use switch()-wide local variable 'cr4'
Jan Beulich [Fri, 13 Jan 2017 14:27:53 +0000 (15:27 +0100)]
x86emul: use switch()-wide local variable 'cr4'

... rather than various smaller scope ones.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: support VME and PVI
Jan Beulich [Fri, 13 Jan 2017 14:25:52 +0000 (15:25 +0100)]
x86emul: support VME and PVI

... affecting PUSHF, POPF, CLI, and STI.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: conditionally clear BNDn for branches
Jan Beulich [Fri, 13 Jan 2017 14:24:45 +0000 (15:24 +0100)]
x86emul: conditionally clear BNDn for branches

Considering that we surface MPX to HVM guests, instructions we emulate
should also correctly deal with MPX state. While for now BND*
instructions don't get emulated, the effect of branches (which we do
emulate) without BND prefix should be taken care of.

No need to alter XABORT behavior: While not mentioned in the SDM so
far, this restores BNDn as they were at the XBEGIN, and since we make
XBEGIN abort right away, XABORT in the emulator is only a no-op.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/cpuid: Move the legacy cpuids array into struct cpuid_policy
Andrew Cooper [Wed, 4 Jan 2017 13:31:53 +0000 (13:31 +0000)]
x86/cpuid: Move the legacy cpuids array into struct cpuid_policy

This hides the legacy details inside the cpuid subsystem, where they will
eventually be dropped entirely.

While altering the line containing paging_initialised, change its type to bool
to match its use.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/cpuid: Effectively remove domain_cpuid()
Andrew Cooper [Wed, 4 Jan 2017 12:46:09 +0000 (12:46 +0000)]
x86/cpuid: Effectively remove domain_cpuid()

The only callers of domain_cpuid() are the legacy cpuid path via
{pv,hvm}_cpuid().  Move domain_cpuid() to being private in cpuid.c, with an
adjusted API to use struct cpuid_leaf rather than individual pointers.

The ITSC clobbering logic is dropped.  It is no longer necessary now that the
logic has moved into recalculate_cpuid_policy()

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/cpuid: Store the toolstacks choice of hypervisor max leaf
Andrew Cooper [Wed, 4 Jan 2017 12:20:51 +0000 (12:20 +0000)]
x86/cpuid: Store the toolstacks choice of hypervisor max leaf

This removes all dependencies on the legacy cpuids[] array from
cpuid_hypervisor_leaves().  Swap a BUG() to an ASSERT_UNREACHABLE(), because
in the unlikely case that we hit it, returning all zeros to the guest is fine.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/domctl: Move all CPUID update logic into update_domain_cpuid_info()
Andrew Cooper [Wed, 4 Jan 2017 12:43:57 +0000 (12:43 +0000)]
x86/domctl: Move all CPUID update logic into update_domain_cpuid_info()

This simplifies the XEN_DOMCTL_set_cpuid handling, splitting the safety logic
away from the internals of how an update is completed.

The legacy cpuids[] logic is left in alone in a fuction, as it wont survive
very long.  update_domain_cpuid_info() gains a small performance optimisation
to skip all update activites for leaves which won't have any impact on the
guest.  This is temporary until the new hypercall API is completed.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/cpuid: Fix feature flags reported to dom0
Andrew Cooper [Thu, 12 Jan 2017 16:14:56 +0000 (16:14 +0000)]
x86/cpuid: Fix feature flags reported to dom0

c/s a11e8c9 "x86/pv: Use per-domain policy information in pv_cpuid()" switched
PV domains from using a (hardware for dom0, toolstack-chosen from domU) value
masked against pv_featureset[], to actually using the value calculated by
recalculate_cpuid_policy().

For domU, this is no practical change as the content is still chosen by the
toolstack.  For dom0 however, we no longer have two sources of information
potentially clearing bits.  Modern Linux seems to care about having CMP_LEGACY
set in its view of CPUID on an Intel box.

The deliberate setting of HTT, X2APIC and CMP_LEGACY in {pv,hvm}_featureset[]
is necessary for domUs, as the toolstack may have (tried to) set up topology
information in a different representation than the hardware uses.  The bits
therefore needed to be set in the masks used in the older logic, to avoid
clobbering the toolstacks information.

Move the HTT/X2APIC/CMP_LEGACY logic from calculate_{pv,hvm}_max_policy()
(where the meaning of {pv,hvm}_featureset[] has changed subtly) to
recalculate_cpuid_policy() where the masking logic now lives.

This will cause {pv,hvm}_max_policy to actually contain real hardware values
(so dom0 sees real hardware values), but still allows the toolstack to set
bits not present in real hardware for domUs.

Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/sysctl: Fix NULL pointer dereference in error path
Andrew Cooper [Wed, 11 Jan 2017 17:51:44 +0000 (17:51 +0000)]
x86/sysctl: Fix NULL pointer dereference in error path

This was introduced by c/s c38869e711 "x86/cpuid: Drop the temporary linear
feature bitmap from struct cpuid_policy", and caught by Coverity.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agotools: don't remove tdb data base file before starting xenstored
Juergen Gross [Tue, 10 Jan 2017 16:13:39 +0000 (17:13 +0100)]
tools: don't remove tdb data base file before starting xenstored

As xenstored now is always starting with an empty tdb data base there
is no need any longer to remove the file before starting xenstored.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agotools/xenstore: start with empty data base
Juergen Gross [Tue, 10 Jan 2017 16:13:38 +0000 (17:13 +0100)]
tools/xenstore: start with empty data base

Today xenstored tries to open a tdb data base file on disk when it is
started. As this is problematic in most cases the scripts used to start
xenstored ensure xenstored won't find such a file in order to start
with an empty xenstore.

A tdb data base file can't be used to restore all Xenstore state as
e.g. Xenstore watches are not kept in the tdb data base. The file is
meant to be used for debugging purposes after a xenstored crash only.

Instead of opening a Xenstore data base file found on disk always start
with an empty data base. This will avoid problems in case someone is
testing multiple xenstored versions without rebooting (which is not
supported but helps debugging in some cases).

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agodocs: add Xen PV Drivers Lifecycle
Stefano Stabellini [Fri, 13 Jan 2017 01:47:14 +0000 (17:47 -0800)]
docs: add Xen PV Drivers Lifecycle

Add a document that details the lifecycle of new PV drivers.

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agotools/libxc: Fix the reported max_leaf values for PV guests
Andrew Cooper [Mon, 9 Jan 2017 13:17:01 +0000 (13:17 +0000)]
tools/libxc: Fix the reported max_leaf values for PV guests

When iterating through CPUID leaves to generating a policy, libxc will clip
itself at the hardcoded maxima, meaning that no data outside of the hardcoded
maxima are provided to Xen (in turn, causing Xen to return zeros if these
leaves are requested.)

The HVM code also clips the max_leaf data reported to the guest, but the PV
side didn't.

This results in a PV guest using the emulated CPUID, or via Xen using CPUID
faulting, to observe a max_leaf higher than the toolstack wants, although with
zeros being returned in the intervening leaves.

Fix the PV side to behave like the HVM side, and clip the max_leaf values in
leaf 0 and 0x80000000.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agox86emul: correct EFLAGS.TF handling
Jan Beulich [Wed, 11 Jan 2017 12:43:04 +0000 (13:43 +0100)]
x86emul: correct EFLAGS.TF handling

For repeated string instructions we should not emulate multiple
iterations in one go when a single step trap needs injecting (which
needs to happen after every iteration).

For all non-branch instructions as well as not taken conditional
branches we additionally need to take DebugCtl.BTF into consideration.

For mov-to/pop-into %ss there should be no #DB at all (EFLAGS.TF
remaining set means there'll be #DB after the next instruction).

Additionally retire.sti should remain clear when retire.singlestep gets
set to true.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citirx.com>
8 years agox86emul: support CLWB
Jan Beulich [Wed, 11 Jan 2017 12:41:45 +0000 (13:41 +0100)]
x86emul: support CLWB

Just like for CLFLUSH{,OPT} back it by the wbinvd() hook for now.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/HVM: restrict permitted instructions during special purpose emulation
Jan Beulich [Wed, 11 Jan 2017 12:40:49 +0000 (13:40 +0100)]
x86/HVM: restrict permitted instructions during special purpose emulation

Most invocations of the instruction emulator are for VM exits where the
set of legitimate instructions (i.e. ones capable of causing the
respective exit) is rather small. Restrict the permitted sets via a new
callback, at once eliminating the abuse of handle_mmio() for non-MMIO
operations.

A seemingly unrelated comment adjustment is being done here to keep
x86_emulate() in sync with x86_insn_is_mem_write() (in the context of
which this was found to be wrong).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
8 years agox86/cpuid: Alter the legacy-path prototypes to match guest_cpuid()
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/cpuid: Alter the legacy-path prototypes to match guest_cpuid()

This allows the compiler to have a far easier time inlining the legacy paths
into guest_cpuid(), and avoids the need to have a full struct cpu_user_regs in
the guest_cpuid() stack frame.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <JBeulich@suse.com>
8 years agox86/cpuid: Effectively remove pv_cpuid() and hvm_cpuid()
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/cpuid: Effectively remove pv_cpuid() and hvm_cpuid()

All callers of pv_cpuid() and hvm_cpuid() (other than guest_cpuid() legacy
path) have been removed from the codebase.  Move them into cpuid.c to avoid
any further use, leaving guest_cpuid() as the sole API to use.

This is purely code motion, with no functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/svm: Use guest_cpuid() rather than hvm_cpuid()
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/svm: Use guest_cpuid() rather than hvm_cpuid()

More work is required before LWP details can be read straight out of the
cpuid_policy block, but in the meantime hvm_cpuid() wants to disappear so
update the code to use the newer interface.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/hvm: Use guest_cpuid() rather than hvm_cpuid()
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/hvm: Use guest_cpuid() rather than hvm_cpuid()

More work is required before maxphysaddr can be read straight out of the
cpuid_policy block, but in the meantime hvm_cpuid() wants to disappear so
update the code to use the newer interface.

Use the behaviour of max_leaf handling (returning all zeros) to avoid a double
call into guest_cpuid().

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/cpuid: Move all leaf 7 handling into guest_cpuid()
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/cpuid: Move all leaf 7 handling into guest_cpuid()

All per-domain policy data concerning leaf 7 is accurate.  Handle it all in
guest_cpuid() by reading out of the raw array block, and introduing a dynamic
adjustment for OSPKE.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/cpuid: Perform max_leaf calculations in guest_cpuid()
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/cpuid: Perform max_leaf calculations in guest_cpuid()

Clamp the toolstack-providied max_leaf values in recalculate_cpuid_policy(),
causing the per-domain policy to have guest-accurate data.

Have guest_cpuid() exit early if a requested leaf is out of range, rather than
falling into the legacy path.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/cpuid: Calculate appropriate max_leaf values for the global policies
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/cpuid: Calculate appropriate max_leaf values for the global policies

Derive host_policy from raw_policy, and {pv,hvm}_max_policy from host_policy.
Clamp the raw values to the maximum we will offer to guests.

This simplifies the PV and HVM policy calculations, removing the need for an
intermediate linear host_featureset bitmap.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/cpuid: Drop the temporary linear feature bitmap from struct cpuid_policy
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/cpuid: Drop the temporary linear feature bitmap from struct cpuid_policy

With most uses of the *_featureset API removed, the remaining uses are only
during XEN_SYSCTL_get_cpu_featureset, init_guest_cpuid(), and
recalculate_cpuid_policy(), none of which are hot paths.

Drop the temporary infrastructure, and have the current users recreate the
linear bitmap using cpuid_policy_to_featureset().  This avoids storing
duplicated information in struct cpuid_policy.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/hvm: Use per-domain policy information in hvm_cpuid()
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/hvm: Use per-domain policy information in hvm_cpuid()

... rather than performing runtime adjustments.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/pv: Use per-domain policy information in pv_cpuid()
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/pv: Use per-domain policy information in pv_cpuid()

... rather than performing runtime adjustments.  This is safe now that
recalculate_cpuid_policy() perfoms suitable sanitisation when the policy data
is loaded.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/pv: Use per-domain policy information when calculating the cpumasks
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/pv: Use per-domain policy information when calculating the cpumasks

... rather than dynamically clamping against the PV maximum policy.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/svm: Improvements using named features
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/svm: Improvements using named features

This avoids calling into hvm_cpuid() to obtain information which is directly
available.  In particular, this avoids the need to overload flag_dr_dirty
because of hvm_cpuid() being unavailable in svm_save_dr().

flag_dr_dirty is returned to a boolean (as it was before c/s c097f549 which
introduced the need to overload it).  While returning it to type bool, remove
the use of bool_t for the adjacent fields.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agox86/hvm: Improve CPUID and MSR handling using named features
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/hvm: Improve CPUID and MSR handling using named features

This avoids hvm_cpuid() recursing into itself, and the MSR paths using
hvm_cpuid() to obtain information which is directly available.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/pv: Improve pv_cpuid() using named features
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/pv: Improve pv_cpuid() using named features

This avoids refering back to domain_cpuid() or native CPUID to obtain
information which is directly available.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/vvmx: Use hvm_cr4_guest_valid_bits() to calculate MSR_IA32_VMX_CR4_FIXED1
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/vvmx: Use hvm_cr4_guest_valid_bits() to calculate MSR_IA32_VMX_CR4_FIXED1

Reuse the logic in hvm_cr4_guest_valid_bits() instead of duplicating it.

This fixes a bug to do with the handling of X86_CR4_PCE.  The RDPMC
instruction predate the architectural performance feature, and has been around
since the P6.  X86_CR4_PCE is like X86_CR4_TSD and only controls whether RDPMC
is available at cpl!=0, not whether RDPMC is generally unavailable.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/hvm: Improve CR4 verification using named features
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/hvm: Improve CR4 verification using named features

Alter the function to return the valid CR4 bits, rather than the invalid CR4
bits.  This will allow reuse in other areas of code.

Pick the appropriate cpuid_policy object rather than using hvm_cpuid() or
boot_cpu_data.  This breaks the dependency on current.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/hvm: Improve hvm_efer_valid() using named features
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/hvm: Improve hvm_efer_valid() using named features

Pick the appropriate cpuid_policy object rather than using hvm_cpuid() or
boot_cpu_data.  This breaks the dependency on current.

As data is read straight out of cpuid_policy, there is no need to work around
the fact that X86_FEATURE_SYSCALL might be clear because of the dynamic
adjustment in hvm_cpuid().  This simplifies the SCE handling, as EFER.SCE can
be set in isolation in 32bit mode on Intel hardware.

Alter nestedhvm_enabled() to be const-correct, allowing hvm_efer_valid() to be
properly const-correct.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/cpuid: Introduce named feature bitfields
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/cpuid: Introduce named feature bitfields

It greatly aids the readibility of code to express feature checks with their
direct name (e.g. p->basic.mtrr or p->extd.lm), rarther that by a field and a
bitmask.  gen-cpuid.py is augmented to calculate a suitable declaration to
live in a union with the underlying feature word.

gen-cpuid.py doesn't know Xen's choice of naming for the feature word indicies
(and arguably shouldn't care), so provides the declarations in terms of their
numeric feature word index.  The DECL_BITFIELD() macro (local to cpuid_policy)
takes a feature word index name and chooses the right declaration, to aid
clarity.

All X86_FEATURE_*'s are included in the naming, other than the features
fast-forwarded from other state (APIC, OSXSAVE, OSPKE), whose value cannot be
read out of the feature word.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/cpuid: Dispatch cpuid_hypervisor_leaves() from guest_cpuid()
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/cpuid: Dispatch cpuid_hypervisor_leaves() from guest_cpuid()

... rather than from the legacy path.  Update the API to match guest_cpuid(),
and remove its dependence on current.

Make use of guest_cpuid() unconditionally zeroing res to avoid repeated
re-zeroing.  To use a const struct domain, domain_cpuid() needs to be
const-corrected.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/hvm: Dispatch cpuid_viridian_leaves() from guest_cpuid()
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/hvm: Dispatch cpuid_viridian_leaves() from guest_cpuid()

... rather than from the legacy path.  Update the API to match guest_cpuid(),
and remove its dependence on current.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/cpuid: Recalculate a domains CPUID policy when appropriate
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/cpuid: Recalculate a domains CPUID policy when appropriate

Introduce recalculate_cpuid_policy() which clamps a CPUID policy based on the
domains current restrictions.

Each adjustment introduced here mirrors what currently happens in
{pv,hvm}_cpuid(), although some logic is expressed differently.

 * The clearing X86_FEATURE_LM for 32bit PV guests, sanitise_featureset()
   takes out all 64bit-dependent features in one go.

 * The toolstacks choice of X86_FEATURE_ITSC in (by default) clobbered in
   domain_cpuid(), but {pv,hvm}_cpuid() needed to account for the host ITSC
   value when masking the toolstack value.

This now requires that sanitise_featureset(), lookup_deep_deps() and
associated data needs to be available at runtime, so moves out of __init.

Recalculate the cpuid policy when:

 * The domain is first created
 * Switching a PV guest to being compat
 * Setting disable_migrate or vTSC modes
 * The toolstack sets new policy data

The disable_migrate code was previously common.  To compensate, move the code
to each archs arch_do_domctl(), as the implementations now differ.

From this point on, domains have full and correct feature-leaf information in
their CPUID policies, allowing for substantial cleanup and improvements.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agox86/cpuid: Allocate a CPUID policy for every domain
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/cpuid: Allocate a CPUID policy for every domain

Introduce init_domain_cpuid_policy() to allocate an appropriate cpuid policy
for the domain (currently the domains maximum applicable policy), and call it
during domain construction.

init_guest_cpuid() now needs calling before dom0 is constructed.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/cpuid: Move featuresets into struct cpuid_policy
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/cpuid: Move featuresets into struct cpuid_policy

Featuresets will eventually live only once in a struct cpuid_policy, but lots
of code currently uses the global featuresets as a linear bitmap.  Remove the
existing global *_featureset bitmaps, replacing them with *_policy objects
containing named featureset words and a fs[] linear bitmap.

Two new helpers are introduced to scatter/gather a linear featureset bitmap
to/from the fixed word locations in struct cpuid_policy.

The existing calculate_raw_policy() already obtains the scattered raw
featureset.  Gather the raw featureset into raw_policy.fs in
calculate_raw_policy() and drop calculate_raw_featureset() entirely.

Now that host_featureset can't be a straight define of
boot_cpu_data.x86_capability, introduce calculate_host_policy() to suitably
fill in host_policy from boot_cpu_data.x86_capability.  (Future changes will
have additional sanitization logic in this function.)

The PV and HVM policy objects and calculation functions have max introduced to
their names, as there will eventually be a distinction between max and default
policies for each domain type.  The existing logic works in terms of linear
bitmaps, so scatter the result back into the policy objects.

Leave some compatibility defines providing the old *_featureset API.  This
results in no observed change in the *_featureset values, which are still used
at the hypercall and guest_cpuid() interfaces.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/cpuid: Introduce struct cpuid_policy
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/cpuid: Introduce struct cpuid_policy

struct cpuid_policy will eventually be a complete replacement for the cpuids[]
array, with a fixed layout and named fields to allow O(1) access to specific
information.

For now, the CPUID content is capped at the 0xd and 0x8000001c leaves, which
matches the maximum policy that the toolstack will generate for a domain.  The
xstate leaves extend up to LWP, and the structured features leaf is
implemented with subleaf properties (in anticipation of subleaf 1 appearing
soon), although only subleaf 0 is currently implemented.

Introduce calculate_raw_policy() which fills raw_policy with information,
making use of the new helpers, cpuid_{,count_}leaf().

Finally, rename calculate_featuresets() to init_guest_cpuid(), as it is going
to perform rather more work.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/cpuid: Introduce guest_cpuid() and struct cpuid_leaf
Andrew Cooper [Wed, 11 Jan 2017 11:59:02 +0000 (11:59 +0000)]
x86/cpuid: Introduce guest_cpuid() and struct cpuid_leaf

Longterm, pv_cpuid() and hvm_cpuid() will be merged into a single
guest_cpuid(), which is also capable of working outside of current context.

To aid this transtion, introduce guest_cpuid() with the intended API, which
simply defers back to pv_cpuid() or hvm_cpuid() as appropriate.

Introduce struct cpuid_leaf which is used to represent the results of a CPUID
query in a more efficient mannor than passing four pointers through the
calltree.

Update all codepaths which should use the new guest_cpuid() API.  These are
the codepaths which have variable inputs, and (other than some specific
x86_emulate() cases) all pertain to servicing a CPUID instruction from a
guest.

The other codepaths using {pv,hvm}_cpuid() with fixed inputs will later be
adjusted to read their data straight from the policy block.

No intended functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Kevin Tian <kevint.tian@intel.com>
8 years agox86/HVM: Fix teardown ordering in hvm_vcpu_destroy()
Suravee Suthikulpanit [Tue, 10 Jan 2017 14:03:02 +0000 (08:03 -0600)]
x86/HVM: Fix teardown ordering in hvm_vcpu_destroy()

The order of destroy function calls in hvm_vcpu_destroy() should be
the reverse of init calls in hvm_vcpu_initialise().

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
[ Fix up tasklet_kill() position ]
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/emul: Replace opencoded extraction of IOPL from eflags
Andrew Cooper [Fri, 6 Jan 2017 20:03:08 +0000 (20:03 +0000)]
x86/emul: Replace opencoded extraction of IOPL from eflags

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agoxenstore: bump TDB_VERSION
Jan Beulich [Tue, 10 Jan 2017 10:46:59 +0000 (10:46 +0000)]
xenstore: bump TDB_VERSION

Commit 9e49dcf67f ("xenstore: add per-node generation counter) changed
the TDB layout, which - in order to not break older xenstored running
on the same system - need to be accompanied by a version bump.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agoget_maintainer.pl: Teach brace expansion
Anthony PERARD [Mon, 9 Jan 2017 15:22:32 +0000 (15:22 +0000)]
get_maintainer.pl: Teach brace expansion

Simpler non-nested brace expansion.

Some entries in the MAINTAINER are not understood by the script, the
ones that contain {,}. This patch fixes it.

This will convert brace expansion style use in MAINTAINER into a regex
that get_maintainer.pl can use to match a path again a maintainer
section.

It is done by using two different regex, the first one will take care of
converting ',' inside '{}' to a '|', one by one, as long as there is at
least two commas. The second regex will do the final convertion of '{,}'
to '(|)'.

With the patch, the right maintainers are displayed, instead of "THE
REST" maintainers, when using the following command for e.g.
$ ./scripts/get_maintainer.pl -f docs/misc/kconfig.txt

The patch also get rid of the warnings, with recent perl:
Unescaped left brace in regex is deprecated, passed through in regex; marked by <-- HERE in m/^docs/misc/kconfig{ <-- HERE ,-language}\.txt/ at ./scripts/get_maintainer.pl line 731.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Tested-by: Doug Goldstein <cardoe@cardoe.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agolibxl/xc_kexec.c: convert tabs into spaces; preserving indentation
Eric DeVolder [Mon, 9 Jan 2017 15:42:41 +0000 (07:42 -0800)]
libxl/xc_kexec.c: convert tabs into spaces; preserving indentation

Convert tabs into spaces; preserving indentation

No functional changes

Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agox86/cpuid: Add AVX512_VPOPCNTDQ support
He Chen [Tue, 10 Jan 2017 09:19:54 +0000 (17:19 +0800)]
x86/cpuid: Add AVX512_VPOPCNTDQ support

AVX512_VPOPCNTDQ: Vector POPCNT instructions for word and qwords.
variable precision.

Signed-off-by: He Chen <he.chen@linux.intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agodocs: convert tscmode.txt into man page
Cédric Bosdonnat [Fri, 9 Dec 2016 16:07:31 +0000 (17:07 +0100)]
docs: convert tscmode.txt into man page

tscmode.txt is referenced in xl.cfg(5). Convert it into a pod
formatted man page.

Signed-off-by: Cédric Bosdonnat <cbosdonnat@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agodocs: move pci-device-reservations from misc to man
Cédric Bosdonnat [Fri, 9 Dec 2016 15:49:31 +0000 (16:49 +0100)]
docs: move pci-device-reservations from misc to man

pci-device-reservations is references in xl.cfg(5), convert it as a man
page in pod format. The name is now prefixed with 'xen-' to avoid
possible name conflicts.

Signed-off-by: Cédric Bosdonnat <cbosdonnat@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agodocs: convert misc/channel.txt into xen-pv-channel man page
Cédric Bosdonnat [Fri, 9 Dec 2016 15:38:06 +0000 (16:38 +0100)]
docs: convert misc/channel.txt into xen-pv-channel man page

channel.txt is referenced in xl.cfg(5). Move it to man pages, section 7.

Signed-off-by: Cédric Bosdonnat <cbosdonnat@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agodocs: convert vtpmmgr into a pod man page
Cédric Bosdonnat [Fri, 9 Dec 2016 15:19:00 +0000 (16:19 +0100)]
docs: convert vtpmmgr into a pod man page

vtpmmgr.txt is referenced in a man page, convert it to a man page.
The man page is named xen-vtpmmgr to avoid any conflict with other
potential vtpm docs.

Signed-off-by: Cédric Bosdonnat <cbosdonnat@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agodocs: move vtpm from misc to man
Cédric Bosdonnat [Fri, 9 Dec 2016 14:49:54 +0000 (15:49 +0100)]
docs: move vtpm from misc to man

vtpm.txt is referenced in xl.cfg man page. Convert it to pod,
move it to the man folder and update the reference. The man page
is named xen-vtmp to avoid any potential conflict with other
VTPM documentation.

Signed-off-by: Cédric Bosdonnat <cbosdonnat@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agodocs: move xl-numa-placement.markdown to man7
Cédric Bosdonnat [Fri, 9 Dec 2016 13:59:08 +0000 (14:59 +0100)]
docs: move xl-numa-placement.markdown to man7

docs/misc/xl-numa-placement.markdown is referenced by xl.cfg.5 man page,
move it to a man page, section 7.

Signed-off-by: Cédric Bosdonnat <cbosdonnat@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agodocs: move vbd-interface from misc to man
Cédric Bosdonnat [Fri, 9 Dec 2016 13:45:40 +0000 (14:45 +0100)]
docs: move vbd-interface from misc to man

Make vbd-interface a man page, section7, as this document is
referenced in other man pages (xl-disk-configuration)

Signed-off-by: Cédric Bosdonnat <cbosdonnat@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agodocs: convert xl-disk-configuration into a man page
Cédric Bosdonnat [Fri, 9 Dec 2016 13:38:45 +0000 (14:38 +0100)]
docs: convert xl-disk-configuration into a man page

Convert xl-disk-configuration.txt from plain text file to a POD file
to get it as a man page. The references to it in the other man pages
are also updated.

Signed-off-by: Cédric Bosdonnat <cbosdonnat@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agodocs: xl-network-configuration turns into a man
Cédric Bosdonnat [Fri, 9 Dec 2016 13:33:22 +0000 (14:33 +0100)]
docs: xl-network-configuration turns into a man

Move docs/misc/xl-network-configuration.markdown to docs/man and
update the references to it in the other man pages.

Signed-off-by: Cédric Bosdonnat <cbosdonnat@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agodocs: add rules for man 7 section
Cédric Bosdonnat [Fri, 9 Dec 2016 13:57:35 +0000 (14:57 +0100)]
docs: add rules for man 7 section

Some of the docs/misc documents will need to go in man 7 section,
prepare docs/Makefile for it.

Signed-off-by: Cédric Bosdonnat <cbosdonnat@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agodocs: allow writing man pages in markdown
Cédric Bosdonnat [Fri, 9 Dec 2016 13:25:53 +0000 (14:25 +0100)]
docs: allow writing man pages in markdown

Some of the docs/misc documents are written in markdown language.
As an effort to cleanup man pages these documents will be converted into
man pages. To avoid some more conversion, add rules to the docs/Makefile
to generate man pages out of markdown files as well as pod ones.

However, pandoc doesn't know how to convert man pages links. Thus the
man links in markdown pages won't work.

Signed-off-by: Cédric Bosdonnat <cbosdonnat@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agoxen/x86: Fix CONFIG_CRASH_DEBUG build following c/s 897129dea
Andrew Cooper [Fri, 6 Jan 2017 14:33:54 +0000 (14:33 +0000)]
xen/x86: Fix CONFIG_CRASH_DEBUG build following c/s 897129dea

Found by a Travis RANDCONFIG run.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
8 years agox86/domctl: Make XEN_DOMCTL_set_address_size singleshot
Andrew Cooper [Wed, 7 Dec 2016 17:48:27 +0000 (17:48 +0000)]
x86/domctl: Make XEN_DOMCTL_set_address_size singleshot

Toolstacks (including some out-of-tree ones) use XEN_DOMCTL_set_address_size
at most once per domain, and it ends up having a destructive effect on the
available CPUID policy for a domain.

To avoid ordering issues between altering the policy via domctl, and the
constructive effects which would have to happen from switching back to native,
explicitly reject this case.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86: fix build with older versions of GCC following e34bc403c3
Andrew Cooper [Fri, 6 Jan 2017 14:08:09 +0000 (15:08 +0100)]
x86: fix build with older versions of GCC following e34bc403c3

GCCs of at least 4.4 and earlier do not tollerate the initialisiation of the
$VENDOR_cpu_dev structures, because of c_ident becoming an anonymous union.

Instead of using an anonymous union, reintepret c_ident[] in its CPUID form
just in get_cpu_vendor().

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86: use unambiguous register names
Jan Beulich [Fri, 6 Jan 2017 14:07:31 +0000 (15:07 +0100)]
x86: use unambiguous register names

Eliminate the mis-naming of 64-bit fields with 32-bit register names
(eflags instead of rflags etc). To ensure no piece of code was missed,
transiently use the underscore prefixed names only for 32-bit register
accesses. This will be cleaned up subsequently.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86: drop cpu_has_sse{,2}
Jan Beulich [Fri, 6 Jan 2017 14:06:09 +0000 (15:06 +0100)]
x86: drop cpu_has_sse{,2}

Commit dc88221c97 ("x86: rename XMM* features to SSE*") pointlessly
added them - these features are always available on 64-bit CPUs. (Let's
not assume this for MMX though in at least the insn emulator.)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: support fencing insns
Jan Beulich [Fri, 6 Jan 2017 14:04:22 +0000 (15:04 +0100)]
x86emul: support fencing insns

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/mtrr: use stdbool instead of int + define
Doug Goldstein [Thu, 5 Jan 2017 16:26:09 +0000 (10:26 -0600)]
x86/mtrr: use stdbool instead of int + define

Instead of using an int and providing a define for TRUE and FALSE,
change the code to use stdbool that Xen provides.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
[Minor style tweaks]
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agolibxl: Update xenstore on VCPU hotplug for all guest types
Boris Ostrovsky [Tue, 3 Jan 2017 14:04:12 +0000 (09:04 -0500)]
libxl: Update xenstore on VCPU hotplug for all guest types

Currently HVM guests that use upstream qemu do not update xenstore's
availability entry for VCPUs. While it is not strictly necessary for
hotplug to work, xenstore ends up not reflecting actual status of
VCPUs. We should fix this.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agobuild: use debug_symbols to add -g3
Wei Liu [Thu, 5 Jan 2017 16:36:51 +0000 (16:36 +0000)]
build: use debug_symbols to add -g3

While doing archeology I found 38ce7ce3,  we should make sure
debug_symbols is responsible for adding "-g" to CFLAGS.

Move adding "-g3" from being guarded by debug to being guarded by
debug_symbols.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agobuild: move debug{,_symbols} to tools/Rules.mk
Wei Liu [Fri, 23 Dec 2016 12:24:16 +0000 (12:24 +0000)]
build: move debug{,_symbols} to tools/Rules.mk

31d41d7b tried to make debug affect tools build only but failed to take
care of debug_symbols (which appends "-g" to CFLAGS).

Move both to tools/Rules.mk at once in this patch.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
8 years agobuild: move setting LTO options to xen/Rules.mk
Wei Liu [Fri, 23 Dec 2016 12:12:36 +0000 (12:12 +0000)]
build: move setting LTO options to xen/Rules.mk

Having them in StdGNU.mk would affect both hypervisor and tools build.
However judging from the commit message of e4cdd74f LTO was only meant
to affect hypvervisor build.

Move the relevant bits to xen/Rules.mk.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agotools/libxl: include scheduler parameters in the output of xl list -l
Roger Pau Monne [Thu, 5 Jan 2017 10:08:34 +0000 (10:08 +0000)]
tools/libxl: include scheduler parameters in the output of xl list -l

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reported-by: Fatih Acar <fatih@gandi.net>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agox86/pv: Defer I/O bitmap checks even in 64bit mode for emulate_privilege_op()
Andrew Cooper [Thu, 5 Jan 2017 11:41:50 +0000 (11:41 +0000)]
x86/pv: Defer I/O bitmap checks even in 64bit mode for emulate_privilege_op()

The I/O bitmap doesn't change function depending on mode.  64bit userspace
such as an X server still needs to enter guest_io_okay() to find that the PV
kernel did set up an appropriate virtual I/O bitmap to permit access.

While moving the check, alter its representation to be easier to read.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/pv: Fix determination of 64bit mode in emulate_privilege_op()
Andrew Cooper [Thu, 5 Jan 2017 11:23:15 +0000 (11:23 +0000)]
x86/pv: Fix determination of 64bit mode in emulate_privilege_op()

ctxt->addr_size is expressed in bits rather than bytes, and has the value 16,
32 or 64.  Comparing < 8 made the intended non-64bit paths dead.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/vvmx: Drop sreg_to_index[]
Andrew Cooper [Tue, 3 Jan 2017 11:55:54 +0000 (11:55 +0000)]
x86/vvmx: Drop sreg_to_index[]

Since c/s 0888d36b "x86/emul: Correct the decoding of SReg3 operands",
x86_seg_* have followed hardware encodings, meaning that this translation
table is now an identiy transform.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
8 years agox86/VMX: use unambiguous register names
Jan Beulich [Thu, 5 Jan 2017 10:11:19 +0000 (11:11 +0100)]
x86/VMX: use unambiguous register names

This is in preparation of eliminating the mis-naming of 64-bit fields
with 32-bit register names (eflags instead of rflags etc). Use the
guaranteed 32-bit underscore prefixed names for now where appropriate.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
8 years agox86/apicv: fix RTC periodic timer and apicv issue
Quan Xu [Thu, 5 Jan 2017 10:10:01 +0000 (11:10 +0100)]
x86/apicv: fix RTC periodic timer and apicv issue

When Xen apicv is enabled, wall clock time is faster on Windows7-32
guest with high payload (with 2vCPU, captured from xentrace, in
high payload, the count of IPI interrupt increases rapidly between
these vCPUs).

If IPI intrrupt (vector 0xe1) and periodic timer interrupt (vector 0xd1)
are both pending (index of bit set in vIRR), unfortunately, the IPI
intrrupt is high priority than periodic timer interrupt. Xen updates
IPI interrupt bit set in vIRR to guest interrupt status (RVI) as a high
priority and apicv (Virtual-Interrupt Delivery) delivers IPI interrupt
within VMX non-root operation without a VM-Exit. Within VMX non-root
operation, if periodic timer interrupt index of bit is set in vIRR and
highest, the apicv delivers periodic timer interrupt within VMX non-root
operation as well.

But in current code, if Xen doesn't update periodic timer interrupt bit
set in vIRR to guest interrupt status (RVI) directly, Xen is not aware
of this case to decrease the count (pending_intr_nr) of pending periodic
timer interrupt, then Xen will deliver a periodic timer interrupt again.

And that we update periodic timer interrupt in every VM-entry, there is
a chance that already-injected instance (before EOI-induced exit happens)
will incur another pending IRR setting if there is a VM-exit happens
between virtual interrupt injection (vIRR->0, vISR->1) and EOI-induced
exit (vISR->0), since pt_intr_post hasn't been invoked yet, then the
guest receives more periodic timer interrupt.

So we set eoi_exit_bitmap for intack.vector - give a chance to post
periodic time interrupts when periodic time interrupts become the
highest one.

Signed-off-by: Quan Xu <xuquan8@huawei.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Chao Gao <chao.gao@intel.com>
8 years agox86/cpuid: Untangle the <asm/cpufeature.h> include hierachy
Andrew Cooper [Thu, 8 Dec 2016 08:46:42 +0000 (08:46 +0000)]
x86/cpuid: Untangle the <asm/cpufeature.h> include hierachy

The use of X86_FEATURES_ONLY was shortlived in Linux for the same problem
encountered here.  The following series needs to add extra includes to
asm/cpuid.h, which breaks the build elsewhere given the current hierachy.

Move the feature definitions into a separate header file, which also matches
the solution Linux used.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/svm: Replace opencoded 1GB superpage check
Andrew Cooper [Tue, 3 Jan 2017 17:46:58 +0000 (17:46 +0000)]
x86/svm: Replace opencoded 1GB superpage check

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agox86/mwait-idle: add Knights Mill CPUID
Piotr Luc [Wed, 4 Jan 2017 13:29:30 +0000 (14:29 +0100)]
x86/mwait-idle: add Knights Mill CPUID

Add Knights Mill (KNM) to the list of CPUIDs supported by mwait-idle.

Signed-off-by: Piotr Luc <piotr.luc@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
[Linux commit: a2c1bc645e87346150516b3abf1933ed29d0f48b]
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/mwait-idle: add CPU model 0x4a (Atom Z34xx series)
Andy Shevchenko [Wed, 4 Jan 2017 13:29:08 +0000 (14:29 +0100)]
x86/mwait-idle: add CPU model 0x4a (Atom Z34xx series)

Add CPU ID for Atom Z34xx processors. Datasheets indicate support for this,
detailed information about potential quirks or limitations are missing, though.
So we just reuse the definition from official BSP code.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
[Linux commit: 5e7ec268fd48d63cfd0e3a9be6c6443f01673bd4]
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: use unambiguous register names
Jan Beulich [Wed, 4 Jan 2017 13:28:32 +0000 (14:28 +0100)]
x86emul: use unambiguous register names

This is in preparation of eliminating the mis-naming of 64-bit fields
with 32-bit register names (eflags instead of rflags etc).

Note that the result is not fully consistent until after at least one
more patch is in place, primarily to limit patch size (by trying to not
touch the same line twice).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: make _PRE_EFLAGS() tolerate first argument being 32-bit
Jan Beulich [Wed, 4 Jan 2017 13:28:02 +0000 (14:28 +0100)]
x86emul: make _PRE_EFLAGS() tolerate first argument being 32-bit

While this may appear to introduce a truncation issue, the high 32 bits
get zapped already anyway (early in _PRE_EFLAGS() as well as in
_POST_EFLAGS()). Once a subsequent patch switches to use proper 32-bit
EFLAGS operands, we'll in fact end up with more correct code, as that
zeroing of the upper halves will then go away.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>