]> xenbits.xensource.com Git - xen.git/log
xen.git
8 years agox86/Intel: Expose cpuid_faulting_enabled so it can be used elsewhere
Kyle Huey [Thu, 20 Oct 2016 13:44:27 +0000 (06:44 -0700)]
x86/Intel: Expose cpuid_faulting_enabled so it can be used elsewhere

While we're here, use bool instead of bool_t.

Signed-off-by: Kyle Huey <khuey@kylehuey.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agoConfig.mk: use non-debug build for 4.8
Wei Liu [Thu, 20 Oct 2016 13:00:47 +0000 (14:00 +0100)]
Config.mk: use non-debug build for 4.8

Set debug ?= n in preparation for late RCs and eventual release.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
8 years agox86/svm: Drop adjustment of X86_FEATURE_APIC
Andrew Cooper [Thu, 1 Sep 2016 09:38:27 +0000 (10:38 +0100)]
x86/svm: Drop adjustment of X86_FEATURE_APIC

The common hvm_cpuid() code already does this.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agoxen/sm{e, a}p: allow disabling sm{e, a}p for Xen itself
He Chen [Wed, 19 Oct 2016 08:03:24 +0000 (16:03 +0800)]
xen/sm{e, a}p: allow disabling sm{e, a}p for Xen itself

SMEP/SMAP is a security feature to prevent kernel executing/accessing
user address involuntarily, any such behavior will lead to a page fault.

SMEP/SMAP is open (in CR4) for both Xen and HVM guest in earlier code.
SMEP/SMAP bit set in Xen CR4 would enforce security checking for 32-bit
PV guest which will suffer unknown SMEP/SMAP page fault when guest
kernel attempt to access user address although SMEP/SMAP is close for
PV guests.

This patch introduces a new boot option value "hvm" for "sm{e,a}p", it
is going to diable SMEP/SMAP for Xen hypervisor while enable them for
HVM. In this way, 32-bit PV guest will not suffer SMEP/SMAP security
issue. Users can choose whether open SMEP/SMAP for Xen itself,
especially when they are going to run 32-bit PV guests.

Signed-off-by: He Chen <he.chen@linux.intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
[Fixed up command line docs]
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/vmx: Reduce the verbosity of the vmentry failure error reporting
Andrew Cooper [Thu, 13 Oct 2016 11:12:20 +0000 (12:12 +0100)]
x86/vmx: Reduce the verbosity of the vmentry failure error reporting

Identify the affected vcpu at the start of the message.  While tweaking this
area, add extra newlines between cases.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <JBeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agox86/vmx: Print the problematic MSR if a vmentry fails
Andrew Cooper [Thu, 13 Oct 2016 10:46:58 +0000 (11:46 +0100)]
x86/vmx: Print the problematic MSR if a vmentry fails

Sample error looks like:

  (XEN) Failed vm entry (exit reason 0x80000022) caused by MSR loading (entry 13).
  (XEN)   msr 0000068a val 1fff800000102af0 (mbz 0)
  (XEN) ************* VMCS Area **************

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <JBeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl: remove explicit rule for libxl_arm_acpi.o 4.8.0-rc3
Wei Liu [Tue, 18 Oct 2016 12:43:07 +0000 (13:43 +0100)]
libxl: remove explicit rule for libxl_arm_acpi.o

After 9c635883 ("ARM64: fix libxl build, do not include
../../xen/include") there is nothing special needed to build
libxl_arm_acpi.o. Remove the explicit rule, use predefined one.

Build tested on ARM64.

Suggested-by: Steve Capper <steve.capper@linaro.org>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoARM64: fix libxl build, do not include ../../xen/include
Stefano Stabellini [Tue, 18 Oct 2016 11:32:50 +0000 (12:32 +0100)]
ARM64: fix libxl build, do not include ../../xen/include

Do not include ../../xen/include/ to build libxl_arm_acpi.c: header
files clashing against default headers under /usr/include are present in
that directory.

Link only $(XEN_ROOT)/xen/include/acpi under tools/include instead.

Build tested on ARM64 and x86_64.

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Tested-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agotools/xl: Use %u for uint32_t domids
Ronald Rojas [Mon, 17 Oct 2016 00:16:32 +0000 (20:16 -0400)]
tools/xl: Use %u for uint32_t domids

domid is normally represented by uint32_t, but many format
strings in xl_cmdimpl.c use %d when printing, which is signed.
Use %u instead to print the unsigned integer domid.

Signed-off-by: Ronald Rojas <ronladred@gmail.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibacpi: add back the "G" in "GNU" in licence header
Wei Liu [Fri, 14 Oct 2016 17:02:32 +0000 (18:02 +0100)]
libacpi: add back the "G" in "GNU" in licence header

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agolibacpi: fix arm64 build
Wei Liu [Fri, 14 Oct 2016 17:02:30 +0000 (18:02 +0100)]
libacpi: fix arm64 build

The arm64 build for libacpi was broken due to two reasons:

1. ACPI_BUILD_DIR was appended twice to dsdt_anycpu_arm.c.
2. The inclusion of firmware/Rules.mk overrided XEN_TARGET_ARCH, which
   made CONFIG_ARM disappear.

Fix those by:

1. Correctly generate full path for dsdt_anaycpu_arm.c.
2. Include tools/Rules.mk instead, because libacpi/Makefile doesn't rely
   on settings in firmware/Rules.mk.

While at it, use CONFIG_ARM_64 instead of CONFIG_ARM as it is more
accurate.

Reported-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agodocs: RTDS feature document.
Dario Faggioli [Fri, 14 Oct 2016 10:02:25 +0000 (11:02 +0100)]
docs: RTDS feature document.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agodocs: Credit2 feature document.
Dario Faggioli [Fri, 14 Oct 2016 10:01:40 +0000 (11:01 +0100)]
docs: Credit2 feature document.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agodocs: Credit1 feature document.
Dario Faggioli [Fri, 14 Oct 2016 10:00:55 +0000 (11:00 +0100)]
docs: Credit1 feature document.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agox86/Viridian: don't depend on undefined register state
Jan Beulich [Fri, 14 Oct 2016 12:09:42 +0000 (14:09 +0200)]
x86/Viridian: don't depend on undefined register state

The high halves of all GPRs are undefined in 32-bit and compat modes,
and the dependency is being obfuscated by our structure field names not
matching architectural register names (it was actually while putting
together a patch to correct this when I noticed the issue here).

For consistency also use the architecturally correct names on the
output side.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
8 years agox86emul: fix pushing of selector registers
Jan Beulich [Fri, 14 Oct 2016 12:09:16 +0000 (14:09 +0200)]
x86emul: fix pushing of selector registers

Both explicit PUSH and far CALL currently push unrelated data (the
segment attributes word) in the high half (attributes and limit in the
64-bit case in the high 48 bits) instead of zero. To avoid having to
apply this and further changes in multiple places, also fold the two
(respectively) far call/jmp instances into one.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: honor MXCSR.MM
Jan Beulich [Fri, 14 Oct 2016 12:08:29 +0000 (14:08 +0200)]
x86emul: honor MXCSR.MM

Commit 6dc9ac9f52 ("x86emul: check alignment of SSE and AVX memory
operands") didn't consider a specific AMD mode: Mis-alignment #GP
faults can be masked on some of their hardware.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/hvm: Clobber %cs.L when LME becomes set
Andrew Cooper [Thu, 13 Oct 2016 12:16:47 +0000 (12:16 +0000)]
x86/hvm: Clobber %cs.L when LME becomes set

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/hvm: Correct the position of the %cs L/D checks
Andrew Cooper [Thu, 13 Oct 2016 10:27:28 +0000 (11:27 +0100)]
x86/hvm: Correct the position of the %cs L/D checks

Contrary to the description in the software manuals, in Long Mode, attempts to
load %cs check that D is not set in combination with L before the present flag
is checked.

This can be observed because the L/D check fails with #GP before the presence
check failes with #NP.

This change partially reverts c/s 78ff18c90 "x86: defer not-present segment
checks", taking it back to how it was in the v1 submission.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agotools: check liblzma in configure for rombios
Wei Liu [Thu, 13 Oct 2016 11:03:17 +0000 (12:03 +0100)]
tools: check liblzma in configure for rombios

We upgraded ipxe in 38ab99b2 ("ipxe: update to new commit"). That
version of ipxe requires liblzma to build.

Check that in configure and document this in README.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agox86emul: correct {,F}CMOV and F{,U}COMI{,P} emulation
Jan Beulich [Thu, 13 Oct 2016 11:07:25 +0000 (13:07 +0200)]
x86emul: correct {,F}CMOV and F{,U}COMI{,P} emulation

The FPU ones need to be executed with guest EFLAGS.{C,P,Z}F in context.

We also can't exclude someone wanting to hide the feature from (32-bit)
guests.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agokeyhandler: rework process of nonirq keyhandler
Lan Tianyu [Thu, 13 Oct 2016 11:06:28 +0000 (13:06 +0200)]
keyhandler: rework process of nonirq keyhandler

Keyhandler may run for a long time in serial port driver's
timer handler on the large machine with a lot of physical
cpus(e,g dump_timerq()) when serial port driver works in
the poll mode(via the exception mechanism).

If a timer handler runs a long time, it will block nmi_timer_fn()
to feed NMI watchdog and cause Xen hypervisor panic. Inserting
process_pending_softirqs() in timer handler will not help. when timer
interrupt arrives, timer subsystem calls all expired timer handlers
before programming next timer interrupt. There is no timer interrupt
arriving to trigger timer softirq during run a timer handler.

This patch is to fix the issue to make nonirq keyhandler run in
tasklet when receive debug key from serial port.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agoipxe: update to newer commit
Wei Liu [Mon, 10 Oct 2016 12:50:58 +0000 (13:50 +0100)]
ipxe: update to newer commit

The current commit in tree is rather old. It has come to a point that
cherry-picking commits from upstream isn't trivial anymore.

There is long term plan to track ipxe upstream, but for 4.8 release, we
should just update ipxe to a newer commit (they are using rolling
release model now).

Forward-port the one boot prompt patch that is still relevant and retire
the rest which are already in upstream.

Reported-by: Juergen Schinker <ba1020@homie.homelinux.net>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxen/arm: Disable the Cortex-a53-edac
Edgar E. Iglesias [Thu, 6 Oct 2016 16:36:31 +0000 (18:36 +0200)]
xen/arm: Disable the Cortex-a53-edac

Disable the Cortex-a53-edac. Xen currently does not yet
handle reads/writes to the implementation defined CPUMERRSR
register.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Acked-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/trace: Fix trace metadata page count calculation (revert fbf96e6)
George Dunlap [Fri, 30 Sep 2016 14:42:56 +0000 (15:42 +0100)]
xen/trace: Fix trace metadata page count calculation (revert fbf96e6)

Changeset fbf96e6, "xentrace: correct formula to calculate
t_info_pages", broke the trace metadata page count calculation, by
mistaking t_info_first_offset as denominated in bytes, when in fact it
is denominated in words (uint32_t).

Effectively revert that change, and put a comment there to reduce the
chance that someone will make that mistake in the future.

Reviewed-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Tested-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
8 years agoMakefile: fix (again) EFI part of "symbols: Generate an xen-sym.map 4.8.0-rc2
Konrad Rzeszutek Wilk [Mon, 10 Oct 2016 18:10:56 +0000 (11:10 -0700)]
Makefile: fix (again) EFI part of "symbols: Generate an xen-sym.map

This is a follow-up to commit d14fffcc6a7c054db9e337026a3c850152244ac4
"fix EFI part of "symbols: Generate an xen-sym.map" which fixed most of
the issues.

However we still have an issue - The file being installed (xen.efi.map)
does not exist in an ARM64 build (the xen.efi is linked againts xen).

The fix can be done two ways:
 a) See if xen.efi.map exists and then copy it
 b) Or link xen.efi.map to xen-syms.map (similar to how xen.efi is linked
    against xen).

The patch chooses the former.

Reported-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoKconfig: use tab instead of space
Wei Liu [Mon, 10 Oct 2016 09:40:30 +0000 (10:40 +0100)]
Kconfig: use tab instead of space

Previously in d6be2cfc ("xen: make clear gcov support limitation in
Kconfig") and db6c2264 ("xen: add a gcov Kconfig option"), space was
used to indent Kconfig text. Change that to use tab instead.

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86: defer not-present segment checks
Jan Beulich [Mon, 10 Oct 2016 10:16:49 +0000 (12:16 +0200)]
x86: defer not-present segment checks

Following on from commits 5602e74c60 ("x86emul: correct loading of
%ss") and bdb860d01c ("x86/HVM: correct segment register loading during
task switch") the point of the non-.present checks needs to be refined:
#NP (and its #SS companion), other than suggested by the various
instruction pages in Intel's SDM, gets checked for only after all type
and permission checks. The only checks getting done even later are the
long mode specific ones for system descriptors (which we don't support
yet) and 64-bit code segments (i.e. anything touching other than the
attribute byte).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86: replace redundant MTRR MSR definitions
Jan Beulich [Mon, 10 Oct 2016 10:16:06 +0000 (12:16 +0200)]
x86: replace redundant MTRR MSR definitions

We really should have only one set of #define-s for them.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/hvm: remove emulation context setting from hvmemul_cmpxchg()
Razvan Cojocaru [Fri, 7 Oct 2016 09:35:58 +0000 (11:35 +0200)]
x86/hvm: remove emulation context setting from hvmemul_cmpxchg()

hvmemul_cmpxchg() sets the read emulation context in p_new instead
of p_old, which is inconsistent (and wrong). Since p_old is
unused in any case and cmpxchg() semantics would be altered even
if it wasn't, remove the emulation context setting code.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
8 years agotimer: process softirq during dumping timer info
Lan Tianyu [Fri, 7 Oct 2016 09:35:26 +0000 (11:35 +0200)]
timer: process softirq during dumping timer info

Dumping timer info may run for a long time on the huge machine with
a lot of physical cpus. To avoid triggering NMI watchdog, add
process_pending_softirqs() in the loop of dumping timer info.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
8 years agox86emul: check for FPU availability
Jan Beulich [Wed, 5 Oct 2016 12:20:10 +0000 (14:20 +0200)]
x86emul: check for FPU availability

We can't exclude someone wanting to hide the FPU from guests.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper@citrix.com>
8 years agox86emul: deliver correct math exceptions
Jan Beulich [Wed, 5 Oct 2016 12:19:43 +0000 (14:19 +0200)]
x86emul: deliver correct math exceptions

#MF only applies to x87 instructions. SSE and AVX ones need #XM to be
raised instead, unless CR4.OSXMMEXCPT is clear, in which case #UD needs
to result. (But note that this is only a latent issue - we don't
emulate any instructions so far which could result in #XM.)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: honor guest CR4.OSFXSR and CR4.OSXSAVE
Jan Beulich [Wed, 5 Oct 2016 12:18:42 +0000 (14:18 +0200)]
x86emul: honor guest CR4.OSFXSR and CR4.OSXSAVE

These checks belong into the emulator instead of hvmemul_get_fpu().

The CR0.PE/EFLAGS.VM ones can actually just be ASSERT()ed, as decoding
should make it impossible to get into get_fpu() with them in the wrong
state.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agoFix to be error handled when 10ms delayed for cpu_on
casionwoo [Tue, 4 Oct 2016 11:04:08 +0000 (20:04 +0900)]
Fix to be error handled when 10ms delayed for cpu_on

Comment of origin code said "wait max 10 ms until cpu is on"
Origin code expects to print "CPU%d power enable failed", if cpu do not on until 10ms
But actual code do not reach to print even it wait 10 ms (actually it waits 11ms not 10ms)
Because the comparing is like bellow
"if ( timeout-- == 0 )"
So I modified the code to wait 10ms and print the error statement
Let me simulate about origin code and modified code.

Origin code)

timeout    delayed time   timeout
(before while)     (mdelay(1)) (timeout--)
  10     1 9
  9 2 8
  8 3 7
  7 4 6
  6 5 5
  5 6 4
  4 7 3
  3 8 2
  2 9 1
  1 10 0
  0 11 -1

Modified code)

timeout    delayed time   timeout
(before while)     (mdelay(1)) (--timeout)
  10     1 9
  9 2 8
  8 3 7
  7 4 6
  6 5 5
  5 6 4
  4 7 3
  3 8 2
  2 9 1
  1 10 0

Signed-off-by: JEUNGWOO, YOO <casionwoo@gmail.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoarm: fix build with gcc6
Jan Beulich [Tue, 4 Oct 2016 10:26:14 +0000 (04:26 -0600)]
arm: fix build with gcc6

Commit e170622f95 ("xen/arm: p2m: Re-implement p2m_set_mem_access using
p2m_{set,get}_entry") eliminated the only user of level_sizes[],
causing gcc6 to warn about the unused variable (as it's a const one
older gcc versions apparently don't care to emit a warning).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agox86emul: honor guest CR0.TS and CR0.EM
Jan Beulich [Tue, 4 Oct 2016 13:04:46 +0000 (14:04 +0100)]
x86emul: honor guest CR0.TS and CR0.EM

We must not emulate any instructions accessing respective registers
when either of these flags is set in the guest view of the register, or
else we may do so on data not belonging to the guest's current task.

Being architecturally required behavior, the logic gets placed in the
instruction emulator instead of hvmemul_get_fpu(). It should be noted,
though, that hvmemul_get_fpu() being the only current handler for the
get_fpu() callback, we don't have an active problem with CR4: Both
CR4.OSFXSR and CR4.OSXSAVE get handled as necessary by that function.

This is XSA-190.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agoinit-xenstore-domain: remove an unused variable
Jan Beulich [Tue, 4 Oct 2016 10:27:07 +0000 (04:27 -0600)]
init-xenstore-domain: remove an unused variable

Introduced by commit 80dd5b401e ("tools: add --maxmem parameter to
init-xenstore-domain").

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl: Mark libxl_retrieve_domain_configuration as for external callers only
Ian Jackson [Tue, 4 Oct 2016 09:19:36 +0000 (10:19 +0100)]
libxl: Mark libxl_retrieve_domain_configuration as for external callers only

This function takes the userdata lock.  Incautious use inside libxl
can result in nested acquisition of that lock, and deadlock.

There is no good reason to use this function inside libxl, but it is a
superficially attractive option.  Make future regressions easier to
spot by marking the function for external use only.

Similar arguments apply for the application-facing userdata accessors,
so do those too.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl: fix issues in 38cd0664
Wei Liu [Mon, 3 Oct 2016 14:46:02 +0000 (15:46 +0100)]
libxl: fix issues in 38cd0664

A few issues were introduced in 38cd0664 ("libxl/arm: Add the size of
ACPI tables to maxmem"):

1. d_config was not properly initialised and disposed of.
2. using libxl_retrieve_domain_configuration caused thread to
   deadlock itself.

Fix those issues by:

1. properly initialise and dispose of d_config.
2. switch to use libxl__get_domain_configuration.

Note that in theory we can refactor libxl_retrieve_domain_configuration
a bit to get a function without locking, but up until the calculation of
extra memory only relies on static configuration, hence we use the
stored configuration only.

Reported-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Tested-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoXen 4.8.0-rc1 preparation 4.8.0-rc1
Ian Jackson [Mon, 3 Oct 2016 10:55:26 +0000 (11:55 +0100)]
Xen 4.8.0-rc1 preparation

* Change QEMU_UPSTREAM_REVISION MINIOS_UPSTREAM_REVISION and
  QEMU_TRADITIONAL_REVISION to refer to the Xen 4.8.0-rc1 tags.

* Change README and xen/Makefile to refer to Xen 4.8.0-rc (note, the
  RC number is not included, so we do not have to update these again).

I reran autogen.sh as per the release checklist and this produced no
changes, as expected.  (Debian jessie i386.)

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agotmem: Batch and squash XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_[FLAGS,NPAGES,UUID]
Konrad Rzeszutek Wilk [Fri, 30 Sep 2016 19:10:22 +0000 (15:10 -0400)]
tmem: Batch and squash XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_[FLAGS,NPAGES,UUID]
in one sub-call: XEN_SYSCTL_TMEM_OP_GET_POOLS.

These operations are used during the save process of migration.
Instead of doing 64 hypercalls lets do just one. We modify
the 'struct xen_tmem_client' structure (used in
XEN_SYSCTL_TMEM_OP_[GET|SET]_CLIENT_INFO) to have an extra field
'nr_pools'. Armed with that the code slurping up pages from the
hypervisor can allocate a big enough structure (struct tmem_pool_info)
to contain all the active pools. And then just iterate over each
one and save it in the stream.

We are also re-using one of the subcommands numbers for this,
as such the XEN_SYSCTL_INTERFACE_VERSION should be incremented
and that was done in the patch titled:
"tmem/libxc: Squash XEN_SYSCTL_TMEM_OP_[SET|SAVE].."

In the xc_tmem_[save|restore] we also added proper memory handling
of the 'buf' and 'pools'. Because of the loops and to make it as
easy as possible to review we add a goto label and for almost
all error conditions jump in it.

The include for inttypes is required for the PRId64 macro to
work (which is needed to compile this code under 32-bit).

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
8 years agotmem/xc_tmem_control: Rename 'arg1' to 'len' and 'arg2' to arg.
Konrad Rzeszutek Wilk [Fri, 30 Sep 2016 19:10:01 +0000 (15:10 -0400)]
tmem/xc_tmem_control: Rename 'arg1' to 'len' and 'arg2' to arg.

That is what they are used for. Lets make it more clear.

Of all the various sub-commands, the only one that needed
semantic change is XEN_SYSCTL_TMEM_OP_SAVE_BEGIN. That in the
past used 'arg1', and now we are moving it to use 'arg'.
Since that code is only used during migration which is tied
to the toolstack it is OK to change it.

We should increment the XEN_SYSCTL_INTERFACE_VERSION because
of this, and that was fortunatly done in the patch titled:
"tmem/libxc: Squash XEN_SYSCTL_TMEM_OP_[SET|SAVE].."

While at it, also fix xc_tmem_control_oid to properly handle
the 'buf' and bounce it as appropiate.

Acked-by: Andrew cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
8 years agotmem: Unify XEN_SYSCTL_TMEM_OP_[[SAVE_[BEGIN|END]|RESTORE_BEGIN]
Konrad Rzeszutek Wilk [Mon, 26 Sep 2016 15:05:09 +0000 (11:05 -0400)]
tmem: Unify XEN_SYSCTL_TMEM_OP_[[SAVE_[BEGIN|END]|RESTORE_BEGIN]

return values. For success they used to be 1 ([SAVE,RESTORE]_BEGIN),
0 if guest did not have any tmem (but only for SAVE_BEGIN), and
-1 for any type of failure.

And SAVE_END (which you would think would mirror SAVE_BEGIN)
had 0 for success and -1 if guest did not any tmem enabled for it.

This is confusing. Now the code will return 0 if the operation was
success.  Various XEN_EXX values are returned if tmem is not enabled
or the operation could not performed.

The xc_tmem.c code only needs one place to check - where we use
SAVE_BEGIN. The place where RESTORE_BEGIN is used will have errno
with the proper error value and return will be -1, so will still
fail properly.

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
8 years agotmem/libxc: Squash XEN_SYSCTL_TMEM_OP_[SET|SAVE]..
Konrad Rzeszutek Wilk [Fri, 30 Sep 2016 14:53:01 +0000 (10:53 -0400)]
tmem/libxc: Squash XEN_SYSCTL_TMEM_OP_[SET|SAVE]..

Specifically:

XEN_SYSCTL_TMEM_OP_SET_[WEIGHT,COMPRESS] are now done via:

 XEN_SYSCTL_TMEM_SET_CLIENT_INFO

and XEN_SYSCTL_TMEM_OP_SAVE_GET_[VERSION,MAXPOOLS,
CLIENT_WEIGHT, CLIENT_FLAGS] can now be retrieved via:

 XEN_SYSCTL_TMEM_GET_CLIENT_INFO

All this information is now in 'struct xen_tmem_client' and
that is what we pass around.

We also rev up the XEN_SYSCTL_INTERFACE_VERSION as we are
re-using the value number of the deleted ones (and henceforth
the information is retrieved differently).

On the toolstack, prior to this patch, the xc_tmem_control
would use the bounce buffer only when arg1 was set and the cmd
was to list. With the 'XEN_SYSCTL_TMEM_OP_SET_[WEIGHT|COMPRESS]'
that made sense as the 'arg1' would have the value. However
for the other ones (say XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_UUID)
the 'arg1' would be the length of the 'buf'. If this
confusing don't despair, patch patch titled:
tmem/xc_tmem_control: Rename 'arg1' to 'len' and 'arg2' to arg.
takes care of that.

The acute reader of the toolstack code will discover that
we only used the bounce buffer for LIST, not for any other
subcommands that used 'buf'!?! Which means that the contents
of 'buf' would never be copied back to the calleer 'buf'!

The author is not sure how this could possibly work, perhaps Xen 4.1
(when this was introduced) was more relaxed about the bounce buffer
being enabled. Anyhow this fixes xc_tmem_control to do it for
any subcommand that has 'arg1'.

Lastly some of the checks in xc_tmem_[restore|save] are removed
as they can't ever be reached (not even sure how they could
have been reached in the original submission). One of them
is the check for the weight against -1 when in fact the
hypervisor would never have provided that value.

Now the checks are simple - as the hypercall always returns
->version and ->maxpools (which is mirroring how it was done
prior to this patch). But if one wants to check the if a guest
has any tmem activity then the patch titled
"tmem: Batch and squash XEN_SYSCTL_TMEM_OP_SAVE_GET_POOL_
[FLAGS,NPAGES,UUID] in one sub-call: XEN_SYSCTL_TMEM_OP_GET_POOLS."
adds an ->nr_pools to check for that.

Also we add the check for ->version and ->maxpools and remove
the TODO.

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
8 years agotmem/sysctl: Add union in struct xen_sysctl_tmem_op
Konrad Rzeszutek Wilk [Fri, 30 Sep 2016 14:50:32 +0000 (10:50 -0400)]
tmem/sysctl: Add union in struct xen_sysctl_tmem_op

No functional change. We do this to prepare for another
entry to be added in the union. See patch titled:
"tmem/libxc: Squash XEN_SYSCTL_TMEM_OP_[SET|SAVE]"

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
8 years agotmem: Move client weight, frozen, live_migrating, and compress
Konrad Rzeszutek Wilk [Fri, 30 Sep 2016 14:10:42 +0000 (10:10 -0400)]
tmem: Move client weight, frozen, live_migrating, and compress

in its own structure. This paves the way to make only one hypercall
to retrieve/set this information instead of multiple ones.

Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
8 years agotmem: Delete deduplication (and tze) code.
Konrad Rzeszutek Wilk [Tue, 27 Sep 2016 13:40:22 +0000 (09:40 -0400)]
tmem: Delete deduplication (and tze) code.

Couple of reasons:
 - It can lead to security issues (see row-hammer, KSM and such
   attacks).
 - Code is quite complex.
 - Deduplication is good if the pages themselves are the same
   but that is hardly guaranteed.
 - We got some gains (if pages are deduped) but at the cost of
   making code less maintainable.
 - tze depends on deduplication code.

As such, deleting it.

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
8 years agotmem: Retire XEN_SYSCTL_TMEM_OP_[SET_CAP|SAVE_GET_CLIENT_CAP]
Konrad Rzeszutek Wilk [Wed, 21 Sep 2016 20:53:51 +0000 (16:53 -0400)]
tmem: Retire XEN_SYSCTL_TMEM_OP_[SET_CAP|SAVE_GET_CLIENT_CAP]

It is not used by anything. Its intent was to complement
the 'weight' attribute but there hadn't been any request for this.

If there is a need to resurface it, it can be integrated back
via the XEN_SYSCTL_TMEM_SET_CLIENT_INFO introduced in
"tmem/libxc: Squash XEN_SYSCTL_TMEM_OP_[SET|SAVE].."

Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
8 years agolibxc/tmem/restore: Remove call to XEN_SYSCTL_TMEM_OP_SAVE_GET_VERSION
Konrad Rzeszutek Wilk [Thu, 22 Sep 2016 01:18:57 +0000 (21:18 -0400)]
libxc/tmem/restore: Remove call to XEN_SYSCTL_TMEM_OP_SAVE_GET_VERSION

The only thing this hypercall returns is TMEM_SPEC_VERSION.

The comment around is also misleading - this call does not
do any domain operation.

Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
8 years agosvm/emulate: remove duplicated const specifier
Wei Liu [Fri, 30 Sep 2016 15:47:17 +0000 (16:47 +0100)]
svm/emulate: remove duplicated const specifier

Clang complains:

emulate.c:65:3: error: duplicate 'const' declaration specifier
      [-Werror,-Wduplicate-decl-specifier]
} const opc_tab[INSTR_MAX_COUNT] = {
  ^

Remove that const to fix the issue.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agoxen: credit2: "relax" CSCHED2_MAX_TIMER
Dario Faggioli [Thu, 15 Sep 2016 11:35:05 +0000 (12:35 +0100)]
xen: credit2: "relax" CSCHED2_MAX_TIMER

Credit2 is already event based, rather than tick
based. This means, the time at which the (i+1)-eth
scheduling decision needs to happen is computed
during the i-eth scheduling decision, and a timer
is set accordingly.

If there's nothing imminent (or, the most imminent
event is really really really far away), it is
ok to say "well, let's double-check things in
a little bit anyway", but such 'little bit' does
not need to be too little, as, most likely, it's
just pure overhead.

The current period, for this "safety catch"-alike
timer is 2ms, which indeed is high, but it can
well be higher. In fact, benchmarks show that
setting it to 10ms --combined with other
optimizations-- does actually improve performance.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
8 years agoxen: tracing: add trace records for schedule and rate-limiting.
Dario Faggioli [Fri, 30 Sep 2016 14:21:34 +0000 (16:21 +0200)]
xen: tracing: add trace records for schedule and rate-limiting.

As far as {csched, csched2, rt}_schedule() are concerned,
an "empty" event, would already make it easier to read and
understand a trace.

But while there, add a few useful information, like
if the cpu that is going through the scheduler has
been tickled or not, if it is currently idle, etc
(they vary, on a per-scheduler basis).

For Credit1 and Credit2, add a record about when
rate-limiting kicks in too.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
8 years agoxen: credit2: implement yield()
Dario Faggioli [Fri, 30 Sep 2016 14:21:27 +0000 (16:21 +0200)]
xen: credit2: implement yield()

When a vcpu explicitly yields it is usually giving
us an advice of "let someone else run and come back
to me in a bit."

Credit2 isn't, so far, doing anything when a vcpu
yields, which means an yield is basically a NOP (well,
actually, it's pure overhead, as it causes the scheduler
kick in, but the result is --at least 99% of the time--
that the very same vcpu that yielded continues to run).

With this patch, when a vcpu yields, we go and try
picking the next vcpu on the runqueue that can run on
the pcpu where the yielding vcpu is running. Of course,
if we don't find any other vcpu that wants and can run
there, the yielding vcpu will continue.

Also, add an yield performance counter, and fix the
style of a couple of comments.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
8 years agoSVM: use generic instruction decoding
Jan Beulich [Fri, 30 Sep 2016 15:11:56 +0000 (17:11 +0200)]
SVM: use generic instruction decoding

... instead of custom handling. To facilitate this break out init code
from _hvm_emulate_one() into the new hvm_emulate_init(), and make
hvmemul_insn_fetch( globally available.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agox86/32on64: don't modify guest descriptors without need
Jan Beulich [Fri, 30 Sep 2016 14:45:46 +0000 (16:45 +0200)]
x86/32on64: don't modify guest descriptors without need

System gates with type 0 shouldn't have what might be their DPL altered
- such descriptors can't be used anyway without incurring a #GP, and
hence adjusting its DPL is only risking to confuse the guest.

Also bail right away for non-present descriptors - no need to write
back anything in that case.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: support RTM instructions
Jan Beulich [Fri, 30 Sep 2016 14:44:49 +0000 (16:44 +0200)]
x86emul: support RTM instructions

Minimal emulation: XBEGIN aborts right away, hence
- XABORT is just a no-op,
- XEND always raises #GP,
- XTEST always signals neither RTM nor HLE are active.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agoxl: allow to set the ratelimit value online for Credit2
Dario Faggioli [Fri, 30 Sep 2016 02:54:28 +0000 (04:54 +0200)]
xl: allow to set the ratelimit value online for Credit2

Last part of the wiring necessary for allowing to
change the value of the ratelimit_us parameter online,
for Credit2 (like it is already for Credit1).

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agolibxl: allow to set the ratelimit value online for Credit2
Dario Faggioli [Fri, 30 Sep 2016 02:54:21 +0000 (04:54 +0200)]
libxl: allow to set the ratelimit value online for Credit2

This is the remaining part of the plumbing (the libxl
one) necessary to be able to change the value of the
ratelimit_us parameter online, for Credit2 (like it is
already for Credit1).

Note that, so far, we were rejecting (for Credit1) a
new value of zero, despite it is a pretty nice way to
ask for the rate limiting to be disabled, and the
hypervisor is already capable of dealing with it in
that way.

Therefore, we change things so that it is possible to
do so, both for Credit1 and Credit2

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
8 years agolibxl: fix coding style of credit1 parameters related functions
Dario Faggioli [Fri, 30 Sep 2016 02:54:14 +0000 (04:54 +0200)]
libxl: fix coding style of credit1 parameters related functions

More specifically, the the error handling path is
made compliant with libxl's codying style.

No functional change intended.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agotools: tracing: handle more scheduling related events.
Dario Faggioli [Fri, 30 Sep 2016 02:54:07 +0000 (04:54 +0200)]
tools: tracing: handle more scheduling related events.

There are some scheduling related trace records that
are not being taken care of (and hence only dumped as
raw records).

Some of them are being introduced in this series, while
other were just neglected by previous patches.

Add support for them.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
8 years agoxen: credit2: only reset credit on reset condition
Dario Faggioli [Fri, 30 Sep 2016 02:53:46 +0000 (04:53 +0200)]
xen: credit2: only reset credit on reset condition

The condition for a Credit2 scheduling epoch coming to an
end is that the vcpu at the front of the runqueue has negative
credits. However, it is possible, that runq_candidate() does
not actually return to the scheduler the first vcpu in the
runqueue (e.g., because such vcpu can't run on the cpu that
is going through the scheduler, because of hard-affinity).

If that happens, we should not trigger a credit reset, or we
risk altering the lenght of a scheduler epoch, wrt what the
original idea of the algorithm was.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
8 years agoxen: credit2: make tickling more deterministic
Dario Faggioli [Fri, 30 Sep 2016 02:53:39 +0000 (04:53 +0200)]
xen: credit2: make tickling more deterministic

Right now, the following scenario can occurr:
 - upon vcpu v wakeup, v itself is put in the runqueue,
   and pcpu X is tickled;
 - pcpu Y schedules (for whatever reason), sees v in
   the runqueue and picks it up.

This may seem ok (or even a good thing), but it's not.
In fact, if runq_tickle() decided X is where v should
run, it did it for a reason (load distribution, SMT
support, cache hotness, affinity, etc), and we really
should try as hard as possible to stick to that.

Of course, we can't be too strict, or we risk leaving
vcpus in the runqueue while there is available CPU
capacity. So, we only leave v in runqueue --for X to
pick it up-- if we see that X has been tickled and
has not scheduled yet, i.e., it will have a real chance
of actually select and schedule v.

If that is not the case, we schedule it on Y (or, at
least, we consider that), as running somewhere non-ideal
is better than not running at all.

The commit also adds performance counters for each of
the possible situations.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
8 years agoxen: credit1: don't rate limit context switches in case of yields
Dario Faggioli [Fri, 30 Sep 2016 02:53:32 +0000 (04:53 +0200)]
xen: credit1: don't rate limit context switches in case of yields

Rate limiting has been primarily introduced to avoid too
heavy context switch rate due to interrupts, and, in
general, asynchronous events.

If a vcpu "voluntarily" yields, we really should let it
give up the cpu for a while.

In fact, it may be that it is yielding because it's about
to start spinning, and there's few point in forcing a vcpu
to spin for (potentially) the entire rate-limiting period.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
8 years agoxen: credit1: return the 'time remaining to the limit' as next timeslice.
Dario Faggioli [Fri, 30 Sep 2016 02:53:25 +0000 (04:53 +0200)]
xen: credit1: return the 'time remaining to the limit' as next timeslice.

If vcpu x has run for 200us, and sched_ratelimit_us is
1000us, continue running x _but_ return 1000us-200us as
the next time slice. This way, next scheduling point will
happen in 800us, i.e., exactly at the point when x crosses
the threshold, and can be descheduled (if appropriate).

Right now (without this patch), we're always returning
sched_ratelimit_us (1000us, in the example above), which
means we're (potentially) allowing x to run more than
it should have been able to.

Note that, however, in order to avoid setting timers to very
short intervals, which is part of the purpose of rate limiting,
we never use a time slice smaller than a well defined threshold.
Such threshold (CSCHED_MIN_TIMER defined in this patch) is, in
general independent from rate limiting, but it looks a good idea
to set it to the minimum possible ratelimiting value.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
8 years agox86emul: consolidate segment register handling
Jan Beulich [Fri, 30 Sep 2016 13:37:34 +0000 (15:37 +0200)]
x86emul: consolidate segment register handling

Use a single set of variables throughout the huge switch() statement,
allowing to funnel SLDT/STR into the mov-from-sreg code path.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: support UMIP
Jan Beulich [Fri, 30 Sep 2016 13:37:00 +0000 (15:37 +0200)]
x86emul: support UMIP

To make this complete, also add support for SLDT and STR. Note that by
just looking at the guest CR4 bit, this is independent of actually
making available the UMIP feature to guests.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: sort opcode 0f01 special case switch() statement
Jan Beulich [Fri, 30 Sep 2016 13:06:40 +0000 (15:06 +0200)]
x86emul: sort opcode 0f01 special case switch() statement

Sort the special case opcode 0f01 entries numerically, insert blank
lines between each of the cases, and properly place opening braces.

No functional change.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/emulate: add support for {,v}movd {,x}mm,r/m32 and {,v}movq {,x}mm,r/m64
Zhi Wang [Fri, 30 Sep 2016 13:01:23 +0000 (15:01 +0200)]
x86/emulate: add support for {,v}movd {,x}mm,r/m32 and {,v}movq {,x}mm,r/m64

Found that Windows driver was using a SSE2 instruction MOVD.

Signed-off-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Mihai Donțu <mdontu@bitdefender.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/emulate: add support for {,v}movq xmm,xmm/m64
Mihai Donțu [Fri, 30 Sep 2016 13:00:29 +0000 (15:00 +0200)]
x86/emulate: add support for {,v}movq xmm,xmm/m64

From: Mihai Donțu <mdontu@bitdefender.com>

Signed-off-by: Mihai Donțu <mdontu@bitdefender.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: defer injection of #DB
Jan Beulich [Fri, 30 Sep 2016 12:58:48 +0000 (14:58 +0200)]
x86emul: defer injection of #DB

Move the raising of the single step trap until after registers were
updated. This should probably have been that way from the beginning,
to allow the inject_hw_exception() hook to see updated register state
(in case it cares) - it's a trap, after all.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: support XSETBV
Jan Beulich [Fri, 30 Sep 2016 12:57:59 +0000 (14:57 +0200)]
x86emul: support XSETBV

This is a prereq for switching PV privileged op emulation to the
generic instruction emulator. Since handle_xsetbv() is already capable
of dealing with all guest kinds, avoid introducing another hook here.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/emulate: Resolve MISSING_BREAK issue in x86_decode()
Andrew Cooper [Fri, 30 Sep 2016 10:01:04 +0000 (11:01 +0100)]
x86/emulate: Resolve MISSING_BREAK issue in x86_decode()

Coverity doesn't appear to be able to spot that this is a terminal error path,
but leave a comment to "fix" MISSING_BREAK.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
8 years agotools/libxc: Don't leak foreign mappings when loading modules
Andrew Cooper [Fri, 30 Sep 2016 10:01:03 +0000 (11:01 +0100)]
tools/libxc: Don't leak foreign mappings when loading modules

Spotted by Coverity

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agoAdded COPYING and README.patch files to xen/common and xen/tools
Lars Kurth [Mon, 26 Sep 2016 16:06:49 +0000 (17:06 +0100)]
Added COPYING and README.patch files to xen/common and xen/tools

This patch adds information related to non-GPL licenses and code
imports from 3rd party projects. The aim of this patch, is to
make it easier for future contributors, to perform a review
of the codebase.

Signed-off-by: Lars Kurth <lars.kurth@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
[ wei: remove all trailing whitespaces ]
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
8 years agoblktap2: Added COPYING file
Lars Kurth [Mon, 26 Sep 2016 12:16:34 +0000 (13:16 +0100)]
blktap2: Added COPYING file

Blktap2 has some complexity, as some files do not have (c) headers
and the directory did not have a COPYING file. At this stage, we
have not verified the intention of (c) holders. We may do this in
future, if the need arises.

Signed-off-by: Lars Kurth <lars.kurth@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
[ wei: delete all trailing whitespaces ]
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
8 years agoAdded COPYING files and README.source files
Lars Kurth [Mon, 26 Sep 2016 12:16:33 +0000 (13:16 +0100)]
Added COPYING files and README.source files

Added a COPYING file as a boilerplate to explain license oddities in
this directory

Added a vtpm/COPYING file which contains MIT licensed files only

Added a vtpmmgr/README.source file which contains many BSD-3-Clause
files that originally came from tools/vtpm_manager

Signed-off-by: Lars Kurth <lars.kurth@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
[ wei: delete all trailing whitespaces ]
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl/arm: Add the size of ACPI tables to maxmem
Shannon Zhao [Thu, 29 Sep 2016 01:19:02 +0000 (18:19 -0700)]
libxl/arm: Add the size of ACPI tables to maxmem

Here it adds the ACPI tables size to set the target maxmem to avoid
providing less available memory for guest.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl/arm: Initialize domain param HVM_PARAM_CALLBACK_IRQ
Shannon Zhao [Thu, 29 Sep 2016 01:19:01 +0000 (18:19 -0700)]
libxl/arm: Initialize domain param HVM_PARAM_CALLBACK_IRQ

The guest kernel will get the event channel interrupt information via
domain param HVM_PARAM_CALLBACK_IRQ. Initialize it here.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agopublic/hvm/params.h: Add macros for HVM_PARAM_CALLBACK_TYPE_PPI
Shannon Zhao [Thu, 29 Sep 2016 01:19:00 +0000 (18:19 -0700)]
public/hvm/params.h: Add macros for HVM_PARAM_CALLBACK_TYPE_PPI

Add macros for HVM_PARAM_CALLBACK_TYPE_PPI operation values and update
them in evtchn_fixup().

Also use HVM_PARAM_CALLBACK_IRQ_TYPE_MASK in hvm_set_callback_via().

Cc: Jan Beulich <jbeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
8 years agolibxl/arm: Add ACPI module
Shannon Zhao [Thu, 29 Sep 2016 01:18:59 +0000 (18:18 -0700)]
libxl/arm: Add ACPI module

Add the ARM Multiboot module for ACPI, so UEFI or DomU can get the base
address of ACPI tables from it.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl/arm: Factor finalise_one_memory_node as a gerneric function
Shannon Zhao [Thu, 29 Sep 2016 01:18:58 +0000 (18:18 -0700)]
libxl/arm: Factor finalise_one_memory_node as a gerneric function

Rename finalise_one_memory_node to finalise_one_node and pass the node
name via function parameter.

This is useful for adding ACPI module which will be added by a later
patch.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl/arm: Construct ACPI DSDT table
Shannon Zhao [Thu, 29 Sep 2016 01:18:57 +0000 (18:18 -0700)]
libxl/arm: Construct ACPI DSDT table

Copy the static DSDT table into ACPI blob.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl/arm: Construct ACPI FADT table
Shannon Zhao [Thu, 29 Sep 2016 01:18:56 +0000 (18:18 -0700)]
libxl/arm: Construct ACPI FADT table

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl/arm: Construct ACPI MADT table
Shannon Zhao [Thu, 29 Sep 2016 01:18:55 +0000 (18:18 -0700)]
libxl/arm: Construct ACPI MADT table

According to the GIC version, construct the MADT table.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl/arm: Factor MPIDR computing codes out as a helper
Shannon Zhao [Thu, 29 Sep 2016 01:18:54 +0000 (18:18 -0700)]
libxl/arm: Factor MPIDR computing codes out as a helper

Factor MPIDR computing codes out as a helper, so it could be shared
between DT and ACPI.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl/arm: Construct ACPI GTDT table
Shannon Zhao [Thu, 29 Sep 2016 01:18:53 +0000 (18:18 -0700)]
libxl/arm: Construct ACPI GTDT table

Construct GTDT table with the interrupt information of timers.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl/arm: Construct ACPI XSDT table
Shannon Zhao [Thu, 29 Sep 2016 01:18:52 +0000 (18:18 -0700)]
libxl/arm: Construct ACPI XSDT table

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl/arm: Construct ACPI RSDP table
Shannon Zhao [Thu, 29 Sep 2016 01:18:51 +0000 (18:18 -0700)]
libxl/arm: Construct ACPI RSDP table

Construct ACPI RSDP table and add a helper to calculate the ACPI table
checksum.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl/arm: Estimate the size of ACPI tables
Shannon Zhao [Thu, 29 Sep 2016 01:18:50 +0000 (18:18 -0700)]
libxl/arm: Estimate the size of ACPI tables

Estimate the size of ACPI tables and reserve a memory map space for ACPI
tables.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl/arm: Generate static ACPI DSDT table
Shannon Zhao [Thu, 29 Sep 2016 01:18:49 +0000 (18:18 -0700)]
libxl/arm: Generate static ACPI DSDT table

It uses static DSDT table like the way x86 uses. Currently the DSDT
table only contains processor device objects and it generates the
maximal objects which so far is 128.

While the GUEST_MAX_VCPUS is defined under __XEN__ or __XEN_TOOLS__, it
needs to add -D__XEN_TOOLS__ to compile mk_dsdt.c.

Also only check iasl for aarch64 in configure since ACPI on ARM32 is not
supported.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
[ wei: run autogen.sh and fix compilation on x86 ]
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl/arm: prepare for constructing ACPI tables
Shannon Zhao [Thu, 29 Sep 2016 01:18:48 +0000 (18:18 -0700)]
libxl/arm: prepare for constructing ACPI tables

It only constructs the ACPI tables for 64-bit ARM DomU when user enables
acpi because 32-bit DomU doesn't support ACPI. And the generation codes
are only built for 64-bit toolstack.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agotools/libxl: Add an unified configuration option for ACPI
Shannon Zhao [Thu, 29 Sep 2016 01:18:47 +0000 (18:18 -0700)]
tools/libxl: Add an unified configuration option for ACPI

Since the existing configuration option "u.hvm.acpi" is x86 specific and
we want to reuse it on ARM as well, add a unified option "acpi" for
x86 and ARM, and for ARM it's disabled by default.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agopub-headers: reduce C99 dependencies
Jan Beulich [Wed, 28 Sep 2016 12:00:31 +0000 (06:00 -0600)]
pub-headers: reduce C99 dependencies

For consumers not using (fully) C99-aware compilers, limit the number
of places where tweaking of the headers would be necessary: Introduce
and use xen_mk_ullong(), allowing its helper macro to be overridden at
once.

For now don't touch public/io/, which also has a few offenders.

The need to include xen.h in hvm/e820.h demonstrates that it is a bad
idea to include public headers first thing - arch/x86/hvm/mtrr.c needs
adjustment just because of this.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agox86emul: simplify LEAVE handling
Jan Beulich [Fri, 30 Sep 2016 08:01:14 +0000 (10:01 +0200)]
x86emul: simplify LEAVE handling

There's no 1-byte operand size case to take care of here, and there's
no point doing the first writeback using dst fields - we can read rBP
and write rSP directly.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/PV: split out dealing with MSRs from privileged instruction handling
Jan Beulich [Fri, 30 Sep 2016 07:55:32 +0000 (09:55 +0200)]
x86/PV: split out dealing with MSRs from privileged instruction handling

This is in preparation for using the generic emulator here.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/PV: split out dealing with DRn from privileged instruction handling
Jan Beulich [Fri, 30 Sep 2016 07:55:08 +0000 (09:55 +0200)]
x86/PV: split out dealing with DRn from privileged instruction handling

This is in preparation for using the generic emulator here.

Some care is needed temporarily to not unduly alter guest register
state: The local variable "res" can only go away once this code got
fully switched over to using x86_emulate().

Also switch to IS_ERR_VALUE() instead of (incorrectly) open coding it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/PV: split out dealing with CRn from privileged instruction handling
Jan Beulich [Fri, 30 Sep 2016 07:54:43 +0000 (09:54 +0200)]
x86/PV: split out dealing with CRn from privileged instruction handling

This is in preparation for using the generic emulator here.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: generate and make use of a canonical opcode representation
Jan Beulich [Fri, 30 Sep 2016 07:53:40 +0000 (09:53 +0200)]
x86emul: generate and make use of a canonical opcode representation

This representation is then being made available to interested callers,
to facilitate replacing their custom decoding.

This entails combining the three main switch statements into one.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: fix {,i}mul and {,i}div
Jan Beulich [Fri, 30 Sep 2016 07:52:52 +0000 (09:52 +0200)]
x86emul: fix {,i}mul and {,i}div

Commit a3db233ede ("x86emul: use DstEax also for {,I}{MUL,DIV}") went
a little too far: DstEax and SrcEax weren't really meant to be used
together with ModRM - they assume modrm_reg remains zero by the time
the destination / source register pointer gets calculated. Don't fully
undo that commit though, but instead just correct the register pointer,
and don't use dst.val as input for mul and imul (div and idiv did avoid
that already).

Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>