]> xenbits.xensource.com Git - people/royger/xen.git/log
people/royger/xen.git
8 years agox86/layout: Correct Xen's idea of its own memory layout
Andrew Cooper [Tue, 28 Feb 2017 15:17:17 +0000 (15:17 +0000)]
x86/layout: Correct Xen's idea of its own memory layout

c/s b4cd59fe "x86: reorder .data and .init when linking" had an unintended
side effect, where xen_in_range() and the tboot S3 MAC were no longer correct.

In practice, it means that Xen's .data section is excluded from consideration,
which means:
 1) Default IOMMU construction for the hardware domain could create mappings.
 2) .data isn't included in the tboot MAC checked on resume from S3.

Adjust the comments and virtual address anchors used to define the regions.

Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agoxenstore: remove memory report command line support
Juergen Gross [Fri, 24 Feb 2017 06:21:45 +0000 (07:21 +0100)]
xenstore: remove memory report command line support

As a memory report can now be triggered via XS_CONTROL support via
command line and signal handler is no longer needed. Remove it.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agoxenstore: make memory report available via XS_CONTROL
Juergen Gross [Fri, 24 Feb 2017 06:21:44 +0000 (07:21 +0100)]
xenstore: make memory report available via XS_CONTROL

Add a XS_CONTROL command to xenstored for doing a talloc report to a
file. Right now this is supported by specifying a command line option
when starting xenstored and sending a signal to the daemon to trigger
the report.

To dump the report to the standard log file call:

xenstore-control memreport

To dump the report to a new file call:

xenstore-control memreport <file>

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
8 years agoxenstore: add support for changing log functionality dynamically
Juergen Gross [Fri, 24 Feb 2017 06:21:43 +0000 (07:21 +0100)]
xenstore: add support for changing log functionality dynamically

Today Xenstore supports logging only if specified at start of the
Xenstore daemon. As it can't be disabled during runtime it is not
recommended to start xenstored with logging enabled.

Add support for switching logging on and off at runtime and to
specify a (new) logfile. This is done via the XS_CONTROL wire command
which can be sent with xenstore-control.

To switch logging on just use:

xenstore-control log on

To switch it off again:

xenstore-control log off

To specify a (new) logfile:

xenstore-control logfile <file>

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
8 years agoxenstore: enhance control command support
Juergen Gross [Fri, 24 Feb 2017 06:21:42 +0000 (07:21 +0100)]
xenstore: enhance control command support

The Xenstore protocol supports the XS_CONTROL command for triggering
various actions in the Xenstore daemon. Enhance that support by using
a command table and adding a help function.

Support multiple control commands in the associated xenstore-control
program used to issue XS_CONTROL commands.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
8 years agoxenstore: Split out XS_CONTROL action to dedicated source file
Juergen Gross [Fri, 24 Feb 2017 06:21:41 +0000 (07:21 +0100)]
xenstore: Split out XS_CONTROL action to dedicated source file

Move the XS_CONTROL handling of xenstored to a new source file
xenstored_control.c.

In order to avoid making get_string() in xenstored_core.c globally
visible use strlen() instead, which is save in this context due to
xs_count_strings() before returned a value > 1.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
8 years agoxenstore: rename XS_DEBUG wire command
Juergen Gross [Fri, 24 Feb 2017 06:21:40 +0000 (07:21 +0100)]
xenstore: rename XS_DEBUG wire command

In preparation to support other than pure debug functionality via the
Xenstore XS_DEBUG wire command rename it to XS_CONTROL and make
XS_DEBUG an alias of it.

Add an alias xs_control_command for the associated xs_debug_command,
too.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
8 years agoxl: merge xl_cmdimpl.c into xl.c
Wei Liu [Fri, 24 Feb 2017 16:01:45 +0000 (16:01 +0000)]
xl: merge xl_cmdimpl.c into xl.c

After splitting out all the meaty bits, xl_cmdimpl.c doesn't contain
much. Merge the rest into xl.c and delete the file.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out migration related code
Wei Liu [Fri, 24 Feb 2017 15:59:32 +0000 (15:59 +0000)]
xl: split out migration related code

Include COLO / Remus code because they are built on top of the existing
migration protocol.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out save/restore related code
Wei Liu [Fri, 24 Feb 2017 15:54:52 +0000 (15:54 +0000)]
xl: split out save/restore related code

Add some function declarations to xl.h because they are now needed in
multiple files.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out vm lifecycle control functions
Wei Liu [Fri, 24 Feb 2017 15:21:05 +0000 (15:21 +0000)]
xl: split out vm lifecycle control functions

Including create, reboot, shutdown, pause, unpause and destroy.

Lift a bunch of core data structures and function declarations to xl.h
because they are needed in both xl_cmdimpl.c and xl_vmcontrol.c.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out miscellaneous functions
Wei Liu [Fri, 24 Feb 2017 14:58:29 +0000 (14:58 +0000)]
xl: split out miscellaneous functions

A collections of functions that don't warrant their own files.

Moving main_devd there requires lifting do_daemonize to xl_utils.c.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out vnc and console related code
Wei Liu [Fri, 24 Feb 2017 14:45:36 +0000 (14:45 +0000)]
xl: split out vnc and console related code

The new file also contains code for channel, which is just a console
in disguise.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: call libxl_vncviewer_exec in main_vncviewer
Wei Liu [Mon, 27 Feb 2017 17:35:32 +0000 (17:35 +0000)]
xl: call libxl_vncviewer_exec in main_vncviewer

We will need to move main_vncviewer to a different file where it has no
access to the helper vncviewer.

Call libxl_vncviewer_exec directly.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out functions to print out information
Wei Liu [Fri, 24 Feb 2017 14:25:30 +0000 (14:25 +0000)]
xl: split out functions to print out information

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out psr related code
Wei Liu [Fri, 24 Feb 2017 14:15:46 +0000 (14:15 +0000)]
xl: split out psr related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out memory related code
Wei Liu [Fri, 24 Feb 2017 14:13:05 +0000 (14:13 +0000)]
xl: split out memory related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out cdrom related code
Wei Liu [Fri, 24 Feb 2017 14:08:53 +0000 (14:08 +0000)]
xl: split out cdrom related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out vcpu related code
Wei Liu [Fri, 24 Feb 2017 14:02:19 +0000 (14:02 +0000)]
xl: split out vcpu related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out pci related code
Wei Liu [Fri, 24 Feb 2017 13:57:08 +0000 (13:57 +0000)]
xl: split out pci related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out scheduler related code
Wei Liu [Fri, 24 Feb 2017 13:54:43 +0000 (13:54 +0000)]
xl: split out scheduler related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out usb related code
Wei Liu [Fri, 24 Feb 2017 13:50:42 +0000 (13:50 +0000)]
xl: split out usb related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out network related code
Wei Liu [Fri, 24 Feb 2017 13:46:11 +0000 (13:46 +0000)]
xl: split out network related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out block related code
Wei Liu [Fri, 24 Feb 2017 13:43:37 +0000 (13:43 +0000)]
xl: split out block related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out vtpm related code
Wei Liu [Fri, 24 Feb 2017 13:39:42 +0000 (13:39 +0000)]
xl: split out vtpm related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out flask related code
Wei Liu [Fri, 24 Feb 2017 13:34:54 +0000 (13:34 +0000)]
xl: split out flask related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out cpupool related code
Wei Liu [Fri, 24 Feb 2017 13:19:48 +0000 (13:19 +0000)]
xl: split out cpupool related code

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out xl_parse.[ch]
Wei Liu [Fri, 24 Feb 2017 13:10:14 +0000 (13:10 +0000)]
xl: split out xl_parse.[ch]

Move all parsing code into xl_parse.c. Export the ones needed in
xl_parse.h.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: introduce a function to get shutdown action name
Wei Liu [Mon, 27 Feb 2017 17:28:17 +0000 (17:28 +0000)]
xl: introduce a function to get shutdown action name

The array to map libxl_shutdown_action_to_shutdown to string is going to
be moved to a dedicated file.

Provide a function to do the translation so that we don't need to make
the array globally visible.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: rename cpurange_parse to parse_cpurange
Wei Liu [Mon, 27 Feb 2017 17:22:06 +0000 (17:22 +0000)]
xl: rename cpurange_parse to parse_cpurange

We want to consistently prefix functions to parse input with "parse_".

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: split out tmem related code to xl_tmem.c
Wei Liu [Fri, 24 Feb 2017 12:14:58 +0000 (12:14 +0000)]
xl: split out tmem related code to xl_tmem.c

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: move some helper functions to xl_utils.c
Wei Liu [Fri, 24 Feb 2017 11:40:41 +0000 (11:40 +0000)]
xl: move some helper functions to xl_utils.c

Move some commonly used functions to a new file.

find_domain requires access to global variable common_domname. Make that
non-static.

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agox86/PVHv2: fix dereference of native RSDP table mapping
Roger Pau Monne [Mon, 27 Feb 2017 12:14:38 +0000 (12:14 +0000)]
x86/PVHv2: fix dereference of native RSDP table mapping

Check that the RSDP is mapped before trying to access it.

Spotted-by: Coverity
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agolibs/devicemodel: free xencall handle in error path in _open()
Wei Liu [Mon, 27 Feb 2017 12:20:26 +0000 (12:20 +0000)]
libs/devicemodel: free xencall handle in error path in _open()

Change the allocation to use calloc to get zeroed structure. Free
xencall handler in error path.

Spotted by Coverity.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: lift a bunch of macros to xl_utils.h
Wei Liu [Thu, 23 Feb 2017 18:34:15 +0000 (18:34 +0000)]
xl: lift a bunch of macros to xl_utils.h

We're going to split xl_cmdimpl.c into multiple files. Lift the commonly
used macros to xl_utils.h.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: remove trailing spaces in xl_cmdimpl.c
Wei Liu [Thu, 23 Feb 2017 18:41:13 +0000 (18:41 +0000)]
xl: remove trailing spaces in xl_cmdimpl.c

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: generate _paths.h
Wei Liu [Thu, 23 Feb 2017 18:28:02 +0000 (18:28 +0000)]
xl: generate _paths.h

It is included by xl.h. Previously it was using _paths.h from some other
place. We'd better generate one for xl as well.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: use <> variant to include Xen tools library headers
Wei Liu [Thu, 23 Feb 2017 18:06:14 +0000 (18:06 +0000)]
xl: use <> variant to include Xen tools library headers

They should be treated like any other libraries installed on the build
host. Compiler options are set correctly to point to their locations.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: remove inclusion of libxl_osdeps.h
Wei Liu [Thu, 23 Feb 2017 18:03:11 +0000 (18:03 +0000)]
xl: remove inclusion of libxl_osdeps.h

There is no reason for a client to include a private header from libxl.
Remove the inclusion and define _GNU_SOURCE for {v,}asprintf in
xl_cmdimpl.c.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: update copyright information
Wei Liu [Thu, 23 Feb 2017 18:38:13 +0000 (18:38 +0000)]
xl: update copyright information

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: remove accidentally committed hunk from Makefile
Wei Liu [Thu, 23 Feb 2017 18:21:52 +0000 (18:21 +0000)]
xl: remove accidentally committed hunk from Makefile

It was never intended to be committed. Lucky the high level Makefile was
correct so it didn't cause us problem when building xl.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agox86/shadow: Fix build with CONFIG_SHADOW_PAGING=n following c/s 45ac805
Andrew Cooper [Mon, 27 Feb 2017 11:47:10 +0000 (11:47 +0000)]
x86/shadow: Fix build with CONFIG_SHADOW_PAGING=n following c/s 45ac805

c/s 45ac805 "x86/paging: Package up the log dirty function pointers" neglected
the case when CONFIG_SHADOW_PAGING is disabled.  Make a similar adjustment to
the none stubs.

Spotted by a Travis RANDCONFIG run.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
8 years agox86: fix memory leak in pvh_setup_acpi_xsdt
Wei Liu [Sun, 26 Feb 2017 15:49:32 +0000 (15:49 +0000)]
x86: fix memory leak in pvh_setup_acpi_xsdt

Switch to use goto style error handling to avoid leaking xsdt.

Coverity-ID: 1401535

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86: fix memory leak in pvh_setup_acpi_madt
Wei Liu [Sun, 26 Feb 2017 15:49:31 +0000 (15:49 +0000)]
x86: fix memory leak in pvh_setup_acpi_madt

Switch to use goto style error handling to avoid leaking madt.

Coverity-ID: 1401534

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agobuild: add --with-rundir option to configure
Juergen Gross [Thu, 16 Feb 2017 07:47:07 +0000 (08:47 +0100)]
build: add --with-rundir option to configure

There have been reports that Fedora 25 uses /run instead of /var/run.

Add a --with-rundir option ito configure to be able to specify that
directory. Default is still /var/run.

A re-run of autogen.sh is required.

Signed-off-by: Juergen Gross <jgross@suse.com>
[ wei: run autogen.sh ]
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agotools/xen-mceinj: fix the type of cpu number
Haozhong Zhang [Fri, 24 Feb 2017 10:52:56 +0000 (18:52 +0800)]
tools/xen-mceinj: fix the type of cpu number

Use "unsigned int" rather than "int" to align to the type "uint32_t"
of xen_mc_physcpuinfo.ncpus.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agox86/vmx: fix vmentry failure with TSX bits in LBR
Sergey Dyasli [Thu, 23 Feb 2017 09:33:27 +0000 (09:33 +0000)]
x86/vmx: fix vmentry failure with TSX bits in LBR

During VM entry, H/W will automatically load guest's MSRs from MSR-load
area in the same way as they would be written by WRMSR.

However, under the following conditions:

    1. LBR (Last Branch Record) MSRs were placed in the MSR-load area
    2. Address format of LBR includes TSX bits 61:62
    3. CPU has TSX support disabled

VM entry will fail with a message in the log similar to:

    (XEN) [   97.239514] d1v0 vmentry failure (reason 0x80000022): MSR loading (entry 3)
    (XEN) [   97.239516]   msr 00000680 val 1fff800000102e60 (mbz 0)

This happens because of the following behaviour:

    - When capturing branches, LBR H/W will always clear bits 61:62
      regardless of the sign extension
    - For WRMSR, bits 61:62 are considered the part of the sign extension

This bug affects only certain pCPUs (e.g. Haswell) with vCPUs that
use LBR.  Fix it by sign-extending TSX bits in all LBR entries during
VM entry in affected cases.

LBR MSRs are currently not Live Migrated. In order to implement such
functionality, the MSR levelling work has to be done first because
hosts can have different LBR formats.

Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
8 years agox86/vmx: optimize vmx_read/write_guest_msr()
Sergey Dyasli [Thu, 23 Feb 2017 09:33:26 +0000 (09:33 +0000)]
x86/vmx: optimize vmx_read/write_guest_msr()

Replace linear scan with vmx_find_msr().  This way the time complexity
of searching for required MSR reduces from linear to logarithmic.

Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/vmx: introduce vmx_find_msr()
Sergey Dyasli [Thu, 23 Feb 2017 09:33:25 +0000 (09:33 +0000)]
x86/vmx: introduce vmx_find_msr()

Modify vmx_add_msr() to use a variation of insertion sort algorithm:
find a place for the new entry and shift all subsequent elements before
insertion.

The new vmx_find_msr() exploits the fact that MSR list is now sorted
and reuses the existing code for binary search.

Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
8 years agox86/emul: Fix sarx emulation test
Andrew Cooper [Fri, 24 Feb 2017 18:12:19 +0000 (18:12 +0000)]
x86/emul: Fix sarx emulation test

The emulation tests run `sarx %edx,(%ecx),%ebx` with 0xfedcba98 pointed at by
%ecx, and 0xff13 in %rdx.

As the instruction uses a 32bit operand size, the expected result is
0x00000000ffffffdb in %rbx (rather than 0xffffffffffffffdb), due to usual
behaviour of 32bit operations on 64bit registers.

The test harness was incorrectly sign extending from 32 bits to 64 bits rather
than zero extending when checking the result of emulation, causing a false
negative failure.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/paging: Package up the log dirty function pointers
Andrew Cooper [Thu, 16 Feb 2017 16:42:16 +0000 (16:42 +0000)]
x86/paging: Package up the log dirty function pointers

They depend soley on paging mode, so don't need to be repeated per domain, and
can live in .rodata.  While making this change, drop the redundant log_dirty
from the function pointer names.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: George Dunlap <george.dunlap@citrix.com>
8 years agox86/cpuid: Handle leaf 0x1 in guest_cpuid()
Andrew Cooper [Fri, 17 Feb 2017 17:10:50 +0000 (17:10 +0000)]
x86/cpuid: Handle leaf 0x1 in guest_cpuid()

The features words, ecx and edx, are already audited as part of the featureset
logic.  The existing leaf 0x80000001 dynamic logic has its SYSCALL adjustment
split out, as the rest of the adjustments are common with leaf 0x1.  The
existing leaf 0x1 feature adjustments from {pv,hvm}_cpuid() are moved
wholesale into guest_cpuid(), although deduped against the common adjustments.

The eax word is family/model/stepping information, and is fine to use as
provided by the toolstack, although with reserved bits cleared.

The ebx word is more problematic.  The low 8 bits are the brand ID and safe to
pass straight through.  The next 8 bits are the CLFLUSH line size.  This value
is forwarded straight from hardware, as nothing good can possibly come of
providing an alternative value to the guest.

The next 8 bits are slightly different between Intel and AMD, but are both
some property of the number of logical cores in the current physical package.
For now, the toolstack value is used unchanged until better topology support
is available.

The final 8 bits are the initial legacy APIC ID.  For HVM guests, this was
overridden to vcpu_id * 2.  The same logic is now applied to PV guests, so
guests don't observe a constant number on all vcpus via their emulated or
faulted view.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/gen-cpuid: Clarify the intended meaning of AVX wrt feature dependencies
Andrew Cooper [Fri, 13 Jan 2017 17:54:24 +0000 (17:54 +0000)]
x86/gen-cpuid: Clarify the intended meaning of AVX wrt feature dependencies

Also update the AVX512 text similarly for EVEX, even if there are no
EVEX-encoded GPR instructions currently.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/cpuid: Disallow policy updates once the domain is running
Andrew Cooper [Fri, 17 Feb 2017 15:47:31 +0000 (15:47 +0000)]
x86/cpuid: Disallow policy updates once the domain is running

On real hardware, the bulk of CPUID data is system-specific and constant.
Hold the toolstack to the same behaviour when constructing domains.

Values which are expected to change dynamically (e.g. OSXSAVE) are unaffected
and continue to function as before.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86emul/test: split generic and testcase specific parts
Jan Beulich [Fri, 24 Feb 2017 16:22:13 +0000 (17:22 +0100)]
x86emul/test: split generic and testcase specific parts

Both the build logic and the invocation have their blowfish specific
aspects abstracted out here. Additionally
- run native execution (if suitable) first (as that one failing
  suggests a problem with the to be tested code itself, in which case
  having the emulator have a go over it is kind of pointless)
- move the 64-bit tests up in blobs[] so 64-bit native execution will
  also precede 32-bit emulation (on 64-bit systems only of course)
- instead of -msoft-float (we'd rather not have the compiler generate
  such code), pass -fno-asynchronous-unwind-tables and -g0 (reducing
  binary size of the helper images as well as [slightly] compilation
  time)
- skip tests with zero length blobs (these can result from failed
  compilation, but not failing the build in this case seems desirable:
  it may allow partial testing - e.g. with older compilers - and
  permits manually removing certain tests from the generated headers
  without having to touch actual source code)
- constrain rIP to the actual blob range rather than looking for the
  specific (fake) return address put on the stack
- also print the opcode when x86_emulate() fails
- print at least three progress dots (for relatively short tests)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86: setup PVHv2 Dom0 ACPI tables
Roger Pau Monné [Fri, 24 Feb 2017 14:49:19 +0000 (15:49 +0100)]
x86: setup PVHv2 Dom0 ACPI tables

Create a new MADT table that contains the topology exposed to the guest. A
new XSDT table is also created, in order to filter the tables that we want
to expose to the guest, plus the Xen crafted MADT. This in turn requires Xen
to also create a new RSDP in order to make it point to the custom XSDT.

Also, regions marked as E820_ACPI or E820_NVS are identity mapped into Dom0
p2m, plus any top-level ACPI tables that should be accessible to Dom0 and
reside in reserved regions. This is needed because some memory maps don't
properly account for all the memory used by ACPI, so it's common to find ACPI
tables in reserved regions.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86: setup PVHv2 Dom0 CPUs
Roger Pau Monné [Fri, 24 Feb 2017 14:48:59 +0000 (15:48 +0100)]
x86: setup PVHv2 Dom0 CPUs

Initialize Dom0 BSP/APs and setup the memory and IO permissions. This also sets
the initial BSP state in order to match the protocol specified in
docs/misc/hvmlite.markdown.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86: parse Dom0 kernel for PVHv2
Roger Pau Monné [Fri, 24 Feb 2017 14:48:43 +0000 (15:48 +0100)]
x86: parse Dom0 kernel for PVHv2

Introduce a helper to parse the Dom0 kernel.

A new helper is also introduced to libelf, that's used to store the destination
vcpu of the domain. This parameter is needed when loading the kernel on a HVM
domain (PVHv2), since hvm_copy_to_guest_phys requires passing the destination
vcpu.

While there also fix image_base and image_start to be of type "void *", and do
the necessary fixup of related functions.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/libelf: pass the destination vCPU to libelf for Dom0 build
Roger Pau Monné [Fri, 24 Feb 2017 14:47:55 +0000 (15:47 +0100)]
x86/libelf: pass the destination vCPU to libelf for Dom0 build

Allow setting the destination vCPU for libelf, so that elf_load_image can take
it into account when loading the kernel for Dom0. This is needed for PVHv2 Dom0
build, so that hvm_copy_to_guest_phys can be called with a Dom0 vCPU instead of
current (that contains the idle vCPU at this point).

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/bzimage: change the types from char * to void *
Roger Pau Monné [Fri, 24 Feb 2017 14:47:36 +0000 (15:47 +0100)]
x86/bzimage: change the types from char * to void *

This allows to also change the types of image_base and image_start in the Dom0
builder from char * to void *.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86: populate PVHv2 Dom0 physical memory map
Roger Pau Monné [Fri, 24 Feb 2017 14:47:03 +0000 (15:47 +0100)]
x86: populate PVHv2 Dom0 physical memory map

Craft the Dom0 e820 memory map and populate it. Introduce a helper to remove
memory pages that are shared between Xen and a domain, and use it in order to
remove low 1MB RAM regions from dom_io in order to assign them to a PVHv2 Dom0.

On hardware lacking support for unrestricted mode also craft the identity page
tables and the TSS used for virtual 8086 mode.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86: remove XENFEAT_hvm_pirqs for PVHv2 guests
Roger Pau Monné [Fri, 24 Feb 2017 14:46:10 +0000 (15:46 +0100)]
x86: remove XENFEAT_hvm_pirqs for PVHv2 guests

PVHv2 guests, unlike HVM guests, won't have the option to route interrupts
from physical or emulated devices over event channels using PIRQs. This
applies to both DomU and Dom0 PVHv2 guests.

Introduce a new XEN_X86_EMU_USE_PIRQ to notify Xen whether a HVM guest can
route physical interrupts (even from emulated devices) over event channels,
and is thus allowed to use some of the PHYSDEV ops.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/hvm: Don't let hvm_set_efer() raise #GP itself
Andrew Cooper [Fri, 24 Feb 2017 09:22:09 +0000 (09:22 +0000)]
x86/hvm: Don't let hvm_set_efer() raise #GP itself

c/s 49de10f3c "x86/hvm: Don't raise #GP behind the emulators back for MSR
accesses" missed an edge case.

hvm_set_efer() raises #GP itself, so deliberately avoided the goto gp_fault
path in hvm_msr_write_intercept().

With the above change, guest updates to MSR_EFER which end up faulting raises
hvm_msr_write_intercept() returning X86EMUL_EXCEPTION.  The second #GP gets
combined to #DF and handed back to the guest.

Update hvm_set_efer() to avoid raising #GP, requiring its callers to do so.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agolibxl/libxl_pci.c: Fix reverse logic when detaching device
Chao Gao [Thu, 23 Feb 2017 23:12:10 +0000 (07:12 +0800)]
libxl/libxl_pci.c: Fix reverse logic when detaching device

Commit 20b75251d97 ("libxl/libxl_pci.c: used LOG*D functions") reverses the
logic to call xc_deassign_device(). It makes the device unusable.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agoarm/p2m: remove the page from p2m->pages list before freeing it
Julien Grall [Fri, 24 Feb 2017 08:58:50 +0000 (09:58 +0100)]
arm/p2m: remove the page from p2m->pages list before freeing it

The p2m code is using the page list field to link all the pages used
for the stage-2 page tables. The page is added into the p2m->pages
list just after the allocation but never removed from the list.

The page list field is also used by the allocator, not removing may
result a later Xen crash due to inconsistency (see [1]).

This bug was introduced by the reworking of p2m code in commit 2ef3e36ec7
"xen/arm: p2m: Introduce p2m_set_entry and __p2m_set_entry".

[1] https://lists.xenproject.org/archives/html/xen-devel/2017-02/msg00524.html

Reported-by: Vijaya Kumar K <Vijaya.Kumar@cavium.com>
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agotools: move xl to a dedicated directory
Wei Liu [Tue, 21 Feb 2017 14:52:46 +0000 (14:52 +0000)]
tools: move xl to a dedicated directory

It makes clear distinction between the client (xl) and library (libxl),
which should help design better APIs.  This will also help reduce the
code size in libxl directory.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agotools: provide libxlutil compiling and linking options
Wei Liu [Tue, 21 Feb 2017 14:40:48 +0000 (14:40 +0000)]
tools: provide libxlutil compiling and linking options

We are about to split out xl (which depends on libxlutil) to a different
directory. Provide the proper options for compiling and linking in
Rules.mk, and replace the hardcoded string in libxl/Makefile.

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxen-access: request compat devicemodel API
Wei Liu [Thu, 23 Feb 2017 16:46:45 +0000 (16:46 +0000)]
xen-access: request compat devicemodel API

xc_hvm_inject_trap is moved to the new libdevicemodel. Request the
compat layer from libxenctrl for now to make xen-access build again.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
8 years agolibs/devicemodel: initialise op_bufs in xendevicemodel_xcall
Wei Liu [Thu, 23 Feb 2017 15:18:20 +0000 (15:18 +0000)]
libs/devicemodel: initialise op_bufs in xendevicemodel_xcall

To avoid freeing uninitialised buffer when taking the first error exit
path.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agopython: handle long type in scripts
Marek Marczykowski-Górecki [Thu, 23 Feb 2017 10:48:28 +0000 (11:48 +0100)]
python: handle long type in scripts

In Python3 'long' type have been merged into 'int', '1L' syntax is no
longer valid. Assign 'int' type to a 'long' variable in python3, so
'long(1)' will give correct result in both python2 and python3.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agopython: adjust module initalization for Python3
Marek Marczykowski-Górecki [Thu, 23 Feb 2017 10:48:27 +0000 (11:48 +0100)]
python: adjust module initalization for Python3

In Python3, PyTypeObject looks slightly different, and also module
initialization is different. Instead of Py_InitModule, PyModule_Create
should be called on already defined PyModuleDef structure. And then
initialization function should return that module.

Additionally initialization function should be named PyInit_<name>,
instead of init<name>.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agopython: use PyLong_* for constructing 'int' type in Python3
Marek Marczykowski-Górecki [Thu, 23 Feb 2017 10:48:26 +0000 (11:48 +0100)]
python: use PyLong_* for constructing 'int' type in Python3

In Python3 'int' and 'long' types are the same, there are no longer
separate PyInt_* functions.  Provide convenient #defines to limit #if in
code.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agopython: use PyBytes/PyUnicode instead of PyString
Marek Marczykowski-Górecki [Thu, 23 Feb 2017 10:48:25 +0000 (11:48 +0100)]
python: use PyBytes/PyUnicode instead of PyString

In Python2 PyBytes is the same as PyString, but in Python3 PyString is
gone and 'str' is really PyUnicode in C-API.
When handling arbitrary data, use PyBytes - which is the right thing to
do in Python3, and pose no API change in Python2. When handling
xenstore paths and transaction ids, which have well defined format, use
PyUnicode - to ease API usage - no need to prefix all xenstore paths
with 'b' when migrating scripts to Python3.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agopython: initialize specific fields of PyTypeObject
Marek Marczykowski-Górecki [Thu, 23 Feb 2017 10:48:24 +0000 (11:48 +0100)]
python: initialize specific fields of PyTypeObject

Fields not named here will be zero-initialized anyway, but using this
way will be much easier to support both Python2 and Python3.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agopython: use Py_TYPE instead of looking directly into PyObject_HEAD
Marek Marczykowski-Górecki [Thu, 23 Feb 2017 10:48:23 +0000 (11:48 +0100)]
python: use Py_TYPE instead of looking directly into PyObject_HEAD

Py_TYPE works on both Python2 and Python3, while internals of
PyObject_HEAD have changed.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agopython: drop tp_getattr implementation
Marek Marczykowski-Górecki [Thu, 23 Feb 2017 10:48:22 +0000 (11:48 +0100)]
python: drop tp_getattr implementation

tp_getattr method of type object is deprecated already in Python2 and
gone in Python3. Default implementation does the same as this custom one.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agopython: check return value of PyErr_NewException
Marek Marczykowski-Górecki [Thu, 23 Feb 2017 10:48:21 +0000 (11:48 +0100)]
python: check return value of PyErr_NewException

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agotools/libxendevicemodel: Add headers.chk to .gitignore
Ian Jackson [Thu, 23 Feb 2017 12:09:47 +0000 (12:09 +0000)]
tools/libxendevicemodel: Add headers.chk to .gitignore

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
8 years agotools/libxendevicemodel: add a call to restrict the handle
Paul Durrant [Fri, 10 Feb 2017 14:34:15 +0000 (14:34 +0000)]
tools/libxendevicemodel: add a call to restrict the handle

My recent patch [1] to the Linux privcmd module introduced a mechanism
to restrict an open file handle to subsequently only accept operations for
a specified domain.

This patch extends the libxendevicemodel API and make use of the
mechanism in the Linux-specific code to restrict operations on the
interface handle.

[1] https://git.kernel.org/cgit/linux/kernel/git/ostr/linux.git/commit/?id=4610d240

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agotools/libxendevicemodel: introduce a Linux-specific implementation
Paul Durrant [Wed, 15 Feb 2017 15:22:29 +0000 (15:22 +0000)]
tools/libxendevicemodel: introduce a Linux-specific implementation

My recent patch [1] to the Linux privcmd module introduced a dedicated
mechanism for making dm_op hypercalls.

This patch adds the necessary code to libxendevicemodel to take
advantage of that mechanism if it is implemented in the tools domain
kernel.

[1] https://git.kernel.org/cgit/linux/kernel/git/ostr/linux.git/commit/?id=ab520be8

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agotools/libxendevicemodel: extract functions and add a compat layer
Paul Durrant [Wed, 15 Feb 2017 14:44:16 +0000 (14:44 +0000)]
tools/libxendevicemodel: extract functions and add a compat layer

This patch extracts all functions resulting in a dm_op hypercall from
libxenctrl and moves them into libxendevicemodel. It also adds a compat
layer into libxenctrl, which can be selected by defining
XC_WANT_COMPAT_DEVICEMODEL_API to 1 before including xenctrl.h.

With this patch the core of libxendevicemodel still uses libxencall to
issue the dm_op hypercalls, but this is done by calling through code that
can be modified on a per-OS basis. A subsequent patch will add a Linux-
specific variant.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
8 years agotools/libxendevicemodel: introduce the new library
Paul Durrant [Wed, 15 Feb 2017 13:54:25 +0000 (13:54 +0000)]
tools/libxendevicemodel: introduce the new library

The new xendevicemodel library is intended to be used by all Xen device
models such that the only hypercall that use will be the dm_op hypercall
added by commit 524a98c2.

This patch adds the boilerplate for the new library, with only open() and
close() entry points, and calls to those from libxenctrl in preparation
for the compat layer added by a subsequent patch.

[ Also: update MINIOS_UPSTREAM_REVISION and QEMU_TRADITIONAL_REVISION
  to the commits with the corresponding changes to those other trees
  - Ian Jackson ]

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agodoc: fix typo in bios_path_override
Olaf Hering [Wed, 22 Feb 2017 16:35:20 +0000 (16:35 +0000)]
doc: fix typo in bios_path_override

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agoxenstore: correct test for opened logfile in reopen_log()
Juergen Gross [Wed, 22 Feb 2017 15:28:45 +0000 (16:28 +0100)]
xenstore: correct test for opened logfile in reopen_log()

As 0 is a valid file descriptor testing a descriptor to be valid
should be done via >= 0 instead of > 0.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agoQEMU_TAG update
Ian Jackson [Wed, 22 Feb 2017 16:25:42 +0000 (16:25 +0000)]
QEMU_TAG update

8 years agotools/libxenctrl: fix error check after opening libxenforeignmemory
Paul Durrant [Wed, 22 Feb 2017 13:27:34 +0000 (13:27 +0000)]
tools/libxenctrl: fix error check after opening libxenforeignmemory

Checking the value of xch->xcall is clearly incorrect. The code should be
checking xch->fmem (i.e. the return of the previously called function).

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agox86/mm: Swap mfn_valid() to use mfn_t
Andrew Cooper [Wed, 18 Jan 2017 15:05:24 +0000 (15:05 +0000)]
x86/mm: Swap mfn_valid() to use mfn_t

Replace one opencoded mfn_eq() and some coding style issues on altered lines.
Swap __mfn_valid() to being bool, although it can't be updated to take mfn_t
because of include dependencies.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Julien Grall <julien.grall@arm.com>
8 years agox86: add multiboot2 protocol support for EFI platforms
Daniel Kiper [Wed, 22 Feb 2017 13:38:54 +0000 (14:38 +0100)]
x86: add multiboot2 protocol support for EFI platforms

This way Xen can be loaded on EFI platforms using GRUB2 and
other boot loaders which support multiboot2 protocol.

Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
8 years agoefi: create new early memory allocator
Daniel Kiper [Wed, 22 Feb 2017 13:38:06 +0000 (14:38 +0100)]
efi: create new early memory allocator

There is a problem with place_string() which is used as early memory
allocator. It gets memory chunks starting from start symbol and goes
down. Sadly this does not work when Xen is loaded using multiboot2
protocol because then the start lives on 1 MiB address and we should
not allocate a memory from below of it. So, I tried to use mem_lower
address calculated by GRUB2. However, this solution works only on some
machines. There are machines in the wild (e.g. Dell PowerEdge R820)
which uses first ~640 KiB for boot services code or data... :-(((
Hence, we need new memory allocator for Xen EFI boot code which is
quite simple and generic and could be used by place_string() and
efi_arch_allocate_mmap_buffer(). I think about following solutions:

1) We could use native EFI allocation functions (e.g. AllocatePool()
   or AllocatePages()) to get memory chunk. However, later (somewhere
   in __start_xen()) we must copy its contents to safe place or reserve
   it in e820 memory map and map it in Xen virtual address space. This
   means that the code referring to Xen command line, loaded modules and
   EFI memory map, mostly in __start_xen(), will be further complicated
   and diverge from legacy BIOS cases. Additionally, both former things
   have to be placed below 4 GiB because their addresses are stored in
   multiboot_info_t structure which has 32-bit relevant members.

2) We may allocate memory area statically somewhere in Xen code which
   could be used as memory pool for early dynamic allocations. Looks
   quite simple. Additionally, it would not depend on EFI at all and
   could be used on legacy BIOS platforms if we need it. However, we
   must carefully choose size of this pool. We do not want increase Xen
   binary size too much and waste too much memory but also we must fit
   at least memory map on x86 EFI platforms. As I saw on small machine,
   e.g. IBM System x3550 M2 with 8 GiB RAM, memory map may contain more
   than 200 entries. Every entry on x86-64 platform is 40 bytes in size.
   So, it means that we need more than 8 KiB for EFI memory map only.
   Additionally, if we use this memory pool for Xen and modules command
   line storage (it would be used when xen.efi is executed as EFI application)
   then we should add, I think, about 1 KiB. In this case, to be on safe
   side, we should assume at least 64 KiB pool for early memory allocations.
   Which is about 4 times of our earlier calculations. However, during
   discussion on Xen-devel Jan Beulich suggested that just in case we should
   use 1 MiB memory pool like it is in original place_string() implementation.
   So, let's use 1 MiB as it was proposed. If we think that we should not
   waste unallocated memory in the pool on running system then we can mark
   this region as __initdata and move all required data to dynamically
   allocated places somewhere in __start_xen().

2a) We could put memory pool into .bss.page_aligned section. Then allocate
    memory chunks starting from the lowest address. After init phase we can
    free unused portion of the memory pool as in case of .init.text or .init.data
    sections. This way we do not need to allocate any space in image file and
    freeing of unused area in the memory pool is very simple.

Now #2a solution is implemented because it is quite simple and requires
limited number of changes, especially in __start_xen().

New allocator is quite generic and can be used on ARM platforms too.
Though it is not enabled on ARM yet due to lack of some prereq.
List of them is placed before ebmalloc code.

Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Doug Goldstein <cardoe@cardoe.com>
Tested-by: Doug Goldstein <cardoe@cardoe.com>
8 years agoefi: build xen.gz with EFI code
Daniel Kiper [Wed, 22 Feb 2017 13:36:56 +0000 (14:36 +0100)]
efi: build xen.gz with EFI code

Build xen.gz with EFI code. We need this to support multiboot2
protocol on EFI platforms.

If we wish to load non-ELF file using multiboot (v1) or multiboot2 then
it must contain "linear" (or "flat") representation of code and data.
This is requirement of both boot protocols. Currently, PE file contains
many sections which are not "linear" (one after another without any holes)
or even do not have representation in a file (e.g. BSS). From EFI point
of view everything is OK and works. However, this file layout cannot be
properly interpreted by multiboot protocols family. In theory there is
a chance that we could build proper PE file (from multiboot protocols POV)
using current build system. However, it means that xen.efi further diverge
from Xen ELF file (in terms of contents and build method). On the other
hand ELF has all needed properties. So, it means that this is good starting
point for further development. Additionally, I think that this is also good
starting point for further xen.efi code and build optimizations. It looks
that there is a chance that finally we can generate xen.efi directly from
Xen ELF using just simple objcopy or other tool. This way we will have one
Xen binary which can be loaded by three boot protocols: EFI native loader,
multiboot (v1) and multiboot2.

Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Doug Goldstein <cardoe@cardoe.com>
8 years agox86: add multiboot2 protocol support
Daniel Kiper [Wed, 22 Feb 2017 13:35:05 +0000 (14:35 +0100)]
x86: add multiboot2 protocol support

Add multiboot2 protocol support. Alter min memory limit handling as we
now may not find it from either multiboot (v1) or multiboot2.

This way we are laying the foundation for EFI + GRUB2 + Xen development.

Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Doug Goldstein <cardoe@cardoe.com>
8 years agoMAINTAINERS: update VT-d maintainers
Kevin Tian [Wed, 22 Feb 2017 11:37:22 +0000 (12:37 +0100)]
MAINTAINERS: update VT-d maintainers

Feng just left Intel.  So remove him from the list.

Signed-off-by: Kevin Tian <kevin.tian@intel.com>
8 years agox86/VMX: sanitize VM86 TSS handling
Jan Beulich [Wed, 22 Feb 2017 11:36:36 +0000 (12:36 +0100)]
x86/VMX: sanitize VM86 TSS handling

The present way of setting this up is flawed: Leaving the I/O bitmap
pointer at zero means that the interrupt redirection bitmap lives
outside (ahead of) the allocated space of the TSS. Similarly setting a
TSS limit of 255 when only 128 bytes get allocated means that 128 extra
bytes may be accessed by the CPU during I/O port access processing.

Introduce a new HVM param to set the allocated size of the TSS, and
have the hypervisor actually take care of setting namely the I/O bitmap
pointer. Both this and the segment limit now take the allocated size
into account.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agoMAINTAINERS: drop Jinsong Liu
Jan Beulich [Wed, 22 Feb 2017 11:35:58 +0000 (12:35 +0100)]
MAINTAINERS: drop Jinsong Liu

Mails to his listed address are bouncing.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/emul: Support CPUID faulting via a speculative MSR read
Andrew Cooper [Wed, 2 Nov 2016 15:50:23 +0000 (15:50 +0000)]
x86/emul: Support CPUID faulting via a speculative MSR read

This removes the need for every cpuid() emulation hook to individually support
CPUID faulting.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/emul: Introduce common msr_val for emulation
Andrew Cooper [Wed, 2 Nov 2016 15:50:23 +0000 (15:50 +0000)]
x86/emul: Introduce common msr_val for emulation

Use it consistently in place of local tsc_aux, msr_content and val
declarations, and replace opencoded uses of X86EMUL_OKAY.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/hvm: Don't raise #GP behind the emulators back for MSR accesses
Andrew Cooper [Wed, 2 Nov 2016 14:36:49 +0000 (14:36 +0000)]
x86/hvm: Don't raise #GP behind the emulators back for MSR accesses

The current hvm_msr_{read,write}_intercept() infrastructure calls
hvm_inject_hw_exception() directly to latch a fault, and returns
X86EMUL_EXCEPTION to its caller.

This behaviour is problematic for the hvmemul_{read,write}_msr() paths, as the
fault is raised behind the back of the x86 emulator.

Alter the behaviour so hvm_msr_{read,write}_intercept() simply returns
X86EMUL_EXCEPTION, leaving the callers to actually inject the #GP fault.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agox86/vmx: Drop vmx_msr_state infrastructure
Andrew Cooper [Sun, 18 Dec 2016 14:56:28 +0000 (14:56 +0000)]
x86/vmx: Drop vmx_msr_state infrastructure

To avoid leaking host MSR state into guests, guest LSTAR, STAR and
SYSCALL_MASK state is unconditionally loaded when switching into guest
context.

Attempting to dirty-track the state is pointless; host state is always
restoring upon exit from guest context, meaning that guest state is always
considered dirty.

Drop struct vmx_msr_state, enum VMX_INDEX_MSR_* and msr_index[].  The guests
MSR values are stored plainly in arch_vmx_struct, in the same way as shadow_gs
and cstar are.  vmx_restore_guest_msrs() and long_mode_do_msr_write() ensure
that the hardware MSR values are always up-to-date.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
8 years agox86/vmx: Remove vmx_save_host_msrs() and host_msr_state
Andrew Cooper [Sun, 18 Dec 2016 14:56:28 +0000 (14:56 +0000)]
x86/vmx: Remove vmx_save_host_msrs() and host_msr_state

A pcpu's LSTAR, STAR and SYSCALL_MASK MSRs are unconditionally switched when
moving in and out of HVM vcpu context.  Two of these values are compile time
constants, and the third is directly available in an existing per-cpu
variable.

There is no need to save host state in vmx_cpu_up() into a different per-cpu
structure, so drop all the infrastructure.  vmx_restore_host_msrs() is
simplified to 3 plain WRMSR instructions.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
8 years agox86/setup: Intoduce XEN_MSR_STAR
Andrew Cooper [Sun, 18 Dec 2016 14:56:28 +0000 (14:56 +0000)]
x86/setup: Intoduce XEN_MSR_STAR

Xen's choice of the MSR_STAR value is constant across all pcpus.  Introduce a
new define and use it to avoid the opencoding in subarch_percpu_traps_init()
and restore_rest_processor_state().

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>