]> xenbits.xensource.com Git - people/dwmw2/xen.git/log
people/dwmw2/xen.git
6 years agoxenforeignmemory: work around bug in older privcmd
Paul Durrant [Fri, 24 Aug 2018 12:16:26 +0000 (13:16 +0100)]
xenforeignmemory: work around bug in older privcmd

Versions of linux privcmd prior to commit dc9eab6fd94d ("return -ENOTTY
for unimplemented IOCTLs") will return -EINVAL rather than the conventional
-ENOTTY for unimplemented codes. This breaks the error path in
libxenforeignmemory resource mapping, which only translates ENOTTY into
EOPNOTSUPP to inform callers of the need to use an alternative (legacy)
mechanism.

This patch adds a new 'unimplemented' [1] ioctl code into the local
privcmd header which is then used to probe for the appropriate errno to
translate in the resource mapping error path

[1] this is a code that has, so far, never been used in any version of
    privcmd and will be added to future versions of the header in the
    linux source, to make sure it stays unimplemented.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
6 years agotools: building IPXE should be determined by CONFIG_IPXE
Wei Liu [Fri, 24 Aug 2018 10:54:04 +0000 (11:54 +0100)]
tools: building IPXE should be determined by CONFIG_IPXE

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
6 years agoQEMU_TAG update
Ian Jackson [Thu, 23 Aug 2018 14:10:11 +0000 (15:10 +0100)]
QEMU_TAG update

6 years agoxen/arm: p2m: Introduce a new variable removing_mapping in __p2m_set_entry
Julien Grall [Mon, 16 Jul 2018 17:27:10 +0000 (18:27 +0100)]
xen/arm: p2m: Introduce a new variable removing_mapping in __p2m_set_entry

This is making the code slightly easier to understand.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: p2m: Rename ret to mfn in p2m_lookup
Julien Grall [Mon, 16 Jul 2018 17:27:09 +0000 (18:27 +0100)]
xen/arm: p2m: Rename ret to mfn in p2m_lookup

Comestic change to make clearer what is the return ('ret' is a bit
too generic).

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: guest_walk: Use lpae_is_mapping to simplify the code
Julien Grall [Mon, 16 Jul 2018 17:27:06 +0000 (18:27 +0100)]
xen/arm: guest_walk: Use lpae_is_mapping to simplify the code

!lpae_is_page(pte, level) && !lpae_is_superpage(pte, level) is
equivalent to !lpae_is_mapping(pte, level).

At the same time drop lpae_is_page(pte, level) that is now unused.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: Rename lpae_valid to lpae_is_valid
Julien Grall [Mon, 16 Jul 2018 17:27:05 +0000 (18:27 +0100)]
xen/arm: Rename lpae_valid to lpae_is_valid

This will help to keep the naming consistent accross all lpae helpers.

No functional change intended.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: Rework lpae_table
Julien Grall [Mon, 16 Jul 2018 17:27:04 +0000 (18:27 +0100)]
xen/arm: Rework lpae_table

Currently, lpae_table can only work on entry from any level other than
3. Make it work with any level by extending the prototype to pass the
level.

At the same time, rename the function to lpae_is_mapping so naming stay
consistent accross all lpae_* helpers.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: Rework lpae_mapping
Julien Grall [Mon, 16 Jul 2018 17:27:03 +0000 (18:27 +0100)]
xen/arm: Rework lpae_mapping

Currently, lpae_mapping can only work on entry from any level other than
3. Make it work with any level by extending the prototype to pass the
level.

At the same time, rename the function to lpae_is_mapping so naming stay
consistent accross lpae_* helpers.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: p2m: Limit call to mem access code use in get_page_from_gva
Julien Grall [Mon, 16 Jul 2018 17:27:02 +0000 (18:27 +0100)]
xen/arm: p2m: Limit call to mem access code use in get_page_from_gva

Mem access has only an impact on the hardware translation between a
guest virtual address and the machine physical address. So it is not
necessary to fallback to memaccess for all the other case (e.g when it
is not possible to acquire the page behind the MFN).

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: p2m: Reduce the locking section in get_page_from_gva
Julien Grall [Mon, 16 Jul 2018 17:27:01 +0000 (18:27 +0100)]
xen/arm: p2m: Reduce the locking section in get_page_from_gva

The p2m lock is only necessary to prevent gvirt_to_maddr failing when
break-before-make sequence is used in the P2M update concurrently on
another pCPU. So reduce the locking section.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: cpregs: Fix typo in the documentation of TTBCR
Julien Grall [Mon, 16 Jul 2018 17:26:59 +0000 (18:26 +0100)]
xen/arm: cpregs: Fix typo in the documentation of TTBCR

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: cpregs: Allow HSR_CPREG* to receive more than 1 parameter
Julien Grall [Mon, 16 Jul 2018 17:26:58 +0000 (18:26 +0100)]
xen/arm: cpregs: Allow HSR_CPREG* to receive more than 1 parameter

At the moment, HSR_CPREG is expected to receive only the co-processor
register name in parameter. Because the name is actually a define, this
may have been expanded by a previous macro.

Rather than imposing the use of _HSR_CPREG* in such cases, allow
HSR_CPREG to receive more than 1 parameter.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: rename acpi_make_chosen_node to make_chosen_node
Stefano Stabellini [Tue, 31 Jul 2018 23:27:50 +0000 (16:27 -0700)]
xen/arm: rename acpi_make_chosen_node to make_chosen_node

acpi_make_chosen_node is actually generic and can be reused. Rename it
to make_chosen_node and make it available to non-ACPI builds.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Acked-by: Julien Grall <julien.grall@arm.com>
6 years agoxen/arm: move evtchn_allocate call out of make_hypervisor_node
Stefano Stabellini [Tue, 31 Jul 2018 23:27:49 +0000 (16:27 -0700)]
xen/arm: move evtchn_allocate call out of make_hypervisor_node

In the case of domUs, evtchn_irq is allocated by arch_domain_create and
set to GUEST_EVTCHN_PPI.

To make make_hypervisor_node more reusable, move the call to
evtchn_allocate out of make_hypervisor_node, to the dom0 specific caller
(handle_node).

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
6 years agoxen/arm: move a few DT related defines to public/device_tree_defs.h
Stefano Stabellini [Tue, 31 Jul 2018 23:27:45 +0000 (16:27 -0700)]
xen/arm: move a few DT related defines to public/device_tree_defs.h

Move a few constants defined by libxl_arm.c to
xen/include/public/device_tree_defs.h, so that they can be used from Xen
and libxl. Prepend GUEST_ to avoid conflicts.

Move the DT_IRQ_TYPE* definitions from libxl_arm.c to
public/device_tree_defs.h. Use them in Xen where appropriate.

Re-define the existing Xen internal IRQ_TYPEs as DT_IRQ_TYPEs: they
already happen to be the same, let make it clear.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
CC: ian.jackson@eu.citrix.com
6 years agoxen/arm: do not pass dt_host to make_memory_node and make_hypervisor_node
Stefano Stabellini [Tue, 31 Jul 2018 23:27:48 +0000 (16:27 -0700)]
xen/arm: do not pass dt_host to make_memory_node and make_hypervisor_node

In order to make make_memory_node and make_hypervisor_node more
reusable, do not pass them dt_host. As they only use it to calculate
addrcells and sizecells, pass addrcells and sizecells directly.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Acked-by: Julien Grall <julien.grall@arm.com>
6 years agoxen/arm: drivers: scif: Remove unused #define-s
Oleksandr Tyshchenko [Mon, 6 Aug 2018 18:35:49 +0000 (21:35 +0300)]
xen/arm: drivers: scif: Remove unused #define-s

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Acked-by: Julien Grall <julien.grall@arm.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
6 years agox86/oprofile: put SVM only code under CONFIG_HVM
Wei Liu [Fri, 17 Aug 2018 15:12:41 +0000 (16:12 +0100)]
x86/oprofile: put SVM only code under CONFIG_HVM

The code snippet in question is to detect NMI held by SVM until STGI
is called. When Xen doesn't even support HVM guests there is no need
to check svm_stgi_label.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/mtrr: move is_var_mtrr_overlapped
Wei Liu [Fri, 17 Aug 2018 15:12:38 +0000 (16:12 +0100)]
x86/mtrr: move is_var_mtrr_overlapped

Move it to x86 generic code. While at it, use proper boolean type and
fix some cosmetic issues.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/amd: skip OSVW function calls if !CONFIG_HVM
Wei Liu [Fri, 17 Aug 2018 15:12:36 +0000 (16:12 +0100)]
x86/amd: skip OSVW function calls if !CONFIG_HVM

The two functions are not needed when HVM is not supported in
hypervisor.

Note that using hvm_enabled won't work because early_microcode_init
gets to cpu_request_microcode before hvm_enabled is set in presmp init
call stage.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/vmce: enclose HVM load / save code in CONFIG_HVM
Wei Liu [Fri, 17 Aug 2018 15:12:35 +0000 (16:12 +0100)]
x86/vmce: enclose HVM load / save code in CONFIG_HVM

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/pt: add HVM check to XEN_DOMCTL_unbind_pt_irq
Wei Liu [Fri, 17 Aug 2018 15:12:32 +0000 (16:12 +0100)]
x86/pt: add HVM check to XEN_DOMCTL_unbind_pt_irq

Its counterpart is HVM only. Add the check to help dead code
elimination to figure out the call to pt_irq_destroy_bind is not
needed when HVM is not enabled.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/mm: don't reference hvm_funcs directly
Wei Liu [Fri, 17 Aug 2018 15:12:22 +0000 (16:12 +0100)]
x86/mm: don't reference hvm_funcs directly

It is generally not a good idea to reference the internal data
structure of the another subsystem directly. Introduce a wrapper
function for the invlpg hook.

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/vvmx: make get_shadow_eptp static function
Wei Liu [Fri, 17 Aug 2018 15:12:20 +0000 (16:12 +0100)]
x86/vvmx: make get_shadow_eptp static function

Its callers live within the same file.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86: HVM_FEP should depend on HVM
Wei Liu [Fri, 17 Aug 2018 15:12:21 +0000 (16:12 +0100)]
x86: HVM_FEP should depend on HVM

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
6 years agoxen: fix building !CONFIG_LOCK_PROFILE
Wei Liu [Fri, 17 Aug 2018 15:12:19 +0000 (16:12 +0100)]
xen: fix building !CONFIG_LOCK_PROFILE

The init function shouldn't be built or called at all when
!CONFIG_LOCK_PROFILE.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agolibxl_qmp: Disable beautify for QMP generated cmd
Anthony PERARD [Thu, 31 May 2018 13:50:28 +0000 (14:50 +0100)]
libxl_qmp: Disable beautify for QMP generated cmd

There is no need for it.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agolibxl_qmp: Simplify qmp_response_type() prototype
Anthony PERARD [Fri, 25 May 2018 15:18:45 +0000 (16:18 +0100)]
libxl_qmp: Simplify qmp_response_type() prototype

Remove the libxl__qmp_handler* argument so the function can be reused
later in a different context.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agolibxl_json: libxl__json_object_to_json
Anthony PERARD [Fri, 25 May 2018 14:07:14 +0000 (15:07 +0100)]
libxl_json: libxl__json_object_to_json

Allow to generate a JSON string from a libxl__json_object,
useful for debugging.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agolibxl_json: Enable yajl_allow_trailing_garbage
Anthony PERARD [Thu, 31 May 2018 10:50:03 +0000 (11:50 +0100)]
libxl_json: Enable yajl_allow_trailing_garbage

This allows to parse a string that is not NUL-terminated. With that
option disabled, YAJL v2 would look ahead on completion to find out if
there is more to parse.

YAJL v1 doesn't have this behavior.

Any function that allocates a yajl_handle via this function either parse
a NUL-terminated string, or do provide proper length. So change the
default and allow garbage (like a different JSON document) after the end
of the data to parse.

This is important for the QMP client, as there could be more than one
message to parse, and YAJL would consider the next message to be garbage
and throw an error.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agolibxl_dm: Add libxl__qemu_qmp_path()
Anthony PERARD [Mon, 23 Jul 2018 11:20:24 +0000 (12:20 +0100)]
libxl_dm: Add libxl__qemu_qmp_path()

... which generates the path to a QMP socket that libxl uses.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agolibxl_json: constify libxl__json_object_to_yajl_gen arguments
Anthony PERARD [Fri, 25 May 2018 14:02:34 +0000 (15:02 +0100)]
libxl_json: constify libxl__json_object_to_yajl_gen arguments

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
6 years agolibxl_qmp: Remove unused yajl_ctx from handler
Anthony PERARD [Fri, 25 May 2018 15:49:24 +0000 (16:49 +0100)]
libxl_qmp: Remove unused yajl_ctx from handler

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agolibxl: Add libxl__prepare_sockaddr_un() helper
Anthony PERARD [Tue, 24 Jul 2018 11:31:58 +0000 (12:31 +0100)]
libxl: Add libxl__prepare_sockaddr_un() helper

There is going to be a few more users that want to use UNIX socket, this
helper is to prepare the `struct sockaddr_un` and check that the path
isn't too long.

Also start to use it in libxl_qmp.c.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agolibxl_qmp: Move struct sockaddr_un variable to qmp_open()
Anthony PERARD [Fri, 25 May 2018 15:17:01 +0000 (16:17 +0100)]
libxl_qmp: Move struct sockaddr_un variable to qmp_open()

This variable is only used once, no need to keep it in the handler.

Also fix coding style (remove space after sizeof).
And allow strncpy to use all the space in sun_path.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agotools: fix uninstall: tests/x86_emulator, Linux hotplug
Christopher Clark [Mon, 20 Aug 2018 18:42:30 +0000 (11:42 -0700)]
tools: fix uninstall: tests/x86_emulator, Linux hotplug

Fixing top-level "make uninstall":

tools/tests/x86_emulator is missing an uninstall target, which causes
failure. Trivial to add one since it installs nothing, so do that.

Linux hotplug uninstall returns success but doesn't actually remove what
it installed. The Makefile variables are obfuscating incorrect logic, so
strip them out and match existing code for xen-watchdog which does work.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Reviewed-by: Doug Goldstein <cardoe@cardoe.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agolibgnttab: Add support for Linux dma-buf
Oleksandr Andrushchenko [Tue, 21 Aug 2018 06:44:01 +0000 (09:44 +0300)]
libgnttab: Add support for Linux dma-buf

Add support for Linux grant device driver extension which allows
converting existing dma-buf's into an array of grant references
and vise versa. This is only implemented for Linux as other OSes
have no Linux dma-buf support.

Bump gnttab library minor version to 3.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agoMAINTAINERS: add myself as a reviewer for x86 patches
Wei Liu [Mon, 20 Aug 2018 15:25:44 +0000 (16:25 +0100)]
MAINTAINERS: add myself as a reviewer for x86 patches

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agoautomation: build with debian unstable
Wei Liu [Mon, 20 Aug 2018 13:05:11 +0000 (14:05 +0100)]
automation: build with debian unstable

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Doug Goldstein <cardoe@cardoe.com>
6 years agotools/tests: fix an xs-test.c issue
Wei Liu [Mon, 20 Aug 2018 08:38:18 +0000 (09:38 +0100)]
tools/tests: fix an xs-test.c issue

The ret variable can be used uninitialised when iters is 0. Initialise
ret at the beginning to fix this issue.

Reported-by: Steven Haigh <netwiz@crc.id.au>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
6 years agotools/kdd: work around gcc 8.1 bug
Wei Liu [Mon, 6 Aug 2018 10:35:18 +0000 (11:35 +0100)]
tools/kdd: work around gcc 8.1 bug

Gcc 8.1 has a bug that causes kdd fail to build. Rewrite the code to
work around that bug.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86827

Signed-off-by: Tim Deegan <tim@xen.org>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Tested-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
6 years agoxenpmd: make 32 bit gcc 8.1 non-debug build work
Wei Liu [Thu, 26 Jul 2018 14:58:54 +0000 (15:58 +0100)]
xenpmd: make 32 bit gcc 8.1 non-debug build work

32 bit gcc 8.1 non-debug build yields:

xenpmd.c:354:23: error: '%02x' directive output may be truncated writing between 2 and 8 bytes into a region of size 3 [-Werror=format-truncation=]
     snprintf(val, 3, "%02x",
                       ^~~~
xenpmd.c:354:22: note: directive argument in the range [40, 2147483778]
     snprintf(val, 3, "%02x",
                      ^~~~~~
xenpmd.c:354:5: note: 'snprintf' output between 3 and 9 bytes into a destination of size 3
     snprintf(val, 3, "%02x",
     ^~~~~~~~~~~~~~~~~~~~~~~~
              (unsigned int)(9*4 +
              ~~~~~~~~~~~~~~~~~~~~
                             strlen(info->model_number) +
                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                             strlen(info->serial_number) +
                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                             strlen(info->battery_type) +
                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                             strlen(info->oem_info) + 4));
                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~

All info->* used in calculation are 32 bytes long, and the parsing
code makes sure they are null-terminated, so the end result of the
expression won't exceed 255, which should be able to be fit into 3
bytes in hexadecimal format.

Add an assertion to make gcc happy.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
6 years agotools: update ipxe changeset
Wei Liu [Thu, 26 Jul 2018 14:58:53 +0000 (15:58 +0100)]
tools: update ipxe changeset

This placates gcc 8.1. The commit comes from ipxe master branch as of
July 25, 2018.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
6 years agolibxl/arm: Fix build on arm64 + acpi w/ gcc 8.2
Christopher Clark [Thu, 16 Aug 2018 20:22:41 +0000 (13:22 -0700)]
libxl/arm: Fix build on arm64 + acpi w/ gcc 8.2

Add zero-padding to #defined ACPI table strings that are copied.
Provides sufficient characters to satisfy the length required to
fully populate the destination and prevent array-bounds warnings.
Add BUILD_BUG_ON sizeof checks for compile-time length checking.

Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agox86/mmcfg: rename pt_pci_init() and call it from acpi_mmcfg_init()
Zhenzhong Duan [Fri, 17 Aug 2018 13:04:27 +0000 (15:04 +0200)]
x86/mmcfg: rename pt_pci_init() and call it from acpi_mmcfg_init()

Given what pt_pci_init() actually does, rename it properly and move its
declaration to pci.h. Move the only call into acpi_mmcfg_init().

No functional change.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Tested-by: Gopalasetty, Manoj <manoj.gopalasetty@hpe.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agolibxc: copy back the result of XEN_DOMCTL_createdomain
Roger Pau Monné [Fri, 17 Aug 2018 11:59:35 +0000 (13:59 +0200)]
libxc: copy back the result of XEN_DOMCTL_createdomain

Fixes the ARM guest boot breakage introduced by 54ed251dc7.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agorangeset: make inquiry functions tolerate NULL inputs
Jan Beulich [Fri, 17 Aug 2018 11:54:40 +0000 (13:54 +0200)]
rangeset: make inquiry functions tolerate NULL inputs

Rather than special casing the ->iomem_caps check in x86's
get_page_from_l1e() for the dom_xen case, let's be more tolerant in
general, along the lines of rangeset_is_empty(): A never allocated
rangeset can't possibly contain or overlap any range.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agodom0/pvh: change the order of the MMCFG initialization
Roger Pau Monné [Fri, 17 Aug 2018 11:54:02 +0000 (13:54 +0200)]
dom0/pvh: change the order of the MMCFG initialization

So it's done before the iommu is initialized. This is required in
order to be able to fetch the MMCFG regions from the domain struct.

No functional change.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86: remove page.h and processor.h inclusion from asm_defns.h
Jan Beulich [Fri, 17 Aug 2018 11:52:55 +0000 (13:52 +0200)]
x86: remove page.h and processor.h inclusion from asm_defns.h

Subsequent changes require this (too wide anyway imo) dependency to be
dropped.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/HVM: correct an inverted check in hvm_load()
Jan Beulich [Fri, 17 Aug 2018 11:52:20 +0000 (13:52 +0200)]
x86/HVM: correct an inverted check in hvm_load()

Clearly we want to put a vCPU to sleep if it is _not_ already down.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86: make arch_set_info_guest() match comments in load_segments()
Jan Beulich [Fri, 17 Aug 2018 11:51:27 +0000 (13:51 +0200)]
x86: make arch_set_info_guest() match comments in load_segments()

For both fs_base and gs_base_user, there are comments saying "This can
only be non-zero if selector is NULL." While save_segments() ensures
this, so far arch_set_info_guest() didn't. Make behavior consistent
(attaching comments identical to those in save_segments()).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/setup: Avoid OoB E820 lookup when calculating the L1TF safe address
Andrew Cooper [Thu, 16 Aug 2018 15:26:22 +0000 (16:26 +0100)]
x86/setup: Avoid OoB E820 lookup when calculating the L1TF safe address

A number of corner cases (most obviously, no-real-mode and no Multiboot memory
map) can end up with e820_raw.nr_map being 0, at which point the L1TF
calculation will underflow.

Spotted by Coverity.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agolibxl: fix ARM build after 54ed251dc7
Jan Beulich [Thu, 16 Aug 2018 06:49:29 +0000 (00:49 -0600)]
libxl: fix ARM build after 54ed251dc7

Commit "tools: Rework xc_domain_create() to take a full
xen_domctl_createdomain"  failed to replace one further instance of
xc_config in libxl__arch_domain_save_config().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
6 years agox86/mmcfg: remove redundant code in pci_mmcfg_reject_broken()
Zhenzhong Duan [Thu, 16 Aug 2018 07:31:57 +0000 (09:31 +0200)]
x86/mmcfg: remove redundant code in pci_mmcfg_reject_broken()

No functional change.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agognttab/ARM: properly implement gnttab_create_status_page()
Jan Beulich [Thu, 16 Aug 2018 07:30:59 +0000 (09:30 +0200)]
gnttab/ARM: properly implement gnttab_create_status_page()

Prevent the "BUG_ON(page_get_owner(pg) != d)" in
gnttab_unpopulate_status_frames() from triggering.

Reported-by: 王磊 <lei19.wang@samsung.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agox86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries
Paul Durrant [Thu, 16 Aug 2018 07:27:30 +0000 (09:27 +0200)]
x86/hvm/emulate: make sure rep I/O emulation does not cross GFN boundaries

When emulating a rep I/O operation it is possible that the ioreq will
describe a single operation that spans multiple GFNs. This is fine as long
as all those GFNs fall within an MMIO region covered by a single device
model, but unfortunately the higher levels of the emulation code do not
guarantee that. This is something that should almost certainly be fixed,
but in the meantime this patch makes sure that MMIO is truncated at GFN
boundaries and hence the appropriate device model is re-evaluated for each
target GFN.

NOTE: This patch does not deal with the case of a single MMIO operation
      spanning a GFN boundary. That is more complex to deal with and is
      deferred to a subsequent patch.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Convert calculations to be 32-bit only.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
6 years agoxen/evtchn: Pass max_evtchn_port into evtchn_init()
Andrew Cooper [Fri, 16 Mar 2018 18:27:24 +0000 (18:27 +0000)]
xen/evtchn: Pass max_evtchn_port into evtchn_init()

... rather than setting it up once domain_create() has completed.  This
involves constructing a default value for dom0.

No practical change in functionality.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
6 years agoxen/domctl: Merge set_max_evtchn into createdomain
Andrew Cooper [Tue, 27 Feb 2018 17:39:37 +0000 (17:39 +0000)]
xen/domctl: Merge set_max_evtchn into createdomain

set_max_evtchn is somewhat weird.  It was introduced with the event_fifo work,
but has never been used.  Still, it is a bounding on resources consumed by the
event channel infrastructure, and should be part of createdomain, rather than
editable after the fact.

Drop XEN_DOMCTL_set_max_evtchn completely (including XSM hooks and libxc
wrappers), and retain the functionality in XEN_DOMCTL_createdomain.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
6 years agotools: Rework xc_domain_create() to take a full xen_domctl_createdomain
Andrew Cooper [Fri, 9 Mar 2018 14:38:35 +0000 (14:38 +0000)]
tools: Rework xc_domain_create() to take a full xen_domctl_createdomain

In future patches, the structure will be extended with further information,
and this is far cleaner than adding extra parameters.

The python stubs are the only user which passes NULL for the existing config
option (which is actually the arch substructure).  Therefore, the #ifdefary
moves to compensate.

For libxl, pass the full config object down into
libxl__arch_domain_{prepare,save}_config(), as there are in practice arch
specific settings in the common part of the structure (flags s3_integrity and
oos_off specifically).

No practical change in behaviour.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agotools/ocaml: Pass a full domctl_create_config into stub_xc_domain_create()
Andrew Cooper [Mon, 12 Mar 2018 10:40:33 +0000 (10:40 +0000)]
tools/ocaml: Pass a full domctl_create_config into stub_xc_domain_create()

The underlying C function is about to make the same change, and the structure
is going to gain extra fields.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
6 years agox86/pv: Use xmemdup() for cpuidmasks, rather than opencoding it
Andrew Cooper [Wed, 15 Aug 2018 09:53:53 +0000 (10:53 +0100)]
x86/pv: Use xmemdup() for cpuidmasks, rather than opencoding it

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/ptwr: Misc cleanup to ptwr_emulated_update()
Andrew Cooper [Fri, 10 Aug 2018 17:05:24 +0000 (18:05 +0100)]
x86/ptwr: Misc cleanup to ptwr_emulated_update()

All but one user wants mfn as mfn_t, so switch its type.  offset is only ever
used when multipled by 8, so fold that into its initial calculation.  Fold all
the pointer arithmic on pl1e together, to avoid needless casts.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86: write to correct variable in parse_pv_l1tf()
Jan Beulich [Wed, 15 Aug 2018 12:15:30 +0000 (14:15 +0200)]
x86: write to correct variable in parse_pv_l1tf()

Apparently a copy-and-paste mistake.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/hvm/ioreq: MMIO range checking completely ignores direction flag
Paul Durrant [Wed, 15 Aug 2018 12:14:06 +0000 (14:14 +0200)]
x86/hvm/ioreq: MMIO range checking completely ignores direction flag

hvm_select_ioreq_server() is used to route an ioreq to the appropriate
ioreq server. For MMIO this is done by comparing the range of the ioreq
to the ranges registered by the device models of each ioreq server.
Unfortunately the calculation of the range if the ioreq completely ignores
the direction flag and thus may calculate the wrong range for comparison.
Thus the ioreq may either be routed to the wrong server or erroneously
terminated by null_ops.

NOTE: The patch also fixes whitespace in the switch statement to make it
      style compliant.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agolibs/foreignmemory: Avoid printing an error for ENOTSUPP
Julien Grall [Mon, 13 Aug 2018 17:33:25 +0000 (18:33 +0100)]
libs/foreignmemory: Avoid printing an error for ENOTSUPP

Resource mapping is not supported on Arm and results to an error message
at every guest boot:

xenforeignmemory: error: ioctl failed: Operation not supported

Hide the error message when errnor is ENOTSUPP.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agolibxl: start pvqemu when 9pfs is requested
Stefano Stabellini [Tue, 14 Aug 2018 22:13:09 +0000 (15:13 -0700)]
libxl: start pvqemu when 9pfs is requested

PV 9pfs requires the PV backend in QEMU. Make sure that libxl knows it.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agopygrub: fix package version
Simon Rowe [Wed, 15 Aug 2018 08:08:07 +0000 (09:08 +0100)]
pygrub: fix package version

Make the version in setup.py agree with PYGRUB_VER.

Signed-off-by: Simon Rowe <simon.rowe@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxen: fix stale PVH comment
Roger Pau Monne [Tue, 14 Aug 2018 14:03:24 +0000 (16:03 +0200)]
xen: fix stale PVH comment

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxl.conf: Add global affinity masks
Wei Liu [Tue, 7 Aug 2018 14:35:34 +0000 (15:35 +0100)]
xl.conf: Add global affinity masks

XSA-273 involves one hyperthread being able to use Spectre-like
techniques to "spy" on another thread.  The details are somewhat
complicated, but the upshot is that after all Xen-based mitigations
have been applied:

* PV guests cannot spy on sibling threads
* HVM guests can spy on sibling threads

(NB that for purposes of this vulnerability, PVH and HVM guests are
identical.  Whenever this comment refers to 'HVM', this includes PVH.)

There are many possible mitigations to this, including disabling
hyperthreading entirely.  But another solution would be:

* Specify some cores as PV-only, others as PV or HVM
* Allow HVM guests to only run on thread 0 of the "HVM-or-PV" cores
* Allow PV guests to run on the above cores, as well as any thread of the PV-only cores.

For example, suppose you had 16 threads across 8 cores (0-7).  You
could specify 0-3 as PV-only, and 4-7 as HVM-or-PV.  Then you'd set
the affinity of the HVM guests as follows (binary representation):

0000000010101010

And the affinity of the PV guests as follows:

1111111110101010

In order to make this easy, this patches introduces three "global affinity
masks", placed in xl.conf:

    vm.cpumask
    vm.hvm.cpumask
    vm.pv.cpumask

These are parsed just like the 'cpus' and 'cpus_soft' options in the
per-domain xl configuration files.  The resulting mask is AND-ed with
whatever mask results at the end of the xl configuration file.
`vm.cpumask` would be applied to all guest types, `vm.hvm.cpumask`
would be applied to HVM and PVH guest types, and `vm.pv.cpumask`
would be applied to PV guest types.

The idea would be that to implement the above mask across all your
VMs, you'd simply add the following two lines to the configuration
file:

    vm.hvm.cpumask=8,10,12,14
    vm.pv.cpumask=0-8,10,12,14

See xl.conf manpage for details.

This is part of XSA-273 / CVE-2018-3646.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
6 years agox86: Make "spec-ctrl=no" a global disable of all mitigations
Jan Beulich [Mon, 13 Aug 2018 11:07:23 +0000 (05:07 -0600)]
x86: Make "spec-ctrl=no" a global disable of all mitigations

In order to have a simple and easy to remember means to suppress all the
more or less recent workarounds for hardware vulnerabilities, force
settings not controlled by "spec-ctrl=" also to their original defaults,
unless they've been forced to specific values already by earlier command
line options.

This is part of XSA-273.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/spec-ctrl: Introduce an option to control L1D_FLUSH for HVM HAP guests
Andrew Cooper [Tue, 29 May 2018 17:44:16 +0000 (18:44 +0100)]
x86/spec-ctrl: Introduce an option to control L1D_FLUSH for HVM HAP guests

This mitigation requires up-to-date microcode, and is enabled by default on
affected hardware if available, and is used for HVM guests

The default for SMT/Hyperthreading is far more complicated to reason about,
not least because we don't know if the user is going to want to run any HVM
guests to begin with.  If a explicit default isn't given, nag the user to
perform a risk assessment and choose an explicit default, and leave other
configuration to the toolstack.

This is part of XSA-273 / CVE-2018-3620.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/msr: Virtualise MSR_FLUSH_CMD for guests
Andrew Cooper [Fri, 13 Apr 2018 15:34:01 +0000 (15:34 +0000)]
x86/msr: Virtualise MSR_FLUSH_CMD for guests

Guests (outside of the nested virt case, which isn't supported yet) don't need
L1D_FLUSH for their L1TF mitigations, but offering/emulating MSR_FLUSH_CMD is
easy and doesn't pose an issue for Xen.

The MSR is offered to HVM guests only.  PV guests attempting to use it would
trap for emulation, and the L1D cache would fill long before the return to
guest context.  As such, PV guests can't make any use of the L1D_FLUSH
functionality.

This is part of XSA-273 / CVE-2018-3646.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/spec-ctrl: CPUID/MSR definitions for L1D_FLUSH
Andrew Cooper [Wed, 28 Mar 2018 14:21:39 +0000 (15:21 +0100)]
x86/spec-ctrl: CPUID/MSR definitions for L1D_FLUSH

This is part of XSA-273 / CVE-2018-3646.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/pv: Force a guest into shadow mode when it writes an L1TF-vulnerable PTE
Juergen Gross [Mon, 23 Jul 2018 06:11:40 +0000 (08:11 +0200)]
x86/pv: Force a guest into shadow mode when it writes an L1TF-vulnerable PTE

See the comment in shadow.h for an explanation of L1TF and the safety
consideration of the PTEs.

In the case that CONFIG_SHADOW_PAGING isn't compiled in, crash the domain
instead.  This allows well-behaved PV guests to function, while preventing
L1TF from being exploited.  (Note: PV guest kernels which haven't been updated
with L1TF mitigations will likely be crashed as soon as they try paging a
piece of userspace out to disk.)

This is part of XSA-273 / CVE-2018-3620.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/mm: Plumbing to allow any PTE update to fail with -ERESTART
Andrew Cooper [Mon, 23 Jul 2018 06:11:40 +0000 (08:11 +0200)]
x86/mm: Plumbing to allow any PTE update to fail with -ERESTART

Switching to shadow mode is performed in tasklet context.  To facilitate this,
we schedule the tasklet, then create a hypercall continuation to allow the
switch to take place.

As a consequence, the x86 mm code needs to cope with an L1e operation being
continuable.  do_mmu{,ext}_op() may no longer assert that a continuation
doesn't happen on the final iteration.

To handle the arguments correctly on continuation, compat_update_va_mapping*()
may no longer call into their non-compat counterparts.  Move the compat
functions into mm.c rather than exporting __do_update_va_mapping() and
{get,put}_pg_owner(), and fix an unsigned long/int inconsistency with
compat_update_va_mapping_otherdomain().

This is part of XSA-273 / CVE-2018-3620.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/shadow: Infrastructure to force a PV guest into shadow mode
Juergen Gross [Mon, 23 Jul 2018 06:11:40 +0000 (07:11 +0100)]
x86/shadow: Infrastructure to force a PV guest into shadow mode

To mitigate L1TF, we cannot alter an architecturally-legitimate PTE a PV guest
chooses to write, but we can force the PV domain into shadow mode so Xen
controls the PTEs which are reachable by the CPU pagewalk.

Introduce new shadow mode, PG_SH_forced, and a tasklet to perform the
transition.  Later patches will introduce the logic to enable this mode at the
appropriate time.

To simplify vcpu cleanup, make tasklet_kill() idempotent with respect to
tasklet_init(), which involves adding a helper to check for an uninitialised
list head.

This is part of XSA-273 / CVE-2018-3620.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/spec-ctrl: Introduce an option to control L1TF mitigation for PV guests
Andrew Cooper [Mon, 23 Jul 2018 13:46:10 +0000 (13:46 +0000)]
x86/spec-ctrl: Introduce an option to control L1TF mitigation for PV guests

Shadowing a PV guest is only available when shadow paging is compiled in.
When shadow paging isn't available, guests can be crashed instead as
mitigation from Xen's point of view.

Ideally, dom0 would also be potentially-shadowed-by-default, but dom0 has
never been shadowed before, and there are some stability issues under
investigation.

This is part of XSA-273 / CVE-2018-3620.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/spec-ctrl: Calculate safe PTE addresses for L1TF mitigations
Andrew Cooper [Wed, 25 Jul 2018 12:10:19 +0000 (12:10 +0000)]
x86/spec-ctrl: Calculate safe PTE addresses for L1TF mitigations

Safe PTE addresses for L1TF mitigations are ones which are within the L1D
address width (may be wider than reported in CPUID), and above the highest
cacheable RAM/NVDIMM/BAR/etc.

All logic here is best-effort heuristics, which should in practice be fine for
most hardware.  Future work will see about disentangling the SRAT handling
further, as well as having L0 pass this information down to lower levels when
virtualised.

This is part of XSA-273 / CVE-2018-3620.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
6 years agotools/oxenstored: Make evaluation order explicit
Christian Lindig [Mon, 13 Aug 2018 16:26:56 +0000 (17:26 +0100)]
tools/oxenstored: Make evaluation order explicit

In Store.path_write(), Path.apply_modify() updates the node_created
reference and both the value of apply_modify() and node_created are
returned by path_write().

At least with OCaml 4.06.1 this leads to the value of node_created being
returned *before* it is updated by apply_modify().  This in turn leads
to the quota for a domain not being updated in Store.write().  Hence, a
guest can create an unlimited number of entries in xenstore.

The fix is to make evaluation order explicit.

This is XSA-272.

Signed-off-by: Christian Lindig <christian.lindig@citrix.com>
Reviewed-by: Rob Hoes <rob.hoes@citrix.com>
6 years agox86/vtx: Fix the checking for unknown/invalid MSR_DEBUGCTL bits
Andrew Cooper [Mon, 13 Aug 2018 16:26:21 +0000 (17:26 +0100)]
x86/vtx: Fix the checking for unknown/invalid MSR_DEBUGCTL bits

The VPMU_MODE_OFF early-exit in vpmu_do_wrmsr() introduced by c/s
11fe998e56 bypasses all reserved bit checking in the general case.  As a
result, a guest can enable BTS when it shouldn't be permitted to, and
lock up the entire host.

With vPMU active (not a security supported configuration, but useful for
debugging), the reserved bit checking in broken, caused by the original
BTS changeset 1a8aa75ed.

From a correctness standpoint, it is not possible to have two different
pieces of code responsible for different parts of value checking, if
there isn't an accumulation of bits which have been checked.  A
practical upshot of this is that a guest can set any value it
wishes (usually resulting in a vmentry failure for bad guest state).

Therefore, fix this by implementing all the reserved bit checking in the
main MSR_DEBUGCTL block, and removing all handling of DEBUGCTL from the
vPMU MSR logic.

This is XSA-269.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agoARM: disable grant table v2
Stefano Stabellini [Mon, 13 Aug 2018 16:25:51 +0000 (17:25 +0100)]
ARM: disable grant table v2

It was never expected to work, the implementation is incomplete.

As a side effect, it also prevents guests from triggering a
"BUG_ON(page_get_owner(pg) != d)" in gnttab_unpopulate_status_frames().

This is XSA-268.

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/spec-ctrl: Yet more fixes for xpti= parsing
Andrew Cooper [Thu, 9 Aug 2018 16:22:17 +0000 (17:22 +0100)]
x86/spec-ctrl: Yet more fixes for xpti= parsing

As it currently stands, 'xpti=dom0' is indistinguishable from the default
value, which means it will be overridden by ARCH_CAPABILITIES_RDCL_NO on fixed
hardware.

Switch opt_xpti to use -1 as a default like all our other related options, and
clobber it as soon as we have a string to parse.

In addition, 'xpti' alone should be interpreted in its positive boolean form,
rather than resulting in a parse error.

  (XEN) parameter "xpti" has invalid value "", rc=-22!

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agotools/libxenctrl: use new xenforeignmemory API to seed grant table
Paul Durrant [Thu, 9 Aug 2018 09:59:41 +0000 (10:59 +0100)]
tools/libxenctrl: use new xenforeignmemory API to seed grant table

A previous patch added support for priv-mapping guest resources directly
(rather than having to foreign-map, which requires P2M modification for
HVM guests).

This patch makes use of the new API to seed the guest grant table unless
the underlying infrastructure (i.e. privcmd) doesn't support it, in which
case the old scheme is used.

NOTE: The call to xc_dom_gnttab_hvm_seed() in hvm_build_set_params() was
      actually unnecessary, as the grant table has already been seeded
      by a prior call to xc_dom_gnttab_init() made by libxl__build_dom().

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agocommon: add a new mappable resource type: XENMEM_resource_grant_table
Paul Durrant [Thu, 9 Aug 2018 09:59:40 +0000 (10:59 +0100)]
common: add a new mappable resource type: XENMEM_resource_grant_table

This patch allows grant table frames to be mapped using the
XENMEM_acquire_resource memory op.

NOTE: This patch expands the on-stack mfn_list array in acquire_resource()
      but it is still small enough to remain on-stack.

NOTE: This patch also removes a bogus comment above the
      grant_to_status_frames() function.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
[Rebase over "Explicitly default to gnttab v1 during domain creation"]
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agocommon/gnttab: Explicitly default to gnttab v1 during domain creation
Andrew Cooper [Wed, 8 Aug 2018 14:54:30 +0000 (15:54 +0100)]
common/gnttab: Explicitly default to gnttab v1 during domain creation

For reasons which appear to be exclusively down to poor review of the grant
table v2 code, a grant table's version field was wasn't initialised during
creation.

A number of problems (including XSAs) have occurred in the past trying trying
to use a grant table which hasn't been properly set up, and various areas of
the code cope with v0 by defaulting to v1.

In particular, the toolstack using GNTTABOP_setup_table to be able to fill in
the store/console grants has a side effect of switching to v1.

In hindsight however, this "fixup if we see 0" is a very poor, with a
substantial degree of risk.  Explicitly default to grant table v1 during
domain create, and let the rest of the code work safely in the knowledge that
the version is sensibly set.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
6 years agox86/vlapic: Bugfixes and improvements to vlapic_{read,write}()
Andrew Cooper [Mon, 6 Aug 2018 09:11:00 +0000 (09:11 +0000)]
x86/vlapic: Bugfixes and improvements to vlapic_{read,write}()

Firstly, there is no 'offset' boundary check on the non-32-bit write path
before the call to vlapic_read_aligned(), which allows an attacker to read
beyond the end of vlapic->regs->data[], which is only 1024 bytes long.

However, as the backing memory is a domheap page, and misaligned accesses get
chunked down to single bytes across page boundaries, I can't spot any
XSA-worthy problems which occur from the overrun.

On real hardware, bad accesses don't instantly crash the machine.  Their
behaviour is undefined, but the domain_crash() prohibits sensible testing.
Behave more like other x86 MMIO and terminate bad accesses with appropriate
defaults.

While making these changes, clean up and simplify the the smaller-access
handling.  In particular, avoid pointer based mechansims for 1/2-byte reads so
as to avoid forcing the value to be spilled to the stack.

  add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-175 (-175)
  function                                     old     new   delta
  vlapic_read                                  211     142     -69
  vlapic_write                                 304     198    -106

Finally, there are a plethora of read/write functions in the vlapic namespace,
so rename these to vlapic_mmio_{read,write}() to make their purpose more
clear.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
6 years agox86: move arch_evtchn_inject to x86 common code
Wei Liu [Tue, 7 Aug 2018 10:00:50 +0000 (11:00 +0100)]
x86: move arch_evtchn_inject to x86 common code

It is not specific to HVM. It just so happens that PV doesn't need
special handling.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86: add missing "inline" keyword
Wei Liu [Tue, 7 Aug 2018 10:00:45 +0000 (11:00 +0100)]
x86: add missing "inline" keyword

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86: put compat.o and x86_64/compat.o under CONFIG_PV
Wei Liu [Tue, 7 Aug 2018 10:00:44 +0000 (11:00 +0100)]
x86: put compat.o and x86_64/compat.o under CONFIG_PV

They contain code for compat hypercall for PV guests.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agodrop {,acpi_}reserve_bootmem()
Jan Beulich [Fri, 3 Aug 2018 15:40:31 +0000 (17:40 +0200)]
drop {,acpi_}reserve_bootmem()

Both are entirely unused (to be fair, reserve_bootmem() has a use inside
an "#if 0" section in x86's mpparse.c, but if we were to re-enable that
code, it would need doing differently anyway).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/hvm: Drop hvm_sr_handlers initializer
Alexandru Isaila [Fri, 3 Aug 2018 15:39:31 +0000 (17:39 +0200)]
x86/hvm: Drop hvm_sr_handlers initializer

This initializer is flawed and only sets .name of array entry 0
to a non-NULL string.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoautomation: ensure created are not owned as root
Doug Goldstein [Fri, 3 Aug 2018 14:46:49 +0000 (09:46 -0500)]
automation: ensure created are not owned as root

By default the container runs as the root user and since the source tree
is bind mounted into the container, any file is created and owned by the
root user which harms ergonomics when working outside of the container
environment. This maps the root user within the container to the uid of
the user outside of the container so files are not owned by root.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agoautomation: remove dead code from containerize
Doug Goldstein [Fri, 3 Aug 2018 14:46:48 +0000 (09:46 -0500)]
automation: remove dead code from containerize

This is more dead code.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agoautomation: drop container name from containerize
Doug Goldstein [Fri, 3 Aug 2018 14:46:47 +0000 (09:46 -0500)]
automation: drop container name from containerize

This was something that existed for some scripting support for a totally
unrelated project and when I copied this script I failed to remove it so
this removes it. Build containers for Xen are best as ephemeral
environments and should just utilizes Docker's default container naming
behavior.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agoautomation: standardize containerize env names
Doug Goldstein [Fri, 3 Aug 2018 14:46:46 +0000 (09:46 -0500)]
automation: standardize containerize env names

Standardized all the environment variable names that the containerize
script uses to start with CONTAINER_

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxen: specify support for EXPERT and DEBUG Kconfig options
Stefano Stabellini [Tue, 31 Jul 2018 15:24:01 +0000 (08:24 -0700)]
xen: specify support for EXPERT and DEBUG Kconfig options

Add a clear statement about them, reflecting the current security
support status of Kconfig options (no changes to current policies).

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
CC: George.Dunlap@eu.citrix.com
CC: Ian.Jackson@eu.citrix.com
CC: jbeulich@suse.com
CC: andrew.cooper3@citrix.com
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Tim Deegan <tim@xen.org>
CC: Wei Liu <wei.liu2@citrix.com>
---
Changes in v7:
- talk about EXPERT and DEBUG rather than CONFIG_EXPERT and CONFIG_DEBUG

6 years agoxen: add cloc target
Stefano Stabellini [Tue, 31 Jul 2018 15:23:01 +0000 (08:23 -0700)]
xen: add cloc target

Add a Xen build target to count the lines of code of the source files
built. Uses `cloc' to do the job.

With Xen on ARM taking off in embedded, IoT, and automotive, we are
seeing more and more uses of Xen in constrained environments. Users and
system integrators want the smallest Xen and Dom0 configurations. Some
of these deployments require certifications, where you definitely want
the smallest lines of code count. I provided this patch to give us the
lines of code count for that purpose.

Use the .o.d files to account for all the built source files. Generate a
list for the `cloc' utility and invoke `cloc'.

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
CC: jbeulich@suse.com
CC: andrew.cooper3@citrix.com
---
Changes in v4:
- use grep regex to get multiple source files from .d files

Changes in v3:
- remove build as dependecy for the cloc target

Changes in v2:
- change implementation to use .o.d to find built source files

6 years agoxen: add per-platform defaults for NR_CPUS
Stefano Stabellini [Tue, 31 Jul 2018 15:22:01 +0000 (08:22 -0700)]
xen: add per-platform defaults for NR_CPUS

Add specific per-platform defaults for NR_CPUS. Note that the order of
the defaults matter: they need to go first, otherwise the generic
defaults will be applied.

This is done so that Xen builds customized for a specific hardware
platform can have the right NR_CPUS number.

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
CC: JBeulich@suse.com
CC: andrew.cooper3@citrix.com
---

Changes in v6:
- remove useless additional default for ALL

6 years agoarm: add ALL_PLAT, QEMU, Rcar3 and MPSoC configs
Stefano Stabellini [Tue, 31 Jul 2018 15:21:01 +0000 (08:21 -0700)]
arm: add ALL_PLAT, QEMU, Rcar3 and MPSoC configs

Add a "Platform Support" choice with four kconfig options: QEMU, RCAR3,
MPSOC and ALL_PLAT. They enable the required options for their hardware
platform. ALL_PLAT enables all available platforms and it's the default.
It doesn't automatically select any of the related drivers, otherwise
they cannot be disabled. ALL_PLAT is implemented by using hidden options
with default values depending on ALL_PLAT.

In the case of the MPSOC that has a platform file under
arch/arm/platforms/, build the file if MPSOC.

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>
CC: artem_mygaiev@epam.com
CC: volodymyr_babchuk@epam.com
---
Changes in v8:
- remove QEMU_PLATFORM and RCAR3_PLATFORM that are currently unused
- remove selects from ALL
- rename ALL to ALL_PLAT
- introduce ALL64_PLAT and ALL32_PLAT

Changes in v5:
- turn platform support into a choice
- add ALL

Changes in v4:
- fix GICv3/GICV3
- default y to all options
- build xilinx-zynqmp if MPSOC