]> xenbits.xensource.com Git - people/sstabellini/xen-unstable.git/.git/log
people/sstabellini/xen-unstable.git/.git
5 years agoxen/arm: add coloring basic initialization
Luca Miccio [Wed, 21 Aug 2019 14:05:56 +0000 (16:05 +0200)]
xen/arm: add coloring basic initialization

Introduce a first and simple initialization function for the cache
coloring support. A helper function computes 'addr_col_mask', the
platform-dependent bitmask asserting the bits in memory addresses that
can be used for the coloring mechanism. This, in turn is used to
determine the total amount of available colors.

Signed-off-by: Luca Miccio <206497@studenti.unimore.it>
Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
5 years agoxen/arm: compute LLC way size by hardware inspection
Luca Miccio [Wed, 21 Aug 2019 13:55:52 +0000 (15:55 +0200)]
xen/arm: compute LLC way size by hardware inspection

The size of the LLC way is a crucial parameter for the cache coloring
support, since it determines the maximum number of available colors on
the the platform.  This parameter can currently be retrieved only from
the way_size bootarg and it is prone to misconfiguration nullifying the
coloring mechanism and breaking cache isolation.

Add an alternative and more safe method to retrieve the way size by
directly asking the hardware, namely using CCSIDR_EL1 and CSSELR_EL1
registers.

This method has to check also if at least L2 is implemented in the
hardware since there are scenarios where only L1 cache is availble, e.g,
QEMU.

Signed-off-by: Luca Miccio <206497@studenti.unimore.it>
Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
5 years agoxen/arm: add debug logging interface for coloring
Luca Miccio [Tue, 20 Aug 2019 14:49:13 +0000 (16:49 +0200)]
xen/arm: add debug logging interface for coloring

Signed-off-by: Luca Miccio <206497@studenti.unimore.it>
Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
Acked-by: Stefano Stabellini <stefanos@xilinx.com>
5 years agoxen: add parsing function for cache coloring configuration
Luca Miccio [Tue, 20 Aug 2019 14:08:10 +0000 (16:08 +0200)]
xen: add parsing function for cache coloring configuration

Add three new bootargs allowing configuration of cache coloring support
for Xen:
- way_size: The size of a LLC way in bytes. This value is mainly used
  to calculate the maximum available colors on the platform.
- dom0_colors: The coloring configuration for Dom0, which also acts as
  default configuration for any DomU without an explicit configuration.
- xen_colors: The coloring configuration for the Xen hypervisor itself.

A cache coloring configuration consists of a selection of colors to be
assigned to a VM or to the hypervisor. It is represented by a set of
ranges. Add a common function that parses a string with a
comma-separated set of hyphen-separated ranges like "0-7,15-16" and
returns both: the number of chosen colors, and an array containing their
ids.

Signed-off-by: Luca Miccio <206497@studenti.unimore.it>
Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
5 years agoRestore setup_pagetables xen_paddr argument
Luca Miccio [Mon, 6 Jan 2020 13:46:12 +0000 (14:46 +0100)]
Restore setup_pagetables xen_paddr argument

Signed-off-by: Luca Miccio <206497@studenti.unimore.it>
5 years agoRevert "xen/arm: mm: Initialize page-tables earlier"
Luca Miccio [Mon, 6 Jan 2020 13:27:55 +0000 (14:27 +0100)]
Revert "xen/arm: mm: Initialize page-tables earlier"

This reverts commit 3a5d341681af650825bbe3bee9be5d187da35080.

5 years agoRevert "xen/arm: setup: Add Xen as boot module before printing all boot modules"
Luca Miccio [Mon, 6 Jan 2020 13:25:29 +0000 (14:25 +0100)]
Revert "xen/arm: setup: Add Xen as boot module before printing all boot modules"

This reverts commit 48fb2a9deba11ee48dde21c5c1aa93b4d4e1043b.

5 years agoarch: arm: vgic-v3: fix GICD_ISACTIVER range
Peng Fan [Thu, 5 Dec 2019 00:31:25 +0000 (16:31 -0800)]
arch: arm: vgic-v3: fix GICD_ISACTIVER range

The end should be GICD_ISACTIVERN not GICD_ISACTIVER,
and also print a warning for the unhandled read.

Signed-off-by: Peng Fan <peng.fan@nxp.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
5 years agoxen/arm: allow domUs to iomap reserved-memory regions
Stefano Stabellini [Tue, 3 Dec 2019 02:32:19 +0000 (18:32 -0800)]
xen/arm: allow domUs to iomap reserved-memory regions

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
5 years agoplatform: versal: add EEMI layer support
Izhar Ameer Shaikh [Tue, 3 Dec 2019 02:40:40 +0000 (18:40 -0800)]
platform: versal: add EEMI layer support

This patch adds a support for PM EEMI API mediate layer support.

Mapping between device, clock and reset nodes and corresponding base
addresses is derived from topology information. Similar to ZU+, certain
device nodes do not allow any operations such as turning off ACPU cores,
LPD etc.

Since there are a few significant changes to the handling of PM commands
for versal due to various reasons (node value representations,
additions/removal of commands etc.), there is a separate handler for
versal platform.

Signed-off-by: Izhar Ameer Shaikh <izhar.ameer.shaikh@xilinx.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
5 years agoplatform: zynqmp: add a common EEMI header
Izhar Ameer Shaikh [Tue, 3 Dec 2019 02:37:14 +0000 (18:37 -0800)]
platform: zynqmp: add a common EEMI header

This patch adds a new common header to be used for generic PM
EEMI definitions. In addition, header guards are also added to
xilinx-zynqmp-mm.h and xilinx-zynqmp-eemi.h files.

Following unused emums are also removed:
 - pm_node_id
 - pm_request_ack
 - pm_abort_reason
 - pm_suspend_reason
 - pm_ram_state
 - pm_opchar_type

Signed-off-by: Izhar Ameer Shaikh <izhar.ameer.shaikh@xilinx.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
5 years agoplatform: zynqmp: correct typos in comments
Izhar Ameer Shaikh [Fri, 30 Aug 2019 23:32:33 +0000 (16:32 -0700)]
platform: zynqmp: correct typos in comments

Fixed minor typos in comments.

Signed-off-by: Izhar Ameer Shaikh <izhar.ameer.shaikh@xilinx.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
5 years agoplatform: zynqmp: rename clock node macros
Izhar Ameer Shaikh [Fri, 30 Aug 2019 23:32:32 +0000 (16:32 -0700)]
platform: zynqmp: rename clock node macros

To maintain future compatibility, rename clock node macros to have
PM_CLK_* prefix instead of previously used PM_CLOCK_* prefix.

Signed-off-by: Izhar Ameer Shaikh <izhar.ameer.shaikh@xilinx.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
5 years agoplatform: zynqmp: rename reset node macros
Izhar Ameer Shaikh [Fri, 30 Aug 2019 23:32:31 +0000 (16:32 -0700)]
platform: zynqmp: rename reset node macros

To maintain future compatibility, rename reset node macros to have
PM_RST_* prefix instead of previously used PM_RESET_* prefix.

Signed-off-by: Izhar Ameer Shaikh <izhar.ameer.shaikh@xilinx.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
5 years agoplatform: zynqmp: rename device node macros
Izhar Ameer Shaikh [Fri, 30 Aug 2019 23:32:30 +0000 (16:32 -0700)]
platform: zynqmp: rename device node macros

To maintain future compatibility, rename device node macros to have
PM_DEV_* prefix instead of previously used NODE_* prefix.

Signed-off-by: Izhar Ameer Shaikh <izhar.ameer.shaikh@xilinx.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
5 years agoxen: add a separate platform file for Versal
Stefano Stabellini [Mon, 15 Jul 2019 19:39:59 +0000 (12:39 -0700)]
xen: add a separate platform file for Versal

Let all the EEMI calls to go through for Dom0. Block access for domUs.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
5 years agoxen: mediate EEMI TCM calls
Stewart Hildebrand [Fri, 31 May 2019 20:26:13 +0000 (13:26 -0700)]
xen: mediate EEMI TCM calls

It is necessary to allow a DomU to issue EEMI power management
operations on TCM nodes when running OpenAMP in a DomU. Introduce the
TCM nodes in xilinx-zynqmp-eemi.c, so that they are allowed to do so
when the TCM regions are assigned to the domU (they are subject to the
usual permissions checks.)

Signed-off-by: Stewart Hildebrand <Stewart.Hildebrand@dornerworks.com>
Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoplatform: zynqmp: Map missing clocks to respective node
Tejas Patel [Mon, 25 Mar 2019 08:59:42 +0000 (01:59 -0700)]
platform: zynqmp: Map missing clocks to respective node

Dom0 requires access of AMS_REF, TOPSW_LSBUS and LPD_LSBUS clock.
Map these clocks to respective node to provide access
if Dom0 has permission to access to those nodes.

Signed-off-by: Tejas Patel <tejas.patel@xilinx.com>
Reviewed-by: Stefano Stabellini <stefanos@xilinx.com>
5 years agos/xen,shared-memory/xen,shared-memory-v1/g
Stefano Stabellini [Wed, 13 Mar 2019 19:33:08 +0000 (12:33 -0700)]
s/xen,shared-memory/xen,shared-memory-v1/g

The shared memory device tree binding went upstream as
"xen,shared-memory-v1". So, rename all occurrences of
"xen,shared-memory" to "xen,shared-memory-v1" in the docs.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
5 years agoxen/docs: improve reserved-memory doc
Stefano Stabellini [Thu, 7 Mar 2019 19:27:00 +0000 (11:27 -0800)]
xen/docs: improve reserved-memory doc

Extend the device tree snippet example in the docs to have a memory
node that covers the reserved-memory range as required by the device
tree spec.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
5 years agoxen/libxc: don't change xc_domain_memory_mapping
Stefano Stabellini [Fri, 1 Mar 2019 17:28:28 +0000 (09:28 -0800)]
xen/libxc: don't change xc_domain_memory_mapping

Although libxc doesn't promise compatibility, xc_domain_memory_mapping
has been used by QEMU for years. Instead of changing the signature of
the function, introduce a new xc_domain_memory_mapping_cache which takes
the additional cacheability parameter. Leave the original
xc_domain_memory_mapping unmodified.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
5 years agoxen/docs: how to map a page between dom0 and domU using iomem
Stefano Stabellini [Tue, 26 Feb 2019 23:00:44 +0000 (15:00 -0800)]
xen/docs: how to map a page between dom0 and domU using iomem

Document how to use the iomem option to share a page between Dom0 and a
DomU.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
5 years agolibxl/xl: add cacheability option to iomem
Stefano Stabellini [Tue, 26 Feb 2019 23:00:28 +0000 (15:00 -0800)]
libxl/xl: add cacheability option to iomem

Parse a new cacheability option for the iomem parameter, it can be
"devmem" for device memory mappings, which is the default, or "memory"
for normal memory mappings.

Store the parameter in a new field in libxl_iomem_range.

Pass the cacheability option to xc_domain_memory_mapping.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
CC: ian.jackson@eu.citrix.com
CC: wei.liu2@citrix.com
5 years agolibxc: xc_domain_memory_mapping, handle cacheability
Stefano Stabellini [Tue, 26 Feb 2019 23:00:25 +0000 (15:00 -0800)]
libxc: xc_domain_memory_mapping, handle cacheability

Add an additional parameter to xc_domain_memory_mapping to pass
cacheability information. The same parameter values are the same for the
XEN_DOMCTL_memory_mapping hypercall (0 is device memory, 1 is normal
memory). Pass CACHEABILITY_DEVMEM by default -- no changes in behavior.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
CC: ian.jackson@eu.citrix.com
CC: wei.liu2@citrix.com
5 years agoxen: extend XEN_DOMCTL_memory_mapping to handle cacheability
Stefano Stabellini [Fri, 4 Jan 2019 20:47:02 +0000 (12:47 -0800)]
xen: extend XEN_DOMCTL_memory_mapping to handle cacheability

Reuse the existing padding field to pass cacheability information about
the memory mapping, specifically, whether the memory should be mapped as
normal memory or as device memory (this is what we have today).

Add a cacheability parameter to map_mmio_regions. 0 means device
memory, which is what we have today.

On ARM, map device memory as p2m_mmio_direct_dev (as it is already done
today) and normal memory as p2m_ram_rw.

On x86, return error if the cacheability requested is not device memory.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
5 years agoxen/arm: export shared memory regions as reserved-memory on device tree
Stefano Stabellini [Tue, 29 Jan 2019 18:58:06 +0000 (10:58 -0800)]
xen/arm: export shared memory regions as reserved-memory on device tree

Shared memory regions need to be advertised to the guest. Fortunately, a
device tree binding for special memory regions already exist:
reserved-memory.

Add a reserved-memory node for each shared memory region, for both
owners and borrowers.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
5 years agoxen/arm: zynqmp: Add RPLL and VPLL-related clocks to pm_clock2node map
Mirela Simonovic [Tue, 23 Oct 2018 14:51:24 +0000 (16:51 +0200)]
xen/arm: zynqmp: Add RPLL and VPLL-related clocks to pm_clock2node map

Current clock driver in Linux for Zynq MPSoC controls the PLLs as if they
are clocks (using the clock rather than PLL EEMI API). Only RPLL and VPLL
could be directly controlled by a guest that owns the display port, because
the display port driver in Linux requires for video and audio some special
clock frequencies, that further require VPLL and RPLL to be locked in fractional
modes (for video and audio respectively). Therefore, we need to allow a guest
that owns the display port to directly control these PLL-related clocks.

In future, Linux driver should switch to using PLL EEMI API for controlling
PLLs, and the support for that is already added in EEMI mediator in Xen.
Once that happens, this patch can be reverted.

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Reviewed-by: Stefano Stabellini <stefanos@xilinx.com>
5 years agoxen/arm: zynqmp: Remove direct accesses to PLLs and their resets
Mirela Simonovic [Tue, 23 Oct 2018 14:51:23 +0000 (16:51 +0200)]
xen/arm: zynqmp: Remove direct accesses to PLLs and their resets

Only a limited number of PLLs can be controlled by guests, and that
has to be done using PLL EEMI APIs. Clean-up the direct access options
for PLLs and their resets.

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Reviewed-by: Saeed Nowshadi <saeedn@xilinx.com>
5 years agoxen/arm: zynqmp: Remove MMIO r/w accesses to clock and PLL control
Mirela Simonovic [Tue, 23 Oct 2018 14:51:22 +0000 (16:51 +0200)]
xen/arm: zynqmp: Remove MMIO r/w accesses to clock and PLL control

Guests need to used clock/PLL EEMI API calls to query and control
states of clocks/PLLs rather than MMIO read/write accesses. Thereby,
the gate for MMIO read/write accesses to clock/PLL control registers
has to be closed (done in this patch).

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Reviewed-by: Saeed Nowshadi <saeedn@xilinx.com>
Acked-by: Stefano Stabellini <stefanos@xilinx.com>
5 years agoxen/arm: zynqmp: Add PLL set mode/parameter EEMI API
Mirela Simonovic [Tue, 23 Oct 2018 14:51:21 +0000 (16:51 +0200)]
xen/arm: zynqmp: Add PLL set mode/parameter EEMI API

PLL set mode/parameter should be allowed only for VPLL and RPLL to
a guest which uses the display port. This is the case because the display
port driver requires some very specific frequencies for video and audio,
so it relies on configuring VPLL and RPLL in fractional mode (for video
and audio respectively). These two PLLs are reserved for exclusive usage
to display port, or to be more specific - the clock framework of the guest
that owns the display port will need to directly control the modes of these
two PLLs and the power management framework should allow that.
The check is implemented using the domain_has_node_access() function, which
covers this use-case because access to NODE_VPLL and NODE_RPLL is granted to
a guest which has access to the display port via the newly added entries
in pm_node_access map.
If a guest is allowed to control a PLL the request is passed through to
the EL3. Otherwise, an error is returned.

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Reviewed-by: Saeed Nowshadi <saeedn@xilinx.com>
Reviewed-by: Stefano Stabellini <stefanos@xilinx.com>
5 years agoxen/arm: zynqmp: Add PLL EEMI API definitions and passthrough get functions
Mirela Simonovic [Tue, 23 Oct 2018 14:51:20 +0000 (16:51 +0200)]
xen/arm: zynqmp: Add PLL EEMI API definitions and passthrough get functions

PLL get functions should be allowed to every guest because guests may
need to use these APIs to calculate the PLL output frequency. Thereby,
allow passthrough of get functions to every guest.

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Reviewed-by: Saeed Nowshadi <saeedn@xilinx.com>
Reviewed-by: Stefano Stabellini <stefanos@xilinx.com>
5 years agoxen/arm: zynqmp: Implement checking and passthrough for clock control APIs
Mirela Simonovic [Tue, 23 Oct 2018 14:51:19 +0000 (16:51 +0200)]
xen/arm: zynqmp: Implement checking and passthrough for clock control APIs

Clock enable, disable, set parent and set divider EEMI APIs affect
frequency of the target clock, so there has to be a permission checking
to filter the calls that should not be permitted to a guest. To implement
the checking, the clock-to-node map is introduced and implemented using the
pm_clock2node array (note that a clock can drive several nodes). Elements
of pm_clock2node array have to be defined by the increasing clock ID values
because of the search algorithm that relies on this assumption. Only clocks that
a guest could be allowed to control need to be represented in the pm_clock2node
array. Clocks that are not represented in the array should not be controllable
by any guest.
A guest will be granted the permission to control a clock only if all the nodes
driven by the target clock are owned by the guest. If the permission is granted
the call is passed through to the EL3. Otherwise, error is returned.

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Reviewed-by: Saeed Nowshadi <saeedn@xilinx.com>
Reviewed-by: Stefano Stabellini <stefanos@xilinx.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
5 years agoxen/arm: zynqmp: Clock get EEMI API functions are allowed to each guest
Mirela Simonovic [Tue, 23 Oct 2018 14:51:18 +0000 (16:51 +0200)]
xen/arm: zynqmp: Clock get EEMI API functions are allowed to each guest

Each guest is allowed to query clock related information (get divisor
value, current clock parent or clock status). Guests may need to use
these APIs to construct the information about the partial clock tree
that they control or depend on - e.g. although a guest may not control
a clock it may need to calculate its frequency and these APIs are
necessary to enable the calculation.
If the provided clock ID is valid, the call is passed through to the
EL3. Otherwise, an error is returned.

The clock id definitions are added in this patch. Although this patch
requires only clock id min and max values to check if clock id is valid,
the clock id definitions are in general needed in Xen. This is because
Xilinx clock driver implementation in Linux queries the clock tree topology
at runtime from firmware (ATF). Device tree does contain some information
about the clocks - but only leaf clock ID numbers and their binding to device
interfaces. The underlying software layers need to know everything else about
the clock tree. Since the clock topology resides in ATF and querying calls
are passed through by Xen, the Xen at least needs to know about clock IDs to
be able to map them to nodes in order to determine clock-control permissions.
This clock-control permission checking will be added in a following patch.

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Reviewed-by: Saeed Nowshadi <saeedn@xilinx.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
5 years agoxen/arm: zynqmp: Return not supported error for clock get/set rate API
Mirela Simonovic [Tue, 23 Oct 2018 14:51:17 +0000 (16:51 +0200)]
xen/arm: zynqmp: Return not supported error for clock get/set rate API

Clock get/set rate EEMI API should be implemented and mapped to clock
divisor, multiplexer, and gate related EEMI APIs by guests.

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Reviewed-by: Saeed Nowshadi <saeedn@xilinx.com>
Reviewed-by: Stefano Stabellini <stefanos@xilinx.com>
5 years agoxen/arm: zynqmp: Fix power management status/error codes
Mirela Simonovic [Tue, 23 Oct 2018 14:51:16 +0000 (16:51 +0200)]
xen/arm: zynqmp: Fix power management status/error codes

Power management error codes were recently fixed in ATF and aligned
with PMU-FW definitions. Do the same for Xen.

Signed-off-by: Mirela Simonovic <mirela.simonovic@aggios.com>
Reviewed-by: Saeed Nowshadi <saeedn@xilinx.com>
Acked-by: Stefano Stabellini <stefanos@xilinx.com>
5 years agoxen/eemi: proper bounds checks
Stefano Stabellini [Mon, 24 Sep 2018 23:07:33 +0000 (16:07 -0700)]
xen/eemi: proper bounds checks

ARRAY_SIZE(pm_node_access) and ARRAY_SIZE(pm_reset_access) are out of
bounds for indexes. addr == end in pm_mmio_access is also not valid.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
5 years agoxen: match VTCR_EL2 SL0 attribute in TTBCR
Stefano Stabellini [Thu, 6 Sep 2018 17:52:41 +0000 (10:52 -0700)]
xen: match VTCR_EL2 SL0 attribute in TTBCR

The SL0 attribute in TTBCR, which specifies the entry level in the page
table lookup, should be the same as the SL0 attribute in VTCR_EL2, given
that pagetables are shared between MMU and SMMU.

Make it so, by reading the value from VTCR_EL2, and setting reg
accordingly.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Tested-by: Upender Cherukupally <upender@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
5 years agoxen: platform: zynqmp: Add new eemi api IDs
Tejas Patel [Sat, 24 Feb 2018 15:47:11 +0000 (07:47 -0800)]
xen: platform: zynqmp: Add new eemi api IDs

New EEMI API IDs are added in ATF and Linux.
Sync EEMI API IDs of xen with Linux and ATF.

Signed-off-by: Tejas Patel <tejasp@xilinx.com>
Acked-by: Jolly Shah <jollys@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
5 years agoarch/arm64: zynqmp: Allow MMIO access to the CRF audio register
Alistair Francis [Thu, 16 Nov 2017 22:27:15 +0000 (14:27 -0800)]
arch/arm64: zynqmp: Allow MMIO access to the CRF audio register

Allow the guest to access the R_CRF_DP_AUDIO_REF_CTRL register. This
fixes warm reboot issues with newer kernels.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
5 years agoxen/arm: zynqmp: Use the USB XHCI areas to determine EEMI perms
Edgar E. Iglesias [Sun, 26 Mar 2017 21:37:13 +0000 (23:37 +0200)]
xen/arm: zynqmp: Use the USB XHCI areas to determine EEMI perms

Use the USB XHCI areas to determine EEMI perms.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
5 years agoxen/arm64: zynqmp: Regenerate LPD memmap
Edgar E. Iglesias [Sun, 26 Mar 2017 21:00:38 +0000 (23:00 +0200)]
xen/arm64: zynqmp: Regenerate LPD memmap

Regenerate LPD memmap.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
5 years agoxen/arm: zynqmp: Forward plaform specific firmware calls
Edgar E. Iglesias [Mon, 30 Jan 2017 16:36:42 +0000 (17:36 +0100)]
xen/arm: zynqmp: Forward plaform specific firmware calls

Implement an EEMI mediator and forward platform specific
firmware calls from guests to firmware.

The EEMI mediator is responsible for implementing access
controls modifying or blocking calls that try to operate
on setup for devices that are not under the calling guest's
control.

EEMI:
https://www.xilinx.com/support/documentation/user_guides/ug1200-eemi-api.pdf

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
5 years agodocs: documentation about static shared memory regions
Zhongze Liu [Wed, 5 Dec 2018 22:16:02 +0000 (14:16 -0800)]
docs: documentation about static shared memory regions

Author: Zhongze Liu <blackskygg@gmail.com>

Add docs to document the motivation, usage, use cases and other
relevant information about the static shared memory feature.

This is for the proposal "Allow setting up shared memory areas between VMs
from xl config file". See:

  https://lists.xen.org/archives/html/xen-devel/2017-08/msg03242.html

The corresponding device tree binding is described by
Documentation/devicetree/bindings/reserved-memory/xen,shared-memory.txt.

Signed-off-by: Zhongze Liu <blackskygg@gmail.com>
Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: xen-devel@lists.xen.org
5 years agolibxl:xl: add parsing code to parse "libxl_static_sshm" from xl config files
Zhongze Liu [Wed, 5 Dec 2018 22:16:01 +0000 (14:16 -0800)]
libxl:xl: add parsing code to parse "libxl_static_sshm" from xl config files

Add the parsing utils for the newly introduced libxl_static_sshm struct
to the libxl/libxlu_* family. And add realated parsing code in xl to
parse the struct from xl config files. This is for the proposal "Allow
setting up shared memory areas between VMs from xl config file" (see [1]).

[1] https://lists.xen.org/archives/html/xen-devel/2017-08/msg03242.html

Signed-off-by: Zhongze Liu <blackskygg@gmail.com>
Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: xen-devel@lists.xen.org
5 years agolibxl: support unmapping static shared memory areas during domain destruction
Zhongze Liu [Wed, 5 Dec 2018 22:16:00 +0000 (14:16 -0800)]
libxl: support unmapping static shared memory areas during domain destruction

Add libxl__sshm_del to unmap static shared memory areas mapped by
libxl__sshm_add during domain creation. The unmapping process is:

* For a owner: decrease the refcount of the sshm region, if the refcount
  reaches 0, cleanup the whole sshm path.

* For a borrower:
  1) unmap the shared pages, and cleanup related xs entries. If the
     system works normally, all the shared pages will be unmapped, so there
     won't be page leaks. In case of errors, the unmapping process will go
     on and unmap all the other pages that can be unmapped, so the other
     pages won't be leaked, either.
  2) Decrease the refcount of the sshm region, if the refcount reaches
     0, cleanup the whole sshm path.

This is for the proposal "Allow setting up shared memory areas between VMs
from xl config file" (see [1]).

[1] https://lists.xen.org/archives/html/xen-devel/2017-08/msg03242.html

Signed-off-by: Zhongze Liu <blackskygg@gmail.com>
Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: xen-devel@lists.xen.org
5 years agolibxl: support mapping static shared memory areas during domain creation
Zhongze Liu [Tue, 29 Jan 2019 18:57:11 +0000 (10:57 -0800)]
libxl: support mapping static shared memory areas during domain creation

Add libxl__sshm_add to map shared pages from one DomU to another, The mapping
process involves the following steps:

  * Set defaults and check for further errors in the static_shm configs:
    overlapping areas, invalid ranges, duplicated owner domain,
    not page aligned, no owner domain etc.
  * Use xc_domain_add_to_physmap_batch to map the shared pages to borrowers
  * When some of the pages can't be successfully mapped, roll back any
    successfully mapped pages so that the system stays in a consistent state.
  * Write information about static shared memory areas into the appropriate
    xenstore paths and set the refcount of the shared region accordingly.

Temporarily mark this as unsupported on x86 because calling p2m_add_foreign on
two domU's is currently not allowd on x86 (see the comments in
x86/mm/p2m.c:p2m_add_foreign for more details).

This is for the proposal "Allow setting up shared memory areas between VMs
from xl config file" (see [1]).

[1] https://lists.xen.org/archives/html/xen-devel/2017-08/msg03242.html

Signed-off-by: Zhongze Liu <blackskygg@gmail.com>
Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
5 years agolibxl: introduce a new structure to represent static shared memory regions
Zhongze Liu [Tue, 29 Jan 2019 18:56:08 +0000 (10:56 -0800)]
libxl: introduce a new structure to represent static shared memory regions

Add a new structure to the IDL family to represent static shared memory regions
as proposed in the proposal "Allow setting up shared memory areas between VMs
from xl config file" (see [1]).

[1] https://lists.xen.org/archives/html/xen-devel/2017-08/msg03242.html

Signed-off-by: Zhongze Liu <blackskygg@gmail.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: xen-devel@lists.xen.org
5 years agoxen: xsm: flask: introduce XENMAPSPACE_gmfn_share for memory sharing
Zhongze Liu [Wed, 5 Dec 2018 22:15:57 +0000 (14:15 -0800)]
xen: xsm: flask: introduce XENMAPSPACE_gmfn_share for memory sharing

The existing XENMAPSPACE_gmfn_foreign subop of XENMEM_add_to_physmap forbids
a Dom0 to map memory pages from one DomU to another, which restricts some useful
yet not dangerous use cases -- such as sharing pages among DomU's so that they
can do shm-based communication.

This patch introduces XENMAPSPACE_gmfn_share to address this inconvenience,
which is mostly the same as XENMAPSPACE_gmfn_foreign but has its own xsm check.

Specifically, the patch:

* Introduces a new av permission MMU__SHARE_MEM to denote if two domains can
  share memory by using the new subop;
* Introduces xsm_map_gmfn_share() to check if (current) has proper permission
  over (t) AND MMU__SHARE_MEM is allowed between (d) and (t);
* Modify the default xen.te to allow MMU__SHARE_MEM for normal domains that
  allow grant mapping/event channels.

The new subop is marked unsupported for x86 because calling p2m_add_foregin
on two DomU's is currently not supported on x86.

This is for the proposal "Allow setting up shared memory areas between VMs
from xl config file" (see [1]).

[1] https://lists.xen.org/archives/html/xen-devel/2017-08/msg03242.html

Signed-off-by: Zhongze Liu <blackskygg@gmail.com>
Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Tim Deegan <tim@xen.org>
Cc: xen-devel@lists.xen.org
5 years agolibxc/restore: Fix error message for unrecognised stream version
Andrew Cooper [Tue, 17 Dec 2019 13:49:56 +0000 (13:49 +0000)]
libxc/restore: Fix error message for unrecognised stream version

The Expected and Got values are rendered in the wrong order.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
5 years agogolang/xenlight: implement keyed union C to Go marshaling
Nick Rosbrook [Mon, 16 Dec 2019 18:08:10 +0000 (18:08 +0000)]
golang/xenlight: implement keyed union C to Go marshaling

Switch over union key to determine how to populate 'union' in Go struct.

Since the unions of C types cannot be directly accessed in cgo, use a
typeof trick to typedef a struct in the cgo preamble that is analagous
to each inner struct of a keyed union. For example, to define a struct
for the hvm inner struct of libxl_domain_build_info, do:

  typedef typeof(((struct libxl_domain_build_info *)NULL)->u.hvm) libxl_domain_build_info_type_union_hvm;

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: begin C to Go type marshaling
Nick Rosbrook [Mon, 16 Dec 2019 18:08:09 +0000 (18:08 +0000)]
golang/xenlight: begin C to Go type marshaling

Begin implementation of fromC marshaling functions for generated struct
types. This includes support for converting fields that are basic
primitive types such as string and integer types, nested anonymous
structs, nested libxl structs, and libxl built-in types.

This patch does not implement conversion of arrays or keyed unions.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: remove no-longer used type MemKB
Nick Rosbrook [Mon, 16 Dec 2019 18:08:08 +0000 (18:08 +0000)]
golang/xenlight: remove no-longer used type MemKB

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: generate structs from the IDL
Nick Rosbrook [Mon, 16 Dec 2019 18:08:08 +0000 (18:08 +0000)]
golang/xenlight: generate structs from the IDL

Add struct and keyed union generation to gengotypes.py. For keyed unions,
use a method similar to gRPC's oneof to interpret C unions as Go types.
Meaning, for a given struct with a union field, generate a struct for
each sub-struct defined in the union. Then, define an interface of one
method which is implemented by each of the defined sub-structs. For
example:

  type domainBuildInfoTypeUnion interface {
          isdomainBuildInfoTypeUnion()
  }

  type DomainBuildInfoTypeUnionHvm struct {
      // HVM-specific fields...
  }

  func (x DomainBuildInfoTypeUnionHvm) isdomainBuildInfoTypeUnion() {}

  type DomainBuildInfoTypeUnionPv struct {
      // PV-specific fields...
  }

  func (x DomainBuildInfoTypeUnionPv) isdomainBuildInfoTypeUnion() {}

  type DomainBuildInfoTypeUnionPvh struct {
      // PVH-specific fields...
  }

  func (x DomainBuildInfoTypeUnionPvh) isdomainBuildInfoTypeUnion() {}

Then, remove existing struct definitions in xenlight.go that conflict
with the generated types, and modify existing marshaling functions to
align with the new type definitions. Notably, drop "time" package since
fields of type time.Duration are now of type uint64.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: re-factor Hwcap type implementation
Nick Rosbrook [Mon, 16 Dec 2019 18:08:07 +0000 (18:08 +0000)]
golang/xenlight: re-factor Hwcap type implementation

Re-define Hwcap as [8]uint32, and implement toC function. Also, re-name and
modify signature of toGo function to fromC.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: re-factor Uuid type implementation
Nick Rosbrook [Mon, 16 Dec 2019 18:08:06 +0000 (18:08 +0000)]
golang/xenlight: re-factor Uuid type implementation

Re-define Uuid as [16]byte and implement fromC, toC, and String functions.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: define CpuidPolicyList builtin type
Nick Rosbrook [Mon, 16 Dec 2019 18:08:05 +0000 (18:08 +0000)]
golang/xenlight: define CpuidPolicyList builtin type

Define CpuidPolicyList as a string so that libxl_cpuid_parse_config can
be used in the toC function.

For now, fromC is a no-op since libxl does not support a way to read a
policy, modify it,and then give it back to libxl.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: define EvLink builtin as empty struct
Nick Rosbrook [Mon, 16 Dec 2019 18:08:05 +0000 (18:08 +0000)]
golang/xenlight: define EvLink builtin as empty struct

Define EvLink as empty struct as there is currently no reason the internal of
this type should be used in Go.

Implement fromC and toC functions as no-ops.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: define MsVmGenid builtin type
Nick Rosbrook [Mon, 16 Dec 2019 18:08:04 +0000 (18:08 +0000)]
golang/xenlight: define MsVmGenid builtin type

Define MsVmGenid as [int(C.LIBXL_MS_VM_GENID_LEN)]byte and implement fromC and toC functions.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: define Mac builtin type
Nick Rosbrook [Mon, 16 Dec 2019 18:08:03 +0000 (18:08 +0000)]
golang/xenlight: define Mac builtin type

Define Mac as [6]byte and implement fromC, toC, and String functions.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: define StringList builtin type
Nick Rosbrook [Mon, 16 Dec 2019 18:08:02 +0000 (18:08 +0000)]
golang/xenlight: define StringList builtin type

Define StringList as []string an implement fromC and toC functions.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: re-name Bitmap marshaling functions
Nick Rosbrook [Mon, 16 Dec 2019 18:08:01 +0000 (18:08 +0000)]
golang/xenlight: re-name Bitmap marshaling functions

Re-name and modify signature of toGo function to fromC. The reason for
using 'fromC' rather than 'toGo' is that it is not a good idea to define
methods on the C types. Also, add error return type to Bitmap's toC function.

Finally, as code-cleanup, re-organize the Bitmap type's comments as per
Go conventions.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
--
Changes in v2:
- Use consistent variable naming for slice created from
  libxl_bitmap.

5 years agogolang/xenlight: define KeyValueList as empty struct
Nick Rosbrook [Mon, 16 Dec 2019 18:08:01 +0000 (18:08 +0000)]
golang/xenlight: define KeyValueList as empty struct

Define KeyValueList as empty struct as there is currently no reason for
this type to be available in the Go package.

Implement fromC and toC functions as no-ops.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: define Devid type as int
Nick Rosbrook [Mon, 16 Dec 2019 18:08:00 +0000 (18:08 +0000)]
golang/xenlight: define Devid type as int

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: define Defbool builtin type
Nick Rosbrook [Mon, 16 Dec 2019 18:07:59 +0000 (18:07 +0000)]
golang/xenlight: define Defbool builtin type

Define Defbool as struct analagous to the C type, and define the type
'defboolVal' that represent true, false, and default defbool values.

Implement Set, Unset, SetIfDefault, IsDefault, Val, and String functions
on Defbool so that the type can be used in Go analagously to how its
used in C.

Finally, implement fromC and toC functions.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agogolang/xenlight: generate enum types from IDL
Nick Rosbrook [Mon, 16 Dec 2019 18:07:59 +0000 (18:07 +0000)]
golang/xenlight: generate enum types from IDL

Introduce gengotypes.py to generate Go code the from IDL. As a first step,
implement 'enum' type generation.

As a result of the newly-generated code, remove the existing, and now
conflicting definitions in xenlight.go. In the case of the Error type,
rename the slice 'errors' to 'libxlErrors' so that it does not conflict
with the standard library package 'errors.' And, negate the values used
in 'libxlErrors' since the generated error values are negative.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agox86emul: correct far branch handling for 64-bit mode
Jan Beulich [Mon, 16 Dec 2019 16:37:09 +0000 (17:37 +0100)]
x86emul: correct far branch handling for 64-bit mode

AMD and friends explicitly specify that 64-bit operands aren't possible
for these insns. Nevertheless REX.W isn't fully ignored: It still
cancels a possible operand size override (0x66). Intel otoh explicitly
provides for 64-bit operands on the respective insn page of the SDM.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agopublic/io/ring.h: add FRONT/BACK_RING_ATTACH macros
Paul Durrant [Mon, 16 Dec 2019 16:36:37 +0000 (17:36 +0100)]
public/io/ring.h: add FRONT/BACK_RING_ATTACH macros

The version of this header present in the Linux source tree has contained
such macros for some time. These macros, as the names imply, allow front
or back rings to be set up for existent (rather than freshly created and
zeroed) shared rings.

This patch is to update this, the canonical version of the header, to
match the latest definition of these macros in the Linux source.

NOTE: The way the new macros are defined allows the FRONT/BACK_RING_INIT
      macros to be re-defined in terms of them, thereby reducing
      duplication.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
5 years agox86emul: correct LFS et al handling for 64-bit mode
Jan Beulich [Mon, 16 Dec 2019 16:35:50 +0000 (17:35 +0100)]
x86emul: correct LFS et al handling for 64-bit mode

AMD and friends explicitly specify that 64-bit operands aren't possible
for these insns. Nevertheless REX.W isn't fully ignored: It still
cancels a possible operand size override (0x66). Intel otoh explicitly
provides for 64-bit operands on the respective insn page of the SDM.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: correct segment override decode for 64-bit mode
Jan Beulich [Mon, 16 Dec 2019 16:34:46 +0000 (17:34 +0100)]
x86emul: correct segment override decode for 64-bit mode

The legacy / compatibility mode ES, CS, SS, and DS overrides are fully
ignored prefixes in 64-bit mode, i.e. they in particular don't cancel an
earlier FS or GS one. (They don't violate the REX-prefix-must-be-last
rule though.)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/time: drop vtsc_{kern, user}count debug counters
Igor Druzhinin [Fri, 13 Dec 2019 22:48:01 +0000 (22:48 +0000)]
x86/time: drop vtsc_{kern, user}count debug counters

They either need to be transformed to atomics to work correctly
(currently they left unprotected for HVM domains) or dropped entirely
as taking a per-domain spinlock is too expensive for high-vCPU count
domains even for debug build given this lock is taken too often.

Choose the latter as they are not extremely important anyway.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/pv: Fix `global-pages` to match the documentation
Andrew Cooper [Mon, 16 Dec 2019 13:58:45 +0000 (13:58 +0000)]
x86/pv: Fix `global-pages` to match the documentation

c/s 5de961d9c09 "x86: do not enable global pages when virtualized on AMD or
Hygon hardware" in fact does.  Fix the calculation in pge_init().

While fixing this, adjust the command line documenation, first to use the
newer style, and to expand the description to discuss cases where the option
might be useful to use, but Xen can't account for by default.

Fixes: 5de961d9c09 ('x86: do not enable global pages when virtualized on AMD or Hygon hardware')
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/mm: More discriptive names for page de/validation functions
George Dunlap [Thu, 12 Dec 2019 15:57:51 +0000 (15:57 +0000)]
x86/mm: More discriptive names for page de/validation functions

The functions alloc_page_type(), alloc_lN_table(), free_page_type()
and free_lN_table() are confusingly named: nothing is being allocated
or freed.  Rather, the page being passed in is being either validated
or devalidated for use as the specific type; in the specific case of
pagetables, these may be promoted or demoted (i.e., grab appropriate
references for PTEs).

Rename alloc_page_type() and free_page_type() to validate_page() and
devalidate_page().  Also rename alloc_segdesc_page() to
validate_segdesc_page(), since this is what it's doing.

Rename alloc_lN_table() and free_lN_table() to promote_lN_table() and
demote_lN_table(), respectively.

After this change:
- get / put type consistenly refer to increasing or decreasing the count
- validate / devalidate consistently refers to actions done when a
type count goes 0 -> 1 or 1 -> 0
- promote / demote consistenly refers to acquiring or freeing
resources (in the form of type refs and general references) in order
to allow a page to be used as a pagetable.

No functional change.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/mm: Use mfn_t in type get / put call tree
George Dunlap [Fri, 13 Dec 2019 14:09:46 +0000 (14:09 +0000)]
x86/mm: Use mfn_t in type get / put call tree

Replace `unsigned long` with `mfn_t` as appropriate throughout
alloc/free_lN_table, get/put_page_from_lNe, and
get_lN_linear_pagetable.  This obviates the need for a load of
`mfn_x()` and `_mfn()` casts.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/mm: Use a more descriptive name for pagetable mfns
George Dunlap [Fri, 13 Dec 2019 12:53:04 +0000 (12:53 +0000)]
x86/mm: Use a more descriptive name for pagetable mfns

In many places, a PTE being modified is accompanied by the pagetable
mfn which contains the PTE (primarily in order to be able to maintain
linear mapping counts).  In many cases, this mfn is stored in the
non-descript variable (or argement) "pfn".

Replace these names with lNmfn, to indicate that 1) this is a
pagetable mfn, and 2) that it is the same level as the PTE in
question.  This should be enough to remind readers that it's the mfn
containing the PTE.

No functional change.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/mm: Implement common put_data_pages for put_page_from_l[23]e
George Dunlap [Fri, 13 Dec 2019 12:53:04 +0000 (12:53 +0000)]
x86/mm: Implement common put_data_pages for put_page_from_l[23]e

Both put_page_from_l2e and put_page_from_l3e handle having superpage
entries by looping over each page and "put"-ing each one individually.
As with putting page table entries, this code is functionally
identical, but for some reason different.  Moreover, there is already
a common function, put_data_page(), to handle automatically swapping
between put_page() (for read-only pages) or put_page_and_type() (for
read-write pages).

Replace this with put_data_pages() (plural), which does the entire
loop, as well as the put_page / put_page_and_type switch.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/mm: Refactor put_page_from_l*e to reduce code duplication
George Dunlap [Fri, 13 Dec 2019 12:53:04 +0000 (12:53 +0000)]
x86/mm: Refactor put_page_from_l*e to reduce code duplication

put_page_from_l[234]e have identical functionality for devalidating an
entry pointing to a pagetable.  But mystifyingly, they duplicate the
code in slightly different arrangements that make it hard to tell that
it's the same.

Create a new function, put_pt_page(), which handles the common
functionality; and refactor all the functions to be symmetric,
differing only in the level of pagetable expected (and in whether they
handle superpages).

Other than put_page_from_l2e() gaining an ASSERT it probably should
have had already, no functional changes.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agopublic/io/netif.h: document a mechanism to advertise carrier state
Paul Durrant [Fri, 13 Dec 2019 16:39:44 +0000 (16:39 +0000)]
public/io/netif.h: document a mechanism to advertise carrier state

This patch adds a specification for a 'carrier' node in xenstore to allow
a backend to notify a frontend of it's virtual carrier/link state. E.g.
a backend that is unable to forward packets from the guest because it is
not attached to a bridge may wish to advertise 'no carrier'.

While in the area also fix an erroneous backend path description.

NOTE: This is purely a documentation patch. No functional change.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
5 years agoConfig.mk: Remove stray comment
Anthony PERARD [Thu, 12 Dec 2019 18:27:34 +0000 (18:27 +0000)]
Config.mk: Remove stray comment

This comment isn't about CONFIG_TESTS, but about SEABIOS_DIR that has
been removed.

Originally, the comment was added by 5f82d0858de1 ("tools: support
SeaBIOS. Use by default when upstream qemu is configured."), then
later the SEABIOS_DIR was removed by 14ee3c05f3ef ("Clone and build
Seabios by default") but that comment about the pain was left behind.
The commit that made CONFIG_TESTS painful was 85896a7c4dc7 ("build:
add autoconf to replace custom checks in tools/check").

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agoConfig.mk: Remove unused setvar_dir macro
Anthony PERARD [Thu, 12 Dec 2019 18:27:33 +0000 (18:27 +0000)]
Config.mk: Remove unused setvar_dir macro

And remove all mention of it in docs. It hasn't been used since
9ead9afcb935 ("Add configure --with-sysconfig-leaf-dir=SUBDIR to set
CONFIG_LEAF_DIR").

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agobuild: fix tools/configure in case only python3 exists
Juergen Gross [Wed, 11 Dec 2019 16:56:59 +0000 (17:56 +0100)]
build: fix tools/configure in case only python3 exists

Calling ./configure with python3 being there but no python,
tools/configure will fail. Fix that by defaulting to python and
falling back to python3 or python2.

While at it fix the use of non portable "type -p" by replacing it by
AC_PATH_PROG().

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wl@xen.org>
[ wei: run autogen.sh ]
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
5 years agoAMD/IOMMU: Cease using a dynamic height for the IOMMU pagetables
Andrew Cooper [Wed, 11 Dec 2019 13:55:32 +0000 (14:55 +0100)]
AMD/IOMMU: Cease using a dynamic height for the IOMMU pagetables

update_paging_mode() has multiple bugs:

 1) Booting with iommu=debug will cause it to inform you that that it called
    without the pdev_list lock held.
 2) When growing by more than a single level, it leaks the newly allocated
    table(s) in the case of a further error.

Furthermore, the choice of default level for a domain has issues:

 1) All HVM guests grow from 2 to 3 levels during construction because of the
    position of the VRAM just below the 4G boundary, so defaulting to 2 is a
    waste of effort.
 2) The limit for PV guests doesn't take memory hotplug into account, and
    isn't dynamic at runtime like HVM guests.  This means that a PV guest may
    get RAM which it can't map in the IOMMU.

The dynamic height is a property unique to AMD, and adds a substantial
quantity of complexity for what is a marginal performance improvement.  Remove
the complexity by removing the dynamic height.

PV guests now get 3 or 4 levels based on any hotplug regions in the host.
This only makes a difference for hardware which previously had all RAM below
the 512G boundary, and a hotplug region above.

HVM guests now get 4 levels (which will be sufficient until 256TB guests
become a thing), because we don't currently have the information to know when
3 would be safe to use.

The overhead of this extra level is not expected to be noticeable.  It costs
one page (4k) per domain, and one extra IO-TLB paging structure cache entry
which is very hot and less likely to be evicted.

This is XSA-311.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/mm: relinquish_memory: Grab an extra type ref when setting PGT_partial
George Dunlap [Mon, 28 Oct 2019 14:33:51 +0000 (14:33 +0000)]
x86/mm: relinquish_memory: Grab an extra type ref when setting PGT_partial

The PGT_partial bit in page->type_info holds both a type count and a
general ref count.  During domain tear-down, when free_page_type()
returns -ERESTART, relinquish_memory() correctly handles the general
ref count, but fails to grab an extra type count when setting
PGT_partial.  When this bit is eventually cleared, type_count underflows
and triggers the following BUG in page_alloc.c:free_domheap_pages():

    BUG_ON((pg[i].u.inuse.type_info & PGT_count_mask) != 0);

As far as we can tell, this page underflow cannot be exploited any any
other way: The page can't be used as a pagetable by the dying domain
because it's dying; it can't be used as a pagetable by any other
domain since it belongs to the dying domain; and ownership can't
transfer to any other domain without hitting the BUG_ON() in
free_domheap_pages().

(steal_page() won't work on a page in this state, since it requires
PGC_allocated to be set, and PGC_allocated will already have been
cleared.)

Fix this by grabbing an extra type ref if setting PGT_partial in
relinquish_memory.

This is part of XSA-310.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/mm: alloc/free_lN_table: Retain partial_flags on -EINTR
George Dunlap [Thu, 31 Oct 2019 11:17:38 +0000 (11:17 +0000)]
x86/mm: alloc/free_lN_table: Retain partial_flags on -EINTR

When validating or de-validating pages (in alloc_lN_table and
free_lN_table respectively), the `partial_flags` local variable is
used to keep track of whether the "current" PTE started the entire
operation in a "may be partial" state.

One of the patches in XSA-299 addressed the fact that it is possible
for a previously-partially-validated entry to subsequently be found to
have invalid entries (indicated by returning -EINVAL); in which case
page->partial_flags needs to be set to indicate that the current PTE
may have the partial bit set (and thus _put_page_type() should be
called with PTF_partial_set).

Unfortunately, the patches in XSA-299 assumed that once
put_page_from_lNe() returned -ERESTART on a page, it was not possible
for it to return -EINTR.  This turns out to be true for
alloc_lN_table() and free_lN_table, but not for _get_page_type() and
_put_page_type(): both can return -EINTR when called on pages with
PGT_partial set.  In these cases, the pages PGT_partial will still be
set; failing to set partial_flags appropriately may allow an attacker
to do a privilege escalation similar to those described in XSA-299.

Fix this by always copying the local partial_flags variable into
page->partial_flags when exiting early.

NB that on the "get" side, no adjustment to nr_validated_entries is
needed: whether pte[i] is partially validated or entirely
un-validated, we want nr_validated_entries = i.  On the "put" side,
however, we need to adjust nr_validated_entries appropriately: if
pte[i] is entirely validated, we want nr_validated_entries = i + 1; if
pte[i] is partially validated, we want nr_validated_entries = i.

This is part of XSA-310.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/mm: Set old_guest_table when destroying vcpu pagetables
George Dunlap [Tue, 19 Nov 2019 11:40:34 +0000 (11:40 +0000)]
x86/mm: Set old_guest_table when destroying vcpu pagetables

Changeset 6c4efc1eba ("x86/mm: Don't drop a type ref unless you held a
ref to begin with"), part of XSA-299, changed the calling discipline
of put_page_type() such that if put_page_type() returned -ERESTART
(indicating a partially de-validated page), subsequent calls to
put_page_type() must be called with PTF_partial_set.  If called on a
partially de-validated page but without PTF_partial_set, Xen will
BUG(), because to do otherwise would risk opening up the kind of
privilege escalation bug described in XSA-299.

One place this was missed was in vcpu_destroy_pagetables().
put_page_and_type_preemptible() is called, but on -ERESTART, the
entire operation is simply restarted, causing put_page_type() to be
called on a partially de-validated page without PTF_partial_set.  The
result was that if such an operation were interrupted, Xen would hit a
BUG().

Fix this by having vcpu_destroy_pagetables() consistently pass off
interrupted de-validations to put_old_page_type():
- Unconditionally clear references to the page, even if
  put_page_and_type failed
- Set old_guest_table and old_guest_table_partial appropriately

While here, do some refactoring:

 - Move clearing of arch.cr3 to the top of the function

 - Now that clearing is unconditional, move the unmap to the same
   conditional as the l4tab mapping.  This also allows us to reduce
   the scope of the l4tab variable.

 - Avoid code duplication by looping to drop references on
   guest_table_user

This is part of XSA-310.

Reported-by: Sarah Newman <srn@prgmr.com>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/mm: Don't reset linear_pt_count on partial validation
George Dunlap [Wed, 30 Oct 2019 17:05:28 +0000 (17:05 +0000)]
x86/mm: Don't reset linear_pt_count on partial validation

"Linear pagetables" is a technique which involves either pointing a
pagetable at itself, or to another pagetable the same or higher level.
Xen has limited support for linear pagetables: A page may either point
to itself, or point to another page of the same level (i.e., L2 to L2,
L3 to L3, and so on).

XSA-240 introduced an additional restriction that limited the "depth"
of such chains by allowing pages to either *point to* other pages of
the same level, or *be pointed to* by other pages of the same level,
but not both.  To implement this, we keep track of the number of
outstanding times a page points to or is pointed to another page
table, to prevent both from happening at the same time.

Unfortunately, the original commit introducing this reset this count
when resuming validation of a partially-validated pagetable, dropping
some "linear_pt_entry" counts.

On debug builds on systems where guests used this feature, this might
lead to crashes that look like this:

    Assertion 'oc > 0' failed at mm.c:874

Worse, if an attacker could engineer such a situation to occur, they
might be able to make loops or other abitrary chains of linear
pagetables, leading to the denial-of-service situation outlined in
XSA-240.

This is XSA-309.

Reported-by: Manuel Bouyer <bouyer@antioche.eu.org>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/vtx: Work around SingleStep + STI/MovSS VMEntry failures
Andrew Cooper [Wed, 11 Dec 2019 13:09:30 +0000 (14:09 +0100)]
x86/vtx: Work around SingleStep + STI/MovSS VMEntry failures

See patch comment for technical details.

Concerning the timeline, this was first discovered in the aftermath of
XSA-156 which caused #DB to be intercepted unconditionally, but only in
its SingleStep + STI form which is restricted to privileged software.

After working with Intel and identifying the problematic vmentry check,
this workaround was suggested, and the patch was posted in an RFC
series.  Outstanding work for that series (not breaking Introspection)
is still pending, and this fix from it (which wouldn't have been good
enough in its original form) wasn't committed.

A vmentry failure was reported to xen-devel, and debugging identified
this bug in its SingleStep + MovSS form by way of INT1, which does not
involve the use of any privileged instructions, and proving this to be a
security issue.

This is XSA-308

Reported-by: Håkon Alstadheim <hakon@alstadheim.priv.no>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
5 years agox86+Arm32: make find_next_{,zero_}bit() have well defined behavior
Jan Beulich [Wed, 11 Dec 2019 13:06:18 +0000 (14:06 +0100)]
x86+Arm32: make find_next_{,zero_}bit() have well defined behavior

These functions getting used with the 2nd and 3rd arguments being equal
wasn't well defined: Arm64 reliably returns the value of the 2nd
argument in this case, while on x86 for bitmaps up to 64 bits wide the
return value was undefined (due to the undefined behavior of a shift of
a value by the number of bits it's wide) when the incoming value was 64.
On Arm32 an actual out of bounds access would happen when the
size/offset value is a multiple of 32; if this access doesn't fault, the
return value would have been sufficiently correct afaict.

Make the functions consistently tolerate the last two arguments being
equal (and in fact the 3rd argument being greater or equal to the 2nd),
in favor of finding and fixing all the use sites that violate the
original more strict assumption.

This is XSA-307.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien@xen.org>
5 years agoConfig.mk: update seabios to 1.13.0
Wei Liu [Wed, 11 Dec 2019 12:02:26 +0000 (12:02 +0000)]
Config.mk: update seabios to 1.13.0

Signed-off-by: Wei Liu <wl@xen.org>
5 years agox86: add a comment regarding the location of hypervisor_probe
Wei Liu [Wed, 11 Dec 2019 11:33:03 +0000 (11:33 +0000)]
x86: add a comment regarding the location of hypervisor_probe

Signed-off-by: Wei Liu <liuwe@microsoft.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agoSUPPORT.md: add core scheduling
Juergen Gross [Wed, 11 Dec 2019 08:45:49 +0000 (09:45 +0100)]
SUPPORT.md: add core scheduling

Add core scheduling feature to SUPPORT.md.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agodocs/sphinx: How Xen Boots on x86
Andrew Cooper [Sat, 19 Oct 2019 19:12:44 +0000 (12:12 -0700)]
docs/sphinx: How Xen Boots on x86

Begin to document how the x86 build of Xen boots.  It is by no means complete,
but is a start.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agoxen/build: Automatically locate a suitable python interpreter
Andrew Cooper [Sat, 7 Dec 2019 15:50:22 +0000 (15:50 +0000)]
xen/build: Automatically locate a suitable python interpreter

Needing to pass PYTHON=python3 into hypervisor builds is irritating and
unnecessary.  Locate a suitable interpreter automatically, defaulting to Py3
if it is available.

Reported-by: Steven Haigh <netwiz@crc.id.au>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Release-acked-by: Juergen Gross <jgross@suse.com>
5 years agoxen/banner: Drop the fig-to-oct.py script
Andrew Cooper [Sat, 7 Dec 2019 17:45:10 +0000 (17:45 +0000)]
xen/banner: Drop the fig-to-oct.py script

The script is 664 rather than 775, so the banner conversion doesn't actually
work if $(PYTHON) is empty:

  /bin/sh: tools/fig-to-oct.py: Permission denied
  make[3]: *** [include/xen/compile.h] Error 126
  make[3]: Leaving directory `/builds/xen-project/people/andyhhp/xen/xen'

Fixing this is easy, but using python here is wasteful.  compile.h doesn't
need XEN_BANNER rendering in octal, and text is much more simple to handle.
Replace fig-to-oct.py with a smaller sed script.  This could be a shell
one-liner, but it is much more simple to comment sensibly, and doesn't need to
include the added cognative load of makefile and shell escaping.

While changing this logic, take the opportunity to optimise the banner
space (and time on the serial port) by dropping trailing whitespace, which is
84 characters for current staging.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
5 years agoxen/flask: Drop the gen-policy.py script
Andrew Cooper [Sat, 7 Dec 2019 16:20:55 +0000 (16:20 +0000)]
xen/flask: Drop the gen-policy.py script

The script is Python 2 specific, and fails with string/binary issues with
Python 3:

  Traceback (most recent call last):
    File "gen-policy.py", line 14, in <module>
      for char in sys.stdin.read():
    File "/usr/lib/python3.5/codecs.py", line 321, in decode
      (result, consumed) = self._buffer_decode(data, self.errors, final)
  UnicodeDecodeError: 'utf-8' codec can't decode byte 0x8c in position 0: invalid start byte

Fixing the script to be compatible isn't hard, but using python here is
wasteful.  Drop the script entirely, and write an equivelent flask-policy.S
instead.  This removes the need for a $(PYTHON) and $(CC) pass.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Julien Grall <julien@xen.org>
Release-acked-by: Juergen Gross <jgross@suse.com>
5 years agoremove myself as vm_event maintainer
Razvan Cojocaru [Tue, 10 Dec 2019 10:34:33 +0000 (11:34 +0100)]
remove myself as vm_event maintainer

Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
5 years agox86: do not enable global pages when virtualized on AMD or Hygon hardware
Roger Pau Monné [Tue, 10 Dec 2019 10:34:00 +0000 (11:34 +0100)]
x86: do not enable global pages when virtualized on AMD or Hygon hardware

When using global pages a full tlb flush can only be performed by
toggling the PGE bit in CR4, which is usually quite expensive in terms
of performance when running virtualized. This is specially relevant on
AMD or Hygon hardware, which doesn't have the ability to do selective
CR4 trapping, but can also be relevant on e.g. Intel if the underlying
hypervisor also traps accesses to the PGE CR4 bit.

In order to avoid this performance penalty, do not use global pages
when running virtualized on AMD or Hygon hardware. A command line option
'global-pages' is provided in order to allow the user to select whether
global pages will be enabled for PV guests.

The above figures are from a PV shim running on AMD hardware with
32 vCPUs:

PGE enabled, x2APIC mode:

(XEN) Global lock flush_lock: addr=ffff82d0804b01c0, lockval=1adb1adb, not locked
(XEN)   lock:1841883(1375128998543), block:1658716(10193054890781)

Average lock time:   746588ns
Average block time: 6145147ns

PGE disabled, x2APIC mode:

(XEN) Global lock flush_lock: addr=ffff82d0804af1c0, lockval=a8bfa8bf, not locked
(XEN)   lock:2730175(657505389886), block:2039716(2963768247738)

Average lock time:   240829ns
Average block time: 1453029ns

As seen from the above figures the lock and block time of the flush
lock is reduced to approximately 1/3 of the original value.

Note that XEN_MINIMAL_CR4 and mmu_cr4_features are not modified, and
thus global pages are left enabled for the hypervisor. This is not an
issue because the code to switch the control registers (cr3 and cr4)
already takes into account such situation and performs the necessary
flushes. The same already happens when using XPTI or PCIDE, as the
guest cr4 doesn't have global pages enabled in that case either.

Also note that the suspend and resume code is correct in writing
mmu_cr4_features into cr4 on resume, since that's the cr4 used by the
idle vCPU which is the context used by the suspend and resume routine.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/AMD: unbreak CPU hotplug on AMD systems without RstrFpErrPtrs
Igor Druzhinin [Tue, 10 Dec 2019 10:07:22 +0000 (11:07 +0100)]
x86/AMD: unbreak CPU hotplug on AMD systems without RstrFpErrPtrs

If the feature is not present Xen will try to force X86_BUG_FPU_PTRS
feature at CPU identification time. This is especially noticeable in
PV-shim that usually hotplugs its vCPUs. We either need to restrict this
action for boot CPU only or allow secondary CPUs to modify
forced CPU capabilities at runtime. Choose the former since modifying
forced capabilities out of boot path leaves the system in potentially
inconsistent state.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/i8259A: don't open-code LEGACY_VECTOR()
Jan Beulich [Mon, 9 Dec 2019 13:03:01 +0000 (14:03 +0100)]
x86/i8259A: don't open-code LEGACY_VECTOR()

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agolz4: fix system halt at boot kernel on x86_64
Krzysztof Kolasa [Mon, 9 Dec 2019 13:02:35 +0000 (14:02 +0100)]
lz4: fix system halt at boot kernel on x86_64

Sometimes, on x86_64, decompression fails with the following
error:

Decompressing Linux...

Decoding failed

 -- System halted

This condition is not needed for a 64bit kernel(from commit d5e7caf):

if( ... ||
    (op + COPYLENGTH) > oend)
    goto _output_error

macro LZ4_SECURE_COPY() tests op and does not copy any data
when op exceeds the value.

added by analogy to lz4_uncompress_unknownoutputsize(...)

Signed-off-by: Krzysztof Kolasa <kkolasa@winsoft.pl>
[Linux commit 99b7e93c95c78952724a9783de6c78def8fbfc3f]

The offending commit in our case is fcc17f96c277 ("LZ4 : fix the data
abort issue").

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agolz4: pull out constant tables
Rasmus Villemoes [Mon, 9 Dec 2019 13:01:56 +0000 (14:01 +0100)]
lz4: pull out constant tables

There's no reason to allocate the dec{32,64}table on the stack; it
just wastes a bunch of instructions setting them up and, of course,
also consumes quite a bit of stack. Using size_t for such small
integers is a little excessive.

$ scripts/bloat-o-meter /tmp/built-in.o lib/built-in.o
add/remove: 2/2 grow/shrink: 2/0 up/down: 1304/-1548 (-244)
function                                     old     new   delta
lz4_decompress_unknownoutputsize              55     718    +663
lz4_decompress                                55     632    +577
dec64table                                     -      32     +32
dec32table                                     -      32     +32
lz4_uncompress                               747       -    -747
lz4_uncompress_unknownoutputsize             801       -    -801

The now inlined lz4_uncompress functions used to have a stack
footprint of 176 bytes (according to -fstack-usage); their inlinees
have increased their stack use from 32 bytes to 48 and 80 bytes,
respectively.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
[Linux commit bea2b592fd18eb8ffa3fc4ad380610632d03a38f]

Use {,u}int8_t instead of plain "int" for the tables.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>