Luca Miccio [Thu, 22 Aug 2019 09:57:14 +0000 (11:57 +0200)]
xen/color_alloc: lower default for buddy reservation
When cache coloring is enabled, a certain amount of memory is reserved
for buddy allocation. The current default value of 512 MB was set to
enable full interchangeability between the two allocator to allow
gradual introduction of coloring support patchset. As of this commit,
the colored allocator is used for dom0, domUs and Xen, while the buddy
manages only Xen heap pages. The memory reserved to the buddy could be
thus lowered to a reasonably small value.
Signed-off-by: Luca Miccio <206497@studenti.unimore.it> Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
Luca Miccio [Thu, 22 Aug 2019 09:56:25 +0000 (11:56 +0200)]
xen/common vmap: fix alternative with coloring support
Alternative module remaps Xen code (.text) with a physical contiguous
mapping. If Xen code is colored, the current remap will break
alternative system. Fix this problem by remapping Xen code using its
color selection.
Notes:
A more desiderable way to solve this issues is to create a common function
that can be used both with and without coloring support. This avoid to
have code inside #ifdefs. A first implementation is in WIP but early
testings show that in same cases Linux has booting issues.
Keep this working code as it is now until we have a more stable solution.
Signed-off-by: Luca Miccio <206497@studenti.unimore.it> Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
Luca Miccio [Fri, 4 Oct 2019 14:43:34 +0000 (16:43 +0200)]
xen/arch: add coloring support for Xen
Introduce a new implementation of setup_pagetables that uses coloring
logic in order to isolate Xen code using its color selection.
Page tables construction is essentially copied, except for the xenmap
table, where coloring logic is needed. Given the absence of a contiguous
physical mapping, pointers to next level tables need to be manually
calculated.
Xen code is relocated in strided mode using the same coloring logic as
the one in xenmap table by using a temporary colored mapping that will
be destroyed after switching the TTBR register.
Keep Xen text section mapped in the newly created pagetables.
The boot process relies on computing needed physical addresses of Xen
code by using a shift, but colored mapping is not linear and not easily
computable. Therefore, the old Xen code is temporarily kept and used to
boot secondary CPUs until they switch to the colored mapping, which is
accessed using the handy macro virt_old. After the boot process, the old
Xen code memory is reset and its mapping is destroyed.
Signed-off-by: Luca Miccio <206497@studenti.unimore.it> Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
Luca Miccio [Thu, 10 Oct 2019 14:51:01 +0000 (16:51 +0200)]
xen/arm: add argument to remove_early_mappings
Upcoming patches will need to remove temporary mappings created during
Xen coloring process. The function remove_early_mappings does what we
need but it is case-specific. Parametrize the function to avoid code
replication.
Luca Miccio [Thu, 22 Aug 2019 08:21:56 +0000 (10:21 +0200)]
xen/arm: add size parameter to get_xen_paddr
In order to efficiently relocate Xen while coloring it, instead of many
memory regions, only a unique can be mapped as the target, as long as it
includes all the pages of the selected colors. This means that in the
worst case the target region must be greater than xen code size * avail.
colors. In place this region as high as possible in RAM, we use the
get_xen_paddr function. However the latter assumes to handle a memory
with size equals only to xen code region. Add a new "size" parameter to
handle also the coloring case.
During Xen coloring procedure, we need to manually calculate consecutive
physical addresses that conform to the color selection. Add an helper
function that does this operation. The latter will return the next
address that conforms to Xen color selection.
The next_colored function is architecture dependent and the provided
implementation is for ARMv8.
Signed-off-by: Luca Miccio <206496@studenti.unimore.it> Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
Luca Miccio [Fri, 22 Nov 2019 19:27:00 +0000 (20:27 +0100)]
xen/arch: init cache coloring conf for Xen
Add initialization for Xen coloring data. By default, use the lowest
color index available.
Benchmarking the VM interrupt response time provides an estimation of
LLC usage by Xen's most latency-critical runtime task. Results on Arm
Cortex-A53 on Xilinx Zynq UltraScale+ XCZU9EG show that one color, which
reserves 64 KiB of L2, is enough to attain best responsiveness.
More colors are instead very likely to be needed on processors whose L1
cache is physically-indexed and physically-tagged, such as Cortex-A57.
In such cases, coloring applies to L1 also, and there typically are two
distinct L1-colors. Therefore, reserving only one color for Xen would
senselessly partitions a cache memory that is already private, i.e.
underutilize it. The default amount of Xen colors is thus set to one.
Signed-off-by: Luca Miccio <206497@studenti.unimore.it> Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
Luca Miccio [Wed, 21 Aug 2019 17:10:06 +0000 (19:10 +0200)]
tools: add support for cache coloring configuration
Add a new "colors" parameter that defines the color assignment for a
domain. The user can specify one or more color ranges using the same
syntax as the command line color selection (e.g. 0-4).
The parameter is defined as a list of strings that represent the
color ranges.
Signed-off-by: Luca Miccio <206497@studenti.unimore.it> Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
Luca Miccio [Thu, 3 Oct 2019 17:04:57 +0000 (19:04 +0200)]
xen/arm: initialize cache coloring data for Dom0/U
Initialize cache coloring configuration during domain creation. If no
colors assignment is provided by the user, use the default one.
The default configuration is the one assigned to Dom0. The latter is
configured as a standard domain with default configuration.
Signed-off-by: Luca Miccio <206497@studenti.unimore.it> Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
Luca Miccio [Thu, 3 Oct 2019 17:01:08 +0000 (19:01 +0200)]
xen/include: define new handle param
During domU creation process the colors selection has to be passed to
the Xen hypercall. This selection is defined as an array. We need thus
to pass the pointer to Xen and make it accessible in its memory space.
This is generally done using what Xen calls GUEST_HANDLE_PARAMS. Add a
new parameter that allows us to pass both the colors array and the
number of elements in it.
Signed-off-by: Luca Miccio <206497@studenti.unimore.it> Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
Luca Miccio [Wed, 21 Aug 2019 15:33:26 +0000 (17:33 +0200)]
xen/arch: check color selection function
Dom0 color configuration is parsed in the Xen command line. Add an
helper function to check the user selection. If no configuration is
provided by the user, all the available colors supported by the
hardware will be assigned to dom0.
Signed-off-by: Luca Miccio <206497@studenti.unimore.it> Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
Luca Miccio [Wed, 21 Aug 2019 16:13:27 +0000 (18:13 +0200)]
xen/arch: add default colors selection function
When cache coloring support is enabled, a color assignment is needed for
every domain. Introduce a function computing a default configuration
with a safe and common value -- the dom0 color selection.
Signed-off-by: Luca Miccio <206497@studenti.unimore.it> Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
Luca Miccio [Wed, 21 Aug 2019 15:23:25 +0000 (17:23 +0200)]
xen/common: use colored allocator and reverse domains page list
Add colored heap when cache coloring is enabled.
Manage domain page lists as a queue instead of a stack -- pages are
extracted in the same order as they were previously inserted in the
list. This allows quickly insertion of freed pages into the colored
allocator internal data structures -- sorted lists.
Signed-off-by: Luca Miccio <206497@studenti.unimore.it> Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
Luca Miccio [Wed, 21 Aug 2019 15:13:41 +0000 (17:13 +0200)]
xen/color alloc: implement color_from_page for ARM64
The colored allocator does not make any assumptions on how a color is
defined, since the definition may change depending on the architecture.
Add a definition for ARMv8 architectures.
Signed-off-by: Luca Miccio <206497@studenti.unimore.it> Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
Luca Miccio [Wed, 28 Aug 2019 19:50:03 +0000 (21:50 +0200)]
xen/common alloc: release boot regions in order
Release the boot page regions in address-increasing order. This allows
quickly insertion of freed pages into the colored allocator's internal
data structures -- sorted lists.
Signed-off-by: Luca Miccio <206497@studenti.unimore.it> Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
Luca Miccio [Wed, 21 Aug 2019 15:09:56 +0000 (17:09 +0200)]
xen/arm: add colored allocator initialization
Initialize colored heap and allocator data structures. It is assumed
that pages are given to the init function is in ascending order. To
ensure that, pages are retrieved from bootmem_regions starting from the
first one.
Signed-off-by: Luca Miccio <206497@studenti.unimore.it> Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
Luca Miccio [Wed, 21 Aug 2019 14:44:10 +0000 (16:44 +0200)]
xen/arch: introduce cache-coloring allocator
Introduce a new memory page allocator that implement the cache coloring
mechanism. The allocation algorithm follows the given coloring scheme
specified for each guest, and maximizes contiguity in the page
selection.
Pages are stored by color in separated and address-ordered lists that
are collectively called the colored heap. These lists will be populated
by a simple initialisation function, which, for any available page,
compute its color and insert it in the corresponding list. When a
domain requests a page, the allocator take one from the subset of lists
whose colors equal the domain configuration. It chooses the highest
page element among the lasts elements of such lists. This ordering
guarantees that contiguous pages are sequentially allocated, if this is
made possible by a color assignment which includes adjacent ids.
The allocator can handle only requests with order equals to 0 since the
single color granularity is represented in memory by one page.
A dump function is added to allow inspection of colored heap
information.
Signed-off-by: Luca Miccio <206497@studenti.unimore.it> Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
Luca Miccio [Wed, 21 Aug 2019 14:18:45 +0000 (16:18 +0200)]
xen/arm: add colored flag to page struct
A new allocator enforcing a cache-coloring configuration is going to be
introduced. We thus need to distinguish the memory pages assigned to,
and managed by, such colored allocator from the ordinary buddy
allocator's ones. Add a color flag to the page structure.
Signed-off-by: Luca Miccio <206497@studenti.unimore.it> Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
Luca Miccio [Wed, 21 Aug 2019 14:15:37 +0000 (16:15 +0200)]
xen/arm: add coloring data to domains
We want to be able to associate an assignment of cache colors to each
domain. Add a configurable-length array containing a set of color
indices in the domain data.
Signed-off-by: Luca Miccio <206497@studenti.unimore.it> Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
Luca Miccio [Wed, 21 Aug 2019 14:05:56 +0000 (16:05 +0200)]
xen/arm: add coloring basic initialization
Introduce a first and simple initialization function for the cache
coloring support. A helper function computes 'addr_col_mask', the
platform-dependent bitmask asserting the bits in memory addresses that
can be used for the coloring mechanism. This, in turn is used to
determine the total amount of available colors.
Signed-off-by: Luca Miccio <206497@studenti.unimore.it> Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
Luca Miccio [Wed, 21 Aug 2019 13:55:52 +0000 (15:55 +0200)]
xen/arm: compute LLC way size by hardware inspection
The size of the LLC way is a crucial parameter for the cache coloring
support, since it determines the maximum number of available colors on
the the platform. This parameter can currently be retrieved only from
the way_size bootarg and it is prone to misconfiguration nullifying the
coloring mechanism and breaking cache isolation.
Add an alternative and more safe method to retrieve the way size by
directly asking the hardware, namely using CCSIDR_EL1 and CSSELR_EL1
registers.
This method has to check also if at least L2 is implemented in the
hardware since there are scenarios where only L1 cache is availble, e.g,
QEMU.
Signed-off-by: Luca Miccio <206497@studenti.unimore.it> Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
Luca Miccio [Tue, 20 Aug 2019 14:08:10 +0000 (16:08 +0200)]
xen: add parsing function for cache coloring configuration
Add three new bootargs allowing configuration of cache coloring support
for Xen:
- way_size: The size of a LLC way in bytes. This value is mainly used
to calculate the maximum available colors on the platform.
- dom0_colors: The coloring configuration for Dom0, which also acts as
default configuration for any DomU without an explicit configuration.
- xen_colors: The coloring configuration for the Xen hypervisor itself.
A cache coloring configuration consists of a selection of colors to be
assigned to a VM or to the hypervisor. It is represented by a set of
ranges. Add a common function that parses a string with a
comma-separated set of hyphen-separated ranges like "0-7,15-16" and
returns both: the number of chosen colors, and an array containing their
ids.
Signed-off-by: Luca Miccio <206497@studenti.unimore.it> Signed-off-by: Marco Solieri <marco.solieri@unimore.it>
This patch adds a support for PM EEMI API mediate layer support.
Mapping between device, clock and reset nodes and corresponding base
addresses is derived from topology information. Similar to ZU+, certain
device nodes do not allow any operations such as turning off ACPU cores,
LPD etc.
Since there are a few significant changes to the handling of PM commands
for versal due to various reasons (node value representations,
additions/removal of commands etc.), there is a separate handler for
versal platform.
This patch adds a new common header to be used for generic PM
EEMI definitions. In addition, header guards are also added to
xilinx-zynqmp-mm.h and xilinx-zynqmp-eemi.h files.
Following unused emums are also removed:
- pm_node_id
- pm_request_ack
- pm_abort_reason
- pm_suspend_reason
- pm_ram_state
- pm_opchar_type
It is necessary to allow a DomU to issue EEMI power management
operations on TCM nodes when running OpenAMP in a DomU. Introduce the
TCM nodes in xilinx-zynqmp-eemi.c, so that they are allowed to do so
when the TCM regions are assigned to the domU (they are subject to the
usual permissions checks.)
Tejas Patel [Mon, 25 Mar 2019 08:59:42 +0000 (01:59 -0700)]
platform: zynqmp: Map missing clocks to respective node
Dom0 requires access of AMS_REF, TOPSW_LSBUS and LPD_LSBUS clock.
Map these clocks to respective node to provide access
if Dom0 has permission to access to those nodes.
The shared memory device tree binding went upstream as
"xen,shared-memory-v1". So, rename all occurrences of
"xen,shared-memory" to "xen,shared-memory-v1" in the docs.
Although libxc doesn't promise compatibility, xc_domain_memory_mapping
has been used by QEMU for years. Instead of changing the signature of
the function, introduce a new xc_domain_memory_mapping_cache which takes
the additional cacheability parameter. Leave the original
xc_domain_memory_mapping unmodified.
Parse a new cacheability option for the iomem parameter, it can be
"devmem" for device memory mappings, which is the default, or "memory"
for normal memory mappings.
Store the parameter in a new field in libxl_iomem_range.
Pass the cacheability option to xc_domain_memory_mapping.
Add an additional parameter to xc_domain_memory_mapping to pass
cacheability information. The same parameter values are the same for the
XEN_DOMCTL_memory_mapping hypercall (0 is device memory, 1 is normal
memory). Pass CACHEABILITY_DEVMEM by default -- no changes in behavior.
xen: extend XEN_DOMCTL_memory_mapping to handle cacheability
Reuse the existing padding field to pass cacheability information about
the memory mapping, specifically, whether the memory should be mapped as
normal memory or as device memory (this is what we have today).
Add a cacheability parameter to map_mmio_regions. 0 means device
memory, which is what we have today.
On ARM, map device memory as p2m_mmio_direct_dev (as it is already done
today) and normal memory as p2m_ram_rw.
On x86, return error if the cacheability requested is not device memory.
xen/arm: export shared memory regions as reserved-memory on device tree
Shared memory regions need to be advertised to the guest. Fortunately, a
device tree binding for special memory regions already exist:
reserved-memory.
Add a reserved-memory node for each shared memory region, for both
owners and borrowers.
Mirela Simonovic [Tue, 23 Oct 2018 14:51:24 +0000 (16:51 +0200)]
xen/arm: zynqmp: Add RPLL and VPLL-related clocks to pm_clock2node map
Current clock driver in Linux for Zynq MPSoC controls the PLLs as if they
are clocks (using the clock rather than PLL EEMI API). Only RPLL and VPLL
could be directly controlled by a guest that owns the display port, because
the display port driver in Linux requires for video and audio some special
clock frequencies, that further require VPLL and RPLL to be locked in fractional
modes (for video and audio respectively). Therefore, we need to allow a guest
that owns the display port to directly control these PLL-related clocks.
In future, Linux driver should switch to using PLL EEMI API for controlling
PLLs, and the support for that is already added in EEMI mediator in Xen.
Once that happens, this patch can be reverted.
Mirela Simonovic [Tue, 23 Oct 2018 14:51:23 +0000 (16:51 +0200)]
xen/arm: zynqmp: Remove direct accesses to PLLs and their resets
Only a limited number of PLLs can be controlled by guests, and that
has to be done using PLL EEMI APIs. Clean-up the direct access options
for PLLs and their resets.
Mirela Simonovic [Tue, 23 Oct 2018 14:51:22 +0000 (16:51 +0200)]
xen/arm: zynqmp: Remove MMIO r/w accesses to clock and PLL control
Guests need to used clock/PLL EEMI API calls to query and control
states of clocks/PLLs rather than MMIO read/write accesses. Thereby,
the gate for MMIO read/write accesses to clock/PLL control registers
has to be closed (done in this patch).
Mirela Simonovic [Tue, 23 Oct 2018 14:51:21 +0000 (16:51 +0200)]
xen/arm: zynqmp: Add PLL set mode/parameter EEMI API
PLL set mode/parameter should be allowed only for VPLL and RPLL to
a guest which uses the display port. This is the case because the display
port driver requires some very specific frequencies for video and audio,
so it relies on configuring VPLL and RPLL in fractional mode (for video
and audio respectively). These two PLLs are reserved for exclusive usage
to display port, or to be more specific - the clock framework of the guest
that owns the display port will need to directly control the modes of these
two PLLs and the power management framework should allow that.
The check is implemented using the domain_has_node_access() function, which
covers this use-case because access to NODE_VPLL and NODE_RPLL is granted to
a guest which has access to the display port via the newly added entries
in pm_node_access map.
If a guest is allowed to control a PLL the request is passed through to
the EL3. Otherwise, an error is returned.
Mirela Simonovic [Tue, 23 Oct 2018 14:51:20 +0000 (16:51 +0200)]
xen/arm: zynqmp: Add PLL EEMI API definitions and passthrough get functions
PLL get functions should be allowed to every guest because guests may
need to use these APIs to calculate the PLL output frequency. Thereby,
allow passthrough of get functions to every guest.
Mirela Simonovic [Tue, 23 Oct 2018 14:51:19 +0000 (16:51 +0200)]
xen/arm: zynqmp: Implement checking and passthrough for clock control APIs
Clock enable, disable, set parent and set divider EEMI APIs affect
frequency of the target clock, so there has to be a permission checking
to filter the calls that should not be permitted to a guest. To implement
the checking, the clock-to-node map is introduced and implemented using the
pm_clock2node array (note that a clock can drive several nodes). Elements
of pm_clock2node array have to be defined by the increasing clock ID values
because of the search algorithm that relies on this assumption. Only clocks that
a guest could be allowed to control need to be represented in the pm_clock2node
array. Clocks that are not represented in the array should not be controllable
by any guest.
A guest will be granted the permission to control a clock only if all the nodes
driven by the target clock are owned by the guest. If the permission is granted
the call is passed through to the EL3. Otherwise, error is returned.
Mirela Simonovic [Tue, 23 Oct 2018 14:51:18 +0000 (16:51 +0200)]
xen/arm: zynqmp: Clock get EEMI API functions are allowed to each guest
Each guest is allowed to query clock related information (get divisor
value, current clock parent or clock status). Guests may need to use
these APIs to construct the information about the partial clock tree
that they control or depend on - e.g. although a guest may not control
a clock it may need to calculate its frequency and these APIs are
necessary to enable the calculation.
If the provided clock ID is valid, the call is passed through to the
EL3. Otherwise, an error is returned.
The clock id definitions are added in this patch. Although this patch
requires only clock id min and max values to check if clock id is valid,
the clock id definitions are in general needed in Xen. This is because
Xilinx clock driver implementation in Linux queries the clock tree topology
at runtime from firmware (ATF). Device tree does contain some information
about the clocks - but only leaf clock ID numbers and their binding to device
interfaces. The underlying software layers need to know everything else about
the clock tree. Since the clock topology resides in ATF and querying calls
are passed through by Xen, the Xen at least needs to know about clock IDs to
be able to map them to nodes in order to determine clock-control permissions.
This clock-control permission checking will be added in a following patch.
The SL0 attribute in TTBCR, which specifies the entry level in the page
table lookup, should be the same as the SL0 attribute in VTCR_EL2, given
that pagetables are shared between MMU and SMMU.
Make it so, by reading the value from VTCR_EL2, and setting reg
accordingly.
Signed-off-by: Stefano Stabellini <stefanos@xilinx.com> Tested-by: Upender Cherukupally <upender@xilinx.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tejas Patel [Sat, 24 Feb 2018 15:47:11 +0000 (07:47 -0800)]
xen: platform: zynqmp: Add new eemi api IDs
New EEMI API IDs are added in ATF and Linux.
Sync EEMI API IDs of xen with Linux and ATF.
Signed-off-by: Tejas Patel <tejasp@xilinx.com> Acked-by: Jolly Shah <jollys@xilinx.com> Reviewed-by: Alistair Francis <alistair.francis@xilinx.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
xen/arm: zynqmp: Forward plaform specific firmware calls
Implement an EEMI mediator and forward platform specific
firmware calls from guests to firmware.
The EEMI mediator is responsible for implementing access
controls modifying or blocking calls that try to operate
on setup for devices that are not under the calling guest's
control.
Zhongze Liu [Wed, 5 Dec 2018 22:16:01 +0000 (14:16 -0800)]
libxl:xl: add parsing code to parse "libxl_static_sshm" from xl config files
Add the parsing utils for the newly introduced libxl_static_sshm struct
to the libxl/libxlu_* family. And add realated parsing code in xl to
parse the struct from xl config files. This is for the proposal "Allow
setting up shared memory areas between VMs from xl config file" (see [1]).
Zhongze Liu [Wed, 5 Dec 2018 22:16:00 +0000 (14:16 -0800)]
libxl: support unmapping static shared memory areas during domain destruction
Add libxl__sshm_del to unmap static shared memory areas mapped by
libxl__sshm_add during domain creation. The unmapping process is:
* For a owner: decrease the refcount of the sshm region, if the refcount
reaches 0, cleanup the whole sshm path.
* For a borrower:
1) unmap the shared pages, and cleanup related xs entries. If the
system works normally, all the shared pages will be unmapped, so there
won't be page leaks. In case of errors, the unmapping process will go
on and unmap all the other pages that can be unmapped, so the other
pages won't be leaked, either.
2) Decrease the refcount of the sshm region, if the refcount reaches
0, cleanup the whole sshm path.
This is for the proposal "Allow setting up shared memory areas between VMs
from xl config file" (see [1]).
Zhongze Liu [Tue, 29 Jan 2019 18:57:11 +0000 (10:57 -0800)]
libxl: support mapping static shared memory areas during domain creation
Add libxl__sshm_add to map shared pages from one DomU to another, The mapping
process involves the following steps:
* Set defaults and check for further errors in the static_shm configs:
overlapping areas, invalid ranges, duplicated owner domain,
not page aligned, no owner domain etc.
* Use xc_domain_add_to_physmap_batch to map the shared pages to borrowers
* When some of the pages can't be successfully mapped, roll back any
successfully mapped pages so that the system stays in a consistent state.
* Write information about static shared memory areas into the appropriate
xenstore paths and set the refcount of the shared region accordingly.
Temporarily mark this as unsupported on x86 because calling p2m_add_foreign on
two domU's is currently not allowd on x86 (see the comments in
x86/mm/p2m.c:p2m_add_foreign for more details).
This is for the proposal "Allow setting up shared memory areas between VMs
from xl config file" (see [1]).
Zhongze Liu [Tue, 29 Jan 2019 18:56:08 +0000 (10:56 -0800)]
libxl: introduce a new structure to represent static shared memory regions
Add a new structure to the IDL family to represent static shared memory regions
as proposed in the proposal "Allow setting up shared memory areas between VMs
from xl config file" (see [1]).
Zhongze Liu [Wed, 5 Dec 2018 22:15:57 +0000 (14:15 -0800)]
xen: xsm: flask: introduce XENMAPSPACE_gmfn_share for memory sharing
The existing XENMAPSPACE_gmfn_foreign subop of XENMEM_add_to_physmap forbids
a Dom0 to map memory pages from one DomU to another, which restricts some useful
yet not dangerous use cases -- such as sharing pages among DomU's so that they
can do shm-based communication.
This patch introduces XENMAPSPACE_gmfn_share to address this inconvenience,
which is mostly the same as XENMAPSPACE_gmfn_foreign but has its own xsm check.
Specifically, the patch:
* Introduces a new av permission MMU__SHARE_MEM to denote if two domains can
share memory by using the new subop;
* Introduces xsm_map_gmfn_share() to check if (current) has proper permission
over (t) AND MMU__SHARE_MEM is allowed between (d) and (t);
* Modify the default xen.te to allow MMU__SHARE_MEM for normal domains that
allow grant mapping/event channels.
The new subop is marked unsupported for x86 because calling p2m_add_foregin
on two DomU's is currently not supported on x86.
This is for the proposal "Allow setting up shared memory areas between VMs
from xl config file" (see [1]).
Signed-off-by: Zhongze Liu <blackskygg@gmail.com> Signed-off-by: Stefano Stabellini <stefanos@xilinx.com> Cc: Daniel De Graaf <dgdegra@tycho.nsa.gov> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Julien Grall <julien.grall@arm.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: George Dunlap <George.Dunlap@eu.citrix.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Tim Deegan <tim@xen.org> Cc: xen-devel@lists.xen.org
Nick Rosbrook [Mon, 16 Dec 2019 18:08:10 +0000 (18:08 +0000)]
golang/xenlight: implement keyed union C to Go marshaling
Switch over union key to determine how to populate 'union' in Go struct.
Since the unions of C types cannot be directly accessed in cgo, use a
typeof trick to typedef a struct in the cgo preamble that is analagous
to each inner struct of a keyed union. For example, to define a struct
for the hvm inner struct of libxl_domain_build_info, do:
Nick Rosbrook [Mon, 16 Dec 2019 18:08:09 +0000 (18:08 +0000)]
golang/xenlight: begin C to Go type marshaling
Begin implementation of fromC marshaling functions for generated struct
types. This includes support for converting fields that are basic
primitive types such as string and integer types, nested anonymous
structs, nested libxl structs, and libxl built-in types.
This patch does not implement conversion of arrays or keyed unions.
Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Nick Rosbrook [Mon, 16 Dec 2019 18:08:08 +0000 (18:08 +0000)]
golang/xenlight: generate structs from the IDL
Add struct and keyed union generation to gengotypes.py. For keyed unions,
use a method similar to gRPC's oneof to interpret C unions as Go types.
Meaning, for a given struct with a union field, generate a struct for
each sub-struct defined in the union. Then, define an interface of one
method which is implemented by each of the defined sub-structs. For
example:
type domainBuildInfoTypeUnion interface {
isdomainBuildInfoTypeUnion()
}
type DomainBuildInfoTypeUnionHvm struct {
// HVM-specific fields...
}
Then, remove existing struct definitions in xenlight.go that conflict
with the generated types, and modify existing marshaling functions to
align with the new type definitions. Notably, drop "time" package since
fields of type time.Duration are now of type uint64.
Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Re-name and modify signature of toGo function to fromC. The reason for
using 'fromC' rather than 'toGo' is that it is not a good idea to define
methods on the C types. Also, add error return type to Bitmap's toC function.
Finally, as code-cleanup, re-organize the Bitmap type's comments as per
Go conventions.
Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com> Acked-by: George Dunlap <george.dunlap@citrix.com>
--
Changes in v2:
- Use consistent variable naming for slice created from
libxl_bitmap.
Nick Rosbrook [Mon, 16 Dec 2019 18:07:59 +0000 (18:07 +0000)]
golang/xenlight: define Defbool builtin type
Define Defbool as struct analagous to the C type, and define the type
'defboolVal' that represent true, false, and default defbool values.
Implement Set, Unset, SetIfDefault, IsDefault, Val, and String functions
on Defbool so that the type can be used in Go analagously to how its
used in C.
Finally, implement fromC and toC functions.
Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Nick Rosbrook [Mon, 16 Dec 2019 18:07:59 +0000 (18:07 +0000)]
golang/xenlight: generate enum types from IDL
Introduce gengotypes.py to generate Go code the from IDL. As a first step,
implement 'enum' type generation.
As a result of the newly-generated code, remove the existing, and now
conflicting definitions in xenlight.go. In the case of the Error type,
rename the slice 'errors' to 'libxlErrors' so that it does not conflict
with the standard library package 'errors.' And, negate the values used
in 'libxlErrors' since the generated error values are negative.
Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Jan Beulich [Mon, 16 Dec 2019 16:37:09 +0000 (17:37 +0100)]
x86emul: correct far branch handling for 64-bit mode
AMD and friends explicitly specify that 64-bit operands aren't possible
for these insns. Nevertheless REX.W isn't fully ignored: It still
cancels a possible operand size override (0x66). Intel otoh explicitly
provides for 64-bit operands on the respective insn page of the SDM.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
The version of this header present in the Linux source tree has contained
such macros for some time. These macros, as the names imply, allow front
or back rings to be set up for existent (rather than freshly created and
zeroed) shared rings.
This patch is to update this, the canonical version of the header, to
match the latest definition of these macros in the Linux source.
NOTE: The way the new macros are defined allows the FRONT/BACK_RING_INIT
macros to be re-defined in terms of them, thereby reducing
duplication.
Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: Juergen Gross <jgross@suse.com>
Jan Beulich [Mon, 16 Dec 2019 16:35:50 +0000 (17:35 +0100)]
x86emul: correct LFS et al handling for 64-bit mode
AMD and friends explicitly specify that 64-bit operands aren't possible
for these insns. Nevertheless REX.W isn't fully ignored: It still
cancels a possible operand size override (0x66). Intel otoh explicitly
provides for 64-bit operands on the respective insn page of the SDM.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Mon, 16 Dec 2019 16:34:46 +0000 (17:34 +0100)]
x86emul: correct segment override decode for 64-bit mode
The legacy / compatibility mode ES, CS, SS, and DS overrides are fully
ignored prefixes in 64-bit mode, i.e. they in particular don't cancel an
earlier FS or GS one. (They don't violate the REX-prefix-must-be-last
rule though.)
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Igor Druzhinin [Fri, 13 Dec 2019 22:48:01 +0000 (22:48 +0000)]
x86/time: drop vtsc_{kern, user}count debug counters
They either need to be transformed to atomics to work correctly
(currently they left unprotected for HVM domains) or dropped entirely
as taking a per-domain spinlock is too expensive for high-vCPU count
domains even for debug build given this lock is taken too often.
Choose the latter as they are not extremely important anyway.
Andrew Cooper [Mon, 16 Dec 2019 13:58:45 +0000 (13:58 +0000)]
x86/pv: Fix `global-pages` to match the documentation
c/s 5de961d9c09 "x86: do not enable global pages when virtualized on AMD or
Hygon hardware" in fact does. Fix the calculation in pge_init().
While fixing this, adjust the command line documenation, first to use the
newer style, and to expand the description to discuss cases where the option
might be useful to use, but Xen can't account for by default.
George Dunlap [Thu, 12 Dec 2019 15:57:51 +0000 (15:57 +0000)]
x86/mm: More discriptive names for page de/validation functions
The functions alloc_page_type(), alloc_lN_table(), free_page_type()
and free_lN_table() are confusingly named: nothing is being allocated
or freed. Rather, the page being passed in is being either validated
or devalidated for use as the specific type; in the specific case of
pagetables, these may be promoted or demoted (i.e., grab appropriate
references for PTEs).
Rename alloc_page_type() and free_page_type() to validate_page() and
devalidate_page(). Also rename alloc_segdesc_page() to
validate_segdesc_page(), since this is what it's doing.
Rename alloc_lN_table() and free_lN_table() to promote_lN_table() and
demote_lN_table(), respectively.
After this change:
- get / put type consistenly refer to increasing or decreasing the count
- validate / devalidate consistently refers to actions done when a
type count goes 0 -> 1 or 1 -> 0
- promote / demote consistenly refers to acquiring or freeing
resources (in the form of type refs and general references) in order
to allow a page to be used as a pagetable.
No functional change.
Signed-off-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
George Dunlap [Fri, 13 Dec 2019 14:09:46 +0000 (14:09 +0000)]
x86/mm: Use mfn_t in type get / put call tree
Replace `unsigned long` with `mfn_t` as appropriate throughout
alloc/free_lN_table, get/put_page_from_lNe, and
get_lN_linear_pagetable. This obviates the need for a load of
`mfn_x()` and `_mfn()` casts.
Signed-off-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
George Dunlap [Fri, 13 Dec 2019 12:53:04 +0000 (12:53 +0000)]
x86/mm: Use a more descriptive name for pagetable mfns
In many places, a PTE being modified is accompanied by the pagetable
mfn which contains the PTE (primarily in order to be able to maintain
linear mapping counts). In many cases, this mfn is stored in the
non-descript variable (or argement) "pfn".
Replace these names with lNmfn, to indicate that 1) this is a
pagetable mfn, and 2) that it is the same level as the PTE in
question. This should be enough to remind readers that it's the mfn
containing the PTE.
No functional change.
Signed-off-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
George Dunlap [Fri, 13 Dec 2019 12:53:04 +0000 (12:53 +0000)]
x86/mm: Implement common put_data_pages for put_page_from_l[23]e
Both put_page_from_l2e and put_page_from_l3e handle having superpage
entries by looping over each page and "put"-ing each one individually.
As with putting page table entries, this code is functionally
identical, but for some reason different. Moreover, there is already
a common function, put_data_page(), to handle automatically swapping
between put_page() (for read-only pages) or put_page_and_type() (for
read-write pages).
Replace this with put_data_pages() (plural), which does the entire
loop, as well as the put_page / put_page_and_type switch.
Signed-off-by: George Dunlap <george.dunlap@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
George Dunlap [Fri, 13 Dec 2019 12:53:04 +0000 (12:53 +0000)]
x86/mm: Refactor put_page_from_l*e to reduce code duplication
put_page_from_l[234]e have identical functionality for devalidating an
entry pointing to a pagetable. But mystifyingly, they duplicate the
code in slightly different arrangements that make it hard to tell that
it's the same.
Create a new function, put_pt_page(), which handles the common
functionality; and refactor all the functions to be symmetric,
differing only in the level of pagetable expected (and in whether they
handle superpages).
Other than put_page_from_l2e() gaining an ASSERT it probably should
have had already, no functional changes.
Signed-off-by: George Dunlap <george.dunlap@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Paul Durrant [Fri, 13 Dec 2019 16:39:44 +0000 (16:39 +0000)]
public/io/netif.h: document a mechanism to advertise carrier state
This patch adds a specification for a 'carrier' node in xenstore to allow
a backend to notify a frontend of it's virtual carrier/link state. E.g.
a backend that is unable to forward packets from the guest because it is
not attached to a bridge may wish to advertise 'no carrier'.
While in the area also fix an erroneous backend path description.
NOTE: This is purely a documentation patch. No functional change.
Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: Juergen Gross <jgross@suse.com>