]> xenbits.xensource.com Git - people/dwmw2/xen.git/log
people/dwmw2/xen.git
5 years agox86/boot: Do not use trampoline for no-real-mode boot paths bootcleanup
David Woodhouse [Wed, 1 May 2019 10:37:09 +0000 (13:37 +0300)]
x86/boot: Do not use trampoline for no-real-mode boot paths

Where booted from EFI or with no-real-mode, there is no need to stomp
on low memory with the 16-boot code. Instead, just go straight to
trampoline_protmode_entry() at its physical location within the Xen
image, having applied suitable relocations.

This means that the GDT has to be loaded with lgdtl because the 16-bit
lgdt instruction would drop the high 8 bits of the gdt_trampoline
address, causing failures if the Xen image was loaded above 16MiB.

For now, the boot code (including the EFI loader path) still determines
what the trampoline_phys address should be for the permanent 16-bit
trampoline (used for AP startup, and wakeup). But that trampoline is
now only relocated for that address and copied into low memory later,
from a relocate_trampoline() call made from __start_xen().

The permanent trampoline can't trivially use the 32-bit code in place in
its physical location in the Xen image, as idle_pg_table doesn't have a
full physical mapping. Fixing that is theoretically possible but it's
actually much simpler just to relocate the 32-bit code (again) into low
memory, alongside the 16-bit code for the permanent trampoline.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
5 years agox86/boot: Copy 16-bit boot variables back up to Xen image
David Woodhouse [Tue, 30 Apr 2019 15:54:19 +0000 (18:54 +0300)]
x86/boot: Copy 16-bit boot variables back up to Xen image

Ditch the bootsym() access from C code for the variables populated by
16-bit boot code. As well as being cleaner this also paves the way for
not having the 16-bit boot code in low memory for no-real-mode or EFI
loader boots at all.

These variables are put into a separate .data.boot16 section and
accessed in low memory during the real-mode boot, then copied back to
their native location in the Xen image when real mode has finished.

Fix the limit in gdt_48 to admit that trampoline_gdt actually includes
7 entries, since we do now use the seventh (BOOT_FS) in late code so it
matters. Andrew has a patch to further tidy up the GDT and initialise
accessed bits etc., so I won't go overboard with more than the trivial
size fix for now.

The bootsym() macro remains in C code purely for the variables which
are written for the later AP startup and wakeup trampoline to use.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
5 years agox86/boot: Rename trampoline_{start,end} to boot_trampoline_{start,end}
David Woodhouse [Tue, 30 Apr 2019 13:27:13 +0000 (16:27 +0300)]
x86/boot: Rename trampoline_{start,end} to boot_trampoline_{start,end}

In preparation for splitting the boot and permanent trampolines from
each other. Some of these will change back, but most are boot so do the
plain search/replace that way first, then a subsequent patch will extract
the permanent trampoline code.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
5 years agox86/boot: Split bootsym() into four types of relocations
David Woodhouse [Sun, 28 Apr 2019 14:48:22 +0000 (17:48 +0300)]
x86/boot: Split bootsym() into four types of relocations

As a first step toward using the low-memory trampoline only when necessary
for a legacy boot without no-real-mode, clean up the relocations into
three separate groups.

 • bootsym() is now used only at boot time when no-real-mode isn't set.

 • bootdatasym() is for variables containing information discovered by
   the 16-bit boot code. This is currently accessed directly in place
   in low memory by Xen at runtime, but will be copied back to its
   location in high memory to avoid the pointer gymnastics (and because
   a subsequent patch will stop copying the 16-bit boot code into low
   memory at all when it isn't being used).

 • trampsym() is for the permanent 16-bit trampoline used for AP startup
   and for wake from sleep. This is not used at boot, and can be copied
   into (properly allocated) low memory once the system is running.

 • tramp32sym() is used both at boot and for AP startup/wakeup. During
   boot it can be used in-place, running from the physical address of
   the Xen image. For AP startup it can't, because at that point there
   isn't a full 1:1 mapping of all memory; only the low trampoline page
   is mapped.

No (intentional) functional change yet; just a "cleanup" to allow the
various parts to be treated separately in subsequent patches.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
5 years agox86/boot: Only jump into low trampoline code for real-mode boot
David Woodhouse [Sun, 28 Apr 2019 15:38:37 +0000 (18:38 +0300)]
x86/boot: Only jump into low trampoline code for real-mode boot

If the no-real-mode flag is set, don't go there at all. This is a prelude
to not even putting it there in the first place.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
5 years agoinclude/public/memory.h: remove the XENMEM_rsrc_acq_caller_owned flag
Paul Durrant [Fri, 19 Jul 2019 12:25:45 +0000 (13:25 +0100)]
include/public/memory.h: remove the XENMEM_rsrc_acq_caller_owned flag

When commit 3f8f1228 "x86/mm: add HYPERVISOR_memory_op to acquire guest
resources" introduced the concept of directly mapping some guest resources,
it was envisaged that the memory for some resources associated with a guest
may not actually be assigned to that guest, specifically the IOREQ server
resource introduces in commit 6e387461 "x86/hvm/ioreq: add a new mappable
resource type...". Such resources were dubbed "caller owned" and resulted
in the owned resources" and acquiring them resulted in the
XENMEM_rsrc_acq_caller_owned flag being passed back to the caller of the
memory op.

Unfortunately the implementation led to XSA-276, which was mitigated
by commit f6b6ae78 "x86/hvm/ioreq: fix page referencing" and then a related
memory accounting problem was worked around by commit e862e6ce
"x86/hvm/ioreq: use ref-counted target-assigned shared pages". This latter
commit removed the only instance of a "caller owned" resource, but the
flag was left in header and checked in one place in the core code.
This patch removes that now redundant check and removes the definition of
XENMEM_rsrc_acq_caller_owned from the public header. Also, since this was
the only flag defined for the XENMEM_acquire_resource memory op, it removes
the 'flags' field of struct xen_mem_acquire_resource and replaces it with
an equivalently sized 'pad' field.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agopython: do not report handled EAGAIN error
Marek Marczykowski-Górecki [Tue, 20 Aug 2019 02:12:41 +0000 (04:12 +0200)]
python: do not report handled EAGAIN error

match_watch_by_token() when returns an error, sets also exception within
python. This is generally the right thing to do, but when
xspy_read_watch() handle EAGAIN error internally, the exception needs to
be cleared. Otherwise it will fail like this:

    xen.lowlevel.xs.Error: (11, 'Resource temporarily unavailable')

    The above exception was the direct cause of the following exception:

    Traceback (most recent call last):
      (...)
        result = self.handle.read_watch()
    SystemError: <method 'read_watch' of 'xen.lowlevel.xs.xs' objects> returned a result with an error set

Fixes f6e1023412 "python: Extract registered watch search logic from xspy_read_watch()"
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wl@xen.org>
5 years agoviridian: make viridian_time_domain_freeze() safe to call...
Paul Durrant [Wed, 21 Aug 2019 08:22:58 +0000 (09:22 +0100)]
viridian: make viridian_time_domain_freeze() safe to call...

...on a partially destroyed domain.

viridian_time_domain_freeze() and viridian_time_vcpu_freeze() rely
(respectively) on the dynamically allocated per-domain and per-vcpu viridian
areas [1], which are freed during domain_relinquish_resources().
Because arch_domain_pause() can call viridian_domain_time_freeze() this
can lead to host crashes if e.g. a XEN_DOMCTL_pausedomain is issued after
domain_relinquish_resources() has run.

To prevent such crashes, this patch adds a check of is_dying into
viridian_time_domain_freeze(), and viridian_time_domain_thaw() which is
similarly vulnerable to indirection into freed memory.

NOTE: The patch also makes viridian_time_vcpu_freeze/thaw() static, since
      they have no callers outside of the same source module.

[1] See commit e7a9b5e72f26 "viridian: separately allocate domain and vcpu
    structures".

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
5 years agox86/p2m: fix non-translated handling of iommu mappings
Roger Pau Monne [Tue, 23 Jul 2019 12:43:43 +0000 (14:43 +0200)]
x86/p2m: fix non-translated handling of iommu mappings

The current usage of need_iommu_pt_sync in p2m for non-translated
guests is wrong because it doesn't correctly handle a relaxed PV
hardware domain, that has need_sync set to false, but still need
entries to be added from calls to {set/clear}_identity_p2m_entry.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Tested-by: Roman Shaposhnik <roman@zededa.com>
5 years agopython: Add XC binding for Xen build ID
Pawel Wieczorkiewicz [Tue, 20 Aug 2019 12:51:08 +0000 (12:51 +0000)]
python: Add XC binding for Xen build ID

Extend the list of xc() object methods with additional one to display
Xen's buildid. The implementation follows the libxl implementation
(e.g. max buildid size assumption being XC_PAGE_SIZE minus
sizeof(buildid->len)).

Signed-off-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Reviewed-by: Martin Mazein <amazein@amazon.de>
Reviewed-by: Andra-Irina Paraschiv <andraprs@amazon.com>
Reviewed-by: Norbert Manthey <nmanthey@amazon.de>
Acked-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
5 years agoxen/arm: add reserved-memory regions to the dom0 memory node
Stefano Stabellini [Mon, 19 Aug 2019 17:43:38 +0000 (10:43 -0700)]
xen/arm: add reserved-memory regions to the dom0 memory node

Reserved memory regions are automatically remapped to dom0. Their device
tree nodes are also added to dom0 device tree. However, the dom0 memory
node is not currently extended to cover the reserved memory regions
ranges as required by the spec.  This commit fixes it.

Change make_memory_node to take a  struct meminfo * instead of a
kernel_info. Call it twice for dom0, once to create the first regular
memory node, and the second time to create a second memory node with the
ranges covering reserved-memory regions.

Also, make a small code style fix in make_memory_node.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agoxen/arm: don't iomem_permit_access for reserved-memory regions
Stefano Stabellini [Mon, 19 Aug 2019 17:43:37 +0000 (10:43 -0700)]
xen/arm: don't iomem_permit_access for reserved-memory regions

Don't allow reserved-memory regions to be remapped into any unprivileged
guests, until reserved-memory regions are properly supported in Xen. For
now, do not call iomem_permit_access on them, because giving
iomem_permit_access to dom0 means that the toolstack will be able to
assign the region to a domU.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agoxen/arm: handle reserved-memory in consider_modules and dt_unreserved_regions
Stefano Stabellini [Mon, 19 Aug 2019 17:43:36 +0000 (10:43 -0700)]
xen/arm: handle reserved-memory in consider_modules and dt_unreserved_regions

reserved-memory regions overlap with memory nodes. The overlapping
memory is reserved-memory and should be handled accordingly:
consider_modules and dt_unreserved_regions should skip these regions the
same way they are already skipping mem-reserve regions.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agoxen/arm: early_print_info print reserved_mem
Stefano Stabellini [Mon, 19 Aug 2019 17:43:35 +0000 (10:43 -0700)]
xen/arm: early_print_info print reserved_mem

Improve early_print_info to also print the banks saved in
bootinfo.reserved_mem. Print them right after RESVD, increasing the same
index.

Since we are at it, also switch the existing RESVD print to use unsigned
int.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Reviewed-by: Volodymyr Babchuk <volodymyr.babchuk@epam.com>
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agoxen/arm: fix indentation in early_print_info
Stefano Stabellini [Mon, 19 Aug 2019 17:43:34 +0000 (10:43 -0700)]
xen/arm: fix indentation in early_print_info

No functional changes.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agoxen/arm: keep track of reserved-memory regions
Stefano Stabellini [Mon, 19 Aug 2019 17:43:33 +0000 (10:43 -0700)]
xen/arm: keep track of reserved-memory regions

As we parse the device tree in Xen, keep track of the reserved-memory
regions as they need special treatment (follow-up patches will make use
of the stored information.)

Reuse process_memory_node to add reserved-memory regions to the
bootinfo.reserved_mem array.

Refuse to continue once we reach the max number of reserved memory
regions to avoid accidentally mapping any portions of them into a VM.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agoxen/arm: make process_memory_node a device_tree_node_func
Stefano Stabellini [Mon, 19 Aug 2019 17:43:32 +0000 (10:43 -0700)]
xen/arm: make process_memory_node a device_tree_node_func

Change the signature of process_memory_node to match
device_tree_node_func. Thanks to this change, the next patch will be
able to use device_tree_for_each_node to call process_memory_node on all
the children of a provided node.

Return error if there is no reg property or if nr_banks is reached. Let
the caller deal with the error.

Add a printk when device tree parsing fails.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agoxen/arm: pass node to device_tree_for_each_node
Stefano Stabellini [Mon, 19 Aug 2019 17:43:31 +0000 (10:43 -0700)]
xen/arm: pass node to device_tree_for_each_node

Add a new parameter to device_tree_for_each_node: node, the node to
start the search from.

To avoid scanning device tree, and given that we only care about
relative increments of depth compared to the depth of the initial node,
we set the initial depth to 0. Then, we call func() for every node with
depth > 0.

Don't call func() on the parent node passed as an argument. Clarify the
change in the comment on top of the function. The current callers pass
the root node as argument: it is OK to skip the root node because no
relevant properties are in it, only subnodes.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
[julien: Remove min_depth variable]
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agolivepatch: always print XENLOG_ERR information
Pawel Wieczorkiewicz [Wed, 14 Aug 2019 12:23:05 +0000 (12:23 +0000)]
livepatch: always print XENLOG_ERR information

A lot of legitimate error messages were hidden behind debug printk
only. Most of these messages can be triggered by loading a malformed
hotpatch payload and are priceless for understanding issues with such
payloads.
Thus, always display all relevant XENLOG_ERR messages.

Signed-off-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Reviewed-by: Amit Shah <aams@amazon.de>
Reviewed-by: Martin Mazein <amazein@amazon.de>
Reviewed-by: Bjoern Doebel <doebel@amazon.de>
Reviewed-by: Ross Lagerwall <ross.lagerwall@citrix.com>
[Fix indentation and double LIVEPATCH prefixes, drop gratuitous punctuation]
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agoxen/x86: pv: Convert update_intpte() to use typesafe MFN
Julien Grall [Tue, 30 Apr 2019 17:43:25 +0000 (18:43 +0100)]
xen/x86: pv: Convert update_intpte() to use typesafe MFN

The third parameter of update_intpte() is a MFN, so it can be switched
to use the typesafe.

At the same time, the typesafe is propagated as far as possible without
major modifications.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agoxen: Convert is_xen_fixed_mfn to use typesafe MFN
Julien Grall [Sat, 26 Jan 2019 16:38:47 +0000 (16:38 +0000)]
xen: Convert is_xen_fixed_mfn to use typesafe MFN

No functional changes.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agoxen: Convert is_xen_heap_mfn to use typesafe MFN
Julien Grall [Sat, 26 Jan 2019 16:51:42 +0000 (16:51 +0000)]
xen: Convert is_xen_heap_mfn to use typesafe MFN

No functional changes.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agoxen: Convert hotplug page function to use typesafe MFN
Julien Grall [Sat, 26 Jan 2019 16:31:55 +0000 (16:31 +0000)]
xen: Convert hotplug page function to use typesafe MFN

Convert online_page, offline_page and query_page_offline to use
typesafe MFN.

At the same time, the typesafe is propagated as far as possible without
major modifications.

Note, for clarity, the words have been re-ordered in the error message
updated by this patch.

No functional changes.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agoxen/grant-table: Make arch specific macros typesafe
Julien Grall [Sat, 26 Jan 2019 16:14:22 +0000 (16:14 +0000)]
xen/grant-table: Make arch specific macros typesafe

This patch rework all the arch specific macros in grant_table.h to use
the typesafe MFN/GFN.

At the same time, some functions are renamed s/gmfn/gfn/ to match the
current naming scheme (see include/mm.h).

No functional changes intended.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/x86: Use mfn_to_gfn rather than mfn_to_gmfn
Julien Grall [Sat, 26 Jan 2019 15:58:48 +0000 (15:58 +0000)]
xen/x86: Use mfn_to_gfn rather than mfn_to_gmfn

mfn_to_gfn and mfn_to_gmfn are doing exactly the same except the former
is using mfn_t and gfn_t (return type).

Furthermore, the naming of the former is more consistent with the
current naming scheme (GFN/MFN). So replace mfn_to_gmfn with
mfn_to_gfn in x86 code.

Take the opportunity to convert some of the callers to use typesafe GFN and
format the message correctly.

No functional changes.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
--
    Changes in v3:
        - The hunk in x86/mm.c is not necessary anymore
        - Update printk message to use GFN rather than frame when suitable
        - Update commit message with some NITs
        - Add Jan's reviewed-by

    Changes in v2:
        - mfn_to_gfn now returns a gfn_t
        - Use %pd and PRI_gfn when possible in the message
        - Don't split format string to help grep/ack.

5 years agoxen/x86: Make mfn_to_gfn typesafe
Julien Grall [Wed, 13 Mar 2019 15:36:40 +0000 (15:36 +0000)]
xen/x86: Make mfn_to_gfn typesafe

No functional changes intended.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
5 years agox86: Restore IA32_MISC_ENABLE on wakeup
Michał Kowalczyk [Mon, 19 Aug 2019 02:23:33 +0000 (04:23 +0200)]
x86: Restore IA32_MISC_ENABLE on wakeup

Code in intel.c:early_init_intel() modifies IA32_MISC_ENABLE MSR. Those
modifications must be restored after resuming from S3 (see e.g. Linux wakeup
code), otherwise bad things may happen (e.g. wakeup code may cause #GP when
trying to set IA32_EFER.NXE [1]).

This bug was noticed on a ThinkPad x230 with NX disabled in the BIOS:
Xen could correctly boot, but crashed when resuming from suspend.
Applying this patch fixed the problem.

[1] Intel SDM vol 3: "If the execute-disable capability is not
available, a write to set IA32_EFER.NXE produces a #GP exception."

Signed-off-by: Michał Kowalczyk <mkow@invisiblethingslab.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agoxen/console: Simplify domU console handling in guest_console_write
Julien Grall [Tue, 2 Apr 2019 14:30:21 +0000 (15:30 +0100)]
xen/console: Simplify domU console handling in guest_console_write

2 paths in the domU console handling are now the same. So they can be
merged to make the code simpler.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Wei Liu <wei.liu2@citrix.com>
5 years agoxen/public: Document HYPERCALL_console_io()
Julien Grall [Fri, 1 Mar 2019 15:39:21 +0000 (15:39 +0000)]
xen/public: Document HYPERCALL_console_io()

Currently, OS developpers will have to look at Xen code in order to know
the parameters of an hypercall and how it is meant to work.

This is not a trivial task as you may need to have a deep understanding
of Xen internal.

This patch attempts to document the behavior of HYPERCALL_console_io() to
help OS developer.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agoxen/console: Rework HYPERCALL_console_io interface
Julien Grall [Mon, 5 Aug 2019 10:19:03 +0000 (11:19 +0100)]
xen/console: Rework HYPERCALL_console_io interface

At the moment, HYPERCALL_console_io is using signed int to describe the
command (@cmd) and the size of the buffer (@count).
    * @cmd does not need to be signed this used as a set of named value.
    None of them are negative. If new one are introduced they can be
    positive.
    * @count is used to know the size of the buffer. It makes little
    sense to have a negative value here.

So both variables are now switched to use unsigned int.

Changing @count to unsigned type will result in a change of behavior for
the existing commands:
    - write: Any buffer bigger than 2GB will now be printed rather than
      been ignored (the command return 0).
    - read: The return value is a signed 32-bit value for 32-bit Xen.
      To keep compatibility between 32-bit and 64-bit ABI, it
      effectively means the return value is 32-bit (despite been long
      on 64-bit). Negative value are used for error and positive value
      for the number of characters read. To avoid clash between the two
      sets, the buffer is still limited to 2GB. The only difference is
      an error is returned rather than claiming there are no characters.

The behavior is only affecting unlikely use of the current interface, so
this is not a big concern regarding backward compatibility.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agoxen/console: Don't treat NUL character as the end of the buffer
Julien Grall [Tue, 26 Feb 2019 21:39:58 +0000 (21:39 +0000)]
xen/console: Don't treat NUL character as the end of the buffer

After upgrading Debian to Buster, I have began to notice console
mangling when using zsh in Dom0. This is happenning because output sent by
zsh to the console may contain NULs in the middle of the buffer.

The actual implementation of CONSOLEIO_write considers that a buffer
always terminate with a NUL and therefore will ignore anything after it.

In general, NULs are perfectly legitimate in terminal streams. For
instance, this could be used for padding slow terminals. See terminfo(5)
section `Delays and Padding`, or search for the pcre '\bpad'.

Other use cases includes using the console for dumping non-human
readable information (e.g debugger, file if no network...). With the
current behavior, the resulting stream will end up to be corrupted.

The documentation for CONSOLEIO_write is pretty limited (to not say
inexistent). From the declaration, the hypercall takes a buffer and size.
So this could lead to think the NUL character is allowed in the middle of
the buffer.

This patch updates the console API to pass the size along the buffer
down so we can remove the reliance on buffer terminating by a NUL
character.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/mm: Clean IOMMU flags from p2m-pt code
Alexandru Stefan ISAILA [Wed, 14 Aug 2019 14:41:23 +0000 (15:41 +0100)]
x86/mm: Clean IOMMU flags from p2m-pt code

At this moment IOMMU pt sharing is disabled by commit [1].

This patch aims to clear the IOMMU hap share support as it will not be
used in the future. By doing this the IOMMU bits used in pte[52:58] can
be used in other ways.

[1] c2ba3db31ef2d9f1e40e7b6c16cf3be3d671d555

Suggested-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Brian Woods <brian.woods@amd.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agoxen/arm: setup: Add Xen as boot module before printing all boot modules
Julien Grall [Mon, 12 Aug 2019 11:23:43 +0000 (12:23 +0100)]
xen/arm: setup: Add Xen as boot module before printing all boot modules

Since commit f60658c6ae "xen/arm: Stop relocating Xen", the position of
Xen in memory is not printed anymore. This can make difficult to debug
early code.

As Xen is not relocated anymore, we can add Xen as boot module before
calling boot_fdt_info(). With that, the function will print Xen module
information along with all the other modules.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agotools/pygrub: Failing to set value to 0 in Grub2ConfigFile
Michael Young [Tue, 13 Aug 2019 20:15:02 +0000 (21:15 +0100)]
tools/pygrub: Failing to set value to 0 in Grub2ConfigFile

In Grub2ConfigFile the code to handle ${saved_entry} and ${next_entry}
sets arg = "0" but this now does nothing following c/s d1b93ea2615bd
"tools/pygrub: Make pygrub understand default entry in string format"
which replaced arg.strip() with arg_strip in the following line.  This
patch restores the previous behaviour.

Signed-off-by: Michael Young <m.a.young@durham.ac.uk>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agotools/xenstat: Fix -Wformat-truncation= issue
Andrew Cooper [Tue, 13 Aug 2019 13:46:00 +0000 (14:46 +0100)]
tools/xenstat: Fix -Wformat-truncation= issue

Building with GCC 8.3 on Buster identifies:

  src/xenstat_linux.c: In function 'xenstat_collect_networks':
  src/xenstat_linux.c:307:32: warning: 'snprintf' output may be truncated before
  the last format character [-Wformat-truncation=]
    snprintf(devNoBridge, 16, "p%s", devBridge);
                                  ^
  src/xenstat_linux.c:307:2: note: 'snprintf' output between 2 and 17 bytes into
  a destination of size 16
    snprintf(devNoBridge, 16, "p%s", devBridge);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

devNoBridge[] needs one charater more than devBridge[], so allocate one byte
more.  Replace a raw 16 in the snprintf() call with a sizeof() expression
instead.

Finally, libxenstat, unlike most of the rest of the Xen, doesn't use -Werror
which is why this issue went unnoticed in CI.  Fix this.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
5 years agotools/xenstat: Fix -Wunused-function issue
Andrew Cooper [Tue, 13 Aug 2019 14:14:19 +0000 (15:14 +0100)]
tools/xenstat: Fix -Wunused-function issue

When compiling xenstat with -Werror, Clang complains:

  src/xenstat.c:134:34: error: unused function 'parse' [-Werror,-Wunused-function]
  static inline unsigned long long parse(char *s, char *match)
                                   ^
  1 error generated.

Drop the function.  It really is unused.

Spotted by Travis-CI.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
5 years agox86/tss: Fix clang build following c/s 7888440625
Andrew Cooper [Tue, 13 Aug 2019 11:53:15 +0000 (12:53 +0100)]
x86/tss: Fix clang build following c/s 7888440625

Clang-3.5 from Debian Jessie fails with:

  smpboot.c:829:29: error: statement expression not allowed at file scope
          BUILD_BUG_ON(sizeof(this_cpu(tss_page)) != PAGE_SIZE);
                              ^
  /local/xen.git/xen/include/asm/percpu.h:14:7: note: expanded from macro
          'this_cpu'
      (*RELOC_HIDE(&per_cpu__##var, get_cpu_info()->per_cpu_offset))
        ^
  /local/xen.git/xen/include/xen/compiler.h:98:3: note: expanded from macro
          'RELOC_HIDE'
    ({ unsigned long __ptr;                       \
    ^
  /local/xen.git/xen/include/xen/lib.h:26:53: note: expanded from macro
          'BUILD_BUG_ON'
  #define BUILD_BUG_ON(cond) ((void)BUILD_BUG_ON_ZERO(cond))
                                                      ^
  /local/xen.git/xen/include/xen/lib.h:25:57: note: expanded from macro
          'BUILD_BUG_ON_ZERO'
  #define BUILD_BUG_ON_ZERO(cond) sizeof(struct { int:-!!(cond); })
                                                          ^
  1 error generated.
  /local/xen.git/xen/Rules.mk:202: recipe for target 'smpboot.o' failed

This is obviously a compiler bug because the BUILD_BUG_ON() is not at file
scope.  However, it can be worked around by using a local variable.

Spotted by Gitlab CI.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wl@xen.org>
5 years agox86/AMD-Vi: Fold exit paths of {enable,disable}_iommu()
Andrew Cooper [Mon, 12 Aug 2019 17:19:59 +0000 (18:19 +0100)]
x86/AMD-Vi: Fold exit paths of {enable,disable}_iommu()

... to avoid having multiple spin_unlock_irqrestore() calls.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Brian Woods <brian.woods@amd.com>
5 years agox86/boot: Simplify %fs setup in trampoline_setup
Andrew Cooper [Thu, 8 Aug 2019 16:18:10 +0000 (17:18 +0100)]
x86/boot: Simplify %fs setup in trampoline_setup

mov/shr is easier to follow than shld, and doesn't have a merge dependency on
the previous value of %edx.  Shorten the rest of the code by streamlining the
comments.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wl@xen.org>
5 years agoxen/arm: domain_build: Consolidate make_timer_node() and make_timer_domU_node()
Viktor Mitin [Wed, 7 Aug 2019 10:10:28 +0000 (13:10 +0300)]
xen/arm: domain_build: Consolidate make_timer_node() and make_timer_domU_node()

At the moment, the hwdom and domUs are creating the timer node
differently.

Technically the timer exposed the same way for any domain, the only
difference should be the interrupts used. The two current other
differences are:
    - compatible: The hwdom DT will use the same as the one provided
      by the host provided. The domUs DT will use "arm,armv7-timer" for
      32-bit domain and "arm,armv8-timer" for 64-bit domain. The latter
      matches the behavior of libxl when guests are created from
      userspace.

    - clock-frequency: The property is used on platform with
      broken firmware to indicate the clock frequency. This should
      be used by all the domains, however this is not yet the case
      for domUs created by Xen.

To avoid more discrepancy the two functions are now consolidated into
one place make_timer_node().

For simplicity, the compatible will now be based on the bitness even
for the hwdom. This means the compatible exposed for the hwdom may
differ. This should only have an impact on 32-bit hwdom booting on
Armv8 hardware.

Suggested-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Viktor Mitin <viktor_mitin@epam.com>
[julien: Reword commit message]
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agoxen/arm: extend fdt_property_interrupts to support DomU
Viktor Mitin [Wed, 7 Aug 2019 10:10:27 +0000 (13:10 +0300)]
xen/arm: extend fdt_property_interrupts to support DomU

The domain and fdt can be found in the structure kinfo.
Rather than adding a an extra argument for the domain, pass directly
kinfo.
This also requires to adapt fdt_property_interrupts() prototype.
A follow-up patch will need to create the interrupts for either Dom0 or
DomU.

Suggested-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Viktor Mitin <viktor_mitin@epam.com>
Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agox86/pv: Clean up cr3 handling in arch_set_info_guest()
Andrew Cooper [Fri, 21 Dec 2018 13:44:30 +0000 (13:44 +0000)]
x86/pv: Clean up cr3 handling in arch_set_info_guest()

All of this code lives inside CONFIG_PV which means gfn == mfn, and the
fill_ro_mpt() calls clearly show that the value is used untranslated.

Change cr3_gfn to a suitably typed cr3_mfn, and replace get_page_from_gfn()
with get_page_from_mfn().

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/boot: Remove gratuitous call back into low-memory code
David Woodhouse [Tue, 13 Aug 2019 17:28:39 +0000 (18:28 +0100)]
x86/boot: Remove gratuitous call back into low-memory code

We appear to have implemented a memcpy() in the low-memory trampoline
which we then call into from __start_xen(), for no adequately defined
reason.

Kill it with fire.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/vvmx: Fix nested virt on VMCS-Shadow capable hardware
Andrew Cooper [Tue, 30 Jul 2019 14:19:04 +0000 (15:19 +0100)]
x86/vvmx: Fix nested virt on VMCS-Shadow capable hardware

c/s e9986b0dd "x86/vvmx: Simplify per-CPU memory allocations" had the wrong
indirection on its pointer check in nvmx_cpu_up_prepare(), causing the
VMCS-shadowing buffer never be allocated.  Fix it.

This in turn results in a massive quantity of logspam, as every virtual
vmentry/exit hits both gdprintk()s in the *_bulk() functions.

Switch these to using printk_once(), but still only in debug builds.  The size
of the buffer is chosen at compile time, so complaining about it repeatedly is
of no benefit.

Finally, drop the runtime NULL pointer checks.  It is not terribly appropriate
to be repeatedly checking infrastructure which is set up from start-of-day,
and in this case, actually hid the above bug.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/atomic: Improvements and simplifications to assembly constraints
Andrew Cooper [Wed, 21 Nov 2018 13:50:21 +0000 (13:50 +0000)]
x86/atomic: Improvements and simplifications to assembly constraints

 * Constraints in the form "=r" (x) : "0" (x) can be folded to just "+r" (x)
 * Switch to using named parameters (mostly for legibility) which in
   particular helps with...
 * __xchg(), __cmpxchg() and __cmpxchg_user() modify their memory operand, so
   must list it as an output operand.  This only works because they each have
   a memory clobber to give the construct full compiler-barrier properties.
 * Every memory operand has an explicit known size.  Letting the compiler see
   the real size rather than obscuring it with __xg() allows for the removal
   of the instruction size suffixes without introducing ambiguity.
 * Drop semicolons after lock prefixes.
 * Other misc style changes.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agoxen/percpu: Make DECLARE_PER_CPU() and __DEFINE_PER_CPU() common
Andrew Cooper [Fri, 26 Jul 2019 18:48:48 +0000 (19:48 +0100)]
xen/percpu: Make DECLARE_PER_CPU() and __DEFINE_PER_CPU() common

These macros are identical across the architectures, and shouldn't be separate
from the DEFINE_PER_CPU*() infrastructure.

This converts the final asm/percpu.h includes, which were all using
DECLARE_PER_CPU(), to include xen/percpu.h instead.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
5 years agox86/xpti: Don't leak TSS-adjacent percpu data via Meltdown
Jan Beulich [Fri, 9 Aug 2019 13:16:06 +0000 (14:16 +0100)]
x86/xpti: Don't leak TSS-adjacent percpu data via Meltdown

The XPTI work restricted the visibility of most of memory, but missed a few
aspects when it came to the TSS.

Given that the TSS is just an object in percpu data, the 4k mapping for it
created in setup_cpu_root_pgt() maps adjacent percpu data, making it all
leakable via Meltdown, even when XPTI is in use.

Furthermore, no care is taken to check that the TSS doesn't cross a page
boundary.  As it turns out, struct tss_struct is aligned on its size which
does prevent it straddling a page boundary.

Rework the TSS types while making this change.  Rename tss_struct to tss64, to
mirror the existing tss32 structure we have in HVM's Tast Switch logic.  Drop
tss64's alignment and __cacheline_filler[] field.

Introduce tss_page which contains a single tss64 and keeps the rest of the
page clear, so no adjacent data can be leaked.  Move the definition from
setup.c to traps.c, which is a more appropriate place for it to live.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/desc: Drop __HYPERVISOR_CS32
Andrew Cooper [Fri, 9 Aug 2019 12:25:10 +0000 (13:25 +0100)]
x86/desc: Drop __HYPERVISOR_CS32

Xen, being 64bit only these days, does not use a 32bit Ring 0 code segment.

Delete __HYPERVISOR_CS32 and remove it from the GDTs.  Also delete
__HYPERVISOR_CS64 and use __HYPERVISOR_CS uniformly.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/desc: Build boot_{,compat_}gdt[] in C
Andrew Cooper [Mon, 12 Aug 2019 07:17:01 +0000 (09:17 +0200)]
x86/desc: Build boot_{,compat_}gdt[] in C

... where we can at least get the compiler to fill in the surrounding space
without having to do it manually.  This also results in the symbols having
proper type/size information in the debug symbols.

Reorder 'raw' in the seg_desc_t union to allow for easier initialisation.

Leave a comment explaining the various restrictions we have on altering the
GDT layout.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Introduce SEL2GDT(). Correct GDT indices in public header comments.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agoxen/page_alloc: Keep away MFN 0 from the buddy allocator
Julien Grall [Fri, 9 Aug 2019 12:14:40 +0000 (13:14 +0100)]
xen/page_alloc: Keep away MFN 0 from the buddy allocator

Combining of buddies happens only such that the resulting larger buddy
is still order-aligned. To cross a zone boundary while merging, the
implication is that both the buddy [0, 2^n-1] and the buddy
[2^n, 2^(n+1)-1] are free.

Ideally we want to fix the allocator, but for now we can just prevent
adding the MFN 0 in the allocator to avoid merging across zone
boundaries.

On x86, the MFN 0 is already kept away from the buddy allocator. So the
bug can only happen on Arm platform where the first memory bank is
starting at 0.

As this is a specific to the allocator, the MFN 0 is removed in the common code
to cater all the architectures (current and future).

[Stefano: improve commit message]

Reported-by: Jeff Kubascik <jeff.kubascik@dornerworks.com>
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Tested-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
5 years agoxen/link: Introduce .bss.percpu.page_aligned
Andrew Cooper [Fri, 9 Aug 2019 14:36:58 +0000 (16:36 +0200)]
xen/link: Introduce .bss.percpu.page_aligned

Future changes are going to need to page align some percpu data.

Shuffle the exact link order of items within the BSS to give
.bss.percpu.page_aligned appropriate alignment, even on CPU0, which uses
.bss.percpu itself.

Insert explicit alignment such that there won't be a gap between
__per_cpu_start and the first actual per-CPU object.  The POINTER_ALIGN
for __bss_end is to cover the lack of SMP_CACHE_BYTES alignment, as the
loops which zero the BSS use pointer-sized stores on all architectures.

Rework __DEFINE_PER_CPU() so the caller passes in all attributes, and
adjust DEFINE_PER_CPU{,_READ_MOSTLY}() to match.  This has the added bonus
that it is now possible to grep for .bss.percpu and find all the users.

Finally, introduce DEFINE_PER_CPU_PAGE_ALIGNED() which specifies the
section attribute and verifies the type's alignment.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Make DEFINE_PER_CPU_PAGE_ALIGNED() verify the alignment rather than
specifying it. It is the underlying type which should be suitably aligned.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86: define a few selector values
Jan Beulich [Fri, 9 Aug 2019 14:35:42 +0000 (16:35 +0200)]
x86: define a few selector values

TSS, LDT, and per-CPU entries all can benefit a little from also having
their selector values defined.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agopython: fix -Wsign-compare warnings
Marek Marczykowski-Górecki [Fri, 9 Aug 2019 02:01:36 +0000 (03:01 +0100)]
python: fix -Wsign-compare warnings

Specifically:
xen/lowlevel/xc/xc.c: In function ‘pyxc_domain_create’:
xen/lowlevel/xc/xc.c:147:24: error: comparison of integer expressions of different signedness: ‘int’ and ‘long unsigned int’ [-Werror=sign-compare]
  147 |         for ( i = 0; i < sizeof(xen_domain_handle_t); i++ )
      |                        ^
xen/lowlevel/xc/xc.c: In function ‘pyxc_domain_sethandle’:
xen/lowlevel/xc/xc.c:312:20: error: comparison of integer expressions of different signedness: ‘int’ and ‘long unsigned int’ [-Werror=sign-compare]
  312 |     for ( i = 0; i < sizeof(xen_domain_handle_t); i++ )
      |                    ^
xen/lowlevel/xc/xc.c: In function ‘pyxc_domain_getinfo’:
xen/lowlevel/xc/xc.c:391:24: error: comparison of integer expressions of different signedness: ‘int’ and ‘long unsigned int’ [-Werror=sign-compare]
  391 |         for ( j = 0; j < sizeof(xen_domain_handle_t); j++ )
      |                        ^
xen/lowlevel/xc/xc.c: In function ‘pyxc_get_device_group’:
xen/lowlevel/xc/xc.c:677:20: error: comparison of integer expressions of different signedness: ‘int’ and ‘uint32_t’ {aka ‘unsigned int’} [-Werror=sign-compare]
  677 |     for ( i = 0; i < num_sdevs; i++ )
      |                    ^
xen/lowlevel/xc/xc.c: In function ‘pyxc_physinfo’:
xen/lowlevel/xc/xc.c:988:20: error: comparison of integer expressions of different signedness: ‘int’ and ‘long unsigned int’ [-Werror=sign-compare]
  988 |     for ( i = 0; i < sizeof(pinfo.hw_cap)/4; i++ )
      |                    ^
xen/lowlevel/xc/xc.c:994:20: error: comparison of integer expressions of different signedness: ‘int’ and ‘long unsigned int’ [-Werror=sign-compare]
  994 |     for ( i = 0; i < ARRAY_SIZE(virtcaps_bits); i++ )
      |                    ^
xen/lowlevel/xc/xc.c:998:24: error: comparison of integer expressions of different signedness: ‘int’ and ‘long unsigned int’ [-Werror=sign-compare]
  998 |         for ( i = 0; i < ARRAY_SIZE(virtcaps_bits); i++ )
      |                        ^
xen/lowlevel/xs/xs.c: In function ‘xspy_ls’:
xen/lowlevel/xs/xs.c:191:23: error: comparison of integer expressions of different signedness: ‘int’ and ‘unsigned int’ [-Werror=sign-compare]
  191 |         for (i = 0; i < xsval_n; i++)
      |                       ^
xen/lowlevel/xs/xs.c: In function ‘xspy_get_permissions’:
xen/lowlevel/xs/xs.c:297:23: error: comparison of integer expressions of different signedness: ‘int’ and ‘unsigned int’ [-Werror=sign-compare]
  297 |         for (i = 0; i < perms_n; i++) {
      |                       ^
cc1: all warnings being treated as errors

Use size_t for loop iterators where it's compared with sizeof() or
similar construct.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
5 years agoxen/arm: unbreak arm64 build for older toolchains
Stefano Stabellini [Wed, 7 Aug 2019 16:49:15 +0000 (09:49 -0700)]
xen/arm: unbreak arm64 build for older toolchains

Commit 4941bfb "xen/arm64: macros: Introduce an assembly macro to alias
x30" moved

  lr      .req    x30

to macros.h. A later patch (1396dab "xen/arm64: head: Don't clobber
x30/lr in the macro PRINT") started to use "lr" in head.S, however, it
didn't add an #include macros.h to head.S. This commit fixes it.

The lack of alias breaks the build with
gcc-linaro-5.3.1-2016.05-x86_64_aarch64-linux-gnu. The alias was added
later to binutils 2.29 in 2017.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
5 years agoxen/sched: fix memory leak in credit2
Juergen Gross [Wed, 7 Aug 2019 11:04:49 +0000 (13:04 +0200)]
xen/sched: fix memory leak in credit2

csched2_deinit() is leaking the run-queue memory.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Dario Faggioli <dfaggioli@suse.com>
5 years agoxen/percpu: Drop unused asm/percpu.h includes
Andrew Cooper [Fri, 26 Jul 2019 18:48:48 +0000 (19:48 +0100)]
xen/percpu: Drop unused asm/percpu.h includes

These files either don't use any PER_CPU() infrastructure at all, or use
DEFINE_PER_CPU_*().  This is declared in xen/percpu.h, not asm/percpu.h, which
means that xen/percpu.h is included via a different path.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agoxen/percpu: Drop unused xen/percpu.h includes
Andrew Cooper [Fri, 26 Jul 2019 19:26:24 +0000 (20:26 +0100)]
xen/percpu: Drop unused xen/percpu.h includes

None of these headers use any PER_CPU() infrastructure.

xen/rwlock.h however does, and picked it up transitively via xen/spinlock.h,
so include it properly.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agoarm/percpu: Move {get,set}_processor_id() into current.h
Andrew Cooper [Fri, 26 Jul 2019 19:41:03 +0000 (20:41 +0100)]
arm/percpu: Move {get,set}_processor_id() into current.h

For cleanup purposes, it is necessary for asm/percpu.h to not use
DECLARE_PER_CPU() itself.  asm/current.h is arguably a better place for this
functionality to live anyway.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agox86/desc: Shorten boot_{,compat_}gdt[] variable names
Andrew Cooper [Mon, 5 Aug 2019 10:17:46 +0000 (11:17 +0100)]
x86/desc: Shorten boot_{,compat_}gdt[] variable names

The current names, boot_cpu_{,compat_}gdt_table, have a table suffix which is
redundant with the T of GDT, and the cpu infix doesn't provide any meaningful
context.  Drop them both.

Likewise, shorten the {,compat_}gdt{,_l1e} variables.

Finally, rename gdt_descr to boot_gdtr to more clearly identify its purpose.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/boot: Set Accessed bits in boot_cpu_{,compat_}gdt_table[]
Andrew Cooper [Wed, 7 Aug 2019 11:29:01 +0000 (12:29 +0100)]
x86/boot: Set Accessed bits in boot_cpu_{,compat_}gdt_table[]

There is no point causing the CPU to performed a locked update of the
descriptors on first use.

Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/asm: Include msr-index.h rather than msr.h
Andrew Cooper [Fri, 2 Aug 2019 12:35:14 +0000 (13:35 +0100)]
x86/asm: Include msr-index.h rather than msr.h

There is nothing interesting for assembly code in msr.h.  Include msr-index.h
instead, and drop the __ASSEMBLY__ guards in msr.h.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agolibxl: 9pfs has a QEMU backend
Stefano Stabellini [Tue, 6 Aug 2019 17:25:00 +0000 (18:25 +0100)]
libxl: 9pfs has a QEMU backend

Add 9pfs to the kind of PV drivers that has a QEMU backend, specifically
to the macro QEMU_BACKEND.

This is needed otherwise upon domain destroy we get a timeout error:

libxl: error: libxl_device.c:1132:device_backend_callback: Domain 1:unable to remove device with path /local/domain/0/backend/9pfs/1/0
libxl: error: libxl_domain.c:1129:devices_destroy_cb: Domain 1:libxl__devices_destroy failed

This change should have been part of b53b4037cef6 "libxl/xl: add support
for Xen 9pfs".

Also add a comment in libxl_types_internal.idl to help remember changing
QEMU_BACKEND going forward.

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
5 years agoAMD/IOMMU: drop stray "else"
Jan Beulich [Wed, 7 Aug 2019 10:12:00 +0000 (12:12 +0200)]
AMD/IOMMU: drop stray "else"

The blank line between it and the prior if() clearly indicates that this
was meant to be a standalone if().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agoAMD/IOMMU: miscellaneous DTE handling adjustments
Jan Beulich [Wed, 7 Aug 2019 10:11:22 +0000 (12:11 +0200)]
AMD/IOMMU: miscellaneous DTE handling adjustments

First and foremost switch boolean fields to bool. Adjust a few related
function parameters as well. Then
- in amd_iommu_set_intremap_table() don't use literal numbers,
- in iommu_dte_add_device_entry() use a compound literal instead of many
  assignments,
- in amd_iommu_setup_domain_device()
  - eliminate a pointless local variable,
  - use || instead of && when deciding whether to clear an entry,
  - clear the I field without any checking of ATS / IOTLB state,
- leave reserved fields unnamed.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Brian Woods <brian.woods@amd.com>
5 years agox86/apic: enable x2APIC mode before doing any setup
Roger Pau Monné [Wed, 7 Aug 2019 10:09:51 +0000 (12:09 +0200)]
x86/apic: enable x2APIC mode before doing any setup

Current code calls apic_x2apic_probe which does some initialization
and setup before having enabled x2APIC mode (if it's not already
enabled by the firmware).

This can lead to issues if the APIC ID doesn't match the x2APIC ID, as
apic_x2apic_probe calls init_apic_ldr_x2apic_cluster which depending
on the APIC mode might set cpu_2_logical_apicid using the APIC ID
instead of the x2APIC ID (because x2APIC might not be enabled yet).

Fix this by enabling x2APIC before calling apic_x2apic_probe.

As a remark, this was discovered while I was trying to figure out why
one of my test boxes didn't report any iommu faults. The root cause
was that the iommu MSI address field was set using the stale value in
cpu_2_logical_apicid, and thus the iommu fault interrupt would get
lost. Even if the MSI address field gets sets to a correct value
afterwards as soon as a single iommu fault is pending no further
interrupts would get injected, so losing a single iommu fault
interrupt is fatal.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agoIntel TXT: add reviewer, move to Odd Fixes state
Lukasz Hawrylko [Wed, 7 Aug 2019 10:09:31 +0000 (12:09 +0200)]
Intel TXT: add reviewer, move to Odd Fixes state

Support for Intel TXT has orphaned status right now because
no active maintainter is listed. Adding myself as reviewer
and moving it to Odd Fixes state.

Signed-off-by: Lukasz Hawrylko <lukasz.hawrylko@linux.intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agoCODING_STYLE: document intended usage of types
Jan Beulich [Wed, 7 Aug 2019 10:08:38 +0000 (12:08 +0200)]
CODING_STYLE: document intended usage of types

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
5 years agopassthrough/amd: Drop "IOMMU not found" message
Andrew Cooper [Mon, 5 Aug 2019 16:40:36 +0000 (17:40 +0100)]
passthrough/amd: Drop "IOMMU not found" message

Since c/s 9fa94e10585 "x86/ACPI: also parse AMD IOMMU tables early", this
function is unconditionally called in all cases where a DMAR ACPI table
doesn't exist.

As a consequnce, "AMD-Vi: IOMMU not found!" is printed in all cases where an
IOMMU isn't present, even on non-AMD systems.  Drop the message - it isn't
terribly interesting anyway, and is now misleading is a number of common
cases.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Brian Woods <brian.woods@amd.com>
5 years agomm: Safe to clear PGC_allocated on xenheap pages without an extra reference
George Dunlap [Tue, 6 Aug 2019 11:19:55 +0000 (12:19 +0100)]
mm: Safe to clear PGC_allocated on xenheap pages without an extra reference

Commits ec83f825627 "mm.h: add helper function to test-and-clear
_PGC_allocated" (and subsequent fix-up 44a887d021d "mm.h: fix BUG_ON()
condition in put_page_alloc_ref()") introduced a BUG_ON() to detect
unsafe behavior of callers.

Unfortunately this condition still turns out to be too strict.
xenheap pages are somewhat "magic": calling free_domheap_pages() on
them will not cause free_heap_pages() to be called: whichever part of
Xen allocated them specially must call free_xenheap_pages()
specifically.  (They'll also be handled appropriately at domain
destruction time.)

Only crash Xen when put_page_alloc_ref() finds only a single refcount
if the page is not a xenheap page.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agotests/x86emul: Annotate test blobs as executable code
Andrew Cooper [Fri, 24 May 2019 15:14:53 +0000 (16:14 +0100)]
tests/x86emul: Annotate test blobs as executable code

This causes objdump to disassemble them, rather than rendering them as
straight hex data.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/shim: Fix parallel build following c/s 32b1d62887d0 smoke staging
Andrew Cooper [Mon, 5 Aug 2019 13:48:21 +0000 (14:48 +0100)]
x86/shim: Fix parallel build following c/s 32b1d62887d0

Unfortunately, a parallel build from clean can fail in the following manner:

  xen.git$ make -j4 -C tools/firmware/xen-dir/
  make: Entering directory '/local/xen.git/tools/firmware/xen-dir'
  mkdir -p xen-root
  make: *** No rule to make target 'xen-root/xen/arch/x86/configs/pvshim_defconfig', needed by 'xen-root/xen/.config'.  Stop.
  make: *** Waiting for unfinished jobs....

The rule for pvshim_defconfig needs to depend on the linkfarm, rather than
$(D)/xen/.config specifically.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/shim: Refresh pvshim_defconfig
Andrew Cooper [Fri, 26 Jul 2019 09:54:41 +0000 (10:54 +0100)]
x86/shim: Refresh pvshim_defconfig

* Add a dependency so the shim gets rebuilt when pvshim_defconfig changes.
* Default to the NULL scheduler now that it works with vcpu online/offline.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agoxen: sched: refactor the ASSERTs around vcpu_deassing()
Dario Faggioli [Mon, 5 Aug 2019 10:50:57 +0000 (11:50 +0100)]
xen: sched: refactor the ASSERTs around vcpu_deassing()

It is all the time that we call vcpu_deassing() that the vcpu _must_ be
assigned to a pCPU, and hence that such pCPU can't be free.

Therefore, move the ASSERT-s which check for these properties in that
function, where they belong better.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citix.com>
Message-Id: <156412236781.2385.9110155201477198899.stgit@Palanthas>

5 years agoxen: sched: reassign vCPUs to pCPUs, when they come back online
Dario Faggioli [Mon, 5 Aug 2019 10:50:56 +0000 (11:50 +0100)]
xen: sched: reassign vCPUs to pCPUs, when they come back online

When a vcpu that was offline, comes back online, we do want it to either
be assigned to a pCPU, or go into the wait list.

Detecting that a vcpu is coming back online is a bit tricky. Basically,
if the vcpu is waking up, and is neither assigned to a pCPU, nor in the
wait list, it must be coming back from offline.

When this happens, we put it in the waitqueue, and we "tickle" an idle
pCPU (if any), to go pick it up.

Looking at the patch, it seems that the vcpu wakeup code is getting
complex, and hence that it could potentially introduce latencies.
However, all this new logic is triggered only by the case of a vcpu
coming online, so, basically, the overhead during normal operations is
just an additional 'if()'.

Signed-off-by: Dario Faggioli <dario.faggioli@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Message-Id: <156412236222.2385.236340632846050170.stgit@Palanthas>

5 years agoxen: sched: deal with vCPUs being or becoming online or offline
Dario Faggioli [Mon, 5 Aug 2019 10:50:55 +0000 (11:50 +0100)]
xen: sched: deal with vCPUs being or becoming online or offline

If a vCPU is, or is going, offline we want it to be neither
assigned to a pCPU, nor in the wait list, so:
- if an offline vcpu is inserted (or migrated) it must not
  go on a pCPU, nor in the wait list;
- if an offline vcpu is removed, we are sure that it is
  neither on a pCPU nor in the wait list already, so we
  should just bail, avoiding doing any further action;
- if a vCPU goes offline we need to remove it either from
  its pCPU or from the wait list.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Message-Id: <156412235656.2385.13861979113936528474.stgit@Palanthas>

5 years agoxen: sched: refector code around vcpu_deassign() in null scheduler
Dario Faggioli [Mon, 5 Aug 2019 10:50:54 +0000 (11:50 +0100)]
xen: sched: refector code around vcpu_deassign() in null scheduler

vcpu_deassign() is called only once (in _vcpu_remove()).

Let's consolidate the two functions into one.

No functional change intended.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Message-Id: <156412235104.2385.3911161728130674771.stgit@Palanthas>

5 years agoautomation: add openSUSE Tumbleweed CI image coverity-tested/smoke master
Dario Faggioli [Wed, 31 Jul 2019 16:58:46 +0000 (18:58 +0200)]
automation: add openSUSE Tumbleweed CI image

openSUSE comes in two flavours: Leap, which is non-rolling, and released
annualy, and Tumbleweed, which is rolling.

Reasons why it makes sense to have both (despite both being openSUSE,
package lists in dockerfiles being quite similar, etc) are:
- Leap share a lot with SUSE Linux Enterprise. So, regressions on Leap,
  not only means regressions for all openSUSE Leap users, but also helps
  prevent/catch regressions on SLE;
- Tumbleweed often has the most bleeding-edge software, so it will help
  us prevent/catch regressions with newly released versions of
  libraries, compilers, etc (e.g., at the time of writing this commit,
  some build issues, with GCC9, where discovered while trying to build
  in a Tumbleweed image).

Note that, considering the rolling nature of Tumbleweed, the container
would need to be rebuilt (e.g., periodically), even if the docker file
does not change.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Acked-by: Doug Goldstein <cardoe@cardoe.com>
5 years agoautomation: try to keep openSUSE Leap image a little smaller
Dario Faggioli [Wed, 31 Jul 2019 16:58:40 +0000 (18:58 +0200)]
automation: try to keep openSUSE Leap image a little smaller

Using `--no-recommends` when updating or installing commands should
prevent non strictly necessary packages to be installed.

doing a `clean -a` after installing all the packages, should, in
theory, free more space (as opposed to using just `clean`).

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Acked-by: Doug Goldstein <cardoe@cardoe.com>
5 years agoautomation: add info about container pushes
Doug Goldstein [Sat, 3 Aug 2019 14:44:17 +0000 (09:44 -0500)]
automation: add info about container pushes

To be able to push a container, users must have access and have logged
into the container registry. The docs did not explain this fully so this
documents the steps better.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agoci: install C++ in opensuse-leap CI container
Dario Faggioli [Fri, 26 Jul 2019 10:03:25 +0000 (12:03 +0200)]
ci: install C++ in opensuse-leap CI container

The openSUSE Leap container image, built after
opensuse-leap.dockerfile was missing the gcc-c++,
which is necessary, e.g., for building OVMF.

Add it.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Acked-by: Doug Goldstein <cardoe@cardoe.com>
5 years agox86/microcode: always collect_cpu_info() during boot
Sergey Dyasli [Thu, 1 Aug 2019 10:22:37 +0000 (18:22 +0800)]
x86/microcode: always collect_cpu_info() during boot

Currently cpu_sig struct is not updated during boot if no microcode blob
is specified by "ucode=[<interger>| scan]".

It will result in cpu_sig.rev being 0 which affects APIC's
check_deadline_errata() and retpoline_safe() functions.

Fix this by getting ucode revision early during boot and SMP bring up.
While at it, protect early_microcode_update_cpu() for cases when
microcode_ops is NULL.

Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agotools/xen-ucode: Upload a microcode blob to the hypervisor
Chao Gao [Thu, 1 Aug 2019 10:22:36 +0000 (18:22 +0800)]
tools/xen-ucode: Upload a microcode blob to the hypervisor

This patch provides a tool for late microcode update.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
[Use consistent style.  Add to gitignore.]
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agoxen/arm64: head: Don't setup the fixmap on secondary CPUs
Julien Grall [Sun, 9 Jun 2019 17:16:38 +0000 (18:16 +0100)]
xen/arm64: head: Don't setup the fixmap on secondary CPUs

setup_fixmap() will setup the fixmap in the boot page tables in order to
use earlyprintk and also update the register x23 holding the address to
the UART.

However, secondary CPUs are not using earlyprintk between turning the
MMU on and switching to the runtime page table. So setting up the
fixmap in the boot pages table is pointless.

This means most of setup_fixmap() is not necessary for the secondary
CPUs. The update of UART address is now moved out of setup_fixmap() and
duplicated in the CPU boot and secondary CPUs boot. Additionally, the
call to setup_fixmap() is removed from secondary CPUs boot.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm64: head: Move assembly switch to the runtime PT in secondary CPUs path
Julien Grall [Mon, 15 Apr 2019 11:14:38 +0000 (12:14 +0100)]
xen/arm64: head: Move assembly switch to the runtime PT in secondary CPUs path

The assembly switch to the runtime PT is only necessary for the
secondary CPUs. So move the code in the secondary CPUs path.

While this is definitely not compliant with the Arm Arm as we are
switching between two differents set of page-tables without turning off
the MMU. Turning off the MMU is impossible here as the ID map may clash
with other mappings in the runtime page-tables. This will require more
rework to avoid the problem. So for now add a TODO in the code.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm64: head: Document enable_mmu()
Julien Grall [Fri, 7 Jun 2019 21:07:19 +0000 (22:07 +0100)]
xen/arm64: head: Document enable_mmu()

Document the behavior and the main registers usage within enable_mmu().

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm64: head: Improve coding style and document create_pages_tables()
Julien Grall [Fri, 7 Jun 2019 20:53:37 +0000 (21:53 +0100)]
xen/arm64: head: Improve coding style and document create_pages_tables()

Adjust the coding style used in the comments within create_pages_tables()

Lastly, document the behavior and the main registers usage within the
function. Note that x25 is now only used within the function, so it does
not need to be part of the common register.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm64: head: Improve coding style and document cpu_init()
Julien Grall [Fri, 7 Jun 2019 19:03:46 +0000 (20:03 +0100)]
xen/arm64: head: Improve coding style and document cpu_init()

Adjust the coding style used in the comments within cpu_init(). Take the
opportunity to alter the early print to match the function name.

Lastly, document the behavior and the main registers usage within the
function.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm64: head: Rework and document zero_bss()
Julien Grall [Fri, 7 Jun 2019 18:59:15 +0000 (19:59 +0100)]
xen/arm64: head: Rework and document zero_bss()

On secondary CPUs, zero_bss() will be a NOP because BSS only need to be
zeroed once at boot. So the call in the secondary CPUs path can be
removed. It also means that x26 does not need to be set for secondary
CPU.

Note that we will need to keep x26 around for the boot CPU as BSS should
not be reset when booting via UEFI.

Lastly, document the behavior and the main registers usage within the
function.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm64: head: Rework and document check_cpu_mode()
Julien Grall [Fri, 7 Jun 2019 18:29:03 +0000 (19:29 +0100)]
xen/arm64: head: Rework and document check_cpu_mode()

A branch in the success case can be avoided by inverting the branch
condition. At the same time, remove a pointless comment as Xen can only
run at EL2.

Lastly, document the behavior and the main registers usage within the
function.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm64: head: Introduce distinct paths for the boot CPU and secondary CPUs
Julien Grall [Fri, 7 Jun 2019 18:28:51 +0000 (19:28 +0100)]
xen/arm64: head: Introduce distinct paths for the boot CPU and secondary CPUs

The boot code is currently quite difficult to go through because of the
lack of documentation and a number of indirection to avoid executing
some path in either the boot CPU or secondary CPUs.

In an attempt to make the boot code easier to follow, each parts of the
boot are now in separate functions. Furthermore, the paths for the boot
CPU and secondary CPUs are now distinct and for now will call each
functions.

Follow-ups will remove unnecessary calls and do further improvement
(such as adding documentation and reshuffling).

Note that the switch from using the 1:1 mapping to the runtime mapping
is duplicated for each path. This is because in the future we will need
to stay longer in the 1:1 mapping for the boot CPU.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm: platform: Add Raspberry Pi platform
Stewart Hildebrand [Mon, 29 Jul 2019 13:19:20 +0000 (09:19 -0400)]
xen/arm: platform: Add Raspberry Pi platform

The aux peripherals (uart1, spi1, and spi2) share an IRQ and a page of
memory. For debugging, it is helpful to use the aux UART in Xen. In
this case, Xen would try to assign spi1 and spi2 to dom0, but this
results in an error since the shared IRQ was already assigned to Xen.
Blacklist aux devices other than the UART to prevent mapping the shared
IRQ and memory range to dom0.

Blacklisting spi1 and spi2 unfortunately makes those peripherals
unavailable for use in the system. Future work could include forwarding
the IRQ for spi1 and spi2, and trap and mediate access to the memory
range for spi1 and spi2.

Signed-off-by: Stewart Hildebrand <stewart.hildebrand@dornerworks.com>
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agoxen/arm: types: Specify the zero padding in the definition of PRIregister
Julien Grall [Thu, 16 May 2019 22:39:36 +0000 (23:39 +0100)]
xen/arm: types: Specify the zero padding in the definition of PRIregister

The definition of PRIregister varies between Arm32 and Arm64 (32-bit vs
64-bit). However, some of the users uses the wrong padding and others
are not using padding at all.

For more consistency, the padding is now moved into the PRIregister and
varies depending on the architecture.

Signed-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm: vsmc: The function identifier is always 32-bit
Julien Grall [Thu, 16 May 2019 22:31:46 +0000 (23:31 +0100)]
xen/arm: vsmc: The function identifier is always 32-bit

On Arm64, the SMCCC function identifier is always stored in the first 32-bit
of x0 register. The rest of the bits are not defined and should be
ignored.

This means the variable funcid should be an uint32_t rather than
register_t.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm: traps: Avoid BUG_ON() in do_trap_brk()
Julien Grall [Wed, 15 May 2019 16:48:04 +0000 (17:48 +0100)]
xen/arm: traps: Avoid BUG_ON() in do_trap_brk()

At the moment, do_trap_brk() is using a BUG_ON() to check the hardware
has been correctly configured during boot.

Any error when configuring the hardware could result to a guest 'brk'
trapping in the hypervisor and crash it.

This is pretty harsh to kill Xen when actually killing the guest would
be enough as misconfiguring this trap would not lead to exposing
sensitive data. Replace the BUG_ON() with crashing the guest.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm: traps: Avoid using BUG_ON() in _show_registers()
Julien Grall [Wed, 15 May 2019 16:16:13 +0000 (17:16 +0100)]
xen/arm: traps: Avoid using BUG_ON() in _show_registers()

At the moment, _show_registers() is using a BUG_ON() to assert only
userspace will run 32-bit code in a 64-bit domain.

Such extra precaution is not necessary and could be avoided by only
checking the CPU mode to decide whether show_registers_64() or
show_reigsters_32() should be called.

This has also the nice advantage to avoid nested if in the code.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm: Rework psr_mode_is_32bit()
Julien Grall [Wed, 15 May 2019 13:34:55 +0000 (14:34 +0100)]
xen/arm: Rework psr_mode_is_32bit()

psr_mode_is_32bit() prototype does not match the rest of the helpers for
the process state. Looking at the callers, most of them will access
struct cpu_user_regs just for calling psr_mode_is_32bit().

The macro is now reworked to take a struct cpu_user_regs in parameter.
At the same time take the opportunity to switch to a static inline
helper.

Lastly, when compiled for 32-bit, Xen will only support 32-bit guest. So
it is pointless to check whether the register state correspond to 64-bit
or not.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/doc: Improve Dom0-less documentation
Viktor Mitin [Wed, 31 Jul 2019 08:10:41 +0000 (11:10 +0300)]
xen/doc: Improve Dom0-less documentation

- Changed unprintable characters with %s/\%xA0/ /g
  So all the spaces are 0x20 now.

- Added address-cells and size-cells to configuration example.
  This resolves the dom0less boot issue in case of arm64.

- Added some notes about xl tools usage in case of dom0less.

Signed-off-by: Viktor Mitin <viktor_mitin@epam.com>
[julien: Remove newline at the end of the file]
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agons16550: Add compatible string for Raspberry Pi 4
Stewart Hildebrand [Mon, 29 Jul 2019 13:19:19 +0000 (09:19 -0400)]
ns16550: Add compatible string for Raspberry Pi 4

Per the BCM2835 peripherals datasheet [1] page 10:
"The UART core is build to emulate 16550 behaviour ... The implemented
UART is not a 16650 compatible UART However as far as possible the
first 8 control and status registers are laid out like a 16550 UART. Al
16550 register bits which are not supported can be written but will be
ignored and read back as 0. All control bits for simple UART receive/
transmit operations are available."

Additionally, Linux uses the 8250/16550 driver for the aux UART [2].

Unfortunately the brcm,bcm2835-aux-uart device tree binding doesn't
have the reg-shift and reg-io-width properties [3]. Thus, the reg-shift
and reg-io-width properties are inherent properties of this UART.

Thanks to Andre Przywara for contributing the reg-shift and
reg-io-width setting snippet.

In my testing, I have relied on enable_uart=1 being set in config.txt,
a configuration file read by the Raspberry Pi's firmware. With
enable_uart=1, the firmware performs UART initialization.

[1] https://www.raspberrypi.org/documentation/hardware/raspberrypi/bcm2835/BCM2835-ARM-Peripherals.pdf
[2] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/tty/serial/8250/8250_bcm2835aux.c
[3] https://www.kernel.org/doc/Documentation/devicetree/bindings/serial/brcm,bcm2835-aux-uart.txt

Signed-off-by: Stewart Hildebrand <stewart.hildebrand@dornerworks.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Tested-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agoxen/spec-ctrl: Speculative mitigation facilities report wrong status
Jin Nan Wang [Wed, 31 Jul 2019 13:33:44 +0000 (13:33 +0000)]
xen/spec-ctrl: Speculative mitigation facilities report wrong status

Booting with spec-ctrl=0 results in Xen printing "None MD_CLEAR".

  (XEN)   Support for HVM VMs: None MD_CLEAR
  (XEN)   Support for PV VMs: None MD_CLEAR

Add a check about X86_FEATURE_MD_CLEAR to avoid to print "None".

Signed-off-by: James Wang <jnwang@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/ubsan: Don't perform alignment checking on supporting compilers
Andrew Cooper [Mon, 24 Jun 2019 09:43:34 +0000 (10:43 +0100)]
x86/ubsan: Don't perform alignment checking on supporting compilers

GCC 5 introduced -fsanitize=alignment which is enabled by default by
CONFIG_UBSAN.  This trips a load of wont-fix cases in the ACPI tables and the
hypercall page and stubs writing logic.

It also causes the native Xen boot to crash before the console is set up, for
an as-yet unidentified reason (most likley a wont-fix case earlier on boot).

Disable alignment sanitisation on compilers which would try using it.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>