]> xenbits.xensource.com Git - xen.git/log
xen.git
4 years agox86: split __copy_{from,to}_user() into "guest" and "unsafe" variants
Jan Beulich [Fri, 19 Feb 2021 16:19:19 +0000 (17:19 +0100)]
x86: split __copy_{from,to}_user() into "guest" and "unsafe" variants

The "guest" variants are intended to work with (potentially) fully guest
controlled addresses, while the "unsafe" variants are intended to be
used in order to access addresses not (directly) under guest control,
within Xen's part of virtual address space. Subsequently we will want
them to have distinct behavior, so as first step identify which one is
which. For now, both groups of constructs alias one another.

Double underscore prefixes are retained only on
__copy_{from,to}_guest_pv(), to allow still distinguishing them from
their "checking" counterparts once they also get renamed (to
copy_{from,to}_guest_pv()).

Add previously missing __user at some call sites.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org> [shadow]
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agox86: split __{get,put}_user() into "guest" and "unsafe" variants
Jan Beulich [Fri, 19 Feb 2021 16:18:27 +0000 (17:18 +0100)]
x86: split __{get,put}_user() into "guest" and "unsafe" variants

The "guest" variants are intended to work with (potentially) fully guest
controlled addresses, while the "unsafe" variants are intended to be
used in order to access addresses not (directly) under guest control,
within Xen's part of virtual address space. (For linear page table and
descriptor table accesses the low bits of the addresses may still be
guest controlled, but this still won't allow speculation to "escape"
into unwanted areas.) Subsequently we will want them to have distinct
behavior, so as first step identify which one is which. For now, both
groups of constructs alias one another.

Double underscore prefixes are retained only on __{get,put}_guest(), to
allow still distinguishing them from their "checking" counterparts once
they also get renamed (to {get,put}_guest()).

Since for them it's almost a full re-write, move what becomes
{get,put}_unsafe_size() into the "common" uaccess.h (x86_64/*.h should
disappear at some point anyway).

In __copy_to_user() one of the two casts in each put_guest_size()
invocation gets dropped. They're not needed and did break symmetry with
__copy_from_user().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org> [shadow]
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agoxen/arm : smmuv3: Fix to handle multiple StreamIds per device.
Rahul Singh [Wed, 17 Feb 2021 10:05:14 +0000 (10:05 +0000)]
xen/arm : smmuv3: Fix to handle multiple StreamIds per device.

SMMUv3 driver does not handle multiple StreamId if the master device
supports more than one StreamID.

This bug was introduced when the driver was ported from Linux to XEN.
dt_device_set_protected(..) should be called from add_device(..) not
from the dt_xlate(..).

Move dt_device_set_protected(..) from dt_xlate(..) to add_device().

Signed-off-by: Rahul Singh <rahul.singh@arm.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agognttab: bypass IOMMU (un)mapping when a domain is (un)mapping its own grant
Jan Beulich [Thu, 18 Feb 2021 12:16:59 +0000 (13:16 +0100)]
gnttab: bypass IOMMU (un)mapping when a domain is (un)mapping its own grant

Mappings for a domain's own pages should already be present in the
IOMMU. While installing the same mapping again is merely redundant (and
inefficient), removing the mapping when the grant mapping gets removed
is outright wrong in this case: The mapping was there before the map, so
should remain in place after unmapping.

This affects
- Arm Dom0 in the direct mapped case,
- x86 PV Dom0 in the "iommu=dom0-strict" / "dom0-iommu=strict" case,
- all x86 PV DomU-s, including driver domains.

See the code comment for why it's the original domain and not the page
owner that gets compared against.

Reported-by: Rahul Singh <Rahul.Singh@arm.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
4 years agognttab: never permit mapping transitive grants
Jan Beulich [Thu, 18 Feb 2021 12:16:12 +0000 (13:16 +0100)]
gnttab: never permit mapping transitive grants

Transitive grants allow an intermediate domain I to grant a target
domain T access to a page which origin domain O did grant I access to.
As an implementation restriction, T is not allowed to map such a grant.
This restriction is currently tried to be enforced by marking active
entries resulting from transitive grants as is-sub-page; sub-page grants
for obvious reasons don't allow mapping. However, marking (and checking)
only active entries is insufficient, as a map attempt may also occur on
a grant not otherwise in use. When not presently in use (pin count zero)
the grant type itself needs checking. Otherwise T may be able to map an
unrelated page owned by I. This is because the "transitive" sub-
structure of the v2 union would end up being interpreted as "full_page"
sub-structure instead. The low 32 bits of the GFN used would match the
grant reference specified in I's transitive grant entry, while the upper
32 bits could be random (depending on how exactly I sets up its grant
table entries).

Note that if one mapping already exists and the granting domain _then_
changes the grant to GTF_transitive (which the domain is not supposed to
do), the changed type will only be honored after the pin count has gone
back to zero. This is no different from e.g. GTF_readonly or
GTF_sub_page becoming set when a grant is already in use.

While adjusting the implementation, also adjust commentary in the public
header to better reflect reality.

Fixes: 3672ce675c93 ("Transitive grant support")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agoIOREQ: refine when to send mapcache invalidation request
Jan Beulich [Thu, 18 Feb 2021 12:11:19 +0000 (13:11 +0100)]
IOREQ: refine when to send mapcache invalidation request

XENMEM_decrease_reservation isn't the only means by which pages can get
removed from a guest, yet all removals ought to be signaled to qemu. Put
setting of the flag into the central p2m_remove_page() underlying all
respective hypercalls as well as a few similar places, mainly in PoD
code.

Additionally there's no point sending the request for the local domain
when the domain acted upon is a different one. The latter domain's ioreq
server mapcaches need invalidating. We assume that domain to be paused
at the point the operation takes place, so sending the request in this
case happens from the hvm_do_resume() path, which as one of its first
steps calls handle_hvm_io_completion().

Even without the remote operation aspect a single domain-wide flag
doesn't do: Guests may e.g. decrease-reservation on multiple vCPU-s in
parallel. Each of them needs to issue an invalidation request in due
course, in particular because exiting to guest context should not happen
before the request was actually seen by (all) the emulator(s).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agostubdom/xenstored: Fix uninitialised variables in lu_read_state()
Andrew Cooper [Thu, 11 Feb 2021 21:10:51 +0000 (21:10 +0000)]
stubdom/xenstored: Fix uninitialised variables in lu_read_state()

Various version of gcc, when compiling with -Og, complain:

  xenstored_control.c: In function ‘lu_read_state’:
  xenstored_control.c:540:11: error: ‘state.size’ is used uninitialized in this
  function [-Werror=uninitialized]
    if (state.size == 0)
        ~~~~~^~~~~
  xenstored_control.c:543:6: error: ‘state.buf’ may be used uninitialized in
  this function [-Werror=maybe-uninitialized]
    pre = state.buf;
    ~~~~^~~~~~~~~~~
  xenstored_control.c:550:23: error: ‘state.buf’ may be used uninitialized in
  this function [-Werror=maybe-uninitialized]
     (void *)head - state.buf < state.size;
                    ~~~~~^~~~
  xenstored_control.c:550:35: error: ‘state.size’ may be used uninitialized in
  this function [-Werror=maybe-uninitialized]
     (void *)head - state.buf < state.size;
                                ~~~~~^~~~~

for the stubdom build.  This is because lu_get_dump_state() is a no-op stub in
MiniOS, and state really is operated on uninitialised.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agotools/libxl: Fix uninitialised variable in libxl__write_stub_dmargs()
Andrew Cooper [Thu, 11 Feb 2021 17:44:36 +0000 (17:44 +0000)]
tools/libxl: Fix uninitialised variable in libxl__write_stub_dmargs()

Various version of gcc, when compiling with -Og, complain:

  libxl_dm.c: In function ‘libxl__write_stub_dmargs’:
  libxl_dm.c:2166:16: error: ‘dmargs’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
               rc = libxl__xs_write_checked(gc, t, path, dmargs);
               ~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

It isn't actually used while uninitialised, but only because of how the
is_linux_stubdom checks line up.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agotools/libxg: Drop stale p2m logic from ARM's meminit()
Andrew Cooper [Thu, 11 Feb 2021 17:45:21 +0000 (17:45 +0000)]
tools/libxg: Drop stale p2m logic from ARM's meminit()

Various version of gcc, when compiling with -Og, complain:

  xg_dom_arm.c: In function 'meminit':
  xg_dom_arm.c:420:19: error: 'p2m_size' may be used uninitialized in this function [-Werror=maybe-uninitialized]
    420 |     dom->p2m_size = p2m_size;
        |     ~~~~~~~~~~~~~~^~~~~~~~~~

This is actually entirely stale code since ee21f10d70^..97e34ad22d which
removed the 1:1 identity p2m for translated domains.

Drop the write of d->p2m_size, and the p2m_size local variable.  Reposition
the p2m_size field in struct xc_dom_image and correct some stale
documentation.

This change really ought to have been part of the original cleanup series.

No actual change to how ARM domains are constructed.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agotools/libxg: Fix uninitialised variable in write_x86_cpu_policy_records()
Andrew Cooper [Thu, 11 Feb 2021 14:25:57 +0000 (14:25 +0000)]
tools/libxg: Fix uninitialised variable in write_x86_cpu_policy_records()

Various version of gcc, when compiling with -Og, complain:

  xg_sr_common_x86.c: In function 'write_x86_cpu_policy_records':
  xg_sr_common_x86.c:92:12: error: 'rc' may be used uninitialized in this function [-Werror=maybe-uninitialized]
     92 |     return rc;
        |            ^~

The complaint is legitimate, and can occur with unexpected behaviour of two
related hypercalls in combination with a libc which permits zero-length
malloc()s.

Have an explicit rc = 0 on the success path, and make the MSRs record error
handling consistent with the CPUID record before it.

Fixes: f6b2b8ec53d ("libxc/save: Write X86_{CPUID,MSR}_DATA records")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agotools/xl: Fix exit code for `xl vkbattach`
Andrew Cooper [Thu, 11 Feb 2021 18:49:23 +0000 (18:49 +0000)]
tools/xl: Fix exit code for `xl vkbattach`

Various version of gcc, when compiling with -Og, complain:

  xl_vkb.c: In function 'main_vkbattach':
  xl_vkb.c:79:12: error: 'rc' may be used uninitialized in this function [-Werror=maybe-uninitialized]
     79 |     return rc;
        |            ^~

The dryrun_only path really does leave rc uninitalised.  Introduce a done
label for success paths to use.

Fixes: a15166af7c3 ("xl: add vkb config parser and CLI")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agoxen/iommu: Check if the IOMMU was initialized before tearing down
Julien Grall [Thu, 17 Dec 2020 12:27:21 +0000 (12:27 +0000)]
xen/iommu: Check if the IOMMU was initialized before tearing down

is_iommu_enabled() will return true even if the IOMMU has not been
initialized (e.g. the ops are not set).

In the case of an early failure in arch_domain_init(), the function
iommu_destroy_domain() will be called even if the IOMMU is not
initialized.

This will result to dereference the ops which will be NULL and an host
crash.

Fix the issue by checking that ops has been set before accessing it.

Fixes: 71e617a6b8f6 ("use is_iommu_enabled() where appropriate...")
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Paul Durrant <paul@xen.org>
4 years agoxen/page_alloc: Only flush the page to RAM once we know they are scrubbed
Julien Grall [Thu, 21 Jan 2021 10:16:08 +0000 (10:16 +0000)]
xen/page_alloc: Only flush the page to RAM once we know they are scrubbed

At the moment, each page are flushed to RAM just after the allocator
found some free pages. However, this is happening before check if the
page was scrubbed.

As a consequence, on Arm, a guest may be able to access the old content
of the scrubbed pages if it has cache disabled (default at boot) and
the content didn't reach the Point of Coherency.

The flush is now moved after we know the content of the page will not
change. This also has the benefit to reduce the amount of work happening
with the heap_lock held.

This is XSA-364.

Fixes: 307c3be3ccb2 ("mm: Don't scrub pages while holding heap lock in alloc_heap_pages()")
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agoSUPPORT.md: PV display frontend is unsupported in "backend allocation" mode
Jan Beulich [Tue, 16 Feb 2021 14:31:59 +0000 (15:31 +0100)]
SUPPORT.md: PV display frontend is unsupported in "backend allocation" mode

This wasn't meant to be supported, but wasn't stated this way.

This is XSA-363.

Reported-by: Jan Belich <jbeulich@suse.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
4 years agoxen/arm: fix gnttab_need_iommu_mapping
Stefano Stabellini [Mon, 8 Feb 2021 18:49:32 +0000 (10:49 -0800)]
xen/arm: fix gnttab_need_iommu_mapping

Commit 91d4eca7add broke gnttab_need_iommu_mapping on ARM.
The offending chunk is:

 #define gnttab_need_iommu_mapping(d)                    \
-    (is_domain_direct_mapped(d) && need_iommu(d))
+    (is_domain_direct_mapped(d) && need_iommu_pt_sync(d))

On ARM we need gnttab_need_iommu_mapping to be true for dom0 when it is
directly mapped and IOMMU is enabled for the domain, like the old check
did, but the new check is always false.

In fact, need_iommu_pt_sync is defined as dom_iommu(d)->need_sync and
need_sync is set as:

    if ( !is_hardware_domain(d) || iommu_hwdom_strict )
        hd->need_sync = !iommu_use_hap_pt(d);

iommu_use_hap_pt(d) means that the page-table used by the IOMMU is the
P2M. It is true on ARM. need_sync means that you have a separate IOMMU
page-table and it needs to be updated for every change. need_sync is set
to false on ARM. Hence, gnttab_need_iommu_mapping(d) is false too,
which is wrong.

As a consequence, when using PV network from a domU on a system where
IOMMU is on from Dom0, I get:

(XEN) smmu: /smmu@fd800000: Unhandled context fault: fsr=0x402, iova=0x8424cb148, fsynr=0xb0001, cb=0
[   68.290307] macb ff0e0000.ethernet eth0: DMA bus error: HRESP not OK

The fix is to go back to something along the lines of the old
implementation of gnttab_need_iommu_mapping.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Fixes: 91d4eca7add ("mm / iommu: split need_iommu() into has_iommu_pt() and need_iommu_pt_sync()")
Backport: 4.13+

4 years agoxen: workaround missing device_type property in pci/pcie nodes
Stefano Stabellini [Tue, 9 Feb 2021 19:53:34 +0000 (11:53 -0800)]
xen: workaround missing device_type property in pci/pcie nodes

PCI buses differ from default buses in a few important ways, so it is
important to detect them properly. Normally, PCI buses are expected to
have the following property:

    device_type = "pci"

In reality, it is not always the case. To handle PCI bus nodes that
don't have the device_type property, also consider the node name: if the
node name is "pcie" or "pci" then consider the bus as a PCI bus.

This commit is based on the Linux kernel commit
d1ac0002dd29 "of: address: Work around missing device_type property in
pcie nodes".

This fixes Xen boot on RPi4. Some RPi4 kernels have the following node
on their device trees:

&pcie0 {
pci@1,0 {
#address-cells = <3>;
#size-cells = <2>;
ranges;

reg = <0 0 0 0 0>;

usb@1,0 {
reg = <0x10000 0 0 0 0>;
resets = <&reset RASPBERRYPI_FIRMWARE_RESET_ID_USB>;
};
};
};

The pci@1,0 node is a PCI bus. If we parse the node and its children as
a default bus, the reg property under usb@1,0 would have to be
interpreted as an address range mappable by the CPU, which is not the
case and would break.

Link: https://lore.kernel.org/xen-devel/YBmQQ3Tzu++AadKx@mattapan.m5p.com/
[fix style on commit]
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Tested-by: Elliott Mitchell <ehem+xen@m5p.com>
Tested-by: Jukka Kaartinen <jukka.kaartinen@unikie.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agoautomation: Add Ubuntu Focal builds
Andrew Cooper [Thu, 11 Feb 2021 13:25:58 +0000 (13:25 +0000)]
automation: Add Ubuntu Focal builds

Logical continuation of c/s eb52442d7f "automation: Add Ubuntu:focal
container".

No further changes required.  Everything builds fine.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agotools/libxl: Document where the magic MAC numbers come from
Andrew Cooper [Wed, 10 Feb 2021 13:51:21 +0000 (13:51 +0000)]
tools/libxl: Document where the magic MAC numbers come from

Matches the comment in the xl-network-configuration manpage.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agox86emul: fix SYSENTER/SYSCALL switching into 64-bit mode
Jan Beulich [Thu, 11 Feb 2021 16:53:10 +0000 (17:53 +0100)]
x86emul: fix SYSENTER/SYSCALL switching into 64-bit mode

When invoked by compat mode, mode_64bit() will be false at the start of
emulation. The logic after complete_insn, however, needs to consider the
mode switched into, in particular to avoid truncating RIP.

Inspired by / paralleling and extending Linux commit 943dea8af21b ("KVM:
x86: Update emulator context mode if SYSENTER xfers to 64-bit mode").

While there, tighten a related assertion in x86_emulate_wrapper() - we
want to be sure to not switch into an impossible mode when the code gets
built for 32-bit only (as is possible for the test harness).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citirix.com>
4 years agotools: rerun autoconf again
Ian Jackson [Wed, 10 Feb 2021 15:30:59 +0000 (15:30 +0000)]
tools: rerun autoconf again

Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Ian Jackson <iwj@xenproject.org>
4 years agox86/ucode/amd: Fix microcode payload size for Fam19 processors
Andrew Cooper [Tue, 9 Feb 2021 15:28:57 +0000 (15:28 +0000)]
x86/ucode/amd: Fix microcode payload size for Fam19 processors

The original limit provided wasn't accurate.  Blobs are in fact rather larger.

Fixes: fe36a173d1 ("x86/amd: Initial support for Fam19h processors")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agox86/ucode/amd: Handle length sanity check failures more gracefully
Andrew Cooper [Tue, 9 Feb 2021 20:49:07 +0000 (20:49 +0000)]
x86/ucode/amd: Handle length sanity check failures more gracefully

Currently, a failure of verify_patch_size() causes an early abort of the
microcode blob loop, which in turn causes a second go around the main
container loop, ultimately failing the UCODE_MAGIC check.

First, check for errors after the blob loop.  An error here is unrecoverable,
so avoid going around the container loop again and printing an
unhelpful-at-best error concerning bad UCODE_MAGIC.

Second, split the verify_patch_size() check out of the microcode blob header
check.  In the case that the sanity check fails, we can still use the
known-to-be-plausible header length to continue walking the container to
potentially find other applicable microcode blobs.

Before:
  (XEN) microcode: Bad microcode data
  (XEN) microcode: Wrong microcode patch file magic
  (XEN) Parsing microcode blob error -22

After:
  (XEN) microcode: Bad microcode length 0x000015c0 for cpu 0xa000
  (XEN) microcode: Bad microcode length 0x000015c0 for cpu 0xa010
  (XEN) microcode: Bad microcode length 0x000015c0 for cpu 0xa011
  (XEN) microcode: Bad microcode length 0x000015c0 for cpu 0xa200
  (XEN) microcode: Bad microcode length 0x000015c0 for cpu 0xa210
  (XEN) microcode: Bad microcode length 0x000015c0 for cpu 0xa500
  (XEN) microcode: couldn't find any matching ucode in the provided blob!

Fixes: 4de936a38a ("x86/ucode/amd: Rework parsing logic in cpu_request_microcode()")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/ucode/amd: Fix OoB read in cpu_request_microcode()
Andrew Cooper [Tue, 9 Feb 2021 22:10:54 +0000 (22:10 +0000)]
x86/ucode/amd: Fix OoB read in cpu_request_microcode()

verify_patch_size() is a maximum size check, and doesn't have a minimum bound.

If the microcode container encodes a blob with a length less than 64 bytes,
the subsequent calls to microcode_fits()/compare_header() may read off the end
of the buffer.

Fixes: 4de936a38a ("x86/ucode/amd: Rework parsing logic in cpu_request_microcode()")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agotools/configure: add bison as mandatory
Roger Pau Monne [Fri, 5 Feb 2021 11:53:27 +0000 (12:53 +0100)]
tools/configure: add bison as mandatory

Bison is now mandatory when the pvshim build is enabled in order to
generate the Kconfig.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
4 years agoautoconf: check endian.h include path
Roger Pau Monne [Thu, 4 Feb 2021 09:38:33 +0000 (10:38 +0100)]
autoconf: check endian.h include path

Introduce an autoconf macro to check for the include path of certain
headers that can be different between OSes.

Use such macro to find the correct path for the endian.h header, and
modify the users of endian.h to use the output of such check.

Suggested-by: Ian Jackson <iwj@xenproject.org>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agoxl: optionally print timestamps when running xl commands
Olaf Hering [Tue, 9 Feb 2021 15:45:35 +0000 (16:45 +0100)]
xl: optionally print timestamps when running xl commands

Add a global option "-T" to xl to enable timestamps in the output from
libxl and libxc. This is most useful with long running commands such
as "migrate".

During 'xl -v.. migrate domU host' a large amount of debug is generated.
It is difficult to map each line to the sending and receiving side.
Also the time spent for migration is not reported.

With 'xl -T migrate domU host' both sides will print timestamps and
also the pid of the invoked xl process to make it more obvious which
side produced a given log line.

Note: depending on the command, xl itself also produces other output
which does not go through libxentoollog. As a result such output will
not have timestamps prepended.

This change adds also the missing "-t" flag to "xl help" output.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
4 years agotools: add with-xen-scriptdir configure option
Olaf Hering [Tue, 9 Feb 2021 15:45:34 +0000 (16:45 +0100)]
tools: add with-xen-scriptdir configure option

Some distros plan for fresh installations will have an empty /etc,
whose content will not be controlled by the package manager anymore.

To make this possible, add a knob to configure to allow storing the
hotplug scripts to libexec instead of /etc/xen/scripts.

The current default remains unchanged, which is /etc/xen/scripts.

[autoconf rerun -iwj]

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
4 years agotools: move CONFIG_DIR and XEN_CONFIG_DIR in paths.m4
Olaf Hering [Tue, 9 Feb 2021 15:45:33 +0000 (16:45 +0100)]
tools: move CONFIG_DIR and XEN_CONFIG_DIR in paths.m4

Upcoming changes need to reuse XEN_CONFIG_DIR.

In its current location the assignment happens too late. Move it up
in the file, along with CONFIG_DIR. Their only dependency is
sysconfdir, which may also be adjusted in this file.

No functional change intended.

[autoconf rerun -iwj]

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agotools: Regenerate autoconf
Ian Jackson [Tue, 9 Feb 2021 17:05:54 +0000 (17:05 +0000)]
tools: Regenerate autoconf

This seems to have been omitted in many recent commits.  The earliest
of which are, according to git-bisect:
  154137dfdba3  stubdom/configure      stubdom: add xenstore pvh stubdom
  cc83ee4c6c37  all configure scripts  NetBSD: Fix lock directory path
but it seems that this is true of several later commits too.

Release status: I consider this discrepancy a release critical bug.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
Release-acked-by: Ian Jackson <iwj@xenproject.org>
4 years agotools: remove tabs from code produced by libxl_save_msgs_gen.pl
Olaf Hering [Mon, 11 Jan 2021 17:42:17 +0000 (18:42 +0100)]
tools: remove tabs from code produced by libxl_save_msgs_gen.pl

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agoUse XEN_SCRIPT_DIR to refer to /etc/xen/scripts
Olaf Hering [Mon, 11 Jan 2021 17:41:51 +0000 (18:41 +0100)]
Use XEN_SCRIPT_DIR to refer to /etc/xen/scripts

Replace all hardcoded paths to use XEN_SCRIPT_DIR to expand the actual
location.

[ .gitignore change split out -iwj ]
[ dropped erroneous hunk for docs/misc/block-scripts.txt iwj ]

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agoUpdate .gitignore for some docs files
Olaf Hering [Mon, 8 Feb 2021 16:07:32 +0000 (16:07 +0000)]
Update .gitignore for some docs files

[ split out of a larger commit -iwj ]

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agodocs: substitute XEN_CONFIG_DIR in xl.conf.5
Olaf Hering [Mon, 11 Jan 2021 17:41:49 +0000 (18:41 +0100)]
docs: substitute XEN_CONFIG_DIR in xl.conf.5

xl(1) opens xl.conf in XEN_CONFIG_DIR.
Substitute this variable also in the man page.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agodocs: remove stale create example from xl.1
Olaf Hering [Mon, 11 Jan 2021 17:41:48 +0000 (18:41 +0100)]
docs: remove stale create example from xl.1

Maybe xm create had a feature to create a domU based on a configuration
file. xl create requires the '-f' option to refer to a file.
There is no code to look into XEN_CONFIG_DIR, so remove the example.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agotools/libxl: Fix ARM build
Andrew Cooper [Mon, 8 Feb 2021 14:36:32 +0000 (14:36 +0000)]
tools/libxl: Fix ARM build

Fixes: 804fe751375 ("tools/libxl: pass libxl__domain_build_state to libxl__arch_domain_create")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools/libxl: only set viridian flags on new domains
Igor Druzhinin [Wed, 3 Feb 2021 20:07:04 +0000 (20:07 +0000)]
tools/libxl: only set viridian flags on new domains

Domains migrating or restoring should have viridian HVM param key in
the migration stream already and setting that twice results in Xen
returing -EEXIST on the second attempt later (during migration stream parsing)
in case the values don't match. That causes migration/restore operation
to fail at destination side.

That issue is now resurfaced by the latest commits (983524671 and 7e5cffcd1e)
extending default viridian feature set making the values from the previous
migration streams and those set at domain construction different.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agotools/libxl: pass libxl__domain_build_state to libxl__arch_domain_create
Igor Druzhinin [Wed, 3 Feb 2021 20:07:03 +0000 (20:07 +0000)]
tools/libxl: pass libxl__domain_build_state to libxl__arch_domain_create

No functional change.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agox86/vm_event: add response flag to reset vmtrace buffer
Tamas K Lengyel [Sat, 30 Jan 2021 13:36:37 +0000 (08:36 -0500)]
x86/vm_event: add response flag to reset vmtrace buffer

Allow resetting the vmtrace buffer in response to a vm_event. This can be used
to optimize a use-case where detecting a looped vmtrace buffer is important.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agox86/vm_event: Carry the vmtrace buffer position in vm_event
Tamas K Lengyel [Mon, 18 Jan 2021 17:46:37 +0000 (12:46 -0500)]
x86/vm_event: Carry the vmtrace buffer position in vm_event

Add vmtrace_pos field to x86 regs in vm_event. Initialized to ~0 if
vmtrace is not in use.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agoxen/vmtrace: support for VM forks
Tamas K Lengyel [Fri, 11 Sep 2020 18:14:00 +0000 (20:14 +0200)]
xen/vmtrace: support for VM forks

Implement vmtrace_reset_pt function. Properly set IPT
state for VM forks.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agotools/misc: Add xen-vmtrace tool
Michał Leszczyński [Tue, 16 Jun 2020 13:35:07 +0000 (15:35 +0200)]
tools/misc: Add xen-vmtrace tool

Add an demonstration tool that uses xc_vmtrace_* calls in order
to manage external IPT monitoring for DomU.

Signed-off-by: Michał Leszczyński <michal.leszczynski@cert.pl>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <iwj@xenproject.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agotools/libxc: Add xc_vmtrace_* functions
Michał Leszczyński [Tue, 16 Jun 2020 13:33:25 +0000 (15:33 +0200)]
tools/libxc: Add xc_vmtrace_* functions

Add functions in libxc that use the new XEN_DOMCTL_vmtrace interface.

Signed-off-by: Michał Leszczyński <michal.leszczynski@cert.pl>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agoxen/domctl: Add XEN_DOMCTL_vmtrace_op
Michał Leszczyński [Sun, 28 Jun 2020 21:48:09 +0000 (23:48 +0200)]
xen/domctl: Add XEN_DOMCTL_vmtrace_op

Implement an interface to configure and control tracing operations.  Reuse the
existing SETDEBUGGING flask vector rather than inventing a new one.

Userspace using this interface is going to need platform specific knowledge
anyway to interpret the contents of the trace buffer.  While some operations
(e.g. enable/disable) can reasonably be generic, others cannot.  Provide an
explicitly-platform specific pair of get/set operations to reduce API churn as
new options get added/enabled.

For the VMX specific Processor Trace implementation, tolerate reading and
modifying a safe subset of bits in CTL, STATUS and OUTPUT_MASK.  This permits
userspace to control the content which gets logged, but prevents modification
of details such as the position/size of the output buffer.

Signed-off-by: Michał Leszczyński <michal.leszczynski@cert.pl>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agox86/vmx: Add Intel Processor Trace support
Michał Leszczyński [Tue, 16 Jun 2020 13:20:18 +0000 (15:20 +0200)]
x86/vmx: Add Intel Processor Trace support

Add CPUID/MSR enumeration details for Processor Trace.  For now, we will only
support its use inside VMX operation.  Fill in the vmtrace_available boolean
to activate the newly introduced common infrastructure for allocating trace
buffers.

For now, Processor Trace is going to be operated in Single Output mode behind
the guests back.  Add the MSRs to struct vcpu_msrs, and set up the buffer
limit in vmx_init_ipt() as it is fixed for the lifetime of the domain.

Context switch the most of the MSRs in and out of vCPU context, but the main
control register needs to reside in the MSR load/save lists.  Explicitly pull
the msrs pointer out into a local variable, because the optimiser cannot keep
it live across the memory clobbers in the MSR accesses.

Signed-off-by: Michał Leszczyński <michal.leszczynski@cert.pl>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agoxen/memory: Add a vmtrace_buf resource type
Michał Leszczyński [Sun, 28 Jun 2020 22:05:51 +0000 (00:05 +0200)]
xen/memory: Add a vmtrace_buf resource type

Allow to map processor trace buffer using acquire_resource().

Signed-off-by: Michał Leszczyński <michal.leszczynski@cert.pl>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agotools/[lib]xl: Add vmtrace_buf_size parameter
Michał Leszczyński [Thu, 18 Jun 2020 22:31:24 +0000 (00:31 +0200)]
tools/[lib]xl: Add vmtrace_buf_size parameter

Allow to specify the size of per-vCPU trace buffer upon
domain creation. This is zero by default (meaning: not enabled).

Signed-off-by: Michał Leszczyński <michal.leszczynski@cert.pl>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agoxen/domain: Add vmtrace_size domain creation parameter
Michał Leszczyński [Thu, 2 Jul 2020 23:16:10 +0000 (01:16 +0200)]
xen/domain: Add vmtrace_size domain creation parameter

To use vmtrace, buffers of a suitable size need allocating, and different
tasks will want different sizes.

Add a domain creation parameter, and audit it appropriately in the
{arch_,}sanitise_domain_config() functions.

For now, the x86 specific auditing is tuned to Processor Trace running in
Single Output mode, which requires a single contiguous range of memory.

The size is given an arbitrary limit of 64M which is expected to be enough for
anticipated usecases, but not large enough to get into long-running-hypercall
problems.

Signed-off-by: Michał Leszczyński <michal.leszczynski@cert.pl>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agoxen/memory: Fix mapping grant tables with XENMEM_acquire_resource
Andrew Cooper [Mon, 27 Jul 2020 16:24:11 +0000 (17:24 +0100)]
xen/memory: Fix mapping grant tables with XENMEM_acquire_resource

A guest's default number of grant frames is 64, and XENMEM_acquire_resource
will reject an attempt to map more than 32 frames.  This limit is caused by
the size of mfn_list[] on the stack.

Fix mapping of arbitrary size requests by looping over batches of 32 in
acquire_resource(), and using hypercall continuations when necessary.

To start with, break _acquire_resource() out of acquire_resource() to cope
with type-specific dispatching, and update the return semantics to indicate
the number of mfns returned.  Update gnttab_acquire_resource() and x86's
arch_acquire_resource() to match these new semantics.

Have do_memory_op() pass start_extent into acquire_resource() so it can pick
up where it left off after a continuation, and loop over batches of 32 until
all the work is done, or a continuation needs to occur.

compat_memory_op() is a bit more complicated, because it also has to marshal
frame_list in the XLAT buffer.  Have it account for continuation information
itself and hide details from the upper layer, so it can marshal the buffer in
chunks if necessary.

With these fixes in place, it is now possible to map the whole grant table for
a guest.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agox86/EFI: work around GNU ld 2.36 issue
Jan Beulich [Fri, 5 Feb 2021 13:09:42 +0000 (14:09 +0100)]
x86/EFI: work around GNU ld 2.36 issue

Our linker capability check fails with the recent binutils release's ld:

.../check.o:(.debug_aranges+0x6): relocation truncated to fit: R_X86_64_32 against `.debug_info'
.../check.o:(.debug_info+0x6): relocation truncated to fit: R_X86_64_32 against `.debug_abbrev'
.../check.o:(.debug_info+0xc): relocation truncated to fit: R_X86_64_32 against `.debug_str'+76
.../check.o:(.debug_info+0x11): relocation truncated to fit: R_X86_64_32 against `.debug_str'+d
.../check.o:(.debug_info+0x15): relocation truncated to fit: R_X86_64_32 against `.debug_str'+2b
.../check.o:(.debug_info+0x29): relocation truncated to fit: R_X86_64_32 against `.debug_line'
.../check.o:(.debug_info+0x30): relocation truncated to fit: R_X86_64_32 against `.debug_str'+19
.../check.o:(.debug_info+0x37): relocation truncated to fit: R_X86_64_32 against `.debug_str'+71
.../check.o:(.debug_info+0x3e): relocation truncated to fit: R_X86_64_32 against `.debug_str'
.../check.o:(.debug_info+0x45): relocation truncated to fit: R_X86_64_32 against `.debug_str'+5e
.../check.o:(.debug_info+0x4c): additional relocation overflows omitted from the output

Tell the linker to strip debug info as a workaround. Debug info has been
getting stripped already anyway when linking the actual xen.efi.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools/tests: fix resource test build on FreeBSD
Roger Pau Monne [Fri, 5 Feb 2021 12:19:38 +0000 (13:19 +0100)]
tools/tests: fix resource test build on FreeBSD

error.h is not a standard header, and none of the functions declared
there are actually used by the code. This fixes the build on FreeBSD
that doesn't have error.h

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools/tests: Introduce a test for acquire_resource
Andrew Cooper [Thu, 23 Jul 2020 16:26:16 +0000 (17:26 +0100)]
tools/tests: Introduce a test for acquire_resource

For now, simply try to map 40 frames of grant table.  This catches most of the
basic errors with resource sizes found and fixed through the 4.15 dev window.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agodocs/man: Document qemu-ifup on NetBSD
Manuel Bouyer [Wed, 3 Feb 2021 16:54:20 +0000 (17:54 +0100)]
docs/man: Document qemu-ifup on NetBSD

Document that on NetBSD, the tap interface will be configured by the
qemu-ifup script.

Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agotools/xenstored: close socket connections on error
Manuel Bouyer [Wed, 3 Feb 2021 16:54:19 +0000 (17:54 +0100)]
tools/xenstored: close socket connections on error

On error, don't keep socket connection in ignored state but close them.
When the remote end of a socket is closed, xenstored will flag it as an
error and switch the connection to ignored. But on some OSes (e.g.
NetBSD), poll(2) will return only POLLIN in this case, so sockets in ignored
state will stay open forever in xenstored (and it will loop with CPU 100%
busy).

Fixes: d2fa370d3ef9 ("tools/xenstore: Preserve bad client until they are destroyed")
Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agotools/hotplug: Add a qemu-ifup script on NetBSD
Manuel Bouyer [Wed, 3 Feb 2021 16:54:18 +0000 (17:54 +0100)]
tools/hotplug: Add a qemu-ifup script on NetBSD

On NetBSD, qemu-xen will use a qemu-ifup script to setup the tap interfaces
(as qemu-xen-traditional used to). Copy the script from qemu-xen-traditional,
and install it on NetBSD. While there document parameters and environnement
variables.

Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agolibs/devicemodel: Fix ABI breakage from xendevicemodel_set_irq_level()
Andrew Cooper [Thu, 4 Feb 2021 15:50:16 +0000 (15:50 +0000)]
libs/devicemodel: Fix ABI breakage from xendevicemodel_set_irq_level()

It is not permitted to edit the VERS clause for a version in a release of Xen.

Revert xendevicemodel_set_irq_level()'s inclusion in .so.1.2 and bump the the
library minor version to .so.1.4 instead.

Fixes: 5d752df85f ("xen/dm: Introduce xendevicemodel_set_irq_level DM op")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agotools/oxenstored: mkdir conflicts were sometimes missed
Edwin Török [Fri, 15 Jan 2021 19:38:58 +0000 (19:38 +0000)]
tools/oxenstored: mkdir conflicts were sometimes missed

Due to how set_write_lowpath was used here it didn't detect create/delete
conflicts.  When we create an entry we must mark our parent as modified
(this is what creating a new node via write does).

Otherwise we can have 2 transactions one creating, and another deleting a node
both succeeding depending on timing.  Or one transaction reading an entry,
concluding it doesn't exist, do some other work based on that information and
successfully commit even if another transaction creates the node via mkdir
meanwhile.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agotools/oxenstored: Reject invalid watch paths early
Edwin Török [Fri, 15 Jan 2021 19:28:37 +0000 (19:28 +0000)]
tools/oxenstored: Reject invalid watch paths early

Watches on invalid paths were accepted, but they would never trigger.  The
client also got no notification that its watch is bad and would never trigger.

Found again by the structured fuzzer, due to an error on live update reload:
the invalid watch paths would get rejected during live update and the list of
watches would be different pre/post live update.

The testcase is watch on `//`, which is an invalid path.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agotools/oxenstored: Fix quota calculation for mkdir EEXIST
Edwin Török [Fri, 15 Jan 2021 19:11:32 +0000 (19:11 +0000)]
tools/oxenstored: Fix quota calculation for mkdir EEXIST

We increment the domain's quota on mkdir even when the node already exists.
This results in a quota inconsistency after live update, where reconstructing
the tree from scratch results in a different quota.

Not a security issue because the domain uses up quota faster, so it will only
get a Quota error sooner than it should.

Found by the structured fuzzer.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agox86/efi: enable MS ABI attribute on clang
Roger Pau Monné [Thu, 4 Feb 2021 13:02:32 +0000 (14:02 +0100)]
x86/efi: enable MS ABI attribute on clang

Or else the EFI service calls will use the wrong calling convention.

The __ms_abi__ attribute is available on all supported versions of
clang. Add a specific Clang check because the GCC version reported by
Clang is below the required 4.4 to use the __ms_abi__ attribute.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agoIOREQ: fix waiting for broadcast completion
Jan Beulich [Thu, 4 Feb 2021 13:01:21 +0000 (14:01 +0100)]
IOREQ: fix waiting for broadcast completion

Checking just a single server is not enough - all of them must have
signaled that they're done processing the request.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
4 years agox86/string: correct memmove()'s forwarding to memcpy()
Jan Beulich [Thu, 4 Feb 2021 12:59:56 +0000 (13:59 +0100)]
x86/string: correct memmove()'s forwarding to memcpy()

With memcpy() expanding to the compiler builtin, we may not hand it
overlapping source and destination. We strictly mean to forward to our
own implementation (a few lines up in the same source file).

Fixes: 78825e1c60fa ("x86/string: Clean up x86/string.h")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agolibs/foreignmem: Fix/simplify errno handling for map_resource
Andrew Cooper [Wed, 3 Feb 2021 15:43:35 +0000 (15:43 +0000)]
libs/foreignmem: Fix/simplify errno handling for map_resource

Simplify the FreeBSD and Linux logic, left in this state by the previous
change.  No functional change.

Duplicate the FreeBSD logic for NetBSD, to maintain the uniform ABI for
callers that EOPNOTSUPP covers all Xen/Kernel support.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agolibs/foreignmem: Drop useless and/or misleading logging
Andrew Cooper [Wed, 3 Feb 2021 15:41:55 +0000 (15:41 +0000)]
libs/foreignmem: Drop useless and/or misleading logging

These log lines are all in response to single system calls, and do not provide
any information which the immediate caller can't determine themselves.  It is
however rude to put junk like this onto stderr, especially as system call
failures are not even error conditions in certain circumstances.

The FreeBSD logging has stale function names in, and Solaris shouldn't have
passed code review to start with.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Ian Jackson <iwj@xenproject.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agox86/build: correctly record dependencies of asm-offsets.s
Jan Beulich [Tue, 2 Feb 2021 10:36:50 +0000 (11:36 +0100)]
x86/build: correctly record dependencies of asm-offsets.s

Going through an intermediate *.new file requires telling the compiler
what the real target is, so that the inclusion of the resulting .*.d
file will actually be useful.

Fixes: 7d2d7a43d014 ("x86/build: limit rebuilding of asm-offsets.h")
Reported-by: Julien Grall <julien@xen.org>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agoMerge branch 'staging' of xenbits.xen.org:/home/xen/git/xen into staging
Jan Beulich [Tue, 2 Feb 2021 10:36:28 +0000 (11:36 +0100)]
Merge branch 'staging' of xenbits.xen.org:/home/xen/git/xen into staging

4 years agomemory: fix build with COVERAGE but !HVM
Julien Grall [Tue, 2 Feb 2021 10:35:42 +0000 (11:35 +0100)]
memory: fix build with COVERAGE but !HVM

Xen is heavily relying on the DCE stage to remove unused code so the
linker doesn't throw an error because a function is not implemented
yet we defined a prototype for it.

On some GCC versions (such as 9.4 provided by Debian sid), the compiler
DCE stage will not manage to figure that out for
xenmem_add_to_physmap_batch():

ld: ld: prelink.o: in function `xenmem_add_to_physmap_batch':
/xen/xen/common/memory.c:942: undefined reference to `xenmem_add_to_physmap_one'
/xen/xen/common/memory.c:942:(.text+0x22145): relocation truncated
to fit: R_X86_64_PLT32 against undefined symbol `xenmem_add_to_physmap_one'
prelink-efi.o: in function `xenmem_add_to_physmap_batch':
/xen/xen/common/memory.c:942: undefined reference to `xenmem_add_to_physmap_one'
make[2]: *** [Makefile:215: /root/xen/xen/xen.efi] Error 1
make[2]: *** Waiting for unfinished jobs....
ld: /xen/xen/.xen-syms.0: hidden symbol `xenmem_add_to_physmap_one' isn't defined
ld: final link failed: bad value

It is not entirely clear why the compiler DCE is not detecting the
unused code. However, cloning the check introduced by the commit below
into xenmem_add_to_physmap_batch() does the trick.

No functional change intended.

Fixes: d4f699a0df6c ("x86/mm: p2m_add_foreign() is HVM-only")
Reported-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Signed-off-by: Julien Grall <jgrall@amazon.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wl@xen.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agoxenstore: Fix all builds
Andrew Cooper [Mon, 1 Feb 2021 23:30:51 +0000 (23:30 +0000)]
xenstore: Fix all builds

This diff is easier viewed through `cat -A`

  diff --git a/tools/xenstore/include/xenstore_state.h b/tools/xenstore/include/xenstore_state.h$
  index 1bd443f61a..f7e4da2b2c 100644$
  --- a/tools/xenstore/include/xenstore_state.h$
  +++ b/tools/xenstore/include/xenstore_state.h$
  @@ -21,7 +21,7 @@$
   #ifndef XENSTORE_STATE_H$
   #define XENSTORE_STATE_H$
   $
  -#if defined(__FreeBSD__) ||M-BM- defined(__NetBSD__)$
  +#if defined(__FreeBSD__) || defined(__NetBSD__)$
   #include <sys/endian.h>$
   #else$
   #include <endian.h>$

A non-breaking space isn't a valid C preprocessor token.

Fixes: ffbb8aa282de ("xenstore: fix build on {Net/Free}BSD")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agoxenstore: Fix all builds
Andrew Cooper [Tue, 2 Feb 2021 10:33:26 +0000 (11:33 +0100)]
xenstore: Fix all builds

This diff is easier viewed through `cat -A`

  diff --git a/tools/xenstore/include/xenstore_state.h b/tools/xenstore/include/xenstore_state.h$
  index 1bd443f61a..f7e4da2b2c 100644$
  --- a/tools/xenstore/include/xenstore_state.h$
  +++ b/tools/xenstore/include/xenstore_state.h$
  @@ -21,7 +21,7 @@$
   #ifndef XENSTORE_STATE_H$
   #define XENSTORE_STATE_H$
   $
  -#if defined(__FreeBSD__) ||M-BM- defined(__NetBSD__)$
  +#if defined(__FreeBSD__) || defined(__NetBSD__)$
   #include <sys/endian.h>$
   #else$
   #include <endian.h>$

A non-breaking space isn't a valid C preprocessor token.

Fixes: ffbb8aa282de ("xenstore: fix build on {Net/Free}BSD")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agoxenstore: fix build on {Net/Free}BSD
Roger Pau Monne [Mon, 1 Feb 2021 15:53:17 +0000 (16:53 +0100)]
xenstore: fix build on {Net/Free}BSD

The endian.h header is in sys/ on NetBSD and FreeBSD.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Jackson <iwj@xenproject.org>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agoxenpmd.c: Remove hard tab
Ian Jackson [Mon, 1 Feb 2021 15:18:36 +0000 (15:18 +0000)]
xenpmd.c: Remove hard tab

bbed98e7cedc "xenpmd.c: use dynamic allocation" had a hard tab.
I thought we had fixed that and I thought I had checked.
Remove it now.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
4 years agoxenpmd.c: use dynamic allocation
Manuel Bouyer [Sat, 30 Jan 2021 18:27:10 +0000 (19:27 +0100)]
xenpmd.c: use dynamic allocation

On NetBSD, d_name is larger than 256, so file_name[284] may not be large
enough (and gcc emits a format-truncation error).
Use asprintf() instead of snprintf() on a static on-stack buffer.

Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Plus

define GNU_SOURCE for asprintf()

Harmless on NetBSD.

Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Release-Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agox86/debug: fix page-overflow bug in dbg_rw_guest_mem
Tamas K Lengyel [Sat, 30 Jan 2021 01:59:53 +0000 (20:59 -0500)]
x86/debug: fix page-overflow bug in dbg_rw_guest_mem

When using gdbsx dbg_rw_guest_mem is used to read/write guest memory. When the
buffer being accessed is on a page-boundary, the next page needs to be grabbed
to access the correct memory for the buffer's overflown parts. While
dbg_rw_guest_mem has logic to handle that, it broke with 229492e210a. Instead
of grabbing the next page the code right now is looping back to the
start of the first page. This results in errors like the following while trying
to use gdb with Linux' lx-dmesg:

[    0.114457] PM: hibernation: Registered nosave memory: [mem
0xfdfff000-0xffffffff]
[    0.114460] [mem 0x90000000-0xfbffffff] available for PCI demem 0
[    0.114462] f]f]
Python Exception <class 'ValueError'> embedded null character:
Error occurred in Python: embedded null character

Fixing this bug by taking the variable assignment outside the loop.

Fixes: 229492e210a ("x86/debugger: use copy_to/from_guest() in dbg_rw_guest_mem()")
Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoxen+tools: Introduce XEN_SYSCTL_PHYSCAP_vmtrace
Andrew Cooper [Wed, 20 Jan 2021 19:06:19 +0000 (19:06 +0000)]
xen+tools: Introduce XEN_SYSCTL_PHYSCAP_vmtrace

We're about to introduce support for Intel Processor Trace, but similar
functionality exists in other platforms.

Aspects of vmtrace can reasonably can be common, so start with
XEN_SYSCTL_PHYSCAP_vmtrace and plumb the signal from Xen all the way down into
`xl info`.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agoxen/memory: Indent part of acquire_resource()
Andrew Cooper [Mon, 27 Jul 2020 16:24:11 +0000 (17:24 +0100)]
xen/memory: Indent part of acquire_resource()

Indent the middle of acquire_resource() inside a do {} while ( 0 ) loop.  This
is broken out specifically to make the following change readable.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul@xen.org>
4 years agoxen/memory: Improve compat XENMEM_acquire_resource handling
Andrew Cooper [Tue, 28 Jul 2020 15:30:12 +0000 (16:30 +0100)]
xen/memory: Improve compat XENMEM_acquire_resource handling

The frame_list is an input, or an output, depending on whether the calling
domain is translated or not.  The array does not need marshalling in both
directions.

Furthermore, the copy-in loop was very inefficient, copying 4 bytes at at
time.  Rewrite it to copy in all nr_frames at once, and then expand
compat_pfn_t to xen_pfn_t in place.

Re-position the copy-in loop to simplify continuation support in a future
patch, and reduce the scope of certain variables.

No change in guest observed behaviour.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul@xen.org>
4 years agoxen/memory: Fix acquire_resource size semantics
Andrew Cooper [Thu, 23 Jul 2020 14:18:33 +0000 (15:18 +0100)]
xen/memory: Fix acquire_resource size semantics

Calling XENMEM_acquire_resource with a NULL frame_list is a request for the
size of the resource, but the returned 32 is bogus.

If someone tries to follow it for XENMEM_resource_ioreq_server, the acquire
call will fail as IOREQ servers currently top out at 2 frames, and it is only
half the size of the default grant table limit for guests.

Also, no users actually request a resource size, because it was never wired up
in the sole implementation of resource acquisition in Linux.

Introduce a new resource_max_frames() to calculate the size of a resource, and
implement it the IOREQ and grant subsystems.

It is impossible to guarantee that a mapping call following a successful size
call will succeed (e.g. The target IOREQ server gets destroyed, or the domain
switches from grant v2 to v1).  Document the restriction, and use the
flexibility to simplify the paths to be lockless.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agoxen/gnttab: Rework resource acquisition
Andrew Cooper [Mon, 27 Jul 2020 12:40:06 +0000 (13:40 +0100)]
xen/gnttab: Rework resource acquisition

The existing logic doesn't function in the general case for mapping a guests
grant table, due to arbitrary 32 frame limit, and the default grant table
limit being 64.

In order to start addressing this, rework the existing grant table logic by
implementing a single gnttab_acquire_resource().  This is far more efficient
than the previous acquire_grant_table() in memory.c because it doesn't take
the grant table write lock, and attempt to grow the table, for every single
frame.

The new gnttab_acquire_resource() function subsumes the previous two
gnttab_get_{shared,status}_frame() helpers.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agoxen/memory: Reject out-of-range resource 'frame' values
Andrew Cooper [Thu, 28 Jan 2021 14:39:25 +0000 (14:39 +0000)]
xen/memory: Reject out-of-range resource 'frame' values

The ABI is unfortunate, and frame being 64 bits leads to all kinds of problems
performing correct overflow checks.

Reject out-of-range values, and combinations which overflow, and use unsigned
int consistently elsewhere.  This fixes several truncation bugs in the grant
call tree, as the underlying limits are expressed with unsigned int to begin
with.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agolibs/store: make build without PTHREAD_STACK_MIN
Manuel Bouyer [Tue, 26 Jan 2021 22:47:59 +0000 (23:47 +0100)]
libs/store: make build without PTHREAD_STACK_MIN

On NetBSD, PTHREAD_STACK_MIN is not available.
If PTHREAD_STACK_MIN is not defined, define it to 0 so that we fallback to
DEFAULT_THREAD_STACKSIZE

Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agolibs/light: pass some infos to qemu
Manuel Bouyer [Tue, 26 Jan 2021 22:47:58 +0000 (23:47 +0100)]
libs/light: pass some infos to qemu

Pass bridge name to qemu as command line option
When starting qemu, set an environnement variable XEN_DOMAIN_ID,
to be used by qemu helper scripts
The only functional difference of using the br parameter is that the
bridge name gets passed to the QEMU script.
NetBSD doesn't have the ioctl to rename network interfaces implemented, and
thus cannot rename the interface from tapX to vifX.Y-emu. Only qemu knowns
the tap interface name, so we need to use the qemu script from qemu itself.

Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agolibs/light: make it build without setresuid()
Manuel Bouyer [Tue, 26 Jan 2021 22:47:57 +0000 (23:47 +0100)]
libs/light: make it build without setresuid()

NetBSD doesn't have setresuid(). introcuce libxl__setresuid(),
which on NetBSD assert() that it's never called (it should not be called when
dm restriction is off, and NetBSD doesn't support dm restriction at
this time).
On linux and FreeBSD it just calls setresuid().

Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agolibs/light: fix uuid on NetBSD
Manuel Bouyer [Tue, 26 Jan 2021 22:47:56 +0000 (23:47 +0100)]
libs/light: fix uuid on NetBSD

NetBSD uses the same uuid library as FreeBSD. As this is in a
__FreeBSD__ || __NetBSD__ block, just drop the #ifdef __FreeBSD__
and dead code.

Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agolibs/light: fix tv_sec printf format
Manuel Bouyer [Tue, 26 Jan 2021 22:47:55 +0000 (23:47 +0100)]
libs/light: fix tv_sec printf format

Don't assume tv_sec is a unsigned long, it is 64 bits on NetBSD 32 bits.
Use %jd and cast to (intmax_t) instead

Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agolibs/light: Switch NetBSD to QEMU_XEN
Manuel Bouyer [Tue, 26 Jan 2021 22:47:54 +0000 (23:47 +0100)]
libs/light: Switch NetBSD to QEMU_XEN

Switch NetBSD to QEMU_XEN.
All 3 versions of libxl__default_device_model() now return
LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN, so remove it and just set
b_info->device_model_version to LIBXL_DEVICE_MODEL_VERSION_QEMU_XEN in
libxl__domain_build_info_setdefault().

Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agolibs/gnttab: implement on NetBSD
Manuel Bouyer [Tue, 26 Jan 2021 22:47:53 +0000 (23:47 +0100)]
libs/gnttab: implement on NetBSD

Implement gnttab interface on NetBSD.
The kernel interface is different from FreeBSD so we can't use the FreeBSD
version

Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agolibs/call: fix build on NetBSD
Manuel Bouyer [Tue, 26 Jan 2021 22:47:51 +0000 (23:47 +0100)]
libs/call: fix build on NetBSD

Define PAGE_* if not already defined
Catch up with osdep interface change.

Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agoNetBSD hotplug: fix block unconfigure on destroy
Manuel Bouyer [Tue, 26 Jan 2021 22:47:49 +0000 (23:47 +0100)]
NetBSD hotplug: fix block unconfigure on destroy

When a domain is destroyed, xparams may not be available any more when
the block script is called to unconfigure the vnd.
Check xparam only at configure time, and just unconfigure any vnd present
in the xenstore.

Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agoNetBSD hotplug: Introduce locking functions
Manuel Bouyer [Tue, 26 Jan 2021 22:47:48 +0000 (23:47 +0100)]
NetBSD hotplug: Introduce locking functions

On NetBSD, some block device configuration requires serialisation.
Introcuce locking functions (derived from the Linux version), and use them
in the block script where appropriate.

Signed-off-by: Manuel Bouyer <bouyer@netbsd.org>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agoxen/ioreq: Make the IOREQ feature selectable on Arm
Oleksandr Tyshchenko [Fri, 29 Jan 2021 16:39:25 +0000 (18:39 +0200)]
xen/ioreq: Make the IOREQ feature selectable on Arm

The purpose of this patch is to add a possibility for user
to be able to select IOREQ support on Arm (which is disabled
by default) with retaining the current behaviour on x86
(is selected by HVM and it's prompt is not visible).

Also make the IOREQ be depended on CONFIG_EXPERT on Arm since
it is considered as Technological Preview feature and
update SUPPORT.md.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agoxen/ioreq: Do not let bufioreq to be used on other than x86 arches
Oleksandr Tyshchenko [Fri, 29 Jan 2021 01:48:51 +0000 (03:48 +0200)]
xen/ioreq: Do not let bufioreq to be used on other than x86 arches

This patch prevents the device model running on other than x86
systems to use buffered I/O feature for now.

Please note, there is no caller which requires to send buffered
I/O request on Arm currently and the purpose of this check is
to catch any future user of bufioreq.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Acked-by: Paul Durrant <paul@xen.org>
4 years agoxen/arm: Add mapcache invalidation handling
Oleksandr Tyshchenko [Fri, 29 Jan 2021 01:48:50 +0000 (03:48 +0200)]
xen/arm: Add mapcache invalidation handling

We need to send mapcache invalidation request to qemu/demu everytime
the page gets removed from a guest.

At the moment, the Arm code doesn't explicitely remove the existing
mapping before inserting the new mapping. Instead, this is done
implicitely by __p2m_set_entry().

First of all we need to recognize a case when the "freed" entry
contains some RAM page in order to set the corresponding flag.
The most suitable place to do this is p2m_free_entry(), there we can
find the correct leaf type. The invalidation request will be sent
in do_trap_hypercall() later on.

Taking into the account the following the do_trap_hypercall()
is the best place to send invalidation request:
 - The only way a guest can modify its P2M on Arm is via an hypercall
 - When sending the invalidation request, the vCPU will be blocked
   until all the IOREQ servers have acknowledged the invalidation

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>
4 years agoxen/ioreq: Make x86's send_invalidate_req() common
Oleksandr Tyshchenko [Fri, 29 Jan 2021 01:48:49 +0000 (03:48 +0200)]
xen/ioreq: Make x86's send_invalidate_req() common

As the IOREQ is a common feature now and we also need to
invalidate qemu/demu mapcache on Arm when the required condition
occurs this patch moves this function to the common code
(and remames it to ioreq_signal_mapcache_invalidate).
This patch also moves per-domain qemu_mapcache_invalidate
variable out of the arch sub-struct (and drops "qemu" prefix).

We don't put this variable inside the #ifdef CONFIG_IOREQ_SERVER
at the end of struct domain, but in the hole next to the group
of 5 bools further up which is more efficient.

The subsequent patch will add mapcache invalidation handling on Arm.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Julien Grall <julien.grall@arm.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>
4 years agoxen/arm: io: Harden sign extension check
Oleksandr Tyshchenko [Fri, 29 Jan 2021 01:48:48 +0000 (03:48 +0200)]
xen/arm: io: Harden sign extension check

In the ideal world we would never get an undefined behavior when
propagating the sign bit since that bit can only be set for access
size smaller than the register size (i.e byte/half-word for aarch32,
byte/half-word/word for aarch64).

In the real world we need to care for *possible* hardware bug such as
advertising a sign extension for either 64-bit (or 32-bit) on Arm64
(resp. Arm32).

So harden a bit more the code to prevent undefined behavior when
propagating the sign bit in case of buggy hardware.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
CC: Julien Grall <julien.grall@arm.com>
4 years agoxen/arm: io: Abstract sign-extension
Oleksandr Tyshchenko [Fri, 29 Jan 2021 01:48:47 +0000 (03:48 +0200)]
xen/arm: io: Abstract sign-extension

In order to avoid code duplication (both handle_read() and
handle_ioserv() contain the same code for the sign-extension)
put this code to a common helper to be used for both.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Julien Grall <jgrall@amazon.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>
4 years agoxen/dm: Introduce xendevicemodel_set_irq_level DM op
Oleksandr Tyshchenko [Fri, 29 Jan 2021 01:48:46 +0000 (03:48 +0200)]
xen/dm: Introduce xendevicemodel_set_irq_level DM op

This patch adds ability to the device emulator to notify otherend
(some entity running in the guest) using a SPI and implements Arm
specific bits for it. Proposed interface allows emulator to set
the logical level of a one of a domain's IRQ lines.

We can't reuse the existing DM op (xen_dm_op_set_isa_irq_level)
to inject an interrupt as the "isa_irq" field is only 8-bit and
able to cover IRQ 0 - 255, whereas we need a wider range (0 - 1020).

Please note, for egde-triggered interrupt (which is used for
the virtio-mmio emulation) we only trigger the interrupt on Arm
if the level is asserted (rising edge) and do nothing if the level
is deasserted (falling edge), so the call could be named "trigger_irq"
(without the level parameter). But, in order to model the line closely
(to be able to support level-triggered interrupt) we need to know whether
the line is low or high, so the proposed interface has been chosen.
However, it is worth mentioning that in case of the level-triggered
interrupt, we should keep injecting the interrupt to the guest until
the line is deasserted (this is not covered by current patch).

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>
4 years agoxen/ioreq: Introduce domain_has_ioreq_server()
Oleksandr Tyshchenko [Fri, 29 Jan 2021 01:48:45 +0000 (03:48 +0200)]
xen/ioreq: Introduce domain_has_ioreq_server()

This patch introduces a helper the main purpose of which is to check
if a domain is using IOREQ server(s).

On Arm the current benefit is to avoid calling vcpu_ioreq_handle_completion()
(which implies iterating over all possible IOREQ servers anyway)
on every return in leave_hypervisor_to_guest() if there is no active
servers for the particular domain.
Also this helper will be used by one of the subsequent patches on Arm.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Paul Durrant <paul@xen.org>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>
4 years agoxen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm
Oleksandr Tyshchenko [Fri, 29 Jan 2021 01:48:44 +0000 (03:48 +0200)]
xen/mm: Handle properly reference in set_foreign_p2m_entry() on Arm

This patch implements reference counting of foreign entries in
in set_foreign_p2m_entry() on Arm. This is a mandatory action if
we want to run emulator (IOREQ server) in other than dom0 domain,
as we can't trust it to do the right thing if it is not running
in dom0. So we need to grab a reference on the page to avoid it
disappearing.

It is valid to always pass "p2m_map_foreign_rw" type to
guest_physmap_add_entry() since the current and foreign domains
would be always different. A case when they are equal would be
rejected by rcu_lock_remote_domain_by_id(). Besides the similar
comment in the code put a respective ASSERT() to catch incorrect
usage in future.

It was tested with IOREQ feature to confirm that all the pages given
to this function belong to a domain, so we can use the same approach
as for XENMAPSPACE_gmfn_foreign handling in xenmem_add_to_physmap_one().

This involves adding an extra parameter for the foreign domain to
set_foreign_p2m_entry() and a helper to indicate whether the arch
supports the reference counting of foreign entries and the restriction
for the hardware domain in the common code can be skipped for it.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
CC: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>
4 years agoxen/arm: Call vcpu_ioreq_handle_completion() in check_for_vcpu_work()
Oleksandr Tyshchenko [Fri, 29 Jan 2021 01:48:43 +0000 (03:48 +0200)]
xen/arm: Call vcpu_ioreq_handle_completion() in check_for_vcpu_work()

This patch adds remaining bits needed for the IOREQ support on Arm.
Besides just calling vcpu_ioreq_handle_completion() we need to handle
it's return value to make sure that all the vCPU works are done before
we return to the guest (the vcpu_ioreq_handle_completion() may return
false if there is vCPU work to do or IOREQ state is invalid).
For that reason we use an unbounded loop in leave_hypervisor_to_guest().

The worse that can happen here if the vCPU will never run again
(the I/O will never complete). But, in Xen case, if the I/O never
completes then it most likely means that something went horribly
wrong with the Device Emulator. And it is most likely not safe
to continue. So letting the vCPU to spin forever if the I/O never
completes is a safer action than letting it continue and leaving
the guest in unclear state and is the best what we can do for now.

Please note, using this loop we will not spin forever on a pCPU,
preventing any other vCPUs from being scheduled. At every loop
we will call check_for_pcpu_work() that will process pending
softirqs. In case of failure, the guest will crash and the vCPU
will be unscheduled. In normal case, if the rescheduling is necessary
the vCPU will be rescheduled to give place to someone else.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Julien Grall <jgrall@amazon.com>
CC: Julien Grall <julien.grall@arm.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>
4 years agoarm/ioreq: Introduce arch specific bits for IOREQ/DM features
Julien Grall [Fri, 29 Jan 2021 01:48:42 +0000 (03:48 +0200)]
arm/ioreq: Introduce arch specific bits for IOREQ/DM features

This patch adds basic IOREQ/DM support on Arm. The subsequent
patches will improve functionality and add remaining bits.

The IOREQ/DM features are supposed to be built with IOREQ_SERVER
option enabled, which is disabled by default on Arm for now.

Please note, the "PIO handling" TODO is expected to left unaddressed
for the current series. It is not an big issue for now while Xen
doesn't have support for vPCI on Arm. On Arm64 they are only used
for PCI IO Bar and we would probably want to expose them to emulator
as PIO access to make a DM completely arch-agnostic. So "PIO handling"
should be implemented when we add support for vPCI.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>
4 years agoxen/ioreq: Use guest_cmpxchg64() instead of cmpxchg()
Oleksandr Tyshchenko [Fri, 29 Jan 2021 01:48:41 +0000 (03:48 +0200)]
xen/ioreq: Use guest_cmpxchg64() instead of cmpxchg()

The cmpxchg() in ioreq_send_buffered() operates on memory shared
with the emulator domain (and the target domain if the legacy
interface is used).

In order to be on the safe side we need to switch
to guest_cmpxchg64() to prevent a domain to DoS Xen on Arm.
The point to use 64-bit version of helper is to support Arm32
since the IOREQ code uses cmpxchg() with 64-bit value.

As there is no plan to support the legacy interface on Arm,
we will have a page to be mapped in a single domain at the time,
so we can use s->emulator in guest_cmpxchg64() safely.

Thankfully the only user of the legacy interface is x86 so far
and there is not concern regarding the atomics operations.

Please note, that the legacy interface *must* not be used on Arm
without revisiting the code.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Paul Durrant <paul@xen.org>
CC: Julien Grall <julien.grall@arm.com>
[On Arm only]
Tested-by: Wei Chen <Wei.Chen@arm.com>