]> xenbits.xensource.com Git - xen.git/log
xen.git
4 years agoxen/iommu: smmu: Use 1U << 31 rather than 1 << 31
Julien Grall [Thu, 24 Dec 2020 15:24:19 +0000 (15:24 +0000)]
xen/iommu: smmu: Use 1U << 31 rather than 1 << 31

Replace all the use of 1 << 31 with 1U << 31 to prevent undefined
behavior in the SMMU driver.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
[stefano: fix title and description]
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agox86/acpi: remove dead code
Roger Pau Monné [Mon, 11 Jan 2021 13:58:00 +0000 (14:58 +0100)]
x86/acpi: remove dead code

After the recent changes to acpi_fadt_parse_sleep_info the bad label
can never be called with facs mapped, and hence the unmap can be
removed.

Additionally remove the whole label, since it was used by a
single caller. Move the relevant code from the label.

No functional change intended.

CID: 1471722
Fixes: 16ca5b3f873 ('x86/ACPI: don't invalidate S5 data when S3 wakeup vector cannot be determined')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86: drop fake CONFIG_{HPET,X86_PM}_TIMER
Jan Beulich [Mon, 11 Jan 2021 13:56:53 +0000 (14:56 +0100)]
x86: drop fake CONFIG_{HPET,X86_PM}_TIMER

I don't think we mean to ever make them real Kconfig options, so let's
just do away with them.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoACPI: replace casts by container_of()
Jan Beulich [Mon, 11 Jan 2021 13:56:23 +0000 (14:56 +0100)]
ACPI: replace casts by container_of()

The latter is slightly more type-safe. Also add const where possible,
including without need to touch further code. Additionally replace an
adjacent unnecessary use of u16.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/ACPI: don't overwrite FADT
Jan Beulich [Mon, 11 Jan 2021 13:55:52 +0000 (14:55 +0100)]
x86/ACPI: don't overwrite FADT

When marking fields invalid for our own purposes, we should do so in our
local copy (so we will notice later on), not in the firmware provided
one (which another entity may want to look at again, e.g. after kexec).
Also mark the function parameter const to notice such issues right away.

Instead use the pointer at the firmware copy for specifying an adjacent
printk()'s arguments. If nothing else this at least reduces the number
of relocations the assembler hasto emit and the linker has to process.

Fixes: 62d1a69a4e9f ("ACPI: support v5 (reduced HW) sleep interface")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoACPI: reduce verbosity by default
Jan Beulich [Mon, 11 Jan 2021 13:55:16 +0000 (14:55 +0100)]
ACPI: reduce verbosity by default

While they're KERN_INFO messages and hence not visible by default, we
still have had reports that the amount of output is too large, not the
least because
- the command line controlled resizing of the console ring buffer
  happens only after SRAT parsing (which may alone produce more than 16k
  of output),
- the default resizing of the console ring buffer happens only after
  ACPI table parsing, since the default size gets calculated depending
  on the number or processors found.

Gate all per-processor logging behind a new "acpi=verbose", making sure
we wouldn't unintentionally pass this on to Dom0.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoevtchn: closing of vIRQ-s doesn't require looping over all vCPU-s
Jan Beulich [Mon, 11 Jan 2021 13:53:55 +0000 (14:53 +0100)]
evtchn: closing of vIRQ-s doesn't require looping over all vCPU-s

Global vIRQ-s have their event channel association tracked on vCPU 0.
Per-vCPU vIRQ-s can't have their notify_vcpu_id changed. Hence it is
well-known which vCPU's virq_to_evtchn[] needs updating.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
4 years agoevtchn: don't call Xen consumer callback with per-channel lock held
Jan Beulich [Mon, 11 Jan 2021 13:53:02 +0000 (14:53 +0100)]
evtchn: don't call Xen consumer callback with per-channel lock held

While there don't look to be any problems with this right now, the lock
order implications from holding the lock can be very difficult to follow
(and may be easy to violate unknowingly). The present callbacks don't
(and no such callback should) have any need for the lock to be held.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agox86/PV: fold redundant calls to adjust_guest_l<N>e()
Jan Beulich [Mon, 11 Jan 2021 13:51:39 +0000 (14:51 +0100)]
x86/PV: fold redundant calls to adjust_guest_l<N>e()

At least from an abstract perspective it is quite odd for us to compare
adjusted old and unadjusted new page table entries when determining
whether the fast path can be used. This is largely benign because
FASTPATH_FLAG_WHITELIST covers most of the flags which the adjustments
may set, and the flags getting set don't affect the outcome of
get_page_from_l<N>e(). There's one exception: 32-bit L3 entries get
_PAGE_RW set, but get_page_from_l3e() doesn't allow linear page tables
to be created at this level for such guests. Apart from this _PAGE_RW
is unused by get_page_from_l<N>e() (for N > 1), and hence forcing the
bit on early has no functional effect.

The main reason for the change, however, is that adjust_guest_l<N>e()
aren't exactly cheap - both in terms of pure code size and because each
one has at least one evaluate_nospec() by way of containing
is_pv_32bit_domain() conditionals.

Call the functions once ahead of the fast path checks, instead of twice
after.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86/PV: consistently inline {,un}adjust_guest_l<N>e()
Jan Beulich [Mon, 11 Jan 2021 13:50:38 +0000 (14:50 +0100)]
x86/PV: consistently inline {,un}adjust_guest_l<N>e()

Commit 8a74707a7c ("x86/nospec: Use always_inline to fix code gen for
evaluate_nospec") converted inline to always_inline for
adjust_guest_l[134]e(), but left adjust_guest_l2e() and
unadjust_guest_l3e() alone without saying why these two would differ in
the needed / wanted treatment. Adjust these two as well.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agoxen/arm: do not read MVFR2 when is not defined
Stefano Stabellini [Tue, 5 Jan 2021 19:05:48 +0000 (11:05 -0800)]
xen/arm: do not read MVFR2 when is not defined

MVFR2 is not available on ARMv7. It is available on ARMv8 aarch32 and
aarch64. If Xen reads MVFR2 on ARMv7 it could crash.

Avoid the issue by doing the following:

- define MVFR2_MAYBE_UNDEFINED on arm32
- if MVFR2_MAYBE_UNDEFINED, do not attempt to read MVFR2 in Xen
- keep the 3rd register_t in struct cpuinfo_arm.mvfr on arm32 so that a
  guest read to the register returns '0' instead of crashing the guest.

'0' is an appropriate value to return to the guest because it is defined
as "no support for miscellaneous features".

Aarch64 Xen is not affected by this patch.

Fixes: 9cfdb489af81 ("xen/arm: Add ID registers and complete cpuinfo")
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agox86/hypercall: fix gnttab hypercall args conditional build on pvshim
Roger Pau Monné [Fri, 8 Jan 2021 15:51:52 +0000 (16:51 +0100)]
x86/hypercall: fix gnttab hypercall args conditional build on pvshim

A pvshim build doesn't require the grant table functionality built in,
but it does require knowing the number of arguments the hypercall has
so the hypercall parameter clobbering works properly.

Instead of also setting the argument count for the gnttab case if PV
shim functionality is enabled, just drop all of the conditionals from
hypercall_args_table, as a hypercall having a NULL handler won't get
to use that information anyway.

Note this hasn't been detected by osstest because the tools pvshim
build is done without debug enabled, so the hypercall parameter
clobbering doesn't happen.

Fixes: d2151152dd2 ('xen: make grant table support configurable')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/shadow: adjust TLB flushing in sh_unshadow_for_p2m_change()
Jan Beulich [Fri, 8 Jan 2021 15:51:19 +0000 (16:51 +0100)]
x86/shadow: adjust TLB flushing in sh_unshadow_for_p2m_change()

Accumulating transient state of d->dirty_cpumask in a local variable is
unnecessary here: The flush is fine to make with the dirty set at the
time of the call. With this, move the invocation to a central place at
the end of the function.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
4 years agox86/shadow: cosmetics to sh_unshadow_for_p2m_change()
Jan Beulich [Fri, 8 Jan 2021 15:50:47 +0000 (16:50 +0100)]
x86/shadow: cosmetics to sh_unshadow_for_p2m_change()

Besides the adjustments for style
- use switch(),
- widen scope of commonly used variables,
- narrow scope of other variables.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
4 years agox86/p2m: pass old PTE directly to write_p2m_entry_pre() hook
Jan Beulich [Fri, 8 Jan 2021 15:50:11 +0000 (16:50 +0100)]
x86/p2m: pass old PTE directly to write_p2m_entry_pre() hook

In no case is a pointer to non-const needed. Since no pointer arithmetic
is done by the sole user of the hook, passing in the PTE itself is quite
fine.

While doing this adjustment also
- drop the intermediate sh_write_p2m_entry_pre():
  sh_unshadow_for_p2m_change() can itself be used as the hook function,
  moving the conditional into there,
- introduce a local variable holding the flags of the old entry.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
4 years agox86/p2m: avoid unnecessary calls of write_p2m_entry_pre() hook
Jan Beulich [Fri, 8 Jan 2021 15:49:23 +0000 (16:49 +0100)]
x86/p2m: avoid unnecessary calls of write_p2m_entry_pre() hook

When shattering a large page, we first construct the new page table page
and only then hook it up. The "pre" hook in this case does nothing, for
the page starting out all blank. Avoid 512 calls into shadow code in
this case by passing in INVALID_GFN, indicating the page being updated
is (not yet) associated with any GFN. (The alternative to this change
would be to actually pass in a correct GFN, which can't be all the same
on every loop iteration.)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86/mem_sharing: resolve mm-lock order violations when forking VMs with nested p2m
Tamas K Lengyel [Fri, 8 Jan 2021 10:51:36 +0000 (11:51 +0100)]
x86/mem_sharing: resolve mm-lock order violations when forking VMs with nested p2m

Several lock-order violations have been encountered while attempting to fork
VMs with nestedhvm=1 set. This patch resolves the issues.

The order violations stems from a call to p2m_flush_nestedp2m being performed
whenever the hostp2m changes. This functions always takes the p2m lock for the
nested_p2m. However, with sharing the p2m locks always have to be taken before
the sharing lock. To resolve this issue we avoid taking the sharing lock where
possible (and was actually unecessary to begin with). But we also make
p2m_flush_nestedp2m aware that the p2m lock may have already been taken and
preemptively take all nested_p2m locks before unsharing a page where taking the
sharing lock is necessary.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agox86: fold indirect_thunk_asm.h into asm-defns.h
Jan Beulich [Fri, 8 Jan 2021 10:50:32 +0000 (11:50 +0100)]
x86: fold indirect_thunk_asm.h into asm-defns.h

There's little point in having two separate headers both getting
included by asm_defns.h. This in particular reduces the number of
instances of guarding asm(".include ...") suitably in such dual use
headers.

No change to generated code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86: drop ASM_{CL,ST}AC
Jan Beulich [Fri, 8 Jan 2021 10:48:09 +0000 (11:48 +0100)]
x86: drop ASM_{CL,ST}AC

Use ALTERNATIVE directly, such that at the use sites it is visible that
alternative code patching is in use. Similarly avoid hiding the fact in
SAVE_ALL.

No change to generated code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86: replace __ASM_{CL,ST}AC
Jan Beulich [Fri, 8 Jan 2021 10:45:07 +0000 (11:45 +0100)]
x86: replace __ASM_{CL,ST}AC

Introduce proper assembler macros instead, enabled only when the
assembler itself doesn't support the insns. To avoid duplicating the
macros for assembly and C files, have them processed into asm-macros.h.
This in turn requires adding a multiple inclusion guard when generating
that header.

No change to generated code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agoxen/arm: optee: The function identifier is always 32-bit
Roman Skakun [Wed, 6 Jan 2021 11:26:57 +0000 (13:26 +0200)]
xen/arm: optee: The function identifier is always 32-bit

Per the SMCCC specification (see section 3.1 in ARM DEN 0028D), the
function identifier is only stored in the least significant 32-bits.
The most significant 32-bits should be ignored.

Signed-off-by: Roman Skakun <roman_skakun@epam.com>
Acked-by: Volodymyr Babchyk <volodymyr_babchuk@epam.com>
[jgrall: Reword the commit message and comment]
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agoxsm/dummy: harden against speculative abuse
Jan Beulich [Thu, 7 Jan 2021 14:11:25 +0000 (15:11 +0100)]
xsm/dummy: harden against speculative abuse

First of all don't open-code is_control_domain(), which is already
suitably using evaluate_nospec(). Then also apply this construct to the
other paths of xsm_default_action(). Also guard two paths not using this
function.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wl@xen.org>
4 years agox86/dpci: EOI interrupt regardless of its masking status
Roger Pau Monné [Thu, 7 Jan 2021 14:10:29 +0000 (15:10 +0100)]
x86/dpci: EOI interrupt regardless of its masking status

Modify hvm_pirq_eoi to always EOI the interrupt if required, instead
of not doing such EOI if the interrupt is routed through the vIO-APIC
and the entry is masked at the time the EOI is performed.

Further unmask of the vIO-APIC pin won't EOI the interrupt, and thus
the guest OS has to wait for the timeout to expire and the automatic
EOI to be performed.

This allows to simplify the helpers and drop the vioapic_redir_entry
parameter from all of them.

Fixes: ccfe4e08455 ('Intel vt-d specific changes in arch/x86/hvm/vmx/vtd.')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86: drop use of E801 memory "map" (and alike)
Jan Beulich [Thu, 7 Jan 2021 14:09:47 +0000 (15:09 +0100)]
x86: drop use of E801 memory "map" (and alike)

ACPI mandates use of E820 (or newer, e.g. EFI), and in fact firmware
has been observed to include E820_ACPI ranges in what E801 reports as
available (really "configured") memory. Since all 64-bit systems ought
to support ACPI, drop our use of older BIOS and boot loader interfaces.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/mem-sharing: don't pointlessly use get_domain_by_id()
Jan Beulich [Thu, 7 Jan 2021 14:09:20 +0000 (15:09 +0100)]
x86/mem-sharing: don't pointlessly use get_domain_by_id()

For short-lived references rcu_lock_domain_by_id() is the better
(slightly cheaper) alternative.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86: don't pointlessly use get_domain_by_id()
Jan Beulich [Thu, 7 Jan 2021 14:08:51 +0000 (15:08 +0100)]
x86: don't pointlessly use get_domain_by_id()

For short-lived references rcu_lock_domain_by_id() is the better
(slightly cheaper) alternative.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agocommon: don't (kind of) open-code rcu_lock_domain_by_any_id()
Jan Beulich [Thu, 7 Jan 2021 14:06:15 +0000 (15:06 +0100)]
common: don't (kind of) open-code rcu_lock_domain_by_any_id()

Even more so when using rcu_lock_domain_by_id() in place of the more
efficient rcu_lock_current_domain().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agovPCI/MSI-X: fold clearing of entry->updated
Jan Beulich [Thu, 7 Jan 2021 14:03:17 +0000 (15:03 +0100)]
vPCI/MSI-X: fold clearing of entry->updated

Both call sites clear the flag after a successfull call to
update_entry(). This can be simplified by moving the clearing into the
function, onto its success path.

As a result of neither caller caring about update_entry()'s return value
anymore, the function gets switched to return void.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86/vm_event: transfer nested p2m base info
Tamas K Lengyel [Sun, 3 Jan 2021 18:41:17 +0000 (11:41 -0700)]
x86/vm_event: transfer nested p2m base info

Required to introspect events originating from nested VMs.

Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/mem_sharing: Copy CPUID and MSR configuration during vm forking
Tamas K Lengyel [Tue, 5 Jan 2021 21:58:23 +0000 (13:58 -0800)]
x86/mem_sharing: Copy CPUID and MSR configuration during vm forking

Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools/libxenguest: move M2P macros to xg_private.h
Olaf Hering [Tue, 5 Jan 2021 15:13:56 +0000 (16:13 +0100)]
tools/libxenguest: move M2P macros to xg_private.h

Just code movement as a preparatory change before xg_sr_* will be moved.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools/libxenguest: remove FOLD_CR3 from xg_save_restore.h
Olaf Hering [Tue, 5 Jan 2021 15:05:36 +0000 (16:05 +0100)]
tools/libxenguest: remove FOLD_CR3 from xg_save_restore.h

The last user was removed with commit b15bc4345e772df92e5ffdbc4c1e9ae2a6206617

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools/libxenguest: remove get_platform_info from xg_save_restore.h
Olaf Hering [Tue, 5 Jan 2021 15:02:47 +0000 (16:02 +0100)]
tools/libxenguest: remove get_platform_info from xg_save_restore.h

Last user was removed with commit 4ddf474e2b7c045fadeaf765ac6157de745e84d6
Previously it was also used in migration code, which was removed with commit
b15bc4345e772df92e5ffdbc4c1e9ae2a6206617

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agolibxl: cleanup remaining backend xs dirs after driver domain
Marek Marczykowski-Górecki [Sun, 8 Nov 2020 14:59:42 +0000 (15:59 +0100)]
libxl: cleanup remaining backend xs dirs after driver domain

When device is removed, backend domain (which may be a driver domain) is
responsible for removing backend entries from xenstore. But in case of
driver domain, it has no access to remove all of them - specifically the
directory named after frontend-id remains. This may accumulate enough to
exceed xenstore quote of the driver domain, breaking further devices.

Fix this by calling libxl__xs_path_cleanup() on the backend path from
libxl__device_destroy() in the toolstack domain too. Note
libxl__device_destroy() is called when the driver domain already removed
what it can (see device_destroy_be_watch_cb()->device_hotplug_done()).

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agotools: ipxe: update for fixing build with GCC10
Olaf Hering [Mon, 4 Jan 2021 11:52:23 +0000 (12:52 +0100)]
tools: ipxe: update for fixing build with GCC10

Update to v1.21.1 to fix build in Tumbleweed, which has been broken
since months due to lack of new release.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Wei Liu <wl@xen.org>
4 years agotools/libxenguest: handle more than 16T in precopy_stats
Olaf Hering [Tue, 5 Jan 2021 08:30:48 +0000 (09:30 +0100)]
tools/libxenguest: handle more than 16T in precopy_stats

total_written tracks the number of transferred dirty pages.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Wei Liu <wl@xen.org>
4 years agolibs/devicemodel: add dm_op support for FreeBSD
Roger Pau Monne [Tue, 5 Jan 2021 10:25:46 +0000 (11:25 +0100)]
libs/devicemodel: add dm_op support for FreeBSD

The FreeBSD ioctls have the same fields has the Linux ones, so the
same file can be shared between both OSes.

No functional change for OSes different than FreeBSD.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agolibs/foreignmemory: implement the missing functions on FreeBSD
Roger Pau Monne [Tue, 5 Jan 2021 10:25:45 +0000 (11:25 +0100)]
libs/foreignmemory: implement the missing functions on FreeBSD

Implement restrict, map resource and unmap resource helpers on
FreeBSD.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agolib/sort: adjust types
Jan Beulich [Tue, 5 Jan 2021 12:20:54 +0000 (13:20 +0100)]
lib/sort: adjust types

First and foremost do away with the use of plain int for sizes or size-
derived values. Use size_t, despite this requiring some adjustment to
the logic. Also replace u32 by uint32_t.

While not directly related also drop a leftover #ifdef from x86's
swap_ex - this was needed only back when 32-bit Xen was still a thing.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agovPCI/MSI-X: tidy init_msix()
Jan Beulich [Tue, 5 Jan 2021 12:20:13 +0000 (13:20 +0100)]
vPCI/MSI-X: tidy init_msix()

First of all introduce a local variable for the to be allocated struct.
The compiler can't CSE all the occurrences (I'm observing 80 bytes of
code saved with gcc 10). Additionally, while the caller can cope and
there was no memory leak, globally "announce" the struct only once done
initializing it. This also removes the dependency of the function on
the caller cleaning up after it in case of an error.

Also prefer a local variable over using a structure field previously
set from this very variable.

Finally move the call to vpci_add_register() ahead of all further
initialization of the struct, to bail early in case of error.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agovPCI/MSI-X: make use of xzalloc_flex_struct()
Jan Beulich [Tue, 5 Jan 2021 12:19:28 +0000 (13:19 +0100)]
vPCI/MSI-X: make use of xzalloc_flex_struct()

... instead of effectively open-coding it in a type-unsafe way.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86/vPCI: check address in vpci_msi_update()
Jan Beulich [Tue, 5 Jan 2021 12:18:26 +0000 (13:18 +0100)]
x86/vPCI: check address in vpci_msi_update()

If the upper address bits don't match the interrupt delivery address
space window, entirely different behavior would need to be implemented.
Refuse such requests for the time being.

Replace adjacent hard tabs while introducing MSI_ADDR_BASE_MASK.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86/vPCI: tolerate (un)masking a disabled MSI-X entry
Jan Beulich [Tue, 5 Jan 2021 12:17:54 +0000 (13:17 +0100)]
x86/vPCI: tolerate (un)masking a disabled MSI-X entry

None of the four reasons causing vpci_msix_arch_mask_entry() to get
called (there's just a single call site) are impossible or illegal prior
to an entry actually having got set up:
- the entry may remain masked (in this case, however, a prior masked ->
  unmasked transition would already not have worked),
- MSI-X may not be enabled,
- the global mask bit may be set,
- the entry may not otherwise have been updated.
Hence the function asserting that the entry was previously set up was
simply wrong. Since the caller tracks the masked state (and setting up
of an entry would only be effected when that software bit is clear),
it's okay to skip both masking and unmasking requests in this case.

Fixes: d6281be9d0145 ('vpci/msix: add MSI-X handlers')
Reported-by: Manuel Bouyer <bouyer@antioche.eu.org>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Tested-by: Manuel Bouyer <bouyer@antioche.eu.org>
4 years agox86: hypercall vector is unused when !PV32
Jan Beulich [Tue, 5 Jan 2021 12:17:02 +0000 (13:17 +0100)]
x86: hypercall vector is unused when !PV32

This vector can be used as an ordinary interrupt handling one in this
case. To be sure no references are left, make the #define itself
conditional.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86/build: restrict contents of asm-offsets.h when !HVM / !PV
Jan Beulich [Tue, 5 Jan 2021 12:13:18 +0000 (13:13 +0100)]
x86/build: restrict contents of asm-offsets.h when !HVM / !PV

This file has a long dependencies list (through asm-offsets.[cs]) and a
long list of dependents. IOW if any of the former changes, all of the
latter will be rebuilt, even if there's no actual change to the
generated file. Therefore avoid producing symbols we don't actually
need, depending on configuration.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86/build: limit #include-ing by asm-offsets.c
Jan Beulich [Tue, 5 Jan 2021 12:12:37 +0000 (13:12 +0100)]
x86/build: limit #include-ing by asm-offsets.c

This file has a long dependencies list and asm-offsets.h, generated from
it, has a long list of dependents. IOW if any of the former changes, all
of the latter will be rebuilt, even if there's no actual change to the
generated file. Therefore avoid including headers we don't actually need
(generally or configuration dependent).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86/build: limit rebuilding of asm-offsets.h
Jan Beulich [Tue, 5 Jan 2021 12:12:15 +0000 (13:12 +0100)]
x86/build: limit rebuilding of asm-offsets.h

This file has a long dependencies list (through asm-offsets.[cs]) and a
long list of dependents. IOW if any of the former changes, all of the
latter will be rebuilt, even if there's no actual change to the
generated file. This is the primary scenario we have the move-if-changed
macro for.

Since debug information may easily cause the file contents to change in
benign ways, also avoid emitting this into the output file.

Finally already before this change *.new files needed including in what
gets removed by the "clean" target.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86/ACPI: don't invalidate S5 data when S3 wakeup vector cannot be determined
Jan Beulich [Tue, 5 Jan 2021 12:11:04 +0000 (13:11 +0100)]
x86/ACPI: don't invalidate S5 data when S3 wakeup vector cannot be determined

We can be more tolerant as long as the data collected from FACS is only
needed to enter S3. A prior change already added suitable checking to
acpi_enter_sleep().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86/ACPI: fix S3 wakeup vector mapping
Jan Beulich [Tue, 5 Jan 2021 12:09:55 +0000 (13:09 +0100)]
x86/ACPI: fix S3 wakeup vector mapping

Use of __acpi_map_table() here was at least close to an abuse already
before, but it will now consistently return NULL here. Drop the layering
violation and use set_fixmap() directly. Re-use of the ACPI fixmap area
is hopefully going to remain "fine" for the time being.

Add checks to acpi_enter_sleep(): The vector now needs to be contained
within a single page, but the ACPI spec requires 64-byte alignment of
FACS anyway. Also bail if no wakeup vector was determined in the first
place, in part as preparation for a subsequent relaxation change.

Fixes: 1c4aa69ca1e1 ("xen/acpi: Rework acpi_os_map_memory() and acpi_os_unmap_memory()")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agoxen/arm: Activate TID3 in HCR_EL2
Bertrand Marquis [Thu, 17 Dec 2020 15:38:08 +0000 (15:38 +0000)]
xen/arm: Activate TID3 in HCR_EL2

Activate TID3 bit in HCR register when starting a guest.
This will trap all coprecessor ID registers so that we can give to guest
values corresponding to what they can actually use and mask some
features to guests even though they would be supported by the underlying
hardware (like SVE or MPAM).

Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agoxen/arm: Add CP10 exception support to handle MVFR
Bertrand Marquis [Thu, 17 Dec 2020 15:38:07 +0000 (15:38 +0000)]
xen/arm: Add CP10 exception support to handle MVFR

Add support for cp10 exceptions decoding to be able to emulate the
values for MVFR0, MVFR1 and MVFR2 when TID3 bit of HSR is activated.
This is required for aarch32 guests accessing MVFR registers using
vmrs and vmsr instructions.

Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agoxen/arm: Add handler for cp15 ID registers
Bertrand Marquis [Thu, 17 Dec 2020 15:38:06 +0000 (15:38 +0000)]
xen/arm: Add handler for cp15 ID registers

Add support for emulation of cp15 based ID registers (on arm32 or when
running a 32bit guest on arm64).
The handlers are returning the values stored in the guest_cpuinfo
structure for known registers and RAZ for all reserved registers.
In the current status the MVFR registers are no supported.

Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com>
[Stefano: fix code style]
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agoxen/arm: Add handler for ID registers on arm64
Bertrand Marquis [Thu, 17 Dec 2020 15:38:05 +0000 (15:38 +0000)]
xen/arm: Add handler for ID registers on arm64

Add vsysreg emulation for registers trapped when TID3 bit is activated
in HSR.
The emulation is returning the value stored in cpuinfo_guest structure
for know registers and is handling reserved registers as RAZ.

Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agoxen/arm: create a cpuinfo structure for guest
Bertrand Marquis [Thu, 17 Dec 2020 15:38:04 +0000 (15:38 +0000)]
xen/arm: create a cpuinfo structure for guest

Create a cpuinfo structure for guest and mask into it the features that
we do not support in Xen or that we do not want to publish to guests.

Modify some values in the cpuinfo structure for guests to mask some
features which we do not want to allow to guests (like AMU) or we do not
support (like SVE).
Modify some values in the guest cpuinfo structure to guests to hide some
processor features:
- SVE as this is not supported by Xen and guest are not allowed to use
this features (ZEN is set to 0 in CPTR_EL2).
- AMU as HCPTR_TAM is set in CPTR_EL2 so AMU cannot be used by guests
All other bits are left untouched.
- RAS as this is not supported by Xen.

The code is trying to group together registers modifications for the
same feature to be able in the long term to easily enable/disable a
feature depending on user parameters or add other registers modification
in the same place (like enabling/disabling HCR bits).

Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agoxen/arm: Add arm64 ID registers definitions
Bertrand Marquis [Thu, 17 Dec 2020 15:38:03 +0000 (15:38 +0000)]
xen/arm: Add arm64 ID registers definitions

Add coprocessor registers definitions for all ID registers trapped
through the TID3 bit of HSR.
Those are the one that will be emulated in Xen to only publish to guests
the features that are supported by Xen and that are accessible to
guests.

Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agoxen/arm: Add ID registers and complete cpuinfo
Bertrand Marquis [Thu, 17 Dec 2020 15:38:02 +0000 (15:38 +0000)]
xen/arm: Add ID registers and complete cpuinfo

Add definition and entries in cpuinfo for ID registers introduced in
newer Arm Architecture reference manual:
- ID_PFR2: processor feature register 2
- ID_DFR1: debug feature register 1
- ID_MMFR4 and ID_MMFR5: Memory model feature registers 4 and 5
- ID_ISA6: ISA Feature register 6
Add more bitfield definitions in PFR fields of cpuinfo.
Add MVFR2 register definition for aarch32.
Add MVFRx_EL1 defines for aarch32.
Add mvfr values in cpuinfo.
Add some registers definition for arm64 in sysregs as some are not
always know by compilers.
Initialize the new values added in cpuinfo in identify_cpu during init.

Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agoxen/arm: Use READ_SYSREG instead of 32/64 versions
Bertrand Marquis [Thu, 17 Dec 2020 15:38:01 +0000 (15:38 +0000)]
xen/arm: Use READ_SYSREG instead of 32/64 versions

Modify identify_cpu function to use READ_SYSREG instead of READ_SYSREG32
or READ_SYSREG64.

All aarch32 specific registers (for example ID_PFR0_EL1) are 64bit when
accessed from aarch64 with upper bits read as 0, so it is right to
access them as 64bit registers on a 64bit platform.

Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agox86/p2m: Fix paging_gva_to_gfn() for nested virt
Andrew Cooper [Thu, 31 Dec 2020 16:55:20 +0000 (16:55 +0000)]
x86/p2m: Fix paging_gva_to_gfn() for nested virt

nestedhap_walk_L1_p2m() takes guest physical addresses, not frame numbers.
This means the l2 input is off-by-PAGE_SHIFT, as is the l1 value eventually
returned to the caller.

Delete the misleading comment as well.

Fixes: bab2bd8e222de ("xen/nested_p2m: Don't walk EPT tables with a regular PT walker")
Reported-by: Tamas K Lengyel <tamas@tklengyel.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Tested-by: Tamas K Lengyel <tamas@tklengyel.com>
4 years agox86/p2m: fix p2m_add_foreign error path
Roger Pau Monné [Mon, 4 Jan 2021 09:03:23 +0000 (10:03 +0100)]
x86/p2m: fix p2m_add_foreign error path

One of the error paths in p2m_add_foreign could call put_page with a
NULL page, thus triggering a fault.

Split the checks into two different if statements, so the appropriate
error path can be taken.

Fixes: 173ae325026bd ('x86/p2m: tidy p2m_add_foreign() a little')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoxen: remove the usage of the P ar option
Roger Pau Monne [Wed, 30 Dec 2020 17:34:46 +0000 (18:34 +0100)]
xen: remove the usage of the P ar option

It's not part of the POSIX standard [0] and as such non GNU ar
implementations don't usually have it.

It's not relevant for the use case here anyway, as the archive file is
recreated every time due to the rm invocation before the ar call. No
file name matching should happen so matching using the full path name
or a relative one should yield the same result.

This fixes the build on FreeBSD.

While there also drop the s option, as ar will already generate a
symbol table by default when creating the archive.

[0] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/ar.html

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/svm: Clean up MSR_K8_VM_CR definitions
Andrew Cooper [Wed, 30 Dec 2020 19:26:14 +0000 (19:26 +0000)]
x86/svm: Clean up MSR_K8_VM_CR definitions

Drop the unused shift number, and reposition the constants into the cleaned-up
section.  Rename VM_CR_SVM_DISABLE to be closer to its APM definition.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86/hpet: Fix return value of hpet_setup()
Andrew Cooper [Tue, 29 Dec 2020 17:51:23 +0000 (17:51 +0000)]
x86/hpet: Fix return value of hpet_setup()

hpet_setup() is idempotent if the rate has already been calculated, and
returns the cached value.  However, this only works correctly when the return
statements are identical.

Use a sensibly named local variable, rather than a dead one with a bad name.

Fixes: a60bb68219 ("x86/time: reduce rounding errors in calculations")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agoxen/domain: Introduce domain_teardown()
Andrew Cooper [Mon, 28 Sep 2020 17:14:53 +0000 (18:14 +0100)]
xen/domain: Introduce domain_teardown()

There is no common equivelent of domain_reliquish_resources(), which has
caused various pieces of common cleanup to live in inappropriate
places.

Perhaps most obviously, evtchn_destroy() is called for every continuation of
domain_reliquish_resources(), which can easily be thousands of times.

Create domain_teardown() to be a new top level facility, and call it from the
appropriate positions in domain_kill() and domain_create()'s error path.  The
intention is for this to supersede domain_reliquish_resources() in due course.

No change in behaviour yet.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agoxen/domain: Reorder trivial initialisation in early domain_create()
Andrew Cooper [Mon, 28 Sep 2020 15:47:58 +0000 (16:47 +0100)]
xen/domain: Reorder trivial initialisation in early domain_create()

This improves the robustness of the error paths.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agodocs: use predictable ordering in generated documentation
Maximilian Engelhardt [Fri, 18 Dec 2020 20:42:34 +0000 (21:42 +0100)]
docs: use predictable ordering in generated documentation

When the seq number is equal, sort by the title to get predictable
output ordering. This is useful for reproducible builds.

Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/mm: p2m_add_foreign() is HVM-only
Jan Beulich [Tue, 22 Dec 2020 11:01:12 +0000 (12:01 +0100)]
x86/mm: p2m_add_foreign() is HVM-only

This is the case also for xenmem_add_to_physmap_one(), as is it's only
caller of the function. Move the latter next to p2m_add_foreign(),
allowing it one to become static at the same time. While moving, adjust
indentation of the body of the main switch().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/Intel: insert Tiger Lake model numbers
Jan Beulich [Tue, 22 Dec 2020 08:00:03 +0000 (09:00 +0100)]
x86/Intel: insert Tiger Lake model numbers

Both match prior generation processors as far as LBR and C-state MSRs
go (SDM rev 073). The if_pschange_mc erratum, according to the spec
update, is not applicable.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/EFI: don't insert timestamp when SOURCE_DATE_EPOCH is defined
Maximilian Engelhardt [Tue, 22 Dec 2020 07:59:14 +0000 (08:59 +0100)]
x86/EFI: don't insert timestamp when SOURCE_DATE_EPOCH is defined

By default a timestamp gets added to the xen efi binary. Unfortunately
ld doesn't seem to provide a way to set a custom date, like from
SOURCE_DATE_EPOCH, so set a zero value for the timestamp (option
--no-insert-timestamp) if SOURCE_DATE_EPOCH is defined. This makes
reproducible builds possible.

This is an alternative to the patch suggested in [1]. This patch only
omits the timestamp when SOURCE_DATE_EPOCH is defined.

[1] https://lists.xenproject.org/archives/html/xen-devel/2020-10/msg02161.html

Signed-off-by: Maximilian Engelhardt <maxi@daemonizer.de>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agox86: verify function type (and maybe attribute) in switch_stack_and_jump()
Jan Beulich [Tue, 22 Dec 2020 07:57:19 +0000 (08:57 +0100)]
x86: verify function type (and maybe attribute) in switch_stack_and_jump()

It is imperative that the functions passed here are taking no arguments,
return no values, and don't return in the first place. While the type
can be checked uniformly, the attribute check is limited to gcc 9 and
newer (no clang support for this so far afaict).

Note that I didn't want to have the "true" fallback "implementation" of
__builtin_has_attribute(..., __noreturn__) generally available, as
"true" may not be a suitable fallback in other cases.

Note further that the noreturn addition to startup_cpu_idle_loop()'s
declaration requires adding unreachable() to Arm's
switch_stack_and_jump(), or else the build would break. I suppose this
should have been there already.

For vmx_asm_do_vmentry() along with adding the attribute, also restrict
its scope.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wl@xen.org>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agoxen: Rework WARN_ON() to return whether a warning was triggered
Julien Grall [Fri, 18 Dec 2020 13:30:54 +0000 (13:30 +0000)]
xen: Rework WARN_ON() to return whether a warning was triggered

So far, our implementation of WARN_ON() cannot be used in the following
situation:

if ( WARN_ON() )
    ...

This is because WARN_ON() doesn't return whether a warning has been
triggered. Such construciton can be handy if you want to print more
information and also dump the stack trace.

Therefore, rework the WARN_ON() implementation to return whether a
warning was triggered. The idea was borrowed from Linux

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/shadow: Fix build with !CONFIG_SHADOW_PAGING
Andrew Cooper [Mon, 21 Dec 2020 14:52:26 +0000 (14:52 +0000)]
x86/shadow: Fix build with !CONFIG_SHADOW_PAGING

Implement a stub for shadow_vcpu_teardown()

Fixes: d162f36848c4 ("xen/x86: Fix memory leak in vcpu_create() error path")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agoxen/x86: Fix memory leak in vcpu_create() error path
Andrew Cooper [Mon, 28 Sep 2020 14:25:44 +0000 (15:25 +0100)]
xen/x86: Fix memory leak in vcpu_create() error path

Various paths in vcpu_create() end up calling paging_update_paging_modes(),
which eventually allocate a monitor pagetable if one doesn't exist.

However, an error in vcpu_create() results in the vcpu being cleaned up
locally, and not put onto the domain's vcpu list.  Therefore, the monitor
table is not freed by {hap,shadow}_teardown()'s loop.  This is caught by
assertions later that we've successfully freed the entire hap/shadow memory
pool.

The per-vcpu loops in domain teardown logic is conceptually wrong, but exist
due to insufficient existing structure in the existing logic.

Break paging_vcpu_teardown() out of paging_teardown(), with mirrored breakouts
in the hap/shadow code, and use it from arch_vcpu_create()'s error path.  This
fixes the memory leak.

The new {hap,shadow}_vcpu_teardown() must be idempotent, and are written to be
as tolerable as possible, with the minimum number of safety checks possible.
In particular, drop the mfn_valid() check - if these fields are junk, then Xen
is going to explode anyway.

Reported-by: Michał Leszczyński <michal.leszczynski@cert.pl>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agoxen/Kconfig: Correct the NR_CPUS description
Andrew Cooper [Fri, 18 Dec 2020 23:30:04 +0000 (23:30 +0000)]
xen/Kconfig: Correct the NR_CPUS description

The description "physical CPUs" is especially wrong, as it implies the number
of sockets, which tops out at 8 on all but the very biggest servers.

NR_CPUS is the number of logical entities the scheduler can use.

Reported-by: hanetzer@startmail.com
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agoRevert "x86/mm: p2m_add_foreign() is HVM-only"
Andrew Cooper [Fri, 18 Dec 2020 17:53:13 +0000 (17:53 +0000)]
Revert "x86/mm: p2m_add_foreign() is HVM-only"

This reverts commit 8009c33b5179536e2ecce54462fe4cd069060f77.  It breaks the
PV-Shim build.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/mm: p2m_add_foreign() is HVM-only
Jan Beulich [Fri, 18 Dec 2020 12:29:14 +0000 (13:29 +0100)]
x86/mm: p2m_add_foreign() is HVM-only

This is the case also for xenmem_add_to_physmap_one(), as is it's only
caller of the function. Move the latter next to p2m_add_foreign(),
allowing it one to become static at the same time. While moving, adjust
indentation of the body of the main switch().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/p2m: tidy p2m_add_foreign() a little
Jan Beulich [Fri, 18 Dec 2020 12:28:30 +0000 (13:28 +0100)]
x86/p2m: tidy p2m_add_foreign() a little

Drop a bogus ASSERT() - we don't typically assert incoming domain
pointers to be non-NULL, and there's no particular reason to do so here.

Replace the open-coded DOMID_SELF check by use of
rcu_lock_remote_domain_by_id(), at the same time covering the request
being made with the current domain's actual ID.

Move the "both domains same" check into just the path where it really
is meaningful.

Swap the order of the two puts, such that
- the p2m lock isn't needlessly held across put_page(),
- a separate put_page() on an error path can be avoided,
- they're inverse to the order of the respective gets.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agolib: move sort code
Jan Beulich [Fri, 18 Dec 2020 12:25:40 +0000 (13:25 +0100)]
lib: move sort code

Build this code into an archive, partly paralleling bsearch().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Acked-by: Wei Liu <wl@xen.org>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
4 years agolib: move bsearch code
Jan Beulich [Fri, 18 Dec 2020 12:23:42 +0000 (13:23 +0100)]
lib: move bsearch code

Convert this code to an inline function (backed by an instance in an
archive in case the compiler decides against inlining), which results
in not having it in x86 final binaries. This saves a little bit of dead
code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
4 years agolib: move rbtree code
Jan Beulich [Fri, 18 Dec 2020 12:22:54 +0000 (13:22 +0100)]
lib: move rbtree code

Build this code into an archive, which results in not linking it into
x86 final binaries. This saves about 1.5k of dead code.

While moving the source file, take the opportunity and drop the
pointless EXPORT_SYMBOL() and an instance of trailing whitespace.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wl@xen.org>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
4 years agolib: move init_constructors()
Jan Beulich [Fri, 18 Dec 2020 12:22:10 +0000 (13:22 +0100)]
lib: move init_constructors()

... into its own CU, for being unrelated to other things in
common/lib.c.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wl@xen.org>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
4 years agolib: move parse_size_and_unit()
Jan Beulich [Fri, 18 Dec 2020 12:21:25 +0000 (13:21 +0100)]
lib: move parse_size_and_unit()

... into its own CU, to build it into an archive.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Acked-by: Wei Liu <wl@xen.org>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
4 years agolib: move list sorting code
Jan Beulich [Fri, 18 Dec 2020 12:20:42 +0000 (13:20 +0100)]
lib: move list sorting code

Build the source file always, as by putting it into an archive it still
won't be linked into final binaries when not needed. This way possible
build breakage will be easier to notice, and it's more consistent with
us unconditionally building other library kind of code (e.g. sort() or
bsearch()).

While moving the source file, take the opportunity and drop the
pointless EXPORT_SYMBOL() and an unnecessary #include.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wl@xen.org>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agolib: collect library files in an archive
Jan Beulich [Fri, 18 Dec 2020 12:17:57 +0000 (13:17 +0100)]
lib: collect library files in an archive

In order to (subsequently) drop odd things like CONFIG_NEEDS_LIST_SORT
just to avoid bloating binaries when only some arch-es and/or
configurations need generic library routines, combine objects under lib/
into an archive, which the linker then can pick the necessary objects
out of.

Note that we can't use thin archives just yet, until we've raised the
minimum required binutils version suitably.

Note further that --start-group / --end-group get put in place right
away to allow for symbol resolution across all archives, once we gain
multuiple ones.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wl@xen.org>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agoautomation: add domU creation to dom0 alpine linux test
Stefano Stabellini [Tue, 24 Nov 2020 21:33:14 +0000 (13:33 -0800)]
automation: add domU creation to dom0 alpine linux test

Add a trivial Busybox based domU.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agoautomation: use the tests-artifacts kernel for qemu-smoke-arm64-gcc
Stefano Stabellini [Tue, 24 Nov 2020 21:22:17 +0000 (13:22 -0800)]
automation: use the tests-artifacts kernel for qemu-smoke-arm64-gcc

Use the tests-artifacts kernel, instead of the Debian kernel, for the
qemu-smoke-arm64-gcc job.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agoautomation: create an alpine linux arm64 test job
Stefano Stabellini [Tue, 24 Nov 2020 21:15:51 +0000 (13:15 -0800)]
automation: create an alpine linux arm64 test job

Create a test job that starts Xen and Dom0 on QEMU based on the alpine
linux rootfs. Use the Linux kernel and rootfs from the tests-artifacts
containers. Add the Xen tools binaries from the Alpine Linux build job.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agoautomation: make available the tests artifacts to the pipeline
Stefano Stabellini [Tue, 24 Nov 2020 21:13:50 +0000 (13:13 -0800)]
automation: make available the tests artifacts to the pipeline

In order to make available the pre-built binaries of the
automation/tests-artifacts containers to the gitlab-ci pipeline we need
to export them as gitlab artifacts.

To do that, we create two "fake" jobs that simply export the require
binaries as artifacts and do nothing else.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agoautomation: add tests artifacts
Stefano Stabellini [Tue, 24 Nov 2020 21:08:20 +0000 (13:08 -0800)]
automation: add tests artifacts

Some tests (soon to come) will require pre-built binaries to run, such
as the Linux kernel binary. We don't want to rebuild the Linux kernel
for each gitlab-ci run: these builds should not be added to the current
list of build jobs.

Instead, create additional containers that today are built and uploaded
manually, but could be re-built automatically. The containers build the
required binarires during the "docker build" step and store them inside
the container itself.

gitlab-ci will be able to fetch these pre-built binaries during the
regular test runs, saving cycles.

Add two tests artifacts containers:
- one to build the Linux kernel ARM64
- one to create an Alpine Linux ARM64 rootfs for Dom0

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agoautomation: add alpine linux x86 build jobs
Stefano Stabellini [Fri, 20 Nov 2020 17:56:25 +0000 (09:56 -0800)]
automation: add alpine linux x86 build jobs

Allow failure for these jobs. Currently they fail because hvmloader
doesn't build with musl. The failures don't block the pipeline.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agoautomation: add alpine linux 3.12 x86 build container
Stefano Stabellini [Fri, 20 Nov 2020 17:54:01 +0000 (09:54 -0800)]
automation: add alpine linux 3.12 x86 build container

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agoautomation: add alpine linux arm64 build test
Stefano Stabellini [Wed, 18 Nov 2020 01:07:43 +0000 (17:07 -0800)]
automation: add alpine linux arm64 build test

Based on the arm64 3.12 build container

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agoautomation: add alpine linux 3.12 arm64 build container
Stefano Stabellini [Wed, 18 Nov 2020 01:03:55 +0000 (17:03 -0800)]
automation: add alpine linux 3.12 arm64 build container

The build container will be used for a new Alpine Linux 3.12 arm64 build
test.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agoautomation: special configure flags for musl-based systems
Stefano Stabellini [Fri, 20 Nov 2020 03:20:15 +0000 (19:20 -0800)]
automation: special configure flags for musl-based systems

QEMU upstream builds with warnings when libc is musl:

  #warning redirecting incorrect #include <sys/signal.h> to <signal.h>

Disable -Werror by passing --disable-werror to the QEMUU config script
if libc is musl.

hvmloader doesn't build on musl systems today. Disable any guest
firmware build.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agoautomation: add dom0less to the QEMU aarch64 smoke test
Stefano Stabellini [Fri, 13 Nov 2020 23:22:41 +0000 (15:22 -0800)]
automation: add dom0less to the QEMU aarch64 smoke test

Add a trivial dom0less test:
- fetch the Debian arm64 kernel and use it ad dom0/U kernel
- use busybox-static to create a trivial dom0/U ramdisk
- use ImageBuilder to generate the uboot boot script automatically
- install and use u-boot from the Debian package to start the test
- binaries are loaded from uboot via tftp

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agoautomation: add a QEMU aarch64 smoke test
Stefano Stabellini [Fri, 13 Nov 2020 02:30:33 +0000 (18:30 -0800)]
automation: add a QEMU aarch64 smoke test

Use QEMU to start Xen (just the hypervisor) up until it stops because
there is no dom0 kernel to boot.

It is based on the existing build job unstable-arm64v8.

Also use make -j$(nproc) to build Xen.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agoxen/hypfs: add new enter() and exit() per node callbacks
Juergen Gross [Thu, 17 Dec 2020 15:50:21 +0000 (16:50 +0100)]
xen/hypfs: add new enter() and exit() per node callbacks

In order to better support resource allocation and locking for dynamic
hypfs nodes add enter() and exit() callbacks to struct hypfs_funcs.

The enter() callback is called when entering a node during hypfs user
actions (traversing, reading or writing it), while the exit() callback
is called when leaving a node (accessing another node at the same or a
higher directory level, or when returning to the user).

For avoiding recursion this requires a parent pointer in each node.
Let the enter() callback return the entry address which is stored as
the last accessed node in order to be able to use a template entry for
that purpose in case of dynamic entries.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agoxen/hypfs: switch write function handles to const
Juergen Gross [Thu, 17 Dec 2020 15:49:49 +0000 (16:49 +0100)]
xen/hypfs: switch write function handles to const

The node specific write functions take a void user address handle as
parameter. As a write won't change the user memory use a const_void
handle instead.

This requires a new macro for casting a guest handle to a const type.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agoxen/cpupool: support moving domain between cpupools with different granularity
Juergen Gross [Thu, 17 Dec 2020 15:49:11 +0000 (16:49 +0100)]
xen/cpupool: support moving domain between cpupools with different granularity

When moving a domain between cpupools with different scheduling
granularity the sched_units of the domain need to be adjusted.

Do that by allocating new sched_units and throwing away the old ones
in sched_move_domain().

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
4 years agotools/xenstore: remove unused cruft from xenstored_domain.c
Juergen Gross [Tue, 15 Dec 2020 16:35:41 +0000 (17:35 +0100)]
tools/xenstore: remove unused cruft from xenstored_domain.c

domain->remote_port and restore_existing_connections() are useless and
can be removed.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
4 years agotools/xenstore: make set_tdb_key() non-static
Juergen Gross [Tue, 15 Dec 2020 16:35:40 +0000 (17:35 +0100)]
tools/xenstore: make set_tdb_key() non-static

set_tdb_key() can be used by destroy_node(), too. So remove the static
attribute and move it to xenstored_core.c.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>