]> xenbits.xensource.com Git - people/jgross/xen.git/log
people/jgross/xen.git
4 years agolibxl: workaround gcc 10.2 maybe-uninitialized warning
Marek Marczykowski-Górecki [Wed, 19 Aug 2020 02:00:35 +0000 (04:00 +0200)]
libxl: workaround gcc 10.2 maybe-uninitialized warning

It seems xlu_pci_parse_bdf has a state machine that is too complex for
gcc to understand. The build fails with:

    libxlu_pci.c: In function 'xlu_pci_parse_bdf':
    libxlu_pci.c:32:18: error: 'func' may be used uninitialized in this function [-Werror=maybe-uninitialized]
       32 |     pcidev->func = func;
          |     ~~~~~~~~~~~~~^~~~~~
    libxlu_pci.c:51:29: note: 'func' was declared here
       51 |     unsigned dom, bus, dev, func, vslot = 0;
          |                             ^~~~
    libxlu_pci.c:31:17: error: 'dev' may be used uninitialized in this function [-Werror=maybe-uninitialized]
       31 |     pcidev->dev = dev;
          |     ~~~~~~~~~~~~^~~~~
    libxlu_pci.c:51:24: note: 'dev' was declared here
       51 |     unsigned dom, bus, dev, func, vslot = 0;
          |                        ^~~
    libxlu_pci.c:30:17: error: 'bus' may be used uninitialized in this function [-Werror=maybe-uninitialized]
       30 |     pcidev->bus = bus;
          |     ~~~~~~~~~~~~^~~~~
    libxlu_pci.c:51:19: note: 'bus' was declared here
       51 |     unsigned dom, bus, dev, func, vslot = 0;
          |                   ^~~
    libxlu_pci.c:29:20: error: 'dom' may be used uninitialized in this function [-Werror=maybe-uninitialized]
       29 |     pcidev->domain = domain;
          |     ~~~~~~~~~~~~~~~^~~~~~~~
    libxlu_pci.c:51:14: note: 'dom' was declared here
       51 |     unsigned dom, bus, dev, func, vslot = 0;
          |              ^~~
    cc1: all warnings being treated as errors

Workaround it by setting the initial value to invalid value (0xffffffff)
and then assert on each value being set. This way we mute the gcc
warning, while still detecting bugs in the parse code.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agotools/firmware: Fix typo in uninstall target
Hubert Jasudowicz [Tue, 18 Aug 2020 19:29:48 +0000 (21:29 +0200)]
tools/firmware: Fix typo in uninstall target

When ipxe.bin is missing, make uninstall will fail due to
wrong switch (-r) passed to rm command. Replace it with -f.

Signed-off-by: Hubert Jasudowicz <hubert.jasudowicz@cert.pl>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agoMAINTAINERS: Update my email address
Ian Jackson [Wed, 26 Aug 2020 14:47:19 +0000 (15:47 +0100)]
MAINTAINERS: Update my email address

I am changing my email address.  (My affiliation to Citrix remains
unchanged.)  See
   https://xenbits.xen.org/people/iwj/2020/email-transition.txt
for a signed confirmation with full details.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <iwj@xenproject.org>
4 years agox86: use constant flags for section .init.rodata
Roger Pau Monné [Thu, 27 Aug 2020 07:53:46 +0000 (09:53 +0200)]
x86: use constant flags for section .init.rodata

LLVM 11 complains with:

<instantiation>:1:1: error: changed section flags for .init.rodata, expected: 0x2
.pushsection .init.rodata
^
<instantiation>:30:9: note: while in macro instantiation
        entrypoint 0
        ^
entry.S:979:9: note: while in macro instantiation
        .rept 256
        ^

And:

entry.S:1015:9: error: changed section flags for .init.rodata, expected: 0x2
        .section .init.rodata
        ^

Fix it by explicitly using the same flags and type in all the
instances.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agox86: don't include domctl and alike in shim-exclusive builds
Jan Beulich [Thu, 27 Aug 2020 07:52:45 +0000 (09:52 +0200)]
x86: don't include domctl and alike in shim-exclusive builds

There is no need for platform-wide, system-wide, or per-domain control
in this case. Hence avoid including this dead code in the build.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agobitmap: move to/from xenctl_bitmap conversion helpers
Jan Beulich [Thu, 27 Aug 2020 07:52:01 +0000 (09:52 +0200)]
bitmap: move to/from xenctl_bitmap conversion helpers

A subsequent change will exclude domctl.c from getting built for a
particular configuration, yet the two functions get used from elsewhere.

While moving the code
- drop unmotivated uses of min_t(),
- fix style violations in the moved code,
- xfree() as early as possible.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agox86: don't build with EFI support in shim-exclusive mode
Jan Beulich [Thu, 27 Aug 2020 07:51:07 +0000 (09:51 +0200)]
x86: don't build with EFI support in shim-exclusive mode

There's no need for xen.efi at all, and there's also no need for EFI
support in xen.gz since the shim runs in PVH mode, i.e. without any
firmware (and hence by implication also without EFI one).

The slightly odd looking use of $(space) is to ensure the new ifneq()
evaluates consistently between "build" and "install" invocations of
make.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agobuild: also check for empty .bss.* in .o -> .init.o conversion
Jan Beulich [Thu, 27 Aug 2020 07:46:55 +0000 (09:46 +0200)]
build: also check for empty .bss.* in .o -> .init.o conversion

We're gaining such sections, and like .text.* and .data.* they shouldn't
be present in objects subject to automatic to-init conversion. Oddly
enough for quite some time we did have an instance breaking this rule,
which gets fixed at this occasion, by breaking out the EFI boot
allocator functions into its own translation unit.

Fixes: c5b9805bc1f7 ("efi: create new early memory allocator")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agomake better use of mfn local variable in free_heap_pages()
Jan Beulich [Tue, 25 Aug 2020 15:47:27 +0000 (17:47 +0200)]
make better use of mfn local variable in free_heap_pages()

Besides the one use that there is in the function (of the value
calculated at function entry), there are two more places where the
redundant page-to-address conversion can be avoided.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Wei Liu <wl@xen.org>
4 years agox86: don't maintain compat M2P when !PV32
Jan Beulich [Tue, 25 Aug 2020 15:46:27 +0000 (17:46 +0200)]
x86: don't maintain compat M2P when !PV32

It's effectively unused in this case (as well as when "pv=no-32").

While touching their definitions anyway, also adjust section placement
of m2p_compat_vstart and compat_idle_pg_table_l2. Similarly, while
putting init_xen_pae_l2_slots() inside #ifdef, also move it to a PV-only
source file.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/EFI: sanitize build logic
Jan Beulich [Tue, 25 Aug 2020 15:43:52 +0000 (17:43 +0200)]
x86/EFI: sanitize build logic

With changes done over time and as far as linking goes, the only special
things about building with EFI support enabled are
- the need for the dummy relocations object (for xen.gz uniformly in all
  build stages, for xen.efi in stage 1),
- the special efi/buildid.o file, which can't be made part of
  efi/built_in.o, due to the extra linker options required for it.
All other efi/*.o can be consumed from the built_in*.o files.

In efi/Makefile, besides moving relocs-dummy.o to "extra", also properly
split between obj-y and obj-bin-y.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/PV: also check kernel endianness when building Dom0
Jan Beulich [Mon, 24 Aug 2020 13:38:48 +0000 (15:38 +0200)]
x86/PV: also check kernel endianness when building Dom0

While big endian x86 images are pretty unlikely to appear, merely
logging endianness isn't of much use.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86: convert set_gpfn_from_mfn() to a function
Jan Beulich [Mon, 24 Aug 2020 13:38:03 +0000 (15:38 +0200)]
x86: convert set_gpfn_from_mfn() to a function

It is already a little too heavy for a macro, and more logic is about to
get added to it.

This also allows reducing the scope of compat_machine_to_phys_mapping.

Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/vpic: rename irq to pin in vpic_ioport_write
Roger Pau Monné [Mon, 24 Aug 2020 13:36:44 +0000 (15:36 +0200)]
x86/vpic: rename irq to pin in vpic_ioport_write

The irq variable is wrongly named, as it's used to store the pin on
the 8259 chip, but not the global irq value. While renaming reduce
it's scope and make it unsigned.

No functional change intended.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/vpic: fix vpic_elcr_mask macro parameter usage
Roger Pau Monné [Mon, 24 Aug 2020 13:35:49 +0000 (15:35 +0200)]
x86/vpic: fix vpic_elcr_mask macro parameter usage

vpic_elcr_mask wasn't using the v parameter, and instead worked
because in the context of the callers v would be vpic. Fix this by
correctly using the parameter. While there also remove the unneeded
casts to uint8_t and the ending semicolon.

No functional change intended.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agoMAINTAINERS: Add Roger Pau Monné as x86 maintainer
George Dunlap [Fri, 21 Aug 2020 14:32:01 +0000 (15:32 +0100)]
MAINTAINERS: Add Roger Pau Monné as x86 maintainer

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agoxen/arm: Enable CPU Erratum 1165522 for Neoverse master origin/HEAD origin/master origin/smoke
Bertrand Marquis [Tue, 18 Aug 2020 13:47:39 +0000 (14:47 +0100)]
xen/arm: Enable CPU Erratum 1165522 for Neoverse

Enable CPU erratum of Speculative AT on the Neoverse N1 processor
versions r0p0 to r2p0.
Also Fix Cortex A76 Erratum string which had a wrong errata number.

Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agoarm: Add Neoverse N1 processor identification
Bertrand Marquis [Tue, 18 Aug 2020 13:47:38 +0000 (14:47 +0100)]
arm: Add Neoverse N1 processor identification

Add MIDR and CPU part numbers for Neoverse N1

Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agox86: move cpu_{up,down}_helper()
Jan Beulich [Wed, 19 Aug 2020 09:09:38 +0000 (11:09 +0200)]
x86: move cpu_{up,down}_helper()

This is in preparation of making the building of sysctl.c conditional.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86: move domain_cpu_policy_changed()
Jan Beulich [Wed, 19 Aug 2020 09:08:46 +0000 (11:08 +0200)]
x86: move domain_cpu_policy_changed()

This is in preparation of making the building of domctl.c conditional.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/pv: allow reading APIC_BASE MSR
Roger Pau Monne [Mon, 17 Aug 2020 15:57:54 +0000 (17:57 +0200)]
x86/pv: allow reading APIC_BASE MSR

Linux PV guests will attempt to read the APIC_BASE MSR, so just report
a default value to make Linux happy.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/pv: handle reads to the PAT MSR
Roger Pau Monne [Mon, 17 Aug 2020 15:57:53 +0000 (17:57 +0200)]
x86/pv: handle reads to the PAT MSR

The value in the PAT MSR is part of the ABI between Xen and PV guests,
and there's no reason to not allow a PV guest to read it.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/pv: handle writes to the EFER MSR
Roger Pau Monne [Mon, 17 Aug 2020 15:57:52 +0000 (17:57 +0200)]
x86/pv: handle writes to the EFER MSR

Silently drop writes to the EFER MSR for PV guests if the value is not
changed from what it's being reported. Current PV Linux will attempt
to write to the MSR with the same value that's been read, and raising
a fault will result in a guest crash.

As part of this work introduce a helper to easily get the EFER value
reported to guests.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools/ocaml/xenstored: drop select based socket watching
Edwin Török [Mon, 17 Aug 2020 18:45:47 +0000 (19:45 +0100)]
tools/ocaml/xenstored: drop select based socket watching

Poll has been the default since 2014, I think we can safely say by now
that poll() works and we don't need to fall back to select().

This will allow fixing up the way we call poll to be more efficient
(and pave the way for introducing epoll support):
currently poll wraps the select API, which is inefficient.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
4 years agotools/ocaml/libs/xc: Fix ambiguous documentation comment
Edwin Török [Mon, 17 Aug 2020 18:45:44 +0000 (19:45 +0100)]
tools/ocaml/libs/xc: Fix ambiguous documentation comment

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
4 years agoQEMU_TRADITIONAL_REVISION update
Ian Jackson [Tue, 18 Aug 2020 15:00:10 +0000 (16:00 +0100)]
QEMU_TRADITIONAL_REVISION update

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agodocs/process/branching-checklist: Get osstest branch right
Ian Jackson [Wed, 15 Jul 2020 15:39:18 +0000 (16:39 +0100)]
docs/process/branching-checklist: Get osstest branch right

The runes for this manual osstest were wrong.  It needs to run as
osstest, and cr-for-branches should be run from testing.git.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agotools: bump library version numbers
Wei Liu [Wed, 12 Aug 2020 09:55:34 +0000 (09:55 +0000)]
tools: bump library version numbers

Signed-off-by: Wei Liu <wl@xen.org>
4 years agoConfig.mk: update OVMF changeset
Wei Liu [Wed, 12 Aug 2020 09:55:11 +0000 (09:55 +0000)]
Config.mk: update OVMF changeset

Signed-off-by: Wei Liu <wl@xen.org>
4 years agoxen/arm: cmpxchg: Add missing memory barriers in __cmpxchg_mb_timeout()
Julien Grall [Wed, 29 Jul 2020 13:50:37 +0000 (14:50 +0100)]
xen/arm: cmpxchg: Add missing memory barriers in __cmpxchg_mb_timeout()

The function __cmpxchg_mb_timeout() was intended to have the same
semantics as __cmpxchg_mb(). Unfortunately, the memory barriers were
not added when first implemented.

There is no known issue with the existing callers, but the barriers are
added given this is the expected semantics in Xen.

The issue was introduced by XSA-295.

Backport: 4.8+
Fixes: 86b0bc958373 ("xen/arm: cmpxchg: Provide a new helper that can timeout")
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
4 years agoxen/arm: guestcopy: Re-order the includes
Julien Grall [Sat, 4 Apr 2020 11:07:17 +0000 (12:07 +0100)]
xen/arm: guestcopy: Re-order the includes

We usually have xen/ includes first and then asm/. They are also ordered
alphabetically among themselves.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
4 years agoxen/arm: decode: Re-order the includes
Julien Grall [Sat, 4 Apr 2020 11:06:04 +0000 (12:06 +0100)]
xen/arm: decode: Re-order the includes

We usually have xen/ includes first and then asm/. They are also ordered
alphabetically among themselves.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agoxen/arm: kernel: Re-order the includes
Julien Grall [Sat, 4 Apr 2020 11:03:22 +0000 (12:03 +0100)]
xen/arm: kernel: Re-order the includes

We usually have xen/ includes first and then asm/. They are also ordered
alphabetically among themselves.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agoxen/guest_access: Add emacs magics
Julien Grall [Sat, 4 Apr 2020 10:50:18 +0000 (11:50 +0100)]
xen/guest_access: Add emacs magics

Add emacs magics for xen/guest_access.h and
asm-x86/guest_access.h.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/hvm: change EOI exit bitmap helper parameter
Roger Pau Monné [Wed, 12 Aug 2020 12:47:05 +0000 (14:47 +0200)]
x86/hvm: change EOI exit bitmap helper parameter

Change the last parameter of the update_eoi_exit_bitmap helper to be a
set/clear boolean instead of a triggering field. This is already
inline with how the function is implemented, and will allow deciding
whether an exit is required by the higher layers that call into
update_eoi_exit_bitmap. Note that the current behavior is not changed
by this patch.

No functional change intended.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
4 years agorpmball: Adjust to new rpm, do not require --force
Don Slutz [Sun, 9 Aug 2020 18:22:34 +0000 (14:22 -0400)]
rpmball: Adjust to new rpm, do not require --force

Also prevent warning: directory /boot: remove failed

Before:

[root@TestCloud1 xen]# rpm -hiv dist/xen*rpm
Preparing...                          ################################# [100%]
        file /boot from install of xen-4.15-unstable.x86_64 conflicts with file from package filesystem-3.2-25.el7.x86_64
        file /usr/bin from install of xen-4.15-unstable.x86_64 conflicts with file from package filesystem-3.2-25.el7.x86_64
        file /usr/lib from install of xen-4.15-unstable.x86_64 conflicts with file from package filesystem-3.2-25.el7.x86_64
        file /usr/lib64 from install of xen-4.15-unstable.x86_64 conflicts with file from package filesystem-3.2-25.el7.x86_64
        file /usr/sbin from install of xen-4.15-unstable.x86_64 conflicts with file from package filesystem-3.2-25.el7.x86_64
[root@TestCloud1 xen]# rpm -e xen
warning: directory /boot: remove failed: Device or resource busy

After:

[root@TestCloud1 xen]# rpm -hiv dist/xen*rpm
Preparing...                          ################################# [100%]
Updating / installing...
   1:xen-4.15-unstable                ################################# [100%]
[root@TestCloud1 xen]# rpm -e xen
[root@TestCloud1 xen]#

Signed-off-by: Don Slutz <Don.Slutz@Gmail.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agox86/iommu: convert AMD IOMMU code to use new page table allocator
Paul Durrant [Tue, 4 Aug 2020 13:41:59 +0000 (14:41 +0100)]
x86/iommu: convert AMD IOMMU code to use new page table allocator

This patch converts the AMD IOMMU code to use the new page table allocator
function. This allows all the free-ing code to be removed (since it is now
handled by the general x86 code) which reduces TLB and cache thrashing as well
as shortening the code.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/iommu: add common page-table allocator
Paul Durrant [Tue, 4 Aug 2020 13:41:57 +0000 (14:41 +0100)]
x86/iommu: add common page-table allocator

Instead of having separate page table allocation functions in VT-d and AMD
IOMMU code, we could use a common allocation function in the general x86 code.

This patch adds a new allocation function, iommu_alloc_pgtable(), for this
purpose. The function adds the page table pages to a list. The pages in this
list are then freed by iommu_free_pgtables(), which is called by
domain_relinquish_resources() after PCI devices have been de-assigned.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/iommu: re-arrange arch_iommu to separate common fields...
Paul Durrant [Tue, 4 Aug 2020 13:41:56 +0000 (14:41 +0100)]
x86/iommu: re-arrange arch_iommu to separate common fields...

... from those specific to VT-d or AMD IOMMU, and put the latter in a union.

There is no functional change in this patch, although the initialization of
the 'mapped_rmrrs' list occurs slightly later in iommu_domain_init() since
it is now done (correctly) in VT-d specific code rather than in general x86
code.

NOTE: I have not combined the AMD IOMMU 'root_table' and VT-d 'pgd_maddr'
      fields even though they perform essentially the same function. The
      concept of 'root table' in the VT-d code is different from that in the
      AMD code so attempting to use a common name will probably only serve
      to confuse the reader.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
4 years agox86/vmx: reorder code in vmx_deliver_posted_intr
Roger Pau Monné [Thu, 30 Jul 2020 14:03:09 +0000 (16:03 +0200)]
x86/vmx: reorder code in vmx_deliver_posted_intr

Remove the unneeded else branch, which allows to reduce the
indentation of a larger block of code, while making the flow of the
function more obvious.

No functional change intended.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
4 years agotools/xenstore: Do not abort xenstore-ls if a node disappears while iterating
David Woodhouse [Thu, 19 Mar 2020 20:40:24 +0000 (20:40 +0000)]
tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating

The do_ls() function has somewhat inconsistent handling of errors.

If reading the node's contents with xs_read() fails, then do_ls() will
just quietly not display the contents.

If reading the node's permissions with xs_get_permissions() fails, then
do_ls() will print a warning, continue, and ultimately won't exit with
an error code (unless another error happens).

If recursing into the node with xs_directory() fails, then do_ls() will
abort immediately, not printing any further nodes.

For persistent failure modes — such as ENOENT because a node has been
removed, or EACCES because it has had its permisions changed since the
xs_directory() on the parent directory returned its name — it's
obviously quite likely that if either of the first two errors occur for
a given node, then so will the third and thus xenstore-ls will abort.

The ENOENT one is actually a fairly common case, and has caused tools to
fail to clean up a network device because it *apparently* already
doesn't exist in xenstore.

There is a school of thought that says, "Well, xenstore-ls returned an
error. So the tools should not trust its output."

The natural corollary of this would surely be that the tools must re-run
xenstore-ls as many times as is necessary until its manages to exit
without hitting the race condition. I am not keen on that conclusion.

For the specific case of ENOENT it seems reasonable to declare that,
but for the timing, we might as well just not have seen that node at
all when calling xs_directory() for the parent. By ignoring the error,
we give acceptable output.

The issue can be reproduced as follows:

(dom0) # for a in `seq 1 1000` ; do
              xenstore-write /local/domain/2/foo/$a $a ;
         done

Now simultaneously:

(dom0) # for a in `seq 1 999` ; do
              xenstore-rm /local/domain/2/foo/$a ;
         done
(dom2) # while true ; do
              ./xenstore-ls -p /local/domain/2/foo | grep -c 1000 ;
         done

We should expect to see node 1000 in the output, every time.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agox86/viridian: remove the viridian_vcpu msg_pending bit mask origin/coverity-tested/smoke
Paul Durrant [Thu, 13 Aug 2020 10:35:53 +0000 (11:35 +0100)]
x86/viridian: remove the viridian_vcpu msg_pending bit mask

The mask does not actually serve a useful purpose as we only use the SynIC
for timer messages. Dropping the mask means that the EOM MSR handler
essentially becomes a no-op. This means we can avoid setting 'message_pending'
for timer messages and hence avoid a VMEXIT for the EOM.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agox86/setup: Ignore early boot parameters like no-real-mode
Trammell Hudson [Wed, 12 Aug 2020 17:42:48 +0000 (17:42 +0000)]
x86/setup: Ignore early boot parameters like no-real-mode

There are parameters in xen/arch/x86/boot/cmdline.c that
are only used early in the boot process, so handlers are
necessary to avoid an "Unknown command line option" in
dmesg.

This also updates ignore_param() to generate a temporary
variable name so that the macro can be used more than once
per file.

Signed-off-by: Trammell hudson <hudson@trmm.net>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
[Leave note to stop TEMP_NAME() finding more general use]
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoConfig.mk: update seabios to 1.14.0
Wei Liu [Wed, 12 Aug 2020 09:21:39 +0000 (09:21 +0000)]
Config.mk: update seabios to 1.14.0

Signed-off-by: Wei Liu <wl@xen.org>
4 years agoRevert "x86/EFI: sanitize build logic"
Andrew Cooper [Mon, 10 Aug 2020 14:45:46 +0000 (15:45 +0100)]
Revert "x86/EFI: sanitize build logic"

This reverts commit 90c7eee53fcc0b48bd51aa3a7d1d0d9980ce1a7a.

It breaks the build in some configurations with CONFIG_LIVEPATCH enabled.

  make[2]: *** No rule to make target 'efi/buildid.o', needed by '/local/xen.git/xen/xen.efi'.  Stop.
  make[2]: *** Waiting for unfinished jobs....

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/EFI: sanitize build logic
Jan Beulich [Fri, 7 Aug 2020 11:32:11 +0000 (13:32 +0200)]
x86/EFI: sanitize build logic

With changes done over time and as far as linking goes, the only special
thing about building with EFI support enabled is the need for the dummy
relocations object for xen.gz uniformly in all build stages. All other
efi/*.o can be consumed from the built_in*.o files.

In efi/Makefile, besides moving relocs-dummy.o to "extra", also properly
split between obj-y and obj-bin-y.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86: slightly re-arrange 32-bit handling in dom0_construct_pv()
Jan Beulich [Fri, 7 Aug 2020 11:14:02 +0000 (13:14 +0200)]
x86: slightly re-arrange 32-bit handling in dom0_construct_pv()

Add #ifdef-s (the 2nd one will be needed in particular, to guard the
uses of m2p_compat_vstart and HYPERVISOR_COMPAT_VIRT_START()) and fold
duplicate uses of elf_32bit().

Also adjust what gets logged: Avoid "compat32" when support isn't built
in, and don't assume ELF class <> ELFCLASS64 means ELFCLASS32.

While doing this, in code getting touched anyway:
- use ROUNDUP() instead of open-coding it,
- drop a stale (dead) BUG_ON(),
- replace panic() by printk() plus error return, for being consistent
  with other code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agobuild: correctly report non-empty section sizes upon .o -> .init.o conversion
Jan Beulich [Fri, 7 Aug 2020 11:12:21 +0000 (13:12 +0200)]
build: correctly report non-empty section sizes upon .o -> .init.o conversion

The originally used sed expression converted not just multiple leading
zeroes (as intended), but also trailing ones, rendering the error
message somewhat confusing. Collapse zeroes in just the one place where
we need them collapsed, and leave objdump's output as is for all other
purposes.

Fixes: 48115d14743e ("Move more kernel decompression bits to .init.* sections")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agobuild: work around bash issue
Jan Beulich [Fri, 7 Aug 2020 11:12:00 +0000 (13:12 +0200)]
build: work around bash issue

Older bash (observed with 3.2.57(2)) fails to honor "set -e" for certain
built-in commands ("while" here), despite the command's status correctly
being non-zero. The subsequent objcopy invocation now being separated by
a semicolon results in no failure. Insert an explicit "exit" (replacing
; by && ought to be another possible workaround).

Fixes: e321576f4047 ("xen/build: start using if_changed")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/tsc: Fix diagnostics for TSC frequency
Andrew Cooper [Wed, 5 Aug 2020 13:56:11 +0000 (14:56 +0100)]
x86/tsc: Fix diagnostics for TSC frequency

A Gemini Lake platform prints:

  (XEN) CPU0: TSC: 19200000MHz * 279 / 3 = 1785600000MHz
  (XEN) CPU0: 800..1800 MHz

during boot.  The units on the first line are Hz, not MHz, so correct that and
add a space for clarity.

Also, for the min/max line, use three dots instead of two and add more spaces
so that the line can't be mistaken for being a double decimal point typo.

Boot now reads:

  (XEN) CPU0: TSC: 19200000 Hz * 279 / 3 = 1785600000 Hz
  (XEN) CPU0: 800 ... 1800 MHz

Extend these changes to the other TSC diagnostics.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/ioapic: Improve code generation for __io_apic_{read,write}()
Andrew Cooper [Wed, 5 Aug 2020 13:35:16 +0000 (14:35 +0100)]
x86/ioapic: Improve code generation for __io_apic_{read,write}()

The write into REGSEL prevents the optimiser from reusing the address
calculation, forcing it to be calcualted twice.

The calculation itself is quite expensive.  Pull it out into a local varaible.

Bloat-o-meter reports:
  add/remove: 0/0 grow/shrink: 0/26 up/down: 0/-1527 (-1527)

Also correct the register type, which is uint32_t, not int.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/ioapic: Fix style in io_apic.h
Andrew Cooper [Wed, 5 Aug 2020 10:49:15 +0000 (11:49 +0100)]
x86/ioapic: Fix style in io_apic.h

This file is a mix of Xen and Linux styles.  Switch it fully to Xen style.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/ioapic: Fix fixmap error path logic in ioapic_init_mappings()
Andrew Cooper [Wed, 5 Aug 2020 11:05:27 +0000 (12:05 +0100)]
x86/ioapic: Fix fixmap error path logic in ioapic_init_mappings()

In the case that bad_ioapic_register() fails, the current position of idx++
means that clear_fixmap(idx) will be called with the wrong index, and not
clean up the mapping just created.

Increment idx as part of the loop, rather than midway through the loop body.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86emul: correct AVX512_BF16 insn names in EVEX Disp8 test
Jan Beulich [Wed, 5 Aug 2020 08:30:18 +0000 (10:30 +0200)]
x86emul: correct AVX512_BF16 insn names in EVEX Disp8 test

The leading 'v' ought to be omitted from the table entries.

Fixes: 7ff66809ccd5 ("x86emul: support AVX512_BF16 insns")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: extend decoding / mem access testing to EVEX-encoded insns
Jan Beulich [Wed, 5 Aug 2020 08:29:55 +0000 (10:29 +0200)]
x86emul: extend decoding / mem access testing to EVEX-encoded insns

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: AVX512PF insns aren't memory accesses
Jan Beulich [Wed, 5 Aug 2020 08:29:18 +0000 (10:29 +0200)]
x86emul: AVX512PF insns aren't memory accesses

These are prefetches, so should be treated just like other prefetches.

Fixes: 467e91bde720 ("x86emul: support AVX512PF insns")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: AVX512F scatter insns are memory writes
Jan Beulich [Wed, 5 Aug 2020 08:28:40 +0000 (10:28 +0200)]
x86emul: AVX512F scatter insns are memory writes

While the custom handling renders the "to_mem" field generally unused,
x86_insn_is_mem_write() still (indirectly) consumes that information,
and hence the table entries want to be correct.

Fixes: 7d569b848036 ("x86emul: support AVX512F scatter insns")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: AVX512{F,BW} down conversion moves are memory writes
Jan Beulich [Wed, 5 Aug 2020 08:28:01 +0000 (10:28 +0200)]
x86emul: AVX512{F,BW} down conversion moves are memory writes

For this to be properly reported, the case labels need to move to a
different switch() block.

Fixes: 30e0bdf79828 ("x86emul: support AVX512{F,BW} down conversion moves")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: extend decoding / mem access testing to XOP-encoded insns
Jan Beulich [Wed, 5 Aug 2020 08:27:31 +0000 (10:27 +0200)]
x86emul: extend decoding / mem access testing to XOP-encoded insns

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: extend decoding / mem access testing to VEX-encoded insns
Jan Beulich [Wed, 5 Aug 2020 08:27:23 +0000 (10:27 +0200)]
x86emul: extend decoding / mem access testing to VEX-encoded insns

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: extend decoding / mem access testing to MMX / SSE insns
Jan Beulich [Wed, 5 Aug 2020 08:27:11 +0000 (10:27 +0200)]
x86emul: extend decoding / mem access testing to MMX / SSE insns

IOW just legacy encoded ones. For 3dNow! just one example is used, as
they're all similar in nature both encoding- and operand-wise.

Rename pfx_none to pfx_no, so it can be used to improve readability /
column alignment.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: extend decoding / mem access testing to FPU insns
Jan Beulich [Wed, 5 Aug 2020 08:26:59 +0000 (10:26 +0200)]
x86emul: extend decoding / mem access testing to FPU insns

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: adjustments to mem access / write logic testing
Jan Beulich [Wed, 5 Aug 2020 08:26:11 +0000 (10:26 +0200)]
x86emul: adjustments to mem access / write logic testing

The combination of specifying a ModR/M byte with the upper two bits set
and the modrm field set to T is pointless - the same test will be
executed twice, i.e. overall things will be slower for no extra gain. I
can only assume this was a copy-and-paste-without-enough-editing mistake
of mine.

Furthermore adjust the base type of a few bit fields to shrink table
size, as subsequently quite a few new entries will get added to the
tables using this type.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86: comment update after "drop high compat r/o M2P table address range"
Jan Beulich [Wed, 5 Aug 2020 08:21:22 +0000 (10:21 +0200)]
x86: comment update after "drop high compat r/o M2P table address range"

Commit 5af040ef8b57 clearly should also have updated the comment, not
just the #define-s.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: further FPU env testing relaxation for AMD-like CPUs
Jan Beulich [Wed, 5 Aug 2020 08:20:59 +0000 (10:20 +0200)]
x86emul: further FPU env testing relaxation for AMD-like CPUs

See the code comment that's being extended. Additionally a few more
zap_fpsel() invocations are needed - whenever we stored state after
there potentially having been a context switch behind our backs.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: replace further UB shifts
Jan Beulich [Wed, 5 Aug 2020 08:19:29 +0000 (10:19 +0200)]
x86emul: replace further UB shifts

I have no explanation how I managed to overlook these while putting
together what is now b6a907f8c83d ("x86emul: replace UB shifts").

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoRevert "libxl: avoid golang building without CONFIG_GOLANG=y"
Wei Liu [Tue, 4 Aug 2020 15:53:48 +0000 (15:53 +0000)]
Revert "libxl: avoid golang building without CONFIG_GOLANG=y"

This reverts commit fe49938f21c26f0ce630c69af055f927dd0ed75f.

We have an on-going discussion regarding this patch.

Signed-off-by: Wei Liu <wl@xen.org>
4 years agolibxl: avoid golang building without CONFIG_GOLANG=y
Jan Beulich [Mon, 3 Aug 2020 08:06:32 +0000 (10:06 +0200)]
libxl: avoid golang building without CONFIG_GOLANG=y

While this doesn't address the real problem I've run into (attempting to
update r/o source files), not recursing into tools/golang/xenlight/ is
enough to fix the build for me for the moment. I don't currently see why
60db5da62ac0 ("libxl: Generate golang bindings in libxl Makefile") found
it necessary to invoke this build step unconditionally.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agox86emul: avoid assembler warning about .type not taking effect in test harness
Jan Beulich [Mon, 3 Aug 2020 14:27:22 +0000 (16:27 +0200)]
x86emul: avoid assembler warning about .type not taking effect in test harness

gcc re-orders top level blocks by default when optimizing. This
re-ordering results in all our .type directives to get emitted to the
assembly file first, followed by gcc's. The assembler warns about
attempts to change the type of a symbol when it was already set (and
when there's no intervening setting to "notype").

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/hvm: simplify 'mmio_direct' check in epte_get_entry_emt()
Paul Durrant [Fri, 31 Jul 2020 15:43:31 +0000 (17:43 +0200)]
x86/hvm: simplify 'mmio_direct' check in epte_get_entry_emt()

Re-factor the code to take advantage of the fact that the APIC access page is
a 'special' page. The VMX code is left alone and hence the APIC access page is
still inserted into the P2M with type p2m_mmio_direct. This is left alone as it
is not obvious there is another suitable type to use, and the necessary
re-ordering in epte_get_entry_emt() is straightforward.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/hvm: set 'ipat' in EPT for special pages
Paul Durrant [Fri, 31 Jul 2020 15:42:47 +0000 (17:42 +0200)]
x86/hvm: set 'ipat' in EPT for special pages

All non-MMIO ranges (i.e those not mapping real device MMIO regions) that
map valid MFNs are normally marked MTRR_TYPE_WRBACK and 'ipat' is set. Hence
when PV drivers running in a guest populate the BAR space of the Xen Platform
PCI Device with pages such as the Shared Info page or Grant Table pages,
accesses to these pages will be cachable.

However, should IOMMU mappings be enabled be enabled for the guest then these
accesses become uncachable. This has a substantial negative effect on I/O
throughput of PV devices. Arguably PV drivers should bot be using BAR space to
host the Shared Info and Grant Table pages but it is currently commonplace for
them to do this and so this problem needs mitigation. Hence this patch makes
sure the 'ipat' bit is set for any special page regardless of where in GFN
space it is mapped.

NOTE: Clearly this mitigation only applies to Intel EPT. It is not obvious
      that there is any similar mitigation possible for AMD NPT. Downstreams
      such as Citrix XenServer have been carrying a patch similar to this for
      several releases though.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86emul: replace UB shifts
Jan Beulich [Fri, 31 Jul 2020 15:41:58 +0000 (17:41 +0200)]
x86emul: replace UB shifts

Displacement values can be negative, hence we shouldn't left-shift them.
Or else we get

(XEN) UBSAN: Undefined behaviour in x86_emulate/x86_emulate.c:3482:55
(XEN) left shift of negative value -2

While auditing shifts, I noticed a pair of missing parentheses, which
also get added right here.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools/xen-cpuid: show enqcmd
Olaf Hering [Fri, 31 Jul 2020 15:41:27 +0000 (17:41 +0200)]
tools/xen-cpuid: show enqcmd

Translate <29> into a feature string.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/PV: drop a few misleading paging_mode_refcounts() checks
Jan Beulich [Fri, 31 Jul 2020 15:40:13 +0000 (17:40 +0200)]
x86/PV: drop a few misleading paging_mode_refcounts() checks

The filling and cleaning up of v->arch.guest_table in new_guest_cr3()
was apparently inconsistent so far: There was a type ref acquired
unconditionally for the new top level page table, but the dropping of
the old type ref was conditional upon !paging_mode_refcounts(). Mirror
this also to arch_set_info_guest().

Also move new_guest_cr3()'s #ifdef to around the function - both callers
now get built only when CONFIG_PV, i.e. no need to retain a stub.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools/configure: drop BASH configure variable
Andrew Cooper [Fri, 26 Jun 2020 16:46:38 +0000 (17:46 +0100)]
tools/configure: drop BASH configure variable

This is a weird variable to have in the first place.  The only user of it is
XSM's CONFIG_SHELL, which opencodes a fallback to sh.  The scripts are shebang
sh, which is already necessary to support non-Linux build environments.

Make the mkflask.sh and mkaccess_vector.sh scripts executable, drop the
CONFIG_SHELL, and drop the $BASH variable to prevent further use.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agoxen/spinlock: move debug helpers inside the locked regions
Roger Pau Monne [Wed, 29 Jul 2020 11:13:30 +0000 (13:13 +0200)]
xen/spinlock: move debug helpers inside the locked regions

Debug helpers such as lock profiling or the invariant pCPU assertions
must strictly be performed inside the exclusive locked region, or else
races might happen.

Note the issue was not strictly introduced by the pointed commit in
the Fixes tag, since lock stats where already incremented before the
barrier, but that commit made it more apparent as manipulating the cpu
field could happen outside of the locked regions and thus trigger the
BUG_ON on rel_lock(). This is only enabled on debug builds, and thus
releases are not affected.

Fixes: 80cba391a35 ('spinlocks: in debug builds store cpu holding the lock')
Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
4 years agox86/cpuid: Fix APIC bit clearing
Fam Zheng [Wed, 29 Jul 2020 17:51:45 +0000 (18:51 +0100)]
x86/cpuid: Fix APIC bit clearing

The bug is obvious here, other places in this function used
"cpufeat_mask" correctly.

Fixed: b648feff8ea2 ("xen/x86: Improvements to in-hypervisor cpuid sanity checks")
Signed-off-by: Fam Zheng <famzheng@amazon.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/hvm: Clean up track_dirty_vram() calltree
Andrew Cooper [Fri, 20 Jul 2018 17:22:25 +0000 (17:22 +0000)]
x86/hvm: Clean up track_dirty_vram() calltree

 * Rename nr to nr_frames.  A plain 'nr' is confusing to follow in the the
   lower levels.
 * Use DIV_ROUND_UP() rather than opencoding it in several different ways
 * The hypercall input is capped at uint32_t, so there is no need for
   nr_frames to be unsigned long in the lower levels.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
4 years agox86/hvm: only translate ISA interrupts to GSIs in virtual timers
Roger Pau Monne [Mon, 27 Jul 2020 17:05:39 +0000 (19:05 +0200)]
x86/hvm: only translate ISA interrupts to GSIs in virtual timers

Only call hvm_isa_irq_to_gsi for ISA interrupts, interrupts
originating from an IO APIC pin already use a GSI and don't need to be
translated.

I haven't observed any issues from this, but I think it's better to
use it correctly.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/vpt: only try to resume timers belonging to enabled devices
Roger Pau Monne [Mon, 27 Jul 2020 17:05:38 +0000 (19:05 +0200)]
x86/vpt: only try to resume timers belonging to enabled devices

Check whether the emulated device is actually enabled before trying to
resume the associated timers.

Thankfully all those structures are zeroed at initialization, and
since the devices are not enabled they are never populated, which
triggers the pt->vcpu check at the beginning of pt_resume forcing an
exit from the function.

While there limit the scope of i and make it unsigned.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/hvm: fix ISA IRQ 0 handling when set as lowest priority mode in IO APIC
Roger Pau Monne [Mon, 27 Jul 2020 17:05:37 +0000 (19:05 +0200)]
x86/hvm: fix ISA IRQ 0 handling when set as lowest priority mode in IO APIC

Lowest priority destination mode does allow the vIO APIC code to
select a vCPU to inject the interrupt to, but the selected vCPU must
be part of the possible destinations configured for such IO APIC pin.

Fix the code in order to only force vCPU 0 if it's part of the
listed destinations.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/hvm: don't force vCPU 0 for IRQ 0 when using fixed destination mode
Roger Pau Monne [Mon, 27 Jul 2020 17:05:36 +0000 (19:05 +0200)]
x86/hvm: don't force vCPU 0 for IRQ 0 when using fixed destination mode

When the IO APIC pin mapped to the ISA IRQ 0 has been configured to
use fixed delivery mode, do not forcefully route interrupts to vCPU 0,
as the OS might have setup those interrupts to be injected to a
different vCPU, and injecting to vCPU 0 can cause the OS to miss such
interrupts or errors to happen due to unexpected vectors being
injected on vCPU 0.

In order to fix remove such handling altogether for fixed destination
mode pins and just inject them according to the data setup in the
IO-APIC entry.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/hvm: fix vIO-APIC build without IRQ0_SPECIAL_ROUTING
Roger Pau Monne [Mon, 27 Jul 2020 17:05:35 +0000 (19:05 +0200)]
x86/hvm: fix vIO-APIC build without IRQ0_SPECIAL_ROUTING

pit_channel0_enabled needs to be guarded with IRQ0_SPECIAL_ROUTING
since it's only used when the special handling of ISA IRQ 0 is
enabled. However such helper being a single line it's better to just
inline it directly in vioapic_deliver where it's used.

No functional change.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoprint: introduce a format specifier for pci_sbdf_t
Roger Pau Monne [Mon, 27 Jul 2020 10:31:36 +0000 (12:31 +0200)]
print: introduce a format specifier for pci_sbdf_t

The new format specifier is '%pp', and prints a pci_sbdf_t using the
seg:bus:dev.func format. Replace all SBDFs printed using
'%04x:%02x:%02x.%u' to use the new format specifier.

No functional change intended.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Julien Grall <julien.grall@arm.com>
For just the pieces where Jan is the only maintainer:
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agopublic/domctl: Fix the struct xen_domctl ABI in 32bit builds
Andrew Cooper [Mon, 27 Jul 2020 18:21:09 +0000 (19:21 +0100)]
public/domctl: Fix the struct xen_domctl ABI in 32bit builds

The Xen domctl ABI currently relies on the union containing a field with
alignment of 8.

32bit projects which only copy the used subset of functionality end up with an
ABI breakage if they don't have at least one uint64_aligned_t field copied.

Insert explicit padding, and some build assertions to ensure it never changes
moving forwards.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agoxen/displif: Protocol version 2
Oleksandr Andrushchenko [Wed, 1 Jul 2020 07:19:23 +0000 (10:19 +0300)]
xen/displif: Protocol version 2

1. Add protocol version as an integer

Version string, which is in fact an integer, is hard to handle in the
code that supports different protocol versions. To simplify that
also add the version as an integer.

2. Pass buffer offset with XENDISPL_OP_DBUF_CREATE

There are cases when display data buffer is created with non-zero
offset to the data start. Handle such cases and provide that offset
while creating a display buffer.

3. Add XENDISPL_OP_GET_EDID command

Add an optional request for reading Extended Display Identification
Data (EDID) structure which allows better configuration of the
display connectors over the configuration set in XenStore.
With this change connectors may have multiple resolutions defined
with respect to detailed timing definitions and additional properties
normally provided by displays.

If this request is not supported by the backend then visible area
is defined by the relevant XenStore's "resolution" property.

If backend provides extended display identification data (EDID) with
XENDISPL_OP_GET_EDID request then EDID values must take precedence
over the resolutions defined in XenStore.

4. Bump protocol version to 2.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
4 years agox86/pv: Make the PV default WRMSR path match the HVM default
Andrew Cooper [Thu, 23 Jul 2020 17:33:51 +0000 (18:33 +0100)]
x86/pv: Make the PV default WRMSR path match the HVM default

The current HVM default for writes to unknown MSRs is to inject #GP if the MSR
is unreadable, and discard writes otherwise. While this behaviour isn't great,
the PV default is even worse, because it swallows writes even to non-readable
MSRs.  i.e. A PV guest doesn't even get a #GP fault for a write to a totally
bogus index.

Update PV to make it consistent with HVM, which will simplify the task of
making other improvements to the default MSR behaviour.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agolockprof: don't pass name into registration function
Jan Beulich [Fri, 24 Jul 2020 08:19:25 +0000 (10:19 +0200)]
lockprof: don't pass name into registration function

The type uniquely identifies the associated name, hence the name fields
can be statically initialized.

Also constify not just the involved struct field, but also struct
lock_profile's. Rather than specifying lock_profile_ancs[]' dimension at
definition time, add a suitable build time check, such that at least
missing tail additions to the initializer can be spotted easily.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agolockprof: don't leave locks uninitialized upon allocation failure
Jan Beulich [Fri, 24 Jul 2020 08:18:30 +0000 (10:18 +0200)]
lockprof: don't leave locks uninitialized upon allocation failure

Even if a specific struct lock_profile instance can't be allocated, the
lock itself should still be functional. As this isn't a production use
feature, also log a message in the event that the profiling struct can't
be allocated.

Fixes: d98feda5c756 ("Make lock profiling usable again")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/S3: put data segment registers into known state upon resume
Jan Beulich [Fri, 24 Jul 2020 08:17:26 +0000 (10:17 +0200)]
x86/S3: put data segment registers into known state upon resume

wakeup_32 sets %ds and %es to BOOT_DS, while leaving %fs at what
wakeup_start did set it to, and %gs at whatever BIOS did load into it.
All of this may end up confusing the first load_segments() to run on
the BSP after resume, in particular allowing a non-nul selector value
to be left in %fs.

Alongside %ss, also put all other data segment registers into the same
state that the boot and CPU bringup paths put them in.

Reported-by: M. Vefa Bicakci <m.v.b@runbox.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/vmce: Dispatch vmce_{rd,wr}msr() from guest_{rd,wr}msr()
Andrew Cooper [Tue, 21 Jul 2020 17:25:15 +0000 (18:25 +0100)]
x86/vmce: Dispatch vmce_{rd,wr}msr() from guest_{rd,wr}msr()

... rather than from the default clauses of the PV and HVM MSR handlers.

This means that we no longer take the vmce lock for any unknown MSR, and
accesses to architectural MCE banks outside of the subset implemented for the
guest no longer fall further through the unknown MSR path.

The bank limit of 32 isn't stated anywhere I can locate, but is a consequence
of the MSR layout described in SDM Volume 4.

With the vmce calls removed, the hvm alternative_call()'s expression can be
simplified substantially.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86/svm: Misc coding style corrections
Andrew Cooper [Fri, 7 Feb 2020 15:35:54 +0000 (15:35 +0000)]
x86/svm: Misc coding style corrections

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/svm: Fold nsvm_{wr,rd}msr() into svm_msr_{read,write}_intercept()
Andrew Cooper [Mon, 10 Dec 2018 11:58:03 +0000 (11:58 +0000)]
x86/svm: Fold nsvm_{wr,rd}msr() into svm_msr_{read,write}_intercept()

... to simplify the default cases.

There are multiple errors with the handling of these three MSRs, but they are
deliberately not addressed at this point.

This removes the dance converting -1/0/1 into X86EMUL_*, allowing for the
removal of the 'ret' variable.

While cleaning this up, drop the gdprintk()'s for #GP conditions, and the
'result' variable from svm_msr_write_intercept() as it is never modified.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agotools/ocaml: Default to useful build output
Elliott Mitchell [Sat, 18 Jul 2020 03:32:42 +0000 (20:32 -0700)]
tools/ocaml: Default to useful build output

While hiding details of build output looks pretty to some, defaulting to
doing so deviates from the rest of Xen.  Switch the OCAML tools to match
everything else.

Signed-off-by: Elliott Mitchell <ehem+xen@m5p.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
4 years agotools: Partially revert "Cross-compilation fixes."
Elliott Mitchell [Sat, 18 Jul 2020 03:31:21 +0000 (20:31 -0700)]
tools: Partially revert "Cross-compilation fixes."

This partially reverts commit 16504669c5cbb8b195d20412aadc838da5c428f7.

Doesn't look like much of 16504669c5cbb8b195d20412aadc838da5c428f7
actually remains due to passage of time.

Of the 3, both Python and pygrub appear to mostly be building just fine
cross-compiling.  The OCAML portion is being troublesome, this is going
to cause bug reports elsewhere soon.  The OCAML portion though can
already be disabled by setting OCAML_TOOLS=n and shouldn't have this
extra form of disabling.

Signed-off-by: Elliott Mitchell <ehem+xen@m5p.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agotools/xen-cpuid: use dashes consistently in feature names
Jan Beulich [Tue, 21 Jul 2020 12:04:59 +0000 (14:04 +0200)]
tools/xen-cpuid: use dashes consistently in feature names

We've grown to a mix of dashes and underscores - switch to consistent
naming in the hope that future additions will play by this.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wl@xen.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agooxenstored: fix ABI breakage introduced in Xen 4.9.0
Edwin Török [Wed, 15 Jul 2020 15:10:56 +0000 (16:10 +0100)]
oxenstored: fix ABI breakage introduced in Xen 4.9.0

dbc84d2983969bb47d294131ed9e6bbbdc2aec49 (Xen >= 4.9.0) deleted XS_RESTRICT
from oxenstored, which caused all the following opcodes to be shifted by 1:
reset_watches became off-by-one compared to the C version of xenstored.

Looking at the C code the opcode for reset watches needs:
XS_RESET_WATCHES = XS_SET_TARGET + 2

So add the placeholder `Invalid` in the OCaml<->C mapping list.
(Note that the code here doesn't simply convert the OCaml constructor to
 an integer, so we don't need to introduce a dummy constructor).

Igor says that with a suitably patched xenopsd to enable watch reset,
we now see `reset watches` during kdump of a guest in xenstored-access.log.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Tested-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
4 years agogolang/xenlight: fix code generation for python 2.6
Nick Rosbrook [Mon, 20 Jul 2020 23:54:40 +0000 (19:54 -0400)]
golang/xenlight: fix code generation for python 2.6

Before python 2.7, str.format() calls required that the format fields
were explicitly enumerated, e.g.:

  '{0} {1}'.format(foo, bar)

  vs.

  '{} {}'.format(foo, bar)

Currently, gengotypes.py uses the latter pattern everywhere, which means
the Go bindings do not build on python 2.6. Use the 2.6 syntax for
format() in order to support python 2.6 for now.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agoMAINTAINERS: add myself as a golang bindings maintainer
Nick Rosbrook [Thu, 16 Jul 2020 16:00:26 +0000 (12:00 -0400)]
MAINTAINERS: add myself as a golang bindings maintainer

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agoSUPPORT.md: Spell Experimental correctly
Julien Grall [Mon, 20 Jul 2020 17:35:55 +0000 (18:35 +0100)]
SUPPORT.md: Spell Experimental correctly

Signed-off-by: Julien Grall <jgrall@amazon.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>