]> xenbits.xensource.com Git - people/jgross/xen.git/log
people/jgross/xen.git
4 years agoxen/arm: Enable CPU Erratum 1165522 for Neoverse master origin/HEAD origin/master origin/smoke
Bertrand Marquis [Tue, 18 Aug 2020 13:47:39 +0000 (14:47 +0100)]
xen/arm: Enable CPU Erratum 1165522 for Neoverse

Enable CPU erratum of Speculative AT on the Neoverse N1 processor
versions r0p0 to r2p0.
Also Fix Cortex A76 Erratum string which had a wrong errata number.

Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agoarm: Add Neoverse N1 processor identification
Bertrand Marquis [Tue, 18 Aug 2020 13:47:38 +0000 (14:47 +0100)]
arm: Add Neoverse N1 processor identification

Add MIDR and CPU part numbers for Neoverse N1

Signed-off-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agox86: move cpu_{up,down}_helper()
Jan Beulich [Wed, 19 Aug 2020 09:09:38 +0000 (11:09 +0200)]
x86: move cpu_{up,down}_helper()

This is in preparation of making the building of sysctl.c conditional.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86: move domain_cpu_policy_changed()
Jan Beulich [Wed, 19 Aug 2020 09:08:46 +0000 (11:08 +0200)]
x86: move domain_cpu_policy_changed()

This is in preparation of making the building of domctl.c conditional.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/pv: allow reading APIC_BASE MSR
Roger Pau Monne [Mon, 17 Aug 2020 15:57:54 +0000 (17:57 +0200)]
x86/pv: allow reading APIC_BASE MSR

Linux PV guests will attempt to read the APIC_BASE MSR, so just report
a default value to make Linux happy.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/pv: handle reads to the PAT MSR
Roger Pau Monne [Mon, 17 Aug 2020 15:57:53 +0000 (17:57 +0200)]
x86/pv: handle reads to the PAT MSR

The value in the PAT MSR is part of the ABI between Xen and PV guests,
and there's no reason to not allow a PV guest to read it.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/pv: handle writes to the EFER MSR
Roger Pau Monne [Mon, 17 Aug 2020 15:57:52 +0000 (17:57 +0200)]
x86/pv: handle writes to the EFER MSR

Silently drop writes to the EFER MSR for PV guests if the value is not
changed from what it's being reported. Current PV Linux will attempt
to write to the MSR with the same value that's been read, and raising
a fault will result in a guest crash.

As part of this work introduce a helper to easily get the EFER value
reported to guests.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools/ocaml/xenstored: drop select based socket watching
Edwin Török [Mon, 17 Aug 2020 18:45:47 +0000 (19:45 +0100)]
tools/ocaml/xenstored: drop select based socket watching

Poll has been the default since 2014, I think we can safely say by now
that poll() works and we don't need to fall back to select().

This will allow fixing up the way we call poll to be more efficient
(and pave the way for introducing epoll support):
currently poll wraps the select API, which is inefficient.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
4 years agotools/ocaml/libs/xc: Fix ambiguous documentation comment
Edwin Török [Mon, 17 Aug 2020 18:45:44 +0000 (19:45 +0100)]
tools/ocaml/libs/xc: Fix ambiguous documentation comment

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
4 years agoQEMU_TRADITIONAL_REVISION update
Ian Jackson [Tue, 18 Aug 2020 15:00:10 +0000 (16:00 +0100)]
QEMU_TRADITIONAL_REVISION update

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agodocs/process/branching-checklist: Get osstest branch right
Ian Jackson [Wed, 15 Jul 2020 15:39:18 +0000 (16:39 +0100)]
docs/process/branching-checklist: Get osstest branch right

The runes for this manual osstest were wrong.  It needs to run as
osstest, and cr-for-branches should be run from testing.git.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agotools: bump library version numbers
Wei Liu [Wed, 12 Aug 2020 09:55:34 +0000 (09:55 +0000)]
tools: bump library version numbers

Signed-off-by: Wei Liu <wl@xen.org>
4 years agoConfig.mk: update OVMF changeset
Wei Liu [Wed, 12 Aug 2020 09:55:11 +0000 (09:55 +0000)]
Config.mk: update OVMF changeset

Signed-off-by: Wei Liu <wl@xen.org>
4 years agoxen/arm: cmpxchg: Add missing memory barriers in __cmpxchg_mb_timeout()
Julien Grall [Wed, 29 Jul 2020 13:50:37 +0000 (14:50 +0100)]
xen/arm: cmpxchg: Add missing memory barriers in __cmpxchg_mb_timeout()

The function __cmpxchg_mb_timeout() was intended to have the same
semantics as __cmpxchg_mb(). Unfortunately, the memory barriers were
not added when first implemented.

There is no known issue with the existing callers, but the barriers are
added given this is the expected semantics in Xen.

The issue was introduced by XSA-295.

Backport: 4.8+
Fixes: 86b0bc958373 ("xen/arm: cmpxchg: Provide a new helper that can timeout")
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
4 years agoxen/arm: guestcopy: Re-order the includes
Julien Grall [Sat, 4 Apr 2020 11:07:17 +0000 (12:07 +0100)]
xen/arm: guestcopy: Re-order the includes

We usually have xen/ includes first and then asm/. They are also ordered
alphabetically among themselves.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
4 years agoxen/arm: decode: Re-order the includes
Julien Grall [Sat, 4 Apr 2020 11:06:04 +0000 (12:06 +0100)]
xen/arm: decode: Re-order the includes

We usually have xen/ includes first and then asm/. They are also ordered
alphabetically among themselves.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agoxen/arm: kernel: Re-order the includes
Julien Grall [Sat, 4 Apr 2020 11:03:22 +0000 (12:03 +0100)]
xen/arm: kernel: Re-order the includes

We usually have xen/ includes first and then asm/. They are also ordered
alphabetically among themselves.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agoxen/guest_access: Add emacs magics
Julien Grall [Sat, 4 Apr 2020 10:50:18 +0000 (11:50 +0100)]
xen/guest_access: Add emacs magics

Add emacs magics for xen/guest_access.h and
asm-x86/guest_access.h.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/hvm: change EOI exit bitmap helper parameter
Roger Pau Monné [Wed, 12 Aug 2020 12:47:05 +0000 (14:47 +0200)]
x86/hvm: change EOI exit bitmap helper parameter

Change the last parameter of the update_eoi_exit_bitmap helper to be a
set/clear boolean instead of a triggering field. This is already
inline with how the function is implemented, and will allow deciding
whether an exit is required by the higher layers that call into
update_eoi_exit_bitmap. Note that the current behavior is not changed
by this patch.

No functional change intended.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
4 years agorpmball: Adjust to new rpm, do not require --force
Don Slutz [Sun, 9 Aug 2020 18:22:34 +0000 (14:22 -0400)]
rpmball: Adjust to new rpm, do not require --force

Also prevent warning: directory /boot: remove failed

Before:

[root@TestCloud1 xen]# rpm -hiv dist/xen*rpm
Preparing...                          ################################# [100%]
        file /boot from install of xen-4.15-unstable.x86_64 conflicts with file from package filesystem-3.2-25.el7.x86_64
        file /usr/bin from install of xen-4.15-unstable.x86_64 conflicts with file from package filesystem-3.2-25.el7.x86_64
        file /usr/lib from install of xen-4.15-unstable.x86_64 conflicts with file from package filesystem-3.2-25.el7.x86_64
        file /usr/lib64 from install of xen-4.15-unstable.x86_64 conflicts with file from package filesystem-3.2-25.el7.x86_64
        file /usr/sbin from install of xen-4.15-unstable.x86_64 conflicts with file from package filesystem-3.2-25.el7.x86_64
[root@TestCloud1 xen]# rpm -e xen
warning: directory /boot: remove failed: Device or resource busy

After:

[root@TestCloud1 xen]# rpm -hiv dist/xen*rpm
Preparing...                          ################################# [100%]
Updating / installing...
   1:xen-4.15-unstable                ################################# [100%]
[root@TestCloud1 xen]# rpm -e xen
[root@TestCloud1 xen]#

Signed-off-by: Don Slutz <Don.Slutz@Gmail.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agox86/iommu: convert AMD IOMMU code to use new page table allocator
Paul Durrant [Tue, 4 Aug 2020 13:41:59 +0000 (14:41 +0100)]
x86/iommu: convert AMD IOMMU code to use new page table allocator

This patch converts the AMD IOMMU code to use the new page table allocator
function. This allows all the free-ing code to be removed (since it is now
handled by the general x86 code) which reduces TLB and cache thrashing as well
as shortening the code.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/iommu: add common page-table allocator
Paul Durrant [Tue, 4 Aug 2020 13:41:57 +0000 (14:41 +0100)]
x86/iommu: add common page-table allocator

Instead of having separate page table allocation functions in VT-d and AMD
IOMMU code, we could use a common allocation function in the general x86 code.

This patch adds a new allocation function, iommu_alloc_pgtable(), for this
purpose. The function adds the page table pages to a list. The pages in this
list are then freed by iommu_free_pgtables(), which is called by
domain_relinquish_resources() after PCI devices have been de-assigned.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/iommu: re-arrange arch_iommu to separate common fields...
Paul Durrant [Tue, 4 Aug 2020 13:41:56 +0000 (14:41 +0100)]
x86/iommu: re-arrange arch_iommu to separate common fields...

... from those specific to VT-d or AMD IOMMU, and put the latter in a union.

There is no functional change in this patch, although the initialization of
the 'mapped_rmrrs' list occurs slightly later in iommu_domain_init() since
it is now done (correctly) in VT-d specific code rather than in general x86
code.

NOTE: I have not combined the AMD IOMMU 'root_table' and VT-d 'pgd_maddr'
      fields even though they perform essentially the same function. The
      concept of 'root table' in the VT-d code is different from that in the
      AMD code so attempting to use a common name will probably only serve
      to confuse the reader.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
4 years agox86/vmx: reorder code in vmx_deliver_posted_intr
Roger Pau Monné [Thu, 30 Jul 2020 14:03:09 +0000 (16:03 +0200)]
x86/vmx: reorder code in vmx_deliver_posted_intr

Remove the unneeded else branch, which allows to reduce the
indentation of a larger block of code, while making the flow of the
function more obvious.

No functional change intended.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
4 years agotools/xenstore: Do not abort xenstore-ls if a node disappears while iterating
David Woodhouse [Thu, 19 Mar 2020 20:40:24 +0000 (20:40 +0000)]
tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating

The do_ls() function has somewhat inconsistent handling of errors.

If reading the node's contents with xs_read() fails, then do_ls() will
just quietly not display the contents.

If reading the node's permissions with xs_get_permissions() fails, then
do_ls() will print a warning, continue, and ultimately won't exit with
an error code (unless another error happens).

If recursing into the node with xs_directory() fails, then do_ls() will
abort immediately, not printing any further nodes.

For persistent failure modes — such as ENOENT because a node has been
removed, or EACCES because it has had its permisions changed since the
xs_directory() on the parent directory returned its name — it's
obviously quite likely that if either of the first two errors occur for
a given node, then so will the third and thus xenstore-ls will abort.

The ENOENT one is actually a fairly common case, and has caused tools to
fail to clean up a network device because it *apparently* already
doesn't exist in xenstore.

There is a school of thought that says, "Well, xenstore-ls returned an
error. So the tools should not trust its output."

The natural corollary of this would surely be that the tools must re-run
xenstore-ls as many times as is necessary until its manages to exit
without hitting the race condition. I am not keen on that conclusion.

For the specific case of ENOENT it seems reasonable to declare that,
but for the timing, we might as well just not have seen that node at
all when calling xs_directory() for the parent. By ignoring the error,
we give acceptable output.

The issue can be reproduced as follows:

(dom0) # for a in `seq 1 1000` ; do
              xenstore-write /local/domain/2/foo/$a $a ;
         done

Now simultaneously:

(dom0) # for a in `seq 1 999` ; do
              xenstore-rm /local/domain/2/foo/$a ;
         done
(dom2) # while true ; do
              ./xenstore-ls -p /local/domain/2/foo | grep -c 1000 ;
         done

We should expect to see node 1000 in the output, every time.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agox86/viridian: remove the viridian_vcpu msg_pending bit mask origin/coverity-tested/smoke
Paul Durrant [Thu, 13 Aug 2020 10:35:53 +0000 (11:35 +0100)]
x86/viridian: remove the viridian_vcpu msg_pending bit mask

The mask does not actually serve a useful purpose as we only use the SynIC
for timer messages. Dropping the mask means that the EOM MSR handler
essentially becomes a no-op. This means we can avoid setting 'message_pending'
for timer messages and hence avoid a VMEXIT for the EOM.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agox86/setup: Ignore early boot parameters like no-real-mode
Trammell Hudson [Wed, 12 Aug 2020 17:42:48 +0000 (17:42 +0000)]
x86/setup: Ignore early boot parameters like no-real-mode

There are parameters in xen/arch/x86/boot/cmdline.c that
are only used early in the boot process, so handlers are
necessary to avoid an "Unknown command line option" in
dmesg.

This also updates ignore_param() to generate a temporary
variable name so that the macro can be used more than once
per file.

Signed-off-by: Trammell hudson <hudson@trmm.net>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
[Leave note to stop TEMP_NAME() finding more general use]
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoConfig.mk: update seabios to 1.14.0
Wei Liu [Wed, 12 Aug 2020 09:21:39 +0000 (09:21 +0000)]
Config.mk: update seabios to 1.14.0

Signed-off-by: Wei Liu <wl@xen.org>
4 years agoRevert "x86/EFI: sanitize build logic"
Andrew Cooper [Mon, 10 Aug 2020 14:45:46 +0000 (15:45 +0100)]
Revert "x86/EFI: sanitize build logic"

This reverts commit 90c7eee53fcc0b48bd51aa3a7d1d0d9980ce1a7a.

It breaks the build in some configurations with CONFIG_LIVEPATCH enabled.

  make[2]: *** No rule to make target 'efi/buildid.o', needed by '/local/xen.git/xen/xen.efi'.  Stop.
  make[2]: *** Waiting for unfinished jobs....

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/EFI: sanitize build logic
Jan Beulich [Fri, 7 Aug 2020 11:32:11 +0000 (13:32 +0200)]
x86/EFI: sanitize build logic

With changes done over time and as far as linking goes, the only special
thing about building with EFI support enabled is the need for the dummy
relocations object for xen.gz uniformly in all build stages. All other
efi/*.o can be consumed from the built_in*.o files.

In efi/Makefile, besides moving relocs-dummy.o to "extra", also properly
split between obj-y and obj-bin-y.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86: slightly re-arrange 32-bit handling in dom0_construct_pv()
Jan Beulich [Fri, 7 Aug 2020 11:14:02 +0000 (13:14 +0200)]
x86: slightly re-arrange 32-bit handling in dom0_construct_pv()

Add #ifdef-s (the 2nd one will be needed in particular, to guard the
uses of m2p_compat_vstart and HYPERVISOR_COMPAT_VIRT_START()) and fold
duplicate uses of elf_32bit().

Also adjust what gets logged: Avoid "compat32" when support isn't built
in, and don't assume ELF class <> ELFCLASS64 means ELFCLASS32.

While doing this, in code getting touched anyway:
- use ROUNDUP() instead of open-coding it,
- drop a stale (dead) BUG_ON(),
- replace panic() by printk() plus error return, for being consistent
  with other code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agobuild: correctly report non-empty section sizes upon .o -> .init.o conversion
Jan Beulich [Fri, 7 Aug 2020 11:12:21 +0000 (13:12 +0200)]
build: correctly report non-empty section sizes upon .o -> .init.o conversion

The originally used sed expression converted not just multiple leading
zeroes (as intended), but also trailing ones, rendering the error
message somewhat confusing. Collapse zeroes in just the one place where
we need them collapsed, and leave objdump's output as is for all other
purposes.

Fixes: 48115d14743e ("Move more kernel decompression bits to .init.* sections")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agobuild: work around bash issue
Jan Beulich [Fri, 7 Aug 2020 11:12:00 +0000 (13:12 +0200)]
build: work around bash issue

Older bash (observed with 3.2.57(2)) fails to honor "set -e" for certain
built-in commands ("while" here), despite the command's status correctly
being non-zero. The subsequent objcopy invocation now being separated by
a semicolon results in no failure. Insert an explicit "exit" (replacing
; by && ought to be another possible workaround).

Fixes: e321576f4047 ("xen/build: start using if_changed")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/tsc: Fix diagnostics for TSC frequency
Andrew Cooper [Wed, 5 Aug 2020 13:56:11 +0000 (14:56 +0100)]
x86/tsc: Fix diagnostics for TSC frequency

A Gemini Lake platform prints:

  (XEN) CPU0: TSC: 19200000MHz * 279 / 3 = 1785600000MHz
  (XEN) CPU0: 800..1800 MHz

during boot.  The units on the first line are Hz, not MHz, so correct that and
add a space for clarity.

Also, for the min/max line, use three dots instead of two and add more spaces
so that the line can't be mistaken for being a double decimal point typo.

Boot now reads:

  (XEN) CPU0: TSC: 19200000 Hz * 279 / 3 = 1785600000 Hz
  (XEN) CPU0: 800 ... 1800 MHz

Extend these changes to the other TSC diagnostics.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/ioapic: Improve code generation for __io_apic_{read,write}()
Andrew Cooper [Wed, 5 Aug 2020 13:35:16 +0000 (14:35 +0100)]
x86/ioapic: Improve code generation for __io_apic_{read,write}()

The write into REGSEL prevents the optimiser from reusing the address
calculation, forcing it to be calcualted twice.

The calculation itself is quite expensive.  Pull it out into a local varaible.

Bloat-o-meter reports:
  add/remove: 0/0 grow/shrink: 0/26 up/down: 0/-1527 (-1527)

Also correct the register type, which is uint32_t, not int.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/ioapic: Fix style in io_apic.h
Andrew Cooper [Wed, 5 Aug 2020 10:49:15 +0000 (11:49 +0100)]
x86/ioapic: Fix style in io_apic.h

This file is a mix of Xen and Linux styles.  Switch it fully to Xen style.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/ioapic: Fix fixmap error path logic in ioapic_init_mappings()
Andrew Cooper [Wed, 5 Aug 2020 11:05:27 +0000 (12:05 +0100)]
x86/ioapic: Fix fixmap error path logic in ioapic_init_mappings()

In the case that bad_ioapic_register() fails, the current position of idx++
means that clear_fixmap(idx) will be called with the wrong index, and not
clean up the mapping just created.

Increment idx as part of the loop, rather than midway through the loop body.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86emul: correct AVX512_BF16 insn names in EVEX Disp8 test
Jan Beulich [Wed, 5 Aug 2020 08:30:18 +0000 (10:30 +0200)]
x86emul: correct AVX512_BF16 insn names in EVEX Disp8 test

The leading 'v' ought to be omitted from the table entries.

Fixes: 7ff66809ccd5 ("x86emul: support AVX512_BF16 insns")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: extend decoding / mem access testing to EVEX-encoded insns
Jan Beulich [Wed, 5 Aug 2020 08:29:55 +0000 (10:29 +0200)]
x86emul: extend decoding / mem access testing to EVEX-encoded insns

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: AVX512PF insns aren't memory accesses
Jan Beulich [Wed, 5 Aug 2020 08:29:18 +0000 (10:29 +0200)]
x86emul: AVX512PF insns aren't memory accesses

These are prefetches, so should be treated just like other prefetches.

Fixes: 467e91bde720 ("x86emul: support AVX512PF insns")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: AVX512F scatter insns are memory writes
Jan Beulich [Wed, 5 Aug 2020 08:28:40 +0000 (10:28 +0200)]
x86emul: AVX512F scatter insns are memory writes

While the custom handling renders the "to_mem" field generally unused,
x86_insn_is_mem_write() still (indirectly) consumes that information,
and hence the table entries want to be correct.

Fixes: 7d569b848036 ("x86emul: support AVX512F scatter insns")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: AVX512{F,BW} down conversion moves are memory writes
Jan Beulich [Wed, 5 Aug 2020 08:28:01 +0000 (10:28 +0200)]
x86emul: AVX512{F,BW} down conversion moves are memory writes

For this to be properly reported, the case labels need to move to a
different switch() block.

Fixes: 30e0bdf79828 ("x86emul: support AVX512{F,BW} down conversion moves")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: extend decoding / mem access testing to XOP-encoded insns
Jan Beulich [Wed, 5 Aug 2020 08:27:31 +0000 (10:27 +0200)]
x86emul: extend decoding / mem access testing to XOP-encoded insns

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: extend decoding / mem access testing to VEX-encoded insns
Jan Beulich [Wed, 5 Aug 2020 08:27:23 +0000 (10:27 +0200)]
x86emul: extend decoding / mem access testing to VEX-encoded insns

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: extend decoding / mem access testing to MMX / SSE insns
Jan Beulich [Wed, 5 Aug 2020 08:27:11 +0000 (10:27 +0200)]
x86emul: extend decoding / mem access testing to MMX / SSE insns

IOW just legacy encoded ones. For 3dNow! just one example is used, as
they're all similar in nature both encoding- and operand-wise.

Rename pfx_none to pfx_no, so it can be used to improve readability /
column alignment.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: extend decoding / mem access testing to FPU insns
Jan Beulich [Wed, 5 Aug 2020 08:26:59 +0000 (10:26 +0200)]
x86emul: extend decoding / mem access testing to FPU insns

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: adjustments to mem access / write logic testing
Jan Beulich [Wed, 5 Aug 2020 08:26:11 +0000 (10:26 +0200)]
x86emul: adjustments to mem access / write logic testing

The combination of specifying a ModR/M byte with the upper two bits set
and the modrm field set to T is pointless - the same test will be
executed twice, i.e. overall things will be slower for no extra gain. I
can only assume this was a copy-and-paste-without-enough-editing mistake
of mine.

Furthermore adjust the base type of a few bit fields to shrink table
size, as subsequently quite a few new entries will get added to the
tables using this type.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86: comment update after "drop high compat r/o M2P table address range"
Jan Beulich [Wed, 5 Aug 2020 08:21:22 +0000 (10:21 +0200)]
x86: comment update after "drop high compat r/o M2P table address range"

Commit 5af040ef8b57 clearly should also have updated the comment, not
just the #define-s.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: further FPU env testing relaxation for AMD-like CPUs
Jan Beulich [Wed, 5 Aug 2020 08:20:59 +0000 (10:20 +0200)]
x86emul: further FPU env testing relaxation for AMD-like CPUs

See the code comment that's being extended. Additionally a few more
zap_fpsel() invocations are needed - whenever we stored state after
there potentially having been a context switch behind our backs.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: replace further UB shifts
Jan Beulich [Wed, 5 Aug 2020 08:19:29 +0000 (10:19 +0200)]
x86emul: replace further UB shifts

I have no explanation how I managed to overlook these while putting
together what is now b6a907f8c83d ("x86emul: replace UB shifts").

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoRevert "libxl: avoid golang building without CONFIG_GOLANG=y"
Wei Liu [Tue, 4 Aug 2020 15:53:48 +0000 (15:53 +0000)]
Revert "libxl: avoid golang building without CONFIG_GOLANG=y"

This reverts commit fe49938f21c26f0ce630c69af055f927dd0ed75f.

We have an on-going discussion regarding this patch.

Signed-off-by: Wei Liu <wl@xen.org>
4 years agolibxl: avoid golang building without CONFIG_GOLANG=y
Jan Beulich [Mon, 3 Aug 2020 08:06:32 +0000 (10:06 +0200)]
libxl: avoid golang building without CONFIG_GOLANG=y

While this doesn't address the real problem I've run into (attempting to
update r/o source files), not recursing into tools/golang/xenlight/ is
enough to fix the build for me for the moment. I don't currently see why
60db5da62ac0 ("libxl: Generate golang bindings in libxl Makefile") found
it necessary to invoke this build step unconditionally.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agox86emul: avoid assembler warning about .type not taking effect in test harness
Jan Beulich [Mon, 3 Aug 2020 14:27:22 +0000 (16:27 +0200)]
x86emul: avoid assembler warning about .type not taking effect in test harness

gcc re-orders top level blocks by default when optimizing. This
re-ordering results in all our .type directives to get emitted to the
assembly file first, followed by gcc's. The assembler warns about
attempts to change the type of a symbol when it was already set (and
when there's no intervening setting to "notype").

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/hvm: simplify 'mmio_direct' check in epte_get_entry_emt()
Paul Durrant [Fri, 31 Jul 2020 15:43:31 +0000 (17:43 +0200)]
x86/hvm: simplify 'mmio_direct' check in epte_get_entry_emt()

Re-factor the code to take advantage of the fact that the APIC access page is
a 'special' page. The VMX code is left alone and hence the APIC access page is
still inserted into the P2M with type p2m_mmio_direct. This is left alone as it
is not obvious there is another suitable type to use, and the necessary
re-ordering in epte_get_entry_emt() is straightforward.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/hvm: set 'ipat' in EPT for special pages
Paul Durrant [Fri, 31 Jul 2020 15:42:47 +0000 (17:42 +0200)]
x86/hvm: set 'ipat' in EPT for special pages

All non-MMIO ranges (i.e those not mapping real device MMIO regions) that
map valid MFNs are normally marked MTRR_TYPE_WRBACK and 'ipat' is set. Hence
when PV drivers running in a guest populate the BAR space of the Xen Platform
PCI Device with pages such as the Shared Info page or Grant Table pages,
accesses to these pages will be cachable.

However, should IOMMU mappings be enabled be enabled for the guest then these
accesses become uncachable. This has a substantial negative effect on I/O
throughput of PV devices. Arguably PV drivers should bot be using BAR space to
host the Shared Info and Grant Table pages but it is currently commonplace for
them to do this and so this problem needs mitigation. Hence this patch makes
sure the 'ipat' bit is set for any special page regardless of where in GFN
space it is mapped.

NOTE: Clearly this mitigation only applies to Intel EPT. It is not obvious
      that there is any similar mitigation possible for AMD NPT. Downstreams
      such as Citrix XenServer have been carrying a patch similar to this for
      several releases though.

Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86emul: replace UB shifts
Jan Beulich [Fri, 31 Jul 2020 15:41:58 +0000 (17:41 +0200)]
x86emul: replace UB shifts

Displacement values can be negative, hence we shouldn't left-shift them.
Or else we get

(XEN) UBSAN: Undefined behaviour in x86_emulate/x86_emulate.c:3482:55
(XEN) left shift of negative value -2

While auditing shifts, I noticed a pair of missing parentheses, which
also get added right here.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools/xen-cpuid: show enqcmd
Olaf Hering [Fri, 31 Jul 2020 15:41:27 +0000 (17:41 +0200)]
tools/xen-cpuid: show enqcmd

Translate <29> into a feature string.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/PV: drop a few misleading paging_mode_refcounts() checks
Jan Beulich [Fri, 31 Jul 2020 15:40:13 +0000 (17:40 +0200)]
x86/PV: drop a few misleading paging_mode_refcounts() checks

The filling and cleaning up of v->arch.guest_table in new_guest_cr3()
was apparently inconsistent so far: There was a type ref acquired
unconditionally for the new top level page table, but the dropping of
the old type ref was conditional upon !paging_mode_refcounts(). Mirror
this also to arch_set_info_guest().

Also move new_guest_cr3()'s #ifdef to around the function - both callers
now get built only when CONFIG_PV, i.e. no need to retain a stub.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agotools/configure: drop BASH configure variable
Andrew Cooper [Fri, 26 Jun 2020 16:46:38 +0000 (17:46 +0100)]
tools/configure: drop BASH configure variable

This is a weird variable to have in the first place.  The only user of it is
XSM's CONFIG_SHELL, which opencodes a fallback to sh.  The scripts are shebang
sh, which is already necessary to support non-Linux build environments.

Make the mkflask.sh and mkaccess_vector.sh scripts executable, drop the
CONFIG_SHELL, and drop the $BASH variable to prevent further use.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agoxen/spinlock: move debug helpers inside the locked regions
Roger Pau Monne [Wed, 29 Jul 2020 11:13:30 +0000 (13:13 +0200)]
xen/spinlock: move debug helpers inside the locked regions

Debug helpers such as lock profiling or the invariant pCPU assertions
must strictly be performed inside the exclusive locked region, or else
races might happen.

Note the issue was not strictly introduced by the pointed commit in
the Fixes tag, since lock stats where already incremented before the
barrier, but that commit made it more apparent as manipulating the cpu
field could happen outside of the locked regions and thus trigger the
BUG_ON on rel_lock(). This is only enabled on debug builds, and thus
releases are not affected.

Fixes: 80cba391a35 ('spinlocks: in debug builds store cpu holding the lock')
Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
4 years agox86/cpuid: Fix APIC bit clearing
Fam Zheng [Wed, 29 Jul 2020 17:51:45 +0000 (18:51 +0100)]
x86/cpuid: Fix APIC bit clearing

The bug is obvious here, other places in this function used
"cpufeat_mask" correctly.

Fixed: b648feff8ea2 ("xen/x86: Improvements to in-hypervisor cpuid sanity checks")
Signed-off-by: Fam Zheng <famzheng@amazon.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/hvm: Clean up track_dirty_vram() calltree
Andrew Cooper [Fri, 20 Jul 2018 17:22:25 +0000 (17:22 +0000)]
x86/hvm: Clean up track_dirty_vram() calltree

 * Rename nr to nr_frames.  A plain 'nr' is confusing to follow in the the
   lower levels.
 * Use DIV_ROUND_UP() rather than opencoding it in several different ways
 * The hypercall input is capped at uint32_t, so there is no need for
   nr_frames to be unsigned long in the lower levels.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
4 years agox86/hvm: only translate ISA interrupts to GSIs in virtual timers
Roger Pau Monne [Mon, 27 Jul 2020 17:05:39 +0000 (19:05 +0200)]
x86/hvm: only translate ISA interrupts to GSIs in virtual timers

Only call hvm_isa_irq_to_gsi for ISA interrupts, interrupts
originating from an IO APIC pin already use a GSI and don't need to be
translated.

I haven't observed any issues from this, but I think it's better to
use it correctly.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/vpt: only try to resume timers belonging to enabled devices
Roger Pau Monne [Mon, 27 Jul 2020 17:05:38 +0000 (19:05 +0200)]
x86/vpt: only try to resume timers belonging to enabled devices

Check whether the emulated device is actually enabled before trying to
resume the associated timers.

Thankfully all those structures are zeroed at initialization, and
since the devices are not enabled they are never populated, which
triggers the pt->vcpu check at the beginning of pt_resume forcing an
exit from the function.

While there limit the scope of i and make it unsigned.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/hvm: fix ISA IRQ 0 handling when set as lowest priority mode in IO APIC
Roger Pau Monne [Mon, 27 Jul 2020 17:05:37 +0000 (19:05 +0200)]
x86/hvm: fix ISA IRQ 0 handling when set as lowest priority mode in IO APIC

Lowest priority destination mode does allow the vIO APIC code to
select a vCPU to inject the interrupt to, but the selected vCPU must
be part of the possible destinations configured for such IO APIC pin.

Fix the code in order to only force vCPU 0 if it's part of the
listed destinations.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/hvm: don't force vCPU 0 for IRQ 0 when using fixed destination mode
Roger Pau Monne [Mon, 27 Jul 2020 17:05:36 +0000 (19:05 +0200)]
x86/hvm: don't force vCPU 0 for IRQ 0 when using fixed destination mode

When the IO APIC pin mapped to the ISA IRQ 0 has been configured to
use fixed delivery mode, do not forcefully route interrupts to vCPU 0,
as the OS might have setup those interrupts to be injected to a
different vCPU, and injecting to vCPU 0 can cause the OS to miss such
interrupts or errors to happen due to unexpected vectors being
injected on vCPU 0.

In order to fix remove such handling altogether for fixed destination
mode pins and just inject them according to the data setup in the
IO-APIC entry.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/hvm: fix vIO-APIC build without IRQ0_SPECIAL_ROUTING
Roger Pau Monne [Mon, 27 Jul 2020 17:05:35 +0000 (19:05 +0200)]
x86/hvm: fix vIO-APIC build without IRQ0_SPECIAL_ROUTING

pit_channel0_enabled needs to be guarded with IRQ0_SPECIAL_ROUTING
since it's only used when the special handling of ISA IRQ 0 is
enabled. However such helper being a single line it's better to just
inline it directly in vioapic_deliver where it's used.

No functional change.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoprint: introduce a format specifier for pci_sbdf_t
Roger Pau Monne [Mon, 27 Jul 2020 10:31:36 +0000 (12:31 +0200)]
print: introduce a format specifier for pci_sbdf_t

The new format specifier is '%pp', and prints a pci_sbdf_t using the
seg:bus:dev.func format. Replace all SBDFs printed using
'%04x:%02x:%02x.%u' to use the new format specifier.

No functional change intended.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Julien Grall <julien.grall@arm.com>
For just the pieces where Jan is the only maintainer:
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agopublic/domctl: Fix the struct xen_domctl ABI in 32bit builds
Andrew Cooper [Mon, 27 Jul 2020 18:21:09 +0000 (19:21 +0100)]
public/domctl: Fix the struct xen_domctl ABI in 32bit builds

The Xen domctl ABI currently relies on the union containing a field with
alignment of 8.

32bit projects which only copy the used subset of functionality end up with an
ABI breakage if they don't have at least one uint64_aligned_t field copied.

Insert explicit padding, and some build assertions to ensure it never changes
moving forwards.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agoxen/displif: Protocol version 2
Oleksandr Andrushchenko [Wed, 1 Jul 2020 07:19:23 +0000 (10:19 +0300)]
xen/displif: Protocol version 2

1. Add protocol version as an integer

Version string, which is in fact an integer, is hard to handle in the
code that supports different protocol versions. To simplify that
also add the version as an integer.

2. Pass buffer offset with XENDISPL_OP_DBUF_CREATE

There are cases when display data buffer is created with non-zero
offset to the data start. Handle such cases and provide that offset
while creating a display buffer.

3. Add XENDISPL_OP_GET_EDID command

Add an optional request for reading Extended Display Identification
Data (EDID) structure which allows better configuration of the
display connectors over the configuration set in XenStore.
With this change connectors may have multiple resolutions defined
with respect to detailed timing definitions and additional properties
normally provided by displays.

If this request is not supported by the backend then visible area
is defined by the relevant XenStore's "resolution" property.

If backend provides extended display identification data (EDID) with
XENDISPL_OP_GET_EDID request then EDID values must take precedence
over the resolutions defined in XenStore.

4. Bump protocol version to 2.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
4 years agox86/pv: Make the PV default WRMSR path match the HVM default
Andrew Cooper [Thu, 23 Jul 2020 17:33:51 +0000 (18:33 +0100)]
x86/pv: Make the PV default WRMSR path match the HVM default

The current HVM default for writes to unknown MSRs is to inject #GP if the MSR
is unreadable, and discard writes otherwise. While this behaviour isn't great,
the PV default is even worse, because it swallows writes even to non-readable
MSRs.  i.e. A PV guest doesn't even get a #GP fault for a write to a totally
bogus index.

Update PV to make it consistent with HVM, which will simplify the task of
making other improvements to the default MSR behaviour.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agolockprof: don't pass name into registration function
Jan Beulich [Fri, 24 Jul 2020 08:19:25 +0000 (10:19 +0200)]
lockprof: don't pass name into registration function

The type uniquely identifies the associated name, hence the name fields
can be statically initialized.

Also constify not just the involved struct field, but also struct
lock_profile's. Rather than specifying lock_profile_ancs[]' dimension at
definition time, add a suitable build time check, such that at least
missing tail additions to the initializer can be spotted easily.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agolockprof: don't leave locks uninitialized upon allocation failure
Jan Beulich [Fri, 24 Jul 2020 08:18:30 +0000 (10:18 +0200)]
lockprof: don't leave locks uninitialized upon allocation failure

Even if a specific struct lock_profile instance can't be allocated, the
lock itself should still be functional. As this isn't a production use
feature, also log a message in the event that the profiling struct can't
be allocated.

Fixes: d98feda5c756 ("Make lock profiling usable again")
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/S3: put data segment registers into known state upon resume
Jan Beulich [Fri, 24 Jul 2020 08:17:26 +0000 (10:17 +0200)]
x86/S3: put data segment registers into known state upon resume

wakeup_32 sets %ds and %es to BOOT_DS, while leaving %fs at what
wakeup_start did set it to, and %gs at whatever BIOS did load into it.
All of this may end up confusing the first load_segments() to run on
the BSP after resume, in particular allowing a non-nul selector value
to be left in %fs.

Alongside %ss, also put all other data segment registers into the same
state that the boot and CPU bringup paths put them in.

Reported-by: M. Vefa Bicakci <m.v.b@runbox.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/vmce: Dispatch vmce_{rd,wr}msr() from guest_{rd,wr}msr()
Andrew Cooper [Tue, 21 Jul 2020 17:25:15 +0000 (18:25 +0100)]
x86/vmce: Dispatch vmce_{rd,wr}msr() from guest_{rd,wr}msr()

... rather than from the default clauses of the PV and HVM MSR handlers.

This means that we no longer take the vmce lock for any unknown MSR, and
accesses to architectural MCE banks outside of the subset implemented for the
guest no longer fall further through the unknown MSR path.

The bank limit of 32 isn't stated anywhere I can locate, but is a consequence
of the MSR layout described in SDM Volume 4.

With the vmce calls removed, the hvm alternative_call()'s expression can be
simplified substantially.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86/svm: Misc coding style corrections
Andrew Cooper [Fri, 7 Feb 2020 15:35:54 +0000 (15:35 +0000)]
x86/svm: Misc coding style corrections

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/svm: Fold nsvm_{wr,rd}msr() into svm_msr_{read,write}_intercept()
Andrew Cooper [Mon, 10 Dec 2018 11:58:03 +0000 (11:58 +0000)]
x86/svm: Fold nsvm_{wr,rd}msr() into svm_msr_{read,write}_intercept()

... to simplify the default cases.

There are multiple errors with the handling of these three MSRs, but they are
deliberately not addressed at this point.

This removes the dance converting -1/0/1 into X86EMUL_*, allowing for the
removal of the 'ret' variable.

While cleaning this up, drop the gdprintk()'s for #GP conditions, and the
'result' variable from svm_msr_write_intercept() as it is never modified.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agotools/ocaml: Default to useful build output
Elliott Mitchell [Sat, 18 Jul 2020 03:32:42 +0000 (20:32 -0700)]
tools/ocaml: Default to useful build output

While hiding details of build output looks pretty to some, defaulting to
doing so deviates from the rest of Xen.  Switch the OCAML tools to match
everything else.

Signed-off-by: Elliott Mitchell <ehem+xen@m5p.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
4 years agotools: Partially revert "Cross-compilation fixes."
Elliott Mitchell [Sat, 18 Jul 2020 03:31:21 +0000 (20:31 -0700)]
tools: Partially revert "Cross-compilation fixes."

This partially reverts commit 16504669c5cbb8b195d20412aadc838da5c428f7.

Doesn't look like much of 16504669c5cbb8b195d20412aadc838da5c428f7
actually remains due to passage of time.

Of the 3, both Python and pygrub appear to mostly be building just fine
cross-compiling.  The OCAML portion is being troublesome, this is going
to cause bug reports elsewhere soon.  The OCAML portion though can
already be disabled by setting OCAML_TOOLS=n and shouldn't have this
extra form of disabling.

Signed-off-by: Elliott Mitchell <ehem+xen@m5p.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agotools/xen-cpuid: use dashes consistently in feature names
Jan Beulich [Tue, 21 Jul 2020 12:04:59 +0000 (14:04 +0200)]
tools/xen-cpuid: use dashes consistently in feature names

We've grown to a mix of dashes and underscores - switch to consistent
naming in the hope that future additions will play by this.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wl@xen.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agooxenstored: fix ABI breakage introduced in Xen 4.9.0
Edwin Török [Wed, 15 Jul 2020 15:10:56 +0000 (16:10 +0100)]
oxenstored: fix ABI breakage introduced in Xen 4.9.0

dbc84d2983969bb47d294131ed9e6bbbdc2aec49 (Xen >= 4.9.0) deleted XS_RESTRICT
from oxenstored, which caused all the following opcodes to be shifted by 1:
reset_watches became off-by-one compared to the C version of xenstored.

Looking at the C code the opcode for reset watches needs:
XS_RESET_WATCHES = XS_SET_TARGET + 2

So add the placeholder `Invalid` in the OCaml<->C mapping list.
(Note that the code here doesn't simply convert the OCaml constructor to
 an integer, so we don't need to introduce a dummy constructor).

Igor says that with a suitably patched xenopsd to enable watch reset,
we now see `reset watches` during kdump of a guest in xenstored-access.log.

Signed-off-by: Edwin Török <edvin.torok@citrix.com>
Tested-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
4 years agogolang/xenlight: fix code generation for python 2.6
Nick Rosbrook [Mon, 20 Jul 2020 23:54:40 +0000 (19:54 -0400)]
golang/xenlight: fix code generation for python 2.6

Before python 2.7, str.format() calls required that the format fields
were explicitly enumerated, e.g.:

  '{0} {1}'.format(foo, bar)

  vs.

  '{} {}'.format(foo, bar)

Currently, gengotypes.py uses the latter pattern everywhere, which means
the Go bindings do not build on python 2.6. Use the 2.6 syntax for
format() in order to support python 2.6 for now.

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agoMAINTAINERS: add myself as a golang bindings maintainer
Nick Rosbrook [Thu, 16 Jul 2020 16:00:26 +0000 (12:00 -0400)]
MAINTAINERS: add myself as a golang bindings maintainer

Signed-off-by: Nick Rosbrook <rosbrookn@ainfosec.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agoSUPPORT.md: Spell Experimental correctly
Julien Grall [Mon, 20 Jul 2020 17:35:55 +0000 (18:35 +0100)]
SUPPORT.md: Spell Experimental correctly

Signed-off-by: Julien Grall <jgrall@amazon.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86emul: support AVX512_VP2INTERSECT insns
Jan Beulich [Tue, 21 Jul 2020 12:00:25 +0000 (14:00 +0200)]
x86emul: support AVX512_VP2INTERSECT insns

The standard memory access pattern once again should allow us to go
without a test harness addition beyond the EVEX Disp8-scaling one.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/shadow: l3table[] and gl3e[] are HVM only
Jan Beulich [Tue, 21 Jul 2020 11:59:28 +0000 (13:59 +0200)]
x86/shadow: l3table[] and gl3e[] are HVM only

... by the very fact that they're 3-level specific, while PV always gets
run in 4-level mode. This requires adding some seemingly redundant
#ifdef-s - some of them will be possible to drop again once 2- and
3-level guest code doesn't get built anymore in !HVM configs, but I'm
afraid there's still quite a bit of disentangling work to be done to
make this possible.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
4 years agox86/shadow: have just a single instance of sh_set_toplevel_shadow()
Jan Beulich [Tue, 21 Jul 2020 11:58:56 +0000 (13:58 +0200)]
x86/shadow: have just a single instance of sh_set_toplevel_shadow()

The only guest/shadow level dependent piece here is the call to
sh_make_shadow(). Make a pointer to the respective function an
argument of sh_set_toplevel_shadow(), allowing it to be moved to
common.c.

This implies making get_shadow_status() available to common.c; its set
and delete counterparts are moved along with it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
4 years agox86/shadow: shadow_table[] needs only one entry for PV-only configs
Jan Beulich [Tue, 21 Jul 2020 11:58:15 +0000 (13:58 +0200)]
x86/shadow: shadow_table[] needs only one entry for PV-only configs

Furthermore the field isn't needed at all with shadow support disabled -
move it into struct shadow_vcpu.

Introduce for_each_shadow_table(), shortening loops for the 4-level case
at the same time.

Adjust loop variables and a function parameter to be "unsigned int"
where applicable at the same time. Also move a comment that ended up
misplaced due to incremental additions.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
4 years agox86/shadow: dirty VRAM tracking is needed for HVM only
Jan Beulich [Tue, 21 Jul 2020 11:57:06 +0000 (13:57 +0200)]
x86/shadow: dirty VRAM tracking is needed for HVM only

Move shadow_track_dirty_vram() into hvm.c (requiring two static
functions to become non-static). More importantly though make sure we
don't de-reference d->arch.hvm.dirty_vram for a non-HVM guest. This was
a latent issue only just because the field lives far enough into struct
hvm_domain to be outside the part overlapping with struct pv_domain.

While moving shadow_track_dirty_vram() some purely typographic
adjustments are being made, like inserting missing blanks or putting
breaces on their own lines.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
4 years agodocs: Replace non-UTF-8 character in hypfs-paths.pandoc
Andrew Cooper [Mon, 20 Jul 2020 16:54:52 +0000 (17:54 +0100)]
docs: Replace non-UTF-8 character in hypfs-paths.pandoc

From the docs cronjob on xenbits:

  /usr/bin/pandoc --number-sections --toc --standalone misc/hypfs-paths.pandoc --output html/misc/hypfs-paths.html
  pandoc: Cannot decode byte '\x92': Data.Text.Internal.Encoding.decodeUtf8: Invalid UTF-8 stream
  make: *** [Makefile:236: html/misc/hypfs-paths.html] Error 1

Fixes: 5a4a411bde4 ("docs: specify stability of hypfs path documentation")
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-acked-by: Paul Durrant <paul@xen.org>
4 years agoArm: prune #include-s needed by domain.h
Jan Beulich [Wed, 15 Jul 2020 10:39:06 +0000 (12:39 +0200)]
Arm: prune #include-s needed by domain.h

asm/domain.h is a dependency of xen/sched.h, and hence should not itself
include xen/sched.h. Nor should any of the other #include-s used by it.
While at it, also drop two other #include-s that aren't needed by this
particular header.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agodocs: specify stability of hypfs path documentation
Juergen Gross [Mon, 20 Jul 2020 11:38:00 +0000 (13:38 +0200)]
docs: specify stability of hypfs path documentation

In docs/misc/hypfs-paths.pandoc the supported paths in the hypervisor
file system are specified. Make it more clear that path availability
might change, e.g. due to scope widening or narrowing (e.g. being
limited to a specific architecture).

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Release-acked-by: Paul Durrant <paul@xen.org>
4 years agox86/mtrr: Drop workaround for old 32bit CPUs
Andrew Cooper [Wed, 8 Jul 2020 10:12:51 +0000 (11:12 +0100)]
x86/mtrr: Drop workaround for old 32bit CPUs

This logic is dead as Xen is 64bit-only these days.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86/vmx: add Intel PT MSR definitions
Michal Leszczynski [Tue, 30 Jun 2020 12:33:44 +0000 (14:33 +0200)]
x86/vmx: add Intel PT MSR definitions

Define constants related to Intel Processor Trace features.

Signed-off-by: Michal Leszczynski <michal.leszczynski@cert.pl>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agocompat: add a little bit of description to xlat.lst
Jan Beulich [Fri, 17 Jul 2020 15:52:14 +0000 (17:52 +0200)]
compat: add a little bit of description to xlat.lst

Requested-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86/HVM: fold both instances of looking up a hvm_ioreq_vcpu with a request pending
Jan Beulich [Fri, 17 Jul 2020 15:51:07 +0000 (17:51 +0200)]
x86/HVM: fold both instances of looking up a hvm_ioreq_vcpu with a request pending

It seems pretty likely that the "break" in the loop getting replaced in
handle_hvm_io_completion() was meant to exit both nested loops at the
same time. Re-purpose what has been hvm_io_pending() to hand back the
struct hvm_ioreq_vcpu instance found, and use it to replace the two
nested loops.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
4 years agox86/HVM: re-work hvm_wait_for_io() a little
Jan Beulich [Fri, 17 Jul 2020 15:50:40 +0000 (17:50 +0200)]
x86/HVM: re-work hvm_wait_for_io() a little

Convert the function's main loop to a more ordinary one, without goto
and without initial steps not needing to be inside a loop at all.

Take the opportunity and add blank lines between case blocks.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
4 years agox86/HVM: fold hvm_io_assist() into its only caller
Jan Beulich [Fri, 17 Jul 2020 15:50:09 +0000 (17:50 +0200)]
x86/HVM: fold hvm_io_assist() into its only caller

While there are two call sites, the function they're in can be slightly
re-arranged such that the code sequence can be added at its bottom. Note
that the function's only caller has already checked sv->pending, and
that the prior while() loop was just a slightly more fancy if()
(allowing an early break out of the construct).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul@xen.org>
4 years agoVT-d: use clear_page() in alloc_pgtable_maddr()
Jan Beulich [Fri, 17 Jul 2020 15:49:29 +0000 (17:49 +0200)]
VT-d: use clear_page() in alloc_pgtable_maddr()

For full pages this is (meant to be) more efficient. Also change the
type and reduce the scope of the involved local variable.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
4 years agoVT-d: install sync_cache hook on demand
Jan Beulich [Fri, 17 Jul 2020 15:48:42 +0000 (17:48 +0200)]
VT-d: install sync_cache hook on demand

Instead of checking inside the hook whether any non-coherent IOMMUs are
present, simply install the hook only when this is the case.

To prove that there are no other references to the now dynamically
updated ops structure (and hence that its updating happens early
enough), make it static and rename it at the same time.

Note that this change implies that sync_cache() shouldn't be called
directly unless there are unusual circumstances, like is the case in
alloc_pgtable_maddr(), which gets invoked too early for iommu_ops to
be set already (and therefore we also need to be careful there to
avoid accessing vtd_ops later on, as it lives in .init).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>