Juergen Gross [Sun, 23 Aug 2020 08:00:17 +0000 (10:00 +0200)]
tools: add support for library names other than libxen*
All Xen libraries but one (libxlutil) are named libxen...
Add support in the generic library build framework for that different
naming by adding another indirection layer. For a library
LIB_PREFIX_<lib> can be set in tools/libs/uselibs.mk. The default is
"xen", assuming that all libraries are starting with "lib".
For now don't expand this support to stubdoms, as it isn't needed
there yet.
Juergen Gross [Sun, 23 Aug 2020 08:00:17 +0000 (10:00 +0200)]
tools/libxl: fix dependencies of libxl tests
Today building the libxl internal tests depends on libxlutil having
been built, in spite of the tests not using any functionality og
libxlutil. Fix this by dropping the dependency.
Juergen Gross [Sun, 23 Aug 2020 08:00:17 +0000 (10:00 +0200)]
tools: split libxenstat into new tools/libs/stat directory
There is no reason why libxenstat is not placed in the tools/libs
directory.
At the same time move xenstat.h to a dedicated include directory
in tools/libs/stat in order to follow the same pattern as the other
libraries in tools/libs.
As now xentop is the only left directory in xenstat move it directly
under tools and get rid of tools/xenstat.
Fix some missing prototype errors (add one prototype and make two
functions static).
Juergen Gross [Sun, 23 Aug 2020 08:00:16 +0000 (10:00 +0200)]
tools: split libxenvchan into new tools/libs/vchan directory
There is no reason why libvchan is not placed in the tools/libs
directory.
At the same time move libxenvchan.h to a dedicated include directory
in tools/libs/vchan in order to follow the same pattern as the other
libraries in tools/libs.
As tools/libvchan now contains no library any longer rename it to
tools/vchan.
Juergen Gross [Sun, 23 Aug 2020 08:00:16 +0000 (10:00 +0200)]
tools: split libxenstore into new tools/libs/store directory
There is no reason why libxenstore is not placed in the tools/libs
directory.
The common files between libxenstore and xenstored are kept in the
tools/xenstore directory to be easily accessible by xenstore-stubdom
which needs the xenstored files to be built.
Juergen Gross [Sun, 23 Aug 2020 08:00:15 +0000 (10:00 +0200)]
tools: move libxenctrl below tools/libs
Today tools/libxc needs to be built after tools/libs as libxenctrl is
depending on some libraries in tools/libs. This in turn blocks moving
other libraries depending on libxenctrl below tools/libs.
So carve out libxenctrl from tools/libxc and move it into
tools/libs/ctrl.
Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org> (stubdom parts)
Juergen Gross [Sun, 23 Aug 2020 08:00:15 +0000 (10:00 +0200)]
tools/libxc: untangle libxenctrl from libxenguest
Sources of libxenctrl and libxenguest are completely entangled. In
practice libxenguest is a user of libxenctrl, so don't let any source
libxenctrl include xg_private.h.
This can be achieved by moving all definitions used by libxenctrl from
xg_private.h to xc_private.h.
Export xenctrl_dom.h as it will now be included by other public
headers.
Juergen Gross [Sun, 23 Aug 2020 08:00:13 +0000 (10:00 +0200)]
tools/misc: drop all libxc internals from xen-mfndump.c
The last libxc internal used by xen-mfndump.c is the ERROR() macro.
Add a simple definition for that macro to xen-mfndump.c and replace
the libxc private header includes by official ones.
Juergen Gross [Sun, 23 Aug 2020 08:00:13 +0000 (10:00 +0200)]
tools/misc: replace PAGE_SIZE with XC_PAGE_SIZE in xen-mfndump.c
The definition of PAGE_SIZE comes from xc_private.h, which shouldn't be
used by xen-mfndump.c. Replace PAGE_SIZE by XC_PAGE_SIZE, as
xc_private.h contains:
#define PAGE_SIZE XC_PAGE_SIZE
For the same reason PAGE_SHIFT_X86 needs to replaced with
XC_PAGE_SHIFT.
Juergen Gross [Sun, 23 Aug 2020 08:00:13 +0000 (10:00 +0200)]
tools/misc: don't include xg_save_restore.h from xen-mfndump.c
xen-mfndump.c is including the libxc private header xg_save_restore.h.
Avoid that by moving the definition of is_mapped() to xen-mfndump.c
(it is used there only) and by duplicating the definition of
M2P_SIZE() in xen-mfndump.c.
Juergen Gross [Sun, 23 Aug 2020 08:00:12 +0000 (10:00 +0200)]
tools: tweak tools/libs/libs.mk for being able to support libxenctrl
tools/libs/libs.mk needs to be modified for being able to support
building libxenctrl, as the pkg-config file of that library is not
following the same conventions as those of the other libraries.
So add support for specifying PKG_CONFIG before including libs.mk.
In order to make life easier for unstable libraries like libxenctrl
set MAJOR and MINOR automatically to the Xen-version and 0 when not
specified. This removes the need to bump the versions of unstable
libraries when switching to a new Xen version.
As all libraries built via libs.mk require a map file generate a dummy
one in case there is none existing. This again will help avoiding the
need to bump the libarary version in the map file of an unstable
library in case it is exporting all symbols.
The clean target is missing the removal of _paths.h.
Finally drop the foreach loop when setting PKG_CONFIG_LOCAL, as there
is always only one element in PKG_CONFIG.
Juergen Gross [Sun, 23 Aug 2020 08:00:12 +0000 (10:00 +0200)]
tools: drop explicit path specifications for qemu build
Since more than three years now qemu is capable to set the needed
include and library paths for the Xen libraries via pkg-config.
So drop the specification of those paths in tools/Makefile. This will
enable to move libxenctrl away from tools/libxc, as qemu's configure
script has special treatment of this path.
Juergen Gross [Sun, 23 Aug 2020 08:00:12 +0000 (10:00 +0200)]
stubdom: simplify building xen libraries for stubdoms
The pattern for building a Xen library with sources under tools/libs
is always the same. Simplify stubdom/Makefile by defining a callable
make program for those libraries.
Even if not needed right now add the possibility for defining
additional dependencies for a library.
Juergen Gross [Sun, 23 Aug 2020 08:00:11 +0000 (10:00 +0200)]
tools: generate most contents of library make variables
Library related make variables (CFLAGS_lib*, SHDEPS_lib*, LDLIBS_lib*
and SHLIB_lib*) mostly have a common pattern for their values. Generate
most of this content automatically by adding a new per-library variable
defining on which other libraries a lib is depending. Those definitions
are put into an own file in order to make it possible to include it
from various Makefiles, especially for stubdom.
This in turn makes it possible to drop the USELIB variable from each
library Makefile.
The LIBNAME variable can be dropped, too, as it can be derived from the
directory name the library is residing in.
Juergen Gross [Sun, 23 Aug 2020 08:00:11 +0000 (10:00 +0200)]
tools: add a copy of library headers in tools/include
The headers.chk target in tools/Rules.mk tries to compile all headers
stand alone for testing them not to include any internal header.
Unfortunately the headers tested against are not complete, as any
header for a Xen library is not included in the include path of the
test compile run, resulting in a failure in case any of the tested
headers in including an official Xen library header.
Fix that by copying the official headers located in
tools/libs/*/include to tools/include.
In order to support libraries with header name other than xen<lib>.h
or with multiple headers add a LIBHEADER make variable a lib specific
Makefile can set in that case.
Move the headers.chk target from Rules.mk to libs.mk as it is used
for libraries in tools/libs only.
Add NO_HEADERS_CHK variable to skip checking headers as this will be
needed e.g. for libxenctrl.
Juergen Gross [Sun, 23 Aug 2020 08:00:11 +0000 (10:00 +0200)]
tools: switch XEN_LIBXEN* make variables to lower case (XEN_libxen*)
In order to harmonize names of library related make variables switch
XEN_LIBXEN* names to XEN_libxen*, as all other related variables (e.g.
CFLAGS_libxen*, SHDEPS_libxen*, ...) already use this pattern.
Rename XEN_LIBXC to XEN_libxenctrl, XEN_XENSTORE to XEN_libxenstore,
XEN_XENLIGHT to XEN_libxenlight, XEN_XLUTIL to XEN_libxlutil, and
XEN_LIBVCHAN to XEN_libxenvchan for the same reason.
Introduce XEN_libxenguest with the same value as XEN_libxenctrl.
Enable CPU erratum of Speculative AT on the Neoverse N1 processor
versions r0p0 to r2p0.
Also Fix Cortex A76 Erratum string which had a wrong errata number.
Roger Pau Monne [Mon, 17 Aug 2020 15:57:52 +0000 (17:57 +0200)]
x86/pv: handle writes to the EFER MSR
Silently drop writes to the EFER MSR for PV guests if the value is not
changed from what it's being reported. Current PV Linux will attempt
to write to the MSR with the same value that's been read, and raising
a fault will result in a guest crash.
As part of this work introduce a helper to easily get the EFER value
reported to guests.
Edwin Török [Mon, 17 Aug 2020 18:45:47 +0000 (19:45 +0100)]
tools/ocaml/xenstored: drop select based socket watching
Poll has been the default since 2014, I think we can safely say by now
that poll() works and we don't need to fall back to select().
This will allow fixing up the way we call poll to be more efficient
(and pave the way for introducing epoll support):
currently poll wraps the select API, which is inefficient.
Signed-off-by: Edwin Török <edvin.torok@citrix.com> Acked-by: Christian Lindig <christian.lindig@citrix.com>
xen/arm: cmpxchg: Add missing memory barriers in __cmpxchg_mb_timeout()
The function __cmpxchg_mb_timeout() was intended to have the same
semantics as __cmpxchg_mb(). Unfortunately, the memory barriers were
not added when first implemented.
There is no known issue with the existing callers, but the barriers are
added given this is the expected semantics in Xen.
The issue was introduced by XSA-295.
Backport: 4.8+ Fixes: 86b0bc958373 ("xen/arm: cmpxchg: Provide a new helper that can timeout") Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Change the last parameter of the update_eoi_exit_bitmap helper to be a
set/clear boolean instead of a triggering field. This is already
inline with how the function is implemented, and will allow deciding
whether an exit is required by the higher layers that call into
update_eoi_exit_bitmap. Note that the current behavior is not changed
by this patch.
Don Slutz [Sun, 9 Aug 2020 18:22:34 +0000 (14:22 -0400)]
rpmball: Adjust to new rpm, do not require --force
Also prevent warning: directory /boot: remove failed
Before:
[root@TestCloud1 xen]# rpm -hiv dist/xen*rpm
Preparing... ################################# [100%]
file /boot from install of xen-4.15-unstable.x86_64 conflicts with file from package filesystem-3.2-25.el7.x86_64
file /usr/bin from install of xen-4.15-unstable.x86_64 conflicts with file from package filesystem-3.2-25.el7.x86_64
file /usr/lib from install of xen-4.15-unstable.x86_64 conflicts with file from package filesystem-3.2-25.el7.x86_64
file /usr/lib64 from install of xen-4.15-unstable.x86_64 conflicts with file from package filesystem-3.2-25.el7.x86_64
file /usr/sbin from install of xen-4.15-unstable.x86_64 conflicts with file from package filesystem-3.2-25.el7.x86_64
[root@TestCloud1 xen]# rpm -e xen
warning: directory /boot: remove failed: Device or resource busy
Paul Durrant [Tue, 4 Aug 2020 13:41:59 +0000 (14:41 +0100)]
x86/iommu: convert AMD IOMMU code to use new page table allocator
This patch converts the AMD IOMMU code to use the new page table allocator
function. This allows all the free-ing code to be removed (since it is now
handled by the general x86 code) which reduces TLB and cache thrashing as well
as shortening the code.
Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Paul Durrant [Tue, 4 Aug 2020 13:41:57 +0000 (14:41 +0100)]
x86/iommu: add common page-table allocator
Instead of having separate page table allocation functions in VT-d and AMD
IOMMU code, we could use a common allocation function in the general x86 code.
This patch adds a new allocation function, iommu_alloc_pgtable(), for this
purpose. The function adds the page table pages to a list. The pages in this
list are then freed by iommu_free_pgtables(), which is called by
domain_relinquish_resources() after PCI devices have been de-assigned.
Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Paul Durrant [Tue, 4 Aug 2020 13:41:56 +0000 (14:41 +0100)]
x86/iommu: re-arrange arch_iommu to separate common fields...
... from those specific to VT-d or AMD IOMMU, and put the latter in a union.
There is no functional change in this patch, although the initialization of
the 'mapped_rmrrs' list occurs slightly later in iommu_domain_init() since
it is now done (correctly) in VT-d specific code rather than in general x86
code.
NOTE: I have not combined the AMD IOMMU 'root_table' and VT-d 'pgd_maddr'
fields even though they perform essentially the same function. The
concept of 'root table' in the VT-d code is different from that in the
AMD code so attempting to use a common name will probably only serve
to confuse the reader.
Signed-off-by: Paul Durrant <pdurrant@amazon.com> Acked-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
David Woodhouse [Thu, 19 Mar 2020 20:40:24 +0000 (20:40 +0000)]
tools/xenstore: Do not abort xenstore-ls if a node disappears while iterating
The do_ls() function has somewhat inconsistent handling of errors.
If reading the node's contents with xs_read() fails, then do_ls() will
just quietly not display the contents.
If reading the node's permissions with xs_get_permissions() fails, then
do_ls() will print a warning, continue, and ultimately won't exit with
an error code (unless another error happens).
If recursing into the node with xs_directory() fails, then do_ls() will
abort immediately, not printing any further nodes.
For persistent failure modes — such as ENOENT because a node has been
removed, or EACCES because it has had its permisions changed since the
xs_directory() on the parent directory returned its name — it's
obviously quite likely that if either of the first two errors occur for
a given node, then so will the third and thus xenstore-ls will abort.
The ENOENT one is actually a fairly common case, and has caused tools to
fail to clean up a network device because it *apparently* already
doesn't exist in xenstore.
There is a school of thought that says, "Well, xenstore-ls returned an
error. So the tools should not trust its output."
The natural corollary of this would surely be that the tools must re-run
xenstore-ls as many times as is necessary until its manages to exit
without hitting the race condition. I am not keen on that conclusion.
For the specific case of ENOENT it seems reasonable to declare that,
but for the timing, we might as well just not have seen that node at
all when calling xs_directory() for the parent. By ignoring the error,
we give acceptable output.
The issue can be reproduced as follows:
(dom0) # for a in `seq 1 1000` ; do
xenstore-write /local/domain/2/foo/$a $a ;
done
Now simultaneously:
(dom0) # for a in `seq 1 999` ; do
xenstore-rm /local/domain/2/foo/$a ;
done
(dom2) # while true ; do
./xenstore-ls -p /local/domain/2/foo | grep -c 1000 ;
done
We should expect to see node 1000 in the output, every time.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Paul Durrant [Thu, 13 Aug 2020 10:35:53 +0000 (11:35 +0100)]
x86/viridian: remove the viridian_vcpu msg_pending bit mask
The mask does not actually serve a useful purpose as we only use the SynIC
for timer messages. Dropping the mask means that the EOM MSR handler
essentially becomes a no-op. This means we can avoid setting 'message_pending'
for timer messages and hence avoid a VMEXIT for the EOM.
Trammell Hudson [Wed, 12 Aug 2020 17:42:48 +0000 (17:42 +0000)]
x86/setup: Ignore early boot parameters like no-real-mode
There are parameters in xen/arch/x86/boot/cmdline.c that
are only used early in the boot process, so handlers are
necessary to avoid an "Unknown command line option" in
dmesg.
This also updates ignore_param() to generate a temporary
variable name so that the macro can be used more than once
per file.
Signed-off-by: Trammell hudson <hudson@trmm.net> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
[Leave note to stop TEMP_NAME() finding more general use] Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 7 Aug 2020 11:32:11 +0000 (13:32 +0200)]
x86/EFI: sanitize build logic
With changes done over time and as far as linking goes, the only special
thing about building with EFI support enabled is the need for the dummy
relocations object for xen.gz uniformly in all build stages. All other
efi/*.o can be consumed from the built_in*.o files.
In efi/Makefile, besides moving relocs-dummy.o to "extra", also properly
split between obj-y and obj-bin-y.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 7 Aug 2020 11:14:02 +0000 (13:14 +0200)]
x86: slightly re-arrange 32-bit handling in dom0_construct_pv()
Add #ifdef-s (the 2nd one will be needed in particular, to guard the
uses of m2p_compat_vstart and HYPERVISOR_COMPAT_VIRT_START()) and fold
duplicate uses of elf_32bit().
Also adjust what gets logged: Avoid "compat32" when support isn't built
in, and don't assume ELF class <> ELFCLASS64 means ELFCLASS32.
While doing this, in code getting touched anyway:
- use ROUNDUP() instead of open-coding it,
- drop a stale (dead) BUG_ON(),
- replace panic() by printk() plus error return, for being consistent
with other code.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
The originally used sed expression converted not just multiple leading
zeroes (as intended), but also trailing ones, rendering the error
message somewhat confusing. Collapse zeroes in just the one place where
we need them collapsed, and leave objdump's output as is for all other
purposes.
Fixes: 48115d14743e ("Move more kernel decompression bits to .init.* sections") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 7 Aug 2020 11:12:00 +0000 (13:12 +0200)]
build: work around bash issue
Older bash (observed with 3.2.57(2)) fails to honor "set -e" for certain
built-in commands ("while" here), despite the command's status correctly
being non-zero. The subsequent objcopy invocation now being separated by
a semicolon results in no failure. Insert an explicit "exit" (replacing
; by && ought to be another possible workaround).
Fixes: e321576f4047 ("xen/build: start using if_changed") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
during boot. The units on the first line are Hz, not MHz, so correct that and
add a space for clarity.
Also, for the min/max line, use three dots instead of two and add more spaces
so that the line can't be mistaken for being a double decimal point typo.
Andrew Cooper [Wed, 5 Aug 2020 11:05:27 +0000 (12:05 +0100)]
x86/ioapic: Fix fixmap error path logic in ioapic_init_mappings()
In the case that bad_ioapic_register() fails, the current position of idx++
means that clear_fixmap(idx) will be called with the wrong index, and not
clean up the mapping just created.
Increment idx as part of the loop, rather than midway through the loop body.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Wed, 5 Aug 2020 08:30:18 +0000 (10:30 +0200)]
x86emul: correct AVX512_BF16 insn names in EVEX Disp8 test
The leading 'v' ought to be omitted from the table entries.
Fixes: 7ff66809ccd5 ("x86emul: support AVX512_BF16 insns") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Wed, 5 Aug 2020 08:29:18 +0000 (10:29 +0200)]
x86emul: AVX512PF insns aren't memory accesses
These are prefetches, so should be treated just like other prefetches.
Fixes: 467e91bde720 ("x86emul: support AVX512PF insns") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Wed, 5 Aug 2020 08:28:40 +0000 (10:28 +0200)]
x86emul: AVX512F scatter insns are memory writes
While the custom handling renders the "to_mem" field generally unused,
x86_insn_is_mem_write() still (indirectly) consumes that information,
and hence the table entries want to be correct.
Fixes: 7d569b848036 ("x86emul: support AVX512F scatter insns") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Wed, 5 Aug 2020 08:28:01 +0000 (10:28 +0200)]
x86emul: AVX512{F,BW} down conversion moves are memory writes
For this to be properly reported, the case labels need to move to a
different switch() block.
Fixes: 30e0bdf79828 ("x86emul: support AVX512{F,BW} down conversion moves") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Wed, 5 Aug 2020 08:26:11 +0000 (10:26 +0200)]
x86emul: adjustments to mem access / write logic testing
The combination of specifying a ModR/M byte with the upper two bits set
and the modrm field set to T is pointless - the same test will be
executed twice, i.e. overall things will be slower for no extra gain. I
can only assume this was a copy-and-paste-without-enough-editing mistake
of mine.
Furthermore adjust the base type of a few bit fields to shrink table
size, as subsequently quite a few new entries will get added to the
tables using this type.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Wed, 5 Aug 2020 08:20:59 +0000 (10:20 +0200)]
x86emul: further FPU env testing relaxation for AMD-like CPUs
See the code comment that's being extended. Additionally a few more
zap_fpsel() invocations are needed - whenever we stored state after
there potentially having been a context switch behind our backs.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Mon, 3 Aug 2020 08:06:32 +0000 (10:06 +0200)]
libxl: avoid golang building without CONFIG_GOLANG=y
While this doesn't address the real problem I've run into (attempting to
update r/o source files), not recursing into tools/golang/xenlight/ is
enough to fix the build for me for the moment. I don't currently see why 60db5da62ac0 ("libxl: Generate golang bindings in libxl Makefile") found
it necessary to invoke this build step unconditionally.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Wei Liu <wl@xen.org>
Jan Beulich [Mon, 3 Aug 2020 14:27:22 +0000 (16:27 +0200)]
x86emul: avoid assembler warning about .type not taking effect in test harness
gcc re-orders top level blocks by default when optimizing. This
re-ordering results in all our .type directives to get emitted to the
assembly file first, followed by gcc's. The assembler warns about
attempts to change the type of a symbol when it was already set (and
when there's no intervening setting to "notype").
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Paul Durrant [Fri, 31 Jul 2020 15:43:31 +0000 (17:43 +0200)]
x86/hvm: simplify 'mmio_direct' check in epte_get_entry_emt()
Re-factor the code to take advantage of the fact that the APIC access page is
a 'special' page. The VMX code is left alone and hence the APIC access page is
still inserted into the P2M with type p2m_mmio_direct. This is left alone as it
is not obvious there is another suitable type to use, and the necessary
re-ordering in epte_get_entry_emt() is straightforward.
Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Paul Durrant [Fri, 31 Jul 2020 15:42:47 +0000 (17:42 +0200)]
x86/hvm: set 'ipat' in EPT for special pages
All non-MMIO ranges (i.e those not mapping real device MMIO regions) that
map valid MFNs are normally marked MTRR_TYPE_WRBACK and 'ipat' is set. Hence
when PV drivers running in a guest populate the BAR space of the Xen Platform
PCI Device with pages such as the Shared Info page or Grant Table pages,
accesses to these pages will be cachable.
However, should IOMMU mappings be enabled be enabled for the guest then these
accesses become uncachable. This has a substantial negative effect on I/O
throughput of PV devices. Arguably PV drivers should bot be using BAR space to
host the Shared Info and Grant Table pages but it is currently commonplace for
them to do this and so this problem needs mitigation. Hence this patch makes
sure the 'ipat' bit is set for any special page regardless of where in GFN
space it is mapped.
NOTE: Clearly this mitigation only applies to Intel EPT. It is not obvious
that there is any similar mitigation possible for AMD NPT. Downstreams
such as Citrix XenServer have been carrying a patch similar to this for
several releases though.
Signed-off-by: Paul Durrant <pdurrant@amazon.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Fri, 31 Jul 2020 15:41:58 +0000 (17:41 +0200)]
x86emul: replace UB shifts
Displacement values can be negative, hence we shouldn't left-shift them.
Or else we get
(XEN) UBSAN: Undefined behaviour in x86_emulate/x86_emulate.c:3482:55
(XEN) left shift of negative value -2
While auditing shifts, I noticed a pair of missing parentheses, which
also get added right here.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 31 Jul 2020 15:40:13 +0000 (17:40 +0200)]
x86/PV: drop a few misleading paging_mode_refcounts() checks
The filling and cleaning up of v->arch.guest_table in new_guest_cr3()
was apparently inconsistent so far: There was a type ref acquired
unconditionally for the new top level page table, but the dropping of
the old type ref was conditional upon !paging_mode_refcounts(). Mirror
this also to arch_set_info_guest().
Also move new_guest_cr3()'s #ifdef to around the function - both callers
now get built only when CONFIG_PV, i.e. no need to retain a stub.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Fri, 26 Jun 2020 16:46:38 +0000 (17:46 +0100)]
tools/configure: drop BASH configure variable
This is a weird variable to have in the first place. The only user of it is
XSM's CONFIG_SHELL, which opencodes a fallback to sh. The scripts are shebang
sh, which is already necessary to support non-Linux build environments.
Make the mkflask.sh and mkaccess_vector.sh scripts executable, drop the
CONFIG_SHELL, and drop the $BASH variable to prevent further use.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
xen/spinlock: move debug helpers inside the locked regions
Debug helpers such as lock profiling or the invariant pCPU assertions
must strictly be performed inside the exclusive locked region, or else
races might happen.
Note the issue was not strictly introduced by the pointed commit in
the Fixes tag, since lock stats where already incremented before the
barrier, but that commit made it more apparent as manipulating the cpu
field could happen outside of the locked regions and thus trigger the
BUG_ON on rel_lock(). This is only enabled on debug builds, and thus
releases are not affected.
Andrew Cooper [Fri, 20 Jul 2018 17:22:25 +0000 (17:22 +0000)]
x86/hvm: Clean up track_dirty_vram() calltree
* Rename nr to nr_frames. A plain 'nr' is confusing to follow in the the
lower levels.
* Use DIV_ROUND_UP() rather than opencoding it in several different ways
* The hypercall input is capped at uint32_t, so there is no need for
nr_frames to be unsigned long in the lower levels.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org>