Andrew Cooper [Tue, 26 Mar 2019 14:23:03 +0000 (14:23 +0000)]
CI: Add a CentOS 6 container and build jobs
CentOS 6 is probably the most frequently broken build, so adding it to CI
would be a very good move.
One problem is that CentOS 6 comes with Python 2.6, and Qemu requires 2.7.
There appear to be no sensible ways to get Python 2.7 into a CentOS 6
environments, so modify the build script to skip the Qemu upstream build
instead. Additionally, SeaBIOS requires GCC 4.6 or later, so skip it as well.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Mon, 18 Mar 2019 16:22:29 +0000 (16:22 +0000)]
docs/admin-guide: Boot time microcode loading
Recent discussion on xen-devel has demonstrated that Xen existing microcode
loading support isn't adequately documented. Take the opportunity to address
this, and start some end-user focused documentation.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Wed, 21 Nov 2018 17:03:50 +0000 (17:03 +0000)]
docs/rst: Use pandoc to render ReStructuredText
Sphinx uses ReStructuredText as its markup format. Although missing the
project wide integration, individual *.rst files can be rendered by pandoc to
suppliement our existing ad-hoc documentation.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Wed, 21 Nov 2018 17:03:50 +0000 (17:03 +0000)]
docs/sphinx: Skeleton setup
Sphinx is a documentation system, which is popular for technical writing. It
uses ReStructuredText as its markup syntax, and is designed for whole-project
documentation, rather than the misc assortment of individual files that we
currently have.
This is a skeleton setup which just enough infrastructure to render an empty
set of pages. It will become better integrated into Xen's docs system when it
becomes less WIP.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Thu, 21 Mar 2019 19:36:48 +0000 (19:36 +0000)]
passthrough/vtd: Drop the "workaround_bios_bug" logic entirely
It turns out that this code was previously dead.
c/s dcf41790 " x86/mmcfg/drhd: Move acpi_mmcfg_init() call before calling
acpi_parse_dmar()" resulted in PCI segment 0 now having been initialised
enough for acpi_parse_one_drhd() to not take the
/* Skip checking if segment is not accessible yet. */
path unconditionally. However, some systems have DMAR tables which list
devices which are disabled by user choice (in particular, Dell PowerEdge R740
with I/O AT DMA disabled), and turning off all IOMMU functionality in this
case is entirely unhelpful behaviour.
Leave the warning which identifies the problematic devices, but drop the
remaining logic. This leaves the system in better overall state, and working
in the same way that it did in previous releases.
Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: George Dunlap <george.dunlap@citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Wei Liu [Wed, 20 Mar 2019 15:43:38 +0000 (15:43 +0000)]
libxc: fix HVM core dump
f969bc9fc96 forbid get_address_size call on HVM guests, because that
didn't make sense. It broke core dump functionality on HVM because
libxc unconditionally asked for guest width.
Force guest_width to a sensible value.
Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Wei Liu [Tue, 19 Mar 2019 13:57:06 +0000 (13:57 +0000)]
x86: decouple xen alignment setting from EFI/ELF build
Introduce a new Kconfig option to pick the alignment for xen binary.
To retain original behaviour, the default pick for EFI build is 2M and
ELF build 4K.
Make the PVHSHIM build use 2M alignment for potentially better
performance.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 15 Mar 2019 22:08:41 +0000 (22:08 +0000)]
x86/spec-ctrl: Extend repoline safey calcuations for eIBRS and Atom parts
All currently-released Atom processors are in practice retpoline-safe, because
they don't fall back to a BTB prediction on RSB underflow.
However, an additional meaning of Enhanced IRBS is that the processor may not
be retpoline-safe. The Gemini Lake platform, based on the Goldmont Plus
microarchitecture is the first Atom processor to support eIBRS.
Until Xen gets full eIBRS support, Gemini Lake will still be safe using
regular IBRS.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Igor Druzhinin [Mon, 18 Mar 2019 15:29:21 +0000 (16:29 +0100)]
x86/hvm: finish IOREQs correctly on completion path
Since the introduction of linear_{read,write}() helpers in 3bdec530a5
(x86/HVM: split page straddling emulated accesses in more cases) the
completion path for IOREQs has been broken: if there is an IOREQ in
progress but hvm_copy_{to,from}_guest_linear() returns HVMTRANS_okay
(e.g. when P2M type of source/destination has been changed by IOREQ
handler) the execution will never re-enter hvmemul_do_io() where
IOREQs are completed. This usually results in a domain crash upon
the execution of the next IOREQ entering hvmemul_do_io() and finding
the remnants of the previous IOREQ in the state machine.
This particular issue has been discovered in relation to p2m_ioreq_server
type where an emulator changed the memory type between p2m_ioreq_server
and p2m_ram_rw in process of responding to IOREQ which made
hvm_copy_..() to behave differently on the way back.
Fix it for now by checking if IOREQ completion is required (which
can be identified by querying MMIO cache) before trying to finish
a memory access immediately through hvm_copy_..(), re-enter
hvmemul_do_io() otherwise. This change alone only addresses IOREQ
completion issue for P2M type changing from MMIO to RAM in the
middle of emulation but leaves a case where new IOREQs might be
introduced by P2M changes from RAM to MMIO (which is less likely
to find in practice) that requires more substantial changes in
MMIO emulation code.
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Igor Druzhinin [Mon, 18 Mar 2019 15:28:45 +0000 (16:28 +0100)]
x86/hvm: split all linear reads and writes at page boundary
Ruling out page straddling at linear level makes it easier to
distinguish chunks that require proper handling as MMIO access
and not complete them as page straddling memory transactions
prematurely. This doesn't change the general behavior.
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Juergen Gross [Mon, 18 Mar 2019 10:40:32 +0000 (11:40 +0100)]
xen/debug: make debugtrace more clever regarding repeating entries
In case debugtrace is writing to memory and the last entry is repeated
don't fill up the trace buffer, but modify the count prefix to "x-y "
style instead.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Jan Beulich [Mon, 18 Mar 2019 10:38:36 +0000 (11:38 +0100)]
x86/e820: fix build with gcc9
e820.c: In function ‘clip_to_limit’:
.../xen/include/asm/string.h:10:26: error: ‘__builtin_memmove’ offset [-16, -36] is out of the bounds [0, 20484] of object ‘e820’ with type ‘struct e820map’ [-Werror=array-bounds]
10 | #define memmove(d, s, n) __builtin_memmove(d, s, n)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
e820.c:404:13: note: in expansion of macro ‘memmove’
404 | memmove(&e820.map[i], &e820.map[i+1],
| ^~~~~~~
e820.c:36:16: note: ‘e820’ declared here
36 | struct e820map e820;
| ^~~~
While I can't see where the negative offsets would come from, converting
the loop index to unsigned type helps. Take the opportunity and also
convert several other local variables and copy_e820_map()'s second
parameter to unsigned int (and bool in one case).
Reported-by: Charles Arnold <carnold@suse.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Tue, 12 Feb 2019 18:33:30 +0000 (18:33 +0000)]
x86/svm: Improve code generation from cpu_has_svm_feature()
Taking svm_feature_flags by pointer and using test_bit() results in generated
code which loads svm_feature_flags into a 32bit register, then does a bitwise
operation.
The logic can be expressed in terms of a straight bitwise operation, resulting
in the following minor improvement.
Andrew Cooper [Thu, 14 Feb 2019 11:10:09 +0000 (11:10 +0000)]
x86/pv: Fix construction of 32bit dom0's
dom0_construct_pv() has logic to transition dom0 into a compat domain when
booting an ELF32 image.
One aspect which is missing is the CPUID policy recalculation, meaning that a
32bit dom0 sees a 64bit policy, which differ by the Long Mode feature flag in
particular. Another missing item is the x87_fip_width initialisation.
Update dom0_construct_pv() to use switch_compat(), rather than retaining the
opencoding. Position the call to switch_compat() such that the compat32 local
variable can disappear entirely.
The 32bit monitor table is now created by setup_compat_l4(), avoiding the need
to for manual creation later.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 15 Mar 2019 12:28:46 +0000 (12:28 +0000)]
xen/x86: Fix cflush()'s parameter tracking
Forcing a register operand hides (from the compiler) the fact that clflush
behaves as a read from the memory operand (wrt memory order, faults, etc.).
It also reduces the compilers flexibility with register scheduling.
Re-implement clfush() (and wbinvd() for consistency) as a static inline rather
than a macro, and have it take a const void pointer.
In practice, the only generated code which gets modified by this is in
mwait_idle_with_hints(), where a disp8 encoding now gets used.
While here, I noticed that &mwait_wakeup(cpu) was being calculated twice.
This is caused by the memory clobber in mb(), so take the opportunity to help
the optimiser by calculating it once, ahead of time. bloat-o-meter reports a
delta of -26 as a result of this change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 15 Mar 2019 13:18:04 +0000 (13:18 +0000)]
xen: drop the nop() macro
There isn't a plausible reason to insert nops into code in this manner.
The sole use is in do_debug_key(), and exists to prevent the compiler
optimising the tail of the function with 'jmp debugger_trap_fatal'
In practice, a compiler barrier suffices just as well to prevent the tailcall,
and doesn't involve inserting unnecessary instructions.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Julien Grall <julien.grall@arm.com>
Wei Liu [Fri, 24 Aug 2018 15:22:47 +0000 (16:22 +0100)]
automation: enable building rombios with clang
Previously it is disabled because the embedded ipxe can't be built
with clang. Now that ipxe is split out we can use --with-system-ipxe
to work around the issue.
Juergen Gross [Thu, 14 Mar 2019 15:42:05 +0000 (16:42 +0100)]
introduce a cpumask with all bits set
There are several places in Xen allocating a cpumask on the stack and
setting all bits in it just to use it as an initial mask for allowing
all cpus.
Save the stack space and omit the need for runtime initialization by
defining a globally accessible cpumask_all variable.
Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: George Dunlap <george.dunlap@citrix.com>
Jan Beulich [Thu, 14 Mar 2019 15:38:39 +0000 (16:38 +0100)]
tools: re-sync CPUID leaf 7 tables
Bring libxl's in line with the public header, and update xen-cpuid's to
the latest information available in Intel's documentation (SDM ver 068
and ISA extensions ver 035), with (as before) the exception on MAWAU.
Some pre-existing strings get changed to match SDM naming. This should
be benign in xen-cpuid, and I hope it's also acceptable in libxl, where
people actually using the slightly wrong names would have to update
their guest config files.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Wei Liu [Wed, 13 Mar 2019 13:54:46 +0000 (13:54 +0000)]
automation: use python-dev python2.7-dev in Debian and Ubuntu
... instead of python2.7-dev.
We installed python2.7-dev because xen only worked with 2.7.
Installing python2.7-dev only gives python2.7-config, which causes
configure to fail because it wants python-config by default. Now xen
should work with 2.6 and above, we can install python-dev and let
distros pick the default python.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Juergen Gross [Wed, 13 Mar 2019 10:26:38 +0000 (11:26 +0100)]
MAINTAINERS: add myself as maintainer for public I/O interfaces
The "PUBLIC I/O INTERFACES AND PV DRIVERS DESIGNS" section of the
MAINTAINERS file lists Konrad as the only maintainer. Add myself for
helping him to review patches.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Jan Beulich [Wed, 13 Mar 2019 10:25:49 +0000 (11:25 +0100)]
string: fix type use in strstr()
Using plain int for string lengths, while okay for all practical
purposes, is undesirable in a generic library function.
Take the opportunity and also move the function from being in the middle
of mem*() ones to the set of str*() ones, convert its loop from while()
to for(), and correct style.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Wed, 13 Mar 2019 10:25:04 +0000 (11:25 +0100)]
string: remove memscan()
It has no users, so rather than fixing its use of types (first and
foremost c would need to be cast to unsigned char in the comparison
expression) drop it altogether. memchr() ought to be fine for all
purposes.
Take the opportunity and also do some stylistic adjustments to its
surviving sibling function memchr().
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Wed, 13 Mar 2019 10:23:28 +0000 (11:23 +0100)]
string: avoid undefined behavior in strrchr()
The pre-decrement would not only cause misbehavior when wrapping (benign
because there shouldn't be any NULL pointers passed in), but may also
create a pointer pointing outside the object that the passed in pointer
points to (it won't be de-referenced though).
Take the opportunity and also
- convert bogus space (partly 7 of them) indentation to Linux style tab
one,
- add two blank lines.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Juergen Gross <jgross@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 12 Mar 2019 18:11:35 +0000 (18:11 +0000)]
SVM: fix build after "make nested page-fault tracing and logging consistent"
Some compiler versions don't recognize that "mfn" can't really be used
uninitialized in svm_do_nested_pgfault(). To be on the safe side, add an
initializer for p2mt as well.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com Reviewed-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Andrew Cooper [Wed, 12 Sep 2018 13:36:00 +0000 (14:36 +0100)]
x86/tsx: Implement controls for RTM force-abort mode
The CPUID bit and MSR are deliberately not exposed to guests, because they
won't exist on newer processors. As vPMU isn't security supported, the
misbehaviour of PCR3 isn't expected to impact production deployments.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Wei Liu [Tue, 5 Mar 2019 14:13:17 +0000 (14:13 +0000)]
pygrub/fsimage: make it work with python 3
With the help of two porting guides and cpython source code:
1. Use PyBytes to replace PyString counterparts.
2. Use PyVarObject_HEAD_INIT.
3. Remove usage of Py_FindMethod.
4. Use new module initialisation routine.
For #3, Py_FindMethod was removed, yet an alternative wasn't
documented. The code is the result of reverse-engineering cpython
commit 6116d4a1d1
Wei Liu [Thu, 7 Mar 2019 12:45:47 +0000 (12:45 +0000)]
pygrub: make python scripts work with 2.6 and up
Run 2to3 and pick the sensible suggestions.
Import print_function and absolute_import so 2.6 can work.
There has never been a curses.wrapper module according to 2.x and 3.x
doc, only a function, so "import curses.wrapper" is not correct. It
happened to work because 2.x implemented a (undocumented) module.
We only need to import curses to make curses.wrapper available to
pygrub.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Thu, 21 Jun 2018 14:35:49 +0000 (16:35 +0200)]
libx86: Introduce a helper to deserialise cpuid_policy objects
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monné [Thu, 21 Jun 2018 14:35:50 +0000 (16:35 +0200)]
libx86: introduce a helper to deserialise msr_policy objects
As with the serialise side, Xen's copy_from_guest API is used, with a
compatibility wrapper for the userspace build.
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Tue, 27 Nov 2018 17:21:17 +0000 (17:21 +0000)]
x86/vtd: Don't include control register state in the table pointers
iremap_maddr and qinval_maddr point to the base of a block of contiguous RAM,
allocated by the driver, holding the Interrupt Remapping table, and the Queued
Invalidation ring.
Despite their name, they are actually the values of the hardware register,
including control metadata in the lower 12 bits. While uses of these fields
do appear to correctly shift out the metadata, this is very subtle behaviour
and confusing to follow.
Nothing uses the metadata, so make the fields actually point at the base of
the relevant tables.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Kevin Tian <kevin.tian@intel.com>
Andrew Cooper [Fri, 22 Feb 2019 13:28:16 +0000 (13:28 +0000)]
x86/xstate: Don't special case feature collection
The logic in xstate_init() is a rementent of the pre-featuremask days.
Collect the xstate features in generic_identify(), like all other feature
leaves, after which identify_cpu() will apply the known_feature[] mask derived
from the automatically generated CPUID information.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Tue, 12 Mar 2019 13:40:24 +0000 (14:40 +0100)]
events: drop arch_evtchn_inject()
Have the only user call vcpu_mark_events_pending() instead, at the same
time arranging for correct ordering of the writes (evtchn_pending_sel
should be written before evtchn_upcall_pending).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Julien Grall <julien.grall@arm.com>
Jan Beulich [Tue, 12 Mar 2019 13:39:46 +0000 (14:39 +0100)]
x86/HVM: don't crash guest in hvmemul_find_mmio_cache()
Commit 35a61c05ea ("x86emul: adjust handling of AVX2 gathers") builds
upon the fact that the domain will actually survive running out of MMIO
result buffer space. Drop the domain_crash() invocation. Also delay
incrementing of the usage counter, such that the function can't possibly
use/return an out-of-bounds slot/pointer in case execution subsequently
makes it into the function again without a prior reset of state.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Igor Druzhinin [Tue, 12 Mar 2019 13:38:12 +0000 (14:38 +0100)]
iommu: leave IOMMU enabled by default during kexec crash transition
It's unsafe to disable IOMMU on a live system which is the case
if we're crashing since remapping hardware doesn't usually know what
to do with ongoing bus transactions and frequently raises NMI/MCE/SMI,
etc. (depends on the firmware configuration) to signal these abnormalities.
This, in turn, doesn't play well with kexec transition process as there is
no handling available at the moment for this kind of events resulting
in failures to enter the kernel.
Modern Linux kernels taught to copy all the necessary DMAR/IR tables
following kexec from the previous kernel (Xen in our case) - so it's
currently normal to keep IOMMU enabled. It might require minor changes to
kdump command line that enables IOMMU drivers (e.g. intel_iommu=on /
intremap=on) but recent kernels don't require any additional changes for
the transition to be transparent.
A fallback option is still left for compatibility with ancient crash
kernels which didn't like to have IOMMU active under their feet on boot.
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Tue, 12 Mar 2019 13:36:56 +0000 (14:36 +0100)]
SVM: make nested page-fault tracing and logging consistent
Don't call __get_gfn_type_access() more than once, to make sure data
recorded for xentrace matches up with what gets logged in case of the
domain getting crashed.
As a side effect this also eliminates a type mismatch reported by
Norbert Manthey, as the first call now also needs to update the local
variable "p2mt".
Do a few cosmetics at the same time: Move a comment up a little, drop
the pointless "case 0" (seeing in particular the comment's wording),
and correct formatting.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Olaf Hering [Fri, 8 Mar 2019 12:24:15 +0000 (13:24 +0100)]
libxl: prepare environment for domcreate_stream_done
The function domcreate_bootloader_done may branch early to
domcreate_stream_done, in case some error occoured. Here srs->dcs will be
NULL, which leads to a crash.
It is unclear what the purpose of that backpointer is. Perhaps it can be
removed, and domcreate_stream_done could use CONTAINER_OF.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Wei Liu <wei.liu2@citrix.com>
[ wei: fold in comment required by Ian ] Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Mon, 11 Mar 2019 19:18:40 +0000 (19:18 +0000)]
tools/xen-foreign: Update python scripts to be Py3 compatible
The issues are:
* dict.has_key() was completely removed in Py3
* dict.keys() is an iterable rather than list in Py3, so .sort() doesn't work.
* list.sort(cmp=) was deprecated in Py2.4 and removed in Py3.
The has_key() issue is trivially fixed by switching to using the in keyword.
The sorting issue could be trivially fixed, but take the opportunity to
improve the code.
The reason for the sorting is to ensure that "unsigned long" gets replaced
before "long", and the only reason sorting is necessary is because
inttypes[arch] is needlessly a dictionary. Update inttypes[arch] to be a list
of tuples rather than a dictionary, and process them in list order.
Reported-by: George Dunlap <george.dunlap@eu.citrix.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Juergen Gross [Thu, 21 Feb 2019 17:36:13 +0000 (18:36 +0100)]
tools: add link path flag for local build to pkg-config files
The qemu build process is requiring the link path of Xen libraries
to be specified both with -L and -Wl,-rpath-link. Add the -L flag
to the local pkg-config files.
At the same time let the pkg-config files depend on the Makefile
creating them, too.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Petre Pircalabu [Thu, 14 Feb 2019 14:18:11 +0000 (16:18 +0200)]
vm_event: Add a new opcode to get VM_EVENT_INTERFACE_VERSION
Currently, the VM_EVENT_INTERFACE_VERSION is determined at runtime, by
inspecting the corresponding field in a vm_event_request. This helper
opcode will query the hypervisor supported version before the vm_event
related structures and layout are set-up.
Signed-off-by: Petre Pircalabu <ppircalabu@bitdefender.com> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
George Dunlap [Tue, 5 Mar 2019 12:48:52 +0000 (12:48 +0000)]
README: Document python2 dependency
Much of the tools and configure makefile actually have a python2
dependency; specify this. It also assumes that `python` points to `python2`;
document how to work around this on systems where this is false.
Also update second version requirement listed to match the first.
Signed-off-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Jan Beulich [Tue, 5 Mar 2019 17:04:23 +0000 (18:04 +0100)]
x86/cpuid: add missing PCLMULQDQ dependency
Since we can't seem to be able to settle our discussion for the wider
adjustment previously posted, let's at least add the missing dependency
for 4.12. I'm not convinced though that attaching it to SSE is correct.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Roger Pau Monné [Tue, 5 Mar 2019 16:41:14 +0000 (17:41 +0100)]
x86/dom0: propagate PVH vlapic EOIs to hardware
Current check for MSI EIO is missing a special case for PVH Dom0,
which doesn't have a hvm_irq_dpci struct but requires EIOs to be
forwarded to the physical lapic for passed-through devices.
Add a short-circuit to allow EOIs from PVH Dom0 to be propagated.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Ian Jackson [Tue, 5 Mar 2019 15:31:32 +0000 (15:31 +0000)]
tools/libfsimage: Add `XEN' to environment variable name
This library, which is private to Xen and was properly namespaced in 1a814711881beb17f073f5f57e27e5bd4da1b956
tools/libfsimage: Add `xen' to .h names and principal .so name
honours an environment variable to override the directory where
shared objects (ie filesystem plugins) are to be loaded from.
Rename that variable from FSIMAGE_FSDIR to XEN_FSIMAGE_FSDIR, to give
it a proper namespace prefix.
Nothing in xen.git sets this variable. The three hits for the string
`FSIMAGE_FSDIR' are this getenv, and two references to a compile-time
manifest constant which provides the default value (the -D which sets
it, and the place it is used).
I have also checked the current Debian Xen package in buster and the
variable is not set there either.
CC: Andrew Cooper <andrew.cooper3@citrix.com> CC: Jan Beulich <JBeulich@suse.com> CC: George Dunlap <george.dunlap@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Jan Beulich [Tue, 5 Mar 2019 16:02:36 +0000 (17:02 +0100)]
x86/mm: fix #GP(0) in switch_cr3_cr4()
With "pcid=no-xpti" and opposite XPTI settings in two 64-bit PV domains
(achievable with one of "xpti=no-dom0" or "xpti=no-domu"), switching
from a PCID-disabled to a PCID-enabled 64-bit PV domain fails to set
CR4.PCIDE in time, as CR4.PGE would not be set in either (see
pv_fixup_guest_cr4(), in particular as used by write_ptbase()), and
hence the early CR4 write would be skipped.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Jan Beulich [Tue, 5 Mar 2019 12:54:42 +0000 (13:54 +0100)]
x86/pv: _toggle_guest_pt() may not skip TLB flush for shadow mode guests
For shadow mode guests (e.g. PV ones forced into that mode as L1TF
mitigation, or during migration) update_cr3() -> sh_update_cr3() may
result in a change to the (shadow) root page table (compared to the
previous one when running the same vCPU with the same PCID). This can,
first and foremost, be a result of memory pressure on the shadow memory
pool of the domain. Shadow code legitimately relies on the original
(prior to commit 5c81d260c2 ["xen/x86: use PCID feature"]) behavior of
the subsequent CR3 write to flush the TLB of entries still left from
walks with an earlier, different (shadow) root page table.
Restore the flushing behavior, also for the second CR3 write on the exit
path to guest context when XPTI is active. For the moment accept that
this will introduce more flushes than are strictly necessary - no flush
would be needed when the (shadow) root page table doesn't actually
change, but this information isn't readily (i.e. without introducing a
layering violation) available here.
This is XSA-294.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Juergen Gross <jgross@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Tue, 5 Mar 2019 12:54:05 +0000 (13:54 +0100)]
x86/pv: Don't have %cr4.fsgsbase active behind a guest kernels back
Currently, a 64bit PV guest can appear to set and clear FSGSBASE in %cr4, but
the bit remains set in hardware. Therefore, the {RD,WR}{FS,GS}BASE are usable
even when the guest kernel believes that they are disabled.
The FSGSBASE feature isn't currently supported in Linux, and its context
switch path has some optimisations which rely on userspace being unable to use
the WR{FS,GS}BASE instructions. Xen's current behaviour undermines this
expectation.
In 64bit PV guest context, always load the guest kernels setting of FSGSBASE
into %cr4. This requires adjusting how Xen uses the {RD,WR}{FS,GS}BASE
instructions.
* Delete the cpu_has_fsgsbase helper. It is no longer safe, as users need to
check %cr4 directly.
* The raw __rd{fs,gs}base() helpers are only safe to use when %cr4.fsgsbase
is set. Comment this property.
* The {rd,wr}{fs,gs}{base,shadow}() and read_msr() helpers are updated to use
the current %cr4 value to determine which mechanism to use.
* toggle_guest_mode() and save_segments() are update to avoid reading
fs/gsbase if the values in hardware cannot be stale WRT struct vcpu. A
consequence of this is that the write_cr() path needs to cache the current
bases, as subsequent context switches will skip saving the values.
* write_cr4() is updated to ensure that the shadow %cr4.fsgsbase value is
observed in a safe way WRT the hardware setting, if an interrupt happens to
hit in the middle.
* load_segments() is updated to use the VMLOAD optimisation if FSGSBASE is
unavailable, even if only gs_shadow needs updating. As a minor perf
improvement, check cpu_has_svm first to short circuit a context-dependent
conditional on Intel hardware.
* pv_make_cr4() is updated for 64bit PV guests to use the guest kernels
choice of FSGSBASE.
This is part of XSA-293.
Reported-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Tue, 5 Mar 2019 12:53:32 +0000 (13:53 +0100)]
x86/pv: Rewrite guest %cr4 handling from scratch
The PV cr4 logic is almost impossible to follow, and leaks bits into guest
context which definitely shouldn't be visible (in particular, VMXE).
The biggest problem however, and source of the complexity, is that it derives
new real and guest cr4 values from the current value in hardware - this is
context dependent and an inappropriate source of information.
Rewrite the cr4 logic to be invariant of the current value in hardware.
First of all, modify write_ptbase() to always use mmu_cr4_features for IDLE
and HVM contexts. mmu_cr4_features *is* the correct value to use, and makes
the ASSERT() obviously redundant.
For PV guests, curr->arch.pv.ctrlreg[4] remains the guests view of cr4, but
all logic gets reworked in terms of this and mmu_cr4_features only.
Two masks are introduced; bits which the guest has control over, and bits
which are forwarded from Xen's settings. One guest-visible change here is
that Xen's VMXE setting is no longer visible at all.
pv_make_cr4() follows fairly closely from pv_guest_cr4_to_real_cr4(), but
deliberately starts with mmu_cr4_features, and only alters the minimal subset
of bits.
The boot-time {compat_,}pv_cr4_mask variables are removed, as they are a
remnant of the pre-CPUID policy days. pv_fixup_guest_cr4() gains a related
derivation from the policy.
Another guest visible change here is that a 32bit PV guest can now flip
FSGSBASE in its view of CR4. While the {RD,WR}{FS,GS}BASE instructions are
unusable outside of a 64bit code segment, the ability to modify FSGSBASE
matches real hardware behaviour, and avoids the need for any 32bit/64bit
differences in the logic.
Overall, this patch shouldn't have a practical change in guest behaviour.
VMXE will disappear from view, and an inquisitive 32bit kernel can now see
FSGSBASE changing, but this new logic is otherwise bug-compatible with before.
This is part of XSA-293.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Tue, 5 Mar 2019 12:52:44 +0000 (13:52 +0100)]
x86/mm: properly flush TLB in switch_cr3_cr4()
The CR3 values used for contexts run with PCID enabled uniformly have
CR3.NOFLUSH set, resulting in the CR3 write itself to not cause any
flushing at all. When the second CR4 write is skipped or doesn't do any
flushing, there's nothing so far which would purge TLB entries which may
have accumulated again if the PCID doesn't change; the "just in case"
flush only affects the case where the PCID actually changes. (There may
be particularly many TLB entries re-accumulated in case of a watchdog
NMI kicking in during the critical time window.)
Suppress the no-flush behavior of the CR3 write in this particular case.
Similarly the second CR4 write may not cause any flushing of TLB entries
established again while the original PCID was still in use - it may get
performed because of unrelated bits changing. The flush of the old PCID
needs to happen nevertheless.
At the same time also eliminate a possible race with lazy context
switch: Just like for CR4, CR3 may change at any time while interrupts
are enabled, due to the __sync_local_execstate() invocation from the
flush IPI handler. It is for that reason that the CR3 read, just like
the CR4 one, must happen only after interrupts have been turned off.
This is XSA-292.
Reported-by: Sergey Dyasli <sergey.dyasli@citrix.com> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Tested-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Jan Beulich [Tue, 5 Mar 2019 12:52:15 +0000 (13:52 +0100)]
x86/mm: don't retain page type reference when IOMMU operation fails
The IOMMU update in _get_page_type() happens between recording of the
new reference and validation of the page for its new type (if
necessary). If the IOMMU operation fails, there's no point in actually
carrying out validation. Furthermore, with this resulting in failure
getting indicated to the caller, the recorded type reference also needs
to be dropped again.
Note that in case of failure of alloc_page_type() there's no need to
undo the IOMMU operation: Only special types get handed to the function.
The function, upon failure, clears ->u.inuse.type_info, effectively
converting the page to PGT_none. The IOMMU mapping, however, solely
depends on whether the type is PGT_writable_page.
This is XSA-291.
Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 5 Mar 2019 12:51:46 +0000 (13:51 +0100)]
x86/mm: add explicit preemption checks to L3 (un)validation
When recursive page tables are used at the L3 level, unvalidation of a
single L4 table may incur unvalidation of two levels of L3 tables, i.e.
a maximum iteration count of 512^3 for unvalidating an L4 table. The
preemption check in free_l2_table() as well as the one in
_put_page_type() may never be reached, so explicit checking is needed in
free_l3_table().
When recursive page tables are used at the L4 level, the iteration count
at L4 alone is capped at 512^2. As soon as a present L3 entry is hit
which itself needs unvalidation (and hence requiring another nested loop
with 512 iterations), the preemption checks added here kick in, so no
further preemption checking is needed at L4 (until we decide to permit
5-level paging for PV guests).
The validation side additions are done just for symmetry.
This is part of XSA-290.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 5 Mar 2019 12:51:18 +0000 (13:51 +0100)]
x86/mm: also allow L2 (un)validation to be fully preemptible
Commit c612481d1c ("x86/mm: Plumbing to allow any PTE update to fail
with -ERESTART") added assertions next to the {alloc,free}_l2_table()
invocations to document (and validate in debug builds) that L2
(un)validations are always preemptible.
The assertion in free_page_type() was now observed to trigger when
recursive L2 page tables get cleaned up.
In particular put_page_from_l2e()'s assumption that _put_page_type()
would always succeed is now wrong, resulting in a partially un-validated
page left in a domain, which has no other means of getting cleaned up
later on. If not causing any problems earlier, this would ultimately
trigger the check for ->u.inuse.type_info having a zero count when
freeing the page during cleanup after the domain has died.
As a result it should be considered a mistake to not have extended
preemption fully to L2 when it was added to L3/L4 table handling, which
this change aims to correct.
The validation side additions are done just for symmetry.
This is part of XSA-290.
Reported-by: Manuel Bouyer <bouyer@antioche.eu.org> Tested-by: Manuel Bouyer <bouyer@antioche.eu.org> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
George Dunlap [Wed, 23 Jan 2019 11:57:46 +0000 (11:57 +0000)]
xen: Make coherent PV IOMMU discipline
In order for a PV domain to set up DMA from a passed-through device to
one of its pages, the page must be mapped in the IOMMU. On the other
hand, before a PV page may be used as a "special" page type (such as a
pagetable or descriptor table), it _must not_ be writable in the IOMMU
(otherwise a malicious guest could DMA arbitrary page tables into the
memory, bypassing Xen's safety checks); and Xen's current rule is to
have such pages not in the IOMMU at all.
At the moment, in order to accomplish this, the code borrows HVM
domain's "physmap" concept: When a page is assigned to a guest,
guess_physmap_add_entry() is called, which for PV guests, will create
a writable IOMMU mapping; and when a page is removed,
guest_physmap_remove_entry() is called, which will remove the mapping.
Additionally, when a page gains the PGT_writable page type, the page
will be added into the IOMMU; and when the page changes away from a
PGT_writable type, the page will be removed from the IOMMU.
Unfortunately, borrowing the "physmap" concept from HVM domains is
problematic. HVM domains have a lock on their p2m tables, ensuring
synchronization between modifications to the p2m; and all hypercall
parameters must first be translated through the p2m before being used.
Trying to mix this locked-and-gated approach with PV's lock-free
approach leads to several races and inconsistencies:
* A race between a page being assigned and it being put into the
physmap; for example:
- P1: call populate_physmap() { A = allocate_domheap_pages() }
- P2: Guess page A's mfn, and call decrease_reservation(A). A is owned by the domain,
and so Xen will clear the PGC_allocated bit and free the page
- P1: finishes populate_physmap() { guest_physmap_add_entry() }
Now the domain has a writable IOMMU mapping to a page it no longer owns.
* Pages start out as type PGT_none, but with a writable IOMMU mapping.
If a guest uses a page as a page table without ever having created a
writable mapping, the IOMMU mapping will not be removed; the guest
will have a writable IOMMU mapping to a page it is currently using
as a page table.
* A newly-allocated page can be DMA'd into with no special actions on
the part of the guest; However, if a page is promoted to a
non-writable type, the page must be mapped with a writable type before
DMA'ing to it again, or the transaction will fail.
To fix this, do away with the "PV physmap" concept entirely, and
replace it with the following IOMMU discipline for PV guests:
- (type == PGT_writable) <=> in iommu (even if type_count == 0)
- Upon a final put_page(), check to see if type is PGT_writable; if so,
iommu_unmap.
In order to achieve that:
- Remove PV IOMMU related code from guest_physmap_*
- Repurpose cleanup_page_cacheattr() into a general
cleanup_page_mappings() function, which will both fix up Xen
mappings for pages with special cache attributes, and also check for
a PGT_writable type and remove pages if appropriate.
- For compatibility with current guests, grab-and-release a
PGT_writable_page type for PV guests in guest_physmap_add_entry().
This will cause most "normal" guest pages to start out life with
PGT_writable_page type (and thus an IOMMU mapping), but no type
count (so that they can be used as special cases at will).
Also, note that there is one exception to to the "PGT_writable => in
iommu" rule: xenheap pages shared with guests may be given a
PGT_writable type with one type reference. This reference prevents
the type from changing, which in turn prevents page from gaining an
IOMMU mapping in get_page_type(). It's not clear whether this was
intentional or not, but it's not something to change in a security
update.
This is XSA-288.
Reported-by: Paul Durrant <paul.durrant@citrix.com> Signed-off-by: George Dunlap <george.dunlap@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com>
steal_page, in a misguided attempt to protect against unknown races,
violates both of these rules, thus introducing other races:
- Temporarily, the count_info has the refcount go to 0 while
PGC_allocated is set
- It explicitly returns the page PGC_allocated set, but owner == NULL
and page not on the page_list.
The second one meant that page_get_owner_and_reference() could return
NULL even after having successfully grabbed a reference on the page,
leading the caller to leak the reference (since "couldn't get ref" and
"got ref but no owner" look the same).
Furthermore, rather than grabbing a page reference to ensure that the
owner doesn't change under its feet, it appears to rely on holding
d->page_alloc lock to prevent this.
Unfortunately, this is ineffective: page->owner remains non-NULL for
some time after the count has been set to 0; meaning that it would be
entirely possible for the page to be freed and re-allocated to a
different domain between the page_get_owner() check and the count_info
check.
Modify steal_page to instead follow the appropriate access discipline,
taking the page through series of states similar to being freed and
then re-allocated with MEMF_no_owner:
- Grab an extra reference to make sure we don't race with anyone else
freeing the page
- Drop both references and PGC_allocated atomically, so that (if
successful), anyone else trying to grab a reference will fail
- Attempt to reset Xen's mappings
- Reset the rest of the state.
Then, modify the two callers appropriately:
- Leave count_info alone (it's already been cleared)
- Call free_domheap_page() directly if appropriate
- Call assign_pages() rather than open-coding a partial assign
With all callers to assign_pages() now passing in pages with the
type_info field clear, tighten the respective assertion there.
This is XSA-287.
Signed-off-by: George Dunlap <george.dunlap@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Tue, 5 Mar 2019 12:47:36 +0000 (13:47 +0100)]
IOMMU/x86: fix type ref-counting race upon IOMMU page table construction
When arch_iommu_populate_page_table() gets invoked for an already
running guest, simply looking at page types once isn't enough, as they
may change at any time. Add logic to re-check the type after having
mapped the page, unmapping it again if needed.
This is XSA-285.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Tentatively-Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 5 Mar 2019 12:45:58 +0000 (13:45 +0100)]
gnttab: set page refcount for copy-on-grant-transfer
Commit 5cc77f9098 ("32-on-64: Fix domain address-size clamping,
implement"), which introduced this functionality, took care of clearing
the old page's PGC_allocated, but failed to set the bit (and install the
associated reference) on the newly allocated one. Furthermore the "mfn"
local variable was never updated, and hence the wrong MFN was passed to
guest_physmap_add_page() (and back to the destination domain) in this
case, leading to an IOMMU mapping into an unowned page.
Ideally the code would use assign_pages(), but the call to
gnttab_prepare_for_transfer() sits in the middle of the actions
mirroring that function.
This is XSA-284.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: George Dunlap <george.dunlap@citrix.com>
Andrew Cooper [Mon, 25 Feb 2019 13:06:22 +0000 (13:06 +0000)]
tools/tests: Drop obsolete test infrastructure
The regression/ directory was identified as already broken in 2012 (c/s 953953cc5). The logic is intended to test *.py files in the Xen tree against
different versions of python, but every python version is obsolete as well.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com>
Current npt and shadow code to get an entry will always return
INVALID_MFN for foreign entries. Allow to return the entry mfn for
foreign entries, like it's done for grant table entries.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Acked-by: George Dunlap <george.dunlap@citrix.com>
Roger Pau Monne [Wed, 27 Feb 2019 11:09:04 +0000 (12:09 +0100)]
x86/mm: handle foreign mappings in p2m_entry_modify
So that the specific handling can be removed from
atomic_write_ept_entry and be shared with npt and shadow code.
This commit also removes the check that prevent non-ept PVH dom0 from
mapping foreign pages.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Roger Pau Monne [Wed, 27 Feb 2019 11:09:02 +0000 (12:09 +0100)]
x86/mm: split p2m ioreq server pages special handling into helper
So that it can be shared by both ept, npt and shadow code, instead of
duplicating it.
No change in functionality intended.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Roger Pau Monne [Wed, 27 Feb 2019 11:09:01 +0000 (12:09 +0100)]
x86/p2m: pass the p2m to write_p2m_entry handlers
Current callers pass the p2m to paging_write_p2m_entry, but the
implementation specific handlers of the write_p2m_entry hook instead
of a p2m get a domain struct due to the handling done in
paging_write_p2m_entry.
Change the code so that the implementations of write_p2m_entry take a
p2m instead of a domain.
This is a non-functional change, but will be used by follow up
patches.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com>