Roger Pau Monné [Thu, 17 Jun 2021 16:00:57 +0000 (18:00 +0200)]
x86/ept: force WB cache attributes for grant and foreign maps
Force WB type for grants and foreign pages. Those are usually mapped
over unpopulated physical ranges in the p2m, and those ranges would
usually be UC in the MTRR state, which is unlikely to be the correct
cache attribute. It's also cumbersome (or even impossible) for the
guest to be setting the MTRR type for all those mappings as WB, as
MTRR ranges are finite.
Note that this is not an issue on AMD because WB cache attribute is
already set on grants and foreign mappings in the p2m and MTRR types
are ignored. Also on AMD Xen cannot force a cache attribute because of
the lack of ignore PAT equivalent, so the behavior here slightly
diverges between AMD and Intel (or EPT vs NPT/shadow).
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Roger Pau Monné [Thu, 17 Jun 2021 15:58:11 +0000 (17:58 +0200)]
x86/mtrr: move epte_get_entry_emt to p2m-ept.c
This is an EPT specific function, so it shouldn't live in the generic
mtrr file. Such movement is also needed for future work that will
require passing a p2m_type_t parameter to epte_get_entry_emt, and
making that type visible to the mtrr users is cumbersome and
unneeded.
Moving epte_get_entry_emt out of mtrr.c requires making the helper to
get the MTRR type of an address from the mtrr state public. While
there rename the function to start with the mtrr prefix, like other
mtrr related functions.
While there fix some of the types of the function parameters.
No functional change intended.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Andrew Cooper [Thu, 10 Jun 2021 10:01:06 +0000 (11:01 +0100)]
x86/platform: Improve MSR permission handling for XENPF_resource_op
The logic to disallow writes to the TSC is out-of-place, and should be in
check_resource_access() rather than in resource_access().
Split the existing allow_access_msr() into two - msr_{read,write}_allowed() -
and move all permissions checks here.
Furthermore, guard access to MSR_IA32_CMT_{EVTSEL,CTR} to prohibit their use
on hardware which is lacking the QoS Monitoring feature. Introduce
cpu_has_pqe to help with the logic.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Commit cf8c4d3d13b8 made some preparation to have one day
variable-length-array argument, but didn't declare the array in the
function prototype the same way as in the function definition. And now
GCC 11 complains about it.
Fixes: cf8c4d3d13b8 ("tools/libs/foreignmemory: pull array length argument to map forward") Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Tue, 15 Jun 2021 13:15:26 +0000 (15:15 +0200)]
x86: move .altinstr_replacement past _einittext
This section's contents do not represent part of actual hypervisor text,
so shouldn't be included in what is_kernel_inittext() or (while still
booting) is_active_kernel_text() report "true" for. Keep them in
.init.text though, as there's no real reason to have a separate section
for this in the final binary.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 15 Jun 2021 13:14:20 +0000 (15:14 +0200)]
x86/vpt: fully init timers before putting onto list
With pt_vcpu_lock() no longer acquiring the pt_migrate lock, parties
iterating the list and acting on the timers of the list entries will no
longer be kept from entering their loops by create_periodic_time()'s
holding of that lock. Therefore at least init_timer() needs calling
ahead of list insertion, but keep this and set_timer() together.
Fixes: 8113b02f0bf8 ("x86/vpt: do not take pt_migrate rwlock in some cases") Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Anthony PERARD [Tue, 11 May 2021 09:28:07 +0000 (10:28 +0100)]
libxl: Assert qmp_ev's state in qmp_ev_qemu_compare_version
We are supposed to read the version information only when qmp_ev is in
state "Connected" (that correspond to state==qmp_state_connected),
assert it so that the function isn't used too early.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Anthony PERARD [Tue, 11 May 2021 09:28:06 +0000 (10:28 +0100)]
libxl: Use -device for cd-rom drives
This allows to set an `id` on the device instead of only the drive. We
are going to need the `id` with the "eject" and
"blockdev-change-media" QMP command as using `device` parameter on
those is deprecated. (`device` is the `id` of the `-drive` on the
command line).
We set the same `id` on both -device and -drive as QEMU doesn't
complain and we can then either do "eject id=$id" or "eject
device=$id".
Using "-drive + -device" instead of only "-drive" has been
available since at least QEMU 0.15, and seems to be the preferred way as it
separates the host part (-drive which describe the disk image location
and format) from the guest part (-device which describe the emulated
device). More information in qemu.git/docs/qdev-device-use.txt .
Changing the command line during migration for the cdrom seems fine.
Also the documentation about migration in QEMU explains that the device
state ID is "been formed from a bus name and device address", so
second IDE bus and first device address on bus is still thus and
doesn't matter if written "-drive if=ide,index=2" or "-drive
ide-cd,bus=ide.1,unit=0".
See qemu.git/docs/devel/migration.rst .
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Anthony PERARD [Tue, 11 May 2021 09:28:05 +0000 (10:28 +0100)]
libxl: Replace deprecated "cpu-add" QMP command by "device_add"
The command "cpu-add" for CPU hotplug is deprecated and has been
removed from QEMU 6.0 (April 2021). We need to add cpus with the
command "device_add" now.
In order to find out which parameters to pass to "device_add" we first
make a call to "query-hotpluggable-cpus" which list the cpus drivers
and properties.
The algorithm to figure out which CPU to add, and by extension if any
CPU needs to be hotplugged, is in the function that adds the cpus.
Because of that, the command "query-hotpluggable-cpus" is always
called, even when not needed.
In case we are using a version of QEMU older than 2.7 (Sept 2016)
which don't have "query-hotpluggable-cpus", we fallback to using
"cpu-add".
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Anthony PERARD [Tue, 11 May 2021 09:28:04 +0000 (10:28 +0100)]
libxl: Replace QEMU's command line short-form boolean option
Short-form boolean options are deprecated in QEMU 6.0.
Upstream commit that deprecate those: ccd3b3b8112b ("qemu-option: warn
for short-form boolean options").
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Anthony PERARD [Tue, 11 May 2021 09:28:03 +0000 (10:28 +0100)]
libxl: Replace deprecated QMP command by "query-cpus-fast"
We use the deprecated QMP command "query-cpus" which is removed in the
QEMU 6.0 release. There's a replacement which is "query-cpus-fast",
and have been available since QEMU 2.12 (April 2018).
This patch try the new command first and when the command isn't
available, it fall back to the deprecated one so libxl still works
with older QEMU versions.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Jason Andryuk <jandryuk@gmail.com>
Jan Beulich [Mon, 14 Jun 2021 13:52:36 +0000 (15:52 +0200)]
Arm: avoid .init.data to be marked as executable
This confuses disassemblers, at the very least. Move
.altinstr_replacement to .init.text. The previously redundant ALIGN()
now gets converted to page alignment, such that the hypervisor mapping
won't have this as executable (it'll instead get mapped r/w, which I'm
told is intended to be adjusted at some point).
Note that for the actual patching logic's purposes this part of
.init.text _has_ to live after _einittext (or before _sinittext), or
else branch_insn_requires_update() would produce wrong results.
Also, to have .altinstr_replacement have consistent attributes in the
object files, add "x" to the one instance where it was missing.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Julien Grall <jgrall@amazon.com>
Jan Beulich [Fri, 11 Jun 2021 09:19:15 +0000 (11:19 +0200)]
xen/arm32: avoid .rodata to be marked as executable
The section .proc.info lives in .rodata as it doesn't contain any
executable code. However, the section is still marked as executable
as the consequence .rodata will also be marked executable.
Xen doesn't use the ELF permissions to decide the page-table mapping
permission. However, this will confuse disassemblers.
'#execinstr' is now removed on all the pushsection dealing with
.proc.info
Signed-off-by: Jan Beulich <jbeulich@suse.com>
[julieng: Rework the commit message] Acked-by: Julien Grall <jgrall@amazon.com>
Juergen Gross [Mon, 14 Jun 2021 12:39:52 +0000 (14:39 +0200)]
revert "tools/libs/guest: fix max_pfn setting in map_p2m()"
The reasoning for commit 7bd8989ab77b6a ("tools/libs/guest: fix max_pfn
setting in map_p2m()") was wrong.
The max_pfn field in shared_info is misnamed, it has the semantics of
num_pfns, which is hidden at least partially in Linux, as the kernel is
(wrongly) treating it like the highest used pfn in some places.
Julien Grall [Mon, 14 Jun 2021 10:08:30 +0000 (11:08 +0100)]
xen/grant-table: Simplify the update to the per-vCPU maptrack freelist
Since XSA-228 (commit 02cbeeb62075 "gnttab: split maptrack lock
to make it fulfill its purpose again"), v->maptrack_head,
v->maptrack_tail and the content of the freelist are accessed with
the lock v->maptrack_freelist_lock held.
Therefore it is not necessary to update the fields using cmpxchg()
and also read them atomically.
Note that there are two cases where v->maptrack_tail is accessed without
the lock. They both happen in get_maptrack_handle() when initializing
the free list of the current vCPU. Therefore there is no possible race.
The code is now reworked to remove any use of cmpxch() and read_atomic()
when accessing the fields v->maptrack_{head, tail} as wel as the
freelist.
Take the opportunity to add a comment on top of the lock definition
and explain what it protects.
Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Fri, 11 Jun 2021 13:04:24 +0000 (15:04 +0200)]
Arm32: MSR to SPSR needs qualification
The Arm ARM's description of MSR (ARM DDI 0406C.d section B9.3.12)
doesn't even allow for plain "SPSR" here, and while gas accepts this, it
takes it to mean SPSR_cf. Yet surely all of SPSR wants updating on this
path, not just the lowest and highest 8 bits.
Juergen Gross [Wed, 12 May 2021 14:48:32 +0000 (16:48 +0200)]
tools/libs/store: cleanup libxenstore interface
There are some internals in the libxenstore interface which should be
removed.
Move those functions into xs_lib.c and the related definitions into
xs_lib.h. Remove the functions from the mapfile. Add xs_lib.o to
xenstore_client as some of the internal functions are needed there.
Bump the libxenstore version to 4.0 as the change is incompatible.
Note that the removed functions should not result in any problem as
they ought to be used by xenstored or xenstore_client only.
Avoid an enum as part of a structure as the size of an enum is
compiler implementation dependent.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Thu, 10 Jun 2021 14:56:24 +0000 (16:56 +0200)]
x86: please Clang in arch_set_info_guest()
Clang 10 reports
domain.c:1328:10: error: variable 'cr3_mfn' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
if ( !compat )
^~~~~~~
domain.c:1334:34: note: uninitialized use occurs here
cr3_page = get_page_from_mfn(cr3_mfn, d);
^~~~~~~
domain.c:1328:5: note: remove the 'if' if its condition is always true
if ( !compat )
^~~~~~~~~~~~~~
domain.c:1042:18: note: initialize the variable 'cr3_mfn' to silence this warning
mfn_t cr3_mfn;
^
= 0
domain.c:1189:14: error: variable 'fail' is used uninitialized whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]
if ( !compat )
^~~~~~~
domain.c:1211:9: note: uninitialized use occurs here
fail |= v->arch.pv.gdt_ents != c(gdt_ents);
^~~~
domain.c:1189:9: note: remove the 'if' if its condition is always true
if ( !compat )
^~~~~~~~~~~~~~
domain.c:1187:18: note: initialize the variable 'fail' to silence this warning
bool fail;
^
= false
despite this being a build with -O2 in effect, and despite "compat"
being constant "false" when CONFIG_COMPAT (and hence CONFIG_PV32) is not
defined, as it gets set at the top of the function from the result of
is_pv_32bit_domain().
Re-arrange the two "offending" if()s such that when COMPAT=n the
respective variables will be seen as unconditionally initialized. The
original aim was to have the !compat cases first, though.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
SPSR_hyp is not meant to be accessed from Hyp mode (EL2); accesses
trigger UNPREDICTABLE behaviour. Xen should read/write SPSR instead.
See: ARM DDI 0487D.b page G8-5993.
This fixes booting Xen/arm32 on QEMU.
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com> Reviewed-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Andrew Cooper [Wed, 16 Sep 2020 15:15:52 +0000 (16:15 +0100)]
x86/tsx: Cope with TSX deprecation on SKL/KBL/CFL/WHL
The June 2021 microcode is formally de-featuring TSX on the older Skylake
client CPUs. The workaround from the March 2019 microcode is being dropped,
and replaced with additions to MSR_TSX_FORCE_ABORT to hide the HLE/RTM CPUID
bits.
With this microcode in place, TSX is disabled by default on these CPUs.
Backwards compatibility is provided in the same way as for TAA - RTM force
aborts, rather than suffering #UD, and the CPUID bits can be hidden to recover
performance.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Connor Davis [Wed, 9 Jun 2021 10:45:38 +0000 (12:45 +0200)]
xen: add files needed for minimal riscv build
Add arch-specific makefiles and configs needed to build for
riscv. Also add a minimal head.S that is a simple infinite loop.
head.o can be built with
$ make XEN_TARGET_ARCH=riscv64 SUBSYSTEMS=xen -C xen tiny64_defconfig
$ make XEN_TARGET_ARCH=riscv64 SUBSYSTEMS=xen -C xen TARGET=riscv64/head.o
No other TARGET is supported at the moment.
Signed-off-by: Connor Davis <connojdavis@gmail.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Acked-by: Bobby Eshleman <bobbyeshleman@gmail.com>
Tim Deegan [Wed, 9 Jun 2021 10:43:25 +0000 (12:43 +0200)]
MAINTAINERS: adjust x86/mm/shadow maintainers
Better reflect reality: Andrew and Jan are active maintainers
and I review patches. Keep myself as a reviewer so I can help
with historical context &c.
Signed-off-by: Tim Deegan <tim@xen.org> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 8 Jun 2021 17:29:42 +0000 (18:29 +0100)]
AMD/IOMMU: drop command completion timeout
First and foremost - such timeouts were not signaled to callers, making
them believe they're fine to e.g. free previously unmapped pages.
Mirror VT-d's behavior: A fixed number of loop iterations is not a
suitable way to detect timeouts in an environment (CPU and bus speeds)
independent manner anyway. Furthermore, leaving an in-progress operation
pending when it appears to take too long is problematic: If a command
completed later, the signaling of its completion may instead be
understood to signal a subsequently started command's completion.
Log excessively long processing times (with a progressive threshold) to
have some indication of problems in this area. Allow callers to specify
a non-default timeout bias for this logging, using the same values as
VT-d does, which in particular means a (by default) much larger value
for device IO TLB invalidation.
This is part of XSA-373 / CVE-2021-28692.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Paul Durrant <paul@xen.org>
Jan Beulich [Tue, 8 Jun 2021 17:29:40 +0000 (18:29 +0100)]
AMD/IOMMU: wait for command slot to be available
No caller cared about send_iommu_command() indicating unavailability of
a slot. Hence if a sufficient number prior commands timed out, we did
blindly assume that the requested command was submitted to the IOMMU
when really it wasn't. This could mean both a hanging system (waiting
for a command to complete that was never seen by the IOMMU) or blindly
propagating success back to callers, making them believe they're fine
to e.g. free previously unmapped pages.
Fold the three involved functions into one, add spin waiting for an
available slot along the lines of VT-d's qinval_next_index(), and as a
consequence drop all error indicator return types/values.
This is part of XSA-373 / CVE-2021-28692.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Paul Durrant <paul@xen.org>
Andrew Cooper [Thu, 20 May 2021 00:21:39 +0000 (01:21 +0100)]
x86/spec-ctrl: Mitigate TAA after S3 resume
The user chosen setting for MSR_TSX_CTRL needs restoring after S3.
All APs get the correct setting via start_secondary(), but the BSP was missed
out.
This is XSA-377 / CVE-2021-28690.
Fixes: 8c4330818f6 ("x86/spec-ctrl: Mitigate the TSX Asynchronous Abort sidechannel") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Thu, 11 Mar 2021 14:39:11 +0000 (14:39 +0000)]
x86/spec-ctrl: Protect against Speculative Code Store Bypass
Modern x86 processors have far-better-than-architecturally-guaranteed self
modifying code detection. Typically, when a write hits an instruction in
flight, a Machine Clear occurs to flush stale content in the frontend and
backend.
For self modifying code, before a write which hits an instruction in flight
retires, the frontend can speculatively decode and execute the old instruction
stream. Speculation of this form can suffer from type confusion in registers,
and potentially leak data.
Furthermore, updates are typically byte-wise, rather than atomic. Depending
on timing, speculation can race ahead multiple times between individual
writes, and execute the transiently-malformed instruction stream.
Xen has stubs which are used in certain cases for emulation purposes. Inhibit
speculation between updating the stub and executing it.
This is XSA-375 / CVE-2021-0089.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Tue, 8 Jun 2021 16:38:55 +0000 (17:38 +0100)]
VT-d: eliminate flush related timeouts
Leaving an in-progress operation pending when it appears to take too
long is problematic: If e.g. a QI command completed later, the write to
the "poll slot" may instead be understood to signal a subsequently
started command's completion. Also our accounting of the timeout period
was actually wrong: We included the time it took for the command to
actually make it to the front of the queue, which could be heavily
affected by guests other than the one for which the flush is being
performed.
Do away with all timeout detection on all flush related code paths.
Log excessively long processing times (with a progressive threshold) to
have some indication of problems in this area.
Additionally log (once) if qinval_next_index() didn't immediately find
an available slot. Together with the earlier change sizing the queue(s)
dynamically, we should now have a guarantee that with our fully
synchronous model any demand for slots can actually be satisfied.
This is part of XSA-373 / CVE-2021-28692.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Paul Durrant <paul@xen.org>
Jan Beulich [Tue, 8 Jun 2021 16:38:55 +0000 (17:38 +0100)]
AMD/IOMMU: size command buffer dynamically
With the present synchronous model, we need two slots for every
operation (the operation itself and a wait command). There can be one
such pair of commands pending per CPU. To ensure that under all normal
circumstances a slot is always available when one is requested, size the
command ring according to the number of present CPUs.
This is part of XSA-373 / CVE-2021-28692.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Paul Durrant <paul@xen.org>
Jan Beulich [Tue, 8 Jun 2021 16:38:55 +0000 (17:38 +0100)]
VT-d: size qinval queue dynamically
With the present synchronous model, we need two slots for every
operation (the operation itself and a wait descriptor). There can be
one such pair of requests pending per CPU. To ensure that under all
normal circumstances a slot is always available when one is requested,
size the queue ring according to the number of present CPUs.
This is part of XSA-373 / CVE-2021-28692.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Paul Durrant <paul@xen.org>
xen/arm: Boot modules should always be scrubbed if bootscrub={on, idle}
The function to initialize the pages (see init_heap_pages()) will request
scrub when the admin request idle bootscrub (default) and state ==
SYS_STATE_active. When bootscrub=on, Xen will scrub any free pages in
heap_init_late().
Currently, the boot modules (e.g. kernels, initramfs) will be discarded/
freed after heap_init_late() is called and system_state switched to
SYS_STATE_active. This means the pages associated with the boot modules
will not get scrubbed before getting re-purposed.
If the memory is assigned to an untrusted domU, it may be able to
retrieve secrets from the modules.
Julien Grall [Mon, 17 May 2021 16:47:13 +0000 (17:47 +0100)]
xen/arm: Create dom0less domUs earlier
In a follow-up patch we will need to unallocate the boot modules
before heap_init_late() is called.
The modules will contain the domUs kernel and initramfs. Therefore Xen
will need to create extra domUs (used by dom0less) before heap_init_late().
This has two consequences on dom0less:
1) Domains will not be unpaused as soon as they are created but
once all have been created. However, Xen doesn't guarantee an order
to unpause, so this is not something one could rely on.
2) The memory allocated for a domU will not be scrubbed anymore when an
admin select bootscrub=on. This is not something we advertised, but if
this is a concern we can introduce either force scrub for all domUs or
a per-domain flag in the DT. The behavior for bootscrub=off and
bootscrub=idle (default) has not changed.
Andrew Cooper [Tue, 8 Jun 2021 16:13:59 +0000 (17:13 +0100)]
x86/cpuid: Half revert "x86/cpuid: Drop special_features[]"
xen-cpuid does print out the list of special features, and this is helpful to
keep.
Fixes: ba6950fb070 ("x86/cpuid: Drop special_features[]") Reported-by: Jan Beulich <JBeulich@suse.com> Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Tue, 8 Jun 2021 12:47:47 +0000 (14:47 +0200)]
evtchn: type adjustments
First of all avoid "long" when "int" suffices, i.e. in particular when
merely conveying error codes. 32-bit values are slightly cheaper to
deal with on x86, and their processing is at least no more expensive on
Arm. Where possible use evtchn_port_t for port numbers and unsigned int
for other unsigned quantities in adjacent code. In evtchn_set_priority()
eliminate a local variable altogether instead of changing its type.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Jan Beulich [Tue, 8 Jun 2021 12:47:14 +0000 (14:47 +0200)]
evtchn: add helper for port_is_valid() + evtchn_from_port()
The combination is pretty common, so adding a simple local helper seems
worthwhile. Make it const- and type-correct, in turn requiring the
two called function to also be const-correct (and at this occasion also
make them type-correct).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Julien Grall <jgrall@amazon.com>
Jan Beulich [Tue, 8 Jun 2021 12:46:06 +0000 (14:46 +0200)]
evtchn: slightly defer lock acquire where possible
port_is_valid() and evtchn_from_port() are fine to use without holding
any locks. Accordingly acquire the per-domain lock slightly later in
evtchn_close() and evtchn_bind_vcpu(). Especially for the use by the
former (but there are pre-existing uses) add a comment about
port_is_valid()'s guarantees.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Julien Grall <julien@xen.org>
Anthony PERARD [Tue, 1 Jun 2021 10:28:03 +0000 (11:28 +0100)]
tools/firmware/ovmf: Use OvmfXen platform file is exist
A platform introduced in EDK II named OvmfXen is now the one to use for
Xen instead of OvmfX64. It comes with PVH support.
Also, the Xen support in OvmfX64 is deprecated,
"deprecation notice: *dynamic* multi-VMM (QEMU vs. Xen) support in OvmfPkg"
https://edk2.groups.io/g/devel/message/75498
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Jackson <iwj@xenproject.org>
Juergen Gross [Mon, 7 Jun 2021 13:00:05 +0000 (15:00 +0200)]
tools/libs/guest: fix save and restore of pv domains after 32-bit de-support
After 32-bit PV-guests have been security de-supported when not running
under PV-shim, the hypervisor will no longer be configured to support
those domains per default when not being built as PV-shim.
Unfortunately libxenguest will fail saving or restoring a PV domain
due to this restriction, as it is trying to get the compat MFN list
even for 64 bit guests.
Fix that by obtaining the compat MFN list only for 32-bit PV guests.
Fixes: 1a0f2fe2297d122a08fe ("SUPPORT.md: Un-shimmed 32-bit PV guests are no longer supported") Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Mon, 7 Jun 2021 12:25:09 +0000 (13:25 +0100)]
x86/cpuid: Fix HLE and RTM handling (again)
For reasons which are my fault, but I don't recall why, the
FDP_EXCP_ONLY/NO_FPU_SEL adjustment uses the whole special_features[] array
element, not the two relevant bits.
HLE and RTM were recently added to the list of special features, causing them
to be always set in guest view, irrespective of the toolstacks choice on the
matter.
Rewrite the logic to refer to the features specifically, rather than relying
on the contents of the special_features[] array.
Fixes: 8fe24090d9 ("x86/cpuid: Rework HLE and RTM handling") Reported-by: Edwin Török <edvin.torok@citrix.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Mon, 7 Jun 2021 13:40:55 +0000 (15:40 +0200)]
docs: release-technician-checklist: update to leaf tree version pinning
Our releases look to flip-flop between keeping or discarding the date
and title of the referenced qemu-trad commit. I think with the hash
replaced by a tag, the commit's date and title would better also be
purged.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Jackson <iwj@xenproject.org>
Dario Faggioli [Fri, 19 Mar 2021 12:14:17 +0000 (12:14 +0000)]
xen: credit2: fix per-entity load tracking when continuing running
If we schedule, and the current vCPU continues to run, its statistical
load is not properly updated, resulting in something like this, even if
all the 8 vCPUs are 100% busy:
As we can see, the average load of the runqueue as a whole is, instead,
computed properly.
This issue would, in theory, potentially affect Credit2 load balancing
logic. In practice, however, the problem only manifests (at least with
these characteristics) when there is only 1 runqueue active in the
cpupool, which also means there is no need to do any load-balancing.
Hence its real impact is pretty much limited to wrong per-vCPU load
percentages, when looking at the output of the 'r' debug-key.
With this patch, the load is updated and displayed correctly:
Dario Faggioli [Fri, 28 May 2021 15:12:48 +0000 (17:12 +0200)]
credit2: make sure we pick a runnable unit from the runq if there is one
A !runnable unit (temporarily) present in the runq may cause us to
stop scanning the runq itself too early. Of course, we don't run any
non-runnable vCPUs, but we end the scan and we fallback to picking
the idle unit. In other word, this prevent us to find there and pick
the actual unit that we're meant to start running (which might be
further ahead in the runq).
Depending on the vCPU pinning configuration, this may lead to such
unit to be stuck in the runq for long time, causing malfunctioning
inside the guest.
Fix this by checking runnable/non-runnable status up-front, in the runq
scanning function.
Reported-by: Michał Leszczyński <michal.leszczynski@cert.pl> Reported-by: Dion Kant <g.w.kant@hunenet.nl> Signed-off-by: Dario Faggioli <dfaggioli@suse.com> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Juergen Gross [Fri, 4 Jun 2021 06:02:13 +0000 (08:02 +0200)]
tools/libs: move xc_core* from libxenctrl to libxenguest
The functionality in xc_core* should be part of libxenguest instead
of libxenctrl. Users are already either in libxenguest, or in xl.
There is one single exception: xc_core_arch_auto_translated_physmap()
is being used by xc_domain_memory_mapping(), which is used by qemu.
So leave the xc_core_arch_auto_translated_physmap() functionality in
libxenctrl.
This will make it easier to merge common functionality of xc_core*
and xg_sr_save*.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Wei Liu <wl@xen.org>
Juergen Gross [Fri, 4 Jun 2021 06:02:11 +0000 (08:02 +0200)]
tools/libs/ctrl: use common p2m mapping code in xc_domain_resume_any()
Instead of open coding the mapping of the p2m list use the already
existing xc_core_arch_map_p2m() call, especially as the current code
does not support guests with the linear p2m map. It should be noted
that this code is needed for colo/remus only.
Switching to xc_core_arch_map_p2m() drops the need to bail out for
bitness of tool stack and guest differing.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Christian Lindig <christian.lindig@citrix.com> Acked-by: Wei Liu <wl@xen.org>
Juergen Gross [Fri, 4 Jun 2021 06:02:10 +0000 (08:02 +0200)]
tools/libs/ctrl: fix xc_core_arch_map_p2m() to support linear p2m table
The core of a pv linux guest produced via "xl dump-core" is nor usable
as since kernel 4.14 only the linear p2m table is kept if Xen indicates
it is supporting that. Unfortunately xc_core_arch_map_p2m() is still
supporting the 3-level p2m tree only.
Fix that by copying the functionality of map_p2m() from libxenguest to
libxenctrl.
Additionally the mapped p2m isn't of a fixed length now, so the
interface to the mapping functions needs to be adapted. In order not to
add even more parameters, expand struct domain_info_context and use a
pointer to that as a parameter.
Fixes: dc6d60937121 ("libxc: set flag for support of linear p2m list in domain builder") Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Wei Liu <wl@xen.org>
Juergen Gross [Fri, 4 Jun 2021 06:02:09 +0000 (08:02 +0200)]
tools/libs/guest: fix max_pfn setting in map_p2m()
When setting the highest pfn used in the guest, don't subtract 1 from
the value read from the shared_info data. The value read already is
the correct pfn.
Fixes: 91e204d37f449 ("libxc: try to find last used pfn when migrating") Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Wei Liu <wl@xen.org>
Julien Grall [Mon, 22 Feb 2021 12:01:18 +0000 (12:01 +0000)]
xen/page_alloc: Remove dead code in alloc_domheap_pages()
Since commit 1aac966e24e9 "xen: support RAM at addresses 0 and 4096",
bits_to_zone() will never return 0 and it is expected that we have
minimum 2 zones.
Therefore the check in alloc_domheap_pages() is unnecessary and can
be removed. However, for sanity, it is replaced with an ASSERT().
Also take the opportunity to switch from min_t() to min() as
bits_to_zone() cannot return a negative value. The macro is tweaked
to make it clearer.
This bug was discovered and resolved using Coverity Static Analysis
Security Testing (SAST) by Synopsys, Inc.
Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Connor Davis [Fri, 28 May 2021 07:42:25 +0000 (09:42 +0200)]
common: guard iommu symbols with CONFIG_HAS_PASSTHROUGH
The variables iommu_enabled and iommu_dont_flush_iotlb are defined in
drivers/passthrough/iommu.c and are referenced in common code, which
causes the link to fail when !CONFIG_HAS_PASSTHROUGH.
Guard references to these variables in common code so that xen
builds when !CONFIG_HAS_PASSTHROUGH.
Signed-off-by: Connor Davis <connojdavis@gmail.com>
[jb: further massage xen/iommu.h adjustment] Acked-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monné [Fri, 28 May 2021 07:14:44 +0000 (09:14 +0200)]
libelf: improve PVH elfnote parsing
Pass an hvm boolean parameter to the elf note checking routines, so that
better checking can be done in case libelf is dealing with an hvm
container.
elf_xen_note_check shouldn't return early unless PHYS32_ENTRY is set
and the container is of type HVM, or else the loader and version
checks would be avoided for kernels intended to be booted as PV but
that also have PHYS32_ENTRY set.
Adjust elf_xen_addr_calc_check so that the virtual addresses are
actually physical ones (by setting virt_base and elf_paddr_offset to
zero) when the container is of type HVM, as that container is always
started with paging disabled.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monné [Fri, 28 May 2021 07:13:32 +0000 (09:13 +0200)]
libelf: don't attempt to parse __xen_guest for PVH
The legacy __xen_guest section doesn't support the PHYS32_ENTRY
elfnote, so it's pointless to attempt to parse the elfnotes from that
section when called from an hvm container.
Pass an hvm boolean parameter to the elf note parsing routine, so that
the respective parsing can be suppressed in case libelf is dealing with
an hvm container.
Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
The original commit wasn't quite sufficient: Emptying DEPS is helpful
only when nothing will get added to it subsequently. xen/Rules.mk will,
after including the local Makefile, amend DEPS by dependencies for
objects living in sub-directories though. For the purpose of suppressing
dependencies of the makefiles on the .*.d2 files (and thus to avoid
their re-generation) it is, however, not necessary at all to play with
DEPS. Instead we can override DEPS_INCLUDE (which generally is a late-
expansion variable).
Fixes: 761bb575ce97 ("x86: fix build race when generating temporary object files") Signed-off-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Tue, 18 May 2021 13:53:56 +0000 (14:53 +0100)]
x86/tsx: Minor cleanup and improvements
* Introduce cpu_has_arch_caps and replace boot_cpu_has(X86_FEATURE_ARCH_CAPS)
* Read CPUID data into the appropriate boot_cpu_data.x86_capability[]
element, as subsequent changes are going to need more cpu_has_* logic.
* Use the hi/lo MSR helpers, which substantially improves code generation.
No practical change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Andrew Cooper [Thu, 20 May 2021 18:48:39 +0000 (19:48 +0100)]
x86/cpuid: Rework HLE and RTM handling
The TAA mitigation offered the option to hide the HLE and RTM CPUID bits,
which has caused some migration compatibility problems.
These two bits are special. Annotate them with ! to emphasise this point.
Hardware Lock Elision (HLE) may or may not be visible in CPUID, but is
disabled in microcode on all CPUs, and has been removed from the architecture.
Do not advertise it to VMs by default.
Restricted Transactional Memory (RTM) may or may not be visible in CPUID, and
may or may not be configured in force-abort mode. Have tsx_init() note
whether RTM has been configured into force-abort mode, so
guest_common_feature_adjustments() can conditionally hide it from VMs by
default.
The host policy values for HLE/RTM may or may not be set, depending on any
previous running kernel's choice of visibility, and Xen's choice. TSX is
available on any CPU which enumerates a TSX-hiding mechanism, so instead of
doing a two-step to clobber any hiding, scan CPUID, then set the visibility,
just force visibility of the bits in the first place.
With the HLE/RTM bits now unilaterally visible in the host policy,
xc_cpuid_apply_policy() can construct a more appropriate policy out of thin
air for pre-4.13 VMs with no CPUID data in their migration stream, and
specifically one where HLE/RTM doesn't potentially disappear behind the back
of a running VM.
Fixes: 8c4330818f6 ("x86/spec-ctrl: Mitigate the TSX Asynchronous Abort sidechannel") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Jan Beulich [Thu, 27 May 2021 12:40:29 +0000 (14:40 +0200)]
x86: make hypervisor build with gcc11
Gcc 11 looks to make incorrect assumptions about valid ranges that
pointers may be used for addressing when they are derived from e.g. a
plain constant. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100680.
Utilize RELOC_HIDE() to work around the issue, which for x86 manifests
in at least
- mpparse.c:efi_check_config(),
- tboot.c:tboot_probe(),
- tboot.c:tboot_gen_frametable_integrity(),
- x86_emulate.c:x86_emulate() (at -O2 only).
The last case is particularly odd not just because it only triggers at
higher optimization levels, but also because it only affects one of at
least three similar constructs. Various "note" diagnostics claim the
valid index range to be [0, 2⁶³-1].
Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Jason Andryuk <jandryuk@gmail.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Jan Beulich [Thu, 27 May 2021 12:39:33 +0000 (14:39 +0200)]
firmware/shim: UNSUPPORTED=n
We shouldn't default to include any unsupported code in the shim. Mark
the setting as off, replacing the ARGO specification. This points out
anomalies with the scheduler configuration: Unsupported schedulers
better don't default to Y in release builds (like is already the case
for ARINC653). Without at least the SCHED_NULL adjustments, the shim
would suddenly build with RTDS as its default scheduler.
As a result, the SCHED_NULL setting can also be dropped from defconfig.
Clearly with the shim defaulting to it, SCHED_NULL must be supported at
least there.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
Julien Grall [Wed, 26 May 2021 15:35:53 +0000 (16:35 +0100)]
tools/xenstored: Remove unused parameter in check_domains()
The parameter of check_domains() is not used within the function. In fact,
this was a left over of the original implementation as the version merged
doesn't need to know whether we are restoring.
Julien Grall [Wed, 26 May 2021 15:01:32 +0000 (16:01 +0100)]
xen/char: console: Use const whenever we point to literal strings
Literal strings are not meant to be modified. So we should use const
char * rather than char * when we want to store a pointer to them.
The array should also not be modified at all and is only used by
xenlog_update_val(). So take the opportunity to add an extra const and
move the definition in the function.
Jan Beulich [Wed, 26 May 2021 07:34:37 +0000 (09:34 +0200)]
firmware/shim: drop XEN_CONFIG_EXPERT uses
As of commit d155e4aef35c ("xen: Allow EXPERT mode to be selected from
the menuconfig directly") EXPERT is a regular config option (which the
shim default config also enables).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <rogerpau@citrix.com>
Jan Beulich [Wed, 26 May 2021 07:34:07 +0000 (09:34 +0200)]
firmware/shim: update linkfarm exclusions
Some intermediate files weren't considered at all at the time. Also
after its introduction, various changes to the build environment have
rendered the exclusion sets stale. For example, we now have some .*.cmd
files in the build tree. Combine all respective patterns into a single
.* one, seeing that we don't have any actual source files matching this
pattern in the tree. Add other patterns as well as individual files.
Also introduce LINK_EXCLUDE_PATHS to deal with entire directories full
of generated headers as well as a few specific files the names of which
are too generic to list under LINK_EXCLUDES.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Jan Beulich [Wed, 26 May 2021 07:33:02 +0000 (09:33 +0200)]
x86/guest: fix build when HVM and !PV32
The commit referenced below still wasn't careful enough - with COMPAT we
will have a compat_handle_okay() visible already, which we first need to
get rid of.
Fixes: bd1e7b47bac0 ("x86/shim: fix build when !PV32") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Connor Davis [Mon, 24 May 2021 14:34:28 +0000 (08:34 -0600)]
automation: Add container for riscv64 builds
Add a container for cross-compiling xen to riscv64.
This just includes the cross-compiler and necessary packages for
building xen itself (packages for tools, stubdoms, etc., can be
added later).
Signed-off-by: Connor Davis <connojdavis@gmail.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 25 May 2021 07:08:43 +0000 (09:08 +0200)]
x86/shadow: fix DO_UNSHADOW()
When adding the HASH_CALLBACKS_CHECK() I failed to properly recognize
the (somewhat unusually formatted) if() around the call to
hash_domain_foreach()). Gcc 11 is absolutely right in pointing out the
apparently misleading indentation. Besides adding the missing braces,
also adjust the two oddly formatted if()-s in the macro.
Fixes: 90629587e16e ("x86/shadow: replace stale literal numbers in hash_{vcpu,domain}_foreach()") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Luca Fancellu <luca.fancellu@arm.com> Reviewed-by: Tim Deegan <tim@xen.org>
Dario Faggioli [Tue, 18 May 2021 16:42:45 +0000 (18:42 +0200)]
automation: fix dependencies on openSUSE Tumbleweed containers
Fix the build inside our openSUSE Tumbleweed container by using
adding libzstd headers. While there, remove the explicit dependency
for python and python3 as the respective -devel packages will pull
them in anyway.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Dario Faggioli [Tue, 18 May 2021 16:42:39 +0000 (18:42 +0200)]
automation: use DOCKER_CMD for building containers too
Use DOCKER_CMD from the environment (if defined) in the containers'
makefile too, so that, e.g., when doing `export DOCKED_CMD=podman`
podman is used for building the containers too.
Signed-off-by: Dario Faggioli <dfaggioli@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Julien Grall [Tue, 18 May 2021 17:03:05 +0000 (18:03 +0100)]
tools/libs: guest: Fix Arm build after 8fc4916daf2a
Gitlab CI spotted an issue when building the tools Arm:
xg_dom_arm.c: In function 'meminit':
xg_dom_arm.c:401:50: error: passing argument 3 of 'set_mode' discards 'const' qualifier from pointer target type [-Werror=discarded-qualifiers]
401 | rc = set_mode(dom->xch, dom->guest_domid, dom->guest_type);
| ~~~^~~~~~~~~~~~
This is because the const was not propagated in the Arm code. Fix it
by constifying the 3rd parameter of set_mode().
Fixes: 8fc4916daf2a ("tools/libs: guest: Use const whenever we point to literal strings") Signed-off-by: Julien Grall <jgrall@amazon.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Juergen Gross [Tue, 18 May 2021 06:19:07 +0000 (08:19 +0200)]
tools/xenstore: simplify xenstored main loop
The main loop of xenstored is rather complicated due to different
handling of socket and ring-page interfaces. Unify that handling by
introducing interface type specific functions can_read() and
can_write().
Take the opportunity to remove the empty list check before calling
write_messages() because the function is already able to cope with an
empty list.
xen/arm: kernel: Propagate the error if we fail to decompress the kernel
Currently, we are ignoring any error from perform_gunzip() and replacing
the compressed kernel with the "uncompressed" kernel.
If there is a gzip failure, then it means that the output buffer may
contain garbagge. So it can result to various sort of behavior that may
be difficult to root cause.
In case of failure, free the output buffer and propagate the error.
We also need to adjust the return check for kernel_compress() as
perform_gunzip() may return a positive value.
Take the opportunity to adjust the code style for the check.
Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Michal Orzel <michal.orzel@arm.com>
Connor Davis [Mon, 17 May 2021 13:43:19 +0000 (15:43 +0200)]
xen: fix build when !CONFIG_GRANT_TABLE
Move struct grant_table; in grant_table.h above
ifdef CONFIG_GRANT_TABLE. This fixes the following:
/build/xen/include/xen/grant_table.h:84:50: error: 'struct grant_table'
declared inside parameter list will not be visible outside of this
definition or declaration [-Werror]
84 | static inline int mem_sharing_gref_to_gfn(struct grant_table *gt,
|
Signed-off-by: Connor Davis <connojdavis@gmail.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Mon, 17 May 2021 13:42:00 +0000 (15:42 +0200)]
x86/shim: fix build when !PV32
In this case compat headers don't get generated (and aren't needed).
The changes made by 527922008bce ("x86: slim down hypercall handling
when !PV32") also weren't quite sufficient for this case.
Try to limit #ifdef-ary by introducing two "fallback" #define-s.
Fixes: d23d792478db ("x86: avoid building COMPAT code when !HVM && !PV32") Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Jan Beulich [Mon, 17 May 2021 13:41:28 +0000 (15:41 +0200)]
x86emul: fix test harness build for gas 2.36
All of the sudden, besides .text and .rodata and alike, an always
present .note.gnu.property section has appeared. This section, when
converting to binary format output, gets placed according to its
linked address, causing the resulting blobs to be about 128Mb in size.
The resulting headers with a C representation of the binary blobs then
are, of course all a multiple of that size (and take accordingly long
to create). I didn't bother waiting to see what size the final
test_x86_emulator binary then would have had.
See also https://sourceware.org/bugzilla/show_bug.cgi?id=27753.
Rather than figuring out whether gas supports -mx86-used-note=, simply
remove the section while creating *.bin.
Jan Beulich [Mon, 17 May 2021 13:40:53 +0000 (15:40 +0200)]
x86/AMD: also determine L3 cache size
For Intel CPUs we record L3 cache size, hence we should also do so for
AMD and alike.
While making these additions, also make sure (throughout the function)
that we don't needlessly overwrite prior values when the new value to be
stored is zero.
Jan Beulich [Mon, 17 May 2021 13:38:39 +0000 (15:38 +0200)]
build: centralize / unify asm-offsets generation
Except for an additional prereq Arm and x86 have the same needs here,
and Arm can also benefit from the recent x86 side improvement. Recurse
into arch/*/ only for a phony include target (doing nothing on Arm),
and handle asm-offsets itself entirely locally to xen/Makefile.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Julien Grall <jgrall@amazon.com>
Andrew Cooper [Thu, 13 May 2021 15:43:27 +0000 (16:43 +0100)]
Revert "x86/PV32: avoid TLB flushing after mod_l3_entry()" and "x86/PV: restrict TLB flushing after mod_l[234]_entry()"
These reintroduce XSA-286 / CVE-2018-15469, as confirmed by the xsa-286 XTF
test run by OSSTest.
The TLB flushing is for Xen's correctness, not the guest's.
The text in c/s bed7e6cad30 is technically correct, from the guests point of
view, but clearly false as far as XSA-286 is concerned. That said, it is edcfce55917 which introduced the regression, which demonstrates that the
reasoning is flawed.