Brian Woods [Tue, 26 Jan 2021 22:58:35 +0000 (14:58 -0800)]
arm,smmu: restructure code in preparation to new bindings support
Restructure some of the code and add supporting functions for adding
generic device tree (DT) binding support. This will allow for using
current Linux device trees with just modifying the chosen field to
enable Xen.
Brian Woods [Tue, 6 Apr 2021 22:52:10 +0000 (15:52 -0700)]
arm,smmu: switch to using iommu_fwspec functions
Modify the smmu driver so that it uses the iommu_fwspec helper
functions. This means both ARM IOMMU drivers will both use the
iommu_fwspec helper functions, making enabling generic device tree
bindings in the SMMU driver much cleaner.
This patch fix the stream match conflict issue when two devices have the
same stream-id.
Only difference while applying this patch with regard to Linux patch are
as follows:
1. Spinlock is used in place of mutex when attaching a device to the
SMMU via arm_smmu_master_alloc_smes(..) function call.Replacing the
mutex with spinlock is fine here as we are configuring the hardware
via registers and it is very fast.
2. move iommu_group_alloc(..) function call in arm_smmu_add_device(..)
function from the start of the function to the end.
Original commit message:
iommu/arm-smmu: Intelligent SMR allocation
Stream Match Registers are one of the more awkward parts of the SMMUv2
architecture; there are typically never enough to assign one to each
stream ID in the system, and configuring them such that a single ID
matches multiple entries is catastrophically bad - at best, every
transaction raises a global fault; at worst, they go *somewhere*.
To address the former issue, we can mask ID bits such that a single
register may be used to match multiple IDs belonging to the same device
or group, but doing so also heightens the risk of the latter problem
(which can be nasty to debug).
Tackle both problems at once by replacing the simple bitmap allocator
with something much cleverer. Now that we have convenient in-memory
representations of the stream mapping table, it becomes straightforward
to properly validate new SMR entries against the current state, opening
the door to arbitrary masking and SMR sharing.
Another feature which falls out of this is that with IDs shared by
separate devices being automatically accounted for, simply associating a
group pointer with the S2CR offers appropriate group allocation almost
for free, so hook that up in the process.
This patch is the preparatory work to fix the stream match conflict
when two devices have the same stream-id.
Original commit message:
iommu/arm-smmu: Add a stream map entry iterator
We iterate over the SMEs associated with a master config quite a lot in
various places, and are about to do so even more. Let's wrap the idiom
in a handy iterator macro before the repetition gets out of hand.
Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Rahul Singh <rahul.singh@arm.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com> Acked-by: Julien Grall <jgrall@amazon.com>
This patch is the preparatory work to fix the stream match conflict
when two devices have the same stream-id.
Original commit message:
iommu/arm-smmu: Keep track of S2CR state
Making S2CRs first-class citizens within the driver with a high-level
representation of their state offers a neat solution to a few problems:
Firstly, the information about which context a device's stream IDs are
associated with is already present by necessity in the S2CR. With that
state easily accessible we can refer directly to it and obviate the need
to track an IOMMU domain in each device's archdata (its earlier purpose
of enforcing correct attachment of multi-device groups now being handled
by the IOMMU core itself).
Secondly, the core API now deprecates explicit domain detach and expects
domain attach to move devices smoothly from one domain to another; for
SMMUv2, this notion maps directly to simply rewriting the S2CRs assigned
to the device. By giving the driver a suitable abstraction of those
S2CRs to work with, we can massively reduce the overhead of the current
heavy-handed "detach, free resources, reallocate resources, attach"
approach.
Thirdly, making the software state hardware-shaped and attached to the
SMMU instance once again makes suspend/resume of this register group
that much simpler to implement in future.
This patch is the preparatory work to fix the stream match conflict
when two devices have the same stream-id.
Original commit message:
iommu/arm-smmu: Consolidate stream map entry state
In order to consider SMR masking, we really want to be able to validate
ID/mask pairs against existing SMR contents to prevent stream match
conflicts, which at best would cause transactions to fault unexpectedly,
and at worst lead to silent unpredictable behaviour. With our SMMU
instance data holding only an allocator bitmap, and the SMR values
themselves scattered across master configs hanging off devices which we
may have no way of finding, there's essentially no way short of digging
everything back out of the hardware. Similarly, the thought of power
management ops to support suspend/resume faces the exact same problem.
By massaging the software state into a closer shape to the underlying
hardware, everything comes together quite nicely; the allocator and the
high-level view of the data become a single centralised state which we
can easily keep track of, and to which any updates can be validated in
full before being synchronised to the hardware itself.
This patch is the preparatory work to fix the stream match conflict
when two devices have the same stream-id.
Original commit message:
iommu/arm-smmu: Handle stream IDs more dynamically
Rather than assuming fixed worst-case values for stream IDs and SMR
masks, keep track of whatever implemented bits the hardware actually
reports. This also obviates the slightly questionable validation of SMR
fields in isolation - rather than aborting the whole SMMU probe for a
hardware configuration which is still architecturally valid, we can
simply refuse masters later if they try to claim an unrepresentable ID
or mask (which almost certainly implies a DT error anyway).
Acked-by: Will Deacon <will.deacon@arm.com> Tested-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Rahul Singh <rahul.singh@arm.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com> Acked-by: Julien Grall <jgrall@amazon.com>
George Dunlap [Thu, 1 Apr 2021 13:34:04 +0000 (14:34 +0100)]
CHANGELOG.md: irq-max-guests
Signed-off-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
--- CC: Igor Druzhinin <igor.druzhinin@citrix.com> CC: Jan Beulich <jbeulich@suse.com> CC: Ian Jackson <ian.jackson@citrix.com>
George Dunlap [Thu, 1 Apr 2021 13:18:36 +0000 (14:18 +0100)]
CHANGELOG.md: Various new entries, mostly x86
...Grouped mostly by submitter / maintainer
Signed-off-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
--- CC: Ian Jackson <ian.jackson@citrix.com> CC: Andrew Cooper <andrew.cooper3@citrix.com> CC: Jan Beulich <jbeulich@suse.com> CC: Roger Pau Monne <roger.pau@citrix.com>
George Dunlap [Thu, 1 Apr 2021 13:06:36 +0000 (14:06 +0100)]
CHANGELOG.md: Some additional affordances in various xl subcommands
Signed-off-by: George Dunlap <george.dunlap@citrix.com> Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
--- CC: Ian Jackson <ian.jackson@citrix.com>
George Dunlap [Thu, 1 Apr 2021 12:54:10 +0000 (13:54 +0100)]
CHANGELOG.md: xl PCI configuration doc, xenstore MTU entries
Signed-off-by: George Dunlap <george.dunlap@citrix.com> Reviewed-by: Paul Durrant <paul@xen.org> Release-acked-by: Ian Jackson <ian.jackson@citrix.com>
--- CC: Paul Durrant <paul@xen.org> CC: Ian Jackson <ian.jackson@citrix.com> CC: Wei Liu <wl@xen.org>
George Dunlap [Thu, 1 Apr 2021 12:51:48 +0000 (13:51 +0100)]
CHANGELOG.md: Mention XEN_SCRIPT_DIR
Signed-off-by: George Dunlap <george.dunlap@citrix.com> Reviewed-by: Ian Jackson <iwj@xenproject.org> Release-acked-by: Ian Jackson <iwj@xenproject.org>
---
CC: Olaf Hering <olaf@aepfle.de> CC: Ian Jackson <iwj@xenproject.org>
George Dunlap [Wed, 24 Mar 2021 17:24:31 +0000 (17:24 +0000)]
CHANGELOG.md: Make PV shim smaller by factoring out HVM-specific shadow code
Signed-off-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Release-acked-by: Ian Jackson <ian.jackson@citrix.com>
George Dunlap [Wed, 24 Mar 2021 13:24:45 +0000 (13:24 +0000)]
CHANGELOG.md: Add entries for emulation
Signed-off-by: George Dunlap <george.dunlap@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Release-acked-by: Ian Jackson <ian.jackson@citrix.com>
George Dunlap [Wed, 24 Mar 2021 16:20:28 +0000 (16:20 +0000)]
CHANGELOG.md: Add entries for CI loop
Signed-off-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org> Release-acked-by: Ian Jackson <ian.jackson@citrix.com>
George Dunlap [Tue, 23 Mar 2021 17:06:20 +0000 (17:06 +0000)]
CHANGELOG.md: NetBSD lib/gnttab support
Signed-off-by: George Dunlap <george.dunlap@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-acked-by: Ian Jackson <ian.jackson@citrix.com>
George Dunlap [Tue, 23 Mar 2021 16:58:42 +0000 (16:58 +0000)]
CHANGELOG.md: Add dom0/domU zstd compression support
Signed-off-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Release-acked-by: Ian Jackson <ian.jackson@citrix.com>
George Dunlap [Tue, 23 Mar 2021 16:52:25 +0000 (16:52 +0000)]
CHANGELOG.md: Add named PCI devices
Signed-off-by: George Dunlap <george.dunlap@citrix.com> Reviewed-by: Paul Durrant <paul@xen.org> Release-acked-by: Ian Jackson <ian.jackson@citrix.com>
George Dunlap [Tue, 23 Mar 2021 13:55:57 +0000 (13:55 +0000)]
Intel Processor Trace Support: Add CHANGELOG.md and SUPPORT.md entries
Signed-off-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Release-acked-by: Ian Jackson <ian.jackson@citrix.com>
Commit e1de4c196a2e ("x86/timer: Fix boot on Intel systems using ITSSPRC
static PIT clock gating") was reported to cause boot failures on certain
AMD Ryzen systems. Until we can figure out what the actual issue there
is, skip this new part of HPET setup by default. Introduce a "hpet"
command line option to allow enabling this on hardware where it's really
needed for Xen to boot successfully (i.e. where the PIT doesn't drive
the timer interrupt).
Since it makes little sense to introduce just "hpet=legacy-replacement",
also allow for a boolean argument as well as "broadcast" to replace the
separate "hpetbroadcast" option.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Ian Jackson <iwj@xenproject.org> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
(cherry picked from commit e680cc48b7184d3489873d6776f84ba1fc238ced)
Tamas K Lengyel [Fri, 26 Mar 2021 15:23:00 +0000 (16:23 +0100)]
x86/mem_sharing: copy parent VM's hostp2m's max_mapped_pfn during forking
When creating a VM fork copy the parent VM's hostp2m max_mapped_pfn value. Some
toolstack relies on the XENMEM_maximum_gpfn value to establish the maximum
addressable physical memory in the VM and for forks that have not yet been
unpaused that value is not going to reflect the correct max gpfn that's
possible to populate into the p2m. This patch fixes the issue.
Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
master commit: f10c41537113dc2406c26cdf134403864df02a7d
master date: 2021-03-26 16:17:07 +0100
Andrew Cooper [Fri, 26 Mar 2021 15:08:39 +0000 (16:08 +0100)]
Revert "x86/msr: drop compatibility #GP handling in guest_{rd,wr}msr()"
In hindsight, this was a poor move. Some of these MSRs require probing for,
cause unhelpful spew into xl dmesg, or cause spew from unit tests explicitly
checking behaviour.
This restores behaviour close to that of Xen 4.14, meaning in particular
that for all of the MSRs getting re-added explicitly a #GP fault will get
raised irrespective of the new "msr_relaxed" setting.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org> Acked-by: Ian Jackson <iwj@xenproject.org>
(cherry picked from commit 486cbda6dfbd1ba604cf51564dbc194ca5d0ead7)
Julien Grall [Thu, 25 Mar 2021 17:46:30 +0000 (17:46 +0000)]
docs/misc: xenstored: Re-instate and tweak the documentation for XS_RESUME
Commit 13dd372834a4 removed the documentation for XS_RESUME, however
this command is still implemented (at least in C Xenstored) and used by
libxl when resuming a domain.
So re-instate the documentation for the XS_RESUME. Take the opportunity
to update it as there is a user of the command.
Fixes: 13dd372834a4 ("docs/designs: re-work the xenstore migration document...") Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Paul Durrant <paul@xen.org> Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Ian Jackson <iwj@xenproject.org> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
(cherry picked from commit 03bee3fd475cb755572dd21214daecf9a77a6bb4)
Julien Grall [Thu, 25 Mar 2021 10:42:46 +0000 (10:42 +0000)]
docs/design: Update xenstore-migration.md
It is not very clear the shared page adddress is not contained in the
connection record. Additionally, it is misleading to say the grant
will always point to the share paged as a domain is free to revoke the
permission. The restore code would need to make sure it doesn't
fail/crash if this is happening.
The sentence is now replaced with a paragraph explaining why the GFN is
not preserved and that the grant is not guarantee to exist during
restore.
Take the opportunity to replace "code" with "node" when description the
permission.
Reported-by: Raphael Ning <raphning@amazon.com> Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Juergen Gross <jgross@suse.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
(cherry picked from commit 7b6d288ef648c76da70fd9e41e73617631025cfe)
Julien Grall [Sat, 13 Mar 2021 13:50:44 +0000 (13:50 +0000)]
SUPPORT.MD: Mark LiveUpdate of C/OCaml xenstored daemon as Tech Preview
Support to liveupdate C/OCaml xenstored daemon was added during the
4.15 development cycle. Add two new sections in SUPPORT.MD to explain
what is the support state.
For now, it is a tech preview.
Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Anthony PERARD [Wed, 24 Feb 2021 18:39:20 +0000 (18:39 +0000)]
libxl: Fix domain soft reset state handling
In do_domain_soft_reset(), a `libxl__domain_suspend_state' is used
without been properly initialised and disposed of. This lead do a
abort() in libxl due to the `dsps.qmp' state been used before been
initialised:
libxl__ev_qmp_send: Assertion `ev->state == qmp_state_disconnected || ev->state == qmp_state_connected' failed.
Once initialised, `dsps' also needs to be disposed of as the `qmp'
state might still be in the `Connected' state in the callback for
libxl__domain_suspend_device_model(). So this patch adds
libxl__domain_suspend_dispose() which can be called from the two
places where we need to dispose of `dsps'.
Reported-by: Olaf Hering <olaf@aepfle.de> Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Reviewed-by: Ian Jackson <iwj@xenproject.org> Tested-by: Olaf Hering <olaf@aepfle.de>
Furthermore, pre-4.9 GCC have known bugs (including things like
internal compiler errors on Arm) which would require workaround (I
haven't checked if we have any in Xen).
The minimum version of GCC to build the hypervisor on arm is now
raised to 4.9.
In addition to that, on arm64, GCC version >= 4.9 and < 5.1 have been
shown to emit memory references beyond the stack pointer, resulting in
memory corruption if an interrupt is taken after the stack pointer has
been adjusted but before the reference has been executed.
Therefore, the minimum for arm64 is raised to 5.1.
Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Mon, 15 Mar 2021 07:33:53 +0000 (08:33 +0100)]
tools/x86: don't rebuild cpuid-autogen.h every time
The first thing the "xen-dir" rule does is delete the entire xen/
subtree. Obviously this includes deleting xen/lib/x86/*autogen.h. As a
result there's no original version for $(move-if-changed ...) to compare
against, and hence the file and all its consumers would get rebuilt
every time. Instead only find and delete all the symlinks.
Fixes: eddf9559c977 ("libx86: generate cpuid-autogen.h in the libx86 include dir") Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Ian Jackson <iwj@xenproject.org> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Fri, 12 Mar 2021 11:03:06 +0000 (12:03 +0100)]
x86/AMD: expose HWCR.TscFreqSel to guests
Linux has been warning ("firmware bug") about this bit being clear for a
long time. While writable in older hardware it has been readonly on more
than just most recent hardware. For simplicitly report it always set (if
anything we may want to log the issue ourselves if it turns out to be
clear on older hardware) on CPU families 10h and up (in family 0fh the
bit is part of a larger field of different purpose).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Fri, 12 Mar 2021 11:02:42 +0000 (12:02 +0100)]
x86/PV: conditionally avoid raising #GP for early guest MSR reads
Prior to 4.15 Linux, when running in PV mode, did not install a #GP
handler early enough to cover for example the rdmsrl_safe() of
MSR_K8_TSEG_ADDR in bsp_init_amd() (not to speak of the unguarded read
of MSR_K7_HWCR later in the same function). The respective change
(42b3a4cb5609 "x86/xen: Support early interrupts in xen pv guests") was
backported to 4.14, but no further - presumably since it wasn't really
easy because of other dependencies.
Therefore, to prevent our change in the handling of guest MSR accesses
to render PV Linux 4.13 and older unusable on at least AMD systems, make
the raising of #GP on this paths conditional upon the guest having
installed a handler, provided of course the MSR can be read in the first
place (we would have raised #GP in that case even before). Producing
zero for reads isn't necessarily correct and may trip code trying to
detect presence of MSRs early, but since such detection logic won't work
without a #GP handler anyway, this ought to be a fair workaround.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org> Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Fri, 12 Mar 2021 16:35:54 +0000 (17:35 +0100)]
gnttab: work around "may be used uninitialized" warning
Sadly I was wrong to suggest dropping vaddrs' initializer during review
of v2 of the patch introducing this code. gcc 4.3 can't cope.
Fixes: 52531c734ea1 ("xen/gnttab: Rework resource acquisition") Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Dario Faggioli [Fri, 12 Mar 2021 16:02:47 +0000 (17:02 +0100)]
xen: fix for_each_cpu when NR_CPUS=1
When running an hypervisor build with NR_CPUS=1 for_each_cpu does not
take into account whether the bit of the CPU is set or not in the
provided mask.
This means that whatever we have in the bodies of these loops is always
done once, even if the mask was empty and it should never be done. This
is clearly a bug and was in fact causing an assert to trigger in credit2
code.
Removing the special casing of NR_CPUS == 1 makes things work again.
Reported-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Dario Faggioli <dfaggioli@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Igor Druzhinin [Fri, 12 Mar 2021 16:01:52 +0000 (17:01 +0100)]
vtd: make sure QI/IR are disabled before initialisation
BIOS might pass control to Xen leaving QI and/or IR in enabled and/or
partially configured state. In case of x2APIC code path where EIM is
enabled early in boot - those are correctly disabled by Xen before any
attempt to configure. But for xAPIC that step is missing which was
proven to cause QI initialization failures on some ICX based platforms
where QI is left pre-enabled and partially configured by BIOS. That
problem becomes hard to avoid since those platforms are shipped with
x2APIC opt out being advertised by default at the same time by firmware.
Unify the behaviour between x2APIC and xAPIC code paths keeping that in
line with what Linux does.
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Roger Pau Monné [Fri, 12 Mar 2021 07:59:56 +0000 (08:59 +0100)]
x86/msr: introduce an option for compatible MSR behavior selection
Introduce an option to allow selecting a behavior similar to the pre
Xen 4.15 one for accesses to MSRs not explicitly handled. Since commit 84e848fd7a162f669 and 322ec7c89f6640e accesses to MSRs not explicitly
handled by Xen result in the injection of a #GP to the guest. This
is a behavior change since previously a #GP was only injected if
accessing the MSR on the real hardware would also trigger a #GP, or if
the attempted to be set bits wouldn't match the hardware values (for
PV). The reasons for not leaking hardware MSR values and injecting a
#GP are fully valid, so the solution proposed here should be
considered a temporary workaround until all the required MSRs are
properly handled.
This seems to be problematic for some guests, so introduce an option
to fallback to this kind of legacy behavior without leaking the
underlying MSR values to the guest.
When the option is set, for both PV and HVM don't inject a #GP to the
guest on MSR read if reading the underlying MSR doesn't result in a
#GP, do the same for writes and simply discard the value to be written
on that case.
Note that for guests restored or migrated from previous Xen versions
the option is enabled by default, in order to keep a compatible
MSR behavior. Such compatibility is done at the libxl layer, to avoid
higher-level toolstacks from having to know the details about this flag.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Christian Lindig <christian.lindig@citrix.com> Reviewed-by: Ian Jackson <iwj@xenproject.org> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Andrew Cooper [Thu, 4 Mar 2021 22:30:00 +0000 (22:30 +0000)]
tools/libs: Fix headers.chk logic
c/s 4664034cd dropped the $(LIBHEADERSGLOB) dependency for the headers.chk
rule, without replacing it.
As headers.chk uses $^, a typical build looks like:
andrewcoop@andrewcoop:/local/xen.git$ make -C tools/libs/devicemodel/
make: Entering directory '/local/xen.git/tools/libs/devicemodel'
for i in ; do \
gcc -x c -ansi -Wall -Werror -I/local/xen.git/tools/libs/devicemodel/../../../tools/include \
-S -o /dev/null $i || exit 1; \
echo $i; \
done >headers.chk.new
mv headers.chk.new headers.chk
i.e. with an empty for loop.
Reinsert a $(LIBHEADERS) dependency, so more than just the $(AUTOINCS) get
checked.
Fixes: 4664034cd ("tools/libs: move official headers to common directory") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Andrew Cooper [Thu, 4 Mar 2021 22:30:00 +0000 (22:30 +0000)]
xen/dmop: Strip __XEN_TOOLS__ header guard from public ABI
__XEN_TOOLS__ is really there to separate the unstable from stable hypercalls.
Exactly as with c/s f40e1c52e4, stable interfaces shouldn't contain this
guard.
That change actually broke the build with:
include/xendevicemodel.h:52:5: error: unknown type name 'ioservid_t'
ioservid_t *id);
^
as libxendevicemodel.h now uses a type it can't see a typedef for. However,
nothing noticed because the header.chk logic is also broken (fixed
subsequently).
Strip the guard from the public header, and remove compensation from
devicemodel's private.h. Fix the dmop design doc to discuss both reasons
behind the the ABI design.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Andrew Cooper [Mon, 1 Mar 2021 17:30:00 +0000 (17:30 +0000)]
tools/libxentoolcore: Fill in LIBHEADERS
c/s 4664034cd replaced a glob over include/*.h with an expectation that
LIBHEADER was suitably set for libraries which didn't have a single,
consistently named, header file.
This wasn't true for xentoolcore, which lost xentoolcore_internal.h as a
consequence, and failed an API/ABI check vs 4.14
Fixes: 4664034cd ("tools/libs: move official headers to common directory") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Roger Pau Monne [Wed, 3 Mar 2021 14:33:16 +0000 (15:33 +0100)]
automation: allow doing hypervisor only builds
For things like randconfig there's no need to do a full Xen build, a
hypervisor build only will be much faster and will achieve the same
level of testing, as randconfig only changes the hypervisor build
options.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org> Acked-by: Doug Goldstein <cardoe@cardoe.com>
typeof() is available in Xen's build environment, which uses Xen's
compiler. As these headers are public, they need strict standards
conformance. Only __typeof__() is officially standardized.
A compiler in standards conformance mode should report:
warning: implicit declaration of function 'typeof' is invalid in C99
[-Wimplicit-function-declaration]
(this has been observed with FreeBSD's kernel build environment)
Based-on-patch-by: Julien Grall <julien@xen.org>, Sun Oct 4 20:33:04 2015 +0100 Signed-off-by: Elliott Mitchell <ehem+xen@m5p.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Reviewed-by: Juergen Gross <jgross@suse.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Julien Grall [Fri, 5 Mar 2021 12:40:03 +0000 (12:40 +0000)]
tools/xenstore: Check the format printf for xprintf() and barf{,_perror}()
Allow GCC to analyze the format printf for xprintf() and
barf{,_perror}().
Take the opportunity to define __noreturn to make the prototype for
barf{,_perror})() easier to read.
Also document why 'extern' is used for xprintf().
Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Ian Jackson <iwj@xenproject.org> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Julien Grall [Fri, 5 Mar 2021 12:40:02 +0000 (12:40 +0000)]
tools/xenstore: Consolidate PRINTF_ATTRIBUTE() in utils.h
At the moment PRINTF_ATTRIBUTE() is defined in two places:
- tdb.h: Defined as a NOP
- talloc.h: Defined as a NOP for GCC older than 3.0 otherwise will
add the attribute to check the printf format
Xen requires to build with minimum GCC 4.1 and we want to check the
printf format for all the printf-like functions.
Only implement PRINTF_ATTRIBUTE() once in utils.h and drop the
conditional check for GCC < 3.0.
Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Juergen Gross <jgross@suse.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Tue, 9 Mar 2021 17:04:07 +0000 (18:04 +0100)]
libxl/ACPI: add missing build dependency
Just like all other object files - wherever *.o is mentioned, *.opic
also needs mentioning to yield consistent behavior. Otherwise make may
decide to (re)build the object before recursion into $(ACPI_PATH)/ (to
update $(DSDT_FILES-y) and ssdt_*.h) was actually finished.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Andrew Cooper [Fri, 26 Feb 2021 11:02:33 +0000 (11:02 +0000)]
automation: Fix the Alpine clang builds to use clang
Looks like a copy&paste error.
Fixes: f6e1d8515d7 ("automation: add alpine linux x86 build jobs") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Jan Beulich [Mon, 8 Mar 2021 09:41:50 +0000 (10:41 +0100)]
x86/shadow: suppress "fast fault path" optimization when running virtualized
We can't make correctness of our own behavior dependent upon a
hypervisor underneath us correctly telling us the true physical address
with hardware uses. Without knowing this, we can't be certain reserved
bit faults can actually be observed. Therefore, besides evaluating the
number of address bits when deciding whether to use the optimization,
also check whether we're running virtualized ourselves. (Note that since
we may get migrated when running virtualized, the number of address bits
may also change.)
Requested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Tim Deegan <tim@xen.org> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Fri, 5 Mar 2021 12:29:28 +0000 (13:29 +0100)]
x86/shadow: suppress "fast fault path" optimization without reserved bits
When none of the physical address bits in PTEs are reserved, we can't
create any 4k (leaf) PTEs which would trigger reserved bit faults. Hence
the present SHOPT_FAST_FAULT_PATH machinery needs to be suppressed in
this case, which is most easily achieved by never creating any magic
entries.
To compensate a little, eliminate sh_write_p2m_entry_post()'s impact on
such hardware.
While at it, also avoid using an MMIO magic entry when that would
truncate the incoming GFN.
Requested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tim Deegan <tim@xen.org> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
This is inappropriate for the header file of a standalone library with stable
API and ABI.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Ian Jackson <iwj@xenproject.org> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Andrew Cooper [Thu, 4 Mar 2021 10:36:21 +0000 (10:36 +0000)]
xen/dmop: Fix XEN_DMOP_nr_vcpus to actually return data
The const_op boolean needs clobbering to cause data to be written back to the
caller.
Fixes: c4441ab1f1 ("dmop: Add XEN_DMOP_nr_vcpus") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Roger Pau Monné [Thu, 4 Mar 2021 15:49:00 +0000 (16:49 +0100)]
firmware: provide a stand alone set of headers
The current build of the firmware relies on having 32bit compatible
headers installed in order to build some of the 32bit firmware.
Usually this can be solved by using the -ffreestanding compiler option
which drops the usage of the system headers in favor of a private set
of freestanding headers provided by the compiler itself that are not
tied to libc.
However such option is broken at least in the gcc compiler provided in
Alpine Linux, as the system include path (ie: /usr/include) takes
precedence over the gcc private include path:
And the headers in /usr/include are exclusively 64bit.
Since -ffreestanding is currently broken on at least that distro, and
for resilience against future compilers also having the option broken
provide a set of stand alone 32bit headers required for the firmware
build.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Thu, 4 Mar 2021 15:47:51 +0000 (16:47 +0100)]
crypto: adjust rijndaelEncrypt() prototype for gcc11
The upcoming release complains, not entirely unreasonably:
In file included from rijndael.c:33:
.../xen/include/crypto/rijndael.h:55:53: note: previously declared as 'const unsigned char[]'
55 | void rijndaelEncrypt(const unsigned int [], int, const unsigned char [],
| ^~~~~~~~~~~~~~~~~~~~~~
rijndael.c:865:8: error: argument 4 of type 'u8[16]' {aka 'unsigned char[16]'} with mismatched bound [-Werror=array-parameter=]
865 | u8 ct[16])
| ~~~^~~~~~
In file included from rijndael.c:33:
.../xen/include/crypto/rijndael.h:56:13: note: previously declared as 'unsigned char[]'
56 | unsigned char []);
| ^~~~~~~~~~~~~~~~
Simply declare the correct array dimensions right away. This then allows
compilers to apply checking at call sites, which seems desirable anyway.
For the moment I'm leaving untouched the disagreement between u8/u32
used in the function definition and unsigned {char,int} used in the
declaration, as making this consistent would call for touching further
functions.
Reported-by: Charles Arnold <carnold@suse.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Julien Grall <jgrall@amazon.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Julien Grall [Wed, 3 Mar 2021 17:05:26 +0000 (17:05 +0000)]
tools/xenstored: liveupdate: Properly check long transaction
As XenStored is single-threaded, conn->ta_start_time will always be
smaller than now. As we substract the latter from the former, it means
a transaction will never be considered long running.
Invert the two operands of the substraction in both lu_reject_reason()
and lu_check_allowed(). In addition to that, the former also needs to
check that conn->ta_start_time is not 0 (i.e the transaction is not
active).
Take the opportunity to document the return condition of
lu_check_allowed().
Fixes: e04e53a5be20 ("tools/xenstore: allow live update only with no transaction active") Reported-by: Bjoern Doebel <doebel@amazon.de> Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Juergen Gross <jgross@suse.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Norbert Manthey [Fri, 26 Feb 2021 14:41:44 +0000 (15:41 +0100)]
tools/xenstore: Harden xs_domain_is_introduced()
The function single_with_domid() may return NULL if something
went wrong (e.g. XenStored returns an error or the connection is
in bad state).
They are unlikely but not impossible, so it would be better to
return an error and allow the caller to handle it gracefully rather
than crashing.
In this case we should treat it as the domain has disappeared (i.e.
return false) as the caller will not likely going to be able to
communicate with XenStored again.
This bug was discovered and resolved using Coverity Static Analysis
Security Testing (SAST) by Synopsys, Inc.
Signed-off-by: Norbert Manthey <nmanthey@amazon.de> Reviewed-by: Julien Grall <jgrall@amazon.co.uk> Reviewed-by: Raphael Ning <raphning@amazon.co.uk> Reviewed-by: Juergen Gross <jgross@suse.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Norbert Manthey [Fri, 26 Feb 2021 14:41:41 +0000 (15:41 +0100)]
xenstore: handle do_mkdir and do_rm failure
In the out of memory case, we might return a NULL pointer when
canonicalizing node names. This NULL pointer is not checked when
creating a directory, or when removing a node. This change handles
the NULL pointer for these two cases.
This bug was discovered and resolved using Coverity Static Analysis
Security Testing (SAST) by Synopsys, Inc.
Signed-off-by: Norbert Manthey <nmanthey@amazon.de> Reviewed-by: Thomas Friebel <friebelt@amazon.de> Reviewed-by: Julien Grall <jgrall@amazon.co.uk> Reviewed-by: Juergen Gross <jgross@suse.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Tue, 2 Mar 2021 11:30:30 +0000 (12:30 +0100)]
x86/shadow: replace bogus return path in shadow_get_page_from_l1e()
Prior to be640b1800bb ("x86: make get_page_from_l1e() return a proper
error code") a positive return value did indicate an error. Said commit
failed to adjust this return path, but luckily the only caller has
always been inside a shadow_mode_refcounts() conditional.
Subsequent changes caused 1 to end up at the default (error) label in
the caller's switch() again, but the returning of 1 (== _PAGE_PRESENT)
is still rather confusing here, and a latent risk.
Convert to an ASSERT() instead, just in case any new caller would
appear.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Tim Deegan <tim@xen.org> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Tue, 2 Mar 2021 11:29:16 +0000 (12:29 +0100)]
sched: fix build when NR_CPUS == 1
In this case the compiler is recognizing that no valid array indexes
remain, and hence e.g. reports:
core.c: In function 'cpu_schedule_up':
core.c:2769:19: error: array subscript 1 is above array bounds
of 'struct vcpu *[1]' [-Werror=array-bounds]
2769 | if ( idle_vcpu[cpu] == NULL )
| ~~~~~~~~~^~~~~
Reported-by: Connor Davis <connojdavis@gmail.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Dario Faggioli <dfaggioli@suse.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Julien Grall [Fri, 26 Feb 2021 10:56:40 +0000 (10:56 +0000)]
xen/iommu: x86: Clear the root page-table before freeing the page-tables
The new per-domain IOMMU page-table allocator will now free the
page-tables when domain's resources are relinquished. However, the
per-domain IOMMU structure will still contain a dangling pointer to
the root page-table.
Xen may access the IOMMU page-tables afterwards at least in the case of
PV domain:
This will result to a use after-free and possibly an host crash or
memory corruption.
It would not be possible to free the page-tables further down in
domain_relinquish_resources() because cleanup_page_mappings() will only
be called when the last reference on the page dropped. This may happen
much later if another domain still hold a reference.
After all the PCI devices have been de-assigned, nobody should use the
IOMMU page-tables and it is therefore pointless to try to modify them.
So we can simply clear any reference to the root page-table in the
per-domain IOMMU structure. This requires to introduce a new callback of
the method will depend on the IOMMU driver used.
Take the opportunity to add an ASSERT() in arch_iommu_domain_destroy()
to check if we freed all the IOMMU page tables.
Fixes: 3eef6d07d722 ("x86/iommu: convert VT-d code to use new page table allocator") Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Julien Grall [Fri, 26 Feb 2021 10:56:39 +0000 (10:56 +0000)]
xen/x86: iommu: Ignore IOMMU mapping requests when a domain is dying
The new x86 IOMMU page-tables allocator will release the pages when
relinquishing the domain resources. However, this is not sufficient
when the domain is dying because nothing prevents page-table to be
allocated.
As the domain is dying, it is not necessary to continue to modify the
IOMMU page-tables as they are going to be destroyed soon.
At the moment, page-table allocates will only happen when iommu_map().
So after this change there will be no more page-table allocation
happening because we don't use superpage mappings yet when not sharing
page tables.
In order to observe d->is_dying correctly, we need to rely on per-arch
locking, so the check to ignore IOMMU mapping is added on the per-driver
map_page() callback.
Fixes: 15bc9a1ef51c ("x86/iommu: add common page-table allocator") Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Julien Grall [Fri, 26 Feb 2021 10:56:38 +0000 (10:56 +0000)]
xen/iommu: x86: Don't try to free page tables is the IOMMU is not enabled
When using CONFIG_BIGMEM=y, the page_list cannot be accessed whilst it
is is unitialized. However, iommu_free_pgtables() will be called even if
the domain is not using an IOMMU.
Consequently, Xen will try to go through the page list and deference a
NULL pointer.
Bail out early if the domain is not using an IOMMU.
Fixes: 15bc9a1ef51c ("x86/iommu: add common page-table allocator") Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Julien Grall [Fri, 26 Feb 2021 18:26:55 +0000 (18:26 +0000)]
tools/xenstored: Avoid dereferencing a NULL pointer if LiveUpdate is failing
In case of failure in do_lu_start(), XenStored will first free lu_start
and then try to dereference it.
This will result to a NULL dereference as the destruction callback will
set lu_start to NULL.
The crash can be avoided by freeing lu_start *after* the reply has been
set.
Fixes: af216a99fb4a ("tools/xenstore: add the basic framework for doing the live update") Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Juergen Gross <jgross@suse.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Andrew Cooper [Thu, 25 Feb 2021 19:15:08 +0000 (19:15 +0000)]
tools/firmware: Build firmware as -ffreestanding
firmware should always have been -ffreestanding, as it doesn't execute in the
host environment. -ffreestanding implies -fno-builtin, so replace the option.
inttypes.h isn't a freestanding header, but the 32bitbios_support.c only wants
the stdint.h types so switch to the more appropriate include.
This removes the build time dependency on a 32bit libc just to compile the
hvmloader and friends.
Update README and the TravisCI configuration.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Ian Jackson <iwj@xenproject.org> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Andrew Cooper [Thu, 25 Feb 2021 19:13:17 +0000 (19:13 +0000)]
tools/hvmloader: Drop machelf include as well
The logic behind switching to elfstructs applies to sun builds as well.
Fixes: 81b2b328a2 ("hvmloader: use Xen private header for elf structs") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Andrew Cooper [Thu, 25 Feb 2021 20:30:49 +0000 (20:30 +0000)]
cirrus-ci: Drop obsolete dependency
markdown as a dependency was dropped in 4.12
Fixes: 5d94433a66 ("cirrus-ci: introduce some basic FreeBSD testing") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Andrew Cooper [Thu, 25 Feb 2021 15:46:10 +0000 (15:46 +0000)]
dmop: Add XEN_DMOP_nr_vcpus
Curiously absent from the stable API/ABIs is an ability to query the number of
vcpus which a domain has. Emulators need to know this information in
particular to know how many stuct ioreq's live in the ioreq server mappings.
In practice, this forces all userspace to link against libxenctrl to use
xc_domain_getinfo(), which rather defeats the purpose of the stable libraries.
Introduce a DMOP to retrieve this information and surface it in
libxendevicemodel to help emulators shed their use of unstable interfaces.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Ian Jackson <iwj@xenproject.org> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
--- CC: Jan Beulich <JBeulich@suse.com> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Wei Liu <wl@xen.org> CC: Paul Durrant <paul@xen.org> CC: Stefano Stabellini <sstabellini@kernel.org> CC: Julien Grall <julien@xen.org> CC: Volodymyr Babchuk <Volodymyr_Babchuk@epam.com> CC: Ian Jackson <iwj@xenproject.org>
For 4.15. This was a surprise discovery in the massive ABI untangling effort
I'm currently doing for XenServer's new build system.
This is one new read-only op to obtain information which isn't otherwise
available under a stable API/ABI. As such, its risk for 4.15 is very low,
with a very real quality-of-life improvement for downstreams.
I realise this is technically a new feature and we're long past feature
freeze, but I'm hoping that "really lets some emulators move off the unstable
libraries" is sufficiently convincing argument.
It's not sufficient to let Qemu move off unstable libraries yet - at a
minimum, the add_to_phymap hypercalls need stabilising to support PCI
Passthrough and BAR remapping.
I'd prefer not to duplicate the op handling between ARM and x86, and if this
weren't a release window, I'd submit a prereq patch to dedup the common dmop
handling. That can wait to 4.16 at this point. Also, this op ought to work
against x86 PV guests, but fixing that up will also need this rearrangement
into common code, so needs to wait.
Andrew Cooper [Thu, 25 Feb 2021 16:54:17 +0000 (16:54 +0000)]
x86/dmop: Properly fail for PV guests
The current code has an early exit for PV guests, but it returns 0 having done
nothing.
Fixes: 524a98c2ac5 ("public / x86: introduce __HYPERCALL_dm_op...") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Ian Jackson <iwj@xenproject.org> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Julien Grall [Sat, 20 Feb 2021 19:22:34 +0000 (19:22 +0000)]
xen/sched: Add missing memory barrier in vcpu_block()
The comment in vcpu_block() states that the events should be checked
/after/ blocking to avoids wakeup waiting race. However, from a generic
perspective, set_bit() doesn't prevent re-ordering. So the following
could happen:
CPU0 (blocking vCPU A) | CPU1 ( unblock vCPU A)
|
A <- read local events |
| set local events
| test_and_clear_bit(_VPF_blocked)
| -> Bail out as the bit if not set
|
set_bit(_VFP_blocked) |
|
check A |
The variable A will be 0 and therefore the vCPU will be blocked when it
should continue running.
vcpu_block() is now gaining an smp_mb__after_atomic() to prevent the CPU
to read any information about local events before the flag _VPF_blocked
is set.
Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Ash Wilding <ash.j.wilding@gmail.com> Acked-by: Stefano Stabellini <sstabellini@kernel.org> Acked-by: Dario Faggioli <dfaggioli@suse.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Julien Grall [Thu, 25 Feb 2021 16:33:23 +0000 (16:33 +0000)]
tools/xenstored: control: Store the save filename in lu_dump_state
The function lu_close_dump_state() will use talloc_asprintf() without
checking whether the allocation succeeded. In the unlikely case we are
out of memory, we would dereference a NULL pointer.
As we already computed the filename in lu_get_dump_state(), we can store
the name in the lu_dump_state. This is avoiding to deal with memory file
in the close path and also reduce the risk to use the different
filename.
This bug was discovered and resolved using Coverity Static Analysis
Security Testing (SAST) by Synopsys, Inc.
Fixes: c0dc6a3e7c41 ("tools/xenstore: read internal state when doing live upgrade") Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Juergen Gross <jgross@suse.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Julien Grall [Thu, 25 Feb 2021 15:43:04 +0000 (15:43 +0000)]
tools/xenstored: Avoid unnecessary talloc_strdup() in do_lu_start()
At the moment, the return of talloc_strdup() is not checked. This means
we may dereference a NULL pointer if the allocation failed.
However, it is pointless to allocate the memory as send_reply() will
copy the data to a different buffer. So drop the use of talloc_strdup().
This bug was discovered and resolved using Coverity Static Analysis
Security Testing (SAST) by Synopsys, Inc.
Fixes: af216a99fb4a ("tools/xenstore: add the basic framework for doing the live update") Signed-off-by: Julien Grall <jgrall@amazon.com> Reviewed-by: Juergen Gross <jgross@suse.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Fri, 26 Feb 2021 09:18:59 +0000 (10:18 +0100)]
VMX: delay p2m insertion of APIC access page
Inserting the mapping at domain creation time leads to a memory leak
when the creation fails later on and the domain uses separate CPU and
IOMMU page tables - the latter requires intermediate page tables to be
allocated, but there's no freeing of them at present in this case. Since
we don't need the p2m insertion to happen this early, avoid the problem
altogether by deferring it until the last possible point. This comes at
the price of not being able to handle an error other than by crashing
the domain.
Reported-by: Julien Grall <julien@xen.org> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Andrew Cooper [Thu, 25 Feb 2021 14:09:26 +0000 (14:09 +0000)]
automation: Fix containerize to understand the Alpine container
This was missing from the work to add the alpine container.
Fixes: a9afe7768bd ("automation: add alpine linux 3.12 x86 build container") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Thu, 25 Feb 2021 14:39:09 +0000 (15:39 +0100)]
x86/PV: use get_unsafe() instead of copy_from_unsafe()
The former expands to a single (memory accessing) insn, which the latter
does not guarantee (the __builtin_constant_p() based switch() statement
there is just an optimization). Yet we'd prefer to read consistent PTEs
rather than risking a split read racing with an update done elsewhere.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Thu, 25 Feb 2021 14:38:35 +0000 (15:38 +0100)]
x86: move stac()/clac() from {get,put}_unsafe_asm() ...
... to {get,put}_unsafe_size(). There's no need to have the macros
expanded once per case label in the latter. This also makes the former
well-formed single statements again. No change in generated code.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Thu, 25 Feb 2021 14:37:35 +0000 (15:37 +0100)]
x86: rename copy_{from,to}_user() to copy_{from,to}_guest_pv()
Bring them (back) in line with __copy_{from,to}_guest_pv(). Since it
falls in the same group, also convert clear_user(). Instead of adjusting
__raw_clear_guest(), drop it - it's unused and would require a non-
checking __clear_guest_pv() which we don't have.
Add previously missing __user at some call sites and in the function
declarations.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>
Jan Beulich [Thu, 25 Feb 2021 14:36:54 +0000 (15:36 +0100)]
x86/gdbsx: convert "user" to "guest" accesses
Using copy_{from,to}_user(), this code was assuming to be called only by
PV guests. Use copy_{from,to}_guest() instead, transforming the incoming
structure field into a guest handle (the field should really have been
one in the first place). Also do not transform the debuggee address into
a pointer.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Release-Acked-by: Ian Jackson <iwj@xenproject.org>