Ian Jackson [Fri, 22 Feb 2019 12:24:35 +0000 (12:24 +0000)]
pygrub: Specify -rpath LIBEXEC_LIB when building fsimage.so
If LIBEXEC_LIB is not on the default linker search path, the python
fsimage.so module fails to find libfsimage.so.
Add the relevant directory to the rpath explicitly.
(This situation occurs in the Debian package, where
--with-libexec-libdir is used to put each Xen version's libraries and
utilities in their own directory, to allow them to be coinstalled.)
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
We install libfsimage in a non-standard path for Reasons.
(See debian/rules.)
This patch was originally part of `tools-pygrub-prefix.diff'
(eg commit 51657319be54) and included changes to the Makefile to
change the installation arrangements (we do that part in the rules now
since that is a lot less prone to conflicts when we update) and to
shared library rpath (which is now done in a separate patch).
(Commit message rewritten by Ian Jackson.)
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
squash! pygrub: Set sys.path and rpath
Ian Jackson [Thu, 21 Feb 2019 16:05:40 +0000 (16:05 +0000)]
hotplug-common: Do not adjust LD_LIBRARY_PATH
This is in the upstream script because on non-Debian systems, the
default install locations in /usr/local/lib might not be on the linker
path, and as a result the hotplug scripts would break.
A reason we might need it in Debian is our multiple version
coinstallation scheme. However, the hotplug scripts all call the
utilities via the wrappers, and the binaries are configured to load
from the right place anyway.
This setting is an annoyance because it requires libdir, which is an
arch-specific path but comes from a file we want to put in
xen-utils-common, an arch:all package.
So drop this setting.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Also see Debian bug #894013. The current attempt at providing
anti-spoofing rules results in a situation that does not have any
effect. Also note that forwarding bridged traffic to iptables is not
enabled by default, and that for openvswitch users it does not make any
sense.
So, stop cluttering the live iptables ruleset.
This functionality seems to be introduced before 2004 and since then it
has never got some additional love.
It would be nice to have a proper discussion upstream about how Xen
could provide some anti mac/ip spoofing in the dom0. It does not seem to
be a trivial thing to do, since it requires having quite some knowledge
about what the domU is allowed to do or not (e.g. a domU can be a
router...).
(XEN) Xen version 4.11.1 (Debian )
(@)
(gcc (Debian 8.2.0-13) 8.2.0) debug=n
Thu Jan 3 19:08:37 UTC 2019
I'd like to see:
(XEN) Xen version 4.11.1 (Debian 4.11.1-1~)
(pkg-xen-devel@lists.alioth.debian.org)
(gcc (Debian 8.2.0-13) 8.2.0) debug=n
Thu Jan 3 22:44:00 CET 2019
The substitution was broken since the great packaging refactoring,
because the directory in which the build is done changed.
Also, use the Maintainer address from debian/control instead of the most
recent changelog entry. If someone wants to use the address to ask a
question, they will end up at the team mailing list, which is better
than an individual person.
Ian Jackson [Mon, 15 Oct 2018 11:11:32 +0000 (12:11 +0100)]
Revert "tools-xenstore-compatibility.diff"
Following recent discussion in pkg-xen-devel and xen-devel,
https://lists.xenproject.org/archives/html/xen-devel/2018-10/msg00838.html
I am dropping this patch.
For now I revert it. When we next debrebase, we can (if we like)
throw away both the original patch, and this revert.
Ian Jackson [Fri, 12 Oct 2018 17:17:10 +0000 (17:17 +0000)]
shim: Provide separate install-shim target
When building on a 32-bit userland, the user wants to build 32-bit
tools and a 64-bit hypervisor. This involves setting XEN_TARGET_ARCH
to different values for the tools build and the hypervisor build.
So the user must invoke the tools build and the hypervisor build
separately.
However, although the shim is done by the tools/firmware Makefile, its
bitness needs to be the same as the hypervisor, not the same as the
tools. When run with XEN_TARGET_ARCH=x86_32, it it skipped, which is
wrong.
So the user must invoke the shim build separately. This can be done
with
make -C tools/firmware/xen-dir XEN_TARGET_ARCH=x86_64
However, tools/firmware/xen-dir has no `install' target. The
installation of all `firmware' is done in tools/firmware/Makefile. It
might be possible to fix this, but it is not trivial. For example,
the definitions of INST_DIR and DEBG_DIR would need to be copied, as
would an appropriate $(INSTALL_DIR) call.
For now, provide an `install-shim' target in tools/firmware/Makefile.
This has to be called from `install' of course. We can't make it
a dependency of `install' because it might be run before `all' has
completed. We could make it depend on a `shim' target but such
a target is nearly impossible to write because everything is done by
the inflexible subdir-$@ machinery.
The overally result of this patch is that existing make invocations
work as before. But additionally, the user can say
make -C tools/firmware install-shim XEN_TARGET_ARCH=x86_64
to install the shim. The user must have built it already.
Unlike the build rune, this install-rune is properly conditional
so it is OK to call on ARM.
What a mess.
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
This makes it easier to disable the shim build. (In Debian we need to
build the shim separately because it needs different compiler flags
and a different XEN_COMPILE_ARCH.
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
This is due to the combination of GCC6, and Debian's decision to
enable some hardening flags by default (to try to make runtime
addresses less predictable):
https://wiki.debian.org/Hardening/PIEByDefaultTransition
This is of no benefit for the x86 instruction emulator test, which is
a rebuild of the emulator code for testing purposes only. So pass
options to disable this.
These options will be no-ops if they are the same as the compiler
default.
On amd64, the -fno-pic breaks the build in a different way. So do
this only on i386.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com> CC: Jan Beulich <jbeulich@suse.com> CC: Andrew Cooper <andrew.cooper3@citrix.com>
Gbp-Pq: Topic misc
Gbp-Pq: Name toolstestsx86_emulator-pass--no-pie--fno.patch
kdd.c:698:13: error: 'memcpy' offset [-204, -717] is out of the bounds [0, 216] of object 'ctrl' with type 'kdd_ctrl' {aka 'union <anonymous>'} [-Werror=array-bounds]
memcpy(buf, ((uint8_t *)&ctrl.c32) + offset, len);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
kdd.c: In function 'kdd_select_callback':
kdd.c:642:14: note: 'ctrl' declared here
kdd_ctrl ctrl;
^~~~
But this is impossible - 'offset' is unsigned and correctly validated
few lines before.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Release-Acked-by: Juergen Gross <jgross@suse.com>
(cherry picked from commit 437e00fea04becc91c1b6bc1c0baa636b067a5cc)
Add zero-padding to #defined ACPI table strings that are copied.
Provides sufficient characters to satisfy the length required to
fully populate the destination and prevent array-bounds warnings.
Add BUILD_BUG_ON sizeof checks for compile-time length checking.
Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org> Acked-by: Wei Liu <wei.liu2@citrix.com>
(cherry picked from commit b8f33431f3dd23fb43a879f4bdb4283fdc9465ad)
Andrew Cooper [Wed, 4 Jul 2018 13:32:31 +0000 (14:32 +0100)]
tools: Move ARRAY_SIZE() into xen-tools/libs.h
xen-tools/libs.h currently contains a shared BUILD_BUG_ON() implementation and
is used by some tools. Extend this to include ARRAY_SIZE and clean up all the
opencoding.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit e1b7eb92d3ec6ce3ca68cffb36a148eb59f59613)
Wei Liu [Thu, 26 Jul 2018 14:58:54 +0000 (15:58 +0100)]
xenpmd: make 32 bit gcc 8.1 non-debug build work
32 bit gcc 8.1 non-debug build yields:
xenpmd.c:354:23: error: '%02x' directive output may be truncated writing between 2 and 8 bytes into a region of size 3 [-Werror=format-truncation=]
snprintf(val, 3, "%02x",
^~~~
xenpmd.c:354:22: note: directive argument in the range [40, 2147483778]
snprintf(val, 3, "%02x",
^~~~~~
xenpmd.c:354:5: note: 'snprintf' output between 3 and 9 bytes into a destination of size 3
snprintf(val, 3, "%02x",
^~~~~~~~~~~~~~~~~~~~~~~~
(unsigned int)(9*4 +
~~~~~~~~~~~~~~~~~~~~
strlen(info->model_number) +
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
strlen(info->serial_number) +
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
strlen(info->battery_type) +
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
strlen(info->oem_info) + 4));
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All info->* used in calculation are 32 bytes long, and the parsing
code makes sure they are null-terminated, so the end result of the
expression won't exceed 255, which should be able to be fit into 3
bytes in hexadecimal format.
Add an assertion to make gcc happy.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
(cherry picked from commit e75c9dc85fdeeeda0b98d8cd8d784e0508c3ffb8)
Ian Jackson [Thu, 4 Oct 2018 11:32:00 +0000 (12:32 +0100)]
pygrub fsimage.so: Honour LDFLAGS when building
This seems to have been simply omitted. Obviously this is needed when
building and not just when installing. Passing only when installing
is ineffective.
Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
Ian Jackson [Fri, 21 Sep 2018 14:40:19 +0000 (15:40 +0100)]
INSTALL: Mention kconfig
Firstly, add a reference to the documentation for the kconfig system.
Secondly, warn the user about the XEN_CONFIG_EXPERT problem.
CC: Doug Goldstein <cardoe@cardoe.com> CC: Wei Liu <wei.liu2@citrix.com> CC: Jan Beulich <jbeulich@suse.com> CC: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Ian Jackson <ian.jackson@citrix.com>
| xentop.c: In function 'print':
| xentop.c:304:4: error: 'vwprintw' is deprecated [-Werror=deprecated-declarations]
| vwprintw(stdscr, (curses_str_t)fmt, args);
| ^~~~~~~~
vw_printw (note the underscore) is a non-deprecated alternative.
Signed-off-by: Christopher Clark <christopher.clark6@baesystems.com>
Gbp-Pq: Topic misc
Gbp-Pq: Name tools-xentop-replace-use-of-deprecated-vwprintw.patch
Update to 4.11.1+92-g6c33308a8d-2 with MDS documentation
Following up feedback from the release team, add a NEWS file mentioning
the MDS mitigations with some instructions, so that it will be more
visible to people using apt-listchanges.
Mention the ucode option in our default documented set of "usually used
options", so that users doing a new install will get a hint about the
existence of this option, and what it does.
Julien Grall [Thu, 29 Nov 2018 11:37:43 +0000 (11:37 +0000)]
xen/arm: mm: Set-up page permission for Xen mappings earlier on
Xen mapping is first create using a 2MB page and then shatterred in 4KB
page for fine-graine permission. However, it is not safe to break-down
superpage page without going to an intermediate step invalidating
the entry.
As we are changing Xen mappings, we cannot go through the intermediate
step. The only solution is to create Xen mapping using 4KB entries
directly. As the Xen should always access the mappings according with
the runtime permission, it is then possible to set-up the permissions
while create the mapping.
We are still playing with the fire as there are still some
break-before-make issue in setup_pagetables (i.e switch between 2 sets of
page-tables). But it should slightly be better than the current state.
Igor Druzhinin [Thu, 6 Jun 2019 12:11:24 +0000 (14:11 +0200)]
libacpi: report PCI slots as enabled only for hotpluggable devices
DSDT for qemu-xen lacks _STA method of PCI slot object. If _STA method
doesn't exist then the slot is assumed to be always present and active
which in conjunction with _EJ0 method makes every device ejectable for
an OS even if it's not the case.
qemu-kvm is able to dynamically add _EJ0 method only to those slots
that either have hotpluggable devices or free for PCI passthrough.
As Xen lacks this capability we cannot use their way.
qemu-xen-traditional DSDT has _STA method which only reports that
the slot is present if there is a PCI devices hotplugged there.
This is done through querying of its PCI hotplug controller.
qemu-xen has similar capability that reports if device is "hotpluggable
or absent" which we can use to achieve the same result.
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: 6761965243b113230bed900d6105be05b28f5cea
master date: 2019-05-24 10:30:21 +0200
Jan Beulich [Thu, 6 Jun 2019 12:11:09 +0000 (14:11 +0200)]
x86/IO-APIC: fix build with gcc9
There are a number of pointless __packed attributes which cause gcc 9 to
legitimately warn:
utils.c: In function 'vtd_dump_iommu_info':
utils.c:287:33: error: converting a packed 'struct IO_APIC_route_entry' pointer (alignment 1) to a 'struct IO_APIC_route_remap_entry' pointer (alignment 8) may result in an unaligned pointer value [-Werror=address-of-packed-member]
287 | remap = (struct IO_APIC_route_remap_entry *) &rte;
| ^~~~~~~~~~~~~~~~~~~~~~~~~
intremap.c: In function 'ioapic_rte_to_remap_entry':
intremap.c:343:25: error: converting a packed 'struct IO_APIC_route_entry' pointer (alignment 1) to a 'struct IO_APIC_route_remap_entry' pointer (alignment 8) may result in an unaligned pointer value [-Werror=address-of-packed-member]
343 | remap_rte = (struct IO_APIC_route_remap_entry *) old_rte;
| ^~~~~~~~~~~~~~~~~~~~~~~~~
Simply drop these attributes. Take the liberty and also re-format the
structure definitions at the same time.
Reported-by: Charles Arnold <carnold@suse.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: ca9310b24e6205de5387e5982ccd42c35caf89d4
master date: 2019-05-24 10:19:59 +0200
Jan Beulich [Thu, 6 Jun 2019 12:10:46 +0000 (14:10 +0200)]
x86emul: add support for missing {,V}PMADDWD insns
Their pre-AVX512 incarnations have clearly been overlooked during much
earlier work. Their memory access pattern is entirely standard, so no
specific tests get added to the harness.
Reported-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Alexandru Isaila <aisaila@bitdefender.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 1a48bdd599b268a2d9b7d0c45f1fd40c4892186e
master date: 2019-05-16 13:43:17 +0200
Jan Beulich [Thu, 6 Jun 2019 12:09:56 +0000 (14:09 +0200)]
x86/IRQ: avoid UB (or worse) in trace_irq_mask()
Dynamically allocated CPU mask objects may be smaller than cpumask_t, so
copying has to be restricted to the actual allocation size. This is
particulary important since the function doesn't bail early when tracing
is not active, so even production builds would be affected by potential
misbehavior here.
Take the opportunity and also
- use initializers instead of assignment + memset(),
- constify the cpumask_t input pointer,
- u32 -> uint32_t.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: George Dunlap <george.dunlap@citrix.com>
master commit: 6fafb8befa99620a2d7323b9eca5c387bad1f59f
master date: 2019-05-13 16:41:03 +0200
Andrew Cooper [Thu, 6 Jun 2019 12:09:37 +0000 (14:09 +0200)]
x86/boot: Fix latent memory corruption with early_boot_opts_t
c/s ebb26b509f "xen/x86: make VGA support selectable" added an #ifdef
CONFIG_VIDEO into the middle the backing space for early_boot_opts_t,
but didn't adjust the structure definition in cmdline.c
This only functions correctly because the affected fields are at the end
of the structure, and cmdline.c doesn't write to them in this case.
To retain the slimming effect of compiling out CONFIG_VIDEO, adjust
cmdline.c with enough #ifdef-ary to make C's idea of the structure match
the declaration in asm. This requires adding __maybe_unused annotations
to two helper functions.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: 30596213617fcf4dd7b71d244e16c8fc0acf456b
master date: 2019-05-13 10:35:38 +0100
Andrew Cooper [Thu, 6 Jun 2019 12:09:20 +0000 (14:09 +0200)]
x86/svm: Fix handling of ICEBP intercepts
c/s 9338a37d "x86/svm: implement debug events" added support for introspecting
ICEBP debug exceptions, but didn't account for the fact that
svm_get_insn_len() (previously __get_instruction_length) can fail and may
already have raised #GP with the guest.
If svm_get_insn_len() fails, return back to guest context rather than
continuing and mistaking a trap-style VMExit for a fault-style one.
Spotted by Coverity.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Acked-by: Brian Woods <brian.woods@amd.com>
master commit: 1495b4ff9b4af2b9c0f12cdb6491082cecf34f86
master date: 2019-05-13 10:35:37 +0100
The limit 1900x1200 do not match real world devices (1900 looks like a
typo, should be 1920). But in practice the limits are arbitrary and do
not serve any real purpose. As discussed in "Increase framebuffer size
to todays standards" thread, drop them completely.
This fixes graphic console on device with 3840x2160 native resolution.
Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
drivers/video: drop unused limits
MAX_BPP, MAX_FONT_W, MAX_FONT_H are not used in the code at all.
Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 19600eb75aa9b1df3e4b0a4e55a5d08b957e1fd9
master date: 2019-05-13 10:13:24 +0200
master commit: 343459e34a6d32ba44a21f8b8fe4c1f69b1714c2
master date: 2019-05-13 10:12:56 +0200
When bitmap_fill(..., 0) is called, do not try to write anything. Before
this patch, it tried to write almost LONG_MAX, surely overwriting
something.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: 93df28be2d4f620caf18109222d046355ac56327
master date: 2019-05-13 10:12:00 +0200
Tamas K Lengyel [Thu, 6 Jun 2019 12:07:54 +0000 (14:07 +0200)]
x86/vmx: correctly gather gs_shadow value for current vCPU
Currently the gs_shadow value is only cached when the vCPU is being scheduled
out by Xen. Reporting this (usually) stale value through vm_event is incorrect,
since it doesn't represent the actual state of the vCPU at the time the event
was recorded. This prevents vm_event subscribers from correctly finding kernel
structures in the guest when it is trapped while in ring3.
Refresh shadow_gs value when the context being saved is for the current vCPU.
Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com> Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Kevin Tian <kevin.tian@intel.com>
master commit: f69fc1c2f36e8a74ba54c9c8fa5c904ea1ad319e
master date: 2019-05-13 09:55:59 +0200
Igor Druzhinin [Thu, 6 Jun 2019 12:07:06 +0000 (14:07 +0200)]
x86/mtrr: recalculate P2M type for domains with iocaps
This change reflects the logic in epte_get_entry_emt() and allows
changes in guest MTTRs to be reflected in EPT for domains having
direct access to certain hardware memory regions but without IOMMU
context assigned (e.g. XenGT).
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: f3d880bf2be92534c5bacf11de2f561cbad550fb
master date: 2019-05-13 09:54:45 +0200
Jan Beulich [Thu, 6 Jun 2019 12:06:49 +0000 (14:06 +0200)]
AMD/IOMMU: disable previously enabled IOMMUs upon init failure
If any IOMMUs were successfully initialized before encountering failure,
the successfully enabled ones should be disabled again before cleaning
up their resources.
Move disable_iommu() next to enable_iommu() to avoid a forward
declaration, and take the opportunity to remove stray blank lines ahead
of both functions' final closing braces.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Brian Woods <brian.woods@amd.com>
master commit: 87a3347d476443c66c79953d77d6aef1d2bb3bbd
master date: 2019-05-13 09:52:43 +0200
Jan Beulich [Thu, 6 Jun 2019 12:06:29 +0000 (14:06 +0200)]
trace: fix build with gcc9
While I've not observed this myself, gcc 9 (imo validly) reportedly may
complain
trace.c: In function '__trace_hypercall':
trace.c:826:19: error: taking address of packed member of 'struct <anonymous>' may result in an unaligned pointer value [-Werror=address-of-packed-member]
826 | uint32_t *a = d.args;
and the fix is rather simple - remove the __packed attribute. Introduce
a BUILD_BUG_ON() as replacement, for the unlikely case that Xen might
get ported to an architecture where array alignment higher that that of
its elements.
Reported-by: Martin Liška <martin.liska@suse.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: George Dunlap <george.dunlap@citrix.com>
master commit: 3fd3b266d4198c06e8e421ca515d9ba09ccd5155
master date: 2019-05-13 09:51:23 +0200
Juergen Gross [Mon, 27 May 2019 13:55:20 +0000 (15:55 +0200)]
xen/sched: fix csched2_deinit_pdata()
Commit 753ba43d6d16e688 ("xen/sched: fix credit2 smt idle handling")
introduced a regression when switching cpus between cpupools.
When assigning a cpu to a cpupool with credit2 being the default
scheduler csched2_deinit_pdata() is called for the credit2 private data
after the new scheduler's private data has been hooked to the per-cpu
scheduler data. Unfortunately csched2_deinit_pdata() will cycle through
all per-cpu scheduler areas it knows of for removing the cpu from the
respective sibling masks including the area of the just moved cpu. This
will (depending on the new scheduler) either clobber the data of the
new scheduler or in case of sched_rt lead to a crash.
Avoid that by removing the cpu from the list of active cpus in credit2
data first.
The opposite problem is occurring when removing a cpu from a cpupool:
init_pdata() of credit2 will access the per-cpu data of the old
scheduler.
Andrew Cooper [Wed, 3 Oct 2018 09:32:54 +0000 (10:32 +0100)]
oxenstored: Don't re-open a xenctrl handle for every domain introduction
Currently, an xc handle is opened in main() which is used for cleanup
activities, and a new xc handle is temporarily opened every time a domain is
introduced. This is inefficient, and amongst other things, requires full root
privileges for the lifetime of oxenstored.
All code using the Xenctrl handle is in domains.ml, so initialise xc as a
global (now happens just before main() is called) and drop it as a parameter
from Domains.create and Domains.cleanup.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Christian Lindig <christian.lindig@citrix.com>
(cherry picked from commit 129025fe30934c6a04bbd9c05ade479d34ce4985)
Andrew Cooper [Thu, 29 Nov 2018 18:10:38 +0000 (18:10 +0000)]
tools/libxc: Fix issues with libxc and Xen having different featureset lengths
In almost all cases, Xen and libxc will agree on the featureset length,
because they are built from the same source.
However, there are circumstances (e.g. security hotfixes) where the featureset
gets longer and dom0 will, after installing updates, be running with an old
Xen but new libxc. Despite writing the code with this scenario in mind, there
were some bugs.
First, xen-cpuid's get_featureset() erroneously allocates a buffer based on
Xen's featureset length, but records libxc's length, which may be longer.
In this situation, the hypercall bounce buffer code reads/writes the recorded
length, which is beyond the end of the allocated object, and a later free()
encounters corrupt heap metadata. Fix this by recording the same length that
we allocate.
Secondly, get_cpuid_domain_info() has a related bug when the passed-in
featureset is a different length to libxc's.
A large amount of the libxc cpuid functionality depends on info->featureset
being as long as expected, and it is allocated appropriately. However, in the
case that a shorter external featureset is passed in, the logic to check for
trailing nonzero bits may read off the end of it. Rework the logic to use the
correct upper bound.
In addition, leave a comment next to the fields in struct cpuid_domain_info
explaining the relationship between the various lengths, and how to cope with
different lengths.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
(cherry picked from commit c393b64dcee6684da25257b033148740cb6d7ff0)
Igor Druzhinin [Tue, 9 Apr 2019 12:01:58 +0000 (13:01 +0100)]
tools/xl: use libxl_domain_info to get domain type for vcpu-pin
Parsing the config seems to be an overkill for this particular task
and the config might simply be absent. Type returned from libxl_domain_info
should be either LIBXL_DOMAIN_TYPE_HVM or LIBXL_DOMAIN_TYPE_PV but in
that context distinction between PVH and HVM should be irrelevant.
Juergen Gross [Fri, 31 Aug 2018 15:22:04 +0000 (17:22 +0200)]
tools/libxl: correct vcpu affinity output with sparse physical cpu map
With not all physical cpus online (e.g. with smt=0) the output of hte
vcpu affinities is wrong, as the affinity bitmaps are capped after
nr_cpus bits, instead of using max_cpu_id.
Christian Lindig [Wed, 27 Feb 2019 10:33:42 +0000 (10:33 +0000)]
tools/ocaml: Dup2 /dev/null to stdin in daemonize()
Don't close stdin in daemonize() but dup2 /dev/null instead. Otherwise, fd 0
gets reused later:
[root@idol ~]# ls -lav /proc/`pgrep xenstored`/fd
total 0
dr-x------ 2 root root 0 Feb 28 11:02 .
dr-xr-xr-x 9 root root 0 Feb 27 15:59 ..
lrwx------ 1 root root 64 Feb 28 11:02 0 -> /dev/xen/evtchn
l-wx------ 1 root root 64 Feb 28 11:02 1 -> /dev/null
l-wx------ 1 root root 64 Feb 28 11:02 2 -> /dev/null
lrwx------ 1 root root 64 Feb 28 11:02 3 -> /dev/xen/privcmd
...
Signed-off-by: Christian Lindig <christian.lindig@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Release-acked-by: Juergen Gross <jgross@suse.com>
(cherry picked from commit 677e64dbe315343620c3b266e9eb16623b118038)
tools/misc/xenpm: fix getting info when some CPUs are offline
Use physinfo.max_cpu_id instead of physinfo.nr_cpus to get max CPU id.
This fixes for example 'xenpm get-cpufreq-para' with smt=off, which
otherwise would miss half of the cores.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
(cherry picked from commit ffb60a58df48419c1f2607cd3cc919fa2bfc9c2d)
Jan Beulich [Wed, 15 May 2019 07:49:35 +0000 (09:49 +0200)]
x86: fix build race when generating temporary object files
The rules to generate xen-syms and xen.efi may run in parallel, but both
recursively invoke $(MAKE) to build symbol/relocation table temporary
object files. These recursive builds would both re-generate the .*.d2
files (where needed). Both would in turn invoke the same rule, thus
allowing for a race on the .*.d2.tmp intermediate files.
The dependency files of the temporary .xen*.o files live in xen/ rather
than xen/arch/x86/ anyway, so won't be included no matter what. Take the
opportunity and delete them, as the just re-generated .xen*.S files will
trigger a proper re-build of the .xen*.o ones anyway.
Empty the DEPS variable in case the set of goals consists of just those
temporary object files, thus eliminating the race.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 761bb575ce97255029d2d2249b2719e54bc76825
master date: 2019-04-11 10:25:05 +0200
Initially I had just noticed the unnecessary indirection in the call
from pi_update_irte(). The generic wrapper having an iommu_intremap
conditional made me look at the setup code though. So first of all
enforce the necessary dependency.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 6c54663786d9f1ed04153867687c158675e7277d
master date: 2019-04-09 15:12:07 +0200
Petre Pircalabu [Wed, 15 May 2019 07:48:28 +0000 (09:48 +0200)]
vm_event: fix XEN_VM_EVENT_RESUME domctl
Make XEN_VM_EVENT_RESUME return 0 in case of success, instead of
-EINVAL.
Remove vm_event_resume form vm_event.h header and set the function's
visibility to static as is used only in vm_event.c.
Move the vm_event_check_ring test inside vm_event_resume in order to
simplify the code.
Andrew Cooper [Wed, 15 May 2019 07:47:32 +0000 (09:47 +0200)]
xen/timers: Fix memory leak with cpu unplug/plug
timer_softirq_action() realloc's itself a larger timer heap whenever
necessary, which includes bootstrapping from the empty dummy_heap. Nothing
ever freed this allocation.
CPU plug and unplug has the side effect of zeroing the percpu data area, which
clears ts->heap. This in turn causes new timers to be put on the list rather
than the heap, and for timer_softirq_action() to bootstrap itself again.
This in practice leaks ts->heap every time a CPU is unplugged and replugged.
Implement free_percpu_timers() which includes freeing ts->heap when
appropriate, and update the notifier callback with the recent cpu parking
logic and free-avoidance across suspend.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
xen/cpu: Fix ARM build following c/s 597fbb8
c/s 597fbb8 "xen/timers: Fix memory leak with cpu unplug/plug" broke the ARM
build by being the first patch to add park_offline_cpus to common code.
While it is currently specific to Intel hardware (for reasons of being able to
handle machine check exceptions without an immediate system reset), it isn't
inherently architecture specific, so define it to be false on ARM for now.
Add a comment in both smp.h headers explaining the intended behaviour of the
option.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Julien Grall <julien.grall@arm.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com>
timers: move back migrate_timers_from_cpu() invocation
Commit 597fbb8be6 ("xen/timers: Fix memory leak with cpu unplug/plug")
went a little too far: Migrating timers away from a CPU being offlined
needs to heppen independent of whether it get parked or fully offlined.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
xen/timers: Fix memory leak with cpu unplug/plug (take 2)
Previous attempts to fix this leak failed to identify the root cause, and
ultimately failed. The cause is the CPU_UP_PREPARE case (re)initialising
ts->heap back to dummy_heap, which leaks the previous allocation.
Rearrange the logic to only initialise ts once. This also avoids the
redundant (but benign, due to ts->inactive always being empty) initialising of
the other ts fields.
Jan Beulich [Wed, 15 May 2019 07:46:41 +0000 (09:46 +0200)]
x86emul: suppress general register update upon AVX gather failures
While destination and mask registers may indeed need updating in this
case, the rIP update in particular needs to be avoided, as well as e.g.
raising a single step trap.
Reported-by: George Dunlap <george.dunlap@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 74f299bbd7d5cc52325b5866c17b44dd0bd1c5a2
master date: 2019-04-03 10:14:32 +0200
Juergen Gross [Wed, 15 May 2019 07:45:58 +0000 (09:45 +0200)]
xen/sched: fix credit2 smt idle handling
Credit2's smt_idle_mask_set() and smt_idle_mask_clear() are used to
identify idle cores where vcpus can be moved to. A core is thought to
be idle when all siblings are known to have the idle vcpu running on
them.
Unfortunately the information of a vcpu running on a cpu is per
runqueue. So in case not all siblings are in the same runqueue a core
will never be regarded to be idle, as the sibling not in the runqueue
is never known to run the idle vcpu.
Use a credit2 specific cpumask of siblings with only those cpus
being marked which are in the same runqueue as the cpu in question.
Andrew Cooper [Wed, 12 Dec 2018 19:22:15 +0000 (19:22 +0000)]
x86/spec-ctrl: Introduce options to control VERW flushing
The Microarchitectural Data Sampling vulnerability is split into categories
with subtly different properties:
MLPDS - Microarchitectural Load Port Data Sampling
MSBDS - Microarchitectural Store Buffer Data Sampling
MFBDS - Microarchitectural Fill Buffer Data Sampling
MDSUM - Microarchitectural Data Sampling Uncacheable Memory
MDSUM is a special case of the other three, and isn't distinguished further.
These issues pertain to three microarchitectural buffers. The Load Ports, the
Store Buffers and the Fill Buffers. Each of these structures are flushed by
the new enhanced VERW functionality, but the conditions under which flushing
is necessary vary.
For this concise overview of the issues and default logic, the abbreviations
SP (Store Port), FB (Fill Buffer), LP (Load Port) and HT (Hyperthreading) are
used for brevity:
* Vulnerable hardware is divided into two categories - parts which suffer
from SP only, and parts with any other combination of vulnerabilities.
* SP only has an HT interaction when the thread goes idle, due to the static
partitioning of resources. LP and FB have HT interactions at all points,
due to the competitive sharing of resources. All issues potentially leak
data across the return-to-guest transition.
* The microcode which implements VERW flushing also extends MSR_FLUSH_CMD, so
we don't need to do both on the HVM return-to-guest path. However, some
parts are not vulnerable to L1TF (therefore have no MSR_FLUSH_CMD), but are
vulnerable to MDS, so do require VERW on the HVM path.
Note that we deliberately support mds=1 even without MD_CLEAR in case the
microcode has been updated but the feature bit not exposed.
This is part of XSA-297, CVE-2018-12126, CVE-2018-12127, CVE-2018-12130, CVE-2019-11091.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit 3c04c258ab40405a74e194d9889a4cbc7abe94b4)
Andrew Cooper [Wed, 12 Dec 2018 19:22:15 +0000 (19:22 +0000)]
x86/spec-ctrl: Infrastructure to use VERW to flush pipeline buffers
Three synthetic features are introduced, as we need individual control of
each, depending on circumstances. A later change will enable them at
appropriate points.
The verw_sel field doesn't strictly need to live in struct cpu_info. It lives
there because there is a convenient hole it can fill, and it reduces the
complexity of the SPEC_CTRL_EXIT_TO_{PV,HVM} assembly by avoiding the need for
any temporary stack maintenance.
This is part of XSA-297, CVE-2018-12126, CVE-2018-12127, CVE-2018-12130, CVE-2019-11091.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit 548a932ac786d6bf3584e4b54f2ab993e1117710)
Andrew Cooper [Wed, 12 Sep 2018 13:36:00 +0000 (14:36 +0100)]
x86/spec-ctrl: CPUID/MSR definitions for Microarchitectural Data Sampling
The MD_CLEAR feature can be automatically offered to guests. No
infrastructure is needed in Xen to support the guest making use of it.
This is part of XSA-297, CVE-2018-12126, CVE-2018-12127, CVE-2018-12130, CVE-2019-11091.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit d4f6116c080dc013cd1204c4d8ceb95e5f278689)
Andrew Cooper [Wed, 12 Sep 2018 13:36:00 +0000 (14:36 +0100)]
x86/spec-ctrl: Misc non-functional cleanup
* Identify BTI in the spec_ctrl_{enter,exit}_idle() comments, as other
mitigations will shortly appear.
* Use alternative_input() and cover the lack of memory cobber with a further
barrier.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit 9b62eba6c429c327e1507816bef403ccc87357ae)
Andrew Cooper [Fri, 5 Apr 2019 12:26:30 +0000 (13:26 +0100)]
x86/boot: Detect the firmware SMT setting correctly on Intel hardware
While boot_cpu_data.x86_num_siblings is an accurate value to use on AMD
hardware, it isn't on Intel when the user has disabled Hyperthreading in the
firmware. As a result, a user which has chosen to disable HT still gets
nagged on L1TF-vulnerable hardware when they haven't chosen an explicit
smt=<bool> setting.
Make use of the largely-undocumented MSR_INTEL_CORE_THREAD_COUNT which in
practice exists since Nehalem, when booting on real hardware. Fall back to
using the ACPI table APIC IDs.
While adjusting this logic, fix a latent bug in amd_get_topology(). The
thread count field in CPUID.0x8000001e.ebx is documented as 8 bits wide,
rather than 2 bits wide.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit b12fec4a125950240573ea32f65c61fb9afa74c3)
Andrew Cooper [Fri, 5 Apr 2019 12:26:30 +0000 (12:26 +0000)]
x86/msr: Definitions for MSR_INTEL_CORE_THREAD_COUNT
This is a model specific register which details the current configuration
cores and threads in the package. Because of how Hyperthread and Core
configuration works works in firmware, the MSR it is de-facto constant and
will remain unchanged until the next system reset.
It is a read only MSR (so unilaterally reject writes), but for now retain its
leaky-on-read properties. Further CPUID/MSR work is required before we can
start virtualising a consistent topology to the guest, and retaining the old
behaviour is the safest course of action.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit d4120936bcd1695faf5b575f1259c58e31d2b18b)
Andrew Cooper [Wed, 12 Sep 2018 13:36:00 +0000 (14:36 +0100)]
x86/spec-ctrl: Reposition the XPTI command line parsing logic
It has ended up in the middle of the mitigation calculation logic. Move it to
be beside the other command line parsing.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
(cherry picked from commit c2c2bb0d60c642e64a5243a79c8b1548ffb7bc5b)
Andrew Cooper [Fri, 3 May 2019 08:55:55 +0000 (10:55 +0200)]
x86/spec-ctrl: Extend repoline safey calcuations for eIBRS and Atom parts
All currently-released Atom processors are in practice retpoline-safe, because
they don't fall back to a BTB prediction on RSB underflow.
However, an additional meaning of Enhanced IRBS is that the processor may not
be retpoline-safe. The Gemini Lake platform, based on the Goldmont Plus
microarchitecture is the first Atom processor to support eIBRS.
Until Xen gets full eIBRS support, Gemini Lake will still be safe using
regular IBRS.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
master commit: 17f74242ccf0ce6e51c03a5860947865c0ef0dc2
master date: 2019-03-18 16:26:40 +0000
Jan Beulich [Fri, 3 May 2019 08:53:40 +0000 (10:53 +0200)]
x86/e820: fix build with gcc9
e820.c: In function ‘clip_to_limit’:
.../xen/include/asm/string.h:10:26: error: ‘__builtin_memmove’ offset [-16, -36] is out of the bounds [0, 20484] of object ‘e820’ with type ‘struct e820map’ [-Werror=array-bounds]
10 | #define memmove(d, s, n) __builtin_memmove(d, s, n)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
e820.c:404:13: note: in expansion of macro ‘memmove’
404 | memmove(&e820.map[i], &e820.map[i+1],
| ^~~~~~~
e820.c:36:16: note: ‘e820’ declared here
36 | struct e820map e820;
| ^~~~
While I can't see where the negative offsets would come from, converting
the loop index to unsigned type helps. Take the opportunity and also
convert several other local variables and copy_e820_map()'s second
parameter to unsigned int (and bool in one case).
Reported-by: Charles Arnold <carnold@suse.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 22e2f8dddf5fbed885b5e4db3ffc9e1101be9ec0
master date: 2019-03-18 11:38:36 +0100
Andrew Cooper [Fri, 3 May 2019 08:51:31 +0000 (10:51 +0200)]
xen: Fix backport of "x86/tsx: Implement controls for RTM force-abort mode"
The posted version of this patch depends on c/s 3c555295 "x86/vpmu: Improve
documentation and parsing for vpmu=" (Xen 4.12 and later) to prevent
`vpmu=rtm-abort` impliying `vpmu=1`, which is outside of security support.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Wei Liu [Wed, 28 Nov 2018 17:43:33 +0000 (17:43 +0000)]
tools/firmware: update OVMF Makefile, when necessary
[ This is two commits from master aka staging-4.12: ]
OVMF has become dependent on OpenSSL, which is included as a
submodule. Initialise submodules before building.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
(cherry picked from commit b16281870e06f5f526029a4e69634a16dc38e8e4)
tools: only call git when necessary in OVMF Makefile
Users may choose to export a snapshot of OVMF and build it
with xen.git supplied ovmf-makefile. In that case we don't
need to call `git submodule`.