x86/fpu: Create a typedef for the x87/SSE area inside "struct xsave_struct"
Making the union non-anonymous would cause a lot of headaches, because a lot of
code relies on it being so, but it's possible to make a typedef of the anonymous
union so all callsites currently relying on typeof() can stop doing so directly.
This commit creates a `fpusse_t` typedef to the anonymous union at the head of
the XSAVE area and uses it instead of typeof().
No functional change.
Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Sergiy Kibrik [Thu, 1 Aug 2024 07:38:00 +0000 (09:38 +0200)]
x86/intel: optional build of TSX support
Transactional Synchronization Extensions are supported on certain Intel's
CPUs only, hence can be put under CONFIG_INTEL build option.
The whole TSX support, even if supported by CPU, may need to be disabled via
options, by microcode or through spec-ctrl, depending on a set of specific
conditions. To make sure nothing gets accidentally runtime-broken all
modifications of global TSX configuration variables is secured by #ifdef's,
while variables themselves redefined to 0, so that ones can't mistakenly be
written to.
Signed-off-by: Sergiy Kibrik <Sergiy_Kibrik@epam.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Wed, 31 Jul 2024 19:05:21 +0000 (20:05 +0100)]
x86/domain: Fix domlist_insert() updating the domain hash
A last minute review request was to dedup the expression calculating the
domain hash bucket.
While the code reads correctly, it is buggy because rcu_assign_pointer() is a
deeply misleading API assigning by name not value, and - contrary to it's name
- does not hide an indirection.
Therefore, rcu_assign_pointer(bucket, d); updates the local bucket variable on
the stack, not domain_hash[], causing all subsequent domid lookups to fail.
Rework the logic to use pd in the same way that domlist_remove() does.
Fixes: 19995bc70cc6 ("xen/domain: Factor domlist_{insert,remove}() out of domain_{create,destroy}()") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
xen/riscv: fix build issue for bullseye-riscv64 container
Address compilation error on bullseye-riscv64 container:
undefined reference to `guest_physmap_remove_page`
Since there is no current implementation of `guest_physmap_remove_page()`,
a stub function has been added.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Acked-by: Jan Beulich <jbeulich@suse.com>
x86/e820 address violations of MISRA C:2012 Rule 5.3
This addresses violations of MISRA C:2012 Rule 5.3 which states as
following: An identifier declared in an inner scope shall not hide an
identifier declared in an outer scope. Right here the conflict is with
the global named "e820".
No functional change.
Signed-off-by: Alessandro Zucchelli <alessandro.zucchelli@bugseng.com> Acked-by: Jan Beulich <jbeulich@suse.com>
xen/sched: fix error handling in cpu_schedule_up()
In case cpu_schedule_up() is failing, it needs to undo all externally
visible changes it has done before.
Reason is that cpu_schedule_callback() won't be called with the
CPU_UP_CANCELED notifier in case cpu_schedule_up() did fail.
Fixes: 207589dbacd4 ("xen/sched: move per cpu scheduler private data into struct sched_resource") Reported-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Xen's bitops.h consists of several Linux's headers:
* linux/arch/include/asm/bitops.h:
* The following function were removed as they aren't used in Xen:
* test_and_set_bit_lock
* clear_bit_unlock
* __clear_bit_unlock
* The following functions were renamed in the way how they are
used by common code:
* __test_and_set_bit
* __test_and_clear_bit
* The declaration and implementation of the following functios
were updated to make Xen build happy:
* clear_bit
* set_bit
* __test_and_clear_bit
* __test_and_set_bit
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Acked-by: Jan Beulich <jbeulich@suse.com>
The following generic functions were introduced:
* test_bit
* generic__test_and_set_bit
* generic__test_and_clear_bit
* generic__test_and_change_bit
These functions and macros can be useful for architectures
that don't have corresponding arch-specific instructions.
Also, the patch introduces the following generics which are
used by the functions mentioned above:
* BITOP_BITS_PER_WORD
* BITOP_MASK
* BITOP_WORD
* BITOP_TYPE
The following approach was chosen for generic*() and arch*() bit
operation functions:
If the bit operation function that is going to be generic starts
with the prefix "__", then the corresponding generic/arch function
will also contain the "__" prefix. For example:
* test_bit() will be defined using arch_test_bit() and
generic_test_bit().
* __test_and_set_bit() will be defined using
arch__test_and_set_bit() and generic__test_and_set_bit().
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Michal Orzel <michal.orzel@amd.com>
The current code in ALT_CALL_ARG() won't successfully workaround the clang
code-generation issue if the arg parameter has a size that's not a power of 2.
While there are no such sized parameters at the moment, improve the workaround
to also be effective when such sizes are used.
Instead of using a union with a long use an unsigned long that's first
initialized to 0 and afterwards set to the argument value.
Reported-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> Suggested-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
x86/dom0: fix restoring %cr3 and the mapcache override on PV build error
One of the error paths in the PV dom0 builder section that runs on the guest
page-tables wasn't restoring the Xen value of %cr3, neither removing the
mapcache override.
Fixes: 079ff2d32c3d ('libelf-loader: introduce elf_load_image') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Wed, 31 Jul 2024 10:39:35 +0000 (12:39 +0200)]
public/x86: don't include common xen.h from arch-specific one
No other arch-*.h does so, and arch-x86/xen.h really just takes the role
of arch-x86_32.h and arch-x86_64.h (by those two forwarding there). With
xen.h itself including the per-arch headers, doing so is also kind of
backwards anyway, and just calling for problems. There's exactly one
place where arch-x86/xen.h is included when really xen.h is meant (for
wanting XEN_GUEST_HANDLE_64() to be made available, the default
definition of which lives in the common xen.h).
This then addresses a violation of Misra C:2012 Directive 4.10
("Precautions shall be taken in order to prevent the contents of a
header file being included more than once").
Jan Beulich [Wed, 31 Jul 2024 10:36:14 +0000 (12:36 +0200)]
x86+Arm: drop (rename) __virt_to_maddr() / __maddr_to_virt()
There's no use of them anymore except in the definitions of the non-
underscore-prefixed aliases.
On Arm convert the (renamed) inline function to a macro.
On x86 rename the inline functions, adjust the virt_to_maddr() #define,
and purge the maddr_to_virt() one, thus eliminating a bogus cast which
would have allowed the passing of a pointer type variable into
maddr_to_virt() to go silently.
Andrew Cooper [Thu, 18 Jul 2024 20:22:41 +0000 (21:22 +0100)]
arch/domain: Clean up the idle domain remnants in arch_domain_create()
With arch_domain_create() no longer being called with the idle domain, drop
the last remaining logic.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Andrew Cooper [Thu, 18 Jul 2024 20:20:52 +0000 (21:20 +0100)]
xen/domain: Simpliy domain_create() now the idle domain is complete earlier
With x86 implementing arch_init_idle_domain(), there is no longer any need to
call arch_domain_create() with the idle domain.
Have the idle domain exit early with all other system domains. Move the
static-analysis ASSERT() earlier. Then, remove the !is_idle_domain()
protections around the majority of domain_create() and remove one level of
indentation.
No practical change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Andrew Cooper [Thu, 18 Jul 2024 20:12:31 +0000 (21:12 +0100)]
x86/domain: Implement arch_init_idle_domain()
The idle domain needs d->arch.ctxt_switch initialised on x86. Implement the
new arch_init_idle_domain() in order to do this.
Intentionally remove cpu_policy's initialisation to ZERO_BLOCK_PTR. It has
never tripped since it's introduction, and is weird to have in isolation
without a similar approach on other pointers.
No practical change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Thu, 18 Jul 2024 19:54:05 +0000 (20:54 +0100)]
xen/domain: Introduce arch_init_idle_domain()
The idle domain causes a large amount of complexity in domain_create() because
of x86's need to initialise d->arch.ctxt_switch in arch_domain_create().
In order to address this, introduce an optional hook to perform extra
initialisation of the idle domain.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
As discussed during the last MISRA C meeting, add Rule 12.2 to the list
of MISRA C rules we accept, together with an explanation that we use gcc
-fsanitize=undefined to check for violations.
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> Acked-by: Jan Beulich <jbeulich@suse.com>
In the file include/xen/event.h macro set_bit is called with argument
current->pause_flags.
Once expanded this set_bit's argument is used in sizeof operations
and thus 'current', being a macro that expands to a function
call with potential side effects, generates a violation.
To address this violation the value of current is therefore stored in a
variable called 'v' before passing it to macro set_bit.
Jason Andryuk [Mon, 29 Jul 2024 15:04:12 +0000 (11:04 -0400)]
libxl: Enable stubdom cdrom changing
To change the cd-rom medium, libxl will:
- QMP eject the medium from QEMU
- block-detach the old PV disk
- block-attach the new PV disk
- QMP change the medium to the new PV disk by fdset-id
The QMP code is reused, and remove and attach are implemented here.
The stubdom must internally handle adding /dev/xvdc to the appropriate
fdset. libxl in dom0 doesn't see the result of adding to the fdset as
that is internal to the stubdom, but the fdset's opaque fields will be
set to stub-devid:$devid, so libxl can identify it. $devid is common
between the stubdom and libxl, so it can be identified on both side.
The stubdom will name the device xvdY regardless of the guest name hdY,
sdY, or xvdY, but the stubdom will be assigned the same devid
facilitating lookup. Because the stubdom add-fd call is asynchronous,
libxl needs to poll query-fdsets to identify when add-fd has completed.
For cd-eject, we still need to attach the empty vbd. This is necessary
since xenstore is used to determine that hdc exists. Otherwise after
eject, hdc would be gone and the cd-insert would fail to find the drive
to insert new media.
Signed-off-by: Jason Andryuk <jandryuk@gmail.com> Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
Upgrade Yocto to a newer version. Use ext4 as image format for testing
with QEMU on ARM and ARM64 as the default is WIC and it is not available
for our xen-image-minimal target.
Also update the tar.bz2 filename for the rootfs.
Signed-off-by: Stefano Stabellini <stefano.stabellini@amd.com> Reviewed-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Andrew Cooper [Fri, 5 Jul 2024 11:52:05 +0000 (12:52 +0100)]
XSM/domctl: Fix permission checks on XEN_DOMCTL_createdomain
The XSM checks for XEN_DOMCTL_createdomain are problematic. There's a split
between xsm_domctl() called early, and flask_domain_create() called quite late
during domain construction.
All XSM implementations except Flask have a simple IS_PRIV check in
xsm_domctl(), and operate as expected when an unprivileged domain tries to
make a hypercall.
Flask however foregoes any action in xsm_domctl() and defers everything,
including the simple "is the caller permitted to create a domain" check, to
flask_domain_create().
As a consequence, when XSM Flask is active, and irrespective of the policy
loaded, all domains irrespective of privilege can:
* Mutate the global 'rover' variable, used to track the next free domid.
Therefore, all domains can cause a domid wraparound, and combined with a
voluntary reboot, choose their own domid.
* Cause a reasonable amount of a domain to be constructed before ultimately
failing for permission reasons, including the use of settings outside of
supported limits.
In order to remediate this, pass the ssidref into xsm_domctl() and at least
check that the calling domain privileged enough to create domains.
Take the opportunity to also fix the sign of the cmd parameter to be unsigned.
This issue has not been assigned an XSA, because Flask is experimental and not
security supported.
Reported-by: Ross Lagerwall <ross.lagerwall@citrix.com> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Daniel P. Smith <dpsmith@apertussolutions.com>
Andrew Cooper [Mon, 15 Jul 2024 13:17:43 +0000 (14:17 +0100)]
tools/examples: Remove more obsolete content
xeninfo.pl was introduced in commit 1b0a8bb57e3e ("Added xeninfo.pl, a script
for collecting statistics from Xen hosts using the Xen-API") and has been
touched exactly twice since to remove hardcoded IP addresses and paths.
The configuration files in vnc/* date from when we had a vendered version of
Qemu living in the tree.
These have never (AFAICT) been wired into the `make install` rule.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
There's no l{1,2,3,4}e_read() implementation, so drop the _atomic suffix from
the read helpers. This allows unifying the naming with the write helpers,
which are also atomic but don't have the suffix already: l{1,2,3,4}e_write().
No functional change intended.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
The l{1,2,3,4}e_write_atomic() and non _atomic suffixed helpers share the same
implementation, so it seems pointless and possibly confusing to have both.
x86 32bit mode used to have a non-atomic PTE write that would split the write
in two halves, but with Xen only supporting x86 64bit that's no longer
present.
Remove the l{1,2,3,4}e_write_atomic() helpers and switch it's user to
l{1,2,3,4}e_write(), as that's also atomic. While there also remove
pte_write{,_atomic}() and just use write_atomic() in the wrappers.
No functional change intended.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Ross Lagerwall [Tue, 30 Jul 2024 09:55:56 +0000 (11:55 +0200)]
bunzip2: fix rare decompression failure
The decompression code parses a huffman tree and counts the number of
symbols for a given bit length. In rare cases, there may be >= 256
symbols with a given bit length, causing the unsigned char to overflow.
This causes a decompression failure later when the code tries and fails to
find the bit length for a given symbol.
Since the maximum number of symbols is 258, use unsigned short instead.
Fixes: ab77e81f6521 ("x86/dom0: support bzip2 and lzma compressed bzImage payloads") Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
If run Xen with PVH dom0 and hvm domU, hvm will map a pirq for
a passthrough device by using gsi, see qemu code
xen_pt_realize->xc_physdev_map_pirq and libxl code
pci_add_dm_done->xc_physdev_map_pirq. Then xc_physdev_map_pirq
will call into Xen, but in hvm_physdev_op, PHYSDEVOP_map_pirq
is not allowed because currd is PVH dom0 and PVH has no
X86_EMU_USE_PIRQ flag, it will fail at has_pirq check.
So, allow PHYSDEVOP_map_pirq when dom0 is PVH and also allow
PHYSDEVOP_unmap_pirq for the removal device path to unmap pirq.
And add a new check to prevent (un)map when the subject domain
doesn't have a notion of PIRQ.
So that the interrupt of a passthrough device can be
successfully mapped to pirq for domU with a notion of PIRQ
when dom0 is PVH
Add deviation comments to address violations of
MISRA C:2012 Directive 4.10 ("Precautions shall be taken in order
to prevent the contents of a header file being included more than
once").
Inclusion guards must appear at the beginning of the headers
(comments are permitted anywhere).
This patch adds deviation comments using the format specified
in docs/misra/safe.json for headers with just the direct
inclusion guard before the inclusion guard since they are
safe and not supposed to comply with the directive.
Note that with SAF-10-safe in place, failures to have proper guards later
in the header files will not be reported
misra: modify deviations for empty and generated headers
This patch modifies deviations for Directive 4.10:
"Precautions shall be taken in order to prevent the contents of
a header file being included more than once"
This patch avoids the file-based deviation for empty headers, and
replaces it with a comment-based one using the format specified in
docs/misra/safe.json.
Generated headers are not generally safe against multi-inclusions,
whether a header is safe depends on the nature of the generated code
in the header. For that reason, this patch drops the deviation for
generated headers.
misra: add deviation for headers that explicitly avoid guards
Some headers, under specific circumstances (documented in a comment at
the beginning of the file), explicitly do not have strict inclusion
guards: the caller is responsible for including them correctly.
These files are not supposed to comply with Directive 4.10:
"Precautions shall be taken in order to prevent the contents of a header
file being included more than once"
This patch adds deviation cooments for headers that avoid guards.
x86/traps: address violations of MISRA C Rule 16.3
Add break or pseudo keyword fallthrough to address violations of
MISRA C Rule 16.3: "An unconditional `break' statement shall terminate
every switch-clause".
automation/eclair: fix deviation of MISRA C Rule 16.3
Add missing escape for the final dot of the fallthrough comment,
extend the search of a fallthrough comment up to 2 lines after the last
statement and improve the text of the justification.
When building with gcc with -finstrument-functions, optimization level
-O1, CONFIG_HYPFS=y and # CONFIG_HAS_SCHED_GRANULARITY is not set, the
the following build warning (error) is encountered:
common/sched/cpupool.c: In function ‘cpupool_gran_write’:
common/sched/cpupool.c:1220:26: error: ‘gran’ may be used uninitialized [-Werror=maybe-uninitialized]
1220 | 0 : cpupool_check_granularity(gran);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
common/sched/cpupool.c:1207:21: note: ‘gran’ declared here
1207 | enum sched_gran gran;
| ^~~~
This is a false positive. Silence the warning (error) by initializing
the variable.
Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> Reviewed-by: Juergen Gross <jgross@suse.com>
x86/viridian: Clarify some viridian logging strings
It's sadically misleading to show an error without letters and expect
the dmesg reader to understand it's in hex. The patch adds a 0x prefix
to all hex numbers that don't already have it.
On the one instance in which a boolean is printed as an integer, print
it as a decimal integer instead so it's 0/1 in the common case and not
misleading if it's ever not just that due to a bug.
While at it, rename VIRIDIAN CRASH to VIRIDIAN GUEST_CRASH. Every member
of a support team that looks at the message systematically believes
"viridian" crashed, which is absolutely not what goes on. It's the guest
asking the hypervisor for a sudden shutdown because it crashed, and
stating why.
Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com> Reviewed-by: Paul Durrant <paul@xen.org>
Andrew Cooper [Thu, 9 May 2024 17:52:59 +0000 (18:52 +0100)]
hvmloader: Use fastcall everywhere
HVMLoader is a single freestanding 32bit program with no external
dependencies. Use the fastcall calling convetion (up to 3 parameters in
registers) globally, which is more efficient than passing all parameters on
the stack.
Some bloat-o-meter highlights are:
add/remove: 0/0 grow/shrink: 3/118 up/down: 8/-3004 (-2996)
Function old new delta
...
hvmloader_acpi_build_tables 1125 961 -164
acpi_build_tables 1277 1081 -196
pci_setup 4756 4516 -240
construct_secondary_tables 1689 1447 -242
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Tue, 23 Jul 2024 16:32:26 +0000 (17:32 +0100)]
x86/IO-APIC: Improve APIC_TMR accesses
XenServer's instance of Coverity complains of OVERFLOW_BEFORE_WIDEN in
mask_and_ack_level_ioapic_irq(), which is ultimately because of v being
unsigned long, and (1U << ...) being 32 bits.
The reasoning isn't correct. (1U << (x & 0x1f)) can't overflow, but the
complaint is really about having to expand the RHS. While this can be fixed
by changing v to be unsigned int, take the opportunity to do better still.
Introduce a apic_tmr_read() helper like we already have for ISR and IRR, and
use it to remove the opencoded logic. Introduce an is_level boolean to
improve the legibility of the surrounding logic.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Michal Orzel [Wed, 24 Jul 2024 09:38:13 +0000 (11:38 +0200)]
MAINTAINERS: Add me and Bertrand as device tree maintainers
With Arm port being the major recipient of dt related patches and the
future need of incorporating dt support into other ports, we'd like to
keep an eye on these changes.
Initialize and bring down altp2m only when it is supported by the platform,
e.g. VMX. Also guard p2m_altp2m_propagate_change().
The purpose of that is the possibility to disable altp2m support and exclude its
code from the build completely, when it's not supported by the target platform.
Here hvm_altp2m_supported() is being used to check for ALTP2M availability,
which is only defined if HVM enabled, so a stub for that routine added for
!HVM configuration as well.
Signed-off-by: Sergiy Kibrik <Sergiy_Kibrik@epam.com> Acked-by: Jan Beulich <jbeulich@suse.com>
RISC-V does a conditional toolchain for the Zbb extension
(xen/arch/riscv/rules.mk), but unconditionally uses the
ANDN instruction in emulate_xchg_1_2().
Fixes: 51dabd6312c ("xen/riscv: introduce cmpxchg.h") Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Suggested-By: Jan Beulich <jbeulich@suse.com> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
"$dev" needs to be set correctly for backendtype=phy as well as
backendtype=tap. Move the setting into the conditional, so it can be
handled properly for each.
(dev could be captured during tap-ctl allocate for blktap module, but it
would not be set properly for the find_device case. The backendtype=tap
case would need to be handled regardless.)
Fixes: f16ac12bd418 ("hotplug: Restore block-tap phy compatibility") Fixes: 76a484193dbb ("hotplug: Update block-tap") Signed-off-by: Jason Andryuk <jason.andryuk@amd.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
Andrew Cooper [Fri, 28 Jun 2024 15:33:56 +0000 (16:33 +0100)]
tools/libxs: Stop playing with SIGPIPE
It's very rude for a library to play with signals behind the back of the
application, no matter ones views on the default behaviour of SIGPIPE under
POSIX. Even if the application doesn't care about the xenstored socket, it my
care about others.
This logic has existed since xenstore/xenstored was originally added in commit 29c9e570b1ed ("Add xenstore daemon and library") in 2005.
It's also unnecessary. Pass MSG_NOSIGNAL when talking to xenstored over a
pipe (to avoid sucumbing to SIGPIPE if xenstored has crashed), and forgo any
playing with the signal disposition.
This has a side benefit of saving 2 syscalls per xenstore request.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
Andrew Cooper [Thu, 18 Jul 2024 11:55:48 +0000 (12:55 +0100)]
tools/libxs: Use writev()/sendmsg() instead of write()
With the input data now conveniently arranged, use writev()/sendmsg() instead
of decomposing it into write() calls.
This causes all requests to be submitted with a single system call, rather
than at least two. While in principle short writes can occur, the chances of
it happening are slim given that most xenbus comms are only a handful of
bytes.
Nevertheless, provide {writev,sendmsg}_exact() wrappers which take care of
resubmitting on EINTR or short write.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
Andrew Cooper [Fri, 28 Jun 2024 18:40:27 +0000 (19:40 +0100)]
tools/libxs: Track whether we're using a socket or file
It will determine whether to use writev() or sendmsg().
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
Andrew Cooper [Thu, 18 Jul 2024 11:03:03 +0000 (12:03 +0100)]
tools/libxs: Rationalise the definition of struct xs_handle
Right now there are two completely different struct xs_handle definitions,
depend on #ifdef USE_PTHREAD. One is quite well hidden, and often escapes
updates.
Rework struct xs_handle using some interior ifdefary. It's slightly longer,
but much easier to follow. Importanly, this makes it much harder to forget
the !PTHREAD case when adding a "common" variable.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
Andrew Cooper [Thu, 18 Jul 2024 09:13:04 +0000 (10:13 +0100)]
tools/libxs: Rework xs_talkv() to take xsd_sockmsg within the iovec
We would like to writev() the whole outgoing message, but this is hard given
the current need to prepend the locally-constructed xsd_sockmsg.
Instead, have the caller provide xsd_sockmsg in iovec[0]. This in turn drops
the 't' and 'type' parameters from xs_talkv().
Note that xs_talkv() may alter the iovec structure. This may happen when
writev() is really used under the covers, and it's preferable to having the
lower levels need to duplicate the iovec to edit it upon encountering a short
write. xs_directory_part() is the only function impacted by this, and it's
easy to rearrange to be compatible.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Jason Andryuk <jason.andryuk@amd.com>
Andrew Cooper [Thu, 18 Jul 2024 09:23:00 +0000 (10:23 +0100)]
tools/libxs: Fix length check in xs_talkv()
If the sum of iov element lengths overflows, the XENSTORE_PAYLOAD_MAX can
pass, after which we'll write 4G of data with a good-looking length field, and
the remainder of the payload will be interpreted as subsequent commands.
Check each iov element length for XENSTORE_PAYLOAD_MAX before accmulating it.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jason Andryuk <jason.andryuk@amd.com> Reviewed-by: Juergen Gross <jgross@suse.com>
x86/mctelem: address violations of MISRA C: 2012 Rule 5.3
This addresses violations of MISRA C:2012 Rule 5.3 which states as
following: An identifier declared in an inner scope shall not hide an
identifier declared in an outer scope.
In this case the variable being shadowed is the file scope struct mctctl
in this file, therefore the local variables are renamed to avoid this.
common/softirq: address violation of MISRA C Rule 13.6
In the file common/softirq macro set_bit is called with argument
smp_processor_id.
Once expanded this set_bit's argument is used in sizeof operations
and thus 'smp_processor_id', being a macro that may expand to a
function call with potential side effects, generates a violation.
To address this violation the value of smp_processor_id is therefore
stored in a variable called 'cpu' before passing it to macro set_bit.
x86/altcall: fix clang code-gen when using altcall in loop constructs
Yet another clang code generation issue when using altcalls.
The issue this time is with using loop constructs around alternative_{,v}call
instances using parameter types smaller than the register size.
Given the following example code:
static void bar(bool b)
{
unsigned int i;
for ( i = 0; i < 10; i++ )
{
int ret_;
register union {
bool e;
unsigned long r;
} di asm("rdi") = { .e = b };
register unsigned long si asm("rsi");
register unsigned long dx asm("rdx");
register unsigned long cx asm("rcx");
register unsigned long r8 asm("r8");
register unsigned long r9 asm("r9");
register unsigned long r10 asm("r10");
register unsigned long r11 asm("r11");
Clang will generate machine code that only resets the low 8 bits of %rdi
between loop calls, leaving the rest of the register possibly containing
garbage from the use of %rdi inside the called function. Note also that clang
doesn't truncate the input parameters at the callee, thus breaking the psABI.
Fix this by turning the `e` element in the anonymous union into an array that
consumes the same space as an unsigned long, as this forces clang to reset the
whole %rdi register instead of just the low 8 bits.
Fixes: 2ce562b2a413 ('x86/altcall: use a union as register type for function parameters on clang') Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Tamas K Lengyel [Tue, 23 Jul 2024 11:58:54 +0000 (13:58 +0200)]
Add tools/fuzz/oss-fuzz/build.sh
The build integration script for oss-fuzz targets. Future fuzzing targets can
be added to this script and those targets will be automatically picked up by
oss-fuzz without having to open separate PRs on the oss-fuzz repo.
Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Tamas K Lengyel [Tue, 23 Jul 2024 11:58:07 +0000 (13:58 +0200)]
Add libfuzzer target to fuzz/x86_instruction_emulator
This target enables integration into oss-fuzz. Changing invalid input return
to -1 as values other then 0/-1 are reserved by libfuzzer. Also adding the
missing __wrap_vsnprintf wrapper which is required for successful oss-fuzz
build.
Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Sat, 13 Jul 2024 16:14:16 +0000 (17:14 +0100)]
docs: Fix install-man$(1)-pages if no manpages are generated
All tools to build manpages are optional, and if none of them happen to be
present, the intermediate working directory may not even be created.
Treat this as non-fatal, bringing the behaviour in line with install-html.
Like the html side, it needs to be not-or to avoid Make thinking the rule has
failed.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Anthony PERARD <anthony.perard@vates.tech>
While fixing this, also fix a rendering error in the non-figlet case; while a
leading space looks better for figlet, it looks very wrong for the simple
one-line case.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 5 Jul 2024 17:56:48 +0000 (18:56 +0100)]
ppc/shutdown: Implement machine_{halt,restart}()
OPAL has easy APIs for shutdown/reboot, so wire them up.
Then, use machine_halt() rather than an infinite loop at the end of
start_xen(). This avoids the Qemu smoke test needing to wait for the full
timeout in order to succeed.
(XEN) 8e011600000000c0 is the result of PTE map
Enabled radix in LPCR
Flushed TLB
Hello, ppc64le!
[ 6.341897656,5] OPAL: Shutdown request type 0x0...
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Shawn Anastasio <sanastasio@raptorengineering.com>
Matthew Barnes [Fri, 5 Jul 2024 15:05:07 +0000 (16:05 +0100)]
tools/misc: xen-hvmcrash: Inject #DF instead of overwriting RIP
xen-hvmcrash would previously save records, overwrite the instruction
pointer with a bogus value, and then restore them to crash a domain
just enough to cause the guest OS to memdump.
This approach is found to be unreliable when tested on a guest running
Windows 10 x64, with some executions doing nothing at all.
Another approach would be to trigger NMIs. This approach is found to be
unreliable when tested on Linux (Ubuntu 22.04), as Linux will ignore
NMIs if it is not configured to handle such.
Injecting a double fault abort to all vCPUs is found to be more
reliable at crashing and invoking memdumps from Windows and Linux
domains.
This patch modifies the xen-hvmcrash tool to inject #DF to all vCPUs
belonging to the specified domain, instead of overwriting RIP.
Signed-off-by: Matthew Barnes <matthew.barnes@cloud.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Fri, 21 Jun 2024 18:23:11 +0000 (19:23 +0100)]
xen/ppc: Avoid using the legacy __read_mostly/__ro_after_init definitions
RISC-V wants to introduce a full build of Xen without using the legacy
definitions. PPC64 has the most minimal full build of Xen right now, so make
it compile without the legacy definitions.
Mostly this is just including xen/sections.h in a variety of common files. In
a couple of cases, we can drop an inclusion of {xen,asm}/cache.h, but almost
all files get the definitions transitively.
No functional change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
Andrew Cooper [Fri, 28 Jun 2024 14:56:39 +0000 (15:56 +0100)]
tools/libxs: Drop XSTEST
This appears to been missed from the previous attempt in 2007.
Fixes: fed194611785 ("xenstore: Remove broken and unmaintained test code") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Juergen Gross <jgross@suse.com>
Jan Beulich [Mon, 22 Jul 2024 07:41:03 +0000 (09:41 +0200)]
x86: don't open-code [gm]fn_to_[gm]addr()
At least in pure address calculation use the intended basic construct
instead of opend-coded left-shifting by PAGE_SHIFT. Leave alone page
table entry calculations for now, as those aren't really calculating
addresses.
No functional change.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Mon, 22 Jul 2024 07:39:40 +0000 (09:39 +0200)]
x86: drop REX64_PREFIX
While we didn't copy the full Linux commentary, Linux commit 7180d4fb8308 ("x86_64: Fix 64bit FXSAVE encoding") is quite explicit
about gas 2.16 supporting FXSAVEQ / FXRSTORQ. As that's presently our
minimal required version, we can drop the workaround that was needed for
yet older gas.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tamas K Lengyel [Mon, 22 Jul 2024 07:38:28 +0000 (09:38 +0200)]
Add libfuzzer target to fuzz/x86_instruction_emulator
This target enables integration into oss-fuzz. Changing invalid input return
to -1 as values other then 0/-1 are reserved by libfuzzer. Also adding the
missing __wrap_vsnprintf wrapper which is required for successful oss-fuzz
build.
Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Victor Lira [Mon, 22 Jul 2024 07:37:45 +0000 (09:37 +0200)]
common/sched: address a violation of MISRA C Rule 8.7
Rule 8.7: "Functions and objects should not be defined with external
linkage if they are referenced in only one translation unit".
This patch fixes this by adding the static specifier.
No functional changes.
Reported-by: Stewart Hildebrand stewart.hildebrand@amd.com Signed-off-by: Victor Lira <victorm.lira@amd.com> Acked-by: George Dunlap <george.dunlap@cloud.com>
The CPU POOLS sections in MAINTAINERS can be dropped, as the SCHEDULING
section has the same maintainers and it is covering the CPU POOLS files
as well.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>