]> xenbits.xensource.com Git - people/royger/xen.git/log
people/royger/xen.git
4 years agox86/vpt: introduce a per-vPT lock vpt.4 gitlab/vpt.4
Roger Pau Monne [Mon, 21 Sep 2020 11:16:10 +0000 (13:16 +0200)]
x86/vpt: introduce a per-vPT lock

Introduce a per virtual timer lock that replaces the existing per-vCPU
and per-domain vPT locks. Since virtual timers are no longer assigned
or migrated between vCPUs the locking can be simplified to a
in-structure spinlock that protects all the fields.

This requires introducing a helper to initialize the spinlock, and
that could be used to initialize other virtual timer fields in the
future.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes since v1:
 - New in his version.

4 years agox86/vpt: remove vPT timers per-vCPU lists
Roger Pau Monne [Fri, 28 Aug 2020 14:36:30 +0000 (16:36 +0200)]
x86/vpt: remove vPT timers per-vCPU lists

No longer add vPT timers to lists on specific vCPUs, since there's no
need anymore to check if timer interrupts have been injected on return
to HVM guest.

Such change allows to get rid of virtual timers vCPU migration, and
also cleanup some of the virtual timers fields that are no longer
required.

The model is also slightly different now in that timers are not
stopped when a vCPU is de-scheduled. Such timers will continue
running, and when triggered the function will try to inject the
corresponding interrupt to the guest (which might be different than
the currently running one). Note that the timer triggering when the
guest is no longer running can only happen once, as the timer callback
will not reset the interrupt to fire again. Such resetting if required
will be done by the EOI callback.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes since v3:
 - Remove stale commit log paragrpah.

Changes since v2:
 - Remove pt_{save/restore}_timer and instead use
   pt_{freeze/thaw}_time.
 - Remove the introduction of the 'masked' field, it's not needed.
 - Rework pt_active to use timer_is_active.

Changes since v1:
 - New in this version.

4 years agox86/irq: drop return value from hvm_ioapic_assert
Roger Pau Monne [Mon, 19 Apr 2021 10:22:45 +0000 (12:22 +0200)]
x86/irq: drop return value from hvm_ioapic_assert

There's no caller anymore that cares about the injected vector, so
drop the returned vector from the function.

No functional change indented.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v3:
 - New in this version.

4 years agox86/irq: remove unused parameter from hvm_isa_irq_assert
Roger Pau Monne [Wed, 7 Apr 2021 11:16:10 +0000 (13:16 +0200)]
x86/irq: remove unused parameter from hvm_isa_irq_assert

There are no callers anymore passing a get_vector function pointer to
hvm_isa_irq_assert, so drop the parameter.

No functional change expected.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v3:
 - New in this version.

4 years agox86/vpt: switch interrupt injection model
Roger Pau Monne [Thu, 27 Aug 2020 13:33:43 +0000 (15:33 +0200)]
x86/vpt: switch interrupt injection model

Currently vPT relies on timers being assigned to a vCPU and performing
checks on every return to HVM guest in order to check if an interrupt
from a vPT timer assigned to the vCPU is currently being injected.

This model doesn't work properly since the interrupt destination vCPU
of a vPT timer can be different from the vCPU where the timer is
currently assigned, in which case the timer would get stuck because it
never sees the interrupt as being injected.

Knowing when a vPT interrupt is injected is relevant for the guest
timer modes where missed vPT interrupts are not discarded and instead
are accumulated and injected when possible.

This change aims to modify the logic described above, so that vPT
doesn't need to check on every return to HVM guest if a vPT interrupt
is being injected. In order to achieve this the vPT code is modified
to make use of the new EOI callbacks, so that virtual timers can
detect when a interrupt has been serviced by the guest by waiting for
the EOI callback to execute.

This model also simplifies some of the logic, as when executing the
timer EOI callback Xen can try to inject another interrupt if the
timer has interrupts pending for delivery.

Note that timers are still bound to a vCPU for the time being, this
relation however doesn't limit the interrupt destination anymore, and
will be removed by further patches.

This model has been tested with Windows 7 guests without showing any
timer delay, even when the guest was limited to have very little CPU
capacity and pending virtual timer interrupts accumulate.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v3:
 - Rename pt_irq_fired to irq_eoi and adjust the logic.
 - Initialize v and cb_priv in eoi_callback.

Changes since v2:
 - Avoid and explicit != NULL check.
 - Use a switch in inject_interrupt to evaluate the timer mode.
 - Print the pt->source field on error in create_periodic_time.

Changes since v1:
 - New in this version.

4 years agox86/dpci: switch to use a GSI EOI callback
Roger Pau Monne [Thu, 20 Aug 2020 16:43:02 +0000 (18:43 +0200)]
x86/dpci: switch to use a GSI EOI callback

Switch the dpci GSI EOI callback hooks to use the newly introduced
generic callback functionality, and remove the custom dpci calls found
on the vPIC and vIO-APIC implementations.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v3:
 - Print a warning message if the EOI callback cannot be unregistered.

Changes since v2:
 - Avoid leaking the allocated callback on error paths of
   pt_irq_create_bind.

Changes since v1:
 - New in this version.

4 years agox86/dpci: move code
Roger Pau Monne [Thu, 20 Aug 2020 16:47:23 +0000 (18:47 +0200)]
x86/dpci: move code

This is code movement in order to simply further changes.

No functional change intended.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
Changes since v2:
 - Drop one of the leading underscores from __hvm_dpci_eoi.

Changes since v1:
 - New in this version.

4 years agox86/hvm: allowing registering EOI callbacks for GSIs
Roger Pau Monne [Tue, 18 Aug 2020 13:36:27 +0000 (15:36 +0200)]
x86/hvm: allowing registering EOI callbacks for GSIs

Such callbacks will be executed once a EOI is performed by the guest,
regardless of whether the interrupts are injected from the vIO-APIC or
the vPIC, as ISA IRQs are translated to GSIs and then the
corresponding callback is executed at EOI.

The vIO-APIC infrastructure for handling EOIs is build on top of the
existing vlapic EOI callback functionality, while the vPIC one is
handled when writing to the vPIC EOI register.

Note that such callbacks need to be registered and de-registered, and
that a single GSI can have multiple callbacks associated. That's
because GSIs can be level triggered and shared, as that's the case
with legacy PCI interrupts shared between several devices.

Strictly speaking this is a non-functional change, since there are no
users of this new interface introduced by this change.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v3:
 - Make callback take a domain parameter.
 - Return whether the unregistered callback was found.
 - Add a comment regarding the result of hvm_gsi_has_callbacks being
   stable.

Changes since v2:
 - Latch hvm_domain_irq in some functions.
 - Make domain parameter of hvm_gsi_has_callbacks const.
 - Add comment about dropping the lock around the
   hvm_gsi_execute_callbacks call.
 - Drop change to ioapic_load.

Changes since v1:
 - New in this version.

4 years agox86/vioapic: switch to use the EOI callback mechanism
Roger Pau Monne [Wed, 12 Aug 2020 09:25:12 +0000 (11:25 +0200)]
x86/vioapic: switch to use the EOI callback mechanism

Switch the emulated IO-APIC code to use the local APIC EOI callback
mechanism. This allows to remove the last hardcoded callback from
vlapic_handle_EOI. Removing the hardcoded vIO-APIC callback also
allows to getting rid of setting the EOI exit bitmap based on the
triggering mode, as now all users that require an EOI action use the
newly introduced callback mechanism.

Move and rename the vioapic_update_EOI now that it can be made static.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v3:
 - Remove assert in eoi_callback.
 - Cast callback to bool.
 - Simplify check in ioapic_load: GSIs < 16 and edge interrupts can
   also have callbacks.
 - Reword comment about casting to boolean.

Changes since v2:
 - Explicitly convert the last alternative_vcall parameter to a
   boolean in vlapic_set_irq_callback.

Changes since v1:
 - Remove the triggering check in the update_eoi_exit_bitmap call.
 - Register the vlapic callbacks when loading the vIO-APIC state.
 - Reduce scope of ent.

4 years agox86/vmsi: use the newly introduced EOI callbacks
Roger Pau Monne [Tue, 11 Aug 2020 15:45:23 +0000 (17:45 +0200)]
x86/vmsi: use the newly introduced EOI callbacks

Remove the unconditional call to hvm_dpci_msi_eoi in vlapic_handle_EOI
and instead use the newly introduced EOI callback mechanism in order
to register a callback for MSI vectors injected from passed through
devices.

This avoids having multiple callback functions open-coded in
vlapic_handle_EOI, as there is now a generic framework for registering
such callbacks. It also avoids doing an unconditional call to
hvm_dpci_msi_eoi for each EOI processed by the local APIC.

Note that now the callback is only registered (and thus executed) when
there's an MSI interrupt originating from a PCI passthrough device
being injected into the guest, so the check in hvm_dpci_msi_eoi can be
removed as it's already done by hvm_dirq_assist which is the only
caller of vmsi_deliver_pirq.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes since v3:
 - Fix the callback to take a vcpu parameter.

Changes since v2:
 - Expand commit message.
 - Pass the domain as the callback data.
 - Remove the check in hvm_dpci_msi_eoi

4 years agox86/vlapic: introduce an EOI callback mechanism
Roger Pau Monne [Tue, 11 Aug 2020 14:18:30 +0000 (16:18 +0200)]
x86/vlapic: introduce an EOI callback mechanism

Add a new vlapic_set_irq_callback helper in order to inject a vector
and set a callback to be executed when the guest performs the end of
interrupt acknowledgment.

Such functionality will be used to migrate the current ad hoc handling
done in vlapic_handle_EOI for the vectors that require some logic to
be executed when the end of interrupt is performed.

The setter of the callback will be in charge for setting the callback
again on guest restore, as callbacks are not saved as part of the
vlapic state. That is the reason why vlapic_set_callback is not a
static function.

No current users are migrated to use this new functionality yet, so no
functional change expected as a result.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v3:
 - Use xzalloc.
 - Drop printk on ENOMEM.
 - Add vcpu parameter to vlapic EOI callback.
 - Check that the vector is pending in ISR or IRR when printing a
   warning message because of an overriding callback.
 - Fix commit message regarding resume mention.

Changes since v2:
 - Fix commit message typo.
 - Expand commit message.
 - Also print a warning if the callback data is overridden.
 - Properly free memory in case of error in vlapic_init.

Changes since v1:
 - Make vlapic_set_irq an inline function on the header.
 - Clear the callback hook in vlapic_handle_EOI.
 - Introduce a helper to set the callback without injecting a vector.
 - Remove unneeded parentheses.
 - Reduce callback table by 16.
 - Use %pv to print domain/vcpu ID.

4 years agox86/rtc: drop code related to strict mode
Roger Pau Monne [Thu, 15 Apr 2021 15:29:03 +0000 (17:29 +0200)]
x86/rtc: drop code related to strict mode

Xen has been for a long time setting the WAET ACPI table "RTC good"
flag, which implies there's no need to perform a read of the RTC REG_C
register in order to get further interrupts after having received one.
This is hardcoded in the static ACPI tables, and in the RTC emulation
in Xen.

Drop the support for the alternative (strict) mode, it's been unused
for a long (since Xen 4.3) time without any complains.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Further changes in the series will require that no registering or
unregistering of callback is done inside of the handlers themselves,
like it was done in rtc_pf_callback when in strict_mode.

4 years agox86/dpci: remove the dpci EOI timer
Roger Pau Monne [Tue, 5 Jan 2021 11:52:34 +0000 (12:52 +0100)]
x86/dpci: remove the dpci EOI timer

Current interrupt pass though code will setup a timer for each
interrupt injected to the guest that requires an EOI from the guest.
Such timer would perform two actions if the guest doesn't EOI the
interrupt before a given period of time. The first one is deasserting
the virtual line, the second is perform an EOI of the physical
interrupt source if it requires such.

The deasserting of the guest virtual line is wrong, since it messes
with the interrupt status of the guest. This seems to have been done
in order to compensate for missing deasserts when certain interrupt
controller actions are performed. The original motivation of the
introduction of the timer was to fix issues when a GSI was shared
between different guests. We believe that other changes in the
interrupt handling code (ie: proper propagation of EOI related actions
to dpci) will have fixed such errors now.

Performing an EOI of the physical interrupt source is redundant, since
there's already a timer that takes care of this for all interrupts,
not just the HVM dpci ones, see irq_guest_action_t struct eoi_timer
field.

Since both of the actions performed by the dpci timer are not
required, remove it altogether.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes since v1:
 - Add parentheses.

4 years agox86/vpic: issue dpci EOI for cleared pins at ICW1
Roger Pau Monne [Fri, 15 Jan 2021 11:20:58 +0000 (12:20 +0100)]
x86/vpic: issue dpci EOI for cleared pins at ICW1

When pins are cleared from either ISR or IRR as part of the
initialization sequence forward the clearing of those pins to the dpci
EOI handler, as it is equivalent to an EOI. Not doing so can bring the
interrupt controller state out of sync with the dpci handling logic,
that expects a notification when a pin has been EOI'ed.

Fixes: 7b3cb5e5416 ('IRQ injection changes for HVM PCI passthru.')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v2:
 - Remove the unmask label.

4 years agox86/vpic: don't trigger unmask event until end of init
Roger Pau Monne [Tue, 26 Jan 2021 12:24:50 +0000 (13:24 +0100)]
x86/vpic: don't trigger unmask event until end of init

Wait until the end of the init sequence to trigger the unmask event.
Note that it will be unconditionally triggered, but that's harmless if
not unmask actually happened.

While there change the variable type to bool.

Requested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v2:
 - New in this version.

4 years agox86/vpic: force int output to low when in init mode
Roger Pau Monne [Tue, 26 Jan 2021 12:02:18 +0000 (13:02 +0100)]
x86/vpic: force int output to low when in init mode

When the PIC is on the init sequence prevent interrupt delivery. The
state of the registers is in the process of being set during the init
phase, so it makes sense to prevent any int line changes during that
process.

Requested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes since v2:
 - New in this version.

4 years agoautomation: add a build job with NR_CPUS == 1
Roger Pau Monne [Tue, 2 Mar 2021 08:41:00 +0000 (09:41 +0100)]
automation: add a build job with NR_CPUS == 1

This requires adding some logic in the build script in order to be
able to pass specific Xen Kconfig options.

Setting any CONFIG_* environment variable when executing the build
script will set such variable in the empty .config file before
running the olddefconfig target. The .config file is also checked
afterwards to assert the option has not been lost as part of the
configuration process.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Jan Beulich <jbeulich@suse.com>
---
Not sure whether there's some easiest way to force a config option to
a set value from the command line.

4 years agobuild: detect outdated configure outputs
Roger Pau Monne [Thu, 11 Mar 2021 11:34:01 +0000 (12:34 +0100)]
build: detect outdated configure outputs

The Xen build system relies on configure to parse some .in files in
order to do substitutions based on the data gathered from configure.

The main issue with those substitutions done at the configure level is
that make is not able to detect when they go out of date because the
.in file has been modified, and hence it's possible to end up in a
situation where .in files have been modified but the build is using
outdated ones. This is made even worse because the 'clean' targets
don't remove the output of the .in parsing, so doing a typical `make
clean && make` will still use the old files without complaining.
Note that 'clean' not removing the output of the .in transformations
is the right behavior, otherwise Xen would require re-executing the
configure script after each clean.

Attempt to improve the situation by adding a global rule that spot the
outdated files as long as they are properly listed as makefile target
prerequisites.

Ultimately those substitutions should be part of the build phase, not
the configure one.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
RFC because I'm not sure if there's some better way to handle this.
Also I think we would want to make sure all the .in outputs are
properly listed as target prerequisites, or else this won't work.

Also not sure whether this will break some other usage of .in files
I'm not aware.

4 years agoxen/arm: guest_walk: Only generate necessary offsets/masks
Julien Grall [Sun, 18 Apr 2021 18:11:15 +0000 (19:11 +0100)]
xen/arm: guest_walk: Only generate necessary offsets/masks

At the moment, we are computing offsets/masks for each level and
granularity. This is a bit of waste given that we only need to
know the offsets/masks for the granularity used by the guest.

All the LPAE information can easily be inferred with just the
page shift for a given granularity and the level.

So rather than providing a set of helpers per granularity, we can
provide a single set that takes the granularity and the level in
parameters.

With the new helpers in place, we can rework guest_walk_ld() to
only compute necessary information.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Tested-by: Bertrand Marquish <bertrand.marquis@arm.com>
4 years agoxen/arm: Include asm/asm-offsets.h and asm/macros.h on every assembly files
Julien Grall [Sat, 23 Jan 2021 17:48:45 +0000 (17:48 +0000)]
xen/arm: Include asm/asm-offsets.h and asm/macros.h on every assembly files

In a follow-up patch we may want to automatically replace some
mnemonics (such as ret) with a different sequence.

To ensure all the assembly files will include asm/macros.h it is best to
automatically include it on single assembly. This can be done via
config.h.

It was necessary to include a few more headers as dependency:
  - <asm/asm_defns.h> to define sizeof_*
  - <xen/page-size.h> which is already a latent issue given STACK_ORDER
  rely on PAGE_SIZE.

Unfortunately the build system will use -D__ASSEMBLY__ when generating
the linker script. A new option -D__LINKER__ is introduceed and used for
the linker script to avoid including headers (such as asm/macros.h) that
may not be compatible with the syntax.

Lastly, take the opportunity to remove both asm/asm-offsets.h and
asm/macros.h from the various assembly files as they are now
automagically included.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agox86/pv: Rename hypercall_table_t to pv_hypercall_table_t
Andrew Cooper [Thu, 15 Apr 2021 12:27:45 +0000 (13:27 +0100)]
x86/pv: Rename hypercall_table_t to pv_hypercall_table_t

The type is no longer appropriate for anything other than PV, and therefore
should not retain its generic name.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/pv: Improve dom0_update_physmap() with CONFIG_SPECULATIVE_HARDEN_BRANCH
Andrew Cooper [Thu, 29 Oct 2020 19:53:28 +0000 (19:53 +0000)]
x86/pv: Improve dom0_update_physmap() with CONFIG_SPECULATIVE_HARDEN_BRANCH

dom0_update_physmap() is mostly called in two tight loops, where the lfences
hidden in is_pv_32bit_domain() have a substantial impact.

None of the boot time construction needs protection against malicious
speculation, so use a local variable and calculate is_pv_32bit_domain() just
once.

Reformat the some of the code for legibility, now that the volume has reduced,
and removal of some gratuitous negations.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agostring: drop redundant declarations
Jan Beulich [Fri, 16 Apr 2021 12:44:01 +0000 (14:44 +0200)]
string: drop redundant declarations

These standard functions shouldn't need custom declarations. The only
case where redundancy might be needed is if there were inline functions
there. But we don't have any here (anymore). Prune the per-arch headers
of duplicate declarations while moving the asm/string.h inclusion past
the declarations.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agolib: move 64-bit div/mod compiler helpers
Jan Beulich [Fri, 16 Apr 2021 12:43:10 +0000 (14:43 +0200)]
lib: move 64-bit div/mod compiler helpers

These were built for 32-bit architectures only (the same code could,
with some tweaking, sensibly be used to provide TI-mode helpers on
64-bit arch-es) - retain this property, while still avoiding to have
a CU without any contents at all. For this, Arm's CONFIG_64BIT gets
generalized.

Note that we imply "32-bit arch" to be the same as BITS_PER_LONG == 32,
i.e. we aren't (not just here) prepared to have a 64-bit arch with
BITS_PER_LONG == 32. Yet even if we supported such, likely the compiler
would get away there without invoking these helpers, so the code would
remain unused in practice.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agolib: move muldiv64()
Jan Beulich [Fri, 16 Apr 2021 12:41:48 +0000 (14:41 +0200)]
lib: move muldiv64()

Make this a separate archive member under lib/. While doing so, don't
move latently broken x86 assembly though: Fix the constraints, such
that properly extending inputs to 64-bit won't just be a side effect of
needing to copy registers, and such that we won't fail to clobber %rdx.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agounxz: replace INIT{,DATA} and STATIC
Jan Beulich [Fri, 16 Apr 2021 12:40:15 +0000 (14:40 +0200)]
unxz: replace INIT{,DATA} and STATIC

With xen/common/decompress.h now agreeing in both build modes about
what STATIC expands to, there's no need for these abstractions anymore.

Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agounlz4: replace INIT
Jan Beulich [Fri, 16 Apr 2021 12:39:25 +0000 (14:39 +0200)]
unlz4: replace INIT

There's no need for this abstraction.

Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
4 years agounlzma: replace INIT
Jan Beulich [Fri, 16 Apr 2021 12:38:50 +0000 (14:38 +0200)]
unlzma: replace INIT

There's no need for this abstraction.

Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agounlzo: replace INIT
Jan Beulich [Fri, 16 Apr 2021 12:38:26 +0000 (14:38 +0200)]
unlzo: replace INIT

There's no need for this abstraction.

Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agobunzip: replace INIT
Jan Beulich [Fri, 16 Apr 2021 12:37:36 +0000 (14:37 +0200)]
bunzip: replace INIT

While tools/libs/guest/xg_private.h has its own (non-conflicting for our
purposes) __init, which hence needs to be #undef-ed, there's no other
need for this abstraction.

Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agox86/hpet: Don't enable legacy replacement mode unconditionally
Jan Beulich [Wed, 24 Mar 2021 10:34:32 +0000 (11:34 +0100)]
x86/hpet: Don't enable legacy replacement mode unconditionally

Commit e1de4c196a2e ("x86/timer: Fix boot on Intel systems using ITSSPRC
static PIT clock gating") was reported to cause boot failures on certain
AMD Ryzen systems.

Refine the fix to do nothing in the default case, and only attempt to
configure legacy replacement mode if IRQ0 is found to not be working.  If
legacy replacement mode doesn't help, undo it before falling back to other IRQ
routing configurations.

In addition, introduce a "hpet" command line option so this heuristic
can be overridden.  Since it makes little sense to introduce just
"hpet=legacy-replacement", also allow for a boolean argument as well as
"broadcast" to replace the separate "hpetbroadcast" option.

Reported-by: Frédéric Pierret frederic.pierret@qubes-os.org
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Frédéric Pierret <frederic.pierret@qubes-os.org>
4 years agox86/hpet: Factor hpet_enable_legacy_replacement_mode() out of hpet_setup()
Andrew Cooper [Wed, 24 Mar 2021 14:33:04 +0000 (14:33 +0000)]
x86/hpet: Factor hpet_enable_legacy_replacement_mode() out of hpet_setup()

... in preparation to introduce a second caller.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Frédéric Pierret <frederic.pierret@qubes-os.org>
4 years agoRevert "x86/HPET: don't enable legacy replacement mode unconditionally"
Andrew Cooper [Thu, 15 Apr 2021 15:19:01 +0000 (16:19 +0100)]
Revert "x86/HPET: don't enable legacy replacement mode unconditionally"

This reverts commit e680cc48b7184d3489873d6776f84ba1fc238ced.

It was committed despite multiple objections.  The agreed upon fix is a
different variation of the same original patch, and the delta between the two
is far from clear.

By reverting this commit first, the fixes are clear and coherent as individual
patches, and in the appropriate form for backport to the older trees.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoxen/arm: Prevent Dom0 to be loaded when using dom0less
Luca Fancellu [Wed, 14 Apr 2021 09:14:04 +0000 (10:14 +0100)]
xen/arm: Prevent Dom0 to be loaded when using dom0less

This patch prevents the dom0 to be loaded skipping its
building and going forward to build domUs when the dom0
kernel is not found and at least one domU is present.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
4 years agoxen/arm: Clarify how the domid is decided in create_domUs()
Luca Fancellu [Wed, 14 Apr 2021 09:14:03 +0000 (10:14 +0100)]
xen/arm: Clarify how the domid is decided in create_domUs()

This patch adds a comment in create_domUs() right before
domain_create() to explain the importance of the pre-increment
operator on the variable max_init_domid, to ensure that the
domid 0 is allocated only during start_xen() function by the
create_dom0() and not on any other possible code path to the
domain_create() function.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agoxen/arm: xen/arm: Reinforce use of is_hardware_domain
Luca Fancellu [Wed, 14 Apr 2021 09:14:02 +0000 (10:14 +0100)]
xen/arm: xen/arm: Reinforce use of is_hardware_domain

There are a few places on Arm where we use pretty much an open-coded
version of is_hardware_domain(). The main difference, is the helper
will also block speculation (not yet implemented on Arm).

The existing users are not in hot path, so blocking speculation
would not hurt when it is implemented. So remove the open-coded
version within the arm codebase.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
[julieng: Rework the commit message]
Acked-by: Julien Grall <jgrall@amazon.com>
4 years agoxen/arm: Move dom0 creation in domain_build.c
Luca Fancellu [Wed, 14 Apr 2021 09:14:01 +0000 (10:14 +0100)]
xen/arm: Move dom0 creation in domain_build.c

Move dom0 create and start from setup.c to a dedicated
function in domain_build.c.

With this change, the function construct_dom0() is not
used outside of domain_build.c anymore.
So it is now a static function.

No functional changes intended.

Signed-off-by: Luca Fancellu <luca.fancellu@arm.com>
Reviewed-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
4 years agox86/amd: split LFENCE dispatch serializing setup logic into helper
Roger Pau Monné [Thu, 15 Apr 2021 11:45:09 +0000 (13:45 +0200)]
x86/amd: split LFENCE dispatch serializing setup logic into helper

Split the logic to attempt to setup LFENCE to be dispatch serializing
on AMD into a helper, so it can be shared with Hygon.

No functional change intended.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86: avoid building COMPAT code when !HVM && !PV32
Jan Beulich [Thu, 15 Apr 2021 11:43:51 +0000 (13:43 +0200)]
x86: avoid building COMPAT code when !HVM && !PV32

It was probably a mistake to, over time, drop various CONFIG_COMPAT
conditionals from x86-specific code, as we now have a build
configuration again where we'd prefer this to be unset. Arrange for
CONFIG_COMPAT to actually be off in this case, dealing with fallout.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wl@xen.org>
4 years agox86: slim down hypercall handling when !PV32
Jan Beulich [Thu, 15 Apr 2021 11:35:32 +0000 (13:35 +0200)]
x86: slim down hypercall handling when !PV32

In such a build various of the compat handlers aren't needed. Don't
reference them from the hypercall table, and compile out those which
aren't needed for HVM. Also compile out switch_compat(), which has no
purpose in such a build.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wl@xen.org>
4 years agox86: don't build unused entry code when !PV32
Jan Beulich [Thu, 15 Apr 2021 11:34:29 +0000 (13:34 +0200)]
x86: don't build unused entry code when !PV32

Except for the initial part of cstar_enter compat/entry.S is all dead
code in this case. Further, along the lines of the PV conditionals we
already have in entry.S, make code PV32-conditional there too (to a
fair part because this code actually references compat/entry.S).

This has the side effect of moving the tail part (now at compat_syscall)
of the code out of .text.entry (in line with e.g. compat_sysenter).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wl@xen.org>
4 years agoautomation: remove allow_failure from Alpine Linux jobs
Stefano Stabellini [Fri, 12 Mar 2021 21:05:26 +0000 (13:05 -0800)]
automation: remove allow_failure from Alpine Linux jobs

Now that the Alpine Linux build jobs complete successfully on staging we
can remove the "allow_failure: true" tag.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agoxen/iommu: smmu: Silence clang in arm_smmu_device_dt_probe()
Julien Grall [Fri, 2 Apr 2021 15:51:06 +0000 (16:51 +0100)]
xen/iommu: smmu: Silence clang in arm_smmu_device_dt_probe()

Clang 11 will throw the following error:

smmu.c:2284:18: error: cast to smaller integer type 'enum arm_smmu_arch_version' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast]
        smmu->version = (enum arm_smmu_arch_version)of_id->data;
                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The error can be prevented by initially casting to (uintptr_t) and then
enum.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
4 years agoRevert "xen/arm: mm: flush_page_to_ram() only need to clean to PoC"
Julien Grall [Tue, 13 Apr 2021 16:15:39 +0000 (17:15 +0100)]
Revert "xen/arm: mm: flush_page_to_ram() only need to clean to PoC"

Some callers of flush_page_to_ram() expect the memory to be
invalidated. Reverts commit 9617d5f9c19d1d157629e1e436791509526e0ce5
to unblock OssTest.

Signed-off-by: Julien Grall <jgrall@amazon.com>
4 years agolibxl: User defined max_maptrack_frames in a stub domain
Dmitry Fedorov [Tue, 13 Apr 2021 14:17:29 +0000 (15:17 +0100)]
libxl: User defined max_maptrack_frames in a stub domain

Implementing qrexec+usbip+qemu in Linux-based stub domain leads me to
an issue where a device model stub domain doesn't have maptrack entries.

Signed-off-by: Dmitry Fedorov <d.fedorov@tabit.pro>
Acked-by: Wei Liu <wl@xen.org>
4 years agox86/cpuid: Advertise no-lmsl unilaterally to hvm guests
Andrew Cooper [Fri, 2 Apr 2021 13:10:25 +0000 (14:10 +0100)]
x86/cpuid: Advertise no-lmsl unilaterally to hvm guests

While part of the original AMD64 spec, Long Mode Segment Limit was a feature
not picked up by Intel, and therefore didn't see much adoption in software.
AMD have finally dropped the feature from hardware, and allocated a CPUID bit
to indicate its absence.

Xen has never supported the feature for guests, even when running on capable
hardware, so advertise the feature's absence unilaterally.

There is nothing specifically wrong with exposing this bit to PV guests, but
the PV ABI doesn't include a working concept of MSR_EFER in the first place,
so exposing it to PV guests would be out-of-place.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/HVM: move is_s3_suspended field
Jan Beulich [Tue, 13 Apr 2021 08:18:34 +0000 (10:18 +0200)]
x86/HVM: move is_s3_suspended field

Put it next to another boolean, so they will "share" the subsequent
padding hole.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/EPT: minor local variable adjustment in ept_set_entry()
Jan Beulich [Tue, 13 Apr 2021 08:18:08 +0000 (10:18 +0200)]
x86/EPT: minor local variable adjustment in ept_set_entry()

Not having direct_mmio (used only once anyway) as a local variable gets
the epte_get_entry_emt() invocation here in better sync with the other
ones. While at it also reduce ipat's scope.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
4 years agoiommu: remove read_msi_from_ire hook
Roger Pau Monné [Tue, 13 Apr 2021 08:17:15 +0000 (10:17 +0200)]
iommu: remove read_msi_from_ire hook

It's now unused after commit 28fb8cf323dd93f59a9c851c93ba9b79de8b1c4e.

Fixes: 28fb8cf323d ('x86/iommu: remove code to fetch MSI message from remap table')
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agoVT-d: drop unused #define-s
Jan Beulich [Tue, 13 Apr 2021 08:16:50 +0000 (10:16 +0200)]
VT-d: drop unused #define-s

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
4 years agoVT-d: avoid pointless use of 64-bit constants
Jan Beulich [Tue, 13 Apr 2021 08:16:28 +0000 (10:16 +0200)]
VT-d: avoid pointless use of 64-bit constants

When the respective registers are just 32 bits wide there's no point in
making corresponding constants 64-bit ones.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
4 years agoVT-d: qinval indexes are only up to 19 bits wide
Jan Beulich [Tue, 13 Apr 2021 08:16:06 +0000 (10:16 +0200)]
VT-d: qinval indexes are only up to 19 bits wide

There's no need for 64-bit accesses to these registers (outside of
initial setup and dumping).

Also remove some stray blanks.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
4 years agoVT-d: bring print_qi_regs() in line with print_iommu_regs()
Jan Beulich [Tue, 13 Apr 2021 08:15:41 +0000 (10:15 +0200)]
VT-d: bring print_qi_regs() in line with print_iommu_regs()

Shorten the names printed. There's also no need to go through a local
variable.

While at it also constify the function's parameter.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
4 years agoVT-d: don't open-code dmar_readl()
Jan Beulich [Tue, 13 Apr 2021 08:15:08 +0000 (10:15 +0200)]
VT-d: don't open-code dmar_readl()

While at it also drop the unnecessary use of a local variable there.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
4 years agoVT-d: improve save/restore of registers across S3
Jan Beulich [Tue, 13 Apr 2021 08:14:23 +0000 (10:14 +0200)]
VT-d: improve save/restore of registers across S3

The static allocation of the save space is not only very inefficient
(most of the array slots won't ever get used), but is also the sole
reason for a build-time upper bound on the number of IOMMUs. Introduce
a structure containing just the one needed field we can't (easily)
restore from other in-memory state, and allocate the respective
array dynamically.

Take the opportunity and make the FEUADDR write dependent upon
x2apic_enabled, like is already the case in dma_msi_set_affinity().

Also alter properties of nr_iommus: static, unsigned, and __initdata.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
4 years agox86/shadow: adjust is_pv_*() checks
Jan Beulich [Mon, 12 Apr 2021 10:37:19 +0000 (12:37 +0200)]
x86/shadow: adjust is_pv_*() checks

To cover for "x86: correct is_pv_domain() when !CONFIG_PV" (or any other
change along those lines) we should prefer is_hvm_*(), as it may become
a build time constant while is_pv_*() generally won't.

Also when a domain pointer is in scope, prefer is_*_domain() over
is_*_vcpu().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
4 years agox86/shadow: only 4-level guest code needs building when !HVM
Jan Beulich [Mon, 12 Apr 2021 10:34:04 +0000 (12:34 +0200)]
x86/shadow: only 4-level guest code needs building when !HVM

In order to limit #ifdef-ary, provide "stub" #define-s for
SH_type_{l1,fl1,l2}_{32,pae}_shadow and SHF_{L1,FL1,L2}_{32,PAE}.

The change in shadow_vcpu_init() is necessary to cover for "x86: correct
is_pv_domain() when !CONFIG_PV" (or any other change along those lines)
- we should only rely on is_hvm_*() to become a build time constant.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
4 years agox86/shadow: drop SH_type_l2h_pae_shadow
Jan Beulich [Mon, 12 Apr 2021 10:33:17 +0000 (12:33 +0200)]
x86/shadow: drop SH_type_l2h_pae_shadow

This is a remnant from 32-bit days, having no place anymore where a
shadow of this type would be created.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
4 years agox86/shadow: SH_type_l2h_shadow is PV-only
Jan Beulich [Mon, 12 Apr 2021 10:32:50 +0000 (12:32 +0200)]
x86/shadow: SH_type_l2h_shadow is PV-only

..., i.e. being used only with 4 guest paging levels. Drop its L2/PAE
alias and adjust / drop conditionals. Use >= 4 where touching them
anyway, in preparation for 5-level paging.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
4 years agox86/shadow: don't open-code SHF_* shorthands
Jan Beulich [Mon, 12 Apr 2021 10:32:18 +0000 (12:32 +0200)]
x86/shadow: don't open-code SHF_* shorthands

Use SHF_L1_ANY, SHF_32, SHF_PAE, as well as SHF_64, and introduce
SHF_FL1_ANY.

Note that in shadow_audit_tables() this has the effect of no longer
(I assume mistakenly, or else I don't see why the respective callback
table entry isn't NULL) excluding SHF_L2H_64.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
4 years agox86/shadow: move shadow_set_l<N>e() to their own source file
Jan Beulich [Mon, 12 Apr 2021 10:31:19 +0000 (12:31 +0200)]
x86/shadow: move shadow_set_l<N>e() to their own source file

The few GUEST_PAGING_LEVELS dependencies (of shadow_set_l2e() only) can
be easily expressed by function parameters; I suppose the extra indirect
call is acceptable for the increasingly little used 32-bit non-PAE case.
This way shadow_set_l[12]e(), each of which compiles to almost 1k of
code, need building just once.

The implication is the need for some "relaxation" in types.h: The
underlying PTE types don't vary anymore (and aren't expected to down the
road), so they as well as some basic helpers can be exposed even in the
new, artificial GUEST_PAGING_LEVELS == 0 case.

Almost pure code movement - exceptions are the conversion of
"#if GUEST_PAGING_LEVELS == 2" to runtime conditionals and style
corrections (including to avoid open-coding mfn_to_maddr() and
PAGE_OFFSET()).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
4 years agox86/shadow: polish shadow_write_entries()
Jan Beulich [Mon, 12 Apr 2021 10:30:13 +0000 (12:30 +0200)]
x86/shadow: polish shadow_write_entries()

First of all, avoid the initial dummy write: Try to write the actual
new value instead, and start the loop from 1 if this was successful.
Further, drop safe_write_entry() and use write_atomic() instead. This
eliminates the need for the BUILD_BUG_ON() there at the same time.

Then
- use const and unsigned,
- drop a redundant NULL check,
- don't open-code PAGE_OFFSET() and IS_ALIGNED(),
- adjust comment style.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
4 years agox86/shadow: use get_unsafe() instead of copy_from_unsafe()
Jan Beulich [Mon, 12 Apr 2021 10:28:52 +0000 (12:28 +0200)]
x86/shadow: use get_unsafe() instead of copy_from_unsafe()

This is the slightly more direct way of getting at what we want, and
better in line with shadow_write_entries()'s use of put_unsafe().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
4 years agogunzip: drop INIT{,DATA} and STATIC
Jan Beulich [Mon, 12 Apr 2021 10:26:54 +0000 (12:26 +0200)]
gunzip: drop INIT{,DATA} and STATIC

There's no need for the extra abstraction.

Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agolibxenguest: simplify kernel decompression
Jan Beulich [Mon, 12 Apr 2021 10:26:18 +0000 (12:26 +0200)]
libxenguest: simplify kernel decompression

In all cases the kernel build makes available the uncompressed size in
the final 4 bytes of the bzImage payload. Utilize this to avoid
repeated realloc()ing of the output buffer.

As a side effect this also addresses the previous mistaken return of 0
(success) from xc_try_{bzip2,lzma,xz}_decode() in case
xc_dom_register_external() would have failed.

As another side effect this also addresses the first error path of
_xc_try_lzma_decode() previously bypassing lzma_end().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agolibxenguest: drop redundant decompression declarations
Jan Beulich [Mon, 12 Apr 2021 10:25:55 +0000 (12:25 +0200)]
libxenguest: drop redundant decompression declarations

The ones in xg_dom_decompress_unsafe.h suffice.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wl@xen.org>
4 years agoxen/xsm: Improve alloc/free of evtchn buckets
Andrew Cooper [Sat, 16 Jan 2021 16:09:10 +0000 (16:09 +0000)]
xen/xsm: Improve alloc/free of evtchn buckets

Currently, flask_alloc_security_evtchn() is called in loops of
64 (EVTCHNS_PER_BUCKET), which for non-dummy implementations is a function
pointer call even in the no-op case.  The non no-op case only sets a single
constant, and doesn't actually fail.

Spectre v2 protections has made function pointer calls far more expensive, and
64 back-to-back calls is a waste.  Rework the APIs to pass the size of the
bucket instead, and call them once.

No practical change, but {alloc,free}_evtchn_bucket() should be rather more
efficient now.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Daniel P. Smith <dpsmith@apertussolutions.com>
4 years agotools/libs: Simplify internal *.pc files
Andrew Cooper [Wed, 25 Nov 2020 14:37:00 +0000 (14:37 +0000)]
tools/libs: Simplify internal *.pc files

The internal package config file for libxenlight reads (reformatted to avoid
exceeding the SMTP 998-character line length):

  Libs: -L${libdir}
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toollog
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toollog
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toolcore
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/evtchn
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toolcore
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toollog
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toollog
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toolcore
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/call
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toollog
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toolcore
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/evtchn
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toollog
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toolcore
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/gnttab
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toollog
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toolcore
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/foreignmemory
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toollog
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toolcore
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toollog
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toolcore
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/call
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/devicemodel
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/ctrl
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toolcore
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/store
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toollog
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toolcore
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toollog
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toolcore
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/call
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/hypfs
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toollog
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toolcore
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/evtchn
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toollog
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toollog
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toolcore
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/call
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toollog
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toolcore
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/evtchn
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toollog
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toolcore
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/gnttab
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toollog
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toolcore
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/foreignmemory
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toollog
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toolcore
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toollog
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toolcore
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/call
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/devicemodel
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/ctrl
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/guest
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/light
  -lxenlight

Drop duplicate -rpath-link='s to turn it into the slightly-more-manageable:

  Libs: -L${libdir}
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/call
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/ctrl
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/devicemodel
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/evtchn
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/foreignmemory
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/gnttab
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/guest
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/hypfs
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/light
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/store
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toolcore
  -Wl,-rpath-link=/local/security/xen.git/tools/libs/light/../../../tools/libs/toollog
  -lxenlight

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Bertrand Marquis <bertrand.marquis@arm.com>
4 years agotools: Drop gettext as a build dependency
Andrew Cooper [Fri, 26 Mar 2021 11:25:07 +0000 (11:25 +0000)]
tools: Drop gettext as a build dependency

It has not been a dependency since at least 4.13.  Remove its mandatory check
from ./configure.

Annotate the dependency in the CI dockerfiles, and drop them from CirrusCI and
TravisCI.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agoxen/gunzip: Fix build with clang after 33bc2a8495f7
Julien Grall [Wed, 7 Apr 2021 18:22:10 +0000 (19:22 +0100)]
xen/gunzip: Fix build with clang after 33bc2a8495f7

The compilation will fail when building Xen with clang and
CONFIG_DEBUG=y:

make[4]: Leaving directory '/oss/xen/xen/common/libelf'
  INIT_O  gunzip.init.o
Error: size of gunzip.o:.text is 0x00000019

This is because the function init_allocator() will not be inlined
and is not part of the init section.

Fix it by marking init_allocator() with INIT.

Fixes: 33bc2a8495f7 ("xen/gunzip: Allow perform_gunzip() to be called multiple times")
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Julien Grall <jgrall@amazon.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
4 years agoRevert "x86: guard against straight-line speculation past RET"
Jan Beulich [Fri, 9 Apr 2021 07:50:40 +0000 (09:50 +0200)]
Revert "x86: guard against straight-line speculation past RET"

This reverts commit 71b0b475d801ebeb83a6ba402425135c314fa2df,
which has no real effect - the most recent version of the patch
had lost the INT3 insn.

4 years agohypfs: avoid effectively open-coding xzalloc_array()
Jan Beulich [Fri, 9 Apr 2021 07:25:42 +0000 (09:25 +0200)]
hypfs: avoid effectively open-coding xzalloc_array()

There is a difference in generated code: xzalloc_bytes() forces
SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
actually don't want it.

To avoid the need to add a cast, do away with the only forward-declared
struct hypfs_dyndata.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
4 years agox86/vPMU: avoid effectively open-coding xzalloc_flex_struct()
Jan Beulich [Fri, 9 Apr 2021 07:25:17 +0000 (09:25 +0200)]
x86/vPMU: avoid effectively open-coding xzalloc_flex_struct()

There is a difference in generated code: xzalloc_bytes() forces
SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
actually don't want it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/HVM: avoid effectively open-coding xzalloc_flex_struct()
Jan Beulich [Fri, 9 Apr 2021 07:24:23 +0000 (09:24 +0200)]
x86/HVM: avoid effectively open-coding xzalloc_flex_struct()

Drop hvm_irq_size(), which exists for just this purpose.

There is a difference in generated code: xzalloc_bytes() forces
SMP_CACHE_BYTES alignment. I think we not only don't need this here, but
actually don't want it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agoMAINTAINERS: add myself as hypfs maintainer
Juergen Gross [Fri, 9 Apr 2021 07:23:28 +0000 (09:23 +0200)]
MAINTAINERS: add myself as hypfs maintainer

As I have contributed all the code for hypfs, it would be natural to
be the maintainer.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Ian Jackson <iwj@xenproject.org>
4 years agopci: move ATS code to common directory
Rahul Singh [Fri, 9 Apr 2021 07:22:26 +0000 (09:22 +0200)]
pci: move ATS code to common directory

PCI ATS code is common for all architecture, move code to common
directory to be usable for other architectures.

No functional change intended.

Signed-off-by: Rahul Singh <rahul.singh@arm.com>
4 years agox86/vpt: simplify locking argument to write_{,un}lock
Boris Ostrovsky [Fri, 9 Apr 2021 07:22:04 +0000 (09:22 +0200)]
x86/vpt: simplify locking argument to write_{,un}lock

Make pt_adjust_vcpu() call write_{,un}lock with less indirection, like
create_periodic_time() already does.

Requested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86/vpt: do not take pt_migrate rwlock in some cases
Boris Ostrovsky [Fri, 9 Apr 2021 07:21:27 +0000 (09:21 +0200)]
x86/vpt: do not take pt_migrate rwlock in some cases

Commit 8e76aef72820 ("x86/vpt: fix race when migrating timers between
vCPUs") addressed XSA-336 by introducing a per-domain rwlock that was
intended to protect periodic timer during VCPU migration. Since such
migration is an infrequent event no performance impact was expected.

Unfortunately this turned out not to be the case: on a fairly large
guest (92 VCPUs) we've observed as much as 40% TPCC performance
regression with some guest kernels. Further investigation pointed to
pt_migrate read lock taken in pt_update_irq() as the largest contributor
to this regression. With large number of VCPUs and large number of VMEXITs
(from where pt_update_irq() is always called) the update of an atomic in
read_lock() is thought to be the main cause.

Stephen Brennan analyzed locking pattern and classified lock users as
follows:

1. Functions which read (maybe write) all periodic_time instances attached
to a particular vCPU. These are functions which use pt_vcpu_lock() such
as pt_restore_timer(), pt_save_timer(), etc.
2. Functions which want to modify a particular periodic_time object.
These functions lock whichever vCPU the periodic_time is attached to, but
since the vCPU could be modified without holding any lock, they are
vulnerable to XSA-336. Functions in this group use pt_lock(), such as
pt_timer_fn() or destroy_periodic_time().
3. Functions which not only want to modify the periodic_time, but also
would like to modify the =vcpu= fields. These are create_periodic_time()
or pt_adjust_vcpu(). They create XSA-336 conditions for group 2, but we
can't simply hold 2 vcpu locks due to the deadlock risk.

Roger then pointed out that group 1 functions don't really need to hold
the pt_migrate rwlock and that instead groups 2 and 3 should hold per-vcpu
lock whenever they modify per-vcpu timer lists.

Suggested-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Suggested-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Stephen Brennan <stephen.s.brennan@oracle.com>
4 years agox86/irq: simplify loop in unmap_domain_pirq
Roger Pau Monné [Fri, 9 Apr 2021 07:20:57 +0000 (09:20 +0200)]
x86/irq: simplify loop in unmap_domain_pirq

The for loop in unmap_domain_pirq is unnecessary complicated, with
several places where the index is incremented, and also different
exit conditions spread between the loop body.

Simplify it by looping over each possible PIRQ using the for loop
syntax, and remove all possible in-loop exit points.

No functional change intended.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agox86/shadow: encode full GFN in magic MMIO entries
Jan Beulich [Fri, 9 Apr 2021 07:20:15 +0000 (09:20 +0200)]
x86/shadow: encode full GFN in magic MMIO entries

Since we don't need to encode all of the PTE flags, we have enough bits
in the shadow entry to store the full GFN. Limit use of literal numbers
a little and instead derive some of the involved values. Sanity-check
the result via BUILD_BUG_ON()s.

This then allows dropping from sh_l1e_mmio() again the guarding against
too large GFNs. It needs replacing by an L1TF safety check though, which
in turn requires exposing cpu_has_bug_l1tf.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
4 years agox86/PV32: avoid TLB flushing after mod_l3_entry()
Jan Beulich [Fri, 9 Apr 2021 07:19:18 +0000 (09:19 +0200)]
x86/PV32: avoid TLB flushing after mod_l3_entry()

32-bit guests may not depend upon the side effect of using ordinary
4-level paging when running on a 64-bit hypervisor. For L3 entry updates
to take effect, they have to use a CR3 reload. Therefore there's no need
to issue a paging structure invalidating TLB flush in this case.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86/PV: restrict TLB flushing after mod_l[234]_entry()
Jan Beulich [Fri, 9 Apr 2021 07:18:51 +0000 (09:18 +0200)]
x86/PV: restrict TLB flushing after mod_l[234]_entry()

Just like we avoid to invoke remote root pt flushes when all uses of an
L4 table can be accounted for locally, the same can be done for all of
L[234] for the linear pt flush when the table is a "free floating" one,
i.e. it is pinned but not hooked up anywhere. While this situation
doesn't occur very often, it can be observed.

Since this breaks one of the implications of the XSA-286 fix, drop the
flush_root_pt_local variable again and set ->root_pgt_changed directly,
just like it was before that change.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86/PV: _PAGE_RW changes may take fast path of mod_l[234]_entry()
Jan Beulich [Fri, 9 Apr 2021 07:18:17 +0000 (09:18 +0200)]
x86/PV: _PAGE_RW changes may take fast path of mod_l[234]_entry()

The only time _PAGE_RW matters when validating an L2 or higher entry is
when a linear page table is tried to be installed (see the comment ahead
of define_get_linear_pagetable()). Therefore when we disallow such at
build time, we can allow _PAGE_RW changes to take the fast paths there.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86: limit amount of INT3 in IND_THUNK_*
Jan Beulich [Fri, 9 Apr 2021 07:17:04 +0000 (09:17 +0200)]
x86: limit amount of INT3 in IND_THUNK_*

There's no point having every replacement variant to also specify the
INT3 - just have it once in the base macro. When patching, NOPs will get
inserted, which are fine to speculate through (until reaching the INT3).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86: guard against straight-line speculation past RET
Jan Beulich [Fri, 9 Apr 2021 07:16:22 +0000 (09:16 +0200)]
x86: guard against straight-line speculation past RET

Under certain conditions CPUs can speculate into the instruction stream
past a RET instruction. Guard against this just like 3b7dab93f240
("x86/spec-ctrl: Protect against CALL/JMP straight-line speculation")
did - by inserting an "INT $3" insn. It's merely the mechanics of how to
achieve this that differ: A set of macros gets introduced to post-
process RET insns issued by the compiler (or living in assembly files).

Unfortunately for clang this requires further features their built-in
assembler doesn't support: We need to be able to override insn mnemonics
produced by the compiler (which may be impossible, if internally
assembly mnemonics never get generated).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86/PV: make post-migration page state consistent
Jan Beulich [Fri, 9 Apr 2021 07:15:38 +0000 (09:15 +0200)]
x86/PV: make post-migration page state consistent

When a page table page gets de-validated, its type reference count drops
to zero (and PGT_validated gets cleared), but its type remains intact.
XEN_DOMCTL_getpageframeinfo3, therefore, so far reported prior usage for
such pages. An intermediate write to such a page via e.g.
MMU_NORMAL_PT_UPDATE, however, would transition the page's type to
PGT_writable_page, thus altering what XEN_DOMCTL_getpageframeinfo3 would
return. In libxc the decision which pages to normalize / localize
depends solely on the type returned from the domctl. As a result without
further precautions the guest won't be able to tell whether such a page
has had its (apparent) PTE entries transitioned to the new MFNs.

Add a check of PGT_validated, thus consistently avoiding normalization /
localization in the tool stack.

Also use XEN_DOMCTL_PFINFO_NOTAB in the variable's initializer instead
open coding it.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agolibxg: don't use max policy in xc_cpuid_xend_policy()
Jan Beulich [Fri, 9 Apr 2021 07:14:58 +0000 (09:14 +0200)]
libxg: don't use max policy in xc_cpuid_xend_policy()

using max undermines the separation between default and max. For
example, turning off AVX512F on an MPX-capable system silently turns on
MPX, despite this not being part of the default policy anymore. Since
the information is used only for determining what to convert 'x' to (but
not to e.g. validate '1' settings), the effect of this change is
identical for guests with (suitable) "cpuid=" settings to that of the
changes separating default from max and then converting (e.g.) MPX from
being part of default to only being part of max for guests without
(affected) "cpuid=" settings.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
4 years agox86/CPUID: move some static masks into .init
Jan Beulich [Fri, 9 Apr 2021 07:14:25 +0000 (09:14 +0200)]
x86/CPUID: move some static masks into .init

Except for hvm_shadow_max_featuremask and deep_features they're
referenced by __init functions only.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agox86: refine guest_mode()
Jan Beulich [Fri, 9 Apr 2021 07:12:51 +0000 (09:12 +0200)]
x86: refine guest_mode()

The 2nd of the assertions as well as the macro's return value have been
assuming we're on the primary stack. While for most IST exceptions we
switch back to the main one when user mode was interrupted, for #DF we
intentionally never do, and hence a #DF actually triggering on a user
mode insn (which then is still a Xen bug) would in turn trigger this
assertion, rather than cleanly logging state.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agoxen/page_alloc: Don't hold the heap_lock when clearing PGC_need_scrub
Julien Grall [Thu, 21 Jan 2021 11:12:00 +0000 (11:12 +0000)]
xen/page_alloc: Don't hold the heap_lock when clearing PGC_need_scrub

Currently, the heap_lock is held when clearing PGC_need_scrub in
alloc_heap_pages(). However, this is unnecessary because the only caller
(mark_page_offline()) that can concurrently modify the count_info is
using cmpxchg() in a loop.

Therefore, rework the code to avoid holding the heap_lock and use
test_and_clear_bit() instead.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agofix for_each_cpu() again for NR_CPUS=1
Jan Beulich [Wed, 7 Apr 2021 10:24:45 +0000 (12:24 +0200)]
fix for_each_cpu() again for NR_CPUS=1

Unfortunately aa50f45332f1 ("xen: fix for_each_cpu when NR_CPUS=1") has
caused quite a bit of fallout with gcc10, e.g. (there are at least two
more similar ones, and I didn't bother trying to find them all):

In file included from .../xen/include/xen/config.h:13,
                 from <command-line>:
core_parking.c: In function ‘core_parking_power’:
.../xen/include/asm/percpu.h:12:51: error: array subscript 1 is above array bounds of ‘long unsigned int[1]’ [-Werror=array-bounds]
   12 |     (*RELOC_HIDE(&per_cpu__##var, __per_cpu_offset[cpu]))
.../xen/include/xen/compiler.h:141:29: note: in definition of macro ‘RELOC_HIDE’
  141 |     (typeof(ptr)) (__ptr + (off)); })
      |                             ^~~
core_parking.c:133:39: note: in expansion of macro ‘per_cpu’
  133 |             core_tmp = cpumask_weight(per_cpu(cpu_core_mask, cpu));
      |                                       ^~~~~~~
In file included from .../xen/include/xen/percpu.h:4,
                 from .../xen/include/asm/msr.h:7,
                 from .../xen/include/asm/time.h:5,
                 from .../xen/include/xen/time.h:76,
                 from .../xen/include/xen/spinlock.h:4,
                 from .../xen/include/xen/cpu.h:5,
                 from core_parking.c:19:
.../xen/include/asm/percpu.h:6:22: note: while referencing ‘__per_cpu_offset’
    6 | extern unsigned long __per_cpu_offset[NR_CPUS];
      |                      ^~~~~~~~~~~~~~~~

One of the further errors even went as far as claiming that an array
index (range) of [0, 0] was outside the bounds of a [1] array, so
something fishy is pretty clearly going on there.

The compiler apparently wants to be able to see that the loop isn't
really a loop in order to avoid triggering such warnings, yet what
exactly makes it consider the loop exit condition constant and within
the [0, 1] range isn't obvious - using ((mask)->bits[0] & 1) instead of
cpumask_test_cpu() for example did _not_ help.

Re-instate a special form of for_each_cpu(), experimentally "proven" to
avoid the diagnostics.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
4 years agotools/firmware: hvmloader: Use const in __bug() and __assert_failed()
Julien Grall [Tue, 6 Apr 2021 19:01:18 +0000 (20:01 +0100)]
tools/firmware: hvmloader: Use const in __bug() and __assert_failed()

__bug() and __assert_failed() are not meant to modify the string
parameters. So mark them as const.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
4 years agotools/xentrace: Use const whenever we point to literal strings
Julien Grall [Tue, 6 Apr 2021 19:00:25 +0000 (20:00 +0100)]
tools/xentrace: Use const whenever we point to literal strings

literal strings are not meant to be modified. So we should use const
char * rather than char * when we want to store a pointer to them.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
4 years agotools/kdd: Use const whenever we point to literal strings
Julien Grall [Tue, 6 Apr 2021 18:59:25 +0000 (19:59 +0100)]
tools/kdd: Use const whenever we point to literal strings

literal strings are not meant to be modified. So we should use const
char * rather than char * when we want to shore a pointer to them.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Acked-by: Tim Deegan <tim@xen.org>
4 years agoxen/x86: shadow: The return type of sh_audit_flags() should be const
Julien Grall [Tue, 6 Apr 2021 18:58:05 +0000 (19:58 +0100)]
xen/x86: shadow: The return type of sh_audit_flags() should be const

The function sh_audit_flags() is returning pointer to literal strings.
They should not be modified, so the return is now const and this is
propagated to the callers.

Take the opportunity to fix the coding style in the declaration of
sh_audit_flags.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
4 years agoxen/sched: Constify name and opt_name in struct scheduler
Julien Grall [Tue, 6 Apr 2021 18:34:08 +0000 (19:34 +0100)]
xen/sched: Constify name and opt_name in struct scheduler

Both name and opt_name are pointing to literal string. So mark both of
the fields as const.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
4 years agoxen: Constify the second parameter of rangeset_new()
Julien Grall [Tue, 6 Apr 2021 18:03:49 +0000 (19:03 +0100)]
xen: Constify the second parameter of rangeset_new()

The string 'name' will never get modified by the function, so mark it
as const.

Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agoxen/gunzip: Allow perform_gunzip() to be called multiple times
Julien Grall [Wed, 3 Mar 2021 19:27:56 +0000 (19:27 +0000)]
xen/gunzip: Allow perform_gunzip() to be called multiple times

Currently perform_gunzip() can only be called once because the
the internal state (e.g allocate) is not fully re-initialized.

This works fine if you are only booting dom0. But this will break when
booting multiple using the dom0less that uses compressed kernel images.

This can be resolved by re-initializing bytes_out, malloc_ptr,
malloc_count every time perform_gunzip() is called.

Note the latter is only re-initialized for hardening purpose as there is
no guarantee that every malloc() are followed by free() (It should in
theory!).

Take the opportunity to check the return of alloc_heap_pages() to return
an error rather than dereferencing a NULL pointer later on failure.

Reported-by: Charles Chiou <cchiou@ambarella.com>
Signed-off-by: Julien Grall <jgrall@amazon.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
4 years agoCHANGELOG.md: irq-max-guests
George Dunlap [Thu, 1 Apr 2021 13:34:04 +0000 (14:34 +0100)]
CHANGELOG.md: irq-max-guests

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
CC: Igor Druzhinin <igor.druzhinin@citrix.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Ian Jackson <ian.jackson@citrix.com>
4 years agoCHANGELOG.md: Various entries, mostly xenstore
George Dunlap [Thu, 1 Apr 2021 13:30:55 +0000 (14:30 +0100)]
CHANGELOG.md: Various entries, mostly xenstore

...grouped by submitters / maintainers

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
---
CC: Juergen Gross <jgross@suse.com>
CC: Jan Beulich <jbeulich@suse.com>
CC: Ian Jackson <ian.jackson@citrix.com>