]> xenbits.xensource.com Git - xen.git/log
xen.git
5 years agox86/IRQ: bail early from irq_guest_eoi_timer_fn() when nothing is in flight
Jan Beulich [Thu, 6 Jun 2019 14:04:53 +0000 (16:04 +0200)]
x86/IRQ: bail early from irq_guest_eoi_timer_fn() when nothing is in flight

There's no point entering the loop in the function in this case. Instead
there still being something in flight _after_ the loop would be an
actual problem: No timer would be running anymore for issuing the EOI
eventually, and hence this IRQ (and possibly lower priority ones) would
be blocked, perhaps indefinitely.

Issue a warning instead and prefer breaking some (presumably
misbehaving) guest over stalling perhaps the entire system.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/IRQ: don't keep EOI timer running without need
Jan Beulich [Thu, 6 Jun 2019 14:04:09 +0000 (16:04 +0200)]
x86/IRQ: don't keep EOI timer running without need

The timer needs to remain active only until all pending IRQ instances
have seen EOIs from their respective domains. Stop it when the in-flight
count has reached zero in desc_guest_eoi(). Note that this is race free
(with __do_IRQ_guest()), as the IRQ descriptor lock is being held at
that point.

Also pull up stopping of the timer in __do_IRQ_guest() itself: Instead
of stopping it immediately before re-setting, stop it as soon as we've
made it past any early returns from the function (and hence we're sure
it'll get set again).

Finally bail from the actual timer handler in case we find the timer
already active again by the time we've managed to acquire the IRQ
descriptor lock. Without this we may forcibly EOI an IRQ immediately
after it got sent to a guest. For this, timer_is_active() gets split out
of active_timer(), deliberately moving just one of the two ASSERT()s (to
allow the function to be used also on a never initialized timer).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agomemory: don't depend on guest_handle_subrange_okay() implementation details
Jan Beulich [Thu, 6 Jun 2019 14:03:10 +0000 (16:03 +0200)]
memory: don't depend on guest_handle_subrange_okay() implementation details

guest_handle_subrange_okay() takes inclusive first and last parameters,
i.e. checks that [first, last] is valid. Many callers, however, actually
need to see whether [first, limit) is valid (i.e., limit is non-
inclusive), and to do this they subtract 1 from the size. This is
normally correct, except in cases where first == limit, in which case
guest_handle_subrange_okay() will be passed a second parameter less than
its first.

As it happens, due to the way the math is implemented in x86's
guest_handle_subrange_okay(), the return value turns out to be correct;
but we shouldn\92t rely on this behavior.

Make sure all callers handle first == limit explicitly before calling
guest_handle_subrange_okay().

Note that the other uses (increase-reservation, populate-physmap, and
decrease-reservation) are already fine due to a suitable check in
do_memory_op().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agoadjust system domain creation (and call it earlier on x86)
Jan Beulich [Thu, 6 Jun 2019 09:16:57 +0000 (11:16 +0200)]
adjust system domain creation (and call it earlier on x86)

Split out this mostly arch-independent code into a common-code helper
function. (This does away with Arm's arch_init_memory() altogether.)

On x86 this needs to happen before acpi_boot_init(): Commit 9fa94e1058
("x86/ACPI: also parse AMD IOMMU tables early") only appeared to work
fine - it's really broken, and doesn't crash (on non-EFI AMD systems)
only because of there being a mapping of linear address 0 during early
boot. On EFI there is:

 Early fatal page fault at e008:ffff82d08024d58e (cr2=0000000000000220, ec=0000)
 ----[ Xen-4.13-unstable  x86_64  debug=y   Not tainted ]----
 CPU:    0
 RIP:    e008:[<ffff82d08024d58e>] pci.c#_pci_hide_device+0x17/0x3a
 RFLAGS: 0000000000010046   CONTEXT: hypervisor
 rax: 0000000000000000   rbx: 0000000000006000   rcx: 0000000000000000
 rdx: ffff83104f2ee9b0   rsi: ffff82e0209e5d48   rdi: ffff83104f2ee9a0
 rbp: ffff82d08081fce0   rsp: ffff82d08081fcb8   r8:  0000000000000000
 r9:  8000000000000000   r10: 0180000000000000   r11: 7fffffffffffffff
 r12: ffff83104f2ee9a0   r13: 0000000000000002   r14: ffff83104f2ee4b0
 r15: 0000000000000064   cr0: 0000000080050033   cr4: 00000000000000a0
 cr3: 000000009f614000   cr2: 0000000000000220
 fsb: 0000000000000000   gsb: 0000000000000000   gss: 0000000000000000
 ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
 Xen code around <ffff82d08024d58e> (pci.c#_pci_hide_device+0x17/0x3a):
  48 89 47 38 48 8d 57 10 <48> 8b 88 20 02 00 00 48 89 51 08 48 89 4f 10 48
 Xen stack trace from rsp=ffff82d08081fcb8:
[...]
 Xen call trace:
    [<ffff82d08024d58e>] pci.c#_pci_hide_device+0x17/0x3a
[   [<                >] pci_ro_device+...]
    [<ffff82d080617fe1>] amd_iommu_detect_one_acpi+0x161/0x249
    [<ffff82d0806186ac>] iommu_acpi.c#detect_iommu_acpi+0xb5/0xe7
    [<ffff82d08061cde0>] acpi_table_parse+0x61/0x90
    [<ffff82d080619e7d>] amd_iommu_detect_acpi+0x17/0x19
    [<ffff82d08061790b>] acpi_ivrs_init+0x20/0x5b
    [<ffff82d08062e838>] acpi_boot_init+0x301/0x30f
    [<ffff82d080628b10>] __start_xen+0x1daf/0x28a2

 Pagetable walk from 0000000000000220:
  L4[0x000] = 000000009f44f063 ffffffffffffffff
  L3[0x000] = 000000009f44b063 ffffffffffffffff
  L2[0x000] = 0000000000000000 ffffffffffffffff

 ****************************************
 Panic on CPU 0:
 FATAL TRAP: vector = 14 (page fault)
 [error_code=0000] , IN INTERRUPT CONTEXT
 ****************************************

Of course the bug would nevertheless have lead to post-boot crashes as
soon as the list would actually get traversed.

Take the opportunity and
- convert BUG_ON()s being moved to panic(),
- add __read_mostly annotations to the dom_* definitions.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agoPCI: move pdev_list field to common structure
Jan Beulich [Thu, 6 Jun 2019 09:14:58 +0000 (11:14 +0200)]
PCI: move pdev_list field to common structure

Its management shouldn't be arch-specific, and in particular there
should be no need for special precautions when creating the special
domains.

At this occasion
- correct parenthesization of for_each_pdev(),
- stop open-coding for_each_pdev() in vPCI code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/IRQ: relax locking in irq_guest_eoi_timer_fn()
Jan Beulich [Thu, 6 Jun 2019 09:14:00 +0000 (11:14 +0200)]
x86/IRQ: relax locking in irq_guest_eoi_timer_fn()

This is a timer handler, so it gets entered with IRQs enabled. Therefore
there's no need to save/restore the IRQ masking flag.

Additionally the final switch()'es ACKTYPE_EOI case re-acquires the lock
just for it to be dropped again right away. Do away with this.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agoarm: rename tiny64.conf to tiny64_defconfig
Volodymyr Babchuk [Thu, 16 May 2019 13:39:00 +0000 (15:39 +0200)]
arm: rename tiny64.conf to tiny64_defconfig

As build system now supports *_defconfig rules it is good to be able
to configure minimal XEN image with

 make tiny64_defconfig

command.

Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agomakefile: add support for *_defconfig targets
Volodymyr Babchuk [Thu, 6 Jun 2019 09:11:14 +0000 (11:11 +0200)]
makefile: add support for *_defconfig targets

Ease up XEN configuration for non-standard builds, like
armv8 tiny config.

Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
5 years agoxen/bitops: Further reduce the #ifdef-ary in generic_hweight64()
Andrew Cooper [Tue, 4 Jun 2019 12:40:08 +0000 (13:40 +0100)]
xen/bitops: Further reduce the #ifdef-ary in generic_hweight64()

This #ifdef-ary isn't necessary, and the logic can live in a plain if()

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agoxen/vm-event: Misc fixups
Andrew Cooper [Fri, 31 May 2019 19:54:28 +0000 (12:54 -0700)]
xen/vm-event: Misc fixups

 * Drop redundant brackes, and inline qualifiers.
 * Insert newlines and spaces where appropriate.
 * Drop redundant NDEBUG - gdprint() is already conditional.  Fix the
   logging level, as gdprintk() already prefixes the guest marker.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
5 years agoxen/vm-event: Fix interactions with the vcpu list
Andrew Cooper [Fri, 31 May 2019 19:29:27 +0000 (12:29 -0700)]
xen/vm-event: Fix interactions with the vcpu list

vm_event_resume() should use domain_vcpu(), rather than opencoding it
without its Spectre v1 safety.

vm_event_wake_blocked() can't ever be invoked in a case where d->vcpu is
NULL, so drop the outer if() and reindent, fixing up style issues.

The comment, which is left alone, is false.  This algorithm still has
starvation issues when there is an asymetric rate of generated events.

However, the existing logic is sufficiently complicated and fragile that
I don't think I've followed it fully, and because we're trying to
obsolete this interface, the safest course of action is to leave it
alone, rather than to end up making things subtly different.

Therefore, no practical change that callers would notice.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
5 years agoxen/vm-event: Remove unnecessary vm_event_domain indirection
Andrew Cooper [Fri, 31 May 2019 20:11:15 +0000 (13:11 -0700)]
xen/vm-event: Remove unnecessary vm_event_domain indirection

The use of (*ved)-> leads to poor code generation, as the compiler can't
assume the pointer hasn't changed, and results in hard-to-follow code.

For both vm_event_{en,dis}able(), rename the ved parameter to p_ved, and
work primarily with a local ved pointer.

This has a key advantage in vm_event_enable(), in that the partially
constructed vm_event_domain only becomes globally visible once it is
fully constructed.  As a consequence, the spinlock doesn't need holding.

Furthermore, rearrange the order of operations to be more sensible.
Check for repeated enables and an bad HVM_PARAM before allocating
memory, and gather the trivial setup into one place, dropping the
redundant zeroing.

No practical change that callers will notice.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
5 years agoxen/vm-event: Expand vm_event_* spinlock macros and rename the lock
Andrew Cooper [Fri, 31 May 2019 20:57:03 +0000 (13:57 -0700)]
xen/vm-event: Expand vm_event_* spinlock macros and rename the lock

These serve no purpose, but to add to the congnitive load of following
the code.  Remove the level of indirection.

Furthermore, the lock protects all data in vm_event_domain, making
ring_lock a poor choice of name.

For vm_event_get_response() and vm_event_grab_slot(), fold the exit
paths to have a single unlock, as the compiler can't make this
optimisation itself.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
5 years agoxen/vm-event: Drop unused u_domctl parameter from vm_event_domctl()
Andrew Cooper [Fri, 31 May 2019 19:35:55 +0000 (12:35 -0700)]
xen/vm-event: Drop unused u_domctl parameter from vm_event_domctl()

This parameter isn't used at all.  Futhermore, elide the copyback in
failing cases, as it is only successful paths which generate data which
needs sending back to the caller.

Finally, drop a redundant d == NULL check, as that logic is all common
at the begining of do_domctl().

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agosched_null: superficial clean-ups
Baodong Chen [Mon, 3 Jun 2019 15:56:20 +0000 (17:56 +0200)]
sched_null: superficial clean-ups

* Remove unused dependency 'keyhandler.h'
* Make sched_null_def static

Signed-off-by: Baodong Chen <chenbaodong@mxnavi.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agox86: remove alternative_callN usage of ALTERNATIVE asm macro
Roger Pau Monné [Mon, 3 Jun 2019 15:55:37 +0000 (17:55 +0200)]
x86: remove alternative_callN usage of ALTERNATIVE asm macro

There is a bug in llvm that needs to be fixed before switching to use
the alternative assembly macros in inline assembly call sites.
Therefore alternative_callN using inline assembly to generate the
alternative patch sites should be using the ALTERNATIVE C preprocessor
macro rather than the ALTERNATIVE assembly macro. Using the assembly
macro in an inline assembly instance triggers the following bug on
llvm based toolchains:

<instantiation>:1:1: error: invalid symbol redefinition
.L0_orig_s: call *genapic+64(%rip); .L0_orig_e: .L0_diff = (.L0_repl_e1 - .L0_repl_s1) - (...
^
<instantiation>:1:37: error: invalid symbol redefinition
.L0_orig_s: call *genapic+64(%rip); .L0_orig_e: .L0_diff = (.L0_repl_e1 - .L0_repl_s1) - (...
                                    ^
<instantiation>:1:60: error: invalid reassignment of non-absolute variable '.L0_diff'
.L0_orig_s: call *genapic+64(%rip); .L0_orig_e: .L0_diff = (.L0_repl_e1 - .L0_repl_s1) - (...
                                                           ^
<inline asm>:1:2: note: while in macro instantiation
        ALTERNATIVE "call *genapic+64(%rip)", "call .", X86_FEATURE_LM
        ^
<instantiation>:1:156: error: invalid symbol redefinition
  ...- (.L0_orig_e - .L0_orig_s); mknops ((-(.L0_diff > 0)) * .L0_diff); .L0_orig_p:
                                                                         ^
<instantiation>:18:5: error: invalid symbol redefinition
    .L0_repl_s1: call .; .L0_repl_e1:
    ^
<instantiation>:18:26: error: invalid symbol redefinition
    .L0_repl_s1: call .; .L0_repl_e1:
                         ^
<instantiation>:1:1: error: invalid symbol redefinition
.L0_orig_s: call *genapic+64(%rip); .L0_orig_e: .L0_diff = (.L0_repl_e1 - .L0_repl_s1) - (...
^
<instantiation>:1:37: error: invalid symbol redefinition
.L0_orig_s: call *genapic+64(%rip); .L0_orig_e: .L0_diff = (.L0_repl_e1 - .L0_repl_s1) - (...
                                    ^
<instantiation>:1:60: error: invalid reassignment of non-absolute variable '.L0_diff'
.L0_orig_s: call *genapic+64(%rip); .L0_orig_e: .L0_diff = (.L0_repl_e1 - .L0_repl_s1) - (...
                                                           ^
<inline asm>:1:2: note: while in macro instantiation
        ALTERNATIVE "call *genapic+64(%rip)", "call .", X86_FEATURE_LM
        ^
<instantiation>:1:156: error: invalid symbol redefinition
  ...- (.L0_orig_e - .L0_orig_s); mknops ((-(.L0_diff > 0)) * .L0_diff); .L0_orig_p:
                                                                         ^
<instantiation>:18:5: error: invalid symbol redefinition
    .L0_repl_s1: call .; .L0_repl_e1:
    ^
<instantiation>:18:26: error: invalid symbol redefinition
    .L0_repl_s1: call .; .L0_repl_e1:
                         ^
<instantiation>:1:1: error: invalid symbol redefinition
.L0_orig_s: call *genapic+64(%rip); .L0_orig_e: .L0_diff = (.L0_repl_e1 - .L0_repl_s1) - (...
^
<instantiation>:1:37: error: invalid symbol redefinition
.L0_orig_s: call *genapic+64(%rip); .L0_orig_e: .L0_diff = (.L0_repl_e1 - .L0_repl_s1) - (...
                                    ^
<instantiation>:1:60: error: invalid reassignment of non-absolute variable '.L0_diff'
.L0_orig_s: call *genapic+64(%rip); .L0_orig_e: .L0_diff = (.L0_repl_e1 - .L0_repl_s1) - (...
                                                           ^
<inline asm>:1:2: note: while in macro instantiation
        ALTERNATIVE "call *genapic+64(%rip)", "call .", X86_FEATURE_LM
        ^
<instantiation>:1:156: error: invalid symbol redefinition
  ...- (.L0_orig_e - .L0_orig_s); mknops ((-(.L0_diff > 0)) * .L0_diff); .L0_orig_p:
                                                                         ^
<instantiation>:18:5: error: invalid symbol redefinition
    .L0_repl_s1: call .; .L0_repl_e1:
    ^
<instantiation>:18:26: error: invalid symbol redefinition
    .L0_repl_s1: call .; .L0_repl_e1:
                         ^

This has been reported to upstream llvm:

https://bugs.llvm.org/show_bug.cgi?id=42034

Fixes: 67d01cdb5 ("x86: infrastructure to allow converting certain indirect calls to direct ones")
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agox86: further speed-up to hweight{32,64}()
Jan Beulich [Mon, 3 Jun 2019 15:21:05 +0000 (17:21 +0200)]
x86: further speed-up to hweight{32,64}()

According to Linux commit 0136611c62 ("optimize hweight64 for x86_64")
this is a further improvement over the variant using only bitwise
operations. It's also a slight further code size reduction.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agobitops: speed up hweight<N>()
Jan Beulich [Mon, 3 Jun 2019 15:20:13 +0000 (17:20 +0200)]
bitops: speed up hweight<N>()

Algorithmically this gets us in line with current Linux, where the same
change did happen about 13 years ago. See in particular Linux commits
f9b4192923 ("bitops: hweight() speedup") and 0136611c62 ("optimize
hweight64 for x86_64").

Kconfig changes for actually setting HAVE_FAST_MULTIPLY will follow.

Take the opportunity and change generic_hweight64()'s return type to
unsigned int.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agocpu: change 'cpu_hotplug_[begin|done]' to inline function
Baodong Chen [Mon, 3 Jun 2019 15:18:58 +0000 (17:18 +0200)]
cpu: change 'cpu_hotplug_[begin|done]' to inline function

Signed-off-by: Baodong Chen <chenbaodong@mxnavi.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agoremove on-stack cpumask from stop_machine_run()
Juergen Gross [Mon, 3 Jun 2019 15:17:51 +0000 (17:17 +0200)]
remove on-stack cpumask from stop_machine_run()

The "allbutself" cpumask in stop_machine_run() is not needed. Instead
of allocating it on the stack it can easily be avoided.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agonotifier: refine 'notifier_head', use 'list_head' directly
Baodong Chen [Mon, 3 Jun 2019 15:16:52 +0000 (17:16 +0200)]
notifier: refine 'notifier_head', use 'list_head' directly

'notifier_block' can be replaced with 'list_head' when used for
'notifier_head', this makes a little clearer.

Signed-off-by: Baodong Chen <chenbaodong@mxnavi.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agoschedule: initialize 'now' when really needed
Baodong Chen [Mon, 3 Jun 2019 15:15:44 +0000 (17:15 +0200)]
schedule: initialize 'now' when really needed

when 'periodic_period' is zero, there is no need to initialize 'now'.

Signed-off-by: Baodong Chen <chenbaodong@mxnavi.com>
Acked-by: Dario Faggioli <dfaggioli@suse.com>
5 years agox86emul/fuzz: add a state sanity checking function
Jan Beulich [Mon, 3 Jun 2019 15:15:06 +0000 (17:15 +0200)]
x86emul/fuzz: add a state sanity checking function

This is to accompany sanitize_input(). Just like for initial state we
want to have state between two emulated insns sane, at least as far as
assumptions in the main emulator go. Do minimal checking after segment
register, CR, and MSR writes, and roll back to the old value in case of
failure (raising #GP(0) at the same time).

In the particular case observed, a CR0 write clearing CR0.PE was
followed by a VEX-encoded insn, which the decoder accepts based on
guest address size, restricting things just outside of the 64-bit case
(real and virtual modes don't allow VEX-encoded insns). Subsequently
_get_fpu() would then assert that CR0.PE must be set (and EFLAGS.VM
clear) when trying to invoke YMM, ZMM, or OPMASK state.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
5 years agox86emul/fuzz: extend canonicalization to 57-bit linear address width case
Jan Beulich [Mon, 3 Jun 2019 15:14:41 +0000 (17:14 +0200)]
x86emul/fuzz: extend canonicalization to 57-bit linear address width case

Don't enforce any other dependencies for now, just like we don't enforce
e.g. PAE enabled as a prereq for long mode.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
5 years agox86/hvm: Make the altp2m locking in hvm_hap_nested_page_fault() easier to follow
Andrew Cooper [Tue, 23 Oct 2018 10:18:07 +0000 (11:18 +0100)]
x86/hvm: Make the altp2m locking in hvm_hap_nested_page_fault() easier to follow

Drop the ap2m_active boolean, and consistently use the unlocking form:

  if ( p2m != hostp2m )
       __put_gfn(p2m, gfn);
  __put_gfn(hostp2m, gfn);

which makes it clear that we always unlock the altp2m's gfn if it is in use,
and always unlock the hostp2m's gfn.  This also drops the ternary expression
in the logdirty case.

Extend the logdirty comment to identify where the locking violation is liable
to occur.

No (intended) overall change in behaviour.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
5 years agovm_event: Make ‘local’ functions ‘static’
Petre Pircalabu [Thu, 30 May 2019 14:18:17 +0000 (17:18 +0300)]
vm_event: Make ‘local’ functions ‘static’

vm_event_get_response, vm_event_resume, and vm_event_mark_and_pause are
used only in xen/common/vm_event.c.

Signed-off-by: Petre Pircalabu <ppircalabu@bitdefender.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
5 years agox86/mpparse: Don't print "limit reached" for every subsequent processor
Andrew Cooper [Fri, 17 May 2019 18:35:08 +0000 (19:35 +0100)]
x86/mpparse: Don't print "limit reached" for every subsequent processor

When you boot Xen with the default 256 NR_CPUS, on a box with rather more
processors, the resulting spew is unnecesserily verbose.  Instead, print the
message once, e.g:

 (XEN) ACPI: X2APIC (apic_id[0x115] uid[0x115] enabled)
 (XEN) WARNING: NR_CPUS limit of 256 reached - ignoring further processors
 (XEN) ACPI: X2APIC (apic_id[0x119] uid[0x119] enabled)
 (XEN) ACPI: X2APIC (apic_id[0x11d] uid[0x11d] enabled)
 (XEN) ACPI: X2APIC (apic_id[0x121] uid[0x121] enabled)

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agoxen/lib: Introduce printk_once() and replace some opencoded examples
Andrew Cooper [Fri, 17 May 2019 18:30:47 +0000 (19:30 +0100)]
xen/lib: Introduce printk_once() and replace some opencoded examples

Reflow the ZynqMP message for grepability, and fix the omission of a newline.

There is a race condition where multiple cpus could race to set once_ boolean.
However, the use of this construct is mainly useful for boot time code, and
the only consequence of the race is a repeated print message.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
5 years agox86/spec-ctrl: Knights Landing/Mill are retpoline-safe
Andrew Cooper [Fri, 17 May 2019 18:23:55 +0000 (19:23 +0100)]
x86/spec-ctrl: Knights Landing/Mill are retpoline-safe

They are both Airmont-based and should have been included in c/s 17f74242ccf
"x86/spec-ctrl: Extend repoline safey calcuations for eIBRS and Atom parts".

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/vhpet: avoid 'small' time diff test on resume
Paul Durrant [Fri, 31 May 2019 09:40:52 +0000 (11:40 +0200)]
x86/vhpet: avoid 'small' time diff test on resume

It appears that even 64-bit versions of Windows 10, when not using syth-
etic timers, will use 32-bit HPET non-periodic timers. There is a test
in hpet_set_timer(), specific to 32-bit timers, that tries to disambiguate
between a comparator value that is in the past and one that is sufficiently
far in the future that it wraps. This is done by assuming that the delta
between the main counter and comparator will be 'small' [1], if the
comparator value is in the past. Unfortunately, more often than not, this
is not the case if the timer is being re-started after a migrate and so
the timer is set to fire far in the future (in excess of a minute in
several observed cases) rather then set to fire immediately. This has a
rather odd symptom where the guest console is alive enough to be able to
deal with mouse pointer re-rendering, but any keyboard activity or mouse
clicks yield no response.

This patch simply adds an extra check of 'creation_finished' into
hpet_set_timer() so that the 'small' time test is omitted when the function
is called to restart timers after migration, and thus any negative delta
causes a timer to fire immediately.

[1] The number of ticks that equate to 0.9765625 milliseconds

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agosupport: remove tmem from SUPPORT.md
Juergen Gross [Fri, 31 May 2019 09:40:38 +0000 (11:40 +0200)]
support: remove tmem from SUPPORT.md

Tmem has been removed. Reflect that in SUPPORT.md

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agoVT-d: change bogus return value of intel_iommu_lookup_page()
Jan Beulich [Fri, 31 May 2019 09:39:49 +0000 (11:39 +0200)]
VT-d: change bogus return value of intel_iommu_lookup_page()

The function passes 0 as "alloc" argument to addr_to_dma_page_maddr(),
so -ENOMEM simply makes no sense (and its use was probably simply a
copy-and-paste effect originating at intel_iommu_map_page()).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
5 years agoxen/arm64: head: Correctly report the HW CPU ID
Julien Grall [Thu, 11 Apr 2019 20:03:17 +0000 (21:03 +0100)]
xen/arm64: head: Correctly report the HW CPU ID

There are no reason to consider the HW CPU ID will be 0 when the
processor is part of a uniprocessor system. At best, this will result to
conflicting output as the rest of Xen use the value directly read from
MPIDR_EL1.

So remove the zeroing and logic to check if the CPU is part of a
uniprocessor system.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm64: head: Move earlyprintk messages in .rodata.str
Julien Grall [Sat, 13 Apr 2019 16:25:16 +0000 (17:25 +0100)]
xen/arm64: head: Move earlyprintk messages in .rodata.str

At the moment, the earlyprintk messages are interleaved with the
instructions. This makes more difficult to read the objdump output.

Introduce a new macro to add a string in .rodata.str and use it for all
the earlyprintk messages.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm64: head: Remove unnecessary comment
Julien Grall [Sat, 13 Apr 2019 17:30:33 +0000 (18:30 +0100)]
xen/arm64: head: Remove unnecessary comment

So far, we don't have specific core initialization at boot. So remove
the comment.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agodocs: Introduce some hypercall page documentation
Andrew Cooper [Thu, 28 Mar 2019 14:23:13 +0000 (14:23 +0000)]
docs: Introduce some hypercall page documentation

This also introduced the top-level Guest Documentation section.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agox86: init_hypercall_page() cleanup
Andrew Cooper [Thu, 28 Mar 2019 14:23:13 +0000 (14:23 +0000)]
x86: init_hypercall_page() cleanup

The various pieces of the hypercall page infrastructure have grown
organically over time and ended up in a bit of a mess.

 * Rename all functions to be of the form *_init_hypercall_page().  This
   makes them somewhat shorter, and means they can actually be grepped
   for in one go.
 * Move init_hypercall_page() to domain.c.  The 64-bit traps.c isn't a
   terribly appropriate place for it to live.
 * Drop an obsolete comment from hvm_init_hypercall_page() and drop the
   domain parameter from hvm_funcs.init_hypercall_page() as it isn't
   necessary.
 * Rearrange the logic in the each function to avoid needing extra local
   variables, and to write the page in one single pass.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
5 years agox86/altp2m: Fix style errors introduced with c/s 9abcac7ff
Andrew Cooper [Wed, 29 May 2019 04:19:11 +0000 (05:19 +0100)]
x86/altp2m: Fix style errors introduced with c/s 9abcac7ff

Drop introduced trailing whitespace, excessively long lines, mal-indention,
superfluous use of PRI macros for int-or-smaller types, and incorrect PRI
macros for gfns and mfns.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
5 years agox86/altp2m: cleanup p2m_altp2m_lazy_copy
Tamas K Lengyel [Tue, 28 May 2019 13:10:36 +0000 (14:10 +0100)]
x86/altp2m: cleanup p2m_altp2m_lazy_copy

The p2m_altp2m_lazy_copy is responsible for lazily populating an
altp2m view when the guest traps out due to no EPT entry being present
in the active view.  Currently, in addition to taking a number of
unused argements, the whole calling convention has a number of
redundant p2m lookups: the function reads the hostp2m, even though the
caller has just read the same hostp2m entry; and then the caller
re-reads the altp2m entry that the function has just read (and possibly set).

Rework this function to make it a bit more rational.  Specifically:

- Pass the current hostp2m entry values we have just read for it to
  use to populate the altp2m entry if it finds the entry empty.

- If the altp2m entry is not empty, pass out the values we've read so
  the caller doesn't need to re-walk the tables

- Either way, return with the gfn 'locked', to make clean-up handling
  more consistent.

Rename the function to better reflect this functionality.

While we're here, change bool_t to bool, and return true/false rather
than 1/0.

It's a bit grating to do both the p2m_lock() and the get_gfn(),
knowing that they boil down to the same thing at the moment; but we
have to maintain the fiction until such time as we decide to get rid
of it entirely.

Signed-off-by: Tamas K Lengyel <tamas@tklengyel.com>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Tested-by: Tamas K Lengyel <tamas@tklengyel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agovm_event: fix rc check for uninitialized ring
Juergen Gross [Mon, 27 May 2019 10:26:20 +0000 (12:26 +0200)]
vm_event: fix rc check for uninitialized ring

vm_event_claim_slot() returns -EOPNOTSUPP for an uninitialized ring
since commit 15e4dd5e866b43bbc ("common/vm_event: Initialize vm_event
lists on domain creation"), but the callers test for -ENOSYS.

Correct the callers.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
5 years agovsprintf: constify "end" parameters
Jan Beulich [Mon, 27 May 2019 10:25:44 +0000 (12:25 +0200)]
vsprintf: constify "end" parameters

Except in the top level function we don't mean to ever write through
"end". The variable is used solely for pointer comparison purposes
there. Add const everywhere.

Also make function heading wrapping style uniform again for all of the
involved functions.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/CPUID: adjust SSEn dependencies
Jan Beulich [Mon, 27 May 2019 10:24:37 +0000 (12:24 +0200)]
x86/CPUID: adjust SSEn dependencies

Along the lines of b9f6395590 ("x86/cpuid: adjust dependencies of
post-SSE ISA extensions") further convert SSEn dependencies to be more
chain like, with each successor addition depending on its immediate
predecessor. This is more in line with how hardware has involved, and
how other projects like gcc and binutils connect things together.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agoMAINTAINERS: update my email address
Wei Liu [Fri, 24 May 2019 15:24:02 +0000 (16:24 +0100)]
MAINTAINERS: update my email address

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
5 years agotests/cpu-policy: Skip building on older versions of GCC
Andrew Cooper [Fri, 24 May 2019 13:14:17 +0000 (14:14 +0100)]
tests/cpu-policy: Skip building on older versions of GCC

GCC 4.4 (as included in CentOS 6) is too old to handle designated initialisers
in anonymous unions.  As this is just a developer tool, skip the test in this
case, rather than sacraficing the legibility/expresibility of the test cases.

This fixes the Gitlab CI tests.

While adding this logic to cpu-polcy, adjust the equivelent logic from
x86_emulator on which this was based.  Printing:

  Test harness not built, use newer compiler than "gcc"

isn't helpful for anyone unexpectedly encountering the error.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agogitignore: ignore xen.lds and asm-offsets.s for all archs
Alistair Francis [Fri, 24 May 2019 08:30:39 +0000 (10:30 +0200)]
gitignore: ignore xen.lds and asm-offsets.s for all archs

Instead of ignoring xen.lds and asm-offsets.s for every specific arch,
let's instead just use gitignore's wildcard feature to ignore them for
all archs.

Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agolibacpi: report PCI slots as enabled only for hotpluggable devices
Igor Druzhinin [Fri, 24 May 2019 08:30:21 +0000 (10:30 +0200)]
libacpi: report PCI slots as enabled only for hotpluggable devices

DSDT for qemu-xen lacks _STA method of PCI slot object. If _STA method
doesn't exist then the slot is assumed to be always present and active
which in conjunction with _EJ0 method makes every device ejectable for
an OS even if it's not the case.

qemu-kvm is able to dynamically add _EJ0 method only to those slots
that either have hotpluggable devices or free for PCI passthrough.
As Xen lacks this capability we cannot use their way.

qemu-xen-traditional DSDT has _STA method which only reports that
the slot is present if there is a PCI devices hotplugged there.
This is done through querying of its PCI hotplug controller.
qemu-xen has similar capability that reports if device is "hotpluggable
or absent" which we can use to achieve the same result.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agodrivers/char: protect the asm/vpl011.h include
Alistair Francis [Fri, 24 May 2019 08:29:04 +0000 (10:29 +0200)]
drivers/char: protect the asm/vpl011.h include

The only use of asm/vpl011.h is protected by the CONFIG_SBSA_VUART_CONSOLE
define so lets protect the include as well.

Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agocommon/grant_table: harden helpers
Norbert Manthey [Fri, 24 May 2019 08:28:26 +0000 (10:28 +0200)]
common/grant_table: harden helpers

Guests can issue grant table operations and provide guest controlled
data to them. This data is used for memory loads in helper functions
and macros. To avoid speculative out-of-bound accesses, we use the
array_index_nospec macro where applicable, or the block_speculation
macro.

This is part of the speculative hardening effort.

Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86emul: support AVX512{F,ER} reciprocal insns
Jan Beulich [Fri, 24 May 2019 08:27:24 +0000 (10:27 +0200)]
x86emul: support AVX512{F,ER} reciprocal insns

Also include the only other AVX512ER insn pair, VEXP2P{D,S}.

Note that despite the replacement of the SHA insns' table slots there's
no need to special case their decoding: Their insn-specific code already
sets op_bytes (as was required due to simd_other), and TwoOp is of no
relevance for legacy encoded SIMD insns.

The raising of #UD when EVEX.L'L is 3 for AVX512ER scalar insns is done
to be on the safe side. The SDM does not clarify behavior there, and
it's even more ambiguous here (without AVX512VL in the picture).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: support remaining AVX512BW legacy-equivalent insns
Jan Beulich [Fri, 24 May 2019 08:26:09 +0000 (10:26 +0200)]
x86emul: support remaining AVX512BW legacy-equivalent insns

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: support remaining AVX512F legacy-equivalent insns
Jan Beulich [Fri, 24 May 2019 08:25:26 +0000 (10:25 +0200)]
x86emul: support remaining AVX512F legacy-equivalent insns

Plus their AVX512BW counterparts.

Take the opportunity and also eliminate a pair of open coded instances
of scalar_1op().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: support AVX512{F,DQ} FP-to-uint conversion insns
Jan Beulich [Fri, 24 May 2019 08:24:48 +0000 (10:24 +0200)]
x86emul: support AVX512{F,DQ} FP-to-uint conversion insns

Along the lines of prior patches, VCVT{,T}PS2UQQ as well as
VCVT{,T}S{S,D}2USI need "manual" overrides of disp8scale.

The twobyte_table[] entries get altered, with their prior values
now put in place in x86_decode_twobyte().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: support AVX512{F,DQ} uint-to-FP conversion insns
Jan Beulich [Fri, 24 May 2019 08:24:11 +0000 (10:24 +0200)]
x86emul: support AVX512{F,DQ} uint-to-FP conversion insns

Some "manual" overrides of disp8scale are needed here again. In
particular code ends up simpler when using d8s_dq64 in the
twobyte_table[] entry.

Test harness additions will be done once the reverse conversions are
also available.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: support AVX512DQ packed quad-int/FP conversion insns
Jan Beulich [Fri, 24 May 2019 08:23:31 +0000 (10:23 +0200)]
x86emul: support AVX512DQ packed quad-int/FP conversion insns

VCVT{,T}PS2QQ, sharing their main opcodes with others, once again need
"manual" overrides of disp8scale.

While not directly related here, also add a scalar variant of to_wint()
to the test harness.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: support AVX512F legacy-equivalent scalar int/FP conversion insns
Jan Beulich [Fri, 24 May 2019 08:22:55 +0000 (10:22 +0200)]
x86emul: support AVX512F legacy-equivalent scalar int/FP conversion insns

VCVT{,T}S{S,D}2SI use EVEX.W for their destination (register) rather
than their (possibly memory) source operand size and hence need a
"manual" override of disp8scale.

While the SDM claims that EVEX.L'L needs to be zero for the 32-bit forms
of VCVT{,U}SI2SD (exception type E10NF), observations on my test system
do not confirm this (and I've got informal confirmation that this is a
doc mistake). Nevertheless, to be on the safe side, force evex.lr to be
zero in this case though when constructing the stub.

Slightly adjust the scalar to_int() in the test harness, to increase the
chances of the operand ending up in memory.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: support AVX512F legacy-equivalent packed int/FP conversion insns
Jan Beulich [Fri, 24 May 2019 08:22:18 +0000 (10:22 +0200)]
x86emul: support AVX512F legacy-equivalent packed int/FP conversion insns

... including the two AVX512DQ forms which shared encodings, just with
EVEX.W set there.

VCVTDQ2PD, sharing its main opcode with others, needs a "manual"
override of disp8scale.

The simd_size changes for the twobyte_table[] entries are benign to
pre-existing code, but allow decode_disp8scale() to work as is here.

The at this point wrong placement of the 0xe6 case block is once again
in anticipation of further additions of case labels.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: support AVX512F floating-point conversion insns
Jan Beulich [Fri, 24 May 2019 08:21:30 +0000 (10:21 +0200)]
x86emul: support AVX512F floating-point conversion insns

VCVTPS2PD, sharing its main opcode with others, needs a "manual"
override of disp8scale.

The simd_size change for twobyte_table[0x5a] is benign to pre-existing
code, but allows decode_disp8scale() to work as is here.

Also correct the comment on an AVX counterpart.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/IO-APIC: fix build with gcc9
Jan Beulich [Fri, 24 May 2019 08:19:59 +0000 (10:19 +0200)]
x86/IO-APIC: fix build with gcc9

There are a number of pointless __packed attributes which cause gcc 9 to
legitimately warn:

utils.c: In function 'vtd_dump_iommu_info':
utils.c:287:33: error: converting a packed 'struct IO_APIC_route_entry' pointer (alignment 1) to a 'struct IO_APIC_route_remap_entry' pointer (alignment 8) may result in an unaligned pointer value [-Werror=address-of-packed-member]
  287 |                 remap = (struct IO_APIC_route_remap_entry *) &rte;
      |                                 ^~~~~~~~~~~~~~~~~~~~~~~~~

intremap.c: In function 'ioapic_rte_to_remap_entry':
intremap.c:343:25: error: converting a packed 'struct IO_APIC_route_entry' pointer (alignment 1) to a 'struct IO_APIC_route_remap_entry' pointer (alignment 8) may result in an unaligned pointer value [-Werror=address-of-packed-member]
  343 |     remap_rte = (struct IO_APIC_route_remap_entry *) old_rte;
      |                         ^~~~~~~~~~~~~~~~~~~~~~~~~

Simply drop these attributes. Take the liberty and also re-format the
structure definitions at the same time.

Reported-by: Charles Arnold <carnold@suse.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/boot: Link opt_dom0_verbose to CONFIG_VERBOSE_DEBUG
Andrew Cooper [Mon, 20 May 2019 10:14:05 +0000 (10:14 +0000)]
x86/boot: Link opt_dom0_verbose to CONFIG_VERBOSE_DEBUG

We currently have an asymmetric setup where CONFIG_VERBOSE_DEBUG controls
extra diagnostics for a PV dom0, and opt_dom0_verbose controls extra
diagnostics for a PVH dom0.

Default opt_dom0_verbose to CONFIG_VERBOSE_DEBUG and use opt_dom0_verbose
consistently.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/boot: Wire up dom0=shadow for PV dom0
Andrew Cooper [Fri, 14 Sep 2018 17:50:01 +0000 (18:50 +0100)]
x86/boot: Wire up dom0=shadow for PV dom0

This would have been very handy when debugging some pv-l1tf issues.  As there
is no cost to supporting it, wire it up.

Due to the way dom0 is constructed, switching into shadow mode must be done
after the pagetables are written, and because of partially being in dom0
context, shadow_enable() doesn't like the state it finds.

Reuse the pv_l1tf tasklet for convenience, which will switch dom0 into shadow
mode just before it starts executing.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/boot: Rename dom0_{pvh,verbose} variables to have an opt_ prefix
Andrew Cooper [Mon, 20 May 2019 10:14:03 +0000 (10:14 +0000)]
x86/boot: Rename dom0_{pvh,verbose} variables to have an opt_ prefix

For consistency with other command line options.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/pv: Fix error handling in dom0_construct_pv()
Andrew Cooper [Mon, 20 May 2019 10:14:01 +0000 (10:14 +0000)]
x86/pv: Fix error handling in dom0_construct_pv()

One path in dom0_construct_pv() returns -1 unlike all other error paths.
Switch it to returning -EINVAL.

This was last modified by c/s c84481fb XSA-55, but the bug predates that
series.  However, this patch did (for no obvious reason) introduce a
bifurcated tail to the function with two subtly different elf_check_broken()
clauses.

As the elf_check_broken() is just a warning and doesn't influence the further
boot, fold the exit paths together and use a single clause.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agolibx86: Elide more empty CPUID leaves when serialising a policy
Andrew Cooper [Tue, 21 May 2019 17:19:33 +0000 (18:19 +0100)]
libx86: Elide more empty CPUID leaves when serialising a policy

x86_cpuid_copy_to_buffer() currently serialises the full content of the
various subleaf unions.  While leaves 4, 0xb and 0xd don't have a concrete
max_subleaf field, they do have well defined upper bounds.

Diffing the results of `xen-cpuid -p` shows the resulting saving:

  @@ -1,5 +1,5 @@
   Xen reports there are maximum 114 leaves and 1 MSRs
  -Raw policy: 93 leaves, 1 MSRs
  +Raw policy: 38 leaves, 1 MSRs
    CPUID:
     leaf     subleaf  -> eax      ebx      ecx      edx
     00000000:ffffffff -> 00000016:756e6547:6c65746e:49656e69
  @@ -32,7 +32,7 @@ Raw policy: 93 leaves, 1 MSRs
    MSRs:
     index    -> value
     000000ce -> 0000000080000000
  -Host policy: 93 leaves, 1 MSRs
  +Host policy: 33 leaves, 1 MSRs
    CPUID:
     leaf     subleaf  -> eax      ebx      ecx      edx
     00000000:ffffffff -> 0000000d:756e6547:6c65746e:49656e69

which is mostly due to no longer writing out 64 leaves for xstate when (on
this CoffeeLake system) 8 will do.

Extend the unit tests to cover empty and partially filled subleaf unions.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agolibx86: Introduce wrappers for extracting XCR0/XSS from a cpuid policy
Andrew Cooper [Wed, 22 May 2019 17:39:23 +0000 (18:39 +0100)]
libx86: Introduce wrappers for extracting XCR0/XSS from a cpuid policy

This avoids opencoding the slightly-awkward logic.  More uses of these
wrappers will be introduced shortly.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agoRevert "libxl: add helper function to set device_model_version"
Wei Liu [Wed, 22 May 2019 08:09:08 +0000 (09:09 +0100)]
Revert "libxl: add helper function to set device_model_version"

This reverts commit 3802ecbaa9eb36cbadce39ab03a4f6d36f29ae5c.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
5 years agoRevert "libxl: fix migration of PV and PVH domUs with and without qemu"
Wei Liu [Wed, 22 May 2019 08:08:56 +0000 (09:08 +0100)]
Revert "libxl: fix migration of PV and PVH domUs with and without qemu"

This reverts commit 899433f149d0cc48a5254c797d9e5a8c9dc3b0fb.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
5 years agoRevert "libxl: fix libxl_domain_need_memory after 899433f149d"
Wei Liu [Wed, 22 May 2019 08:08:41 +0000 (09:08 +0100)]
Revert "libxl: fix libxl_domain_need_memory after 899433f149d"

This reverts commit 278c64519c661c851d37e2a929f006fb8a1dcd01.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
5 years agoxen/arm64: livepatch: Fix build after 03957f58db
Julien Grall [Tue, 21 May 2019 14:24:55 +0000 (15:24 +0100)]
xen/arm64: livepatch: Fix build after 03957f58db

Commit 03957f58db "xen/const: Extend the existing macro BIT to take a
suffix in parameter" didn't convert all the callers of the macro BIT.

This will result to a build breakage when enabling Livepatch on arm64.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: support AVX512BW pack insns
Jan Beulich [Tue, 21 May 2019 13:47:22 +0000 (15:47 +0200)]
x86emul: support AVX512BW pack insns

No further test harness additions - what is there is good enough for
these rather "regular" insns.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: support AVX512{F,BW,_VBMI} permute insns
Jan Beulich [Tue, 21 May 2019 13:46:41 +0000 (15:46 +0200)]
x86emul: support AVX512{F,BW,_VBMI} permute insns

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: support AVX512F move duplicate insns
Jan Beulich [Tue, 21 May 2019 13:45:59 +0000 (15:45 +0200)]
x86emul: support AVX512F move duplicate insns

Judging from insn prefixes, these are scalar insns, but their (memory)
operands are vector ones (with the exception of 128-bit VMOVDDUP). For
this some adjustments to disp8scale calculation code are needed.

No explicit test harness additions other than the overrides, as the
compiler already makes use of the insns.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: support AVX512F move high/low insns
Jan Beulich [Tue, 21 May 2019 13:45:06 +0000 (15:45 +0200)]
x86emul: support AVX512F move high/low insns

No explicit test harness additions other than the overrides, as the
compiler already makes use of the insns.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/CPUID: support leaf 7 subleaf 1 / AVX512_BF16
Jan Beulich [Tue, 21 May 2019 13:43:00 +0000 (15:43 +0200)]
x86/CPUID: support leaf 7 subleaf 1 / AVX512_BF16

The AVX512_BF16 feature flag resides in this so far blank sub-leaf.
Expand infrastructure accordingly.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agoxen/x86: Constify the parameter "d" in mfn_to_gfn
Julien Grall [Tue, 7 May 2019 15:14:46 +0000 (16:14 +0100)]
xen/x86: Constify the parameter "d" in mfn_to_gfn

The parameter "d" holds the domain and is not modified by the function.
So constify it.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agoxen/arm: Use mfn_to_pdx instead of pfn_to_pdx when possible
Julien Grall [Tue, 7 May 2019 15:14:45 +0000 (16:14 +0100)]
xen/arm: Use mfn_to_pdx instead of pfn_to_pdx when possible

mfn_to_pdx adds more safety than pfn_to_pdx. Replace all but on place in
the Arm code to use the former.

No functional changes.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm: processor: Use BIT(.., UL) instead of _AC(1, U) in SCTLR_ defines
Julien Grall [Tue, 14 May 2019 12:24:40 +0000 (13:24 +0100)]
xen/arm: processor: Use BIT(.., UL) instead of _AC(1, U) in SCTLR_ defines

Use the pattern BIT(..., UL) to make the code more readable. Note that
unsigned long is used instead of unsigned because SCTLR is technically
32-bit on Arm32 and 64-bit on Arm64.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm: Rename SCTLR_* defines and remove unused one
Julien Grall [Tue, 14 May 2019 12:24:39 +0000 (13:24 +0100)]
xen/arm: Rename SCTLR_* defines and remove unused one

The SCTLR_* are currently used for SCTLR/HSCTLR (arm32) and
SCTLR_EL1/SCTLR_EL2 (arm64).

The naming scheme is actually quite confusing because they may only be
defined for an archicture (or even an exception level). So it is not easy
for the developer to know which one to use.

The naming scheme is reworked by adding Axx_ELx in each define:
    * xx is replaced by 32 or 64 if specific to an architecture
    * x is replaced by 2 (hypervisor) or 1 (kernel) if specific to an
    exception level

While doing the renaming, remove the unused defines (or at least the ones
that are unlikely going to be used).

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm: tlbflush: Clarify the TLB helpers name
Julien Grall [Tue, 14 May 2019 12:11:28 +0000 (13:11 +0100)]
xen/arm: tlbflush: Clarify the TLB helpers name

TLB helpers in the headers tlbflush.h are currently quite confusing to
use the name may lead to think they are dealing with hypervisors TLBs
while they actually deal with guest TLBs.

Rename them to make it clearer that we are dealing with guest TLBs.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm: Remove flush_xen_text_tlb_local()
Julien Grall [Tue, 14 May 2019 12:11:27 +0000 (13:11 +0100)]
xen/arm: Remove flush_xen_text_tlb_local()

The function flush_xen_text_tlb_local() has been misused and will result
to invalidate the instruction cache more than necessary.

For instance, there is no need to invalidate the instruction cache if
we are setting SCTLR_EL2.WXN.

There is effectively only one caller (i.e free_init_memory() who would
need to invalidate the instruction cache.

So rather than keeping around the function flush_xen_text_tlb_local()
replace it with call to flush_xen_tlb_local() and explicitly flush
the cache when necessary.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/arm: mm: Consolidate setting SCTLR_EL2.WXN in a single place
Julien Grall [Tue, 14 May 2019 12:11:26 +0000 (13:11 +0100)]
xen/arm: mm: Consolidate setting SCTLR_EL2.WXN in a single place

The logic to set SCTLR_EL2.WXN is the same for the boot CPU and
non-boot CPU. So introduce a function to set the bit and clear TLBs.

This new function will help us to document and update the logic in a
single place.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Andrii Anisov <andrii_anisov@epam.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoxen/const: Extend the existing macro BIT to take a suffix in parameter
Julien Grall [Tue, 14 May 2019 12:24:38 +0000 (13:24 +0100)]
xen/const: Extend the existing macro BIT to take a suffix in parameter

Arm currently provides two macro BIT and BIT_ULL that are only usable
in C and return respectively unsigned long and unsigned long long.

Extending the macros to deal with assembly would be a nice benefits as
it could replace the common pattern to define fields (AC(1, sfx) << X)
easier to read.

Rather than extending the two macros, it was decided to drop BIT_ULL()
and extend the macro BIT() to take a suffix (e.g U, UL, ULL) in
parameter. This would allow to use different suffix without having to
define new macros.

The new extend macro is now moved in include/xen/const.h so it can be
used by anyone in Xen and also avoid to include bitops.h in assembly
code.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agoautotools: Updage config.guess and config.sub
Alistair Francis [Fri, 17 May 2019 22:31:51 +0000 (15:31 -0700)]
autotools: Updage config.guess and config.sub

The autoconf manual [1] specifies that as we define AC_CANONICAL_HOST we
must supply config.guess and config.sub. In which case let's update them
from [2] commit: b98424c24 "config.guess: Remove space after "#endif", as
Gnulib and some"

This allows us to support more achitectures (RISC-V) and other general
improvements.

1: https://www.gnu.org/software/autoconf/manual/autoconf.html#Canonicalizing
2: https://git.savannah.gnu.org/cgit/config.git/

Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
5 years agox86/svm: Drop support for AMD's Lightweight Profiling
Andrew Cooper [Tue, 17 Jul 2018 13:36:30 +0000 (13:36 +0000)]
x86/svm: Drop support for AMD's Lightweight Profiling

Lightweight Profiling was introduced in Bulldozer (Fam15h), but was dropped
from Zen (Fam17h) processors.  Furthermore, LWP was dropped from Fam15/16 CPUs
when IBPB for Spectre v2 was introduced in microcode, owing to LWP not being
used in practice.

As a result, CPUs which are operating within specification (i.e. with up to
date microcode) no longer have this feature, and therefore are not using it.

Drop support from Xen.  The main motivation here is to remove unnecessary
complexity from CPUID handling, but it also tidies up the SVM code nicely.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Brian Woods <brian.woods@amd.com>
5 years agoxen/boot: Print the build-id along with the changeset information
Andrew Cooper [Fri, 5 Apr 2019 14:26:31 +0000 (14:26 +0000)]
xen/boot: Print the build-id along with the changeset information

During initcalls is ok, but is a rather random place to find the build-id:

  (XEN) Parked 2 CPUs
  (XEN) build-id: 7ff05f78ebc8141000b9feee4370a408bd935dec
  (XEN) Running stub recovery selftests...

Logically, it is version information, so print with the changeset information
in console_init_preirq():

  (XEN) Xen version 4.13-unstable (andrewcoop@andrecoop) (gcc (Debian 4.9.2-10+deb8u2) 4.9.2) debug=y  Fri Apr 12 18:24:52 BST 2019
  (XEN) Latest ChangeSet: Fri Apr 5 14:39:42 2019 git:fc6c7ae-dirty
  (XEN) build-id: 7ff05f78ebc8141000b9feee4370a408bd935dec
  (XEN) PVH start info: (pa 0000ffc0)

Nothing has ever cared about xen_build_init()'s return value, so convert it to
void rather than include errno.h into the !BUILD_ID case of version.h

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/cpuidle: clean up Cx dumping
Jan Beulich [Tue, 21 May 2019 06:31:47 +0000 (08:31 +0200)]
x86/cpuidle: clean up Cx dumping

Don't log the same global information once per CPU. Don't log the same
information (here: the currently active state) twice. Don't prefix
decimal numbers with zeros (giving the impression they're octal). Use
format specifiers matching the type of the corresponding expressions.
Don't split printk()-s without intervening new-lines.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/cpuidle: push parked CPUs into deeper sleep states when possible
Jan Beulich [Tue, 21 May 2019 06:31:09 +0000 (08:31 +0200)]
x86/cpuidle: push parked CPUs into deeper sleep states when possible

When the mwait-idle driver isn't used, C-state information becomes
available only in the course of Dom0 starting up. Use the provided data
to allow parked CPUs to sleep in a more energy efficient way, by waking
them briefly (via NMI) once the data has been recorded.

This involves re-arranging how/when the governor's ->enable() hook gets
invoked. The changes there include addition of so far missing error
handling in the respective CPU notifier handlers.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/idle: re-arrange dead-idle handling
Jan Beulich [Tue, 21 May 2019 06:30:23 +0000 (08:30 +0200)]
x86/idle: re-arrange dead-idle handling

In order to be able to wake parked CPUs from default_dead_idle() (for
them to then enter a different dead-idle routine), the function should
not itself loop. Move the loop into play_dead(), and use play_dead() as
well on the AP boot error path.

Furthermore, not the least considering the comment in play_dead(),
make sure NMI raised (for now this would be a bug elsewhere, but that's
about to change) against a parked or fully offline CPU won't invoke the
actual, full-blown NMI handler.

Note however that this doesn't make #MC any safer for fully offline
CPUs.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: basic AVX512DQ testing
Jan Beulich [Tue, 21 May 2019 06:29:51 +0000 (08:29 +0200)]
x86emul: basic AVX512DQ testing

Test various of the insns which have been implemented already.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: basic AVX512BW testing
Jan Beulich [Tue, 21 May 2019 06:29:38 +0000 (08:29 +0200)]
x86emul: basic AVX512BW testing

Test various of the insns which have been implemented already.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: support AVX512{BW,DQ} mask move insns
Jan Beulich [Tue, 21 May 2019 06:28:48 +0000 (08:28 +0200)]
x86emul: support AVX512{BW,DQ} mask move insns

Entries to the tables in evex-disp8.c are added despite these insns not
allowing for memory operands, with the goal of the tables giving a
complete picture of the supported EVEX-encoded insns in the end.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: support AVX512{F,BW} integer shuffle insns
Jan Beulich [Tue, 21 May 2019 06:27:58 +0000 (08:27 +0200)]
x86emul: support AVX512{F,BW} integer shuffle insns

Also include vshuff{32x4,64x2} as being very similar to vshufi{32x4,64x2}.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citirx.com>
5 years agox86emul: support AVX512{F,BW,_VBMI} full permute insns
Jan Beulich [Tue, 21 May 2019 06:27:16 +0000 (08:27 +0200)]
x86emul: support AVX512{F,BW,_VBMI} full permute insns

Take the liberty and also correct the (public interface) name of the
AVX512_VBMI feature flag, on the assumption that no external consumer
has actually been using that flag so far. Furthermore make it have
AVX512BW instead of AVX512F as a prerequisite, for requiring full
64-bit mask registers (the upper 48 bits of which can't be accessed
other than through XSAVE/XRSTOR without AVX512BW support).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86emul: support AVX512{F,BW} integer unpack insns
Jan Beulich [Tue, 21 May 2019 06:23:57 +0000 (08:23 +0200)]
x86emul: support AVX512{F,BW} integer unpack insns

There's once again one extra twobyte_table[] entry which gets its Disp8
shift value set right away without getting support implemented just yet,
again to avoid needlessly splitting groups of entries.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agox86/cpuid: adjust dependencies of post-SSE ISA extensions
Jan Beulich [Tue, 21 May 2019 06:21:45 +0000 (08:21 +0200)]
x86/cpuid: adjust dependencies of post-SSE ISA extensions

Move AESNI, PCLMULQDQ, and SHA to SSE2, as all of them act on vectors of
integers, whereas plain SSE supports vectors of single precision floats
only. This is in line with how e.g. binutils and gcc treat them.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
5 years agoxen/arm: traps: Avoid using BUG_ON() to check guest state in advance_pc()
Julien Grall [Wed, 15 May 2019 20:17:30 +0000 (21:17 +0100)]
xen/arm: traps: Avoid using BUG_ON() to check guest state in advance_pc()

The condition of the BUG_ON() in advance_pc() is pretty wrong because
the bits [26:25] and [15:10] have a different meaning between AArch32
and AArch64 state.

On AArch32, they are used to store PSTATE.IT. On AArch64, they are RES0
or used for new feature (e.g ARMv8.0-SSBS, ARMv8.5-BTI).

This means a 64-bit guest will hit the BUG_ON() if it is trying to use
any of these features.

More generally, RES0 means that the bits is reserved for future use. So
crashing the host is definitely not the right solution.

In this particular case, we only need to know the guest was using 32-bit
Mode and the Thumb instructions. So replace the BUG_ON() by a proper
check.

Reported-by: Lukas Jünger <lukas.juenger@ice.rwth-aachen.de>
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
5 years agolibxl: fix libxl_domain_need_memory after 899433f149d
Wei Liu [Fri, 17 May 2019 17:05:55 +0000 (18:05 +0100)]
libxl: fix libxl_domain_need_memory after 899433f149d

After 899433f149d libxl needs to know the content of d_config to
determine which QEMU is used. The code is changed such that
libxl__domain_set_device_model needs to be called before
libxl__domain_build_info_setdefault.

This is fine for libxl code, but it is problematic for
libxl_domain_need_memory, which is the only public API that takes a
build_info. To avoid breaking its users, provide a compatibility
setting inside that function.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
5 years agocoverage: filter out libfdt.o and libelf.o
Viktor Mitin [Thu, 16 May 2019 13:20:16 +0000 (16:20 +0300)]
coverage: filter out libfdt.o and libelf.o

While the build system explicitly compiles any .init object without gcov
option, this does not cover the libraries libfdt and libelf. This is
because the two
libraries are built normally and then some sections will have .init
append.

As coverage will be enabled for libfdt, some of the GCOV counters may be
stored in a section that will be stripped after init. On Arm64, this
will reliably result to a crash when 'xencov' will ask to reset the
counters.

Interestingly, on x86, all the counters for libelf seems to be in
sections that will not be renamed so far. Hence, why this was not
discovered before. But this is a latent bug.

As the two libraries can only be used at boot, it is fine to disable
coverage for the entire library.

Reported-by: Viktor Mitin <viktor.mitin.19@gmail.com>
Suggested-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Viktor Mitin <viktor.mitin.19@gmail.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
[julien: Reword commit message]
Signed-off-by: Julien Grall <julien.grall@arm.com>
5 years agox86/emul: dedup hvmemul_cpuid() and pv_emul_cpuid()
Andrew Cooper [Thu, 19 Jul 2018 16:40:06 +0000 (16:40 +0000)]
x86/emul: dedup hvmemul_cpuid() and pv_emul_cpuid()

They are identical, so provide a single x86emul_cpuid() instead.

As x86_emulate() now only uses the ->cpuid() hook for real CPUID instructions,
the hook can be omitted from all special-purpose emulation ops.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/emul: Don't use the ->cpuid() hook for feature checks
Andrew Cooper [Thu, 19 Jul 2018 15:57:41 +0000 (15:57 +0000)]
x86/emul: Don't use the ->cpuid() hook for feature checks

For a release build of xen, this removes nearly 5k of code volume, and removes
a function pointer call from every instantiation.

  add/remove: 0/1 grow/shrink: 0/3 up/down: 0/-4822 (-4822)
  Function                                     old     new   delta
  adjust_bnd                                   260     244     -16
  x86_decode                                  8915    8890     -25
  vcpu_has.isra                                129       -    -129
  x86_emulate                               130040  125388   -4652
  Total: Before=3326565, After=3321743, chg -0.14%

Note that one corner case changes.  At the moment, it is possible for an
entity making direct DOMCTL_set_cpuid hypercalls to construct a policy with
max_leaf < 7, but feature bits set in leaf 7.  By default, libxc and libxl
don't do this, and the result is properly bounded by what the hardware is
capable of (so we won't start trying to use instructions which don't exist in
the CPU).

Previously, the cpuid() hook would end up hiding these features, but they may
still be set cpuid_policy, and therefore might start being accepted by
x86_emulate().

This corner case will be fixed by the in-progress DOMCTL_set_cpu_policy work,
and a guest would only encounter the corner case if it was constructed in a
non-standard manner, and if tried using instruction which it couldn't see
CPUID feature bits for.  As such, it isn't a corner case which we need to
worry about.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
5 years agox86/emul: Pass a full cpuid_policy into x86_emulate()
Andrew Cooper [Thu, 19 Jul 2018 15:52:06 +0000 (15:52 +0000)]
x86/emul: Pass a full cpuid_policy into x86_emulate()

This will be used to simplify feature checking.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>