]> xenbits.xensource.com Git - xen.git/log
xen.git
6 years agopublic/io/blkif.h: try to fix the semantics of sector based quantities
Paul Durrant [Thu, 4 Apr 2019 11:40:02 +0000 (12:40 +0100)]
public/io/blkif.h: try to fix the semantics of sector based quantities

The semantics of sector based quantities, such as first_sect and last_sect
in blkif_request_segment, and the value of "sectors" in the backend info
in xenstore have become confused. Some comments in the header suggest they
should be supplied/interpreted strictly in terms of 512-byte units, others
suggest they should be scaled by the value of "sector-size" i.e. the
logical block size of the underlying backend storage.
This confusion has caused mixed semantics to become ingrained in frontend
implementations. For instance Linux xen-blkfront.c contains code such as:

    fsect = offset >> 9;
    lsect = fsect + (len >> 9) - 1;

whereas the Windows XENVBD frontend contains the following equivalent code:

    Segment->FirstSector = (UCHAR)((Offset + SectorSize - 1) / SectorSize);
    *SectorsNow = __min(SectorsLeft, SectorsPerPage - Segment->FirstSector);
    Segment->LastSector = (UCHAR)(Segment->FirstSector + *SectorsNow - 1);

(where SectorSize is the "sector-size" value advertized in xenstore).

Thus it has become unsafe for a backend to set "sector-size" to anything
other than 512 as it does not know which way the frontend is coded.

This patch is intended to clarify the situation and also introduce a
mechanism to allow logical block sizes of more than 512 to be supported...

A new frontend feature node is specified: 'feature-large-sector-size'.
If this node is present and set to "1" then it means that frontend is
coded to supply and interpret all sector based quantities in terms of the
the advertized "sector-size" value rather than a hardcoded size of 512.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
6 years agoxen/sched: don't disable scheduler on cpus during suspend
Juergen Gross [Tue, 2 Apr 2019 05:34:57 +0000 (07:34 +0200)]
xen/sched: don't disable scheduler on cpus during suspend

Today there is special handling in cpu_disable_scheduler() for suspend
by forcing all vcpus to the boot cpu. In fact there is no need for that
as during resume the vcpus are put on the correct cpus again.

So we can just omit the call of cpu_disable_scheduler() when offlining
a cpu due to suspend and on resuming we can omit taking the schedule
lock for selecting the new processor.

In restore_vcpu_affinity() we should be careful when applying affinity
as the cpu might not have come back to life. This in turn enables us
to even support affinity_broken across suspend/resume.

Avoid all other scheduler dealloc - alloc dance when doing suspend and
resume, too. It is enough to react on cpus failing to come up on resume
again.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
6 years agoxen/cpupool: simplify suspend/resume handling
Juergen Gross [Tue, 2 Apr 2019 05:34:56 +0000 (07:34 +0200)]
xen/cpupool: simplify suspend/resume handling

Instead of removing cpus temporarily from cpupools during
suspend/resume only remove cpus finally which didn't come up when
resuming.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
6 years agoxen: don't free percpu areas during suspend
Juergen Gross [Tue, 2 Apr 2019 05:34:55 +0000 (07:34 +0200)]
xen: don't free percpu areas during suspend

Instead of freeing percpu areas during suspend and allocating them
again when resuming keep them. Only free an area in case a cpu didn't
come up again when resuming.

It should be noted that there is a potential change in behaviour as
the percpu areas are no longer zeroed out during suspend/resume. While
I have checked the called cpu notifier hooks to cope with that there
might be some well hidden dependency on the previous behaviour. OTOH
a component not registering itself for cpu down/up and expecting to
see a zeroed percpu variable after suspend/resume is kind of broken
already. And the opposite case, where a component is not registered
to be called for cpu down/up and is not expecting a percpu variable
suddenly to be zero due to suspend/resume is much more probable,
especially as the suspend/resume functionality seems not to be tested
that often.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agoxen: add new cpu notifier action CPU_RESUME_FAILED
Juergen Gross [Tue, 2 Apr 2019 05:34:54 +0000 (07:34 +0200)]
xen: add new cpu notifier action CPU_RESUME_FAILED

Add a new cpu notifier action CPU_RESUME_FAILED which is called for all
cpus which failed to come up at resume. The calls will be done after
all other cpus are already up in order to know which resources are
available then.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
6 years agoxen: add helper for calling notifier_call_chain() to common/cpu.c
Juergen Gross [Tue, 2 Apr 2019 05:34:53 +0000 (07:34 +0200)]
xen: add helper for calling notifier_call_chain() to common/cpu.c

Add a helper cpu_notifier_call_chain() to call notifier_call_chain()
for a cpu with a specified action, returning an errno value.

This avoids coding the same pattern multiple times.

While at it avoid side effects from using BUG_ON() by not using
cpu_online(cpu) as a parameter.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agoxen/sched: call cpu_disable_scheduler() via cpu notifier
Juergen Gross [Tue, 2 Apr 2019 16:19:05 +0000 (18:19 +0200)]
xen/sched: call cpu_disable_scheduler() via cpu notifier

cpu_disable_scheduler() is being called from __cpu_disable() today.
There is no need to execute it on the cpu just being disabled, so use
the CPU_DEAD case of the cpu notifier chain. Moving the call out of
stop_machine() context is fine, as we just need to hold the domain RCU
lock and need the scheduler percpu data to be still allocated.

Add another hook for CPU_DOWN_PREPARE to bail out early in case
cpu_disable_scheduler() would fail. This will avoid crashes in rare
cases for cpu hotplug or suspend.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoautomation: Add Arch Linux container and build jobs
Anthony PERARD [Wed, 3 Apr 2019 17:33:58 +0000 (18:33 +0100)]
automation: Add Arch Linux container and build jobs

One particularity of Arch Linux, /usr/bin/python is python3.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Doug Goldstein <cardoe@cardoe.com>
6 years agox86/altp2m: treat view 0 as the hostp2m in p2m_get_mem_access()
Razvan Cojocaru [Wed, 3 Apr 2019 08:56:37 +0000 (11:56 +0300)]
x86/altp2m: treat view 0 as the hostp2m in p2m_get_mem_access()

p2m_set_mem_access() (and other places) treat view 0 as the
hostp2m, but p2m_get_mem_access() does not. Correct that
inconsistency.

Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
6 years agoamd-iommu: use a bitfield for DTE
Paul Durrant [Wed, 3 Apr 2019 13:16:08 +0000 (15:16 +0200)]
amd-iommu: use a bitfield for DTE

The current use of get/set_field_from/in_reg_u32() is both inefficient and
requires some ugly casting.

This patch defines a new bitfield structure (amd_iommu_dte) and uses this
structure in all DTE manipulation, resulting in much more readable and
compact code.

NOTE: This patch also includes some clean-up of get_dma_requestor_id() to
      change the types of the arguments from u16 to uint16_t.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Brian Woods <brian.woods@amd.com>
6 years agoamd-iommu: use a bitfield for PTE/PDE
Paul Durrant [Wed, 3 Apr 2019 13:15:29 +0000 (15:15 +0200)]
amd-iommu: use a bitfield for PTE/PDE

The current use of get/set_field_from/in_reg_u32() is both inefficient and
requires some ugly casting.

This patch defines a new bitfield structure (amd_iommu_pte) and uses this
structure in all PTE/PDE manipulation, resulting in much more readable
and compact code.

NOTE: This commit also fixes one malformed comment in
      set_iommu_pte_present().

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Brian Woods <brian.woods@amd.com>
6 years agoxen/tools/symbols.c: fix potential segfault
Xiaochen Wang [Wed, 3 Apr 2019 08:18:20 +0000 (10:18 +0200)]
xen/tools/symbols.c: fix potential segfault

Description:
This bug hardly appears during real kernel compiling,
 because the vmlinux symbols table is huge.

But we can still catch it under strict condition , as follows.
   $ echo "c101b97b T do_fork" | ./scripts/kallsyms --all-symbols
   #include <asm/types.h>
   ......
   ......
   .globl kallsyms_token_table
           ALGN
   kallsyms_token_table:
   Segmentation fault (core dumped)
   $

If symbols table is small, all entries in token_profit[0x10000] may
decrease to 0 after several calls of compress_symbols() in optimize_result().
In that case, find_best_token() always return 0 and
best_table[i] is set to "\0\0" and best_table_len[i] is set to 2.

As a result, expand_symbol(best_table[0]="\0\0", best_table_len[0]=2, buf)
in write_src() will run in infinite recursion until stack overflows,
causing segfault.

This patch checks the find_best_token() return value. If all entries in
token_profit[0x10000] become 0 according to return value, it breaks the loop
in optimize_result().
And expand_symbol() works well when best_table_len[i] is 0.

Signed-off-by: Xiaochen Wang <wangxiaochen0@gmail.com>
[Linux: e0a04b11e4059cab033469617 scripts/kallsyms.c: fix potential segfault]
Signed-off-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Reviewed-by: Bjoern Doebel <doebel@amazon.de>
Reviewed-by: Norbert Manthey <nmanthey@amazon.de>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agoVT-d: return full destination ID for IO-APIC reads
Jan Beulich [Wed, 3 Apr 2019 08:15:54 +0000 (10:15 +0200)]
VT-d: return full destination ID for IO-APIC reads

In x2APIC mode it is 32 bits wide. Not having returned the full value
was mostly benign: We never modify the ID based on its original value;
full new values get written at all times. It was "just" debug logging
which ended up wrong this way (and which will need adjustment itself as
well, to also consume the full value).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
6 years agox86/IO-APIC: consolidate / complete #define-s
Jan Beulich [Wed, 3 Apr 2019 08:15:20 +0000 (10:15 +0200)]
x86/IO-APIC: consolidate / complete #define-s

Drop redundant ones from apic.h. Add delivery mode mask. Use them in
place of open coded hex numbers.

Take the opportunity and modify a helper function's parameters to be
just unsigned int. Also drop the bogus double underscore from its name,
as it and all its callers get touched anyway.

No functional change.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agox86emul: suppress general register update upon AVX gather failures
Jan Beulich [Wed, 3 Apr 2019 08:14:32 +0000 (10:14 +0200)]
x86emul: suppress general register update upon AVX gather failures

While destination and mask registers may indeed need updating in this
case, the rIP update in particular needs to be avoided, as well as e.g.
raising a single step trap.

Reported-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/vvmx: Fix debug prints to not have 17 unnecessary spaces
Andrew Cooper [Wed, 27 Mar 2019 19:52:17 +0000 (19:52 +0000)]
x86/vvmx: Fix debug prints to not have 17 unnecessary spaces

This has been problematic since its introduction in Xen 4.3

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
6 years agotools/ocaml: make python scripts 2 and 3 compatible
Wei Liu [Mon, 1 Apr 2019 10:32:38 +0000 (11:32 +0100)]
tools/ocaml: make python scripts 2 and 3 compatible

1. Explicitly import reduce because that's required in 3.
2. Change print to function.
3. Eliminate invocations of has_key.

Signed-off-by: M A Young <m.a.young@durham.ac.uk>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
6 years agopygrub: encode / decode string in Python 3
Wei Liu [Mon, 1 Apr 2019 10:32:37 +0000 (11:32 +0100)]
pygrub: encode / decode string in Python 3

String is unicode in 3 but bytes in 2. We need to call encode / decode
function when using Python 3.

Reported-by: M A Young <m.a.young@durham.ac.uk>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agopygrub/grub: always use integer for default entry
Wei Liu [Mon, 1 Apr 2019 10:32:36 +0000 (11:32 +0100)]
pygrub/grub: always use integer for default entry

The original code set the default to either a string or an integer
(0) and relies on a Python 2 specific behaviour to work (integer is
allowed to be compared to string in Python 2 but not 3).

Always use integer. The caller (pygrub) already has code to handle
that.

Reported-by: M A Young <m.a.young@durham.ac.uk>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agopygrub: fix message in grub parser
Wei Liu [Mon, 1 Apr 2019 10:32:35 +0000 (11:32 +0100)]
pygrub: fix message in grub parser

The code suggests 0 is allowed. Zero is not a positive number.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoxen/sched: Remove d->is_pinned
Andrew Cooper [Mon, 1 Apr 2019 10:08:43 +0000 (10:08 +0000)]
xen/sched: Remove d->is_pinned

The is_pinned field is rather odd.  It can only be activated with the
"dom0_vcpus_pin" command line option, and causes dom0 (or the late hwdom) to
have its vcpus identity pinned to pcpus.

Having dom0_vcpus_pin active disallows the use of vcpu_set_hard_affinity().
However, when a pcpu is offlined, or moved between cpupools, the affinity is
broken and reverts to cpumask_all.  This results in vcpus which are no longer
pinned, and cannot be adjusted.

A related bit of functionality is the is_pinned_vcpu() predicate.  This is
only used by x86 code, and permits the use of VCPUOP_get_physid and writeable
access to some extra MSRs.

The implementation however returns true for is_pinned (which will include
unpinned vcpus from the above scenario), *or* if the hard affinity mask only
has a single bit set (which is redundant with the intended effect of
is_pinned, but also includes other domains).

Rework the behaviour of "dom0_vcpus_pin" to only being an initial pinning
configuration, and permit full adjustment.  This allows the user to
reconfigure dom0 after the fact or fix up from the fallout of cpu hot unplug
and cpupool manipulation.

An unprivileged domain has no business using VCPUOP_get_physid, and shouldn't
be able to just because it happens to be pinned by admin choice.  All uses of
is_pinned_vcpu() should be restricted to the hardware domain, so rename it to
is_hwdom_pinned_vcpu() to avoid future misuse.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
6 years agotools/xenmon: make xenmon.py compatible with python 2 and 3
Wei Liu [Mon, 1 Apr 2019 10:39:00 +0000 (11:39 +0100)]
tools/xenmon: make xenmon.py compatible with python 2 and 3

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/APIC: suppress redundant "Switched to ..." messages
Jan Beulich [Mon, 1 Apr 2019 09:12:54 +0000 (11:12 +0200)]
x86/APIC: suppress redundant "Switched to ..." messages

There's no need to log anything when what we "switch to" is what is in
use already.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86emul/fuzz: adjust canonicalization in sanitize_input()
Jan Beulich [Mon, 1 Apr 2019 09:12:16 +0000 (11:12 +0200)]
x86emul/fuzz: adjust canonicalization in sanitize_input()

Drop it entirely for %rbp - this register is not special purpose enough
to warrant such special treatment. Add a comment to clarify the purpose
of the canonicalization of %rip and %rsp.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/paging: paging_set_allocation() is init-only
Jan Beulich [Mon, 1 Apr 2019 09:09:43 +0000 (11:09 +0200)]
x86/paging: paging_set_allocation() is init-only

This is needed for Dom0 creation only, therefore it gets additionally
framed by an #ifdef.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
6 years agoxen/timers: Document and improve the representation of the timer heap metadata
Andrew Cooper [Fri, 29 Mar 2019 13:32:09 +0000 (13:32 +0000)]
xen/timers: Document and improve the representation of the timer heap metadata

The {GET,SET}_HEAP_{SIZE,LIMIT}() macros implement some completely
undocumented pointer misuse to store the size and limit information.  In
practice, heap[0] is never a timer pointer, and used to stash the metadata
instead.

Extend the HEAP OPERATIONS comment to include this detail.  Introduce a
structure representing the heap metadata, and a static inline function to
perfom the type punning.

Replace all of the above macros with an equivelent expression involving the
heap_metadata() helper.  Note that I deliberately haven't rearranged the
surrounding code - this allows the correctness of the transformation to be
checked by confirming that the compiled binary is identical.

This also removes two cases of a macro argument with side effects, which only
worked correctly because the arguments were only evaluated once.

Finally, fix up the type of dummy_heap.  The old code functioned correctly,
but only by virtue of confusing a discrete object and a single-entry array.
Change its type to match the intended semantics, and drop the redundant
initialisation in timer_init().

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agoxen/sched: fix credit2 smt idle handling
Juergen Gross [Thu, 28 Mar 2019 15:46:22 +0000 (16:46 +0100)]
xen/sched: fix credit2 smt idle handling

Credit2's smt_idle_mask_set() and smt_idle_mask_clear() are used to
identify idle cores where vcpus can be moved to. A core is thought to
be idle when all siblings are known to have the idle vcpu running on
them.

Unfortunately the information of a vcpu running on a cpu is per
runqueue. So in case not all siblings are in the same runqueue a core
will never be regarded to be idle, as the sibling not in the runqueue
is never known to run the idle vcpu.

Use a credit2 specific cpumask of siblings with only those cpus
being marked which are in the same runqueue as the cpu in question.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
6 years agolibx86: Recalculate synthesised cpuid_policy fields when appropriate
Andrew Cooper [Tue, 10 Jul 2018 12:53:21 +0000 (13:53 +0100)]
libx86: Recalculate synthesised cpuid_policy fields when appropriate

When filling a policy, either from CPUID or an incomming leaf stream,
recalculate the synthesised vendor value.  All callers are expected to want
this behaviour.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agotools/libxc: Use x86_cpuid_lookup_vendor() rather than opencoding the logic
Andrew Cooper [Wed, 20 Mar 2019 14:56:15 +0000 (14:56 +0000)]
tools/libxc: Use x86_cpuid_lookup_vendor() rather than opencoding the logic

This doesn't address any of the assumptions that "anything which isn't AMD is
Intel".  This logic is expected to be replaced wholesale with libx86 in the
longterm.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/cpuid: Drop get_cpu_vendor() completely
Andrew Cooper [Tue, 10 Jul 2018 12:53:21 +0000 (13:53 +0100)]
x86/cpuid: Drop get_cpu_vendor() completely

get_cpu_vendor() tries to do a number of things, and ends up doing none of
them well.

For calculating the vendor itself, use x86_cpuid_lookup_vendor() which is
implemented in a far more efficient manner than looping over cpu_devs[].

For setting up this_cpu, set it up once on the BSP only, rather than
latest-takes-precident across the APs.  Such a system is probably not going to
boot, but this feels like a less dangerous course of action.  Adjust the
printed errors to be more clear in the mismatch case.

This removes the only user of cpu_dev->c_ident[], so drop that field as well.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agolibx86: Introduce x86_cpuid_lookup_vendor()
Andrew Cooper [Wed, 20 Mar 2019 14:05:11 +0000 (14:05 +0000)]
libx86: Introduce x86_cpuid_lookup_vendor()

Also introduce constants for the vendor strings in CPUID leaf 0.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agoCI: Add a CentOS 6 container and build jobs
Andrew Cooper [Tue, 26 Mar 2019 14:23:03 +0000 (14:23 +0000)]
CI: Add a CentOS 6 container and build jobs

CentOS 6 is probably the most frequently broken build, so adding it to CI
would be a very good move.

One problem is that CentOS 6 comes with Python 2.6, and Qemu requires 2.7.
There appear to be no sensible ways to get Python 2.7 into a CentOS 6
environments, so modify the build script to skip the Qemu upstream build
instead.  Additionally, SeaBIOS requires GCC 4.6 or later, so skip it as well.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agoCI: Fix indentation in containerize script
Andrew Cooper [Fri, 22 Mar 2019 11:12:28 +0000 (11:12 +0000)]
CI: Fix indentation in containerize script

The script is mostly indented with spaces, but there are three tabs.  Fix them
up to be consistent.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agodocs/admin-guide: Boot time microcode loading
Andrew Cooper [Mon, 18 Mar 2019 16:22:29 +0000 (16:22 +0000)]
docs/admin-guide: Boot time microcode loading

Recent discussion on xen-devel has demonstrated that Xen existing microcode
loading support isn't adequately documented.  Take the opportunity to address
this, and start some end-user focused documentation.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agodocs/rst: Use pandoc to render ReStructuredText
Andrew Cooper [Wed, 21 Nov 2018 17:03:50 +0000 (17:03 +0000)]
docs/rst: Use pandoc to render ReStructuredText

Sphinx uses ReStructuredText as its markup format.  Although missing the
project wide integration, individual *.rst files can be rendered by pandoc to
suppliement our existing ad-hoc documentation.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agodocs/sphinx: Skeleton setup
Andrew Cooper [Wed, 21 Nov 2018 17:03:50 +0000 (17:03 +0000)]
docs/sphinx: Skeleton setup

Sphinx is a documentation system, which is popular for technical writing.  It
uses ReStructuredText as its markup syntax, and is designed for whole-project
documentation, rather than the misc assortment of individual files that we
currently have.

This is a skeleton setup which just enough infrastructure to render an empty
set of pages.  It will become better integrated into Xen's docs system when it
becomes less WIP.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agopassthrough/vtd: Drop the "workaround_bios_bug" logic entirely
Andrew Cooper [Thu, 21 Mar 2019 19:36:48 +0000 (19:36 +0000)]
passthrough/vtd: Drop the "workaround_bios_bug" logic entirely

It turns out that this code was previously dead.

c/s dcf41790 " x86/mmcfg/drhd: Move acpi_mmcfg_init() call before calling
acpi_parse_dmar()" resulted in PCI segment 0 now having been initialised
enough for acpi_parse_one_drhd() to not take the

  /* Skip checking if segment is not accessible yet. */

path unconditionally.  However, some systems have DMAR tables which list
devices which are disabled by user choice (in particular, Dell PowerEdge R740
with I/O AT DMA disabled), and turning off all IOMMU functionality in this
case is entirely unhelpful behaviour.

Leave the warning which identifies the problematic devices, but drop the
remaining logic.  This leaves the system in better overall state, and working
in the same way that it did in previous releases.

Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Release-acked-by: Juergen Gross <jgross@suse.com>
6 years agoxen/drivers: char: Match #if CONFIG_DEBUG_TRACE and #endif comment
Julien Grall [Tue, 4 Dec 2018 18:02:40 +0000 (18:02 +0000)]
xen/drivers: char: Match #if CONFIG_DEBUG_TRACE and #endif comment

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agoxen/memory: Fix typo in the comment on top of check_get_page_from_gfn
Julien Grall [Sat, 9 Mar 2019 21:20:23 +0000 (21:20 +0000)]
xen/memory: Fix typo in the comment on top of check_get_page_from_gfn

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/mm: Fix typo in comment on top of page_lock
Julien Grall [Sun, 10 Mar 2019 12:41:01 +0000 (12:41 +0000)]
x86/mm: Fix typo in comment on top of page_lock

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agolibxc: fix HVM core dump
Wei Liu [Wed, 20 Mar 2019 15:43:38 +0000 (15:43 +0000)]
libxc: fix HVM core dump

f969bc9fc96 forbid get_address_size call on HVM guests, because that
didn't make sense. It broke core dump functionality on HVM because
libxc unconditionally asked for guest width.

Force guest_width to a sensible value.

Reported-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
6 years agox86: decouple xen alignment setting from EFI/ELF build
Wei Liu [Tue, 19 Mar 2019 13:57:06 +0000 (13:57 +0000)]
x86: decouple xen alignment setting from EFI/ELF build

Introduce a new Kconfig option to pick the alignment for xen binary.
To retain original behaviour, the default pick for EFI build is 2M and
ELF build 4K.

Make the PVHSHIM build use 2M alignment for potentially better
performance.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86: drop "depends on X86" for TBOOT Kconfig option
Wei Liu [Tue, 19 Mar 2019 13:59:25 +0000 (13:59 +0000)]
x86: drop "depends on X86" for TBOOT Kconfig option

Given that this file already resides under arch/x86, there is no need
to have the dependency.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/spec-ctrl: Extend repoline safey calcuations for eIBRS and Atom parts
Andrew Cooper [Fri, 15 Mar 2019 22:08:41 +0000 (22:08 +0000)]
x86/spec-ctrl: Extend repoline safey calcuations for eIBRS and Atom parts

All currently-released Atom processors are in practice retpoline-safe, because
they don't fall back to a BTB prediction on RSB underflow.

However, an additional meaning of Enhanced IRBS is that the processor may not
be retpoline-safe.  The Gemini Lake platform, based on the Goldmont Plus
microarchitecture is the first Atom processor to support eIBRS.

Until Xen gets full eIBRS support, Gemini Lake will still be safe using
regular IBRS.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/msr: Shorten ARCH_CAPABILITIES_* constants
Andrew Cooper [Mon, 18 Mar 2019 11:45:29 +0000 (11:45 +0000)]
x86/msr: Shorten ARCH_CAPABILITIES_* constants

They are unnecesserily verbose, and ARCH_CAPS_* is already the more common
version.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/hvm: finish IOREQs correctly on completion path
Igor Druzhinin [Mon, 18 Mar 2019 15:29:21 +0000 (16:29 +0100)]
x86/hvm: finish IOREQs correctly on completion path

Since the introduction of linear_{read,write}() helpers in 3bdec530a5
(x86/HVM: split page straddling emulated accesses in more cases) the
completion path for IOREQs has been broken: if there is an IOREQ in
progress but hvm_copy_{to,from}_guest_linear() returns HVMTRANS_okay
(e.g. when P2M type of source/destination has been changed by IOREQ
handler) the execution will never re-enter hvmemul_do_io() where
IOREQs are completed. This usually results in a domain crash upon
the execution of the next IOREQ entering hvmemul_do_io() and finding
the remnants of the previous IOREQ in the state machine.

This particular issue has been discovered in relation to p2m_ioreq_server
type where an emulator changed the memory type between p2m_ioreq_server
and p2m_ram_rw in process of responding to IOREQ which made
hvm_copy_..() to behave differently on the way back.

Fix it for now by checking if IOREQ completion is required (which
can be identified by querying MMIO cache) before trying to finish
a memory access immediately through hvm_copy_..(), re-enter
hvmemul_do_io() otherwise. This change alone only addresses IOREQ
completion issue for P2M type changing from MMIO to RAM in the
middle of emulation but leaves a case where new IOREQs might be
introduced by P2M changes from RAM to MMIO (which is less likely
to find in practice) that requires more substantial changes in
MMIO emulation code.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
6 years agox86/hvm: split all linear reads and writes at page boundary
Igor Druzhinin [Mon, 18 Mar 2019 15:28:45 +0000 (16:28 +0100)]
x86/hvm: split all linear reads and writes at page boundary

Ruling out page straddling at linear level makes it easier to
distinguish chunks that require proper handling as MMIO access
and not complete them as page straddling memory transactions
prematurely. This doesn't change the general behavior.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/MCE: mcequirk stuff is AMD-specific
Jan Beulich [Mon, 18 Mar 2019 10:41:47 +0000 (11:41 +0100)]
x86/MCE: mcequirk stuff is AMD-specific

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86emul: no need to set fault_suppression to false for VMOVNT*
Jan Beulich [Mon, 18 Mar 2019 10:41:10 +0000 (11:41 +0100)]
x86emul: no need to set fault_suppression to false for VMOVNT*

When evex.opmsk is required to be zero there's no need for this, as it
won't have been set to true in the first place.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoxen/debug: make debugtrace more clever regarding repeating entries
Juergen Gross [Mon, 18 Mar 2019 10:40:32 +0000 (11:40 +0100)]
xen/debug: make debugtrace more clever regarding repeating entries

In case debugtrace is writing to memory and the last entry is repeated
don't fill up the trace buffer, but modify the count prefix to "x-y "
style instead.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxen/debug: make debugtrace configurable via Kconfig
Juergen Gross [Mon, 18 Mar 2019 10:39:43 +0000 (11:39 +0100)]
xen/debug: make debugtrace configurable via Kconfig

Instead of having to edit include/xen/lib.h for making debugtrace
available make it configurable via Kconfig.

Default is off, it is available only in expert mode or in debug builds.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/e820: fix build with gcc9
Jan Beulich [Mon, 18 Mar 2019 10:38:36 +0000 (11:38 +0100)]
x86/e820: fix build with gcc9

e820.c: In function ‘clip_to_limit’:
.../xen/include/asm/string.h:10:26: error: ‘__builtin_memmove’ offset [-16, -36] is out of the bounds [0, 20484] of object ‘e820’ with type ‘struct e820map’ [-Werror=array-bounds]
   10 | #define memmove(d, s, n) __builtin_memmove(d, s, n)
      |                          ^~~~~~~~~~~~~~~~~~~~~~~~~~
e820.c:404:13: note: in expansion of macro ‘memmove’
  404 |             memmove(&e820.map[i], &e820.map[i+1],
      |             ^~~~~~~
e820.c:36:16: note: ‘e820’ declared here
   36 | struct e820map e820;
      |                ^~~~

While I can't see where the negative offsets would come from, converting
the loop index to unsigned type helps. Take the opportunity and also
convert several other local variables and copy_e820_map()'s second
parameter to unsigned int (and bool in one case).

Reported-by: Charles Arnold <carnold@suse.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/svm: Improve code generation from cpu_has_svm_feature()
Andrew Cooper [Tue, 12 Feb 2019 18:33:30 +0000 (18:33 +0000)]
x86/svm: Improve code generation from cpu_has_svm_feature()

Taking svm_feature_flags by pointer and using test_bit() results in generated
code which loads svm_feature_flags into a 32bit register, then does a bitwise
operation.

The logic can be expressed in terms of a straight bitwise operation, resulting
in the following minor improvement.

  add/remove: 0/0 grow/shrink: 0/4 up/down: 0/-136 (-136)
  Function                                     old     new   delta
  svm_nested_features_on_efer_update           281     273      -8
  svm_create_vmcb                             1404    1388     -16
  svm_vmexit_handler                          6271    6239     -32
  start_svm                                    818     738     -80
  Total: Before=3347569, After=3347433, chg -0.00%

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
6 years agox86/pv: Fix construction of 32bit dom0's
Andrew Cooper [Thu, 14 Feb 2019 11:10:09 +0000 (11:10 +0000)]
x86/pv: Fix construction of 32bit dom0's

dom0_construct_pv() has logic to transition dom0 into a compat domain when
booting an ELF32 image.

One aspect which is missing is the CPUID policy recalculation, meaning that a
32bit dom0 sees a 64bit policy, which differ by the Long Mode feature flag in
particular.  Another missing item is the x87_fip_width initialisation.

Update dom0_construct_pv() to use switch_compat(), rather than retaining the
opencoding.  Position the call to switch_compat() such that the compat32 local
variable can disappear entirely.

The 32bit monitor table is now created by setup_compat_l4(), avoiding the need
to for manual creation later.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agoxen/x86: Fix cflush()'s parameter tracking
Andrew Cooper [Fri, 15 Mar 2019 12:28:46 +0000 (12:28 +0000)]
xen/x86: Fix cflush()'s parameter tracking

Forcing a register operand hides (from the compiler) the fact that clflush
behaves as a read from the memory operand (wrt memory order, faults, etc.).
It also reduces the compilers flexibility with register scheduling.

Re-implement clfush() (and wbinvd() for consistency) as a static inline rather
than a macro, and have it take a const void pointer.

In practice, the only generated code which gets modified by this is in
mwait_idle_with_hints(), where a disp8 encoding now gets used.

While here, I noticed that &mwait_wakeup(cpu) was being calculated twice.
This is caused by the memory clobber in mb(), so take the opportunity to help
the optimiser by calculating it once, ahead of time.  bloat-o-meter reports a
delta of -26 as a result of this change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agoxen: drop the nop() macro
Andrew Cooper [Fri, 15 Mar 2019 13:18:04 +0000 (13:18 +0000)]
xen: drop the nop() macro

There isn't a plausible reason to insert nops into code in this manner.

The sole use is in do_debug_key(), and exists to prevent the compiler
optimising the tail of the function with 'jmp debugger_trap_fatal'

In practice, a compiler barrier suffices just as well to prevent the tailcall,
and doesn't involve inserting unnecessary instructions.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
6 years agoautomation: enable building rombios with clang
Wei Liu [Fri, 24 Aug 2018 15:22:47 +0000 (16:22 +0100)]
automation: enable building rombios with clang

Previously it is disabled because the embedded ipxe can't be built
with clang. Now that ipxe is split out we can use --with-system-ipxe
to work around the issue.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
6 years agointroduce a cpumask with all bits set
Juergen Gross [Thu, 14 Mar 2019 15:42:05 +0000 (16:42 +0100)]
introduce a cpumask with all bits set

There are several places in Xen allocating a cpumask on the stack and
setting all bits in it just to use it as an initial mask for allowing
all cpus.

Save the stack space and omit the need for runtime initialization by
defining a globally accessible cpumask_all variable.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
6 years agoArm/atomic: cosmetics
Jan Beulich [Thu, 14 Mar 2019 15:40:12 +0000 (16:40 +0100)]
Arm/atomic: cosmetics

Drop redundant casts. Un-define no longer needed macros after use.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
6 years agoArm/atomic: unify generation of u64 read/write functions
Jan Beulich [Thu, 14 Mar 2019 15:39:14 +0000 (16:39 +0100)]
Arm/atomic: unify generation of u64 read/write functions

By adding another suitable abstracting macro the need for explicit
inline function definitions in the 32-bit case goes away.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
6 years agotools: re-sync CPUID leaf 7 tables
Jan Beulich [Thu, 14 Mar 2019 15:38:39 +0000 (16:38 +0100)]
tools: re-sync CPUID leaf 7 tables

Bring libxl's in line with the public header, and update xen-cpuid's to
the latest information available in Intel's documentation (SDM ver 068
and ISA extensions ver 035), with (as before) the exception on MAWAU.

Some pre-existing strings get changed to match SDM naming. This should
be benign in xen-cpuid, and I hope it's also acceptable in libxl, where
people actually using the slightly wrong names would have to update
their guest config files.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoConfig.mk: update seabios to 1.12.1
Wei Liu [Thu, 14 Mar 2019 15:00:52 +0000 (15:00 +0000)]
Config.mk: update seabios to 1.12.1

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
6 years agotools: bump library version numbers
Wei Liu [Thu, 14 Mar 2019 12:30:37 +0000 (12:30 +0000)]
tools: bump library version numbers

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxen: make grant table support configurable
Wei Liu [Fri, 18 Jan 2019 12:43:57 +0000 (12:43 +0000)]
xen: make grant table support configurable

Introduce CONFIG_GRANT_TABLE. Provide stubs and make sure x86 and arm
hypervisors build with grant table disabled.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
6 years agobuild/m4: fix python library detection on Ubuntu systems
Wei Liu [Wed, 13 Mar 2019 13:54:48 +0000 (13:54 +0000)]
build/m4: fix python library detection on Ubuntu systems

16cc3362aed doesn't work on Ubuntu with gcc (but it does work with
clang). Work around it by manipulating LIBS.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agotravis: use python-dev instead of python2.7-dev
Wei Liu [Wed, 13 Mar 2019 13:54:47 +0000 (13:54 +0000)]
travis: use python-dev instead of python2.7-dev

Xen build should be using default python now.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoautomation: use python-dev python2.7-dev in Debian and Ubuntu
Wei Liu [Wed, 13 Mar 2019 13:54:46 +0000 (13:54 +0000)]
automation: use python-dev python2.7-dev in Debian and Ubuntu

... instead of python2.7-dev.

We installed python2.7-dev because xen only worked with 2.7.

Installing python2.7-dev only gives python2.7-config, which causes
configure to fail because it wants python-config by default. Now xen
should work with 2.6 and above, we can install python-dev and let
distros pick the default python.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoRun autogen.sh for 4.13
Wei Liu [Wed, 13 Mar 2019 13:15:50 +0000 (13:15 +0000)]
Run autogen.sh for 4.13

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
6 years agoMAINTAINERS: add myself as maintainer for public I/O interfaces
Juergen Gross [Wed, 13 Mar 2019 10:26:38 +0000 (11:26 +0100)]
MAINTAINERS: add myself as maintainer for public I/O interfaces

The "PUBLIC I/O INTERFACES AND PV DRIVERS DESIGNS" section of the
MAINTAINERS file lists Konrad as the only maintainer. Add myself for
helping him to review patches.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
6 years agostring: fix type use in strstr()
Jan Beulich [Wed, 13 Mar 2019 10:25:49 +0000 (11:25 +0100)]
string: fix type use in strstr()

Using plain int for string lengths, while okay for all practical
purposes, is undesirable in a generic library function.

Take the opportunity and also move the function from being in the middle
of mem*() ones to the set of str*() ones, convert its loop from while()
to for(), and correct style.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agostring: remove memscan()
Jan Beulich [Wed, 13 Mar 2019 10:25:04 +0000 (11:25 +0100)]
string: remove memscan()

It has no users, so rather than fixing its use of types (first and
foremost c would need to be cast to unsigned char in the comparison
expression) drop it altogether. memchr() ought to be fine for all
purposes.

Take the opportunity and also do some stylistic adjustments to its
surviving sibling function memchr().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agostring: avoid undefined behavior in strrchr()
Jan Beulich [Wed, 13 Mar 2019 10:23:28 +0000 (11:23 +0100)]
string: avoid undefined behavior in strrchr()

The pre-decrement would not only cause misbehavior when wrapping (benign
because there shouldn't be any NULL pointers passed in), but may also
create a pointer pointing outside the object that the passed in pointer
points to (it won't be de-referenced though).

Take the opportunity and also
- convert bogus space (partly 7 of them) indentation to Linux style tab
  one,
- add two blank lines.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoxen/gnttab: Minor improvements to arch header files
Andrew Cooper [Wed, 24 Oct 2018 09:47:45 +0000 (10:47 +0100)]
xen/gnttab: Minor improvements to arch header files

 * Use XFREE() when appropriate
 * Drop stale comments and unnecessary brackets
 * Fold asm constraints

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
6 years agomicrocode/intel: use union to get fields without shifting and masking
Chao Gao [Mon, 11 Mar 2019 07:57:26 +0000 (15:57 +0800)]
microcode/intel: use union to get fields without shifting and masking

No functional change.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agoSVM: fix build after "make nested page-fault tracing and logging consistent"
Jan Beulich [Tue, 12 Mar 2019 18:11:35 +0000 (18:11 +0000)]
SVM: fix build after "make nested page-fault tracing and logging consistent"

Some compiler versions don't recognize that "mfn" can't really be used
uninitialized in svm_do_nested_pgfault(). To be on the safe side, add an
initializer for p2mt as well.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
6 years agox86/tsx: Implement controls for RTM force-abort mode
Andrew Cooper [Wed, 12 Sep 2018 13:36:00 +0000 (14:36 +0100)]
x86/tsx: Implement controls for RTM force-abort mode

The CPUID bit and MSR are deliberately not exposed to guests, because they
won't exist on newer processors.  As vPMU isn't security supported, the
misbehaviour of PCR3 isn't expected to impact production deployments.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agogic-vgic: skip irqs locking in gic_restore_pending_irqs()
Andrii Anisov [Fri, 21 Dec 2018 18:54:16 +0000 (20:54 +0200)]
gic-vgic: skip irqs locking in gic_restore_pending_irqs()

This function is called under IRQs disabled already, so drop additional
flags save and restore.

Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
6 years agoREADME: remove requirement on Python 2
Wei Liu [Mon, 11 Mar 2019 17:19:19 +0000 (17:19 +0000)]
README: remove requirement on Python 2

Now that all python scripts are compatible with Python 2.6 and above,
remove the restriction.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agopygrub/fsimage: make it work with python 3
Wei Liu [Tue, 5 Mar 2019 14:13:17 +0000 (14:13 +0000)]
pygrub/fsimage: make it work with python 3

With the help of two porting guides and cpython source code:

1. Use PyBytes to replace PyString counterparts.
2. Use PyVarObject_HEAD_INIT.
3. Remove usage of Py_FindMethod.
4. Use new module initialisation routine.

For #3, Py_FindMethod was removed, yet an alternative wasn't
documented.  The code is the result of reverse-engineering cpython
commit 6116d4a1d1

https://docs.python.org/3/howto/cporting.html
http://python3porting.com/cextensions.html

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agopygrub: make python scripts work with 2.6 and up
Wei Liu [Thu, 7 Mar 2019 12:45:47 +0000 (12:45 +0000)]
pygrub: make python scripts work with 2.6 and up

Run 2to3 and pick the sensible suggestions.

Import print_function and absolute_import so 2.6 can work.

There has never been a curses.wrapper module according to 2.x and 3.x
doc, only a function, so "import curses.wrapper" is not correct. It
happened to work because 2.x implemented a (undocumented) module.

We only need to import curses to make curses.wrapper available to
pygrub.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agolibxl: make python scripts work with python 2.6 and up
Wei Liu [Thu, 7 Mar 2019 12:33:38 +0000 (12:33 +0000)]
libxl: make python scripts work with python 2.6 and up

Go through transformations suggested by 2to3 and pick the necessary
ones.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agobuild/m4: make python_devel.m4 work with both python 2 and 3
Wei Liu [Tue, 5 Mar 2019 12:32:06 +0000 (12:32 +0000)]
build/m4: make python_devel.m4 work with both python 2 and 3

Do the following:

1. Change the form of "print".
2. Use AC_CHECK_FUNC to avoid the need to generate library name.
3. Remove unused stuff.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
6 years agopygrub: change tabs into spaces
Wei Liu [Mon, 11 Mar 2019 12:55:29 +0000 (12:55 +0000)]
pygrub: change tabs into spaces

Not sure why Python 2 never complained, but Python 3 does.

Change tabs to spaces.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agopygrub/fsimage: drop unused struct
Wei Liu [Mon, 11 Mar 2019 12:58:05 +0000 (12:58 +0000)]
pygrub/fsimage: drop unused struct

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoUpdate Python requirement to 2.6
Wei Liu [Mon, 11 Mar 2019 17:16:45 +0000 (17:16 +0000)]
Update Python requirement to 2.6

CentOS 5, which was the reason for the 2.4 restriction, is EOL. CentOS
6 ships 2.6.

Bump the version to 2.6 in README. Update configure.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agotools/cpu-policy: Add unit tests
Andrew Cooper [Thu, 3 Jan 2019 18:03:25 +0000 (18:03 +0000)]
tools/cpu-policy: Add unit tests

These will be extended with further libx86 work.

Fix the sorting of the CPUID_GUEST_NR_* constants, noticed while writing the
unit tests.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agolibx86: Introduce a helper to deserialise cpuid_policy objects
Andrew Cooper [Thu, 21 Jun 2018 14:35:49 +0000 (16:35 +0200)]
libx86: Introduce a helper to deserialise cpuid_policy objects

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agolibx86: introduce a helper to deserialise msr_policy objects
Roger Pau Monné [Thu, 21 Jun 2018 14:35:50 +0000 (16:35 +0200)]
libx86: introduce a helper to deserialise msr_policy objects

As with the serialise side, Xen's copy_from_guest API is used, with a
compatibility wrapper for the userspace build.

Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/vtd: Don't include control register state in the table pointers
Andrew Cooper [Tue, 27 Nov 2018 17:21:17 +0000 (17:21 +0000)]
x86/vtd: Don't include control register state in the table pointers

iremap_maddr and qinval_maddr point to the base of a block of contiguous RAM,
allocated by the driver, holding the Interrupt Remapping table, and the Queued
Invalidation ring.

Despite their name, they are actually the values of the hardware register,
including control metadata in the lower 12 bits.  While uses of these fields
do appear to correctly shift out the metadata, this is very subtle behaviour
and confusing to follow.

Nothing uses the metadata, so make the fields actually point at the base of
the relevant tables.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
6 years agox86/xstate: Don't special case feature collection
Andrew Cooper [Fri, 22 Feb 2019 13:28:16 +0000 (13:28 +0000)]
x86/xstate: Don't special case feature collection

The logic in xstate_init() is a rementent of the pre-featuremask days.
Collect the xstate features in generic_identify(), like all other feature
leaves, after which identify_cpu() will apply the known_feature[] mask derived
from the automatically generated CPUID information.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/shadow: sh_{write,cmpxchg}_guest_entry() are PV-only
Jan Beulich [Tue, 12 Mar 2019 13:45:36 +0000 (14:45 +0100)]
x86/shadow: sh_{write,cmpxchg}_guest_entry() are PV-only

Move them to a new pv.c. Make the respective struct shadow_paging_mode
fields as well as the paging.h wrappers PV-only as well.

Take the liberty and switch both functions' "failed" local variables to
more appropriate types.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
6 years agox86/shadow: sh_validate_guest_pt_write() is HVM-only
Jan Beulich [Tue, 12 Mar 2019 13:44:38 +0000 (14:44 +0100)]
x86/shadow: sh_validate_guest_pt_write() is HVM-only

Move the function to hvm.c, make it static, and drop its sh_ prefix.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
6 years agoArm/atomic: parameterize register modifier macro arguments
Jan Beulich [Tue, 12 Mar 2019 13:43:50 +0000 (14:43 +0100)]
Arm/atomic: parameterize register modifier macro arguments

Make the abstracting macros take the asm() operand specifier as
argument, in preparation of doing away with the split u64 read/write
definitions.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
6 years agoArm/atomic: drop uniformly used reg macro parameters
Jan Beulich [Tue, 12 Mar 2019 13:43:15 +0000 (14:43 +0100)]
Arm/atomic: drop uniformly used reg macro parameters

There's no point in parameterizing these when all use sites pass the
same arguments.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
6 years agoArm/atomic: correct asm() constraints in build_add_sized()
Jan Beulich [Tue, 12 Mar 2019 13:42:17 +0000 (14:42 +0100)]
Arm/atomic: correct asm() constraints in build_add_sized()

The memory operand is an in/out one, and the auxiliary register gets
written to early.

Take the opportunity and also drop the redundant cast (the inline
functions' parameters are already of the casted-to type).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
6 years agocommon: avoid atomic read-modify-write accesses in map_vcpu_info()
Jan Beulich [Tue, 12 Mar 2019 13:40:56 +0000 (14:40 +0100)]
common: avoid atomic read-modify-write accesses in map_vcpu_info()

There's no need to set the evtchn_pending_sel bits one by one. Simply
write full words with all ones.

For Arm this requires extending write_atomic() to also handle 64-bit
values; for symmetry read_atomic() gets adjusted as well.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
6 years agoevents: drop arch_evtchn_inject()
Jan Beulich [Tue, 12 Mar 2019 13:40:24 +0000 (14:40 +0100)]
events: drop arch_evtchn_inject()

Have the only user call vcpu_mark_events_pending() instead, at the same
time arranging for correct ordering of the writes (evtchn_pending_sel
should be written before evtchn_upcall_pending).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
6 years agox86/HVM: don't crash guest in hvmemul_find_mmio_cache()
Jan Beulich [Tue, 12 Mar 2019 13:39:46 +0000 (14:39 +0100)]
x86/HVM: don't crash guest in hvmemul_find_mmio_cache()

Commit 35a61c05ea ("x86emul: adjust handling of AVX2 gathers") builds
upon the fact that the domain will actually survive running out of MMIO
result buffer space. Drop the domain_crash() invocation. Also delay
incrementing of the usage counter, such that the function can't possibly
use/return an out-of-bounds slot/pointer in case execution subsequently
makes it into the function again without a prior reset of state.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
6 years agokexec: suppress bogus warning
Jan Beulich [Tue, 12 Mar 2019 13:39:13 +0000 (14:39 +0100)]
kexec: suppress bogus warning

Till now "crashkernel=1G-16G:1M" causes

(XEN) crashkernel: '' ignored
(XEN) parameter "crashkernel" has invalid value "1G-16G:1M", rc=-22!

Don't emit the "ignored" warning when there's no placement specification
and the tail of the specified option is actually empty.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoiommu: leave IOMMU enabled by default during kexec crash transition
Igor Druzhinin [Tue, 12 Mar 2019 13:38:12 +0000 (14:38 +0100)]
iommu: leave IOMMU enabled by default during kexec crash transition

It's unsafe to disable IOMMU on a live system which is the case
if we're crashing since remapping hardware doesn't usually know what
to do with ongoing bus transactions and frequently raises NMI/MCE/SMI,
etc. (depends on the firmware configuration) to signal these abnormalities.
This, in turn, doesn't play well with kexec transition process as there is
no handling available at the moment for this kind of events resulting
in failures to enter the kernel.

Modern Linux kernels taught to copy all the necessary DMAR/IR tables
following kexec from the previous kernel (Xen in our case) - so it's
currently normal to keep IOMMU enabled. It might require minor changes to
kdump command line that enables IOMMU drivers (e.g. intel_iommu=on /
intremap=on) but recent kernels don't require any additional changes for
the transition to be transparent.

A fallback option is still left for compatibility with ancient crash
kernels which didn't like to have IOMMU active under their feet on boot.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>