]> xenbits.xensource.com Git - people/liuw/xen.git/log
people/liuw/xen.git
6 years agodocs/parse-support-md: Allow definition lists for features
Ian Jackson [Mon, 3 Dec 2018 12:05:41 +0000 (12:05 +0000)]
docs/parse-support-md: Allow definition lists for features

Now, as well as a `code block', with
  |    Something: some status
we tolerate a definition list which in pandoc terms looks like this
  |Term
  |: Definition

This ought not usually be be used for features but it will be useful
for linking to the release notes, because markup is not allowed in
code blocks but is in definitions.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Juergen Gross <jgross@suse.com>
6 years agodocs/parse-support-md: Correct handling of Status
Ian Jackson [Mon, 3 Dec 2018 12:01:55 +0000 (12:01 +0000)]
docs/parse-support-md: Correct handling of Status

In fact this was not markdown content, but just a string.  We are
however going to make it be markdown content.  So adjust the comments,
and the consumer.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Juergen Gross <jgross@suse.com>
6 years agodocs/parse-support-md: pandoc2html_inline: print failing json
Ian Jackson [Mon, 3 Dec 2018 12:03:48 +0000 (12:03 +0000)]
docs/parse-support-md: pandoc2html_inline: print failing json

If our run of pandoc to convert pieces of markup in our hand, into
html, fails, print the json that was rejected.

No change in non-error cases.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Juergen Gross <jgross@suse.com>
6 years agodocs/parse-support-md: Break out descr2key
Ian Jackson [Mon, 3 Dec 2018 12:03:19 +0000 (12:03 +0000)]
docs/parse-support-md: Break out descr2key

We are going to want to reuse this.  No functional change.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Juergen Gross <jgross@suse.com>
6 years agodocs/parse-support-md: Adjust some (commented-out) debugging
Ian Jackson [Mon, 3 Dec 2018 12:01:27 +0000 (12:01 +0000)]
docs/parse-support-md: Adjust some (commented-out) debugging

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Juergen Gross <jgross@suse.com>
6 years agodocs/parse-support-md: More complete example runes
Ian Jackson [Mon, 3 Dec 2018 12:09:28 +0000 (12:09 +0000)]
docs/parse-support-md: More complete example runes

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Juergen Gross <jgross@suse.com>
6 years agox86/hvm/viridian: stop open coding updates to APIC registers
Paul Durrant [Fri, 7 Dec 2018 17:50:08 +0000 (17:50 +0000)]
x86/hvm/viridian: stop open coding updates to APIC registers

The code in viridian_synic_wrmsr() duplicates logic in vlapic_reg_write()
to update the ICR, ICR2 and TASKPRI registers. Instead of doing this,
make vlapic_reg_write() non-static and call it.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Rename "offset" to "reg" for consistency with the rest of the vlapic API.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/hvm: remove duplicate vlapic_find_highest_isr() calls
Paul Durrant [Fri, 7 Dec 2018 13:13:02 +0000 (13:13 +0000)]
x86/hvm: remove duplicate vlapic_find_highest_isr() calls

When viridian APIC assist is active, the code in vlapic_has_pending_irq()
may end up re-calling vlapic_find_highest_isr() after emulating an EOI
whereas simply moving the call after the EOI emulation removes the need
for this duplication.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoxen/arm32: Remove __init prefixes from funcs that are used within CPU up flow
Oleksandr Tyshchenko [Fri, 7 Dec 2018 09:45:31 +0000 (11:45 +0200)]
xen/arm32: Remove __init prefixes from funcs that are used within CPU up flow

This is a follow-up patch to
commit 01a7e8ccef6e7d5718a251ad587567afbe723330
xen/arm: Remove __initdata and __init to enable CPU hotplug

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Acked-by: Julien Grall <julien.grall@arm.com>
6 years agoxen/arm: link: Link proc_info_list in .rodata instead of .init.data
Oleksandr Tyshchenko [Fri, 7 Dec 2018 13:41:16 +0000 (15:41 +0200)]
xen/arm: link: Link proc_info_list in .rodata instead of .init.data

To be able to use it for the hot-plugged CPUs as well.

The reason why we link proc_info_list in ".rodata" section is that
it context should never be modified.

This patch also renames ".init.proc.info" section to ".proc.info"
as "init" prefix is not actual anymore.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
6 years agotools/libxl: fix boot of HVM domain with Xenstore-stubdom
Juergen Gross [Tue, 4 Dec 2018 14:28:57 +0000 (15:28 +0100)]
tools/libxl: fix boot of HVM domain with Xenstore-stubdom

The Xenstore domid isn't set for HVM domains. This will result in
failure when booting a HVM domain on a system with Xenstore not running
in dom0.

Same applies for console domid, so set both.

This is broken since commit a2d9a6fa1fcd ("tools/libxenctrl: use new
xenforeignmemory API to seed grant table").

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agotools/xenstore: Document failure for xs_{read,directory,read_watch}
Anthony PERARD [Wed, 5 Dec 2018 16:26:02 +0000 (16:26 +0000)]
tools/xenstore: Document failure for xs_{read,directory,read_watch}

Those functions can return NULL on failure, document it in the public
header.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agons16550: enable use of PCI MSI
Jan Beulich [Thu, 6 Dec 2018 11:21:34 +0000 (12:21 +0100)]
ns16550: enable use of PCI MSI

Which, on x86, requires fiddling with the INTx bit in PCI config space,
since for internally used MSI we can't delegate this to Dom0.

ns16550_init_postirq() also needs (benign) re-ordering of its
operations.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
6 years agoconsole: adjust IRQ initialization
Jan Beulich [Thu, 6 Dec 2018 11:20:55 +0000 (12:20 +0100)]
console: adjust IRQ initialization

In order for a Xen internal PCI device driver to enable MSI on the
device, we need another hook which the driver can use to create the IRQ
(doing this in the init_preirq hook is too early, since IRQ code hasn't
got initialized at that time yet, and doing it in init_postirq is too
late because at least on x86 smp_intr_init() needs to know the IRQ
number).

On x86 this additionally requires a slight ordering change to IRQ
initialization, to facilitate calling the new hook between basic
initialization and the call path leading to smp_intr_init().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
6 years agomake domain_adjust_tot_pages() __must_check
Jan Beulich [Thu, 6 Dec 2018 11:19:04 +0000 (12:19 +0100)]
make domain_adjust_tot_pages() __must_check

Even if unlikely, donate_page() should not ignore the possible need to
obtain a domain reference. To make people look more closely when they
add new uses of domain_adjust_tot_pages(), force its return value to be
checked. This in turn requires a benign change to assign_pages().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86: reduce code duplication in guest_remove_page()
Jan Beulich [Thu, 6 Dec 2018 11:18:03 +0000 (12:18 +0100)]
x86: reduce code duplication in guest_remove_page()

Quite a bit of duplicate code has accumulated on the "paging" types
special case path. Re-use what can be re-used from the common path.

Since it needs touching anyway, slightly re-format and extend the
gdprintk() on the common path as well.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoautomation: break .gitlab-yaml into smaller files
Wei Liu [Thu, 22 Nov 2018 15:49:03 +0000 (15:49 +0000)]
automation: break .gitlab-yaml into smaller files

Break out files for build jobs and test jobs. Keep the top level
.gitlab-ci.yaml small.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Doug Goldstein <cardoe@cardoe.com>
6 years agoautomation: add a qemu smoke test for clang build
Wei Liu [Thu, 22 Nov 2018 15:49:02 +0000 (15:49 +0000)]
automation: add a qemu smoke test for clang build

Also rename the old test to have -gcc suffix.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Doug Goldstein <cardoe@cardoe.com>
6 years agox86/hvm: Handle x2apic MSRs via the new guest_{rd,wr}msr() infrastructure
Andrew Cooper [Mon, 26 Feb 2018 12:45:58 +0000 (12:45 +0000)]
x86/hvm: Handle x2apic MSRs via the new guest_{rd,wr}msr() infrastructure

Dispatch from the guest_{rd,wr}msr() functions.  The read side should be safe
outside of current context, but the write side is definitely not.  As the
toolstack has no legitimate reason to access the APIC registers via this
interface (not least because whether they are accessible at all depends on
guest settings), unilaterally reject access attempts outside of current
context.

Rename to guest_{rd,wr}msr_x2apic() for consistency, and alter the functions
to use X86EMUL_EXCEPTION rather than X86EMUL_UNHANDLEABLE.  The previous
callers turned UNHANDLEABLE into EXCEPTION, but using UNHANDLEABLE will now
interfere with the fallback to legacy MSR handling.

While altering guest_rdmsr_x2apic() make a couple of minor improvements.
Reformat the initialiser for readable[] so it indents in a more natural way,
and alter high to be a 64bit integer to avoid shifting 0 by 32 in the common
path.

Observant people might notice that we now don't let PV guests read the x2apic
MSRs.  They should never have been able to in the first place.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
6 years agox86: Fix APIC MSR constant names
Andrew Cooper [Wed, 7 Mar 2018 16:48:01 +0000 (16:48 +0000)]
x86: Fix APIC MSR constant names

We currently have MSR_IA32_APICBASE and MSR_IA32_APICBASE_MSR which are
synonymous from a naming point of view, but refer to very different things.

Rename the x2APIC MSRs to MSR_X2APIC_*, which are shorter constants and
visually separate the register function from the generic APIC name.  For the
case ranges, introduce MSR_X2APIC_LAST, rather than relying on the knowledge
that there are 0x3ff MSRs architecturally reserved for x2APIC functionality.

For functionality relating to the APIC_BASE MSR, use MSR_APIC_BASE for the MSR
itself, but drop the MSR prefix from the other constants to shorten the names.
In all cases, the fact that we are dealing with the APIC_BASE MSR is obvious
from the context.

No functional change (the combined binary is identical).

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
6 years agox86/cpuid: Drop the synthetic X86_FEATURE_XEN_IBPB
Andrew Cooper [Thu, 29 Nov 2018 18:16:01 +0000 (18:16 +0000)]
x86/cpuid: Drop the synthetic X86_FEATURE_XEN_IBPB

This appears to be a vestigial remnent of an old version of the
XSA-254/Spectre series, and has never been used.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/spec-ctrl: Drop the bti= command line option
Andrew Cooper [Thu, 29 Nov 2018 18:17:45 +0000 (18:17 +0000)]
x86/spec-ctrl: Drop the bti= command line option

bti= was introduced with the original Spectre fixes (Jan 2018), but by the
time Speculative Store Bypass came along (May 2018), it was superceeded by the
more generic spec-ctrl=.

Since then, we've had LazyFPU (June 2018) and L1TF (August 2018), which means
noone will be using the option.  Remove it entirely - anyone who happens to
accidentially be using it might now spot Xen complaining about an option it
doesn't understand.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agopci: apply workaround for Intel errata HSE43 and BDF2/BDX2
Roger Pau Monné [Tue, 4 Dec 2018 13:04:54 +0000 (14:04 +0100)]
pci: apply workaround for Intel errata HSE43 and BDF2/BDX2

These errata affect the values read from the BAR registers, and could
render vPCI (and by extension PVH Dom0 unusable).

HSE43 is a Haswell erratum where a non-BAR register is implemented at
the position where the first BAR of the device should be found in a
Power Control Unit device. Note that there are no BARs on this device,
apart from the bogus CSR register positioned on top of the first BAR.

BDF2/BDX2 is a Broadwell erratum where BARs in the Home Agent device
will return bogus non-zero values.

In both cases the solution is to treat such devices as having no BARs
in the vPCI code.

Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agovmx: remove stale prototypes
Juergen Gross [Tue, 4 Dec 2018 13:04:20 +0000 (14:04 +0100)]
vmx: remove stale prototypes

Some prototypes in include/asm-x86/hvm/vmx/vmx.h have no related
implementation. Remove them.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
6 years agox86emul: raise #GP(0) in VME mode for POPF with TF set in new value
Jan Beulich [Tue, 4 Dec 2018 13:03:43 +0000 (14:03 +0100)]
x86emul: raise #GP(0) in VME mode for POPF with TF set in new value

This is a check explicitly listed by the instruction page in the SDM.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86emul: skip VIF processing in VME mode for 16-bit POPF at IOPL 3
Jan Beulich [Tue, 4 Dec 2018 13:02:46 +0000 (14:02 +0100)]
x86emul: skip VIF processing in VME mode for 16-bit POPF at IOPL 3

At IOPL 3 CR4.VME is irrelevant.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agotools/libxc: Fix error handling in get_cpuid_domain_info()
Andrew Cooper [Thu, 29 Nov 2018 18:17:01 +0000 (18:17 +0000)]
tools/libxc: Fix error handling in get_cpuid_domain_info()

get_cpuid_domain_info() has two conflicting return styles - either -error for
local failures, or -1/errno for hypercall failures.  Switch to consistently
use -error.

While fixing the xc_get_cpu_featureset(), take the opportunity to remove the
redundancy and move it to be adjacent to the other featureset handling.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agotools/libxc: Fix issues with libxc and Xen having different featureset lengths
Andrew Cooper [Thu, 29 Nov 2018 18:10:38 +0000 (18:10 +0000)]
tools/libxc: Fix issues with libxc and Xen having different featureset lengths

In almost all cases, Xen and libxc will agree on the featureset length,
because they are built from the same source.

However, there are circumstances (e.g. security hotfixes) where the featureset
gets longer and dom0 will, after installing updates, be running with an old
Xen but new libxc.  Despite writing the code with this scenario in mind, there
were some bugs.

First, xen-cpuid's get_featureset() erroneously allocates a buffer based on
Xen's featureset length, but records libxc's length, which may be longer.

In this situation, the hypercall bounce buffer code reads/writes the recorded
length, which is beyond the end of the allocated object, and a later free()
encounters corrupt heap metadata.  Fix this by recording the same length that
we allocate.

Secondly, get_cpuid_domain_info() has a related bug when the passed-in
featureset is a different length to libxc's.

A large amount of the libxc cpuid functionality depends on info->featureset
being as long as expected, and it is allocated appropriately.  However, in the
case that a shorter external featureset is passed in, the logic to check for
trailing nonzero bits may read off the end of it.  Rework the logic to use the
correct upper bound.

In addition, leave a comment next to the fields in struct cpuid_domain_info
explaining the relationship between the various lengths, and how to cope with
different lengths.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxl: free bitmaps on exit
Olaf Hering [Wed, 28 Nov 2018 12:24:34 +0000 (13:24 +0100)]
xl: free bitmaps on exit

Every invocation of xl via valgrind will show three leaks.
Since libxl_bitmap_alloc uses NOGC, the caller has to free the memory
after use. And since xl_ctx_free might be called before
parse_global_config, also move the libxl_bitmap_init calls into
xl_ctx_alloc.

Also move the call to atexit() after xl_ctx_alloc, because the latter is
also called again in postfork.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agox86/shadow: don't enable shadow mode with too small a shadow allocation
Jan Beulich [Fri, 30 Nov 2018 11:10:39 +0000 (12:10 +0100)]
x86/shadow: don't enable shadow mode with too small a shadow allocation

We've had more than one report of host crashes after failed migration,
and in at least one case we've had a hint towards a too far shrunk
shadow allocation pool. Instead of just checking the pool for being
empty, check whether the pool is smaller than what
shadow_set_allocation() would minimally bump it to if it was invoked in
the first place.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
6 years agoamd/iommu: skip host bridge devices when updating IOMMU page tables
Roger Pau Monné [Fri, 30 Nov 2018 11:10:00 +0000 (12:10 +0100)]
amd/iommu: skip host bridge devices when updating IOMMU page tables

Host bridges are not behind an IOMMU, and are already special cased and
skipped in amd_iommu_add_device. Apply the same special casing when
updating page tables.

This is required or else update_paging_mode will fail and return an
error to the caller (amd_iommu_{un}map_page) which will destroy the
domain.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Brian Woods <brian.woods@amd.com>
6 years agoamd/iommu: assign iommu devices to Xen
Roger Pau Monné [Fri, 30 Nov 2018 11:09:09 +0000 (12:09 +0100)]
amd/iommu: assign iommu devices to Xen

AMD IOMMU devices are exposed on the PCI bus, and thus are assigned by
default to the hardware domain. This can cause issues because the
IOMMU devices themselves are not behind an IOMMU, so update_paging_mode will
return an error if Xen tries to expand the page tables of a domain
that has assigned devices not behind an IOMMU. update_paging_mode
failing will cause the domain to be destroyed.

Fix this by hiding PCI IOMMU devices, so they are not assigned to the
hardware domain.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Brian Woods <brian.woods@amd.com>
6 years agoamd-iommu: replace occurrences of u<N> with uint<N>_t...
Paul Durrant [Fri, 30 Nov 2018 11:08:28 +0000 (12:08 +0100)]
amd-iommu: replace occurrences of u<N> with uint<N>_t...

...for N in {8, 16, 32, 64}.

Bring the coding style up to date.

Also, while in the neighbourhood, fix some tabs and remove use of uint64_t
values where it leads to the need for explicit casting.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Brian Woods <brian.woods@amd.com>
6 years agons16550/PCI: fix skipping of devices
Jan Beulich [Fri, 30 Nov 2018 11:07:33 +0000 (12:07 +0100)]
ns16550/PCI: fix skipping of devices

Selecting between single/multiple BAR mode should happen after checking
whether to skip the present device, or else multi-BAR devices won't be
skipped correctly, due to port_idx getting set to zero in that case.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agolibxl: Remove redundant pidpath setting
George Dunlap [Fri, 23 Nov 2018 17:14:54 +0000 (17:14 +0000)]
libxl: Remove redundant pidpath setting

This exact same line is duplicated further on without being used or
modified in between.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
6 years agotools: set Dom0 UUID if requested
Wei Liu [Mon, 26 Nov 2018 10:40:44 +0000 (10:40 +0000)]
tools: set Dom0 UUID if requested

Introduce XEN_DOM0_UUID in Xen's global configuration file.  Make
xen-init-dom0 accept an extra argument for UUID.

Also switch xs_open error message in xen-init-dom0 to use perror.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
6 years agox86: fix paging_max_paddr_bits()
Juergen Gross [Wed, 28 Nov 2018 14:51:20 +0000 (15:51 +0100)]
x86: fix paging_max_paddr_bits()

paging_max_paddr_bits() has an invalid use of IS_ENABLED(): instead of
IS_ENABLED(CONFIG_BIGMEM) it is using IS_ENABLED(BIGMEM). Fix that.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86emul: correct 32-bit address handling for AVX2 gathers
Jan Beulich [Wed, 28 Nov 2018 14:50:26 +0000 (15:50 +0100)]
x86emul: correct 32-bit address handling for AVX2 gathers

As done for other cases by commit 7869e2bafe ("x86emul/fuzz: add
rudimentary limit checking"), address calculations should also use
truncate_ea() for the AVX2 gather insns.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoamd-iommu: replace occurrences of bool_t with bool
Paul Durrant [Wed, 28 Nov 2018 14:49:01 +0000 (15:49 +0100)]
amd-iommu: replace occurrences of bool_t with bool

Bring the coding style up to date. No functional change (except for
removal of some pointless initializers).

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Brian Woods <brian.woods@amd.com>
6 years agoxen: remove trailing spaces from public headers
Juergen Gross [Wed, 28 Nov 2018 12:32:36 +0000 (13:32 +0100)]
xen: remove trailing spaces from public headers

Several public header files have trailing spaces in them. This is
rather annoying when importing them into other projects as they might
be rejected not complying to coding style.

Remove the trailing spaces in all headers below xen/include/public/.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agotools/xenstore: Document that xs_close(0) is OK.
Ian Jackson [Fri, 2 Nov 2018 17:01:07 +0000 (17:01 +0000)]
tools/xenstore: Document that xs_close(0) is OK.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agotools/libvchan: Initialise xs_transaction_t to XBT_NULL, not NULL
Ian Jackson [Fri, 2 Nov 2018 17:01:06 +0000 (17:01 +0000)]
tools/libvchan: Initialise xs_transaction_t to XBT_NULL, not NULL

This is an integer type, not a pointer.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agoarm/irq: Fix block parathenses and whitespaces
Andrii Anisov [Fri, 16 Nov 2018 16:24:18 +0000 (18:24 +0200)]
arm/irq: Fix block parathenses and whitespaces

Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoarm/irq: replace an odd tab with spaces
Andrii Anisov [Fri, 16 Nov 2018 16:24:17 +0000 (18:24 +0200)]
arm/irq: replace an odd tab with spaces

Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
Acked-by: Julien Grall <julien.grall@arm.com>
6 years agomm: make opt_bootscrub non-init
Roger Pau Monne [Mon, 26 Nov 2018 17:55:48 +0000 (18:55 +0100)]
mm: make opt_bootscrub non-init

LLVM code generation can attempt to load from a variable in the next
condition of an expression under certain circumstances, thus turning
the following condition:

if ( system_state < SYS_STATE_active && opt_bootscrub == BOOTSCRUB_IDLE )

Into:

0xffff82d080223967 <+103>: cmpl   $0x3,0x37b032(%rip) # 0xffff82d08059e9a0 <system_state>
0xffff82d08022396e <+110>: setb   -0x29(%rbp)
0xffff82d080223972 <+114>: cmpl   $0x2,0x228a8b(%rip) # 0xffff82d08044c404 <opt_bootscrub>

Such code will trigger a page fault if system_state >=
SYS_STATE_active because opt_bootscrub will be unmapped.

Fix this by making opt_bootscrub non-init, thus preventing the page
fault. The LLVM bug with the discussion about this issue can be found
at:

https://bugs.llvm.org/show_bug.cgi?id=39707

I haven't been able to find any other instances of such conditional
expression that uses system_state together with an init variable or
function.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agotools/libs: xenforeignmemory_unmap_resource() should be idempotent...
Paul Durrant [Tue, 27 Nov 2018 16:39:17 +0000 (16:39 +0000)]
tools/libs: xenforeignmemory_unmap_resource() should be idempotent...

...and is not because linux osdep_xenforeignmemory_unmap_resource() is not.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxen/tools: Fix gen-cpuid.py's ability to report errors
Andrew Cooper [Mon, 26 Nov 2018 12:03:07 +0000 (12:03 +0000)]
xen/tools: Fix gen-cpuid.py's ability to report errors

c/s 18596903 "xen/tools: support Python 2 and Python 3" unfortunately
introduced a TypeError when changing how Fail exceptions were printed:

  /local/xen.git/xen/../xen/tools/gen-cpuid.py:Traceback (most recent call last):
    File "/local/xen.git/xen/../xen/tools/gen-cpuid.py", line 483, in <module>
        sys.stderr.write(e)
  TypeError: expected a character buffer object

Coerce e to a string before printing.  While changing this, fold the three
write() calls making up the line into a single one, and take the opportunity
to neaten the output.

A sample error is:

  /local/xen.git/xen/tools/gen-cpuid.py: Fail: Aliased value between FOO and BAR

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agoviridian: fix assertion failure
Paul Durrant [Mon, 26 Nov 2018 16:54:24 +0000 (17:54 +0100)]
viridian: fix assertion failure

Whilst attempting to crash an apparently wedged Windows domain using
'xen-hvmcrash' I managed to trigger the following ASSERT:

(XEN) Assertion '!vp->ptr' failed at viridian.c:607

with stack:

(XEN)    [<ffff82d08032c55d>] viridian_map_guest_page+0x1b4/0x1b6
(XEN)    [<ffff82d08032b1db>] viridian_synic_load_vcpu_ctxt+0x39/0x3b
(XEN)    [<ffff82d08032b90d>] viridian.c#viridian_load_vcpu_ctxt+0x93/0xcc
(XEN)    [<ffff82d0803096d6>] hvm_load+0x10e/0x19e
(XEN)    [<ffff82d080274c6d>] arch_do_domctl+0xb74/0x25b4
(XEN)    [<ffff82d0802068ab>] do_domctl+0x16f7/0x19d8

This happened because viridian_map_guest_page() was not written to cope
with being called multiple times, but this is unfortunately exactly what
happens when xen-hvmcrash re-loads the domain context (having clobbered
the values of RIP).

This patch simply makes viridian_map_guest_page() return immediately if it
finds the page already mapped (i.e. vp->ptr != NULL).

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
6 years agox86emul: suppress default test harness build with incapable assembler
Jan Beulich [Mon, 26 Nov 2018 16:53:51 +0000 (17:53 +0100)]
x86emul: suppress default test harness build with incapable assembler

A top level "make build", as used e.g. by osstest, wants to build all
"all" targets in enabled tools subdirectories, which by default also
includes the emulator test harness. The use of, in particular, {evex}
insn pseudo-prefixes in, again in particular, test_x86_emulator.c causes
this build to fail though when the assembler is not new enough. Take
another big hammer and suppress the default harness build altogether
also when this and other pseudo-prefixes are not supported by the
specified (or defaulted to) assembler.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agox86emul: fix test harness 32-bit "clean" target handling
Jan Beulich [Mon, 26 Nov 2018 14:44:48 +0000 (15:44 +0100)]
x86emul: fix test harness 32-bit "clean" target handling

When preparing what is now 52c37f7ab9 ("x86emul: also allow running the
32-bit harness on a 64-bit distro") I first wrongly used XEN_TARGET_ARCH
instead of XEN_COMPILE_ARCH. When realizing the mistake I forgot to also
switch around the use in the expression controlling the rule
dependencies, causing "make distclean" to fail on 64-bit distros.

Reported-by: Paul Durrant <Paul.Durrant@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Paul Durrant <Paul.Durrant@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agocommon: make sure symbols-dummy.o gets rebuilt when needed
Jan Beulich [Mon, 26 Nov 2018 14:44:05 +0000 (15:44 +0100)]
common: make sure symbols-dummy.o gets rebuilt when needed

The per-arch top level make files don't record any dependencies for the
file, so its mere existence is enough for make to consider it up-to-
date. As of ab3e5f5ff9 ("xsplice, symbols: Implement fast symbol names
-> virtual addresses lookup") the file, however, depends on the
FAST_SYMBOL_LOOKUP config option, which may change between incremental
re-builds.

Use the $(extra-y) machinery to get the file built without an extra
recursion step into common/, but instead right when the other things in
that directory get built. Some makefile adjustments are necessary to
actually make this machinery work beyond the restricted set of place it
was used in before. Note however that an important restriction remains:
$(extra-y) may not overlap $(obj-y) or $(obj-bin-y).

Take the opportunity and also make the gendep invocation cover both
$(obj-bin-y) and $(extra-y), even if this is not directly related here.
I should have included them right away in 8b6ef9c152 ("compat: enforce
distinguishable file names in symbol table").

Reported-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agoEFI: don't repeatedly replace symlinks
Jan Beulich [Mon, 26 Nov 2018 14:43:22 +0000 (15:43 +0100)]
EFI: don't repeatedly replace symlinks

Once created there's no point re-creating them on every incremental
make. This in particular prevents them from becoming root-owned during
e.g. "sudo make install-xen", but it also allows (during development)
to replace them there (instead of in common/efi/) by actual files with
perhaps slightly changed contents.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agopci: add a segment parameter to pci_hide_device
Roger Pau Monné [Mon, 26 Nov 2018 14:42:19 +0000 (15:42 +0100)]
pci: add a segment parameter to pci_hide_device

No functional change expected.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agovpci/msix: carve p2m hole for MSIX MMIO regions
Roger Pau Monné [Mon, 26 Nov 2018 14:41:42 +0000 (15:41 +0100)]
vpci/msix: carve p2m hole for MSIX MMIO regions

Make sure the MSIX MMIO regions don't have p2m entries setup, so that
accesses to them trap into the hypervisor and can be handled by vpci.

Commit 042678762 ("x86/iommu: add map-reserved dom0-iommu option to
map reserved memory ranges") added mappings for all the reserved
regions into the PVH Dom0 p2m, and some of those reserved regions
might contain MSIX MMIO regions, hence the need to make sure there are
no mappings established.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agovpci: fix deferral of long operations
Roger Pau Monné [Mon, 26 Nov 2018 14:41:12 +0000 (15:41 +0100)]
vpci: fix deferral of long operations

Current logic to handle long running operations is flawed because it
doesn't prevent the guest vcpu from running. Fix this by raising a
scheduler softirq when preemption is required, so that the do_softirq
call in the guest entry path performs a rescheduling. Also move the
call to vpci_process_pending into handle_hvm_io_completion, together
with the IOREQ code that handles pending IO instructions.

Note that a scheduler softirq is also raised when the long running
operation is queued in order to prevent the guest vcpu from resuming
execution.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agovpci: fix updating the command register
Roger Pau Monné [Mon, 26 Nov 2018 14:40:06 +0000 (15:40 +0100)]
vpci: fix updating the command register

When switching the memory decoding bit in the command register the
rest of the changes where dropped, leading to only the memory decoding
bit being updated.

Fix this by writing the command register once the guest physmap
manipulations are done if there are changes to the memory decoding
bit.

Note that when only mapping/unmapping the ROM BAR a fabricated command
register value is passed to modify_bars which is only used to signal
whether the action is a mapping or unmapping, but the value is never
written to the device command register. Turn the maodify_decoding
ASSERT into an ASSERT_UNREACHABLE and make sure that non-debug builds
won't end up writing to the command register if only modifying the ROM
BAR.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/vvmx: Don't call vmsucceed() at the end of virtual_vmexit()
Andrew Cooper [Thu, 1 Nov 2018 17:37:48 +0000 (17:37 +0000)]
x86/vvmx: Don't call vmsucceed() at the end of virtual_vmexit()

The correct value for RFLAGS is established earlier in the function, and a
successful vmexit logically discards the previous executing context.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
6 years agox86/vvmx: Fixes to VMWRITE emulation
Andrew Cooper [Thu, 1 Nov 2018 17:37:48 +0000 (17:37 +0000)]
x86/vvmx: Fixes to VMWRITE emulation

 * Don't assume that decode_vmx_inst() always returns X86EMUL_EXCEPTION.
 * The okay boolean is never written, making the else case dead.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
6 years agox86/vvmx: Correct the INVALID_PADDR checks for VMPTRLD/VMCLEAR
Andrew Cooper [Thu, 1 Nov 2018 17:37:48 +0000 (17:37 +0000)]
x86/vvmx: Correct the INVALID_PADDR checks for VMPTRLD/VMCLEAR

The referenced addresses also need checking against MAXPHYSADDR.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
6 years agox86/vvmx: Drop unused CASE_{GET,SET}_REG() macros
Andrew Cooper [Thu, 1 Nov 2018 17:37:48 +0000 (17:37 +0000)]
x86/vvmx: Drop unused CASE_{GET,SET}_REG() macros

These have been obsolete since c/s 053ae230 "x86/vvmx: Remove enum
vmx_regs_enc".

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
6 years agomm: disallow MEMF_no_refcount to be passed for domain-owned allocations
Jan Beulich [Fri, 23 Nov 2018 11:08:09 +0000 (12:08 +0100)]
mm: disallow MEMF_no_refcount to be passed for domain-owned allocations

When such pages get assigned to domains (and hence their ->tot_pages
not incremented accordingly) we would otherwise also need to suppress
decrementing the count when freeing those pages.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/p2m: switch global_logdirty from bool_t to bool
Razvan Cojocaru [Fri, 23 Nov 2018 11:07:24 +0000 (12:07 +0100)]
x86/p2m: switch global_logdirty from bool_t to bool

Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agox86/mm: introduce p2m_{init,free}_logdirty()
Razvan Cojocaru [Fri, 23 Nov 2018 11:06:52 +0000 (12:06 +0100)]
x86/mm: introduce p2m_{init,free}_logdirty()

Add logdirty_ranges allocator / deallocator helpers.
p2m_init_logdirty() will not re-allocate if
p2m->logdirty ranges has already been allocated.

Move the rangeset deallocation call from p2m_teardown_hostp2m()
to p2m_free_one() - we will want this to apply to altp2ms
as well.

Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
6 years agox86/mwait-idle: Graceful probe failure when MWAIT is disabled
Len Brown [Fri, 23 Nov 2018 11:06:07 +0000 (12:06 +0100)]
x86/mwait-idle: Graceful probe failure when MWAIT is disabled

When MWAIT is disabled, intel_idle refuses to probe.
But it may mis-lead the user by blaming this on the model number:

intel_idle: does not run on family 6 modesl 79

So defer the check for MWAIT until after the model# white-list check succeeds,
and if the MWAIT check fails, tell the user how to fix it:

intel_idle: Please enable MWAIT in BIOS SETUP

Signed-off-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[Linux commit: a4c447533a18ee86e07232d6344ba12b1f9c5077]
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/p2m: constify p2m_mem_access_sanity_check()
Razvan Cojocaru [Fri, 23 Nov 2018 11:05:10 +0000 (12:05 +0100)]
x86/p2m: constify p2m_mem_access_sanity_check()

Minor improvement; simply improving code quality by using consts
wherever reasonable.

Suggested-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
6 years agox86/vendor: Renumber the X86_VENDOR_ constants
Andrew Cooper [Tue, 10 Jul 2018 12:40:36 +0000 (13:40 +0100)]
x86/vendor: Renumber the X86_VENDOR_ constants

Make X86_VENDOR_UNKNOWN have the value 0 so a piece of zeroed memory can't get
confused with X86_VENDOR_INTEL.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agoxen/common: Drop unnecessary #ifdef CONFIG_KEXEC
Andrew Cooper [Wed, 21 Nov 2018 17:18:02 +0000 (17:18 +0000)]
xen/common: Drop unnecessary #ifdef CONFIG_KEXEC

kexec.h itself has suitable stubs for the !CONFIG_KEXEC case, so calls to
kexec_crash() don't need guarding.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxen/gnttab: Simplify gnttab_map_frame()
Andrew Cooper [Tue, 23 Oct 2018 18:49:34 +0000 (19:49 +0100)]
xen/gnttab: Simplify gnttab_map_frame()

 * Reflow some lines to remove unnecessary line breaks.
 * Factor out the gnttab_get_frame_gfn() calculation.  Neither x86 nor ARM
   builds seem to be able to fold the two calls, and the resulting code is far
   easier to follow.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agoxen/gnttab: Drop gnttab_create_{shared,status}_page()
Andrew Cooper [Wed, 24 Oct 2018 12:12:35 +0000 (13:12 +0100)]
xen/gnttab: Drop gnttab_create_{shared,status}_page()

share_xen_page_with_guest() is a common API.  Use it directly rather than
wrapping it with unnecessary boilerplate.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/p2m: Switch the two_gfns infrastructure to using gfn_t
Andrew Cooper [Mon, 22 Oct 2018 14:50:14 +0000 (15:50 +0100)]
x86/p2m: Switch the two_gfns infrastructure to using gfn_t

Additionally, drop surrounding trailing whitespace.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wei.liu2@citrix.com>
CC: George Dunlap <george.dunlap@eu.citrix.com>
CC: Tamas K Lengyel <tamas@tklengyel.com>
6 years agoxen/mm: Drop ARM put_gfn() stub
Andrew Cooper [Mon, 22 Oct 2018 14:25:14 +0000 (15:25 +0100)]
xen/mm: Drop ARM put_gfn() stub

On x86, get_gfn_*() and put_gfn() are reference counting pairs.  All the
get_gfn_*() functions are called from within CONFIG_X86 sections, but
put_gfn() is stubbed out on ARM.

As a result, the common code reads as if ARM is dropping references it never
acquired.

Put all put_gfn() calls in common code inside CONFIG_X86 to make the code
properly balanced, and drop the ARM stub.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/mem-sharing: Don't leave the altp2m lock held when nominating a page
Andrew Cooper [Wed, 7 Nov 2018 12:25:26 +0000 (12:25 +0000)]
x86/mem-sharing: Don't leave the altp2m lock held when nominating a page

get_gfn_type_access() internally takes the p2m lock, and nothing ever unlocks
it.  Switch to using the unlocked accessor instead.

This wasn't included in XSA-277 because neither mem-sharing nor altp2m are
supported.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/soft-reset: Drop gfn reference after calling get_gfn_query()
Andrew Cooper [Wed, 7 Nov 2018 12:25:19 +0000 (12:25 +0000)]
x86/soft-reset: Drop gfn reference after calling get_gfn_query()

get_gfn_query() internally takes the p2m lock, and this error path leaves it
locked.

This wasn't included in XSA-277 because the error path can only be triggered
by a carefully timed phymap operation concurrent with the domain being paused
and the toolstack issuing DOMCTL_soft_reset.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agoxen/arm: p2m: Introduce a helper to generate P2M table entry from a page
Julien Grall [Mon, 8 Oct 2018 18:33:42 +0000 (19:33 +0100)]
xen/arm: p2m: Introduce a helper to generate P2M table entry from a page

Generate P2M table entry requires to set some default values which are
worth to explain in a comment. At the moment, there are 2 places where
such entry are created but only one as proper comment.

So move the code to generate P2M table entry in a separate helper.
This will be helpful in a follow-up patch to make modification on the
defaults.

At the same time, switch the default access from p2m->default_access to
p2m_access_rwx. This should not matter as permission are ignored for
table by the hardware.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: guest_walk_tables: Switch the return to bool
Julien Grall [Mon, 8 Oct 2018 18:33:40 +0000 (19:33 +0100)]
xen/arm: guest_walk_tables: Switch the return to bool

At the moment, guest_walk_tables can either return 0, -EFAULT, -EINVAL.
The use of the last 2 are not clearly defined and used inconsistently in
the code. The current only caller does not care about the return
value and the value of it seems very limited (no way to differentiate
between the 15ish error paths).

So switch to bool to simplify the return and make the developer life a
bit easier.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: Allow lpae_is_{table, mapping} helpers to work on invalid entry
Julien Grall [Mon, 8 Oct 2018 18:33:39 +0000 (19:33 +0100)]
xen/arm: Allow lpae_is_{table, mapping} helpers to work on invalid entry

Currently, lpae_is_{table, mapping} helpers will always return false on
entries with the valid bit unset. However, it would be useful to have them
operating on any entry. For instance to store information in advance but
still request a fault.

With that change, the p2m is now providing an overlay for *_is_{table,
mapping} that will check the valid bit of the entry.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agoxen/arm: Introduce helpers to get/set an MFN from/to an LPAE entry
Julien Grall [Mon, 8 Oct 2018 18:33:38 +0000 (19:33 +0100)]
xen/arm: Introduce helpers to get/set an MFN from/to an LPAE entry

The new helpers make it easier to read the code by abstracting the way to
set/get an MFN from/to an LPAE entry. The helpers are using "walk" as the
bits are common across different LPAE stages.

At the same time, use the new helpers to replace the various open-coding
place.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
6 years agox86emul: suppress default test harness build with incapable compiler
Jan Beulich [Thu, 22 Nov 2018 13:31:06 +0000 (14:31 +0100)]
x86emul: suppress default test harness build with incapable compiler

A top level "make build", as used e.g. by osstest, wants to build all
"all" targets in enabled tools subdirectories, which by default also
includes the emulator test harness. The use of, in particular, AVX512
insns in, again in particular, test_x86_emulator.c causes this build to
fail though when the compiler is not new enough. Take a big hammer and
suppress the default harness build altogether when any of the extensions
used is not supported by the specified (or defaulted to) compiler.

Leave the "run" target alone though: While some of the test code blobs
may fail to build with older compilers, as long as the main executable
can be built some limited testing can still be done.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agox86/dom0: use MEMF_no_scrub during Dom0 construction
Sergey Dyasli [Thu, 22 Nov 2018 13:30:14 +0000 (14:30 +0100)]
x86/dom0: use MEMF_no_scrub during Dom0 construction

Now that idle scrub is the default option, all memory is marked as dirty
and alloc_domheap_pages() will do eager scrubbing by default. This can
lead to longer Dom0 construction and potentially to a watchdog timeout,
especially on older H/W (e.g. Harpertown).

Pass MEMF_no_scrub to optimise this process since there is little point
in scrubbing memory for Dom0.

Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
6 years agocredit2: during scheduling, update the idle mask before using it
Dario Faggioli [Thu, 22 Nov 2018 11:54:56 +0000 (11:54 +0000)]
credit2: during scheduling, update the idle mask before using it

Load balancing, when happening, at the end of a "scheduler epoch", can
trigger vcpu migration, which in its turn may call runq_tickle(). If the
cpu where this happens was idle, but we're now going to schedule a vcpu
on it, let's update the runq's idle cpus mask accordingly _before_ doing
load balancing.

Not doing that, in fact, may cause runq_tickle() to think that the cpu
is still idle, and tickle it to go pick up a vcpu from the runqueue,
which might be wrong/unideal.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
6 years agoautomation: make clean between builds
Wei Liu [Wed, 21 Nov 2018 16:28:10 +0000 (16:28 +0000)]
automation: make clean between builds

Currently randconfig tests are more likely to fail than to succeed
because of a bug in xen's build system: symbols-dummy.o's dependency
is wrong, which causes it to not get rebuild between runs, which
eventually causes linking to fail. There may also be other corner
cases we haven't discovered.

The fix is not straightforward. For now, make sure the tree is cleaned
properly between builds so we don't see random failures in Gitlab CI.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Doug Goldstein <cardoe@cardoe.com>
6 years agoxen: sched: Credit2: avoid looping too much (over runqueues) during load balancing
Dario Faggioli [Wed, 21 Nov 2018 15:44:53 +0000 (15:44 +0000)]
xen: sched: Credit2: avoid looping too much (over runqueues) during load balancing

For doing load balancing between runqueues, we check the load of each
runqueue, select the one more "distant" than our own load, and then take
the proper runq lock and attempt vcpu migrations.

If we fail to take such lock, we try again, and the idea was to give up
and bail if, during the checking phase, we can't take the lock of any
runqueue (check the comment near to the 'goto retry;', in the middle of
balance_load())

However, the variable that controls the "give up and bail" part, is not
reset upon retries. Therefore, provided we did manage to check the load of
at least one runqueue during the first pass, if we can't get any runq lock,
we don't bail, but we try again taking the lock of that same runqueue
(and that may even more than once).

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
6 years agox86/mem_access: move p2m_mem_access_sanity_check() from header
Razvan Cojocaru [Wed, 21 Nov 2018 09:55:21 +0000 (10:55 +0100)]
x86/mem_access: move p2m_mem_access_sanity_check() from header

Move p2m_mem_access_sanity_check() from the asm-x86/mem_access.h
header, where it currently is declared inline, to
arch/x86/mm/mem_access.c. This allows source code that includes it
directly, or indirectly (such as xen/mem_access.h), to not worry
about also including sched.h for is_hvm_domain(). Including
xen/mem_access.h is useful for code wanting to use p2m_access_t.

Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
6 years agox86: correct instances of PGC_allocated clearing
Jan Beulich [Wed, 21 Nov 2018 09:54:05 +0000 (10:54 +0100)]
x86: correct instances of PGC_allocated clearing

For domain heap pages assigned to a domain dropping the page reference
tied to PGC_allocated may not drop the last reference, as otherwise the
test_and_clear_bit() might already act on an unowned page.

Work around this where possible, but the need to acquire extra page
references is a fair hint that references should have been acquired in
other places instead.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
6 years agox86/shadow: un-hide "full" auditing code
Jan Beulich [Wed, 21 Nov 2018 09:53:14 +0000 (10:53 +0100)]
x86/shadow: un-hide "full" auditing code

In particular sh_oos_audit() has become stale due to changes elsewhere,
and the need for adjustment was not noticed because both "full audit"
flags are off in both release and debug builds. Switch away from pre-
processor conditionals, thus exposing the code to the compiler at all
times. This obviously requires correcting the accumulated issues with
the so far hidden code.

Note that shadow_audit_tables() now also gains an effect with "full
entry audit" mode disabled; the prior code structure suggests that this
was originally intended anyway.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
6 years agoretpoline: disable jump tables
Norbert Manthey [Wed, 21 Nov 2018 09:52:05 +0000 (10:52 +0100)]
retpoline: disable jump tables

To mitigate Spectre v2, Xen has been fixed with a software fix, namely
using retpoline sequences generated by the compiler. This way, indirect
branches are protected against the attack.

However, the retpoline sequence comes with a slow down. To make up for
this, we propose to avoid jump tables in the first place. Without the
retpoline sequences, this code would be less efficient. However, when
retpoline is enabled, this actually results in a slight performance
improvement.

This change might become irrelevant once the compiler starts avoiding
jump tables in case retpolines are used:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86952

Reported-by: Julian Stecklina <jsteckli@amazon.de>
Reported-by: Pawel Wieczorkiewicz <wipawel@amazon.de>
Signed-off-by: Norbert Manthey <nmanthey@amazon.de>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agoiommu / p2m: add a page_order parameter to iommu_map/unmap_page()...
Paul Durrant [Wed, 21 Nov 2018 09:50:29 +0000 (10:50 +0100)]
iommu / p2m: add a page_order parameter to iommu_map/unmap_page()...

...and re-name them to iommu_map/unmap() since they no longer necessarily
operate on a single page.

The P2M code currently contains many loops to deal with the fact that,
while it may be require to handle page orders greater than 0, the
IOMMU map and unmap functions do not.
This patch adds a page_order parameter to those functions and implements
the necessary loops within. This allows the P2M code to be substantially
simplified.

This patch also adds emacs boilerplate to xen/iommu.h to avoid tabbing
problem.

NOTE: This patch does not modify the underlying vendor IOMMU
      implementations to deal with more than a single page at once.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
6 years agoautomation: add qemu smoke test
Wei Liu [Mon, 19 Nov 2018 16:32:15 +0000 (16:32 +0000)]
automation: add qemu smoke test

This patch introduces a new test stage into the pipeline and provides
a simple QEMU based smoke test.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Doug Goldstein <cardoe@cardoe.com>
6 years agoautomation: also specify xen binary as artifact on x86_64
Wei Liu [Mon, 19 Nov 2018 15:03:58 +0000 (15:03 +0000)]
automation: also specify xen binary as artifact on x86_64

... so that it can be passed on to test stage.

Note that xen is only extracted for x86_64 build since others may not
have that. Use a directory to account for possibly different file
names on Arm.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Doug Goldstein <cardoe@cardoe.com>
6 years agoautomation: stash default config file for artifact extraction
Wei Liu [Mon, 19 Nov 2018 15:03:04 +0000 (15:03 +0000)]
automation: stash default config file for artifact extraction

This aids troubleshooting when we notice a failure in the default
configuration.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Doug Goldstein <cardoe@cardoe.com>
6 years agoautomation: introduce CONTAINER_NO_PULL for containerize
Wei Liu [Mon, 19 Nov 2018 12:11:48 +0000 (12:11 +0000)]
automation: introduce CONTAINER_NO_PULL for containerize

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Doug Goldstein <cardoe@cardoe.com>
6 years agoautomation: fix debian-{stretch,unstable}-32-gcc-debug
Wei Liu [Tue, 20 Nov 2018 14:10:02 +0000 (14:10 +0000)]
automation: fix debian-{stretch,unstable}-32-gcc-debug

They should have used .gcc-x86-32-build-debug in the first place.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agotools/helpers: make gen_stub_json_config accept an UUID argument
Wei Liu [Wed, 14 Nov 2018 18:17:31 +0000 (18:17 +0000)]
tools/helpers: make gen_stub_json_config accept an UUID argument

If that's set, the stub is going to contain that UUID.

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
6 years agotools: update examples/README
Wei Liu [Wed, 14 Nov 2018 18:17:30 +0000 (18:17 +0000)]
tools: update examples/README

This file gets installed to the host system.

This patch cleans it up: 1. remove things that don't exist anymore; 2.
change xm to xl; 3. fix xen-devel list address; 4. add things that are
missing; 5. delete trailing whitespaces.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
6 years agobump XEN_DOMCTL_INTERFACE_VERSION
Olaf Hering [Tue, 20 Nov 2018 14:15:32 +0000 (15:15 +0100)]
bump XEN_DOMCTL_INTERFACE_VERSION

Without this change valgrind can not decide what variant of
xen_domctl_createdomain is provided as input.

Fixes commit 4a83497635 ("xen/domctl: Merge set_max_evtchn into createdomain")
Fixes commit a903bf5233 ("tools: Pass grant table limits to XEN_DOMCTL_set_gnttab_limits")
Fixes commit ae8b8bc599 ("xen/domctl: Remove XEN_DOMCTL_set_gnttab_limits")
Fixes commit 4737fa52ce ("tools: Pass max_vcpus to XEN_DOMCTL_createdomain")

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86emul: use simd_128 also for legacy vector shift insns
Jan Beulich [Tue, 20 Nov 2018 14:14:55 +0000 (15:14 +0100)]
x86emul: use simd_128 also for legacy vector shift insns

This eliminates a separate case block here, and allows to get away with
fewer new ones when adding AVX512 vector shifts.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86emul: support AVX512{F,BW} packed integer arithmetic insns
Jan Beulich [Tue, 20 Nov 2018 14:13:54 +0000 (15:13 +0100)]
x86emul: support AVX512{F,BW} packed integer arithmetic insns

Note: vpadd* / vpsub* et al are put at seemingly the wrong slot of the
big switch(). This is in anticipation of adding e.g. vpunpck* to those
groups (see the legacy/VEX encoded case labels nearby to support this).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86emul: support AVX512{F,BW} packed integer compare insns
Jan Beulich [Tue, 20 Nov 2018 14:13:17 +0000 (15:13 +0100)]
x86emul: support AVX512{F,BW} packed integer compare insns

Include VPTEST{,N}M{B,D,Q,W} as once again possibly used by the compiler
for comparison against all-zero vectors.

Also table entries for a few more insns get their .d8s field set right
away, again in order to not split and later re-combine the groups.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86emul: support AVX512F v{,u}comis{d,s} insns
Jan Beulich [Tue, 20 Nov 2018 14:12:38 +0000 (15:12 +0100)]
x86emul: support AVX512F v{,u}comis{d,s} insns

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86emul: support AVX512{F,DQ} FP broadcast insns
Jan Beulich [Tue, 20 Nov 2018 14:11:50 +0000 (15:11 +0100)]
x86emul: support AVX512{F,DQ} FP broadcast insns

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>