]> xenbits.xensource.com Git - people/dariof/xen.git/log
people/dariof/xen.git
7 years agox86emul: tell cmpxchg hook whether LOCK is in effect
Jan Beulich [Thu, 22 Mar 2018 09:38:02 +0000 (10:38 +0100)]
x86emul: tell cmpxchg hook whether LOCK is in effect

This is necessary for the hook to correctly perform the operation.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
7 years agox86/HVM: eliminate custom #MF/#XM handling
Jan Beulich [Thu, 22 Mar 2018 09:37:26 +0000 (10:37 +0100)]
x86/HVM: eliminate custom #MF/#XM handling

Use the generic stub exception handling instead.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86emul: adjust_bnd() should check XCR0
Jan Beulich [Thu, 22 Mar 2018 09:36:55 +0000 (10:36 +0100)]
x86emul: adjust_bnd() should check XCR0

Experimentally MPX instructions have been confirmed to behave as NOPs
unless both related XCR0 bits are set to 1. By implication branches
then also don't clear BNDn.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86emul: abstract out XCRn accesses
Jan Beulich [Thu, 22 Mar 2018 09:35:50 +0000 (10:35 +0100)]
x86emul: abstract out XCRn accesses

Use hooks, just like done for other special purpose registers.

This includes moving XCR0 checks from hvmemul_get_fpu() to the emulator
itself as well as adding support for XGETBV emulation.

For now fuzzer reads will obtain the real values (minus the fuzzing of
the hook pointer itself).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com> [tracing parts]
7 years agoci: add new bits to MAINTAINERS combine with Travis
Doug Goldstein [Thu, 15 Mar 2018 15:54:04 +0000 (10:54 -0500)]
ci: add new bits to MAINTAINERS combine with Travis

Created a new section just called 'CI' since this is adding GitLab CI
and still leaving the old Travis CI files around. This consolidates the
two sections and adds the new files as well as adding another Travis
file that was missing.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agoci: use GitLab CI to build
Doug Goldstein [Sun, 11 Mar 2018 06:08:50 +0000 (00:08 -0600)]
ci: use GitLab CI to build

Added a GitLab CI config which has a lot more flexibility to allow us to
test a lot more distro configurations than Travis can and even build
test on FreeBSD. This includes a modified copy of scripts/travis-build
that is expected to diverge future over time as we build more than what
Travis is currently building.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agoci: add Dockerfile for Debian stretch
Doug Goldstein [Wed, 14 Mar 2018 16:23:31 +0000 (11:23 -0500)]
ci: add Dockerfile for Debian stretch

Added a Dockerfile which captures all the necessary dependencies to
build Xen on a Debian stretch system.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agoci: add Dockerfile for Debian jessie
Doug Goldstein [Tue, 13 Mar 2018 02:32:27 +0000 (21:32 -0500)]
ci: add Dockerfile for Debian jessie

Added a Dockerfile which captures all the necessary dependencies to
build Xen on a Debian jessie system.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agoci: add Dockerfile for Ubuntu 16.04
Doug Goldstein [Mon, 12 Mar 2018 17:45:00 +0000 (12:45 -0500)]
ci: add Dockerfile for Ubuntu 16.04

Added a Dockerfile which captures all the necessary dependencies to
build Xen on a Ubuntu 16.04 system.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agoci: add Dockerfile for Ubuntu 14.04
Doug Goldstein [Mon, 12 Mar 2018 17:41:33 +0000 (12:41 -0500)]
ci: add Dockerfile for Ubuntu 14.04

Added a Dockerfile which captures all the necessary dependencies to
build Xen on a Ubuntu 14.04 system.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agoci: add Dockerfile for CentOS 7.2
Doug Goldstein [Mon, 12 Mar 2018 17:40:45 +0000 (12:40 -0500)]
ci: add Dockerfile for CentOS 7.2

Added a Dockerfile which captures all the necessary dependencies to
build Xen on a CentOS 7.2 system.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agoci: add README and makefile for containers
Doug Goldstein [Tue, 13 Mar 2018 03:15:07 +0000 (22:15 -0500)]
ci: add README and makefile for containers

Add a basic README explaining the containers and how people can use them
to locally test with if they see an error in CI and want to reproduce it
locally. Added a makefile to help with building and pushing the
containers to the container registry.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agoxenbaked.c: Avoid divide by zero issue
Joe Jin [Wed, 14 Mar 2018 17:14:03 +0000 (10:14 -0700)]
xenbaked.c: Avoid divide by zero issue

xenbaked.c -> dump_stats(), run_time = time(&end_time) - time(&start_time),
time() returns the value in seconds. If one cancels xenmon.py immediately
after started, run_time can be zero, and then xenbaked will hit divide by
zero fault.

Signed-off-by: Joe Jin <joe.jin@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agox86/hvm: add stricter permissions checks to ioreq server control plane
Paul Durrant [Tue, 20 Mar 2018 18:05:25 +0000 (18:05 +0000)]
x86/hvm: add stricter permissions checks to ioreq server control plane

There has always been an intention in the ioreq server API that only the
domain that creates an ioreq server should be able to manipulate it.
However, so far, nothing has enforced this. This means that two domains
with DM_PRIV over a target domain can currently manipulate each others
ioreq servers.

A previous patch added code to take a reference and store a pointer to the
domain that creates an ioreq server. This patch now adds checks to the
functions that manipulate the ioreq server to make sure they are being
called by the same domain.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/hvm: re-structure some of the ioreq server look-up loops
Paul Durrant [Tue, 20 Mar 2018 18:05:24 +0000 (18:05 +0000)]
x86/hvm: re-structure some of the ioreq server look-up loops

This patch is a cosmetic re-structuring of some of the loops with look up
an ioreq server based on target domain and server id.

The restructuring is done separately here to ease review of a subsquent
patch.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/hvm: take a reference on ioreq server emulating domain
Paul Durrant [Tue, 20 Mar 2018 18:05:23 +0000 (18:05 +0000)]
x86/hvm: take a reference on ioreq server emulating domain

When an ioreq server is created the code currently stores the id
of the emulating domain, but does not take a reference on that domain.

This patch modifies the code to hold a reference for the lifetime of the
ioreq server.

NOTE: ioreq servers are either destroyed explicitly or destroyed implicitly
      in context of XEN_DOMCTL_destroydomain.
      If the emulating domain is shut down prior to the target then the
      any domain reference held by an ioreq server will prevent it from
      being destroyed. However, if an emulating domain is shut down prior
      to its target then it is likely that the target's vcpus will block
      fairly quickly waiting for emulation that will never occur, and when
      the target domain is destroyed the reference on the zombie emulating
      domain will be dropped allowing both to be cleaned up.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/hvm: stop passing explicit domid to hvm_create_ioreq_server()
Paul Durrant [Tue, 20 Mar 2018 18:05:22 +0000 (18:05 +0000)]
x86/hvm: stop passing explicit domid to hvm_create_ioreq_server()

Only in the legacy 'default server' case do we pass anything other than
current->domain->domain_id, and in that case we pass the value of
HVM_PARAM_DM_DOMAIN.

The only known user of HVM_PARAM_DM_DOMAIN is qemu-trad (and only when
compiled as a stubdom), which always sets it to DOMID_SELF (ignoring the
return value of xc_set_hvm_param) [1] and never reads it.

This patch:

- Disallows setting HVM_PARAM_DM_DOMAIN to anything other than DOMID_SELF
  and removes the call to hvm_set_dm_domain().
- Stops passing a domid to hvm_create_ioreq_server()
- Changes hvm_create_ioreq_server() to always set
  current->domain->domain_id as the domid of the emulating domain
- Removes the hvm_set_dm_domain() implementation since it is no longer
  needed.

[1] http://xenbits.xen.org/gitweb/?p=qemu-xen-traditional.git;a=blob;f=hw/xen_machine_fv.c;#l299

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agoxen: sched: simplify (and speedup) checking soft-affinity
Dario Faggioli [Wed, 21 Mar 2018 17:17:47 +0000 (17:17 +0000)]
xen: sched: simplify (and speedup) checking soft-affinity

The fact of whether or not a vCPU has a soft-affinity
which is effective, i.e., with the power of actually
affecting the scheduling of the vCPU itself rarely
changes. Very, very rarely, as compared to how often
we need to check for the same thing (basically, at
every scheduling decision!).

That can be improved by storing in a (per-vCPU) flag
(it's actually a boolean field in struct vcpu) whether
or not, considering how hard-affinity and soft-affinity
look like, soft-affinity should or not be taken into
account during scheduling decisions.

This saves some cpumask manipulations, which is nice,
considering how frequently they were being done. Note
that we can't get rid of 100% of the cpumask operations
involved in the check, because soft-affinity being
effective or not, not only depends on the relationship
between the hard and soft-affinity masks of a vCPU, but
also of the online pCPUs and/or of what pCPUs are part
of the cpupool where the vCPU lives, and that's rather
impractical to store in a per-vCPU flag. Still the
overhead is reduced to "just" one cpumask_subset() (and
only if the newly introduced flag is 'true')!

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
7 years agoxen: sched: improve checking soft-affinity
Dario Faggioli [Wed, 21 Mar 2018 17:17:46 +0000 (17:17 +0000)]
xen: sched: improve checking soft-affinity

The function has_soft_affinity() determines whether the soft-affinity
of a vcpu will have any effect -- that is, whether the affinity will
have any difference, scheduling-wise, from an empty soft-affinity
mask.

Such function takes a custom cpumask as its third parameter for better
flexibility; but that mask is different from the vCPU's hard-affinity
only in one case. Getting rid of that parameter not only simplifies
the function, but enables optimizing the soft affinity check.

It's mostly mechanical, with the exception of
sched_credit.c:_cshed_cpu_pick(), which was the one case where we
passed in something other than the existing hard-affinity.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
7 years agoxen: sched: optimize exclusive pinning case (Credit1 & 2)
Dario Faggioli [Wed, 21 Mar 2018 17:17:45 +0000 (17:17 +0000)]
xen: sched: optimize exclusive pinning case (Credit1 & 2)

Exclusive pinning of vCPUs is used, sometimes, for
achieving the highest level of determinism, and the
least possible overhead, for the vCPUs in question.

Although static 1:1 pinning is not recommended, for
general use cases, optimizing the tickling code (of
Credit1 and Credit2) is easy and cheap enough, so go
for it.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
7 years agoxen: sched: introduce 'adjust_affinity' hook.
Dario Faggioli [Wed, 21 Mar 2018 17:17:44 +0000 (17:17 +0000)]
xen: sched: introduce 'adjust_affinity' hook.

For now, just as a way to give a scheduler an "heads up",
about the fact that the affinity changed.

This enables some optimizations, such as pre-computing
and storing (e.g., in flags) facts like a vcpu being
exclusively pinned to a pcpu, or having or not a
soft affinity. I.e., conditions that, despite the fact
that they rarely change, are right now checked very
frequently, even in hot paths.

Note that, as we expect many scheduler specific
implementations of the adjust_affinity hook to do
something with the per-scheduler vCPU private data,
this commit moves the calls to sched_set_affinity()
after that is allocated (in sched_init_vcpu()).

Note also that this, in future, may turn out as a useful
mean for, e.g., having the schedulers vet, ack or nack
the changes themselves.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
7 years agoxen/arm: gic: Read unconditionally the source from the LRs
Julien Grall [Wed, 21 Mar 2018 03:34:35 +0000 (03:34 +0000)]
xen/arm: gic: Read unconditionally the source from the LRs

Commit 5cb00d1 "ARM: GIC: extend LR read/write functions to cover EOI
and source" extended gic_lr to cover the source. The new field was only
set for SGIs interrupt in the read function. However, the write function
is writing the field unconditionally for virtual interrupt.

This means that if the caller was combining the 2 functions (e.g to
update the LR), the source need to be set to 0 by the caller.
Unfortunately, gic_update_one_lr is not zeroing the structure before
reading the LRs. This will lead to trigger the assert randomly.

Instead of zeroing the structure in gic_update_one_lr, make sure that
the source is written unconditionally on read. This is also simplifying
the code to avoid an if statement in the read path.

Lastly, properly update the comments in write_lr that was mistakenly
speaking about the read lr path.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
7 years agoxen/libxc: suppress direct access to Credit1's migration delay
Dario Faggioli [Thu, 15 Mar 2018 17:51:46 +0000 (18:51 +0100)]
xen/libxc: suppress direct access to Credit1's migration delay

Removes special purpose access to Credit1 vCPU
migration delay parameter.

This fixes a build breakage, occuring when Xen
is configured with SCHED_CREDIT=n.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
7 years agotools: xenpm: continue to support {set, get}-vcpu-migration-delay
Dario Faggioli [Thu, 15 Mar 2018 17:51:38 +0000 (18:51 +0100)]
tools: xenpm: continue to support {set, get}-vcpu-migration-delay

Now that it is possible to get and set the migration
delay via the SCHEDOP sysctl, use that in xenpm, instead
of the special purpose libxc interface (which will be
removed in a following commit).

The sysctl, however, requires a cpupool-id argument,
for knowing on which scheduler it is operating on. In
this case, since we don't want to alter xenpm's command
line interface, we always use '0', which means xenpm
will always act on the default cpupool ('Pool-0').

>From this commit on, `xenpm {set,get}-vcpu-migration-delay'
commands work again. But that is only for the sake of
backward compatibility, and their use is deprecated, in
favour of 'xl sched-credit -s [-c <poolid>] -m <delay>'.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agotools: libxl/xl: allow to get/set Credit1's vcpu_migration_delay
Dario Faggioli [Thu, 15 Mar 2018 17:51:30 +0000 (18:51 +0100)]
tools: libxl/xl: allow to get/set Credit1's vcpu_migration_delay

Make it possible to get and set a (Credit1) scheduler's
vCPU migration delay via the SCHEDOP sysctl, from both
libxl and xl (no change needed in libxc).

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
7 years agoxen: sched/credit1: make vcpu_migration_delay per-cpupool
Dario Faggioli [Thu, 15 Mar 2018 17:51:23 +0000 (18:51 +0100)]
xen: sched/credit1: make vcpu_migration_delay per-cpupool

Right now, vCPU migration delay is controlled by
the vcpu_migration_delay boot parameter. This means
the same value will always be used for every instance
of Credit1, in any cpupool that will be created.

Also, in order to get and set such value, a special
purpose libxc interface is defined, and used by the
xenpm tool. And this is problematic if Xen is built
without Credit1 support.

This commit adds a vcpu_migr_delay field inside
struct csched_private, so that we can get/set the
migration delay indepently for each Credit1 instance,
in different cpupools.

Getting and setting now happens via XEN_SYSCTL_SCHEDOP_*,
which is much better suited for this parameter.

The value of the boot time parameter is used for
initializing the vcpu_migr_delay field of the private
structure of all the scheduler instances, when they're
created.

While there, save reading NOW() and doing any s_time_t
operation, when the migration delay of a scheduler is
zero (as it is, by default), in
__csched_vcpu_is_cache_hot().

Finally, note that, from this commit on, using `xenpm
{set,get}-vcpu-migration-delay' will have no effect
any longer. A subsequent commit will re-enable it, for
the sake of backwards-compatibility.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
7 years agoxen/tools: support Python 2 and Python 3
Doug Goldstein [Wed, 28 Feb 2018 19:18:44 +0000 (13:18 -0600)]
xen/tools: support Python 2 and Python 3

These changes should make it possible to support modern Pythons as well
as the oldest Python 2 still supported.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agoREADME: require Python 2.4 or newer
Doug Goldstein [Wed, 28 Feb 2018 19:18:43 +0000 (13:18 -0600)]
README: require Python 2.4 or newer

Increase the minimum required Python to 2.4 or newer.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agofix null sched build with clang and debug=n
Doug Goldstein [Tue, 20 Mar 2018 10:23:29 +0000 (11:23 +0100)]
fix null sched build with clang and debug=n

The null_dom() static inline is just used when debug=y so with clang it
results in an error with the default CFLAGS and debug=n. This function
is used in only one place and it a one line helper so remove it until we
actually need it.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Acked-by: Dario Faggioli <dfaggioli@suse.com>
7 years agox86/mwait-idle: add Gemini Lake support
David E. Box [Tue, 20 Mar 2018 10:21:58 +0000 (11:21 +0100)]
x86/mwait-idle: add Gemini Lake support

Gemini Lake uses the same C-states as Broxton and also uses the
IRTL MSR's to determine maximum C-state latency.

Signed-off-by: David E. Box <david.e.box@linux.intel.com>
Acked-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[Linux commit 1b2e87687d3f951a66900cab6f1583d94099d2f7]
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agoARM: GIC: extend LR read/write functions to cover EOI and source
Julien Grall [Thu, 15 Mar 2018 20:30:13 +0000 (20:30 +0000)]
ARM: GIC: extend LR read/write functions to cover EOI and source

So far our LR read/write functions do not handle the EOI bit and the
source CPUID bits in an LR, because the current VGIC implementation does
not use them.
Extend the gic_lr data structure to hold these bits of information by
using a union to differentiate field used depending on whether the vIRQ
has a corresponding pIRQ.

This allows the new VGIC to use this information.

This is based on the original patch sent by Andre Przywara [1].

[1] https://lists.xenproject.org/archives/html/xen-devel/2018-03/msg00435.html

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: GIC: Only set pirq in the LR when hw_status is set
Julien Grall [Thu, 15 Mar 2018 20:30:12 +0000 (20:30 +0000)]
xen/arm: GIC: Only set pirq in the LR when hw_status is set

The field pirq should only be valid when the virtual interrupt
is associated to a physical interrupt.

This change will help to extend gic_lr for supporting specific virtual
interrupt field (e.g eoi, source) that clashes with the PIRQ field.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: gic: Split the field state in gic_lr in 2 fields active and pending
Julien Grall [Thu, 15 Mar 2018 20:30:11 +0000 (20:30 +0000)]
xen/arm: gic: Split the field state in gic_lr in 2 fields active and pending

Mostly making the code nicer to read.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: gic: Use bool instead of uint8_t for the hw_status in gic_lr
Julien Grall [Thu, 15 Mar 2018 20:30:10 +0000 (20:30 +0000)]
xen/arm: gic: Use bool instead of uint8_t for the hw_status in gic_lr

hw_status can only be 1 or 0. So convert to a bool.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: vgic: Override the group in lr everytime
Julien Grall [Thu, 15 Mar 2018 20:30:09 +0000 (20:30 +0000)]
xen/arm: vgic: Override the group in lr everytime

At the moment, write_lr is assuming the caller will set correctly the
group. However the group should always be 0 when the guest is using
vGICv2 and 1 for vGICv3. As the caller should not care about the group,
override it directly.

With that change, write_lr is now behaving like update_lr for the group.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: gic: Fix indentation in gic_update_one_lr
Julien Grall [Thu, 15 Mar 2018 20:30:08 +0000 (20:30 +0000)]
xen/arm: gic: Fix indentation in gic_update_one_lr

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
7 years agoARM: Implement vcpu_kick()
Andre Przywara [Thu, 15 Mar 2018 20:30:07 +0000 (20:30 +0000)]
ARM: Implement vcpu_kick()

If we change something in a vCPU that affects its runnability or
otherwise needs the vCPU's attention, we might need to tell the scheduler
about it.
We are using this in one place (vIRQ injection) at the moment, but will
need this at more places soon.
So let's factor out this functionality, using the already existing
vcpu_kick() prototype (used in x86 only so far), to make this available
to the rest of the Xen code.
Also adjust the perfcounter name to reflect the new usage.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoARM: VGIC: rename gic_event_needs_delivery()
Andre Przywara [Thu, 15 Mar 2018 20:30:06 +0000 (20:30 +0000)]
ARM: VGIC: rename gic_event_needs_delivery()

gic_event_needs_delivery() is not named very intuitively, especially
the gic_ prefix is somewhat misleading.
Rename it to vgic_vcpu_pending_irq(), which makes it clear that this
relates to the virtual GIC and is about interrupts.
Also add a VCPU parameter, which makes the code more flexible in the
future. The current VGIC expect this to be the current VCPU, so add
an assert to spot any regressions.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoarm/boot: Mark construct_dom0() as __init
Andrew Cooper [Mon, 19 Mar 2018 19:13:44 +0000 (19:13 +0000)]
arm/boot: Mark construct_dom0() as __init

Its sole caller, start_xen(), is __init.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
7 years agoxen/arm: Fix platform name to xilinx_zynqmp from xgene_storm
Amit Singh Tomar [Sun, 18 Mar 2018 09:20:26 +0000 (14:50 +0530)]
xen/arm: Fix platform name to xilinx_zynqmp from xgene_storm

Signed-off-by: Amit Singh Tomar <amittomer25@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Acked-by: Julien Grall <julien.grall@arm.com>
7 years agoxen/arm: p2m: Prevent deadlock when using memaccess
Julien Grall [Mon, 12 Mar 2018 15:34:52 +0000 (15:34 +0000)]
xen/arm: p2m: Prevent deadlock when using memaccess

Commit 7d623b358a4 "arm/mem_access: Add long-descriptor based gpt"
assumed the read-write lock can be taken recursively. However, this
assumption is wrong and will lead to deadlock when the lock is
contended.

The read lock is taken recursively in the following case:
    1) get_page_from_gva
        => Take the read lock (first read lock)
        => Call p2m_mem_access_check_and_get_page on failure when
        memaccess is enabled
    2) p2m_mem_access_check_and_get_page
        => If hardware translation failed fallback to software lookup
        => Call guest_walk_tables
    3) guest_walk_tables
        => Will use access_guest_memory_by_ipa to access stage-1 page-table
    4) access_guest_memory_by_ipa
        => Because Arm does not have hardware instruction to only do
        stage-2 page-table, this is done in software.
        => Take the read lock (second read lock)

To avoid the nested lock, rework the locking in get_page_from_gva and
p2m_mem_access_check_and_get_page. The latter will now be called without
the p2m lock. The new locking in p2m_mem_accces_check_and_get_page will
not cover the translation of the VA to an IPA.

This is fine because we can't promise that the stage-1 page-table have
changed behind our back (they are under guest control). Modification in
the stage-2 page-table can now happen, but I can't issue any potential
issue here except with the break-before-make sequence used when updating
page-table. gva_to_ipa may fail if the sequence is executed at the same
on another CPU. In that case we would fallback in the software lookup
path.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Sergej Proskurin <proskurin@sec.in.tum.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: Relax ARM_SMCCC_ARCH_WORKAROUND_1 discovery
Julien Grall [Mon, 12 Mar 2018 13:19:35 +0000 (13:19 +0000)]
xen/arm: Relax ARM_SMCCC_ARCH_WORKAROUND_1 discovery

A recent update to the ARM SMCCC_ARCH_WORKAROUND_1 specification (see [1])
allows firmware to return a non zero, positive value, to describe that
although the mitigation is implemented at the higher exception level,
the CPU on which the call is made is not affected.

Relax the check on the return value from ARM_WORKAROUND_1 so that we
only error out if the returned value is negative.

[1] https://developer.arm.com/support/security-update/downloads
"Firmware interfaces for mitigating CVE-2017-5715 System Software on Arm
Systems"

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: Restrict when a physical IRQ can be routed/removed from/to a domain
Julien Grall [Thu, 8 Mar 2018 15:24:04 +0000 (15:24 +0000)]
xen/arm: Restrict when a physical IRQ can be routed/removed from/to a domain

Xen is currently allowing to route/remove an interrupt from/to the
domain while it is running.

However, we never sync the virtual interrupt state to the physical
interrupt. This could lead to undesirable effect on the vGIC emulation
and potentially the hardware.

One solution would be to sync the interrupt state when routing, but I am
not sure it is worth the effort as you never really when it is safe to
route/remove the interrupt when a domain is running.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agox86: correct EFLAGS.IF in SYSENTER frame
Jan Beulich [Fri, 16 Mar 2018 16:27:36 +0000 (17:27 +0100)]
x86: correct EFLAGS.IF in SYSENTER frame

Commit 9d1d31ad94 ("x86: slightly reduce Meltdown band-aid overhead")
moved the STI past the PUSHF. While this isn't an active problem (as we
force EFLAGS.IF to 1 before exiting to guest context), let's not risk
internal confusion by finding a PV guest frame with interrupts
apparently off.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agoxen/mm: Clean up share_xen_page_with_guest() API
Andrew Cooper [Thu, 8 Mar 2018 19:24:58 +0000 (19:24 +0000)]
xen/mm: Clean up share_xen_page_with_guest() API

The share_xen_page_with_guest() functions are used by common code, and are
implemented the same by each arch.  Move the declarations into the common mm.h
rather than duplicating them in each arch/mm.h

Turn an int readonly into a boolean enum, to retain ro/rw context at the
callsites, but use shorter labels which avoids a large number of split lines.

Implement share_xen_page_with_privileged_guests() as a static inline wrapper
around share_xen_page_with_guest() to avoid having a call into a separate
translation unit whose only purpose is to shuffle function arguments.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agoxen/domain: Pass the full domctl_createdomain struct to create_domain()
Andrew Cooper [Thu, 8 Mar 2018 12:39:36 +0000 (12:39 +0000)]
xen/domain: Pass the full domctl_createdomain struct to create_domain()

In future patches, the structure will be extended with further information,
and this is far cleaner than adding extra parameters.

One minor tweak is that the setting of guest_type needs to be deferred until
config is known-good to dereference, but this doesn't result in any changed
behaviour as system domains never used to pass XEN_DOMCTL_CDF_hvm_guest.

Also for completeness, move the setting of d->handle into the tail of
domain_create() where it more logically should live.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
7 years agox86/domain: Optimise the order of actions in arch_domain_create()
Andrew Cooper [Thu, 8 Mar 2018 17:25:29 +0000 (17:25 +0000)]
x86/domain: Optimise the order of actions in arch_domain_create()

The only relevent initialisation for the idle domain is the context switch and
poisoned pointers.  Collect these bits together early in the function and exit
when complete (although as a consequence, the e820 and vtsc lock
initialisation are moved forwards).  This allows us to remove subsequent
is_idle_domain() checks and unindent most of the logic.

Furthermore, we no longer call these functions for the idle domain:
 * mapcache_domain_init() and tsc_set_info() were previously guarded against
   the idle domain, and have had their guards turned into ASSERT()s.
 * pit_init() is implicitly guarded by has_vpit().
 * psr_domain_init() no longer allocates a socket array.

Finally, two changes are introduced for the benefit of the following patch:
 * For PV hardware domains, or XEN_X86_EMU_PIT into emflags rather than into
   config->emulation_flags, to facilitating config becoming const.
 * References to domcr_flags are moved until after the idle early exist, to
   facilitiate them being unavailable for system domains.

No practical change in behaviour.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
7 years agox86/domain: Remove unused parameters from {hvm,pv}_domain_initialise()
Andrew Cooper [Thu, 8 Mar 2018 13:58:41 +0000 (13:58 +0000)]
x86/domain: Remove unused parameters from {hvm,pv}_domain_initialise()

Neither domcr_flags nor config are used on either side.  Drop them, making
{hvm,pv}_domain_initialise() symmetric with all the other domain/vcpu
initialise/destroy calls.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
7 years agoxen/domain: Drop all DOMCRF_* constants
Andrew Cooper [Thu, 8 Mar 2018 11:31:47 +0000 (11:31 +0000)]
xen/domain: Drop all DOMCRF_* constants

With DOMCRF_dummy removed, all remaining DOMCRF_* identically match their
DOMCTL counterparts.  Avoid having a conversion between two different bit
layouts, and use the DOMCTL_CDF_* constants everywhere.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agoxen/domain: Drop DOMCRF_dummy
Andrew Cooper [Thu, 8 Mar 2018 11:03:17 +0000 (11:03 +0000)]
xen/domain: Drop DOMCRF_dummy

At the moment, there is a tight coupling between the domid and the use of
DOMCRF_dummy.  Instead of using DOMCRF_dummy, base the one relevant decision
on domid alone.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
7 years agoSUPPORT.md: Multiple IOREQ servers are experimental
George Dunlap [Wed, 14 Mar 2018 11:05:47 +0000 (11:05 +0000)]
SUPPORT.md: Multiple IOREQ servers are experimental

The code has been there in the hypervisor for several releases, but
there is no toolstack support.

While we're here delete some trailing whitespace.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agoxen/x86: Implement enable_nmis() in C
Andrew Cooper [Thu, 15 Mar 2018 16:15:45 +0000 (16:15 +0000)]
xen/x86: Implement enable_nmis() in C

I don't recall why I chose to implement this in assembly to begin with, but
it can happily live in a static inline instead, and only has two callers.

Doing so reduces the quantity of code in .text.entry.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agotools/libacpi: Drop useless print messages
Andrew Cooper [Thu, 15 Mar 2018 11:56:40 +0000 (11:56 +0000)]
tools/libacpi: Drop useless print messages

Libraries have no buisness using stdout directly, and these have no real
value.  Dropping them removes the following output when building a PVH guest:

  [root@fusebot ~]# xl create shim.cfg
  Parsing config from shim.cfg
  S3 disabled
  S4 disabled
  CONV disabled
  [root@fusebot ~]#

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agox86emul: place test blobs in executable section
Jan Beulich [Thu, 15 Mar 2018 16:01:33 +0000 (17:01 +0100)]
x86emul: place test blobs in executable section

This allows the section contents to be disassembled without going
through any extra hoops, simplifying the analysis of problems in test
and/or emulation code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86emul: support 3DNow! insns
Jan Beulich [Thu, 15 Mar 2018 16:00:56 +0000 (17:00 +0100)]
x86emul: support 3DNow! insns

Yes, recent AMD CPUs don't support them anymore, but I think we should
nevertheless cope.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/vlapic: clear TMR bit upon acceptance of edge-triggered interrupt to IRR
Liran Alon [Thu, 15 Mar 2018 15:59:52 +0000 (16:59 +0100)]
x86/vlapic: clear TMR bit upon acceptance of edge-triggered interrupt to IRR

According to Intel SDM section "Interrupt Acceptance for Fixed Interrupts":
"The trigger mode register (TMR) indicates the trigger mode of the
interrupt (see Figure 10-20). Upon acceptance of an interrupt
into the IRR, the corresponding TMR bit is cleared for
edge-triggered interrupts and set for level-triggered interrupts.
If a TMR bit is set when an EOI cycle for its corresponding
interrupt vector is generated, an EOI message is sent to
all I/O APICs."

Before this patch TMR-bit was cleared on LAPIC EOI which is not what
real hardware does. This was also confirmed in KVM upstream commit
a0c9a822bf37 ("KVM: dont clear TMR on EOI").

Behavior after this patch is aligned with both Intel SDM and KVM
implementation.

Signed-off-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/VMX: don't risk corrupting host CR4
Jan Beulich [Thu, 15 Mar 2018 11:45:30 +0000 (12:45 +0100)]
x86/VMX: don't risk corrupting host CR4

Instead of "syncing" the live value to what mmu_cr4_features has, make
sure vCPU-s run with the value most recently loaded into %cr4, such that
after the next VM exit we continue to run with the intended value rather
than a possibly stale one.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
7 years agox86: ignore guest microcode loading attempts
Jan Beulich [Thu, 15 Mar 2018 11:44:24 +0000 (12:44 +0100)]
x86: ignore guest microcode loading attempts

The respective MSRs are write-only, and hence attempts by guests to
write to these are - as of 1f1d183d49 ("x86/HVM: don't give the wrong
impression of WRMSR succeeding") no longer ignored. Restore original
behavior for the two affected MSRs.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agoRevert "tools: detect appropriate debug optimization level"
Wei Liu [Wed, 14 Mar 2018 17:15:15 +0000 (17:15 +0000)]
Revert "tools: detect appropriate debug optimization level"

This reverts commit b43501451733193b265de30fd79a764363a2a473.

Due to the implementation of cc-option, the check is always true,
which means build for gcc that doesn't have -Og support is broken.

This patch can be reapplied once we have fixed cc-option.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
7 years agodocs: Fix entry for the "usbdev" option
Anthony PERARD [Wed, 14 Mar 2018 15:00:14 +0000 (15:00 +0000)]
docs: Fix entry for the "usbdev" option

The man for xl.cfg have the "devtype=hostdev" option, but xl only
understand "type=hostdev", fix the manual to reflect actual
implementation.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agox86/entry: Trivial nonfunctional fixes
Andrew Cooper [Wed, 14 Mar 2018 10:36:09 +0000 (10:36 +0000)]
x86/entry: Trivial nonfunctional fixes

 * Drop unnecessary size suffixes
 * The C pseudocode refers to a trap_info object, not trap_bounce.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/pv: Fix guest crashes following f75b1a5247b "x86/pv: Drop int80_bounce from struc...
Andrew Cooper [Wed, 14 Mar 2018 10:48:36 +0000 (10:48 +0000)]
x86/pv: Fix guest crashes following f75b1a5247b "x86/pv: Drop int80_bounce from struct pv_vcpu"

The original init_int80_direct_trap() was in fact buggy; `int $0x80` is not an
exception.  This went unnoticed for years because int80_bounce and trap_bounce
were separate structures, but were combined by this change.

Exception handling is different to interrupt handling for PV guests.  By
reusing trap_bounce, the following corner case can occur:

 * Handle a guest `int $0x80` instruction.  Latches TBF_EXCEPTION into
   trap_bounce.
 * Handle an exception, which emulates to success (such as ptwr support),
   which leaves trap_bounce unmodified.
 * The exception exit path sees TBF_EXCEPTION set and re-injects the `int
   $0x80` a second time.

Drop the TBF_EXCEPTION from the int80 invocation, which matches the equivalent
logic from the syscall/sysenter paths.

Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agolibxl_qmp: Tell QEMU about live migration or snapshot
Anthony PERARD [Tue, 13 Mar 2018 11:13:18 +0000 (11:13 +0000)]
libxl_qmp: Tell QEMU about live migration or snapshot

Since version 2.10, QEMU will lock the disk images so a second QEMU
instance will not try to open it. This would prevent live migration from
working correctly. A new parameter as been added to the QMP command
"xen-save-devices-state" in QEMU version 2.11 which allow to unlock the
disk image for a live migration, but also keep it locked for a snapshot.

QEMU commit: 5d6c599fe1d69a1bf8c5c4d3c58be2b31cd625ad
"migration, xen: Fix block image lock issue on live migration"

The extra "live" parameter can only be use if QEMU knows about it, so
only add it if qemu is recent enough.

The struct libxl__domain_suspend_state as now knowledge if the suspend
is part of a live migration.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agolibxl: Add a version check of QEMU for QMP commands
Anthony PERARD [Tue, 13 Mar 2018 11:13:17 +0000 (11:13 +0000)]
libxl: Add a version check of QEMU for QMP commands

On connection to QEMU via QMP, the version of QEMU is provided, store it
for later use.

Add a function qmp_qemu_check_version that can be used to check if QEMU
is new enough for certain fonctionnality. This will be used in a moment.

As it's a static function, it is commented out until first use, which is
in the next patch.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agogitignore: ignore wrappers.c link for fuzzer
Wei Liu [Wed, 14 Mar 2018 11:02:31 +0000 (11:02 +0000)]
gitignore: ignore wrappers.c link for fuzzer

At the same time reorder the entries alphabetically.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agoxl: remove apic option for PVH guests
Roger Pau Monne [Wed, 14 Mar 2018 11:09:24 +0000 (11:09 +0000)]
xl: remove apic option for PVH guests

XSA-256 forces the local APIC to always be enabled for PVH guests, so
ignore any apic option for PVH guests. Update the documentation
accordingly.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agotools: xenalyze.c fix format-truncation
John Thomson [Wed, 14 Mar 2018 08:21:24 +0000 (18:21 +1000)]
tools: xenalyze.c fix format-truncation

With gcc optimization enabled by:
tools: detect appropriate debug optimization level
b43501451733193b265de30fd79a764363a2a473

-Wformat-truncation throws warnings

gcc version 7.3.0

xenalyze.c: In function 'find_symbol':
xenalyze.c:382:36: error: 'snprintf' output may be truncated before the last format character [-Werror=format-truncation=]
     snprintf(name, 128, "(%s +%llx)",
                                    ^
xenalyze.c:382:5: note: 'snprintf' output between 6 and 144 bytes into a destination of size 128
     snprintf(name, 128, "(%s +%llx)",
     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
              lastname, offset);
              ~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors

Signed-off-by: John Thomson <git@johnthomson.fastmail.com.au>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
7 years agotools/xl: fix uninitialized variable in xl_vdispl
Doug Goldstein [Tue, 13 Mar 2018 16:25:29 +0000 (11:25 -0500)]
tools/xl: fix uninitialized variable in xl_vdispl

The code added in 7a48622a78a0b452e8afa55b8442c958abd226a7 could use rc
uninitialized in main_vdisplattach().

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agomake xen ocaml safe-strings compliant
Michael Young [Mon, 12 Mar 2018 18:49:29 +0000 (18:49 +0000)]
make xen ocaml safe-strings compliant

Xen built with ocaml 4.06 gives errors such as
Error: This expression has type bytes but an expression was
        expected of type string
as Byte and safe-strings which were introduced in 4.02 are the
default in 4.06.
This patch which is partly by Richard W.M. Jones of Red Hat
from https://bugzilla.redhat.com/show_bug.cgi?id=1526703
fixes these issues.

Signed-off-by: Michael Young <m.a.young@durham.ac.uk>
Reviewed-by: Christian Lindig<christian.lindig@citrix.com>
7 years agotools: detect appropriate debug optimization level
Doug Goldstein [Tue, 13 Mar 2018 04:06:51 +0000 (23:06 -0500)]
tools: detect appropriate debug optimization level

When building debug use -Og as the optimization level if its available,
otherwise retain the use of -O0. -Og has been added by GCC to enable all
optimizations that to not affect debugging while retaining full
debugability.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agodocs: Remove redundant qemu-xen-security document
George Dunlap [Fri, 9 Mar 2018 11:04:18 +0000 (11:04 +0000)]
docs: Remove redundant qemu-xen-security document

All this information is now covered in SUPPORT.md.

Most of the emulated hardware is obvious a couple of the items are
worth pointing out specifically.

"xen_disk" is listed under "Blkback"

"...the PCI host bridge and the PIIX3 chipset...": This statement is
redundant -- the PCI host bridge is a part of the piix3 chipset, which
is listed as supported.

xenfb: The "graphics" side of "xenfb" is listed under "PV Framebuffer
(backend)", and the "input" side of "xenfb" (including both keyboard
and mouse) is listed under "PV Keyboard (backend)".

Backing storage image format is listed in the "Blkback" section.

Fix 'stdvga' spelling while we're here.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agoSUPPORT.md: Specify support for various image formats
George Dunlap [Fri, 9 Mar 2018 17:27:58 +0000 (17:27 +0000)]
SUPPORT.md: Specify support for various image formats

QEMU supports various image formats, but we only provide security
support for raw, qcow, qcow2, and vhd formats.

Rather than duplicate this information under the "x86/Emulated
storage" section, just refer to the "Blkback" section.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agoSUPPORT.md: Clarify that the PV keyboard protocol includes mouse support
George Dunlap [Fri, 9 Mar 2018 11:26:03 +0000 (11:26 +0000)]
SUPPORT.md: Clarify that the PV keyboard protocol includes mouse support

s/fo/fo; while we're here.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agoARM: GICv2: fix GICH_V2_LR definitions
Andre Przywara [Fri, 9 Mar 2018 15:11:33 +0000 (15:11 +0000)]
ARM: GICv2: fix GICH_V2_LR definitions

The bit definition for the CPUID mask in the GICv2 LR register was
wrong, fortunately the current implementation does not use that bit.
Fix it up (it's starting at bit 10, not bit 9) and clean up some
nearby definitions on the way.
This will be used by the new VGIC shortly.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
7 years agoARM: GICv3: poke_irq: make RWP optional
Andre Przywara [Fri, 9 Mar 2018 15:11:32 +0000 (15:11 +0000)]
ARM: GICv3: poke_irq: make RWP optional

A GICv3 hardware implementation can be implemented in several parts that
communicate with each other (think multi-socket systems).
To make sure that critical settings have arrived at all endpoints, some
bits are tracked using the RWP bit in the GICD_CTLR register, which
signals whether a register write is still in progress.
However this only applies to *some* registers, namely the bits in the
GICD_ICENABLER (disabling interrupts) and some bits in the GICD_CTLR
register (cf. Arm IHI 0069D, 8.9.4: RWP, bit[31]).
But our gicv3_poke_irq() was always polling this bit before returning,
resulting in pointless MMIO reads for many registers.
Add an option to gicv3_poke_irq() to state whether we want to wait for
this bit and use it accordingly to match the spec.
Replace a "1 << " with a "1U << " on the way to fix a potentially
undefined behaviour when the argument evaluates to 31.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
7 years agoARM: GICv2: introduce gicv2_poke_irq()
Andre Przywara [Fri, 9 Mar 2018 15:11:31 +0000 (15:11 +0000)]
ARM: GICv2: introduce gicv2_poke_irq()

The GICv2 uses bitmaps spanning several MMIO registers for holding some
interrupt state. Similar to GICv3, add a poke helper functions to set a bit
for a given irq_desc in one of those bitmaps.
At the moment there is only one use in gic-v2.c, but there will be more
coming soon.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
7 years agoARM: GICv3: rename HYP interface definitions to use ICH_ prefix
Andre Przywara [Fri, 9 Mar 2018 15:11:29 +0000 (15:11 +0000)]
ARM: GICv3: rename HYP interface definitions to use ICH_ prefix

On a GICv3 in non-compat mode the hypervisor interface is always
accessed via system registers. Those register names have a "ICH_" prefix
in the manual, to differentiate them from the MMIO registers. Also those
registers are mostly 64-bit (compared to the 32-bit GICv2 registers) and
use different bit assignments.
To make this obvious and to avoid clashes with double definitions using
the same names for actually different bits, lets change all GICv3
hypervisor interface registers to use the "ICH_" prefix from the manual.
This renames the definitions in gic_v3_defs.h and their usage in gic-v3.c
and is needed to allow co-existence of the GICv2 and GICv3 definitions
in the same file.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Acked-by: Julien Grall <julien.grall@arm.com>
7 years agoARM: VGIC: Introduce gic_get_nr_lrs()
Andre Przywara [Fri, 9 Mar 2018 15:11:28 +0000 (15:11 +0000)]
ARM: VGIC: Introduce gic_get_nr_lrs()

So far the number of list registers (LRs) a GIC implements is only
needed in the hardware facing side of the VGIC code (gic-vgic.c).
The new VGIC will need this information in more and multiple places, so
export a function that returns the number.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
7 years agoARM: VGIC: reorder prototypes in vgic.h
Andre Przywara [Fri, 9 Mar 2018 15:11:27 +0000 (15:11 +0000)]
ARM: VGIC: reorder prototypes in vgic.h

Currently vgic.h both contains prototypes used by Xen arch code outside
of the actual VGIC (for instance vgic_vcpu_inject_irq()), and prototypes
for functions used by the VGIC internally.
Group them to later allow an easy split with one #ifdef.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
7 years agoARM: VGIC: carve out struct vgic_cpu and struct vgic_dist
Andre Przywara [Fri, 9 Mar 2018 15:11:26 +0000 (15:11 +0000)]
ARM: VGIC: carve out struct vgic_cpu and struct vgic_dist

Currently we describe the VGIC specific fields in a structure
*embedded* in struct arch_domain and struct arch_vcpu. These members
there are however related to the current VGIC implementation, and will
be substantially different in the future.
To allow coexistence of two implementations, move the definition of these
embedded structures into vgic.h, and just use the opaque type in the arch
specific structures.
This allows easy switching between different implementations later.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
7 years agoARM: VGIC: change to level-IRQ compatible IRQ injection interface
Andre Przywara [Fri, 9 Mar 2018 15:11:25 +0000 (15:11 +0000)]
ARM: VGIC: change to level-IRQ compatible IRQ injection interface

At the moment vgic_vcpu_inject_irq() is the interface for Xen internal
code and virtual devices to inject IRQs into a guest. This interface has
two shortcomings:
1) It requires a VCPU pointer, which we may not know (and don't need!)
for shared interrupts. A second function (vgic_vcpu_inject_spi()), was
there to work around this issue.
2) This interface only really supports edge triggered IRQs, which is
what the Xen VGIC emulates only anyway. However this needs to and will
change, so we need to add the desired level (high or low) to the
interface.
This replaces the existing injection call (taking a VCPU and an IRQ
parameter) with a new one, taking domain, VCPU, IRQ and level parameters.
The VCPU can be NULL in case we don't know and don't care.
We change all call sites to use this new interface. This still doesn't
give us the missing level IRQ handling, but at least prepares the callers
to do the right thing later automatically.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
7 years agoARM: VGIC: Adjust domain_max_vcpus() to be VGIC specific
Andre Przywara [Fri, 9 Mar 2018 15:11:23 +0000 (15:11 +0000)]
ARM: VGIC: Adjust domain_max_vcpus() to be VGIC specific

domain_max_vcpus(), which is used by generic Xen code, returns the
maximum number of VCPUs for a domain, which on ARM is mostly limited by
the VGIC model emulated (a (v)GICv2 can only handle 8 CPUs).
Our current implementation lives in arch/arm/domain.c, but reaches into
VGIC internal data structures.
Move the actual functionality into vgic.c, and provide a shim in
domain.h, to keep this VGIC internal.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
7 years agoARM: VGIC: Move gic_remove_from_lr_pending() prototype
Andre Przywara [Fri, 9 Mar 2018 15:11:22 +0000 (15:11 +0000)]
ARM: VGIC: Move gic_remove_from_lr_pending() prototype

The prototype for gic_remove_from_lr_pending() is the last function in
gic.h which references a VGIC data structure.
Move it over to vgic.h, so that we can remove the inclusion of vgic.h
from gic.h. We add it to asm/domain.h instead, where it is actually
needed.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
7 years agoARM: VGIC: rename gic_inject() and gic_clear_lrs()
Andre Przywara [Fri, 9 Mar 2018 15:11:21 +0000 (15:11 +0000)]
ARM: VGIC: rename gic_inject() and gic_clear_lrs()

The two central functions to synchronise our emulated VGIC state with
the GIC hardware (the LRs, really), are named somewhat confusingly.
Rename them from gic_inject() to vgic_sync_to_lrs() and from
gic_clear_lrs() to vgic_sync_from_lrs(), to make the code more readable.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Acked-by: Julien Grall <julien.grall@arm.com>
7 years agoARM: vGICv3: remove rdist_stride from VGIC structure
Andre Przywara [Fri, 9 Mar 2018 15:11:20 +0000 (15:11 +0000)]
ARM: vGICv3: remove rdist_stride from VGIC structure

The last patch removed the usage of the hardware's redistributor-stride
value from our (Dom0) GICv3 emulation. This means we no longer need to
store this value in the VGIC data structure.
Remove that variable and every code snippet that handled that, instead
simply always use the architected value.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Acked-by: Julien Grall <julien.grall@linaro.org>
7 years agoARM: vGICv3: always use architected redist stride
Andre Przywara [Fri, 9 Mar 2018 15:11:19 +0000 (15:11 +0000)]
ARM: vGICv3: always use architected redist stride

The redistributor-stride property in a GICv3 DT node is only there to
cover broken platforms where this value deviates from the architected one.
Since we emulate the GICv3 distributor even for Dom0, we don't need to
copy the broken behaviour. All the special handling for Dom0s using
GICv3 is just for using the hardware's memory map, which is unaffected
by the redistributor stride - it can never be smaller than the
architected two pages.
Remove the redistributor-stride property from Dom0's DT node and also
remove the code that tried to reuse the hardware value for Dom0's GICv3
emulation.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Acked-by: Julien Grall <julien.grall@arm.com>
7 years agoARM: GICv3: use hardware GICv3 redistributor values for Dom0
Andre Przywara [Fri, 9 Mar 2018 15:11:18 +0000 (15:11 +0000)]
ARM: GICv3: use hardware GICv3 redistributor values for Dom0

The code to generate the DT node or MADT table for Dom0 reaches into the
domain's vGIC structure to learn the number of redistributor regions and
their base addresses.
Since those values are copied from the hardware, we can as well use
those hardware values directly when setting up the hardware domain.

This avoids the hardware GIC code to reference vGIC data structures.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
7 years agoARM: vGICv3: clarify on GUEST_GICV3_RDIST_REGIONS symbol
Andre Przywara [Fri, 9 Mar 2018 15:11:17 +0000 (15:11 +0000)]
ARM: vGICv3: clarify on GUEST_GICV3_RDIST_REGIONS symbol

Normally there is only one GICv3 redistributor region, and we use
that for DomU guests using a GICv3.
Explain the background in a comment and why we need to keep the number
of hardware regions for Dom0.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Acked-by: Julien Grall <julien.grall@arm.com>
7 years agocpufreq/ondemand: fix race while offlining CPU
Jan Beulich [Fri, 9 Mar 2018 16:30:49 +0000 (17:30 +0100)]
cpufreq/ondemand: fix race while offlining CPU

Offlining a CPU involves stopping the cpufreq governor. The on-demand
governor will kill the timer before letting generic code proceed, but
since that generally isn't happening on the subject CPU,
cpufreq_dbs_timer_resume() may run in parallel. If that managed to
invoke the timer handler, that handler needs to run to completion before
dbs_timer_exit() may safely exit.

Make the "stoppable" field a tristate, changing it from +1 to -1 around
the timer function invocation, and make dbs_timer_exit() wait for it to
become non-negative (still writing zero if it's +1).

Also adjust coding style in cpufreq_dbs_timer_resume().

Reported-by: Martin Cerveny <martin@c-home.cz>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Martin Cerveny <martin@c-home.cz>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agox86: improve MSR_SHADOW_GS accesses
Jan Beulich [Fri, 9 Mar 2018 16:29:45 +0000 (17:29 +0100)]
x86: improve MSR_SHADOW_GS accesses

Instead of using RDMSR/WRMSR, on fsgsbase-capable systems use a double
SWAPGS combined with RDGSBASE/WRGSBASE. This halves execution time for
a shadow GS update alone on my Haswell (and we have indications of
good performance improvements by this on Skylake too), while the win is
even higher when e.g. updating more than one base (as may and commonly
will happen in load_segments()).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
7 years agox86/traps: Put idt_table[] back into .bss
Andrew Cooper [Fri, 9 Mar 2018 15:01:21 +0000 (15:01 +0000)]
x86/traps: Put idt_table[] back into .bss

c/s d1d6fc97d "x86/xpti: really hide almost all of Xen image" accidentially
moved idt_table[] from .bss to .data by virtue of using the page_aligned
section.  We also have .bss.page_aligned, so use that.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agox86emul/test: wrap libc functions with FPU save/restore code
Jan Beulich [Fri, 9 Mar 2018 13:47:21 +0000 (06:47 -0700)]
x86emul/test: wrap libc functions with FPU save/restore code

Currently with the native tool chain on Debian Jessie ./test_x86_emulator
yields:

  Testing AVX2 256bit single native execution...okay
  Testing AVX2 256bit single 64-bit code sequence...[line 933] failed!

The bug is that libc's memcpy() in read() uses %xmm8 (specifically, in
__memcpy_sse2_unaligned()), which corrupts %ymm8 behind the back of the AVX2
test code.

Introduce wrappers (and machinery to forward calls to those wrappers)
saving/restoring FPU state around certain library calls.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-and-tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agotests/x86emul: Helpers to save and restore FPU state
Andrew Cooper [Tue, 6 Mar 2018 13:42:36 +0000 (13:42 +0000)]
tests/x86emul: Helpers to save and restore FPU state

Introduce common helpers for saving and restoring FPU state.  During
emul_test_init(), calculate whether to use xsave or fxsave, and tweak the
existing mxcsr_mask logic to avoid using another large static buffer.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/alt: Drop explicit padding of origin sites
Andrew Cooper [Fri, 9 Feb 2018 14:33:59 +0000 (14:33 +0000)]
x86/alt: Drop explicit padding of origin sites

Now that the alternatives infrastructure can calculate the required padding
automatically, there is no need to hard code it.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/alt: Support for automatic padding calculations
Andrew Cooper [Fri, 9 Feb 2018 12:47:58 +0000 (12:47 +0000)]
x86/alt: Support for automatic padding calculations

The correct amount of padding in an origin patch site can be calculated
automatically, based on the relative lengths of the replacements.

This requires a bit of trickery to calculate correctly, especially in the
ALTENRATIVE_2 case where a branchless max() calculation in needed.  The
calculation is further complicated because GAS's idea of true is -1 rather
than 1, which is why the extra negations are required.

Additionally, have apply_alternatives() attempt to optimise the padding nops.
This is complicated by the fact that we must not attempt to optimise nops over
an origin site which has already been modified.

To keep track of this, add a priv field to struct alt_instr, which gets
modified by apply_alternatives().  This method is used in preference to a
local variable in case we make multiple passes.  One extra requirement is that
alt_instr's referring to the same origin site must now be consecutive, but we
already have this property.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/asm: Remove opencoded uses of altinstruction_entry
Andrew Cooper [Fri, 9 Feb 2018 15:58:39 +0000 (15:58 +0000)]
x86/asm: Remove opencoded uses of altinstruction_entry

With future changes, altinstruction_entry is going to become more complicated
to use.  Furthermore, there are already ALTERNATIVE* macros which can be used
to avoid opencoding the creation of replacement information.

For ASM_STAC, ASM_CLAC and CR4_PV32_RESTORE, this means the removal of all
hardocded label numbers.  For the cr4_pv32 alternatives, this means hardcoding
the extra space required in the original patch site, but the hardcoding will
be removed by a later patch.

No change to any functionality, but the handling of nops inside the original
patch sites are a bit different.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/alt: Clean up the assembly used to generate alternatives
Andrew Cooper [Fri, 9 Feb 2018 13:31:28 +0000 (13:31 +0000)]
x86/alt: Clean up the assembly used to generate alternatives

 * On the C side, switch to using local lables rather than hardcoded numbers.
 * Rename parameters and lables to be consistent with alt_instr names, and
   consistent between the the C and asm versions.
 * On the asm side, factor some expressions out into macros to aid clarity.
 * Consistently declare section attributes.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agox86/alt: Clean up struct alt_instr and its users
Andrew Cooper [Fri, 9 Feb 2018 13:31:28 +0000 (13:31 +0000)]
x86/alt: Clean up struct alt_instr and its users

 * Rename some fields for consistency and clarity, and use standard types.
 * Don't opencode the use of ALT_{ORIG,REPL}_PTR().

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agox86/alt: Drop unused alternative infrastructure
Andrew Cooper [Fri, 9 Feb 2018 12:54:58 +0000 (12:54 +0000)]
x86/alt: Drop unused alternative infrastructure

ALTERNATIVE_3 is more complicated than ALTERNATIVE_2 when it comes to
calculating extra padding length, and we have no need for the complexity.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agoxen/domain: Added debug safety in the domain_create() failure path
Andrew Cooper [Wed, 28 Feb 2018 14:02:41 +0000 (14:02 +0000)]
xen/domain: Added debug safety in the domain_create() failure path

Hitting the fail path with err = 0 causes callers to dereference a NULL
pointer, as 0 fails an IS_ERR() check.

All of the paths appear to be fine, but leave some logic to help catch stray
misuses.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>