]> xenbits.xensource.com Git - xen.git/log
xen.git
8 years agolibxl: split libxl vtpm code into one source
Juergen Gross [Tue, 12 Jul 2016 15:30:42 +0000 (17:30 +0200)]
libxl: split libxl vtpm code into one source

Put all vtpm related stuff of libxl into a dedicated source file.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl: move library pvusb specific code into libxl_pvusb.c
Juergen Gross [Tue, 12 Jul 2016 15:30:41 +0000 (17:30 +0200)]
libxl: move library pvusb specific code into libxl_pvusb.c

Outside libxl_pvusb.c only libxl_util.c still contains some pvusb code.

Move it to libxl_pvusb.c.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl: add "pv device mode needed" support to device type framework
Juergen Gross [Tue, 12 Jul 2016 15:30:40 +0000 (17:30 +0200)]
libxl: add "pv device mode needed" support to device type framework

Add another callback to the device type framework in order to aid
decision whether a pv domain needs a device model.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl: add "merge" function to generic device type support
Juergen Gross [Tue, 12 Jul 2016 15:30:39 +0000 (17:30 +0200)]
libxl: add "merge" function to generic device type support

Instead of using a macro generating the code to merge xenstore and
json configuration data, use the generic device type support for
this purpose.

This requires to add some accessor functions to the framework and
a structure for disks (as disks are added separately they didn't need
such a structure up to now).

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agoaltp2m: Allow shared entries to be copied to altp2m views during lazycopy
Tamas K Lengyel [Wed, 27 Jul 2016 09:31:59 +0000 (10:31 +0100)]
altp2m: Allow shared entries to be copied to altp2m views during lazycopy

Move sharing locks above altp2m to avoid locking order violation and crashing
the hypervisor during unsharing operations when altp2m is active.

Applying mem_access settings or remapping gfns in altp2m views will
automatically unshare the page if it was shared previously. Also,
disallow nominating pages for which there are pre-existing altp2m
mem_access settings or remappings present. However, allow altp2m to
populate altp2m views with shared entries during lazycopy as unsharing
will automatically propagate the change to these entries in altp2m
views as well.

While we're here, switch to using the appropriate wrappers rather than
calling p2m->get_entry() directly.

Signed-off-by: Tamas K Lengyel <tamas.lengyel@zentific.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
8 years agoxen/arm: p2m: Simplify p2m type check by using bitmask
Julien Grall [Wed, 20 Jul 2016 16:10:50 +0000 (17:10 +0100)]
xen/arm: p2m: Simplify p2m type check by using bitmask

The resulting assembly code for the macros is much simpler and will
never contain more than one instruction branch.

The idea is taken from x86 (see include/asm-x86/p2m.h). Also move the
two helpers earlier to keep all the p2m type definitions together.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: p2m: Use p2m_is_foreign in get_page_from_gfn to avoid open coding
Julien Grall [Wed, 20 Jul 2016 16:10:49 +0000 (17:10 +0100)]
xen/arm: p2m: Use p2m_is_foreign in get_page_from_gfn to avoid open coding

No functional change.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: p2m: Clean-up mfn_to_p2m_entry
Julien Grall [Wed, 20 Jul 2016 16:10:47 +0000 (17:10 +0100)]
xen/arm: p2m: Clean-up mfn_to_p2m_entry

The physical address is computed from the machine frame number, so
checking if the physical address is page aligned is pointless.

Furthermore, directly assigned the MFN to the corresponding field in the
entry rather than converting to a physical address and orring the value.
It will avoid to rely on the field position and make the code clearer.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoarm/vgic: Change fixed number of mmio handlers to variable number
Shanker Donthineni [Wed, 20 Jul 2016 14:00:56 +0000 (09:00 -0500)]
arm/vgic: Change fixed number of mmio handlers to variable number

Compute the number of mmio handlers that are required for vGICv3 and
vGICv2 emulation drivers in vgic_v3_init()/vgic_v2_init(). Augment
this variable number of mmio handlers to a fixed number MAX_IO_HANDLER
and pass it to domain_io_init() to allocate enough memory.

New code path:
 domain_vgic_register(&count)
   domain_io_init(count + MAX_IO_HANDLER)
     domain_vgic_init()

Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Acked-by: Julien Grall <julien.grall@arm.com>
8 years agoxen/arm: io: Use binary search for mmio handler lookup
Shanker Donthineni [Wed, 20 Jul 2016 14:00:55 +0000 (09:00 -0500)]
xen/arm: io: Use binary search for mmio handler lookup

As the number of I/O handlers increase, the overhead associated with
linear lookup also increases. The system might have maximum of 144
(assuming CONFIG_NR_CPUS=128) mmio handlers. In worst case scenario,
it would require 144 iterations for finding a matching handler. Now
it is time for us to change from linear (complexity O(n)) to a binary
search (complexity O(log n) for reducing mmio handler lookup overhead.

Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen: Add generic implementation of binary search
Shanker Donthineni [Wed, 20 Jul 2016 14:00:54 +0000 (09:00 -0500)]
xen: Add generic implementation of binary search

This patch adds the generic implementation of binary search algorithm
which is copied from Linux kernel v4.7-rc7. No functional changes.

Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
8 years agoarm/io: Use separate memory allocation for mmio handlers
Shanker Donthineni [Wed, 20 Jul 2016 14:00:53 +0000 (09:00 -0500)]
arm/io: Use separate memory allocation for mmio handlers

The number of mmio handlers are limited to a compile time macro
MAX_IO_HANDLER which is 16. This number is not at all sufficient
to support per CPU distributor regions. Either it needs to be
increased to a bigger number, at least CONFIG_NR_CPUS+16, or
allocate a separate memory for mmio handlers dynamically during
domain build.

This patch uses the dynamic allocation strategy to reduce memory
footprint for 'struct domain' instead of static allocation.

Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agox86/entry: Avoid SMAP violation in compat_create_bounce_frame()
Andrew Cooper [Wed, 15 Jun 2016 17:32:14 +0000 (18:32 +0100)]
x86/entry: Avoid SMAP violation in compat_create_bounce_frame()

A 32bit guest kernel might be running on user mappings.
compat_create_bounce_frame() must whitelist its guest accesses to avoid
risking a SMAP violation.

For both variants of create_bounce_frame(), re-blacklist user accesses if
execution exits via an exception table redirection.

This is XSA-183 / CVE-2016-6259

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/pv: Remove unsafe bits from the mod_l?_entry() fastpath
Andrew Cooper [Mon, 11 Jul 2016 13:32:03 +0000 (14:32 +0100)]
x86/pv: Remove unsafe bits from the mod_l?_entry() fastpath

All changes in writeability and cacheability must go through full
re-validation.

Rework the logic as a whitelist, to make it clearer to follow.

This is XSA-182

Reported-by: Jérémie Boutoille <jboutoille@ext.quarkslab.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
8 years agoxen: Remove buggy initial placement algorithm
George Dunlap [Fri, 15 Jul 2016 17:25:52 +0000 (18:25 +0100)]
xen: Remove buggy initial placement algorithm

The initial placement algorithm sometimes picks cpus outside of the
mask it's given, does a lot of unnecessary bitmasking, does its own
separate load calculation, and completely ignores vcpu hard and soft
affinities.  Just get rid of it and rely on the schedulers to do
initial placement.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agoxen: Have schedulers revise initial placement
George Dunlap [Fri, 15 Jul 2016 16:20:36 +0000 (17:20 +0100)]
xen: Have schedulers revise initial placement

The generic domain creation logic in
xen/common/domctl.c:default_vcpu0_location() attempts to try to do
initial placement load-balancing by placing vcpu 0 on the least-busy
non-primary hyperthread available.  Unfortunately, the logic can end
up picking a pcpu that's not in the online mask.  When this is passed
to a scheduler such which assumes that the initial assignment is
valid, it causes a null pointer dereference looking up the runqueue.

Furthermore, this initial placement doesn't take into account hard or
soft affinity, or any scheduler-specific knowledge (such as historic
runqueue load, as in credit2).

To solve this, when inserting a vcpu, always call the per-scheduler
"pick" function to revise the initial placement.  This will
automatically take all knowledge the scheduler has into account.

csched2_cpu_pick ASSERTs that the vcpu's pcpu scheduler lock has been
taken.  Grab and release the lock to minimize time spend with irqs
disabled.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Meng Xu <mengxu@cis.upenn.edu>
Reviwed-by: Dario Faggioli <dario.faggioli@citrix.com>
8 years agoxen: Some code motion to avoid having to do forward-declaration
George Dunlap [Mon, 25 Jul 2016 11:09:52 +0000 (12:09 +0100)]
xen: Some code motion to avoid having to do forward-declaration

For sched_credit2, move the vcpu insert / remove / free functions near the domain
insert / remove / alloc / free functions (and after cpu_pick).

For sched_rt, move rt_cpu_pick() further up.

This is pure code motion; no functional change.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Meng Xu <mengxu@cis.upenn.edu>​
Acked-by: Dario Faggioli <dario.faggioli@citrix.com>
8 years agosystemd: use standard dependencies for xendriverdomain.service
Marek Marczykowski-Górecki [Sun, 24 Jul 2016 19:26:57 +0000 (21:26 +0200)]
systemd: use standard dependencies for xendriverdomain.service

Having DefaultDependencies=no means it can be started before / is
remounted read-write, which will result in various failures (to start
with opening the log).
Since "libxl: trigger attach events for devices attached before xl devd
startup" it is no longer important to start it as early as possible,
because it will process devices created before its startup.

Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agotools/libxc: Properly increment ApicIdCoreSize field on AMD
Boris Ostrovsky [Fri, 22 Jul 2016 17:14:01 +0000 (13:14 -0400)]
tools/libxc: Properly increment ApicIdCoreSize field on AMD

Current code incorrectly adds 1 to full register instead of
incrementing the field in bits 15:12.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agox86/vMSI-X: Fix host crash when shutting down guests with MSI capable devices
Andrew Cooper [Mon, 18 Jul 2016 21:04:43 +0000 (22:04 +0100)]
x86/vMSI-X: Fix host crash when shutting down guests with MSI capable devices

c/s 74c6dc2d "x86/vMSI-X: defer intercept handler registration" caused MSI-X
table infrastructure not to always be initialised, but it missed one path
which needed an is-initialised check.

If a devices is passed through to a domain which is MSI capable but not MSI-X
capable, the call to msixtbl_init() is omitted, but a XEN_DOMCTL_unbind_pt_irq
hypercall still calls into msixtbl_pt_unregister().  This follows the linked
list pointer which is still NULL.

Introduce an is-initalised check to msixtbl_pt_unregister().

Furthermore, the purpose of the open-coded msixtbl_list.next check is rather
subtle.  Introduce an msixtbl_initialised() predicate instead, which makes its
purpose far more obvious.

Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
8 years agoxen: credit2: don't let b_avgload go negative.
Dario Faggioli [Fri, 22 Jul 2016 12:04:53 +0000 (14:04 +0200)]
xen: credit2: don't let b_avgload go negative.

The ASSERT() made effective by b5b5876619bd8ec2e
("xen: credit2: fix two s_time_t handling issues
in load balancing") triggers for b_avgload (spotted
by OSSTest).

b_avgload is where we store the prediction of how
the load of a runqueue will look like in the medium
to long term, because of a vcpu being added to or
removed from there.

On vcpu removal, saturate down b_avgload to zero,
as it makes very few sense to predict that the
load of a runqueue will at some point become negative!

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
8 years agoxen/arm: p2m: Fix multi-lines coding style comments
Julien Grall [Wed, 20 Jul 2016 16:10:46 +0000 (17:10 +0100)]
xen/arm: p2m: Fix multi-lines coding style comments

The start and end markers should be on separate lines.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: p2m: Restrict usage of get_page_from_gva to the current vCPU
Julien Grall [Wed, 20 Jul 2016 16:10:45 +0000 (17:10 +0100)]
xen/arm: p2m: Restrict usage of get_page_from_gva to the current vCPU

The function get_page_from_gva translates a guest virtual address to a
machine address. The translation involves the register VTTBR_EL2,
TTBR0_EL1, TTBR1_EL1 and SCTLR_EL1.

Currently, only the first register is context switch is the current
domain is not the same. This will result to use the wrong TTBR*_EL1 and
SCTLR_EL1 for the translation.

To fix the code properly, we would have to context switch all the
registers mentioned above when the vCPU in parameter is not the current
one. Similar things would need to be done in the callee
p2m_mem_check_and_get_page.

Given that the only caller of this function with the vCPU that may not
be current is a guest debugging function (show_guest_stack), restrict
the usage to the current vCPU for the time being.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: p2m: Pass the vCPU in parameter to get_page_from_gva
Julien Grall [Wed, 20 Jul 2016 16:10:44 +0000 (17:10 +0100)]
xen/arm: p2m: Pass the vCPU in parameter to get_page_from_gva

The function get_page_from_gva translates a guest virtual address to a
machine address. The translation involves the register VTTBR_EL2,
TTBR0_EL1, TTBR1_EL1 and SCTLR_EL1. Whilst the first register is per
domain (the p2m is common to every vCPUs), the last 3 are per-vCPU.

Therefore, the function should take the vCPU in parameter and not the
domain. Fixing the actual code path will be done a separate patch.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: system: Use the correct parameter name in local_irq_restore
Julien Grall [Wed, 20 Jul 2016 16:10:43 +0000 (17:10 +0100)]
xen/arm: system: Use the correct parameter name in local_irq_restore

The parameter to store the flags is called 'x' and not 'flags'.
Thankfully all the user of the macro is passing 'flags'.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoarm/traps: fix bug in dump_guest_s1_walk handling of level 2 page tables
Jonathan Daugherty [Wed, 20 Jul 2016 16:10:17 +0000 (09:10 -0700)]
arm/traps: fix bug in dump_guest_s1_walk handling of level 2 page tables

dump_guest_s1_walk intends to walk to level 2 page table entries but
was failing to do so because of a check that caused level 2 page table
descriptors to be ignored. This change fixes the check so that level 2
page table walks occur as intended by ignoring descriptors unless their
low two bits match the expected sequence [0,1].

For more information, see the ARMv7-A ARM DDI 0406C.b, section B3.5.1.

Signed-off-by: Jonathan Daugherty <jtd@galois.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoarm/traps: fix bug in dump_guest_s1_walk L1 page table offset computation
Jonathan Daugherty [Wed, 20 Jul 2016 16:10:16 +0000 (09:10 -0700)]
arm/traps: fix bug in dump_guest_s1_walk L1 page table offset computation

The dump_guest_s1_walk function was incorrectly using the top 10 bits of
the virtual address to select the L1 page table index.  The correct
amount is 12 bits, resulting in a shift of 20 bits rather than 22.

For more details, see the ARMv7-A ARM DDI 0406C.b, section B3.5,
"Short-descriptor translation table format."

Signed-off-by: Jonathan Daugherty <jtd@galois.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxenstore: add assertion in database dumping code
Wei Liu [Wed, 20 Jul 2016 14:13:42 +0000 (15:13 +0100)]
xenstore: add assertion in database dumping code

If memfile is NULL, the signal handler won't be installed, hence fopen
won't dereference NULL. Coverity is not smart enough to figure that out
unfortunately.

Add an assertion to prevent coverity from complaining.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxenstore: send error earlier in do_mkdir
Wei Liu [Wed, 20 Jul 2016 14:13:41 +0000 (15:13 +0100)]
xenstore: send error earlier in do_mkdir

XenServer's coverity instance complains that a few lines below
create_node dereferences NULL if name == NULL. It however fails to
figure out that if node is NULL, errno won't be ENOENT, so do_mkdir
should have bailed before create_node.

That said, it would be good if we don't need to go through the hops.  We
can bail earlier if name is NULL.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agooxenstored: honour XEN_RUN_DIR
Wei Liu [Mon, 11 Jul 2016 17:28:09 +0000 (18:28 +0100)]
oxenstored: honour XEN_RUN_DIR

Move default the pid file under XEN_RUN_DIR. Note that it changes the
location from /var/run to /var/run/xen.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: David Scott <dave@recoil.org>
8 years agolibxenstat: honour XEN_RUN_DIR
Wei Liu [Mon, 11 Jul 2016 17:28:08 +0000 (18:28 +0100)]
libxenstat: honour XEN_RUN_DIR

This is because libxl uses XEN_RUN_DIR to generate the socket path for
libxenstat while libxenstat itself uses hard-coded path, which is not
necessarily the same path as XEN_RUN_DIR.  The default configuration
happened to work because XEN_RUN_DIR defaulted to /var/run/xen, which
matched the hard-coded path.

We should make libxenstat use XEN_RUN_DIR so that it works with
non-default configuration.

Generate a _paths.h because it is required to make this change work.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agohotplug/Linux: honour XEN_RUN_DIR
Wei Liu [Mon, 11 Jul 2016 17:28:07 +0000 (18:28 +0100)]
hotplug/Linux: honour XEN_RUN_DIR

Store various PID files under XEN_RUN_DIR. Note that this change the
default location from /var/run to /var/run/xen.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agohotplug/NetBSD: honour XEN_RUN_DIR
Wei Liu [Mon, 11 Jul 2016 17:28:06 +0000 (18:28 +0100)]
hotplug/NetBSD: honour XEN_RUN_DIR

Store xldevd.pid under XEN_RUN_DIR. Note that this will change the
default location from /var/run to /var/run/xen.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agohotplug/FreeBSD: honour XEN_RUN_DIR
Wei Liu [Mon, 11 Jul 2016 17:28:05 +0000 (18:28 +0100)]
hotplug/FreeBSD: honour XEN_RUN_DIR

Store xldevd.pid under XEN_RUN_DIR. Note that the default location would
change from /var/run to /var/run/xen.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agotools/helper: honour XEN_RUN_DIR in init-xenstore-domain.c
Wei Liu [Mon, 11 Jul 2016 17:28:04 +0000 (18:28 +0100)]
tools/helper: honour XEN_RUN_DIR in init-xenstore-domain.c

Place the PID file under XEN_RUN_DIR. Note that this change the default
location from /var/run to /var/run/xen.

Generate a _paths.h as that is required to make this change work.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxenconsoled: honour XEN_RUN_DIR
Wei Liu [Mon, 11 Jul 2016 17:28:03 +0000 (18:28 +0100)]
xenconsoled: honour XEN_RUN_DIR

Place the PID file under XEN_RUN_DIR by default. Note this change the
default location from /var/run to /var/run/xen.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxl: rename variable pause to pause_after_migration
Wei Liu [Wed, 20 Jul 2016 08:30:17 +0000 (09:30 +0100)]
xl: rename variable pause to pause_after_migration

Gcc 4.4.4 complained that the "pause" variable introduced in 22b430e0
("xl: add option to leave domain paused after migration") shadowed
pause(2) declaration in unistd.h.

Rename "pause" to "pause_after_migration" to fix this issue.

Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxen: credit2: fix two s_time_t handling issues in load balancing
Dario Faggioli [Wed, 20 Jul 2016 09:50:12 +0000 (10:50 +0100)]
xen: credit2: fix two s_time_t handling issues in load balancing

both introduced in d205f8a7f48e2ec ("xen: credit2: rework
load tracking logic").

First, in __update_runq_load(), the ASSERT() was actually
useless. Let's instead check that the computed value of
the load has not overflowed (and hence gone negative).

While there, do that in __update_svc_load() as well.

Second, in balance_load(), cpus_max needs being extended
in order to be correctly shifted, and the result compared
with an s_time_t value, without risking loosing info.

Spotted by Coverity.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
8 years agoxen: credit2: implement true SMT support
Dario Faggioli [Wed, 20 Jul 2016 09:55:55 +0000 (10:55 +0100)]
xen: credit2: implement true SMT support

In fact, right now, we recommend keepeing runqueues
arranged per-core, so that it is the inter-runqueue load
balancing code that automatically spreads the work in an
SMT friendly way. This means that any other runq
arrangement one may want to use falls short of SMT
scheduling optimizations.

This commit implements SMT awareness --similar to the
one we have in Credit1-- for any possible runq
arrangement. This turned out to be pretty easy to do,
as the logic can live entirely in runq_tickle()
(although, in order to avoid for_each_cpu loops in
that function, we use a new cpumask which indeed needs
to be updated in other places).

In addition to disentangling SMT awareness from load
balancing, this also allows us to support the
sched_smt_power_savings parametar in Credit2 as well.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Anshul Makkar <anshul.makkar@citrix.com>
8 years agoxl: add option to leave domain paused after migration
Roger Pau Monne [Tue, 19 Jul 2016 08:58:15 +0000 (10:58 +0200)]
xl: add option to leave domain paused after migration

This is useful for debugging domains that crash on resume from migration.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl: trigger attach events for devices attached before xl devd startup
Marek Marczykowski-Górecki [Fri, 15 Jul 2016 23:47:56 +0000 (01:47 +0200)]
libxl: trigger attach events for devices attached before xl devd startup

When this daemon is started after creating backend device, that device
will not be configured.

Racy situation:
1. driver domain is started
2. frontend domain is started (just after kicking driver domain off)
3. device in frontend domain is connected to the backend (as specified
   in frontend domain configuration)
4. xl devd is started in driver domain

End result is that backend device in driver domain is not configured
(like network interface is not enabled), so the device doesn't work.

Fix this by artifically triggering events for devices already present in
xenstore before xl devd is started. Do this only after xenstore watch is
already registered, and only for devices not already initialized (in
XenbusStateInitWait state).

Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agoxenstore: add memory allocation debugging capability
Juergen Gross [Tue, 19 Jul 2016 12:08:18 +0000 (14:08 +0200)]
xenstore: add memory allocation debugging capability

Add support for debugging memory allocation statistics to xenstored.
Specifying "-M <file>" on the command line will enable the feature.
Whenever xenstored receives SIGUSR1 it will dump out a full talloc
report to <file>. This helps finding e.g. memory leaks in xenstored.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxenstore: use temporary memory context for firing watches
Juergen Gross [Tue, 19 Jul 2016 11:30:46 +0000 (13:30 +0200)]
xenstore: use temporary memory context for firing watches

Use a temporary memory context for memory allocations when firing
watches. This will avoid leaking memory in case of long living
connections and/or xenstore entries.

This requires adding a new parameter to fire_watches() and add_event()
to specify the memory context to use for allocations.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxenstore: add explicit memory context parameter to get_node()
Juergen Gross [Tue, 19 Jul 2016 11:30:45 +0000 (13:30 +0200)]
xenstore: add explicit memory context parameter to get_node()

Add a parameter to xenstored get_node() function to explicitly
specify the memory context to be used for allocations. This will make
it easier to avoid memory leaks by using a context which is freed
soon.

This requires adding the temporary context to errno_from_parents() and
ask_parents(), too.

When calling get_node() select a sensible memory context for the new
parameter by preferring a temporary one.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxenstore: add explicit memory context parameter to read_node()
Juergen Gross [Tue, 19 Jul 2016 11:30:44 +0000 (13:30 +0200)]
xenstore: add explicit memory context parameter to read_node()

Add a parameter to xenstored read_node() function to explicitly
specify the memory context to be used for allocations. This will make
it easier to avoid memory leaks by using a context which is freed
soon.

When calling read_node() select a sensible memory context for the new
parameter by preferring a temporary one.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxenstore: add explicit memory context parameter to get_parent()
Juergen Gross [Tue, 19 Jul 2016 11:30:43 +0000 (13:30 +0200)]
xenstore: add explicit memory context parameter to get_parent()

Add a parameter to xenstored get_parent() function to explicitly
specify the memory context to be used for allocations. This will make
it easier to avoid memory leaks by using a context which is freed
soon.

When available use a temporary context when calling get_parent(),
otherwise mimic the old behavior by calling get_parent() with the same
argument for both parameters.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxenstore: call each xenstored command function with temporary context
Juergen Gross [Tue, 19 Jul 2016 11:30:42 +0000 (13:30 +0200)]
xenstore: call each xenstored command function with temporary context

In order to be able to avoid leaving temporary memory allocated after
processing of a command in xenstored call all command functions with
the temporary "in" context. Each function can then make use of that
temporary context for allocating temporary memory instead of either
leaving that memory allocated until the connection is dropped (or
even until end of xenstored) or freeing the memory itself.

This requires to modify the interfaces of the functions taking only
one argument from the connection by moving the call of onearg() into
the single functions. Other than that no functional change.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agoxen/x86: Identify legitimate fallthrough cases
Andrew Cooper [Fri, 15 Jul 2016 15:36:07 +0000 (15:36 +0000)]
xen/x86: Identify legitimate fallthrough cases

The case in arch_set_info_guest() is a legitimate fallthrough.  Mark it as such.

The cases in vlapic_accept_irq() are a terminal error path, but Coverity fails
to spot this.  Reorder the comment to the end.

No functional change, but fixes two MISSING_BREAK Coverity defects.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
8 years agoxen: credit2: the private scheduler lock can be an rwlock.
Dario Faggioli [Fri, 15 Jul 2016 14:50:18 +0000 (16:50 +0200)]
xen: credit2: the private scheduler lock can be an rwlock.

In fact, the data it protects only change either at init-time,
during cpupools manipulation, or when changing domains' weights.
In all other cases (namely, load balancing, reading weights
and status dumping), information is only read.

Therefore, let the lock be an read/write one. This means there
is no full serialization point for the whole scheduler and
for all the pCPUs of the host any longer.

This is particularly good for scalability (especially when doing
load balancing).

Also, update the high level description of the locking discipline,
and take the chance for rewording it a little bit (as well as
for adding a couple of locking related ASSERT()-s).

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
8 years agotools: tracing: deal with new Credit2 events
Dario Faggioli [Fri, 15 Jul 2016 14:50:11 +0000 (16:50 +0200)]
tools: tracing: deal with new Credit2 events

more specifically, with: TICKLE_NEW, RUNQ_MAX_WEIGHT,
MIGRATE, LOAD_CHECK, LOAD_BALANCE and PICKED_CPU, and
in both both xenalyze and formats (for xentrace_format).

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agoxen: credit2: only marshall trace point arguments if tracing enabled
Dario Faggioli [Fri, 15 Jul 2016 14:50:04 +0000 (16:50 +0200)]
xen: credit2: only marshall trace point arguments if tracing enabled

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
8 years agoxen: credit2: add yet some more tracing
Dario Faggioli [Fri, 15 Jul 2016 14:49:56 +0000 (16:49 +0200)]
xen: credit2: add yet some more tracing

(and fix the style of two labels as well.)

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
8 years agoxen: credit2: make the code less experimental
Dario Faggioli [Fri, 15 Jul 2016 14:49:49 +0000 (16:49 +0200)]
xen: credit2: make the code less experimental

Mainly, almost all of the BUG_ON-s can be converted into
ASSERTS, and almost all the debug printk can either be
removed or turned into tracing.

The 'TODO' list, in a comment at the beginning of the file,
was also stale, so remove items that were still there but
are actually done.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
8 years agoxen: credit2: use non-atomic cpumask and bit operations
Dario Faggioli [Fri, 15 Jul 2016 14:49:40 +0000 (16:49 +0200)]
xen: credit2: use non-atomic cpumask and bit operations

as all the accesses to both the masks and the flags are
serialized by the runqueues locks already.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
8 years agoxen/tools: improve tracing of Credit2 load tracking events
Dario Faggioli [Fri, 15 Jul 2016 14:49:33 +0000 (16:49 +0200)]
xen/tools: improve tracing of Credit2 load tracking events

Add the shift used for the precision of the integer
arithmetic to the trace records, and update both xenalyze
and xentrace_format to make use of/print it.

In particular, in xenalyze, we are can now show the
load as a (easier to interpreet) percentage.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agoxen: credit2: rework load tracking logic
Dario Faggioli [Fri, 15 Jul 2016 14:49:26 +0000 (16:49 +0200)]
xen: credit2: rework load tracking logic

The existing load tracking code was hard to understad and
maintain, and not entirely consistent. This is due to a
number of reasons:
 - code and comments were not in perfect sync, making it
   difficult to figure out what the intent of a particular
   choice was (e.g., the choice of 18 for load_window_shift);
 - the math, although effective, was not entirely consistent.
   In fact, we were doing (if W is the lenght of the window):

    avgload = (delta*load*W + (W - delta)*avgload)/W
    avgload = avgload + delta*load - delta*avgload/W

   which does not match any known variant of 'smoothing
   moving average'. In fact, it should have been:

    avgload = avgload + delta*load/W - delta*avgload/W

   (for details on why, see the doc comments inside this
   patch.). Furthermore, with

    avgload ~= avgload + W*load - avgload
    avgload ~= W*load

The reason why the formula above sort of worked was because
the number of bits used for the fractional parts of the
values used in fixed point math and the number of bits used
for the lenght of the window were the same (load_window_shift
was being used for both).

This may look handy, but it introduced a (not especially well
documented) dependency between the lenght of the window and
the precision of the calculations, which really should be
two independent things. Especially if treating them as such
(like it is done in this patch) does not lead to more
complex maths (same number of multiplications and shifts, and
there is still room for some optimization).

Therefore, in this patch, we:
 - split length of the window and precision (and, since there
   is already a command line parameter for length of window,
   introduce one for precision too),
 - align the math with one proper incarnation of exponential
   smoothing (at no added cost),
 - add comments, about the details of the algorithm and the
   math used.

While there fix a couple of style issues as well (pointless
initialization, long lines, comments).

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
8 years agoxen: credit2: prevent load balancing to go mad if time goes backwards
Dario Faggioli [Fri, 15 Jul 2016 14:49:18 +0000 (16:49 +0200)]
xen: credit2: prevent load balancing to go mad if time goes backwards

This really should not happen, but:
 1. it does happen! Some more info here:
    http://lists.xen.org/archives/html/xen-devel/2016-06/msg00922.html
 2. independently from 1, it makes sense and is easy enough
    to have a 'safety catch'.

The reason why this is particularly bad for Credit2 is that
negative values of delta mean out of scale high load (because
of the conversion to unsigned). This, for instance in the
case of runqueue load, results in a runqueue having its load
updated to values of the order of 10000% or so, which in turns
means that the load balancer will migrate everything off from
the pCPUs in the runqueue, and leave them idle until the load
gets back to something sane... which may indeed take a while!

This is not a fix for the problem of time going backwards. In
fact, if that happens a lot, load tracking accuracy is still
compromized, but at least the effect is a lot less bad than
before.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
8 years agoxen: sched: leave CPUs doing tasklet work alone.
Dario Faggioli [Fri, 15 Jul 2016 14:49:11 +0000 (16:49 +0200)]
xen: sched: leave CPUs doing tasklet work alone.

In both Credit1 and Credit2, stop considering a pCPU idle,
if the reason why the idle vCPU is being selected, is to
do tasklet work.

Not doing so means that the tickling and load balancing
logic, seeing the pCPU as idle, considers it a candidate
for picking up vCPUs. But the pCPU won't actually pick
up or schedule any vCPU, which would then remain in the
runqueue, which is bad, especially if there were other,
truly idle pCPUs, that could execute it.

The only drawback is that we can't assume that a pCPU is
in always marked as idle when being removed from an
instance of the Credit2 scheduler (csched2_deinit_pdata).
In fact, if we are in stop-machine (i.e., during suspend
or shutdown), the pCPUs are running the stopmachine_tasklet
and hence are actually marked as busy. On the other hand,
when removing a pCPU from a Credit2 pool, it will indeed
be idle. The only thing we can do, therefore, is to
remove the BUG_ON() check.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
8 years agotravis: Add checkpolicy to the list of packages
Andrew Cooper [Fri, 15 Jul 2016 12:51:30 +0000 (13:51 +0100)]
travis: Add checkpolicy to the list of packages

Since c/s 41b61be1c "xsm: add a default policy to .init.data", checkpolicy is
required for the hypervisor build if randconfig decides to enable XSM.

Identified by a Travis randconfig run:
  https://travis-ci.org/andyhhp/xen/jobs/144989065

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Doug Goldstein <cardoe@cardoe.com>
8 years agoasm/atomic.h: implement missing and add common prototypes
Corneliu ZUZU [Fri, 15 Jul 2016 10:46:46 +0000 (13:46 +0300)]
asm/atomic.h: implement missing and add common prototypes

ARM (<asm-arm/atomic.h>):
* add atomic_add_unless() wrapper over __atomic_add_unless()
  (for common-code interface, i.e. <xen/atomic.h>)

X86 (<asm-x86/atomic.h>):
* implement missing functions atomic_{sub,inc,dec}_return(), atomic_add_unless()
* implement missing macro atomic_xchg()

COMMON (<xen/atomic.h>):
* add prototypes for the aforementioned newly implemented X86 functions in
  common <xen/atomic.h>

Signed-off-by: Corneliu ZUZU <czuzu@bitdefender.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
8 years agoasm-arm/atomic.h: atomic_{inc, dec}_return: macros to inline functions
Corneliu ZUZU [Fri, 15 Jul 2016 10:46:00 +0000 (13:46 +0300)]
asm-arm/atomic.h: atomic_{inc, dec}_return: macros to inline functions

Turn atomic_inc_return and atomic_dec_return atomic.h macros to inline
functions.

Signed-off-by: Corneliu ZUZU <czuzu@bitdefender.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/atomic.h: fix: make atomic_read() param const
Corneliu ZUZU [Fri, 15 Jul 2016 10:44:10 +0000 (13:44 +0300)]
xen/atomic.h: fix: make atomic_read() param const

This wouldn't let me make a param of a function that used atomic_read() const.

Signed-off-by: Corneliu ZUZU <czuzu@bitdefender.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
8 years agoasm/atomic.h: common prototyping (add xen/atomic.h)
Corneliu ZUZU [Fri, 15 Jul 2016 10:44:58 +0000 (13:44 +0300)]
asm/atomic.h: common prototyping (add xen/atomic.h)

Create a common-side <xen/atomic.h> to establish, among others, prototypes of
atomic functions called from common-code. Done to avoid introducing
inconsistencies between arch-side <asm/atomic.h> headers when we make subtle
changes to one of them. Some arm-side macros had to be turned into inline
functions in the process.

Removed outdated comment ("NB. I've [...]").

Signed-off-by: Corneliu ZUZU <czuzu@bitdefender.com>
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Julien Grall <julien.grall@arm.com>
8 years agoasm-arm/atomic.h: reorder macros to match x86-side
Corneliu ZUZU [Fri, 15 Jul 2016 10:42:48 +0000 (13:42 +0300)]
asm-arm/atomic.h: reorder macros to match x86-side

Reorder macro definitions to match x86-side.

Signed-off-by: Corneliu ZUZU <czuzu@bitdefender.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Julien Grall <julien.grall@arm.com>
8 years agoasm-x86/atomic.h: minor: proper atomic_inc_and_test() placement
Corneliu ZUZU [Fri, 15 Jul 2016 10:42:07 +0000 (13:42 +0300)]
asm-x86/atomic.h: minor: proper atomic_inc_and_test() placement

Place atomic_inc_and_test() implementation after atomic_inc().
Also empty line fix.

Signed-off-by: Corneliu ZUZU <czuzu@bitdefender.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agoasm-arm/atomic.h: fix arm32|arm64 macros duplication
Corneliu ZUZU [Fri, 15 Jul 2016 10:41:39 +0000 (13:41 +0300)]
asm-arm/atomic.h: fix arm32|arm64 macros duplication

Move duplicate macros between asm-arm/arm32/atomic.h and asm-arm/arm64/atomic.h
to asm-arm/atomic.h.

Signed-off-by: Corneliu ZUZU <czuzu@bitdefender.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Julien Grall <julien.grall@arm.com>
8 years agox86/shadow: Fix build with CONFIG_SHADOW_PAGING=n following c/s 2fc002b
Andrew Cooper [Fri, 15 Jul 2016 12:07:09 +0000 (13:07 +0100)]
x86/shadow: Fix build with CONFIG_SHADOW_PAGING=n following c/s 2fc002b

c/s 2fc002b "xen: Use a typesafe to define INVALID_GFN" changed INVALID_GFN to
be a boxed type.

Identified by a Travis randconfig run:
  https://travis-ci.org/xen-project/xen/jobs/144980445

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
8 years agoxen/build: Use C99 booleans
Andrew Cooper [Wed, 13 Jul 2016 13:55:48 +0000 (14:55 +0100)]
xen/build: Use C99 booleans

and switch bool_t to being of type _Bool rather than char.

Using bool_t as char causes several subtle problems; first that a bool_t
actually has more than two values, and that (bool_t)0x100 actually has the
value 0 rather than the expected 1, due to truncation.

Making this change reveals two bugs now caught by the compiler.
errata_c6_eoi_workaround() actually makes use of bool_t having more than two
states, while generic_apic_probe() has a integer in the middle of a compound
bool_t assignment (which triggers a [-Werror=parentheses] warning on Debian
Jessie).

Finally, it turns out that ARM is mixing and matching bool_t and bool, despite
their different semantics.  This change brings the semantics of bool_t to
match bool, but does not alter the current mix.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Tim Deegan <tim@xen.org>
8 years agoxen/flask: Rename cond_expr.bool to bool_val
Andrew Cooper [Thu, 14 Jul 2016 15:34:52 +0000 (16:34 +0100)]
xen/flask: Rename cond_expr.bool to bool_val

A subsequent change will introduce C99 bools, at which point 'bool'
becomes a type, and ineligible as a variable name.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
8 years agoVT-d: fix Device-TLB flush timeout issue
Quan Xu [Fri, 8 Jul 2016 06:46:15 +0000 (00:46 -0600)]
VT-d: fix Device-TLB flush timeout issue

If Device-TLB flush timed out, we hide the target ATS device
immediately. By hiding the device, we make sure it can't be
assigned to any domain any longer (see device_assigned).

Signed-off-by: Quan Xu <quan.xu@intel.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Quan Xu <quan.xu@intel.com>
8 years agoIOMMU: add domain crash logic
Quan Xu [Fri, 8 Jul 2016 06:45:13 +0000 (00:45 -0600)]
IOMMU: add domain crash logic

Add domain crash logic to the generic IOMMU layer to benefit
all platforms.

No spamming of the log can occur. For DomU, we avoid logging any
message for already dying domains. For Dom0, that'll still be more
verbose than we'd really like, but it at least wouldn't outright
flood the console.

Signed-off-by: Quan Xu <quan.xu@intel.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Quan Xu <quan.xu@intel.com>
8 years agoIOMMU/ATS: use a struct pci_dev * instead of SBDF
Quan Xu [Fri, 8 Jul 2016 06:44:23 +0000 (00:44 -0600)]
IOMMU/ATS: use a struct pci_dev * instead of SBDF

Do away with struct pci_ats_dev; integrate the few bits of information
in struct pci_dev (and as a result drop get_ats_device() altogether).
Hook ATS devices onto a linked list off of each IOMMU instead of on a
global one.

Signed-off-by: Quan Xu <quan.xu@intel.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Tested-by: Quan Xu <quan.xu@intel.com>
8 years agoXSM-Policy: allow source domain access to setpodtarget and getpodtarget for ballooning.
Anshul Makkar [Thu, 14 Jul 2016 14:46:12 +0000 (15:46 +0100)]
XSM-Policy: allow source domain access to setpodtarget and getpodtarget for ballooning.

Access to setpodtarget and getpodtarget is required by dom0 to set the balloon
targets for domU. The patch gives source domain (dom0) access to set
this target for domU and resolve the following permission denied erro
message during ballooning :
avc:  denied  { setpodtarget } for domid=0 target=9
scontext=system_u:system_r:dom0_t
tcontext=system_u:system_r:domU_t tclass=domain

Signed-off-by: Anshul Makkar <anshul.makkar@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
8 years agoxsm: add a default policy to .init.data
Daniel De Graaf [Thu, 14 Jul 2016 14:18:47 +0000 (10:18 -0400)]
xsm: add a default policy to .init.data

This adds a Kconfig option and support for including the XSM policy from
tools/flask/policy in the hypervisor so that the bootloader does not
need to provide a policy to get sane behavior from an XSM-enabled
hypervisor.  The policy provided by the bootloader, if present, will
override the built-in policy.

The XSM policy is not moved out of tools because that remains the
primary location for installing and configuring the policy.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agoxsm: rework policy_buffer globals
Daniel De Graaf [Thu, 14 Jul 2016 14:18:46 +0000 (10:18 -0400)]
xsm: rework policy_buffer globals

This makes the buffers function parameters instead of globals, in
preparation for adding alternate locations for the policy.

Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agoarm: vgic: Split vgic_domain_init() functionality into two functions
Shanker Donthineni [Mon, 27 Jun 2016 20:33:39 +0000 (15:33 -0500)]
arm: vgic: Split vgic_domain_init() functionality into two functions

Separate the code logic that does the registration of vgic_v3/v2 ops
to a new function domain_vgic_register(). The intention of this
separation is to record the required mmio count in vgic_v3/v2_init()
and pass it to function domain_io_init() in a follow-up patch patch.

Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoarm/gic-v3: Remove an unused macro MAX_RDIST_COUNT
Shanker Donthineni [Mon, 27 Jun 2016 20:33:38 +0000 (15:33 -0500)]
arm/gic-v3: Remove an unused macro MAX_RDIST_COUNT

The macro MAX_RDIST_COUNT is not being used after converting code
to handle number of redistributor dynamically. So remove it from
header file and the two other panic() messages that are not valid
anymore.

Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: vgic: Use dynamic memory allocation for vgic_rdist_region
Shanker Donthineni [Mon, 27 Jun 2016 20:33:37 +0000 (15:33 -0500)]
xen/arm: vgic: Use dynamic memory allocation for vgic_rdist_region

The number of Redistributor regions allowed for dom0 is hardcoded
to a define MAX_RDIST_COUNT which is 4. Some systems, especially
latest server chips, may have more than 4 redistributors. Either we
have to increase MAX_RDIST_COUNT to a bigger number or allocate
memory based on the number of redistributors that are found in MADT
table. In the worst case scenario, the macro MAX_RDIST_COUNT should
be equal to CONFIG_NR_CPUS in order to support per CPU Redistributors.

Increasing MAX_RDIST_COUNT has a effect, it blows 'struct domain'
size and hits BUILD_BUG_ON() in domain build code path.

struct domain *alloc_domain_struct(void)
{
    struct domain *d;
    BUILD_BUG_ON(sizeof(*d) > PAGE_SIZE);
    d = alloc_xenheap_pages(0, 0);
    if ( d == NULL )
        return NULL;
...

This patch uses the second approach to fix the BUILD_BUG().

Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoarm/gic-v3: Parse per-cpu redistributor entry in GICC subtable
Shanker Donthineni [Mon, 27 Jun 2016 20:33:36 +0000 (15:33 -0500)]
arm/gic-v3: Parse per-cpu redistributor entry in GICC subtable

The redistributor address can be specified either as part of GICC or
GICR subtable depending on the power domain. The current driver
doesn't support parsing redistributor entry that is defined in GICC
subtable. The GIC CPU subtable entry holds the associated Redistributor
base address if it is not on always-on power domain.

The per CPU Redistributor size is not defined in ACPI specification.
Set the GICR region size to SZ_256K if the GIC hardware is capable of
Direct Virtual LPI Injection feature, SZ_128K otherwise.

This patch adds necessary code to handle both types of Redistributors
base addresses.

Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoarm/gic-v3: Move GICR subtable parsing into a new function
Shanker Donthineni [Thu, 14 Jul 2016 14:13:13 +0000 (15:13 +0100)]
arm/gic-v3: Move GICR subtable parsing into a new function

Add a new function to parse GICR subtable and move the code that
is specific to GICR table to a new function without changing the
function gicv3_acpi_init() behavior.

Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoarm/gic-v3: Do early GICD ioremap and clean up
Shanker Donthineni [Mon, 27 Jun 2016 20:33:34 +0000 (15:33 -0500)]
arm/gic-v3: Do early GICD ioremap and clean up

For ACPI based XEN boot, the GICD region needs to be accessed inside
the function gicv3_acpi_init() in later patch. There is a duplicate
panic() message, one in the DTS probe and second one in the ACPI probe
path. For these two reasons, move the code that validates the GICD base
address and does the region ioremap to a separate function. The
following patch accesses the GICD region inside gicv3_acpi_init() for
finding per CPU Redistributor size.

Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoarm/gic-v3: Use acpi_table_parse_madt() to parse MADT subtables
Shanker Donthineni [Mon, 27 Jun 2016 20:33:33 +0000 (15:33 -0500)]
arm/gic-v3: Use acpi_table_parse_madt() to parse MADT subtables

The function acpi_table_parse_madt() does the same functionality as
function acpi_parse_entries() expect it takes a few arguments.

Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: traps: Data Abort are always unconditional
Julien Grall [Wed, 22 Jun 2016 13:21:03 +0000 (14:21 +0100)]
xen/arm: traps: Data Abort are always unconditional

The HSR encoding for an exception from a data abort does not contain a
conditional code (see G6-4264 in ARM DDI 0487A.i) because they are
always conditional.

So drop the pointless condition check.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: traps: Second attempt to correctly use the content of HPFAR_EL2
Julien Grall [Wed, 22 Jun 2016 13:21:02 +0000 (14:21 +0100)]
xen/arm: traps: Second attempt to correctly use the content of HPFAR_EL2

Commit c051618 "xen/arm: traps: Correctly interpret the content of the
register HPFAR_EL2" attempted to fix the interpretation of HPFAR_EL2.

However, the register contains a 4KB-aligned address. This means that
the reported address is not directly usable to know the faulting IPA.
The offset in the 4KB page can be found by looking at the associated virtual
address (FAR_EL2/HDFAR).

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: Simply the definition of PAGE_SIZE by using the macro _AC
Julien Grall [Wed, 22 Jun 2016 13:21:01 +0000 (14:21 +0100)]
xen/arm: Simply the definition of PAGE_SIZE by using the macro _AC

The macro _AC is used to define constant for both assembly and C.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: p2m: Rework the interface of apply_p2m_changes and use typesafe
Julien Grall [Tue, 12 Jul 2016 13:59:35 +0000 (14:59 +0100)]
xen/arm: p2m: Rework the interface of apply_p2m_changes and use typesafe

Most of the callers of apply_p2m_changes have a GFN, a MFN and the
number of frame to change in hand.

Rather than asking each caller to convert the frame to an address,
rework the interfaces to pass the GFN, MFN and the number of frame.

Note that it would be possible to do more clean-up in apply_p2m_changes,
but this will be done in a follow-up series.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: p2m: Use typesafe gfn for {max,lowest}_mapped_gfn
Julien Grall [Tue, 12 Jul 2016 13:59:34 +0000 (14:59 +0100)]
xen/arm: p2m: Use typesafe gfn for {max,lowest}_mapped_gfn

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: p2m: Introduce helpers to insert and remove mapping
Julien Grall [Tue, 12 Jul 2016 13:59:33 +0000 (14:59 +0100)]
xen/arm: p2m: Introduce helpers to insert and remove mapping

More the half of the arguments of INSERT and REMOVE are the same for
each callers. Simplify the callers of apply_p2m_changes by adding new
helpers which will fill common arguments with default values.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: Use the typesafes mfn and gfn in map_regions_rw_cache ...
Julien Grall [Tue, 12 Jul 2016 13:59:32 +0000 (14:59 +0100)]
xen/arm: Use the typesafes mfn and gfn in map_regions_rw_cache ...

to avoid mixing machine frame with guest frame. Also rename the
parameters of the function and drop pointless PAGE_MASK in the caller.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: Use the typesafes mfn and gfn in map_dev_mmio_region...
Julien Grall [Tue, 12 Jul 2016 13:59:31 +0000 (14:59 +0100)]
xen/arm: Use the typesafes mfn and gfn in map_dev_mmio_region...

to avoid mixing machine frame with guest frame. Also drop the prefix start_.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: p2m: Remove unused operation ALLOCATE
Julien Grall [Tue, 12 Jul 2016 13:59:30 +0000 (14:59 +0100)]
xen/arm: p2m: Remove unused operation ALLOCATE

The operation ALLOCATE is unused. If we ever need it, it could be
reimplemented with INSERT.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: dom0_build: Remove dead code in allocate_memory
Julien Grall [Tue, 12 Jul 2016 13:59:29 +0000 (14:59 +0100)]
xen/arm: dom0_build: Remove dead code in allocate_memory

The code to allocate memory when dom0 does not use direct mapping is
relying on the presence of memory node in the DT.

However, they are not present when booting using UEFI or when using
ACPI.

Rather than fixing the code, remove it because dom0 is always direct
memory mapped and therefore the code is never tested. Also add a
check to avoid disabling direct memory mapped and not implementing
the associated RAM bank allocation.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: map_regions_rw_cache: Map the region with p2m->default_access
Julien Grall [Tue, 12 Jul 2016 13:59:28 +0000 (14:59 +0100)]
xen/arm: map_regions_rw_cache: Map the region with p2m->default_access

The parameter 'access' is used by memaccess to restrict temporarily the
permission. This parameter should not be used for other purpose (such
as restricting permanently the permission).

Instead, we should use the default access requested by memacess. When it
is not enabled, the access will be p2m_access_rwx (i.e no restriction
applied).

The type p2m_mmio_direct will map the region read-write and
non-executable before any further restriction by memaccess. Note that
this is already the resulting permission with the curreent combination
of the type and the access. So there is no functional change.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: arm64: Add helpers to decode and encode branch instructions
Julien Grall [Wed, 22 Jun 2016 11:15:23 +0000 (12:15 +0100)]
xen/arm: arm64: Add helpers to decode and encode branch instructions

We may need to update branch instruction when patching Xen.

The code has been imported from the files arch/arm64/kernel/insn.c
and arch/arm64/include/asm/insn.h in Linux v4.6.

Note that only the necessary helpers have been imported.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: arm64: Reserve a brk immediate to fault on purpose
Julien Grall [Wed, 22 Jun 2016 11:15:22 +0000 (12:15 +0100)]
xen/arm: arm64: Reserve a brk immediate to fault on purpose

It may not possible to return a proper error when encoding an
instruction. Instead, a handcrafted instruction will be returned.

Also, provide the encoding for the faulting instruction.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: arm64: Move the define BRK_BUG_FRAME into a separate header
Julien Grall [Wed, 22 Jun 2016 11:15:21 +0000 (12:15 +0100)]
xen/arm: arm64: Move the define BRK_BUG_FRAME into a separate header

New immediates will be defined in the future. To keep track of the
immediates allocated, gather all of them in a separate header.

Also rename BRK_BUG_FRAME to BKR_BUG_FRAME_IMM.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm64: Add an helper to invalidate all instruction caches
Julien Grall [Wed, 22 Jun 2016 11:15:20 +0000 (12:15 +0100)]
xen/arm64: Add an helper to invalidate all instruction caches

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: Add cpu_hwcap bitmap
Julien Grall [Wed, 22 Jun 2016 11:15:19 +0000 (12:15 +0100)]
xen/arm: Add cpu_hwcap bitmap

This will be used to know if a feature, which Xen cares, is available accross
all the CPUs.

This code is a light version of arch/arm64/kernel/cpufeature.c from
Linux v4.6-rc3.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: Add macros to handle the MIDR
Julien Grall [Wed, 22 Jun 2016 11:15:18 +0000 (12:15 +0100)]
xen/arm: Add macros to handle the MIDR

Add new macros to easily get different parts of the register and to
check if a given MIDR match a CPU model range. The latter will be really
useful to handle errata later.

The macros have been imported from the header
arch/arm64/include/asm/cputype.h in Linux v4.6-rc3.

Also remove MIDR_MASK which is unused.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
8 years agoxen/arm: Include the header asm-arm/system.h in asm-arm/page.h
Julien Grall [Wed, 22 Jun 2016 11:15:17 +0000 (12:15 +0100)]
xen/arm: Include the header asm-arm/system.h in asm-arm/page.h

The header asm-arm/page.h makes use of the macro dsb defined in the header
asm-arm/system.h. Currently, the includer has to specify both of them.

This can be avoided by including asm-arm/system.h in asm-arm/page.h.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>