]> xenbits.xensource.com Git - xen.git/log
xen.git
7 years agoxen/arm: Move co-processor emulation outside of traps.c
Julien Grall [Thu, 14 Sep 2017 17:08:57 +0000 (18:08 +0100)]
xen/arm: Move co-processor emulation outside of traps.c

The co-processor emulation is quite big and pretty much standalone. Move
it in a separate file to shrink down the size of traps.c.

At the same time remove unused cpregs.h.

No functional change.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: Move sysreg emulation outside of traps.c
Julien Grall [Thu, 14 Sep 2017 17:08:56 +0000 (18:08 +0100)]
xen/arm: Move sysreg emulation outside of traps.c

The sysreg emulation is 64-bit specific and surrounded by #ifdef. Move
them in a separate file arm/arm64/vsysreg.c to shrink down a bit traps.c

No functional change.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: traps: Export a bunch of helpers to handle emulation
Julien Grall [Thu, 14 Sep 2017 17:08:55 +0000 (18:08 +0100)]
xen/arm: traps: Export a bunch of helpers to handle emulation

A follow-up patch will move some parts of traps.c in separate files.
The will require to use helpers that are currently statically defined.
Export the following helpers:
    - inject_undef64_exception
    - inject_undef_exception
    - check_conditional_instr
    - advance_pc
    - handle_raz_wi
    - handle_wo_wi
    - handle_ro_raz

Note that asm-arm/arm32/traps.h is empty but it is to keep parity with
the arm64 counterpart.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen: credit2: implement utilization cap
Dario Faggioli [Thu, 14 Sep 2017 16:30:36 +0000 (17:30 +0100)]
xen: credit2: implement utilization cap

This commit implements the Xen part of the cap mechanism for
Credit2.

A cap is how much, in terms of % of physical CPU time, a domain
can execute at most.

For instance, a domain that must not use more than 1/4 of
one physical CPU, must have a cap of 25%; one that must not
use more than 1+1/2 of physical CPU time, must be given a cap
of 150%.

Caps are per domain, so it is all a domain's vCPUs, cumulatively,
that will be forced to execute no more than the decided amount.

This is implemented by giving each domain a 'budget', and
using a (per-domain again) periodic timer. Values of budget
and 'period' are chosen so that budget/period is equal to the
cap itself.

Budget is burned by the domain's vCPUs, in a similar way to
how credits are.

When a domain runs out of budget, its vCPUs can't run any
longer. They can gain, when the budget is replenishment by
the timer, which event happens once every period.

Blocking the vCPUs because of lack of budget happens by
means of a new (_VPF_parked) pause flag, so that, e.g.,
vcpu_runnable() still works. This is similar to what is
done in sched_rtds.c, as opposed to what happens in
sched_credit.c, where vcpu_pause() and vcpu_unpause()
(which means, among other things, more overhead).

Note that, while adding new fields to csched2_vcpu and
csched2_dom, currently existing members are being moved
around, to achieve best placement inside cache lines.

Note also that xenalyze and tools/xentrace/format are being
updated too.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
7 years agox86/mm: initialize ol1e in create_grant_pv_mapping() for older compilers
Boris Ostrovsky [Thu, 14 Sep 2017 16:01:38 +0000 (18:01 +0200)]
x86/mm: initialize ol1e in create_grant_pv_mapping() for older compilers

On gcc 4.4.4:

mm.c: In function \91create_grant_pv_mapping\92:
mm.c:3839: error: \91ol1e.l1\92 may be used uninitialized in this function

While ol1e would not be used uninitialized (because rc needs to be properly
set) we have to accommodate these older compliers.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agolibxl: fix disk listing function
Wei Liu [Thu, 14 Sep 2017 15:38:11 +0000 (16:38 +0100)]
libxl: fix disk listing function

The path should be "vbd" not "disk".

Fixes fbbaf2cc9 ("libxl: change disk to use generic getting list
functions").

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
7 years agolibxl: add back libxl_device_v{k,f}b_add
Wei Liu [Wed, 13 Sep 2017 13:44:09 +0000 (14:44 +0100)]
libxl: add back libxl_device_v{k,f}b_add

The two functions, unlike a lot others, were hand-coded. They were
deleted by accident while the device framework was reworked. Add them
back.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
7 years agox86/oprofile: Add a missing space to initialisation failure message
Andrew Cooper [Wed, 13 Sep 2017 13:41:07 +0000 (14:41 +0100)]
x86/oprofile: Add a missing space to initialisation failure message

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/mce: remove extra blanks in mctelem.c
Haozhong Zhang [Mon, 11 Sep 2017 07:57:58 +0000 (15:57 +0800)]
x86/mce: remove extra blanks in mctelem.c

The entire file of mctelem.c is in Linux coding style, so do not
change the coding style and only remove trailing spaces and extra
blank lines.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/mce: add emacs block to mctelem.c
Haozhong Zhang [Mon, 11 Sep 2017 07:57:57 +0000 (15:57 +0800)]
x86/mce: add emacs block to mctelem.c

mctelem.c uses the tab indention. Add an emacs block to avoid mixed
indention styles in certain editors.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/mce: adapt mce_intel.c to Xen hypervisor coding style
Haozhong Zhang [Mon, 11 Sep 2017 07:57:56 +0000 (15:57 +0800)]
x86/mce: adapt mce_intel.c to Xen hypervisor coding style

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/mce: adapt mcation.c to Xen hypervisor coding style
Haozhong Zhang [Mon, 11 Sep 2017 07:57:55 +0000 (15:57 +0800)]
x86/mce: adapt mcation.c to Xen hypervisor coding style

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/vmce: adapt vmce.c to Xen hypervisor coding style
Haozhong Zhang [Mon, 11 Sep 2017 07:57:54 +0000 (15:57 +0800)]
x86/vmce: adapt vmce.c to Xen hypervisor coding style

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/mce: adapt mce.{c, h} to Xen hypervisor coding style
Haozhong Zhang [Mon, 11 Sep 2017 07:57:53 +0000 (15:57 +0800)]
x86/mce: adapt mce.{c, h} to Xen hypervisor coding style

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/mm: Prevent 32bit PV guests using out-of-range linear addresses
Andrew Cooper [Fri, 11 Aug 2017 13:02:31 +0000 (13:02 +0000)]
x86/mm: Prevent 32bit PV guests using out-of-range linear addresses

The grant ABI uses 64 bit values, and allows a PV guest to specify linear
addresses.  There is nothing interesting a 32bit PV guest can reference which
will pass an __addr_ok() check (and therefore succeed), but we should still
explicitly check and reject such an attempt.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/mm: Combine {destroy,replace}_grant_{pte,va}_mapping()
Andrew Cooper [Tue, 1 Aug 2017 15:39:59 +0000 (16:39 +0100)]
x86/mm: Combine {destroy,replace}_grant_{pte,va}_mapping()

As with the create side of things, these are largely identical.  Most cases
are actually destroying the mapping rather than replacing it with a stolen
entry.

Reimplement their logic in replace_grant_pv_mapping() in a mostly common
way.

No (intended) change in behaviour from a guests point of view.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/mm: Carve steal_linear_address() out of replace_grant_host_mapping()
Andrew Cooper [Tue, 1 Aug 2017 15:39:59 +0000 (16:39 +0100)]
x86/mm: Carve steal_linear_address() out of replace_grant_host_mapping()

Document its curious semantics.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/mm: Combine create_grant_{pte,va}_mapping()
Andrew Cooper [Tue, 1 Aug 2017 15:39:59 +0000 (15:39 +0000)]
x86/mm: Combine create_grant_{pte,va}_mapping()

create_grant_{pte,va}_mapping() are nearly identical; all that is really
different between them is how they convert their addr parameter to the pte to
install the grant into.

Reimplement their logic in create_grant_pv_mapping() in a mostly common way.

No (intended) change in behaviour from a guests point of view.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/mm: Misc cleanup to {create,replace}_grant_host_mapping()
Andrew Cooper [Tue, 1 Aug 2017 15:39:59 +0000 (16:39 +0100)]
x86/mm: Misc cleanup to {create,replace}_grant_host_mapping()

The purpose of this patch is solely to simplify the resulting diff of later
changes.

 * Factor out curr and currd at the start of the functions.
 * Rename pte to nl1e.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/mm: Factor out the grant flags to pte flags conversion logic
Andrew Cooper [Fri, 11 Aug 2017 11:20:40 +0000 (11:20 +0000)]
x86/mm: Factor out the grant flags to pte flags conversion logic

This fixes a bug where the requested AVAIL* flags were not honoured in an
unmap_and_replace operation.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/mm: Improvements to PV l1e mapping helpers
Andrew Cooper [Wed, 2 Aug 2017 11:40:02 +0000 (12:40 +0100)]
x86/mm: Improvements to PV l1e mapping helpers

Drop guest_unmap_l1e() and use unmap_domain_page() directly.  This will
simplify future cleanup.  Rename guest_map_l1e() to map_guest_l1e() to closer
match the mapping nomenclature.

Switch map_guest_l1e() to using mfn_t.  Correct the comment to indicate that
it takes a linear address (not a virtual address), and correct the parameter
name.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agogitignore: add local vimrc files
Petre Pircalabu [Tue, 12 Sep 2017 14:32:03 +0000 (17:32 +0300)]
gitignore: add local vimrc files

Signed-off-by: Petre Pircalabu <ppircalabu@bitdefender.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agolibxl: remove unneeded DEVICE_ADD macro
Oleksandr Grytsov [Tue, 11 Jul 2017 16:52:28 +0000 (19:52 +0300)]
libxl: remove unneeded DEVICE_ADD macro

Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agolibxl: change vtpm to use generec add function
Oleksandr Grytsov [Tue, 11 Jul 2017 16:26:07 +0000 (19:26 +0300)]
libxl: change vtpm to use generec add function

Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agolibxl: fix memory leak in libxl__colo_save_setup
Oleksandr Grytsov [Tue, 12 Sep 2017 13:31:58 +0000 (16:31 +0300)]
libxl: fix memory leak in libxl__colo_save_setup

Getting nic list in case userspace proxy is called
without freeing. The fix is to use cds->nics to
keep nic list. cds->nics will be freed in
devices_teardown_cb.

Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agolibxl: change nic to use generec add function
Oleksandr Grytsov [Tue, 11 Jul 2017 14:26:09 +0000 (17:26 +0300)]
libxl: change nic to use generec add function

Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
[ wei: add missing semicolon ]
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
7 years agolibxl: change disk to use generic getting list functions
Oleksandr Grytsov [Tue, 11 Jul 2017 13:55:47 +0000 (16:55 +0300)]
libxl: change disk to use generic getting list functions

Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agolibxl: change vfb to use generec add function
Oleksandr Grytsov [Mon, 10 Jul 2017 17:34:07 +0000 (20:34 +0300)]
libxl: change vfb to use generec add function

Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agolibxl: change vkb to use generec add function
Oleksandr Grytsov [Mon, 10 Jul 2017 17:17:49 +0000 (20:17 +0300)]
libxl: change vkb to use generec add function

Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agolibxl: change p9 to use generec add function
Oleksandr Grytsov [Mon, 10 Jul 2017 14:03:59 +0000 (17:03 +0300)]
libxl: change p9 to use generec add function

Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agodocs: add PV display driver information
Oleksandr Grytsov [Thu, 25 May 2017 11:55:27 +0000 (14:55 +0300)]
docs: add PV display driver information

Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agoxl: add PV display device commands
Oleksandr Grytsov [Thu, 23 Mar 2017 15:26:41 +0000 (17:26 +0200)]
xl: add PV display device commands

Add commands: vdispl-attach, vdispl-list, vdispl-detach
and domain config vdispl parser

Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agolibxl: add vdispl device
Oleksandr Grytsov [Mon, 26 Jun 2017 11:36:41 +0000 (14:36 +0300)]
libxl: add vdispl device

Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agolibxl: add generic functions to get and free device list
Oleksandr Grytsov [Mon, 10 Jul 2017 13:50:12 +0000 (16:50 +0300)]
libxl: add generic functions to get and free device list

Add libxl__device_list and libxl__device_list_free
functions to handle device list using the device
framework.

Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agolibxl: add generic function to add device
Oleksandr Grytsov [Mon, 26 Jun 2017 13:08:56 +0000 (16:08 +0300)]
libxl: add generic function to add device

Add libxl__device_add to simple write XenStore device conifg
and libxl__device_add_async to update domain configuration
and write XenStore device config asynchroniously.
Almost all devices have similar libxl__device_xxxx_add function.
This generic functions implement same functionality but
using the device handling framework. Th device specific
part such as setting xen store configurationis moved
to set_xenstore_config callback of the device framework.

Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agoxen/arm: Move arch/arm/vtimer.h to include/asm-arm/vtimer.h
Julien Grall [Tue, 12 Sep 2017 10:36:17 +0000 (11:36 +0100)]
xen/arm: Move arch/arm/vtimer.h to include/asm-arm/vtimer.h

It will be necessary to include vtimer.h from subdirectory making the
inclusion a bit awkward.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: traps: Re-order the includes alphabetically
Julien Grall [Tue, 12 Sep 2017 10:36:16 +0000 (11:36 +0100)]
xen/arm: traps: Re-order the includes alphabetically

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Bhupinder Thakur <bhupinder.thakur@linaro.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agognttab: also validate PTE permissions upon destroy/replace
Jan Beulich [Tue, 12 Sep 2017 12:45:13 +0000 (14:45 +0200)]
gnttab: also validate PTE permissions upon destroy/replace

In order for PTE handling to match up with the reference counting done
by common code, presence and writability of grant mapping PTEs must
also be taken into account; validating just the frame number is not
enough. This is in particular relevant if a guest fiddles with grant
PTEs via non-grant hypercalls.

Note that the flags being passed to replace_grant_host_mapping()
already happen to be those of the existing mapping, so no new function
parameter is needed.

This is CVE-2017-14319 / XSA-234.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agotools/xenstore: dont unlink connection object twice
Juergen Gross [Tue, 12 Sep 2017 12:44:56 +0000 (14:44 +0200)]
tools/xenstore: dont unlink connection object twice

A connection object of a domain with associated stubdom has two
parents: the domain and the stubdom. When cleaning up the list of
active domains in domain_cleanup() make sure not to unlink the
connection twice from the same domain. This could happen when the
domain and its stubdom are being destroyed at the same time leading
to the domain loop being entered twice.

Additionally don't use talloc_free() in this case as it will remove
a random parent link, leading eventually to a memory leak. Use
talloc_unlink() instead specifying the context from which the
connection object should be removed.

This is CVE-2017-14317 / XSA-233.

Reported-by: Eric Chanudet <chanudete@ainfosec.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
7 years agogrant_table: fix GNTTABOP_cache_flush handling
Andrew Cooper [Tue, 12 Sep 2017 12:44:11 +0000 (14:44 +0200)]
grant_table: fix GNTTABOP_cache_flush handling

Don't fall over a NULL grant_table pointer when the owner of the domain
is a system domain (DOMID_{XEN,IO} etc).

This is CVE-2017-14318 / XSA-232.

Reported-by: Matthew Daley <mattd@bugfuzz.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agoxen/mm: make sure node is less than MAX_NUMNODES
George Dunlap [Tue, 12 Sep 2017 12:43:16 +0000 (14:43 +0200)]
xen/mm: make sure node is less than MAX_NUMNODES

The output of MEMF_get_node(memflags) can be as large as nodeid_t can
hold (currently 255).  This is then used as an index to arrays of size
MAX_NUMNODE, which is 64 on x86 and 1 on ARM, can be passed in by an
untrusted guest (via memory_exchange and increase_reservation) and is
not currently bounds-checked.

Check the value in page_alloc.c before using it, and also check the
value in the hypercall call sites and return -EINVAL if appropriate.
Don't permit domains other than the hardware or control domain to
allocate node-constrained memory.

This is CVE-2017-14316 / XSA-231.

Reported-by: Matthew Daley <mattd@bugfuzz.com>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/shadow: Use ERR_PTR infrastructure for sh_emulate_map_dest()
Andrew Cooper [Fri, 8 Sep 2017 16:05:33 +0000 (19:05 +0300)]
x86/shadow: Use ERR_PTR infrastructure for sh_emulate_map_dest()

sh_emulate_map_dest() predates the introduction of the generic ERR_PTR()
infrastructure, but take the opportunity to avoid opencoding it.

The chosen error constants require need to be negative to work with IS_ERR(),
but no other changes.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agoxen/mm: Use __virt_to_mfn in map_domain_page instead of virt_to_mfn
Julien Grall [Tue, 12 Sep 2017 10:03:09 +0000 (11:03 +0100)]
xen/mm: Use __virt_to_mfn in map_domain_page instead of virt_to_mfn

virt_to_mfn may by overridden by the source files, for improving locally
typesafe.

Therefore map_domain_page has to use __virt_to_mfn to prevent any
compilation issue in sources files that override the helper.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agoxen/x86: mm: Introduce {G, M}FN <-> {G, M}ADDR helpers
Julien Grall [Tue, 12 Sep 2017 10:03:07 +0000 (11:03 +0100)]
xen/x86: mm: Introduce {G, M}FN <-> {G, M}ADDR helpers

The new wrappers will add more safety when converting an address to a
frame number (either machine or guest). They are already existing for
Arm and could be useful in common code.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agoxen/x86: Replace mandatory barriers with compiler barriers
Andrew Cooper [Wed, 16 Aug 2017 17:07:27 +0000 (18:07 +0100)]
xen/x86: Replace mandatory barriers with compiler barriers

In this case, rmb() is being used for its compiler barrier property.  Replace
it with an explicit barrer() and comment, to avoid it becoming an unnecessary
lfence instruction (when rmb() gets fixed) or looking like an SMP issue.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agomem_access: switch to plain bool
Wei Liu [Mon, 11 Sep 2017 11:16:28 +0000 (12:16 +0100)]
mem_access: switch to plain bool

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
7 years agox86/mm: Allow map_domain_page_global() to be used during boot
Andrew Cooper [Thu, 7 Sep 2017 16:38:52 +0000 (17:38 +0100)]
x86/mm: Allow map_domain_page_global() to be used during boot

map_domain_page_global() uses vmap under the hood, which is set up immediately
after switching to SYS_STATE_boot.  Relax the local_irq_is_enabled() part of
the assertion before Xen has finished booting, so map_domain_page_global() can
be used duing SMP preparation.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agopci: constify domain parameter of pci_get_pdev_by_domain
Roger Pau Monné [Fri, 8 Sep 2017 14:25:24 +0000 (16:25 +0200)]
pci: constify domain parameter of pci_get_pdev_by_domain

While there fix the indentation.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agohvmloader: clone REP INSW test from REP INSB one
Jan Beulich [Fri, 8 Sep 2017 14:24:57 +0000 (16:24 +0200)]
hvmloader: clone REP INSW test from REP INSB one

This also covers an individual string insn access crossing a page
boundary.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agohvmloader: dynamically determine scratch memory range for tests
Jan Beulich [Fri, 8 Sep 2017 14:24:41 +0000 (16:24 +0200)]
hvmloader: dynamically determine scratch memory range for tests

This re-enables tests on configurations where commit 0d6968635c
("hvmloader: avoid tests when they would clobber used memory") forced
them to be skipped.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/HVM: correct repeat count update in linear->phys translation
Jan Beulich [Fri, 8 Sep 2017 14:23:46 +0000 (16:23 +0200)]
x86/HVM: correct repeat count update in linear->phys translation

For the insn emulator's fallback logic in REP INS/OUTS handling
to work correctly, *reps must not be set to zero when returning
X86EMUL_UNHANDLEABLE.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Paul Durrant <paul.durrant@citrix.com>
7 years agomonitor: switch to plain bool
Wei Liu [Fri, 8 Sep 2017 13:44:33 +0000 (14:44 +0100)]
monitor: switch to plain bool

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
7 years agotools: eliminate LIBXL_BLKTAP2
Wei Liu [Mon, 4 Sep 2017 13:44:47 +0000 (14:44 +0100)]
tools: eliminate LIBXL_BLKTAP2

Use CONFIG_BLKTAP2 directly. There is no reason why one would want to
set LIBXL_BLKTAP2 separately as things stand.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
7 years agotools: disable blktap2 by default
Wei Liu [Mon, 4 Sep 2017 13:44:46 +0000 (14:44 +0100)]
tools: disable blktap2 by default

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
7 years agobuild: run autogen.sh on Stretch
Wei Liu [Mon, 4 Sep 2017 13:44:45 +0000 (14:44 +0100)]
build: run autogen.sh on Stretch

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
7 years agoMAINTAINERS: orphan blktap2
Wei Liu [Fri, 8 Sep 2017 10:34:22 +0000 (11:34 +0100)]
MAINTAINERS: orphan blktap2

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
7 years agox86/page: Implement {get,set}_pte_flags() as static inlines
Andrew Cooper [Wed, 6 Sep 2017 13:34:04 +0000 (14:34 +0100)]
x86/page: Implement {get,set}_pte_flags() as static inlines

This resolves 11 Coverity issues along the lines of the following:

1600        for ( i = 0; i < NR_RESERVED_GDT_PAGES; i++ )

    CID: Operands don't affect result
    (CONSTANT_EXPRESSION_RESULT)result_independent_of_operands: ((33U /* 1U |
    0x20U */) | (({...}) ? 8388608U /* 1U << 23 */ : 0) | 0x40U | 2U) & 4095
    is always 0x63 regardless of the values of its operands. This occurs as
    the bitwise second operand of "|".

1601            l1e_write(pl1e + FIRST_RESERVED_GDT_PAGE + i,
1602                      l1e_from_pfn(mfn + i, __PAGE_HYPERVISOR_RW));

This is presumably because once preprocessed, the association of joint logic
inside {get,set}_pte_flags() is lost.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agoDEPS handling: Remove absolute paths from references to cwd
Ian Jackson [Mon, 4 Sep 2017 16:46:16 +0000 (17:46 +0100)]
DEPS handling: Remove absolute paths from references to cwd

In some directories we use gcc on source files elsewhere, to generate
a .o here in the current directory.  Eg in tools/libxl/,
   gcc -I -o build.o /path/to/libacpi/build.c
We pass -MMD and -MF options to generate a .d file right here.

In the general case this .c file might need to include things from the
directory here, eg libacpi/build.c eventually #includes various
*libxl*.h.  We pass gcc -I. for this, which means things from the cwd
where we invoked gcc, not the directory of the #including file.

When we do this, gcc's -MMD output mentions /path/to/libxl/*libxl*.h,
even though it could refer to simply *libxl*.h.  This is presumably
because gcc has noticed that `.' in this context must mean relative to
the invocation cwd, not relative to build.c, and gcc doesn't realise
that references in the .d file are also wrt the invocation cwd.

make distinguishes targets purely textually.  It will canonicalise a
target name by removing ./ before comparison (so _libxl_types.h and
./_libxl_types.h are considered the same target) but it won't examine
the filesystem.  So _libxl_types.h and
/path/to/tools/libxl/_libxl_types.h are different targets.

And, _libxl_types.h is generated from a pattern rule.  This pattern
rule is therefore instatiated twice, and the two instances may be run
concurrently - but use the same tempfiles and can therefore fail.

The thing that is wrong here is gcc's choice to output an absolute
path.

We could work around it by adding a rule to teach make about a
relationship between these `two different files'.  But this has to be
done for every autogenerated file and is therefore fragile (leaving a
race bug when we get it wrong).

Ideally we would fix the problem by fixing the .d file as it is
generated.  But the .d files are generated by many many rules
mentioning $(CC) and $(CFLAGS).  (We might in theory pass a bash
process substitution to -MF, but 1. that's not portable to people who
don't have bash and 2. it hangs, anyway.)

So instead we do this conversion at include time.  That is, we tell
make to include not the raw .d files, but the sedded ones.

The sedding removes occurrences of ` $PWD/'.  We use the shell
variable PWD because the make variable sometimes refers to the xen
toplevel.  If gcc's output format should change, then this sed rune
may not work any more, but that doesn't seem very likely.

The rune is only effective for dependencies on files which are exactly
in the current directory, or a subdirectory of it named simply by its
subdirectory name.  If there are autogenerated include files which
exist in a sibling (or worse, somewhere completely else), this
approach will not work, because we'd have to figure out what name this
Makefile usually uses to refer to them.  Hopefully such things don't
exist.

The indirect variables DEPS_RM and DEPS_INCLUDE are necessary to
preserve the assumptions made in the various Makefiles.  Specifically,
xen/ Makefiles assume that it is ok to say DEPS+=something (where
something is in a subdirectory); tools/ Makefiles all used to include
DEPS themselves (but now they include DEPS_INCLUDE); and many
Makefiles tended to explictly rm DEPS (but now rm DEPS_RM).

In the new scheme of things: DEPS is the files that come out of gcc
(or perhaps an assembler or something) and may be assigned to by
Makefiles.  DEPS_INCLUDE is the processed form.  And DEPS_RM is both
combined, so that they both get cleaned.

We need to explicitly use $(wildcard ) to do the wildcard expansion on
DEPS a bit earlier.  If we didn't, then DEPS_INCLUDE would contain
`.*.d2' which would not exist.

Evaluation order: DEPS_RM and DEPS_INCLUDE are recursively expanded
variables, so that although they are defined early (in Config.mk),
their actual values are computed at the time of use, using the value
of DEPS that is prevailing at that time.

Reported-by: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agoDEPS handling: Use DEPS_INCLUDE everywhere
Ian Jackson [Mon, 4 Sep 2017 16:46:15 +0000 (17:46 +0100)]
DEPS handling: Use DEPS_INCLUDE everywhere

DEPS_INCLUDE is currently the same as DEPS, so no functional change.

This patch is the result of this perl rune:

  git-grep -l 'include.*DEPS' | xargs perl -i -pe 'next unless m/^-?include/; s/\bDEPS\b/DEPS_INCLUDE/'

I have verified that I haven't missed anything, with this rune:

  git-grep '\bDEPS\b'

Reported-by: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agoDEPS handling: Use DEPS_RM everywhere
Ian Jackson [Mon, 4 Sep 2017 16:46:14 +0000 (17:46 +0100)]
DEPS handling: Use DEPS_RM everywhere

DEPS_RM is currently the same as DEPS, so no functional change.

This patch is the result of two perl runes:

  git-grep -l 'rm.*DEPS' | xargs perl -i~ -pe 'next unless m/^\t+rm\b/; s/\bDEPS\b/DEPS_RM/;'

  git-grep -l 'RM.*DEPS' | xargs perl -i~ -pe 'next unless m/^\t+\$\(RM\)/; s/\bDEPS\b/DEPS_RM/;'

And editing  tools/xenstat/libxenstat/Makefile  by hand.

I verified that I didn't miss anything with this rune:

  git-grep '\bDEPS\b' | grep -v include |less

Reported-by: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agoDEPS handling: Provide DEPS_RM and DEPS_INCLUDE
Ian Jackson [Mon, 4 Sep 2017 16:46:13 +0000 (17:46 +0100)]
DEPS handling: Provide DEPS_RM and DEPS_INCLUDE

These are not used anywhere yet, so no functional change.

Reported-by: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agomm: Don't scrub pages while holding heap lock in alloc_heap_pages()
Boris Ostrovsky [Wed, 6 Sep 2017 15:33:52 +0000 (11:33 -0400)]
mm: Don't scrub pages while holding heap lock in alloc_heap_pages()

Instead, preserve PGC_need_scrub bit when setting PGC_state_inuse
state while still under the lock and clear those pages later.

Note that we still need to grub the lock when clearing PGC_need_scrub
bit since count_info might be updated during MCE handling in
mark_page_offline().

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agotools: change the type of '*nr' in 'libxl_psr_cat_get_info'
Yi Sun [Mon, 4 Sep 2017 11:01:44 +0000 (19:01 +0800)]
tools: change the type of '*nr' in 'libxl_psr_cat_get_info'

Due to historical reason, type of parameter '*nr' in 'libxl_psr_cat_get_info'
is 'int'. But this is not right. It should be 'unsigned int'. This patch fixes
this and does related changes.

Suggested-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agotools: use '__i386__' and '__x86_64__' to replace PSR macros
Yi Sun [Mon, 4 Sep 2017 11:01:43 +0000 (19:01 +0800)]
tools: use '__i386__' and '__x86_64__' to replace PSR macros

The libxl interfaces and related functions are not necessary to be included by
'LIBXL_HAVE_PSR_CMT' and 'LIBXL_HAVE_PSR_CAT'. So replace them to common x86
macros. Furthermore, only compile 'xl_psr.c' under x86.

Suggested-by: Roger Pau Monné <roger.pau@citrix.com>
Suggested-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agox86: introduce and use setup_force_cpu_cap()
Jan Beulich [Wed, 6 Sep 2017 10:32:00 +0000 (12:32 +0200)]
x86: introduce and use setup_force_cpu_cap()

For XEN_SMEP and XEN_SMAP to not be cleared while bringing up APs we'd
need to clone the respective hack used for CPUID_FAULTING. Introduce an
inverse of setup_clear_cpu_cap() instead, but let clearing of features
overrule forced setting of them.

XEN_SMAP being wrong post-boot is a problem specifically for live
patching, as a live patch may need alternative instruction patching
keyed off of that feature flag.

Reported-by: Sarah Newman <security@prgmr.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/traps: Fix show_page_walk() to avoid printing trailing whitespace
Andrew Cooper [Tue, 5 Sep 2017 16:54:45 +0000 (17:54 +0100)]
x86/traps: Fix show_page_walk() to avoid printing trailing whitespace

This moves the L2 line to be consistent with the L3 line.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agoxen: Drop asmlinkage everywhere
Andrew Cooper [Fri, 1 Sep 2017 17:05:21 +0000 (17:05 +0000)]
xen: Drop asmlinkage everywhere

asmlinkage is defined as nothing on all architectures, and not used
consistently anywhere, even in common code.  Remove it all.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agolibxc/bitops: correct comment for bitmap_size
Olaf Hering [Tue, 5 Sep 2017 09:03:38 +0000 (11:03 +0200)]
libxc/bitops: correct comment for bitmap_size

The returned value represents now units of bytes instead of longs.

Fixes commit 11d0044a16 ("tools/libxc: Modify bitmap operations to
take void pointers").

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agocommon/vm_event: Initialize vm_event lists on domain creation
Alexandru Isaila [Wed, 30 Aug 2017 09:04:00 +0000 (12:04 +0300)]
common/vm_event: Initialize vm_event lists on domain creation

The patch splits the vm_event into three structures:vm_event_share,
vm_event_paging, vm_event_monitor. The allocation for the
structure is moved to vm_event_enable so that it can be
allocated/init when needed and freed in vm_event_disable.

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
7 years agox86emul: correct EVEX decoding
Jan Beulich [Tue, 5 Sep 2017 15:32:43 +0000 (17:32 +0200)]
x86emul: correct EVEX decoding

While these are latent issues only for now, correct them right away:
- unnamed (in the SDM) EVEX bits need to be set/clear respectively
- EVEX.V' (called RX in our code) needs to uniformly be 1 in non-64-bit
  modes,
- EXEX.R' (called R in our code) is uniformly being ignored in
  non-64-bit modes.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86emul: correct VEX.L handling for VCVT{,T}S{S,D}2SI
Jan Beulich [Tue, 5 Sep 2017 15:32:05 +0000 (17:32 +0200)]
x86emul: correct VEX.L handling for VCVT{,T}S{S,D}2SI

Recent changes to the SDM (and XED) have made clear that older hardware
raising #UD when the bit is set was really an erratum. Generalize the
so far AMD-only override.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86emul: correct VEX.W handling for non-64-bit VPINSRD
Jan Beulich [Tue, 5 Sep 2017 15:31:01 +0000 (17:31 +0200)]
x86emul: correct VEX.W handling for non-64-bit VPINSRD

Going though the XED commits from the last couple of months made me
notice that VPINSRD, other than VPEXTRD, does not clear VEX.W for non-
64-bit modes, leading to an insertion of stray 32-bits of zero in case
the original instruction had the bit set.

Also remove a pointless fall-through in VPEXTRW handling, bringing
things in line with VPINSRW.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/emul: Fix the handling of unimplemented Grp7 instructions
Andrew Cooper [Tue, 5 Sep 2017 08:40:58 +0000 (09:40 +0100)]
x86/emul: Fix the handling of unimplemented Grp7 instructions

Grp7 is abnormally complicated to decode, even by x86's standards, with
{s,l}msw being the problematic cases.

Previously, any value which fell through the first switch statement (looking
for instructions with entirely implicit operands) would be interpreted by the
second switch statement (handling instructions with memory operands).

Unimplemented instructions would then hit the #UD case for having a non-memory
operand, rather than taking the cannot_emulate path.

Consolidate the two switch statements into a single one, using ranges to cover
the instructions with memory operands.

Reported-by: Petre Pircalabu <ppircalabu@bitdefender.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <JBeulich@suse.com>
7 years agox86/p2m-pt: pass level instead of page type to p2m_next_level()
Jan Beulich [Mon, 4 Sep 2017 14:32:14 +0000 (16:32 +0200)]
x86/p2m-pt: pass level instead of page type to p2m_next_level()

This in turn calls for p2m_alloc_ptp() also being passed the numeric
level.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
7 years agox86/p2m: make p2m_alloc_ptp() return an MFN
Jan Beulich [Mon, 4 Sep 2017 14:30:47 +0000 (16:30 +0200)]
x86/p2m: make p2m_alloc_ptp() return an MFN

None of the callers really needs the struct page_info pointer.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
7 years agox86/p2m-pt: simplify p2m_next_level()
Jan Beulich [Mon, 4 Sep 2017 14:25:59 +0000 (16:25 +0200)]
x86/p2m-pt: simplify p2m_next_level()

Calculate entry PFN and flags just once. Convert the two successive
main if()-s to and if/else-if chain. Restrict variable scope where
reasonable. Take the opportunity and also make the induction variable
unsigned.

This at once fixes excessive permissions granted in the 2M PTEs
resulting from splitting a 1G one - original permissions should be
inherited instead. This is not a security issue only because all of
this takes no effect anyway, as iommu_hap_pt_share is always false on
AMD systems for all supported branches.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
7 years agox86/mm: use put_page_type_preemptible in put_page_from_l{3,4}e
Wei Liu [Mon, 4 Sep 2017 11:42:06 +0000 (12:42 +0100)]
x86/mm: use put_page_type_preemptible in put_page_from_l{3,4}e

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/mm: Use static inlines for {,un}adjust_guest_l?e()
Andrew Cooper [Fri, 1 Sep 2017 10:29:56 +0000 (11:29 +0100)]
x86/mm: Use static inlines for {,un}adjust_guest_l?e()

There is no need for these to be macros, and the result is easier to read.

No functional change, but bloat-o-meter reports the following improvement:

  add/remove: 1/0 grow/shrink: 2/3 up/down: 235/-427 (-192)
  function                                     old     new   delta
  __get_page_type                             5231    5351    +120
  adjust_guest_l1e.isra                          -      96     +96
  free_page_type                              1540    1559     +19
  ptwr_emulated_update                        1008     957     -51
  create_grant_pv_mapping                     1342    1186    -156
  mod_l1_entry                                1892    1672    -220

adjust_guest_l1e(), now being a compiler-visible single unit, is chosen for
out-of-line'ing from its several callsites.  The other helpers remain inline.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agoMAINTAINERS: add arch specific public headers to arch file groups
Wei Liu [Mon, 4 Sep 2017 08:29:48 +0000 (09:29 +0100)]
MAINTAINERS: add arch specific public headers to arch file groups

I've recently got sufficiently annoyed by people not applying enough
common sense to get_maintainer.pl output, Cc-ing all REST maintainers
on ARM-only public interface changes.

Sort ARM's xen/ groups of path specifications at the same time.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agox86/mm: Use mfn_t for make_cr3()
Andrew Cooper [Wed, 30 Aug 2017 11:41:40 +0000 (12:41 +0100)]
x86/mm: Use mfn_t for make_cr3()

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: George Dunlap <george.dunlap@citrix.com>
7 years agox86/public: Further corrections to vcpu context comments
Andrew Cooper [Fri, 1 Sep 2017 13:14:17 +0000 (14:14 +0100)]
x86/public: Further corrections to vcpu context comments

VCPUOP_initialise and DOMCTL_setvcpucontext are not symetric.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
7 years agox86/mm: merge ptwr and mmio_ro page fault handlers
Wei Liu [Fri, 1 Sep 2017 14:35:39 +0000 (15:35 +0100)]
x86/mm: merge ptwr and mmio_ro page fault handlers

Provide a unified entry to avoid going through pte look-up, decode and
emulation cycle more than necessary. The path taken is determined by
the faulting address.

Note that the order of checks is changed in the new function, but the
order of the checks is performed shouldn't matter.

The sole caller is changed to use the new function.

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/mm: don't wrap x86_emulate_ctxt in ptwr_emulate_ctxt
Wei Liu [Fri, 1 Sep 2017 14:35:38 +0000 (15:35 +0100)]
x86/mm: don't wrap x86_emulate_ctxt in ptwr_emulate_ctxt

Rewrite the code so that it has the same structure as
mmio_ro_emualte_ctxt. x86_emulate_ctxt now points to ptwr_emulate_ctxt
via its data pointer.

This patch will help unify mmio_ro and ptwr code paths later.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agodomctl/x86: move vMSI related #define-s to public interface
Jan Beulich [Fri, 1 Sep 2017 16:24:10 +0000 (10:24 -0600)]
domctl/x86: move vMSI related #define-s to public interface

Xen and qemu having identical #define-s (with different names) is a
strong hint that these should be part of the public interface, at the
same time making obvious that any change to the values in an interface
modification (and hence needs suitable care).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agoxl/libacpi: extend lapic_id() to uint32_t
Chao Gao [Thu, 31 Aug 2017 05:01:49 +0000 (01:01 -0400)]
xl/libacpi: extend lapic_id() to uint32_t

This patch is to extend lapic_id() to support more vcpus.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agolibxc: increase maximum migration stream record length
Juergen Gross [Thu, 10 Aug 2017 11:24:28 +0000 (13:24 +0200)]
libxc: increase maximum migration stream record length

Today the maximum record lenth in a migration stream is 8MB. This
limits the size of a PV domain to a little bit less than 1TB in the
migration case, as the P2M frame list will exceed 8MB in this case.

Raising the record size limit by a factor of 16 allows for domain
sizes of nearly 16TB to be migrated. This ought to be enough.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
7 years agolibxl, xl: change p9 to p9s
Wei Liu [Tue, 29 Aug 2017 11:19:01 +0000 (12:19 +0100)]
libxl, xl: change p9 to p9s

To match our naming convention. Since we released p9 one release ago,
we need to define a macro inside libxl.h to indicate the change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
7 years agox86: mark the entire directmap NX
Jan Beulich [Fri, 1 Sep 2017 09:07:31 +0000 (11:07 +0200)]
x86: mark the entire directmap NX

There's no reason for the first Mb to be excluded here. Enforce the
restriction right in the top level page table entries.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/pvh: remove stale PVHv1 comment from public headers
Roger Pau Monné [Fri, 1 Sep 2017 09:06:44 +0000 (11:06 +0200)]
x86/pvh: remove stale PVHv1 comment from public headers

From the vcpu_guest_context structure. PVHv2 uses it in the same exact
way as HVM guests, and from the hypervisor point of view PVHv2 is not
even a different guest type, so only mention HVM in the public
headers.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agomm: don't request scrubbing until dom0 is running
Boris Ostrovsky [Fri, 1 Sep 2017 09:06:21 +0000 (11:06 +0200)]
mm: don't request scrubbing until dom0 is running

There is no need to scrub pages freed during dom0 construction since
once dom0 is ready the heap will be scrubbed by scrub_heap_pages() anyway,
setting scrub_debug at the end.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agomm: don't poison a page if scrub_debug is off
Boris Ostrovsky [Fri, 1 Sep 2017 09:06:03 +0000 (11:06 +0200)]
mm: don't poison a page if scrub_debug is off

If scrub_debug is off we don't check pages in check_one_page().
Thus there is no reason to ever poison them.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agomm: change boot_scrub_done definition
Boris Ostrovsky [Fri, 1 Sep 2017 09:05:45 +0000 (11:05 +0200)]
mm: change boot_scrub_done definition

Rename it to the more appropriate scrub_debug and define as a macro
for !CONFIG_SCRUB_DEBUG. This will allow us to get rid of some
ifdefs (here and in the subsequent patch).

Suggested-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agomm: initialize lowmem virq when boot-time scrubbing is disabled
Boris Ostrovsky [Fri, 1 Sep 2017 09:04:47 +0000 (11:04 +0200)]
mm: initialize lowmem virq when boot-time scrubbing is disabled

scrub_heap_pages() does early return if boot-time scrubbing is
disabled, neglecting to initialize lowmem VIRQ.

Because setup_low_mem_virq() doesn't logically belong in
scrub_heap_pages() we put them both into the newly added
heap_init_late().

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agohvmloader, libxl: use the correct ACPI settings depending on device model
Igor Druzhinin [Fri, 1 Sep 2017 09:03:20 +0000 (11:03 +0200)]
hvmloader, libxl: use the correct ACPI settings depending on device model

We need to choose ACPI tables properly depending on the device
model version we are running. Previously, this decision was
made by BIOS type specific code in hvmloader, e.g. always load
QEMU traditional specific tables if it's ROMBIOS and always
load QEMU Xen specific tables if it's SeaBIOS.

This change saves this behavior (for compatibility) but adds
an additional way (xenstore key) to specify the correct
device model if we happen to run a non-default one. Toolstack
bit makes use of it.

The enforcement of BIOS type depending on QEMU version will
be lifted later when the rest of ROMBIOS compatibility fixes
are in place.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agoVT-d: use correct BDF for VF to search VT-d unit
Chao Gao [Fri, 1 Sep 2017 09:02:23 +0000 (11:02 +0200)]
VT-d: use correct BDF for VF to search VT-d unit

When SR-IOV is enabled, 'Virtual Functions' of a 'Physical Function'
are under the scope of the same VT-d unit as the 'Physical Function'.
A 'Physical Function' can be a 'Traditional Function' or an ARI
'Extended Function'. And furthermore, 'Extended Functions' on an
endpoint are under the scope of the same VT-d unit as the 'Traditional
Functions' on the endpoint. To search VT-d unit for a VF, if its PF
isn't an extended function, the BDF of PF should be used. Otherwise
the BDF of a traditional function in the same device with the PF
should be used.

Current code uses PCI_SLOT() to recognize an ARI 'Extended Funcion'.
But it is conceptually wrong w/o checking whether PF is an extended
function and would lead to match VFs of a RC integrated PF to a wrong
VT-d unit.

This patch overrides VF 'is_extfn' field and uses this field to
indicate whether the PF of this VF is an extended function. The field
helps to use correct BDF to search VT-d unit.

Reported-by: Crawford, Eric R <Eric.R.Crawford@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Crawford, Eric R <Eric.R.Crawford@intel.com>
7 years agox86: remove redundant checks in sysctl.c
Yi Sun [Thu, 31 Aug 2017 08:07:26 +0000 (16:07 +0800)]
x86: remove redundant checks in sysctl.c

In sysctl.c, the return value of 'psr_get_info' has been checked immediately.
So, it is redundant to check the return value again when copy the field to
guest.

Suggested-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
7 years agox86/pv: drop gate_op prefix in emul-gate-op.c
Wei Liu [Thu, 31 Aug 2017 11:42:52 +0000 (12:42 +0100)]
x86/pv: drop gate_op prefix in emul-gate-op.c

There is only one function gate_op_read that needs to be modified.
Rename it to read_mem.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/pv: drop priv_op prefix in emul-priv-op.c
Wei Liu [Thu, 31 Aug 2017 11:36:06 +0000 (12:36 +0100)]
x86/pv: drop priv_op prefix in emul-priv-op.c

Drop the prefix because they live in their own file now.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agoRevert "xen: in do_softirq() sample smp_processor_id() only once."
Wei Liu [Thu, 31 Aug 2017 15:28:49 +0000 (16:28 +0100)]
Revert "xen: in do_softirq() sample smp_processor_id() only once."

This reverts commit 57450cfe48b56db90166c52d45a411a9279a12e1.

This breaks arm tests.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
7 years agoxen-access: Correct default value of write-to-CR4 switch
Sergej Proskurin [Wed, 30 Aug 2017 11:19:14 +0000 (13:19 +0200)]
xen-access: Correct default value of write-to-CR4 switch

The current implementation configures the test environment to always
trap on writes to the CR4 control register, even on ARM. This leads to
issues as calling xc_monitor_write_ctrlreg on ARM with VM_EVENT_X86_CR4
will always fail.

Signed-off-by: Sergej Proskurin <proskurin@sec.in.tum.de>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>