]> xenbits.xensource.com Git - xen.git/log
xen.git
7 years agohvmloader: clone REP INSW test from REP INSB one
Jan Beulich [Fri, 8 Sep 2017 14:24:57 +0000 (16:24 +0200)]
hvmloader: clone REP INSW test from REP INSB one

This also covers an individual string insn access crossing a page
boundary.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agohvmloader: dynamically determine scratch memory range for tests
Jan Beulich [Fri, 8 Sep 2017 14:24:41 +0000 (16:24 +0200)]
hvmloader: dynamically determine scratch memory range for tests

This re-enables tests on configurations where commit 0d6968635c
("hvmloader: avoid tests when they would clobber used memory") forced
them to be skipped.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/HVM: correct repeat count update in linear->phys translation
Jan Beulich [Fri, 8 Sep 2017 14:23:46 +0000 (16:23 +0200)]
x86/HVM: correct repeat count update in linear->phys translation

For the insn emulator's fallback logic in REP INS/OUTS handling
to work correctly, *reps must not be set to zero when returning
X86EMUL_UNHANDLEABLE.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Paul Durrant <paul.durrant@citrix.com>
7 years agomonitor: switch to plain bool
Wei Liu [Fri, 8 Sep 2017 13:44:33 +0000 (14:44 +0100)]
monitor: switch to plain bool

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Otherwise, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
7 years agotools: eliminate LIBXL_BLKTAP2
Wei Liu [Mon, 4 Sep 2017 13:44:47 +0000 (14:44 +0100)]
tools: eliminate LIBXL_BLKTAP2

Use CONFIG_BLKTAP2 directly. There is no reason why one would want to
set LIBXL_BLKTAP2 separately as things stand.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
7 years agotools: disable blktap2 by default
Wei Liu [Mon, 4 Sep 2017 13:44:46 +0000 (14:44 +0100)]
tools: disable blktap2 by default

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
7 years agobuild: run autogen.sh on Stretch
Wei Liu [Mon, 4 Sep 2017 13:44:45 +0000 (14:44 +0100)]
build: run autogen.sh on Stretch

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
7 years agoMAINTAINERS: orphan blktap2
Wei Liu [Fri, 8 Sep 2017 10:34:22 +0000 (11:34 +0100)]
MAINTAINERS: orphan blktap2

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
7 years agox86/page: Implement {get,set}_pte_flags() as static inlines
Andrew Cooper [Wed, 6 Sep 2017 13:34:04 +0000 (14:34 +0100)]
x86/page: Implement {get,set}_pte_flags() as static inlines

This resolves 11 Coverity issues along the lines of the following:

1600        for ( i = 0; i < NR_RESERVED_GDT_PAGES; i++ )

    CID: Operands don't affect result
    (CONSTANT_EXPRESSION_RESULT)result_independent_of_operands: ((33U /* 1U |
    0x20U */) | (({...}) ? 8388608U /* 1U << 23 */ : 0) | 0x40U | 2U) & 4095
    is always 0x63 regardless of the values of its operands. This occurs as
    the bitwise second operand of "|".

1601            l1e_write(pl1e + FIRST_RESERVED_GDT_PAGE + i,
1602                      l1e_from_pfn(mfn + i, __PAGE_HYPERVISOR_RW));

This is presumably because once preprocessed, the association of joint logic
inside {get,set}_pte_flags() is lost.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agoDEPS handling: Remove absolute paths from references to cwd
Ian Jackson [Mon, 4 Sep 2017 16:46:16 +0000 (17:46 +0100)]
DEPS handling: Remove absolute paths from references to cwd

In some directories we use gcc on source files elsewhere, to generate
a .o here in the current directory.  Eg in tools/libxl/,
   gcc -I -o build.o /path/to/libacpi/build.c
We pass -MMD and -MF options to generate a .d file right here.

In the general case this .c file might need to include things from the
directory here, eg libacpi/build.c eventually #includes various
*libxl*.h.  We pass gcc -I. for this, which means things from the cwd
where we invoked gcc, not the directory of the #including file.

When we do this, gcc's -MMD output mentions /path/to/libxl/*libxl*.h,
even though it could refer to simply *libxl*.h.  This is presumably
because gcc has noticed that `.' in this context must mean relative to
the invocation cwd, not relative to build.c, and gcc doesn't realise
that references in the .d file are also wrt the invocation cwd.

make distinguishes targets purely textually.  It will canonicalise a
target name by removing ./ before comparison (so _libxl_types.h and
./_libxl_types.h are considered the same target) but it won't examine
the filesystem.  So _libxl_types.h and
/path/to/tools/libxl/_libxl_types.h are different targets.

And, _libxl_types.h is generated from a pattern rule.  This pattern
rule is therefore instatiated twice, and the two instances may be run
concurrently - but use the same tempfiles and can therefore fail.

The thing that is wrong here is gcc's choice to output an absolute
path.

We could work around it by adding a rule to teach make about a
relationship between these `two different files'.  But this has to be
done for every autogenerated file and is therefore fragile (leaving a
race bug when we get it wrong).

Ideally we would fix the problem by fixing the .d file as it is
generated.  But the .d files are generated by many many rules
mentioning $(CC) and $(CFLAGS).  (We might in theory pass a bash
process substitution to -MF, but 1. that's not portable to people who
don't have bash and 2. it hangs, anyway.)

So instead we do this conversion at include time.  That is, we tell
make to include not the raw .d files, but the sedded ones.

The sedding removes occurrences of ` $PWD/'.  We use the shell
variable PWD because the make variable sometimes refers to the xen
toplevel.  If gcc's output format should change, then this sed rune
may not work any more, but that doesn't seem very likely.

The rune is only effective for dependencies on files which are exactly
in the current directory, or a subdirectory of it named simply by its
subdirectory name.  If there are autogenerated include files which
exist in a sibling (or worse, somewhere completely else), this
approach will not work, because we'd have to figure out what name this
Makefile usually uses to refer to them.  Hopefully such things don't
exist.

The indirect variables DEPS_RM and DEPS_INCLUDE are necessary to
preserve the assumptions made in the various Makefiles.  Specifically,
xen/ Makefiles assume that it is ok to say DEPS+=something (where
something is in a subdirectory); tools/ Makefiles all used to include
DEPS themselves (but now they include DEPS_INCLUDE); and many
Makefiles tended to explictly rm DEPS (but now rm DEPS_RM).

In the new scheme of things: DEPS is the files that come out of gcc
(or perhaps an assembler or something) and may be assigned to by
Makefiles.  DEPS_INCLUDE is the processed form.  And DEPS_RM is both
combined, so that they both get cleaned.

We need to explicitly use $(wildcard ) to do the wildcard expansion on
DEPS a bit earlier.  If we didn't, then DEPS_INCLUDE would contain
`.*.d2' which would not exist.

Evaluation order: DEPS_RM and DEPS_INCLUDE are recursively expanded
variables, so that although they are defined early (in Config.mk),
their actual values are computed at the time of use, using the value
of DEPS that is prevailing at that time.

Reported-by: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agoDEPS handling: Use DEPS_INCLUDE everywhere
Ian Jackson [Mon, 4 Sep 2017 16:46:15 +0000 (17:46 +0100)]
DEPS handling: Use DEPS_INCLUDE everywhere

DEPS_INCLUDE is currently the same as DEPS, so no functional change.

This patch is the result of this perl rune:

  git-grep -l 'include.*DEPS' | xargs perl -i -pe 'next unless m/^-?include/; s/\bDEPS\b/DEPS_INCLUDE/'

I have verified that I haven't missed anything, with this rune:

  git-grep '\bDEPS\b'

Reported-by: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agoDEPS handling: Use DEPS_RM everywhere
Ian Jackson [Mon, 4 Sep 2017 16:46:14 +0000 (17:46 +0100)]
DEPS handling: Use DEPS_RM everywhere

DEPS_RM is currently the same as DEPS, so no functional change.

This patch is the result of two perl runes:

  git-grep -l 'rm.*DEPS' | xargs perl -i~ -pe 'next unless m/^\t+rm\b/; s/\bDEPS\b/DEPS_RM/;'

  git-grep -l 'RM.*DEPS' | xargs perl -i~ -pe 'next unless m/^\t+\$\(RM\)/; s/\bDEPS\b/DEPS_RM/;'

And editing  tools/xenstat/libxenstat/Makefile  by hand.

I verified that I didn't miss anything with this rune:

  git-grep '\bDEPS\b' | grep -v include |less

Reported-by: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agoDEPS handling: Provide DEPS_RM and DEPS_INCLUDE
Ian Jackson [Mon, 4 Sep 2017 16:46:13 +0000 (17:46 +0100)]
DEPS handling: Provide DEPS_RM and DEPS_INCLUDE

These are not used anywhere yet, so no functional change.

Reported-by: Jan Beulich <JBeulich@suse.com>
CC: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agomm: Don't scrub pages while holding heap lock in alloc_heap_pages()
Boris Ostrovsky [Wed, 6 Sep 2017 15:33:52 +0000 (11:33 -0400)]
mm: Don't scrub pages while holding heap lock in alloc_heap_pages()

Instead, preserve PGC_need_scrub bit when setting PGC_state_inuse
state while still under the lock and clear those pages later.

Note that we still need to grub the lock when clearing PGC_need_scrub
bit since count_info might be updated during MCE handling in
mark_page_offline().

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agotools: change the type of '*nr' in 'libxl_psr_cat_get_info'
Yi Sun [Mon, 4 Sep 2017 11:01:44 +0000 (19:01 +0800)]
tools: change the type of '*nr' in 'libxl_psr_cat_get_info'

Due to historical reason, type of parameter '*nr' in 'libxl_psr_cat_get_info'
is 'int'. But this is not right. It should be 'unsigned int'. This patch fixes
this and does related changes.

Suggested-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agotools: use '__i386__' and '__x86_64__' to replace PSR macros
Yi Sun [Mon, 4 Sep 2017 11:01:43 +0000 (19:01 +0800)]
tools: use '__i386__' and '__x86_64__' to replace PSR macros

The libxl interfaces and related functions are not necessary to be included by
'LIBXL_HAVE_PSR_CMT' and 'LIBXL_HAVE_PSR_CAT'. So replace them to common x86
macros. Furthermore, only compile 'xl_psr.c' under x86.

Suggested-by: Roger Pau Monné <roger.pau@citrix.com>
Suggested-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agox86: introduce and use setup_force_cpu_cap()
Jan Beulich [Wed, 6 Sep 2017 10:32:00 +0000 (12:32 +0200)]
x86: introduce and use setup_force_cpu_cap()

For XEN_SMEP and XEN_SMAP to not be cleared while bringing up APs we'd
need to clone the respective hack used for CPUID_FAULTING. Introduce an
inverse of setup_clear_cpu_cap() instead, but let clearing of features
overrule forced setting of them.

XEN_SMAP being wrong post-boot is a problem specifically for live
patching, as a live patch may need alternative instruction patching
keyed off of that feature flag.

Reported-by: Sarah Newman <security@prgmr.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/traps: Fix show_page_walk() to avoid printing trailing whitespace
Andrew Cooper [Tue, 5 Sep 2017 16:54:45 +0000 (17:54 +0100)]
x86/traps: Fix show_page_walk() to avoid printing trailing whitespace

This moves the L2 line to be consistent with the L3 line.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agoxen: Drop asmlinkage everywhere
Andrew Cooper [Fri, 1 Sep 2017 17:05:21 +0000 (17:05 +0000)]
xen: Drop asmlinkage everywhere

asmlinkage is defined as nothing on all architectures, and not used
consistently anywhere, even in common code.  Remove it all.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agolibxc/bitops: correct comment for bitmap_size
Olaf Hering [Tue, 5 Sep 2017 09:03:38 +0000 (11:03 +0200)]
libxc/bitops: correct comment for bitmap_size

The returned value represents now units of bytes instead of longs.

Fixes commit 11d0044a16 ("tools/libxc: Modify bitmap operations to
take void pointers").

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agocommon/vm_event: Initialize vm_event lists on domain creation
Alexandru Isaila [Wed, 30 Aug 2017 09:04:00 +0000 (12:04 +0300)]
common/vm_event: Initialize vm_event lists on domain creation

The patch splits the vm_event into three structures:vm_event_share,
vm_event_paging, vm_event_monitor. The allocation for the
structure is moved to vm_event_enable so that it can be
allocated/init when needed and freed in vm_event_disable.

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
7 years agox86emul: correct EVEX decoding
Jan Beulich [Tue, 5 Sep 2017 15:32:43 +0000 (17:32 +0200)]
x86emul: correct EVEX decoding

While these are latent issues only for now, correct them right away:
- unnamed (in the SDM) EVEX bits need to be set/clear respectively
- EVEX.V' (called RX in our code) needs to uniformly be 1 in non-64-bit
  modes,
- EXEX.R' (called R in our code) is uniformly being ignored in
  non-64-bit modes.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86emul: correct VEX.L handling for VCVT{,T}S{S,D}2SI
Jan Beulich [Tue, 5 Sep 2017 15:32:05 +0000 (17:32 +0200)]
x86emul: correct VEX.L handling for VCVT{,T}S{S,D}2SI

Recent changes to the SDM (and XED) have made clear that older hardware
raising #UD when the bit is set was really an erratum. Generalize the
so far AMD-only override.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86emul: correct VEX.W handling for non-64-bit VPINSRD
Jan Beulich [Tue, 5 Sep 2017 15:31:01 +0000 (17:31 +0200)]
x86emul: correct VEX.W handling for non-64-bit VPINSRD

Going though the XED commits from the last couple of months made me
notice that VPINSRD, other than VPEXTRD, does not clear VEX.W for non-
64-bit modes, leading to an insertion of stray 32-bits of zero in case
the original instruction had the bit set.

Also remove a pointless fall-through in VPEXTRW handling, bringing
things in line with VPINSRW.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/emul: Fix the handling of unimplemented Grp7 instructions
Andrew Cooper [Tue, 5 Sep 2017 08:40:58 +0000 (09:40 +0100)]
x86/emul: Fix the handling of unimplemented Grp7 instructions

Grp7 is abnormally complicated to decode, even by x86's standards, with
{s,l}msw being the problematic cases.

Previously, any value which fell through the first switch statement (looking
for instructions with entirely implicit operands) would be interpreted by the
second switch statement (handling instructions with memory operands).

Unimplemented instructions would then hit the #UD case for having a non-memory
operand, rather than taking the cannot_emulate path.

Consolidate the two switch statements into a single one, using ranges to cover
the instructions with memory operands.

Reported-by: Petre Pircalabu <ppircalabu@bitdefender.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <JBeulich@suse.com>
7 years agox86/p2m-pt: pass level instead of page type to p2m_next_level()
Jan Beulich [Mon, 4 Sep 2017 14:32:14 +0000 (16:32 +0200)]
x86/p2m-pt: pass level instead of page type to p2m_next_level()

This in turn calls for p2m_alloc_ptp() also being passed the numeric
level.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
7 years agox86/p2m: make p2m_alloc_ptp() return an MFN
Jan Beulich [Mon, 4 Sep 2017 14:30:47 +0000 (16:30 +0200)]
x86/p2m: make p2m_alloc_ptp() return an MFN

None of the callers really needs the struct page_info pointer.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
7 years agox86/p2m-pt: simplify p2m_next_level()
Jan Beulich [Mon, 4 Sep 2017 14:25:59 +0000 (16:25 +0200)]
x86/p2m-pt: simplify p2m_next_level()

Calculate entry PFN and flags just once. Convert the two successive
main if()-s to and if/else-if chain. Restrict variable scope where
reasonable. Take the opportunity and also make the induction variable
unsigned.

This at once fixes excessive permissions granted in the 2M PTEs
resulting from splitting a 1G one - original permissions should be
inherited instead. This is not a security issue only because all of
this takes no effect anyway, as iommu_hap_pt_share is always false on
AMD systems for all supported branches.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
7 years agox86/mm: use put_page_type_preemptible in put_page_from_l{3,4}e
Wei Liu [Mon, 4 Sep 2017 11:42:06 +0000 (12:42 +0100)]
x86/mm: use put_page_type_preemptible in put_page_from_l{3,4}e

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/mm: Use static inlines for {,un}adjust_guest_l?e()
Andrew Cooper [Fri, 1 Sep 2017 10:29:56 +0000 (11:29 +0100)]
x86/mm: Use static inlines for {,un}adjust_guest_l?e()

There is no need for these to be macros, and the result is easier to read.

No functional change, but bloat-o-meter reports the following improvement:

  add/remove: 1/0 grow/shrink: 2/3 up/down: 235/-427 (-192)
  function                                     old     new   delta
  __get_page_type                             5231    5351    +120
  adjust_guest_l1e.isra                          -      96     +96
  free_page_type                              1540    1559     +19
  ptwr_emulated_update                        1008     957     -51
  create_grant_pv_mapping                     1342    1186    -156
  mod_l1_entry                                1892    1672    -220

adjust_guest_l1e(), now being a compiler-visible single unit, is chosen for
out-of-line'ing from its several callsites.  The other helpers remain inline.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agoMAINTAINERS: add arch specific public headers to arch file groups
Wei Liu [Mon, 4 Sep 2017 08:29:48 +0000 (09:29 +0100)]
MAINTAINERS: add arch specific public headers to arch file groups

I've recently got sufficiently annoyed by people not applying enough
common sense to get_maintainer.pl output, Cc-ing all REST maintainers
on ARM-only public interface changes.

Sort ARM's xen/ groups of path specifications at the same time.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agox86/mm: Use mfn_t for make_cr3()
Andrew Cooper [Wed, 30 Aug 2017 11:41:40 +0000 (12:41 +0100)]
x86/mm: Use mfn_t for make_cr3()

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: George Dunlap <george.dunlap@citrix.com>
7 years agox86/public: Further corrections to vcpu context comments
Andrew Cooper [Fri, 1 Sep 2017 13:14:17 +0000 (14:14 +0100)]
x86/public: Further corrections to vcpu context comments

VCPUOP_initialise and DOMCTL_setvcpucontext are not symetric.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
7 years agox86/mm: merge ptwr and mmio_ro page fault handlers
Wei Liu [Fri, 1 Sep 2017 14:35:39 +0000 (15:35 +0100)]
x86/mm: merge ptwr and mmio_ro page fault handlers

Provide a unified entry to avoid going through pte look-up, decode and
emulation cycle more than necessary. The path taken is determined by
the faulting address.

Note that the order of checks is changed in the new function, but the
order of the checks is performed shouldn't matter.

The sole caller is changed to use the new function.

No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/mm: don't wrap x86_emulate_ctxt in ptwr_emulate_ctxt
Wei Liu [Fri, 1 Sep 2017 14:35:38 +0000 (15:35 +0100)]
x86/mm: don't wrap x86_emulate_ctxt in ptwr_emulate_ctxt

Rewrite the code so that it has the same structure as
mmio_ro_emualte_ctxt. x86_emulate_ctxt now points to ptwr_emulate_ctxt
via its data pointer.

This patch will help unify mmio_ro and ptwr code paths later.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agodomctl/x86: move vMSI related #define-s to public interface
Jan Beulich [Fri, 1 Sep 2017 16:24:10 +0000 (10:24 -0600)]
domctl/x86: move vMSI related #define-s to public interface

Xen and qemu having identical #define-s (with different names) is a
strong hint that these should be part of the public interface, at the
same time making obvious that any change to the values in an interface
modification (and hence needs suitable care).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agoxl/libacpi: extend lapic_id() to uint32_t
Chao Gao [Thu, 31 Aug 2017 05:01:49 +0000 (01:01 -0400)]
xl/libacpi: extend lapic_id() to uint32_t

This patch is to extend lapic_id() to support more vcpus.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agolibxc: increase maximum migration stream record length
Juergen Gross [Thu, 10 Aug 2017 11:24:28 +0000 (13:24 +0200)]
libxc: increase maximum migration stream record length

Today the maximum record lenth in a migration stream is 8MB. This
limits the size of a PV domain to a little bit less than 1TB in the
migration case, as the P2M frame list will exceed 8MB in this case.

Raising the record size limit by a factor of 16 allows for domain
sizes of nearly 16TB to be migrated. This ought to be enough.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
7 years agolibxl, xl: change p9 to p9s
Wei Liu [Tue, 29 Aug 2017 11:19:01 +0000 (12:19 +0100)]
libxl, xl: change p9 to p9s

To match our naming convention. Since we released p9 one release ago,
we need to define a macro inside libxl.h to indicate the change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
7 years agox86: mark the entire directmap NX
Jan Beulich [Fri, 1 Sep 2017 09:07:31 +0000 (11:07 +0200)]
x86: mark the entire directmap NX

There's no reason for the first Mb to be excluded here. Enforce the
restriction right in the top level page table entries.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/pvh: remove stale PVHv1 comment from public headers
Roger Pau Monné [Fri, 1 Sep 2017 09:06:44 +0000 (11:06 +0200)]
x86/pvh: remove stale PVHv1 comment from public headers

From the vcpu_guest_context structure. PVHv2 uses it in the same exact
way as HVM guests, and from the hypervisor point of view PVHv2 is not
even a different guest type, so only mention HVM in the public
headers.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agomm: don't request scrubbing until dom0 is running
Boris Ostrovsky [Fri, 1 Sep 2017 09:06:21 +0000 (11:06 +0200)]
mm: don't request scrubbing until dom0 is running

There is no need to scrub pages freed during dom0 construction since
once dom0 is ready the heap will be scrubbed by scrub_heap_pages() anyway,
setting scrub_debug at the end.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agomm: don't poison a page if scrub_debug is off
Boris Ostrovsky [Fri, 1 Sep 2017 09:06:03 +0000 (11:06 +0200)]
mm: don't poison a page if scrub_debug is off

If scrub_debug is off we don't check pages in check_one_page().
Thus there is no reason to ever poison them.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agomm: change boot_scrub_done definition
Boris Ostrovsky [Fri, 1 Sep 2017 09:05:45 +0000 (11:05 +0200)]
mm: change boot_scrub_done definition

Rename it to the more appropriate scrub_debug and define as a macro
for !CONFIG_SCRUB_DEBUG. This will allow us to get rid of some
ifdefs (here and in the subsequent patch).

Suggested-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agomm: initialize lowmem virq when boot-time scrubbing is disabled
Boris Ostrovsky [Fri, 1 Sep 2017 09:04:47 +0000 (11:04 +0200)]
mm: initialize lowmem virq when boot-time scrubbing is disabled

scrub_heap_pages() does early return if boot-time scrubbing is
disabled, neglecting to initialize lowmem VIRQ.

Because setup_low_mem_virq() doesn't logically belong in
scrub_heap_pages() we put them both into the newly added
heap_init_late().

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agohvmloader, libxl: use the correct ACPI settings depending on device model
Igor Druzhinin [Fri, 1 Sep 2017 09:03:20 +0000 (11:03 +0200)]
hvmloader, libxl: use the correct ACPI settings depending on device model

We need to choose ACPI tables properly depending on the device
model version we are running. Previously, this decision was
made by BIOS type specific code in hvmloader, e.g. always load
QEMU traditional specific tables if it's ROMBIOS and always
load QEMU Xen specific tables if it's SeaBIOS.

This change saves this behavior (for compatibility) but adds
an additional way (xenstore key) to specify the correct
device model if we happen to run a non-default one. Toolstack
bit makes use of it.

The enforcement of BIOS type depending on QEMU version will
be lifted later when the rest of ROMBIOS compatibility fixes
are in place.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agoVT-d: use correct BDF for VF to search VT-d unit
Chao Gao [Fri, 1 Sep 2017 09:02:23 +0000 (11:02 +0200)]
VT-d: use correct BDF for VF to search VT-d unit

When SR-IOV is enabled, 'Virtual Functions' of a 'Physical Function'
are under the scope of the same VT-d unit as the 'Physical Function'.
A 'Physical Function' can be a 'Traditional Function' or an ARI
'Extended Function'. And furthermore, 'Extended Functions' on an
endpoint are under the scope of the same VT-d unit as the 'Traditional
Functions' on the endpoint. To search VT-d unit for a VF, if its PF
isn't an extended function, the BDF of PF should be used. Otherwise
the BDF of a traditional function in the same device with the PF
should be used.

Current code uses PCI_SLOT() to recognize an ARI 'Extended Funcion'.
But it is conceptually wrong w/o checking whether PF is an extended
function and would lead to match VFs of a RC integrated PF to a wrong
VT-d unit.

This patch overrides VF 'is_extfn' field and uses this field to
indicate whether the PF of this VF is an extended function. The field
helps to use correct BDF to search VT-d unit.

Reported-by: Crawford, Eric R <Eric.R.Crawford@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Crawford, Eric R <Eric.R.Crawford@intel.com>
7 years agox86: remove redundant checks in sysctl.c
Yi Sun [Thu, 31 Aug 2017 08:07:26 +0000 (16:07 +0800)]
x86: remove redundant checks in sysctl.c

In sysctl.c, the return value of 'psr_get_info' has been checked immediately.
So, it is redundant to check the return value again when copy the field to
guest.

Suggested-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Yi Sun <yi.y.sun@linux.intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
7 years agox86/pv: drop gate_op prefix in emul-gate-op.c
Wei Liu [Thu, 31 Aug 2017 11:42:52 +0000 (12:42 +0100)]
x86/pv: drop gate_op prefix in emul-gate-op.c

There is only one function gate_op_read that needs to be modified.
Rename it to read_mem.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/pv: drop priv_op prefix in emul-priv-op.c
Wei Liu [Thu, 31 Aug 2017 11:36:06 +0000 (12:36 +0100)]
x86/pv: drop priv_op prefix in emul-priv-op.c

Drop the prefix because they live in their own file now.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agoRevert "xen: in do_softirq() sample smp_processor_id() only once."
Wei Liu [Thu, 31 Aug 2017 15:28:49 +0000 (16:28 +0100)]
Revert "xen: in do_softirq() sample smp_processor_id() only once."

This reverts commit 57450cfe48b56db90166c52d45a411a9279a12e1.

This breaks arm tests.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
7 years agoxen-access: Correct default value of write-to-CR4 switch
Sergej Proskurin [Wed, 30 Aug 2017 11:19:14 +0000 (13:19 +0200)]
xen-access: Correct default value of write-to-CR4 switch

The current implementation configures the test environment to always
trap on writes to the CR4 control register, even on ARM. This leads to
issues as calling xc_monitor_write_ctrlreg on ARM with VM_EVENT_X86_CR4
will always fail.

Signed-off-by: Sergej Proskurin <proskurin@sec.in.tum.de>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agox86/mm: introduce trace point for mmio_ro emulation
Wei Liu [Wed, 30 Aug 2017 17:11:10 +0000 (18:11 +0100)]
x86/mm: introduce trace point for mmio_ro emulation

Using ptrw_emulation trace point is wrong.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/mm: Rearrange guest_get_eff_{,kern_}l1e() to not be void
Andrew Cooper [Wed, 30 Aug 2017 13:18:01 +0000 (14:18 +0100)]
x86/mm: Rearrange guest_get_eff_{,kern_}l1e() to not be void

Coverity complains that gl1e.l1 may be used while uninitialised in
map_ldt_shadow_page().  This isn't actually accurate as guest_get_eff_l1e()
will always write to its parameter.

However, having a void function which returns a 64bit value via pointer is
rather silly.  Rearrange the functions to return l1_pgentry_t.

No functional change, but hopefully should help Coverity not to come to the
wrong conclusion.

Bloat-o-meter also reports a modest improvement:
  add/remove: 0/0 grow/shrink: 0/4 up/down: 0/-71 (-71)
  function                                     old     new   delta
  guest_get_eff_l1e                             82      75      -7
  mmio_ro_do_page_fault                        530     514     -16
  map_ldt_shadow_page                          501     485     -16
  ptwr_do_page_fault                           615     583     -32

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/mm: Use mfn_t for new_guest_cr3()
Andrew Cooper [Wed, 30 Aug 2017 11:15:49 +0000 (12:15 +0100)]
x86/mm: Use mfn_t for new_guest_cr3()

No functional change (as confirmed by diffing the assembly).

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agoxen: RCU: avoid busy waiting until the end of grace period.
Dario Faggioli [Wed, 30 Aug 2017 11:06:22 +0000 (12:06 +0100)]
xen: RCU: avoid busy waiting until the end of grace period.

On the CPU where a callback is queued, cpu_is_haltable()
returns false (due to rcu_needs_cpu() being itself false).
That means the CPU would spin inside idle_loop(), continuously
calling do_softirq(), and, in there, continuously checking
rcu_pending(), in a tight loop.

Let's instead allow the CPU to really go idle, but make sure,
by arming a timer, that we periodically check whether the
grace period has come to an ended. As the period of the
timer, we pick a value that makes thing look like what
happens in Linux, with the periodic tick (as this code
comes from there).

Note that the timer will *only* be armed on CPUs that are
going idle while having queued RCU callbacks. On CPUs that
don't, there won't be any timer, and their sleep won't be
interrupted (and even for CPUs with callbacks, we only
expect an handful of wakeups at most, but that depends on
the system load, as much as from other things).

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
7 years agoxen: RCU: don't let a CPU with a callback go idle.
Dario Faggioli [Wed, 30 Aug 2017 11:06:21 +0000 (12:06 +0100)]
xen: RCU: don't let a CPU with a callback go idle.

If a CPU has a callback queued, it must be ready to invoke
it, as soon as all the other CPUs involved in the grace period
has gone through a quiescent state.

But if we let such CPU go idle, we can't really tell when (if!)
it will realize that it is actually time to invoke the callback.
To solve this problem, a CPU that has a callback queued (and has
already gone through a quiescent state itself) will stay online,
until the grace period ends, and the callback can be invoked.

This is similar to what Linux does, and is the second and last
step for fixing the overly long (or infinite!) grace periods.
The problem, though, is that, within Linux, we have the tick,
so, all that is necessary is to not stop the tick for the CPU
(even if it has gone idle). In Xen, there's no tick, so we must
avoid for the CPU to go idle entirely, and let it spin on
rcu_pending(), consuming power and causing overhead.

In this commit, we implement the above, using rcu_needs_cpu(),
in a way similar to how it is used in Linux. This it correct,
useful and not wasteful for CPUs that participate in grace
period, but have not a callback queued. For the ones that
has callbacks, an optimization that avoids having to spin is
introduced in a subsequent change.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
7 years agoxen: RCU/x86/ARM: discount CPUs that were idle when grace period started.
Dario Faggioli [Wed, 30 Aug 2017 11:06:21 +0000 (12:06 +0100)]
xen: RCU/x86/ARM: discount CPUs that were idle when grace period started.

Xen is a tickless (micro-)kernel, i.e., when a CPU becomes
idle there is no timer tick that will periodically wake the
CPU up.
OTOH, when we imported RCU from Linux, Linux was (on x86) a
ticking kernel, i.e., there was a periodic timer tick always
running, even on idle CPUs. This was bad for power consumption,
but, for instance, made it easy to monitor the quiescent states
of all the CPUs, and hence tell when RCU grace periods ended.

In Xen, that is impossible, and that's particularly problematic
when the system is very lightly loaded, as some CPUs may never
have the chance to tell the RCU core logic about their quiescence,
and grace periods could extend indefinitely!

This has led, on x86, to long (and unpredictable) delays between
RCU callbacks queueing and their actual invokation. On ARM, we've
even seen infinite grace periods (e.g., complate_domain_destroy()
never being actually invoked!). See here:

 https://lists.xenproject.org/archives/html/xen-devel/2017-01/msg02454.html

The first step for fixing this situation is for RCU to record,
at the beginning of a grace period, which CPUs are already idle.
In fact, being idle, they can't be in the middle of any read-side
critical section, and we don't have to wait for their quiescence.

This is tracked in a cpumask, in a similar way to how it was also
done in Linux (on s390, which was tickless already). It is also
basically the same approach used for making Linux x86 tickless,
in 2.6.21 on (see commit 79bf2bb3 "tick-management: dyntick /
highres functionality").

For correctness, wee also add barriers. One is also present in
Linux, (see commit c3f59023, "Fix RCU race in access of nohz_cpu_mask",
although, we change the code comment to something that makes better
sense for us). The other (which is its pair), is put in the newly
introduced function rcu_idle_enter(), right after updating the
cpumask. They prevent races between CPUs going idle during the
beginning of a grace period.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
7 years agoxen: ARM: suspend the tick (if in use) when going idle.
Dario Faggioli [Wed, 30 Aug 2017 11:06:20 +0000 (12:06 +0100)]
xen: ARM: suspend the tick (if in use) when going idle.

Since commit 964fae8ac ("cpuidle: suspend/resume scheduler
tick timer during cpu idle state entry/exit"), if a scheduler
has a periodic tick timer, we stop it when going idle.

This, however, is only true for x86. Make it true for ARM as
well.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Tim Deegan <tim@xen.org>
7 years agoxen: in do_softirq() sample smp_processor_id() only once.
Dario Faggioli [Wed, 30 Aug 2017 11:06:20 +0000 (12:06 +0100)]
xen: in do_softirq() sample smp_processor_id() only once.

In fact, right now, we read it at every iteration of the loop.
The reason it's done like this is how context switch was handled
on IA64 (see commit ae9bfcdc, "[XEN] Various softirq cleanups" [1]).

However:
1) we don't have IA64 any longer, and all the achitectures that
   we do support, are ok with sampling once and for all;
2) sampling at every iteration (slightly) affect performance;
3) sampling at every iteration is misleading, as it makes people
   believe that it is currently possible that SCHEDULE_SOFTIRQ
   moves the execution flow on another CPU (and the comment,
   by reinforcing this belief, makes things even worse!).

Therefore, let's:
- do the sampling only once, and remove the comment;
- leave an ASSERT() around, so that, if context switching
  logic changes (in current or new arches), we will notice.

[1] Some more (historical) information here:
    http://old-list-archives.xenproject.org/archives/html/xen-devel/2006-06/msg01262.html

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
7 years agoMerge branch 'staging' of xenbits.xen.org:/home/xen/git/xen into staging
Jan Beulich [Wed, 30 Aug 2017 10:24:41 +0000 (12:24 +0200)]
Merge branch 'staging' of xenbits.xen.org:/home/xen/git/xen into staging

7 years agoRevert "mm: don't hold heap lock in alloc_heap_pages() longer than necessary"
Jan Beulich [Wed, 30 Aug 2017 10:23:23 +0000 (12:23 +0200)]
Revert "mm: don't hold heap lock in alloc_heap_pages() longer than necessary"

This reverts commit dab6a84aadab11f31332030a1e9f0b9282d76156,
as it introduces a race with free_heap_pages().

7 years agox86/percpu: Misc cleanup
Andrew Cooper [Fri, 18 Aug 2017 12:24:41 +0000 (13:24 +0100)]
x86/percpu: Misc cleanup

 * Drop unnecessary brackets.
 * Add spaces around binary operators.
 * Insert appropriate blank lines.
 * Insert a local variable block at the end.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/pv: Simplify access to the LDT/GDT ptes
Andrew Cooper [Sat, 26 Aug 2017 11:11:07 +0000 (11:11 +0000)]
x86/pv: Simplify access to the LDT/GDT ptes

Rename gdt_ldt_ptes() to pv_gdt_ptes() and drop the domain parameter, as it is
incorrect to use the helper with d != v->domain.

Introduce pv_ldt_ptes() to abstract away the fact that the LDT mapping is 16
slots after the GDT, and adjust the callers accordingly.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/pv: Consistently use typesafe helpers in all files
Andrew Cooper [Sat, 26 Aug 2017 10:40:56 +0000 (10:40 +0000)]
x86/pv: Consistently use typesafe helpers in all files

Rather than having a mix of code behaviour.  This requires updating
pagetable_{get,from}_page() to use the non-overridden helpers.

This requires some adjustments in priv_op_{read,write}_cr(), which is most
easily done by switching CR3 handling to using mfn_t.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/pv: map_ldt_shadow_page() cleanup
Andrew Cooper [Wed, 23 Aug 2017 17:49:31 +0000 (17:49 +0000)]
x86/pv: map_ldt_shadow_page() cleanup

Switch the return value from int to bool, to match its semantics.  Switch its
parameter from a frame offset to a byte offset (simplifying the sole caller)
and allowing for an extra sanity check that the fault is within the LDT limit.

Drop the unnecessary gmfn and okay local variables, and correct the gva
parameter to be named linear.  Rename l1e to gl1e, and simplify the
construction of the new pte by simply taking (the now validated) gl1e and
ensuring that _PAGE_RW is set.

Calculate the pte to be updated outside of the spinlock, which halves the size
of the critical region.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/pv: Switch {fill,zap}_ro_mpt() to using mfn_t
Andrew Cooper [Wed, 23 Aug 2017 17:51:59 +0000 (17:51 +0000)]
x86/pv: Switch {fill,zap}_ro_mpt() to using mfn_t

And update all affected callers.  Fix the fill_ro_mpt() prototype to be bool
like its implementation.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agoRevert "mm: don't hold heap lock in alloc_heap_pages() longer than necessary"
Andrew Cooper [Wed, 30 Aug 2017 10:00:14 +0000 (11:00 +0100)]
Revert "mm: don't hold heap lock in alloc_heap_pages() longer than necessary"

This reverts commit dab6a84aadab11f31332030a1e9f0b9282d76156.  The change is
not safe, and results in a crash such as:

(XEN) ----[ Xen-4.10-unstable  x86_64  debug=y   Tainted:    H ]----
(XEN) CPU:    5
(XEN) RIP:    e008:[<ffff82d0802252fc>] page_alloc.c#free_heap_pages+0x786/0x7a1
(XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor (d0v2)
(XEN) rax: 0000000000001c80   rbx: ffff82e01066bfa0   rcx: ffff82ffffffffe0
(XEN) rdx: ffff82ffffffffe0   rsi: ffff82d08056f600   rdi: 00000000ffffffff
(XEN) rbp: ffff83083751fda8   rsp: ffff83083751fd48   r8:  00000000000001c8
(XEN) r9:  0000000000000018   r10: 0000000000000018   r11: 0000000000000216
(XEN) r12: 0000000000000000   r13: 00000000000001b0   r14: 0000000000000000
(XEN) r15: 0000000000000000   cr0: 0000000080050033   cr4: 00000000001526e0
(XEN) cr3: 000000072f465000   cr2: ffff82ffffffffe4
(XEN) ds: 002b   es: 002b   fs: 0000   gs: 0000   ss: e010   cs: e008
(XEN) Xen code around <ffff82d0802252fc> (page_alloc.c#free_heap_pages+0x786/0x7a1):
(XEN)  24 89 01 e9 91 fd ff ff <89> 7a 04 8b 03 89 01 e9 4d ff ff ff 48 83 c4 38
(XEN) Xen stack trace from rsp=ffff83083751fd48:
(XEN)    0000000000000001 0000001800000001 0000000000000000 0000000000000018
(XEN)    ffff82e01066bf80 0000000000000000 ffff82e01066b920 0000000000000000
(XEN)    ffff82e01066bf80 0000000000000000 ffff83082b781000 ffff880087e4eca8
(XEN)    ffff83083751fdf8 ffff82d080226785 ffff82d08023afa5 0000000000000203
(XEN)    ffff83082b781000 ffff83082b781340 ffff83082b7814d8 ffff83082b781aa8
(XEN)    ffff83082b781000 ffff880087e4eca8 ffff83083751fe18 ffff82d0802f1e44
(XEN)    ffff83082b781000 ffff83082b781000 ffff83083751fe48 ffff82d0802e0fd5
(XEN)    ffff83082b781000 00000000ffffffff ffff83082b781aa8 ffff83082b781000
(XEN)    ffff83083751fe68 ffff82d080271bc8 ffff83082b781aa8 ffff8300abe45000
(XEN)    ffff83083751fe98 ffff82d080207d74 ffff830837516040 0000000000000000
(XEN)    0000000000000000 ffff83083751ffff ffff83083751fec8 ffff82d080229a3c
(XEN)    ffff82d080572d80 ffff82d080573000 ffff82d080572d80 ffffffffffffffff
(XEN)    ffff83083751fef8 ffff82d08023a68a ffff8300abfa6000 0000000af2019e42
(XEN)    0000000000000000 ffff880087e4ec68 ffff83083751ff08 ffff82d08023a6df
(XEN)    00007cf7c8ae00c7 ffff82d08035f391 ffff880087e4eca8 ffff880087e4ec68
(XEN)    0000000000000000 0000000af2019e42 ffff880087e43d70 0000000000000002
(XEN)    0000000000000216 0000000000000004 0000000000000000 00000000000000b6
(XEN)    0000000000000000 ffffffff8100130a deadbeefdeadf00d deadbeefdeadf00d
(XEN)    deadbeefdeadf00d 0000010000000000 ffffffff8100130a 000000000000e033
(XEN)    0000000000000216 ffff880087e43d38 000000000000e02b c2c2c2c2c2c2beef
(XEN) Xen call trace:
(XEN)    [<ffff82d0802252fc>] page_alloc.c#free_heap_pages+0x786/0x7a1
(XEN)    [<ffff82d080226785>] free_domheap_pages+0x312/0x37c
(XEN)    [<ffff82d0802f1e44>] stdvga_deinit+0x30/0x46
(XEN)    [<ffff82d0802e0fd5>] hvm_domain_destroy+0x60/0x116
(XEN)    [<ffff82d080271bc8>] arch_domain_destroy+0x1a/0x8f
(XEN)    [<ffff82d080207d74>] domain.c#complete_domain_destroy+0x6f/0x182
(XEN)    [<ffff82d080229a3c>] rcupdate.c#rcu_process_callbacks+0x141/0x1a2
(XEN)    [<ffff82d08023a68a>] softirq.c#__do_softirq+0x7f/0x8a
(XEN)    [<ffff82d08023a6df>] do_softirq+0x13/0x15
(XEN)    [<ffff82d08035f391>] x86_64/entry.S#process_softirqs+0x21/0x30
(XEN)
(XEN) Pagetable walk from ffff82ffffffffe4:
(XEN)  L4[0x105] = 00000000abe5b063 ffffffffffffffff
(XEN)  L3[0x1ff] = 0000000000000000 ffffffffffffffff
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 5:
(XEN) FATAL PAGE FAULT
(XEN) [error_code=0002]
(XEN) Faulting linear address: ffff82ffffffffe4
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agomm: don't hold heap lock in alloc_heap_pages() longer than necessary
Boris Ostrovsky [Wed, 30 Aug 2017 09:05:02 +0000 (11:05 +0200)]
mm: don't hold heap lock in alloc_heap_pages() longer than necessary

Once pages are removed from the heap we don't need to hold the heap
lock. It is especially useful to drop it for an unscrubbed buddy since
we will be scrubbing it.

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agox86/hvm: allow guest_request vm_events coming from userspace
Alexandru Isaila [Wed, 30 Aug 2017 09:04:13 +0000 (11:04 +0200)]
x86/hvm: allow guest_request vm_events coming from userspace

In some introspection usecases, an in-guest agent needs to communicate
with the external introspection agent.  An existing mechanism is
HVMOP_guest_request_vm_event, but this is restricted to kernel usecases
like all other hypercalls.

Introduce a mechanism whereby the introspection agent can whitelist the
use of HVMOP_guest_request_vm_event directly from userspace.

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
7 years agox86/pt: add a MSI unmask flag to XEN_DOMCTL_bind_pt_irq
Roger Pau Monné [Wed, 30 Aug 2017 09:02:24 +0000 (11:02 +0200)]
x86/pt: add a MSI unmask flag to XEN_DOMCTL_bind_pt_irq

The flag is part of the gflags, and should be used to request the
unmask of a MSI interrupt once it's bound.

This is required for the device model in order to be capable of
binding MSIX interrupts that have the entry mask bit already unset at
bind time. Without this fix the interrupts would be left masked.

Note that this commit introduces a change to the domctl, which
requires a bump of the interface version. This is not done here
because the interface version has already been bumped in this release
cycle.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reported by: Andreas Kinzler <hfp@posteo.de>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/pv: Fill all Xen slots in init_guest_l4_table()
Andrew Cooper [Mon, 28 Aug 2017 15:46:05 +0000 (16:46 +0100)]
x86/pv: Fill all Xen slots in init_guest_l4_table()

There is a bug when using highmem-start= where some L4 directmap slots are not
audited in alloc_l4_table(), and not overwritten by init_guest_l4_table().

As highmem_start is only available in debug builds of the hypervisor, this
does not constitute a security issue.

Ensure that init_guest_l4_table() writes to all of the Xen slots.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agoxen/build: Nuke include/{config,generated} during clean
Andrew Cooper [Mon, 28 Aug 2017 16:16:59 +0000 (16:16 +0000)]
xen/build: Nuke include/{config,generated} during clean

Otherwise a stale generated Kconfig may still be used after a tree-wide clean.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agoRevert "hvmloader, libxl: use the correct ACPI settings depending on device model"
Wei Liu [Tue, 29 Aug 2017 14:02:54 +0000 (15:02 +0100)]
Revert "hvmloader, libxl: use the correct ACPI settings depending on device model"

This reverts commit 149c6bbbf775b5e6dd6beae329fcdaab33a0f8cd.

7 years agoRevert "acpi: set correct address of the control/event blocks in the FADT"
Wei Liu [Tue, 29 Aug 2017 14:02:16 +0000 (15:02 +0100)]
Revert "acpi: set correct address of the control/event blocks in the FADT"

This reverts commit a8c87a8788e5ce21d6e55e0acdc64a8f26cf5687.

7 years agoxen: rtds: only tickle non-already tickled CPUs
Meng Xu [Thu, 3 Aug 2017 02:13:52 +0000 (22:13 -0400)]
xen: rtds: only tickle non-already tickled CPUs

When more than one idle VCPUs that have the same PCPU as their
previous running core invoke runq_tickle(), they will tickle the same
PCPU. The tickled PCPU will only pick at most one VCPU, i.e., the
highest-priority one, to execute. The other VCPUs will not be
scheduled for a period, even when there is an idle core, making these
VCPUs unnecessarily starve for one period.

Therefore, always make sure that we only tickle PCPUs that have not
been tickled already.

Signed-off-by: Haoran Li <naroahlee@gmail.com>
Signed-off-by: Meng Xu <mengxu@cis.upenn.edu>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
7 years agolibxl/arm: Fix build on arm64 + acpi
Daniel Sabogal [Fri, 25 Aug 2017 21:35:47 +0000 (17:35 -0400)]
libxl/arm: Fix build on arm64 + acpi

With musl, the build fails with the following errors:

  actypes.h:202:2: error: #error unknown ACPI_MACHINE_WIDTH
   #error unknown ACPI_MACHINE_WIDTH
    ^~~~~
  actypes.h:207:9: error: unknown type name ‘acpi_native_uint’
   typedef acpi_native_uint acpi_size;
           ^~~~~~~~~~~~~~~~
  actypes.h:617:3: error: unknown type name ‘acpi_io_address’
     acpi_io_address pblk_address;
     ^~~~~~~~~~~~~~~

This likely went undetected with glibc builds since glibc
indirectly pulls __BITS_PER_LONG from the linux headers
through a standard header. For musl, this is not the case.

Instead, use BITS_PER_LONG to fix the build.

Signed-off-by: Daniel Sabogal <dsabogalcc@gmail.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agoacpi: set correct address of the control/event blocks in the FADT
Roger Pau Monne [Tue, 29 Aug 2017 08:50:24 +0000 (09:50 +0100)]
acpi: set correct address of the control/event blocks in the FADT

Commit 149c6b unmasked an issue long present in Xen: the control/event
block addresses provided in the ACPI FADT table where hardcoded to the
V1 version. This was papered over because hvmloader would also always
set HVM_PARAM_ACPI_IOPORTS_LOCATION to 1 regardless of the BIOS
version.

The most notable issue caused by the above bug was that the QEMU
traditional GPE0 block was out of sync: the address provided in the
FADT didn't match the address QEMU was using.

Note that PM1a and TMR worked fine because the V1 address was
hardcoded in the FADT and HVM_PARAM_ACPI_IOPORTS_LOCATION was
unconditionally set to 1 by hvmloader.

Fix this by passing the address of the control/event blocks to
acpi_build_tables, so the values can be properly set in the FADT table
provided to the guest.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agoxen: credit2: try to avoid tickling cpus subject to ratelimiting
Dario Faggioli [Tue, 29 Aug 2017 09:18:52 +0000 (10:18 +0100)]
xen: credit2: try to avoid tickling cpus subject to ratelimiting

With context switching ratelimiting enabled, the following
pattern is quite common in a scheduling trace:

     0.000845622 |||||||||||.x||| d32768v12 csched2:runq_insert d0v13, position 0
     0.000845831 |||||||||||.x||| d32768v12 csched2:runq_tickle_new d0v13, processor = 12, credit = 10135529
     0.000846546 |||||||||||.x||| d32768v12 csched2:burn_credits d2v7, credit = 2619231, delta = 255937
 [1] 0.000846739 |||||||||||.x||| d32768v12 csched2:runq_tickle cpu 12
     [...]
 [2] 0.000850597 ||||||||||||x||| d32768v12 csched2:schedule cpu 12, rq# 1, busy, SMT busy, tickled
     0.000850760 ||||||||||||x||| d32768v12 csched2:burn_credits d2v7, credit = 2614028, delta = 5203
 [3] 0.000851022 ||||||||||||x||| d32768v12 csched2:ratelimit triggered
 [4] 0.000851614 ||||||||||||x||| d32768v12 runstate_continue d2v7 running->running

Basically, what happens is that runq_tickle() realizes
d0v13 should preempt d2v7, running on cpu 12, as it
has higher credits (10135529 vs. 2619231). It therefore
tickles cpu 12 [1], which, in turn, schedules [2].

But --surprise surprise-- d2v7 has run for less than the
ratelimit interval [3], and hence it is _not_ preempted,
and continues to run. This indeed looks fine. Actually,
this is what ratelimiting is there for. Note, however,
that:
 1) we interrupted cpu 12 for nothing;
 2) what if, say on cpu 8, there is a vcpu that has:
    + less credit than d0v13 (so d0v13 can well
      preempt it),
    + more credit than d2v7 (that's why it was not
      selected to be preempted),
    + run for more than the ratelimiting interval
      (so it can really be scheduled out)?

With this patch, if we are in case 2), we'd realize
that tickling 12 would be pointless, and we'll continue
looking, eventually finding and tickling 8.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
7 years agoxen: credit2: optimize runq_candidate() a little bit
Dario Faggioli [Tue, 29 Aug 2017 09:18:52 +0000 (10:18 +0100)]
xen: credit2: optimize runq_candidate() a little bit

By factoring into one (at the top) all the checks
to see whether current is the idle vcpu, and mark
it as unlikely().

In fact, if current is idle, all the logic for
dealing with yielding, context switching rate
limiting and soft-affinity, is just pure overhead,
and we better rush checking the runq and pick some
vcpu up.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
7 years agoxen: credit2: kick away vcpus not running within their soft-affinity
Dario Faggioli [Tue, 29 Aug 2017 09:18:51 +0000 (10:18 +0100)]
xen: credit2: kick away vcpus not running within their soft-affinity

If, during scheduling, we realize that the current vcpu
is running outside of its own soft-affinity, it would be
preferable to send it somewhere else.

Of course, that may not be possible, and if we're too
strict, we risk having vcpus sit in runqueues, even if
there are idle pcpus (violating work-conservingness).
In fact, what about there are no pcpus, from the soft
affinity mask of the vcpu in question, where it can
run?

To make sure we don't fall in the above described trap,
only actually de-schedule the vcpu if there are idle and
not already tickled cpus from its soft affinity where it
can run immediately.

If there is (at least one) of such cpus, we let current
be preempted, so that csched2_context_saved() will put
it back in the runq, and runq_tickle() will wake (one
of) the cpu.

If there is not even one, we let current run where it is,
as running outside its soft-affinity is still better than
not running at all.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
7 years agoxen: credit2: soft-affinity awareness in csched2_cpu_pick()
Dario Faggioli [Tue, 29 Aug 2017 09:18:51 +0000 (10:18 +0100)]
xen: credit2: soft-affinity awareness in csched2_cpu_pick()

We want to find the runqueue with the least average load,
and to do that, we scan through all the runqueues.

It is, therefore, enough that, during such scan:
- we identify the runqueue with the least load, among
  the ones that have pcpus that are part of the soft
  affinity of the vcpu we're calling pick on;
- we identify the same, but for hard affinity.

At this point, we can decide whether to go for the
runqueue with the least load among the ones with some
soft-affinity, or overall.

Therefore, at the price of some code reshuffling, we
can avoid the loop.

(Also, kill a spurious ';' in the definition of MAX_LOAD.)

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Signed-off-by: Justin T. Weaver <jtweaver@hawaii.edu>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
7 years agoxen: credit2: soft-affinity awareness in gat_fallback_cpu()
Dario Faggioli [Tue, 29 Aug 2017 09:18:50 +0000 (10:18 +0100)]
xen: credit2: soft-affinity awareness in gat_fallback_cpu()

By, basically, moving all the logic of the function
inside the usual two steps (soft-affinity step and
hard-affinity step) loop.

While there, add two performance counters (in cpu_pick
and in get_fallback_cpu() itself), in order to be able
to tell how frequently it happens that we need to look
for a fallback cpu.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Signed-off-by: Justin T. Weaver <jtweaver@hawaii.edu>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
7 years agoxen/credit2: soft-affinity awareness in runq_tickle()
George Dunlap [Tue, 29 Aug 2017 09:18:49 +0000 (10:18 +0100)]
xen/credit2: soft-affinity awareness in runq_tickle()

Soft-affinity support is usually implemented by means
of a two step "balancing loop", where:
- during the first step, we consider soft-affinity
  (if the vcpu has one);
- during the second (if we get to it), we consider
  hard-affinity.

In runq_tickle(), we need to do that for checking
whether we can execute the waking vCPU on an pCPU
that is idle. In fact, we want to be sure that, if
there is an idle pCPU in the vCPU's soft affinity,
we'll use it.

If there are no such idle pCPUs, though, and we
have to check non-idle ones, we can avoid the loop
and to both hard and soft-affinity in one pass.

In fact, we can we scan runqueue and compute a
"score" for each vCPU which is running on each pCPU.
The idea is, since we may have to preempt someone:
- try to make sure that the waking vCPU will run
  inside its soft-affinity,
- try to preempt someone that is running outside
  of its own soft-affinity.

The value of the score is added to a trace record,
so xenalyze's code and tools/xentrace/formats are
updated accordingly.

Suggested-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
7 years agox86/mm: Drop is_guest_l3_slot() and simplify callers
Andrew Cooper [Mon, 28 Aug 2017 14:45:07 +0000 (14:45 +0000)]
x86/mm: Drop is_guest_l3_slot() and simplify callers

With a 64bit hypervisor there are no conditional l3 slots, and this is
unlikely to change moving forwards.

No functional change (as confirmed by diffing the disassembly.  GCC obviously
already optimised this code away.)

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agoxen: fix boolean parameter handling
Juergen Gross [Mon, 28 Aug 2017 14:49:30 +0000 (16:49 +0200)]
xen: fix boolean parameter handling

Commit 63e8a1e5ffa7a7fdbde887805f673fea7e8d2e94 ("xen: check parameter
validity when parsing command line") introduced a bug for the case
when a boolean parameter was specified by its keyword only (no value).
It would set just the wrong boolean value for that parameter.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agoxen: make some console related parameters settable at runtime
Juergen Gross [Mon, 28 Aug 2017 07:35:00 +0000 (09:35 +0200)]
xen: make some console related parameters settable at runtime

Support modifying conswitch, console_timestamps, loglvl and
guest_loglvl at runtime.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agoxl: add new xl command set-parameters
Juergen Gross [Mon, 28 Aug 2017 07:37:00 +0000 (09:37 +0200)]
xl: add new xl command set-parameters

Add a new xl command "set-parameters" to set hypervisor parameters at
runtime similar to boot time parameters via command line.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agolibxl: add libxl_set_parameters() function
Juergen Gross [Mon, 28 Aug 2017 07:35:00 +0000 (09:35 +0200)]
libxl: add libxl_set_parameters() function

Add a new libxl function to set hypervisor parameters at runtime
similar to boot time parameters via command line.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agolibxc: add function to set hypervisor parameters
Juergen Gross [Mon, 28 Aug 2017 07:36:00 +0000 (09:36 +0200)]
libxc: add function to set hypervisor parameters

Add a new libxc function to set hypervisor parameters at runtime
similar to boot time parameters via command line.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agoxen: add hypercall for setting parameters at runtime
Juergen Gross [Mon, 28 Aug 2017 07:35:00 +0000 (09:35 +0200)]
xen: add hypercall for setting parameters at runtime

Add a sysctl hypercall to support setting parameters similar to
command line parameters, but at runtime. The parameters to set are
specified as a string, just like the boot parameters.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agoxen: add basic support for runtime parameter changing
Juergen Gross [Mon, 28 Aug 2017 07:35:00 +0000 (09:35 +0200)]
xen: add basic support for runtime parameter changing

Add the needed infrastructure for runtime parameter changing similar
to that used at boot time via cmdline. We are using the same parsing
functions as for cmdline parsing, but with a different array of
parameter definitions.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agoxen: carve out a generic parsing function from _cmdline_parse()
Juergen Gross [Mon, 28 Aug 2017 07:35:00 +0000 (09:35 +0200)]
xen: carve out a generic parsing function from _cmdline_parse()

In order to support generic parameter parsing carve out the parser from
_cmdline_parse(). As this generic function might be called after boot
remove the __init annotations from all called sub-functions.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agoxen/common/sched_credit2.c: remove custom_param() error messages
Juergen Gross [Mon, 28 Aug 2017 07:36:00 +0000 (09:36 +0200)]
xen/common/sched_credit2.c: remove custom_param() error messages

With _cmdline_parse() now issuing error messages in case of illegal
parameters signalled by parsing functions specified in custom_param()
the message issued by parse_credit2_runqueue() can be removed.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Dario Faggioli <dario.faggioli@citrix.com>
7 years agoxen/arch/x86/io_apic.c: remove custom_param() error messages
Juergen Gross [Mon, 28 Aug 2017 07:34:00 +0000 (09:34 +0200)]
xen/arch/x86/io_apic.c: remove custom_param() error messages

With _cmdline_parse() now issuing error messages in case of illegal
parameters signalled by parsing functions specified in custom_param()
the message issued by setup_ioapic_ack() can be removed.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agoxen/arch/x86/hvm/viridian.c: remove custom_param() error messages
Juergen Gross [Mon, 28 Aug 2017 07:34:00 +0000 (09:34 +0200)]
xen/arch/x86/hvm/viridian.c: remove custom_param() error messages

With _cmdline_parse() now issuing error messages in case of illegal
parameters signalled by parsing functions specified in custom_param()
the message issued by parse_viridian_version() can be removed.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
7 years agoxen/arch/x86/cpu/mcheck/mce.c: remove custom_param() error messages
Juergen Gross [Mon, 28 Aug 2017 07:34:00 +0000 (09:34 +0200)]
xen/arch/x86/cpu/mcheck/mce.c: remove custom_param() error messages

With _cmdline_parse() now issuing error messages in case of illegal
parameters signalled by parsing functions specified in custom_param()
the message issued by mce_set_verbosity() can be removed.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agoxen/arch/x86/apic.c: remove custom_param() error messages
Juergen Gross [Mon, 28 Aug 2017 07:34:00 +0000 (09:34 +0200)]
xen/arch/x86/apic.c: remove custom_param() error messages

With _cmdline_parse() now issuing error messages in case of illegal
parameters signalled by parsing functions specified in custom_param()
the message issued by apic_set_verbosity() can be removed.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agoxen: check parameter validity when parsing command line
Juergen Gross [Mon, 28 Aug 2017 07:34:00 +0000 (09:34 +0200)]
xen: check parameter validity when parsing command line

Where possible check validity of parameters in _cmdline_parse() and
issue a warning message in case of an error detected.

In order to make sure a custom parameter parsing function really
returns a value (error or success), don't use a void pointer for
storing the function address, but a proper typed function pointer.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agoxen/arch/x86/psr.c: let custom parameter parsing routines return errno
Juergen Gross [Mon, 28 Aug 2017 07:34:00 +0000 (09:34 +0200)]
xen/arch/x86/psr.c: let custom parameter parsing routines return errno

Modify the custom parameter parsing routines in:

xen/arch/x86/psr.c

to indicate whether the parameter value was parsed successfully.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Jan Beulich <jbeulich@suse.com>