]> xenbits.xensource.com Git - people/dariof/xen.git/log
people/dariof/xen.git
7 years agoxen/libxc: suppress direct access to Credit1's migration delay rel/sched/credit/vcpu_migr_delay_percpool-v2 github/rel/sched/credit/vcpu_migr_delay_percpool-v2 gitlab/rel/sched/credit/vcpu_migr_delay_percpool-v2
Dario Faggioli [Thu, 22 Feb 2018 14:30:21 +0000 (15:30 +0100)]
xen/libxc: suppress direct access to Credit1's migration delay

Removes special purpose access to Credit1 vCPU
migration delay parameter.

This fixes a build breakage, occuring when Xen
is configured with SCHED_CREDIT=n.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
---
Changes from v1:
* bumped the interface version, as requested.

7 years agotools: xenpm: continue to support {set,get}-vcpu-migration-delay
Dario Faggioli [Mon, 19 Feb 2018 18:07:43 +0000 (19:07 +0100)]
tools: xenpm: continue to support {set,get}-vcpu-migration-delay

Now that it is possible to get and set the migration
delay via the SCHEDOP sysctl, use that in xenpm, instead
of the special purpose libxc interface (which will be
removed in a following commit).

The sysctl, however, requires a cpupool-id argument,
for knowing on which scheduler it is operating on. In
this case, since we don't want to alter xenpm's command
line interface, we always use '0', which means xenpm
will always act on the default cpupool ('Pool-0').

From this commit on, `xenpm {set,get}-vcpu-migration-delay'
commands work again. But that is only for the sake of
backward compatibility, and their use is deprecated, in
favour of 'xl sched-credit -s [-c <poolid>] -m <delay>'.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agotools: libxl/xl: allow to get/set Credit1's vcpu_migration_delay
Dario Faggioli [Mon, 19 Feb 2018 18:03:31 +0000 (19:03 +0100)]
tools: libxl/xl: allow to get/set Credit1's vcpu_migration_delay

Make it possible to get and set a (Credit1) scheduler's
vCPU migration delay via the SCHEDOP sysctl, from both
libxl and xl (no change needed in libxc).

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
---
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Wei Liu <wei.liu2@citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
---
Changes from v1:
* add missing 'break', fix using wrong variable in xl_sched.c.

7 years agoxen: sched/credit1: make vcpu_migration_delay per-cpupool
Dario Faggioli [Thu, 15 Feb 2018 14:53:50 +0000 (15:53 +0100)]
xen: sched/credit1: make vcpu_migration_delay per-cpupool

Right now, vCPU migration delay is controlled by
the vcpu_migration_delay boot parameter. This means
the same value will always be used for every instance
of Credit1, in any cpupool that will be created.

Also, in order to get and set such value, a special
purpose libxc interface is defined, and used by the
xenpm tool. And this is problematic if Xen is built
without Credit1 support.

This commit adds a vcpu_migr_delay field inside
struct csched_private, so that we can get/set the
migration delay indepently for each Credit1 instance,
in different cpupools.

Getting and setting now happens via XEN_SYSCTL_SCHEDOP_*,
which is much better suited for this parameter.

The value of the boot time parameter is used for
initializing the vcpu_migr_delay field of the private
structure of all the scheduler instances, when they're
created.

While there, save reading NOW() and doing any s_time_t
operation, when the migration delay of a scheduler is
zero (as it is, by default), in
__csched_vcpu_is_cache_hot().

Finally, note that, from this commit on, using `xenpm
{set,get}-vcpu-migration-delay' is not effective any
longer.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agoxen: sched/credit: convert scheduling parameter to s_time_t when set
Dario Faggioli [Fri, 16 Feb 2018 10:25:28 +0000 (11:25 +0100)]
xen: sched/credit: convert scheduling parameter to s_time_t when set

Basically, instead of converting integers to s_time_t
at usage time (hot paths), do the convertion when the
values are set (cold paths).

This applies to the timeslice and the ratelimit
parameters of Credit1.

Note that, when changing the type of the fields of
struct csched_private (from unsigned to s_time_t),
ncpus is moved up a bit, for better packing.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
---
Cc: George Dunlap <george.dunlap@eu.citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86: add .size/.type directives to indirect thunk generation macro
Jan Beulich [Fri, 23 Feb 2018 13:25:54 +0000 (14:25 +0100)]
x86: add .size/.type directives to indirect thunk generation macro

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agoget_maintainers.pl: Avoid THE_REST when files are added or removed
Alan Robinson [Fri, 23 Feb 2018 13:24:56 +0000 (14:24 +0100)]
get_maintainers.pl: Avoid THE_REST when files are added or removed

When files are added or removed /dev/null is used as a place
holder name in the patch for the absent file.  Don't try and
find a MAINTAINER for this place holder, it only ever flags
and then spams THE REST, behaviour for a real filename is
unchanged.

Signed-off-by: Alan Robinson <Alan.Robinson@ts.fujitsu.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
7 years agobuild: Rename as-insn-check to as-option-add
Andrew Cooper [Wed, 21 Feb 2018 18:20:15 +0000 (18:20 +0000)]
build: Rename as-insn-check to as-option-add

as-insn-check mutates the passed-in flags.  Rename it to as-option-add, in
line with cc-option-add, and update all callers.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
7 years agobuild: Help attempts to syntax highlight Config.mk
Andrew Cooper [Wed, 21 Feb 2018 17:58:04 +0000 (17:58 +0000)]
build: Help attempts to syntax highlight Config.mk

Some attempts to syntax highlight Config.mk end up thinking that most of
Config.mk is a string, due to the unbalanced squote.  Provide a balancing
squote in a comment to compensate.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
7 years agoxen: append EXTRA_CFLAGS_XEN_CORE to CFLAGS
Doug Goldstein [Fri, 23 Feb 2018 10:05:35 +0000 (11:05 +0100)]
xen: append EXTRA_CFLAGS_XEN_CORE to CFLAGS

Allow a user to supply extra CFLAGS via the EXTRA_CFLAGS_XEN_CORE
environment variable for hypervisor builds. This is not a
configuration that is supported but is only aimed to help support
testing and troubleshooting when you need to make changes.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agobuild: remove shim related targets
Roger Pau Monné [Fri, 23 Feb 2018 10:05:19 +0000 (11:05 +0100)]
build: remove shim related targets

There's no need to have shim specific targets, so just use the regular
xen makefile targets in order to build the shim binary.

When the shim is build as part of the firmware directory install the
stripped Xen binary to the firmware directory and place a binary with
symbols in the debug directory.

The objcopy step of the shim build is also removed in this patch:
since the shim is booted in PVH mode there's no need for the resulting
binary to be in elf32 format. Xen can load PVH kernels with either a
32 or 64bit elf header.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/svm: enable pause filtering threshold
Brian Woods [Fri, 23 Feb 2018 10:04:48 +0000 (11:04 +0100)]
x86/svm: enable pause filtering threshold

If available, enable the pause filtering threshold feature.  See the
previous commit for more information.

Signed-off-by: Brian Woods <brian.woods@amd.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
7 years agox86/svm: add support for pause filtering threshold
Brian Woods [Fri, 23 Feb 2018 10:03:36 +0000 (11:03 +0100)]
x86/svm: add support for pause filtering threshold

Add support for enabling the pause filtering threshold feature.  This
causes the pause filtering count to reset if there's pause filtering
threshold cycles or greater between pauses.  See AMD APM Vol 2 Section
15.14.4 for more details.

The values of the pause filtering count and threshold were found by
iterating over different values of the count and threshold while running
kernbench and a pi spigot algorithm with yields placed in it.  A
balanced setting for both variable provides:

(Using averaged elapsed time with kernbench)
old = 852.0
new = 848.8
improvement = .4%

For system without pause filtering threshold, the change, from 3000 to
4000 for the count, should not negatively effect system performance.

Signed-off-by: Brian Woods <brian.woods@amd.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
7 years agox86: fix indirect thunk usage of CONFIG_INDIRECT_THUNK
Roger Pau Monné [Fri, 23 Feb 2018 10:00:31 +0000 (11:00 +0100)]
x86: fix indirect thunk usage of CONFIG_INDIRECT_THUNK

When indirect_thunk_asm.h is instantiated directly into assembly files
CONFIG_INDIRECT_THUNK might not be defined, and thus using .if against
it is wrong.

Add a check to define CONFIG_INDIRECT_THUNK to 0 if not defined, so
that using .if CONFIG_INDIRECT_THUNK is always correct.

This suppresses the following clang error:

<instantiation>:8:9: error: expected absolute expression
    .if CONFIG_INDIRECT_THUNK == 1
        ^
<instantiation>:1:1: note: while in macro instantiation
INDIRECT_BRANCH call %rdx
^
entry.S:589:9: note: while in macro instantiation
        INDIRECT_CALL %rdx
        ^

Note that this is a preparatory patch in order to enable clang's
integrated assembler, the integrated assembler is not yet enabled for
assembly files.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agoVT-d: use two 32-bit writes to update DMAR fault address registers
Haozhong Zhang [Fri, 23 Feb 2018 09:59:31 +0000 (10:59 +0100)]
VT-d: use two 32-bit writes to update DMAR fault address registers

The 64-bit DMAR fault address is composed of two 32 bits registers
DMAR_FEADDR_REG and DMAR_FEUADDR_REG. According to VT-d spec:
"Software is expected to access 32-bit registers as aligned doublewords",
a hypervisor should use two 32-bit writes to DMAR_FEADDR_REG and
DMAR_FEUADDR_REG separately in order to update a 64-bit fault address,
rather than a 64-bit write to DMAR_FEADDR_REG. Note that when x2APIC
is not enabled DMAR_FEUADDR_REG is reserved and it's not necessary to
update it.

Though I haven't seen any errors caused by such one 64-bit write on
real machines, it's still better to follow the specification.

Fixes: ae05fd3912b ("VT-d: use qword MMIO access for MSI address writes")
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
7 years agox86/svm: add EFER SVME support for VGIF/VLOAD
Brian Woods [Tue, 20 Feb 2018 22:27:02 +0000 (16:27 -0600)]
x86/svm: add EFER SVME support for VGIF/VLOAD

Only enable virtual VMLOAD/SAVE and VGIF if the guest EFER.SVME is set.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Brian Woods <brian.woods@amd.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
7 years agosysctl: correct comment in xen_sysctl_pcitopoinfo
Olaf Hering [Wed, 21 Feb 2018 13:44:58 +0000 (14:44 +0100)]
sysctl: correct comment in xen_sysctl_pcitopoinfo

Refer to correct member of struct xen_sysctl_pcitopoinfo in comment.

Fixes: commit 61319fbfd9 ("sysctl: add sysctl interface for querying PCI topology")
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agoxen/tmem: Convert the file common/tmem_xen.c to use typesafe MFN
Julien Grall [Wed, 21 Feb 2018 14:02:44 +0000 (14:02 +0000)]
xen/tmem: Convert the file common/tmem_xen.c to use typesafe MFN

The file common/tmem_xen.c is now converted to use typesafe. This is
requiring to override the macro page_to_mfn to make it work with mfn_t.

Note that all variables converted to mfn_t havem there initial value,
when set, switch from 0 to INVALID_MFN. This is fine because the initial
values was always overriden before used.

Also add a couple of missing newlines suggested by Andrew in the code.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agobuild: filter out command line assembler arguments
Roger Pau Monne [Tue, 20 Feb 2018 14:10:12 +0000 (14:10 +0000)]
build: filter out command line assembler arguments

If the assembler is not used. This happens when using cc -E or cc -S
for example. GCC will just ignore the -Wa,... when the assembler is
not called, but clang will complain loudly and fail.

Also enable passing -Wa,-I$(BASEDIR)/include to clang now that it's
safe to do so.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agobuild: do not hardcode AFLAGS for as-insn tests
Roger Pau Monne [Tue, 20 Feb 2018 14:10:11 +0000 (14:10 +0000)]
build: do not hardcode AFLAGS for as-insn tests

Hardcoding as-insn to use AFLAGS is not correct. For once the test is
performed using a C file with inline assembly, and secondly the flags
used can be passed by the caller together with the CC.

Fix as-insn-check to pass the flags given as parameter to the test.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
[Fix usage comments as they are changing]
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agoxen/arm: vgic: Make sure the number of SPIs is a multiple of 32
Julien Grall [Fri, 16 Feb 2018 14:59:56 +0000 (14:59 +0000)]
xen/arm: vgic: Make sure the number of SPIs is a multiple of 32

The vGIC relies on having a pending_irq available for every IRQs
described in the ranks. As each rank describes 32 interrupts, we need to
make sure the number of SPIs is a multiple of 32.

Reported-by: Jeff Kubascik <Jeff.Kubascik@dornerworks.com>
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Cc: Jarvis Roach <Jarvis.Roach@dornerworks.com>
7 years agoasm-x86/monitor: Add MONITOR_EVENT_INTERRUPT to common capabilities
Alexandru Isaila [Mon, 19 Feb 2018 13:07:06 +0000 (15:07 +0200)]
asm-x86/monitor: Add MONITOR_EVENT_INTERRUPT to common capabilities

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
7 years agox86/msr: add Raw and Host domain policies
Sergey Dyasli [Mon, 19 Feb 2018 11:29:26 +0000 (11:29 +0000)]
x86/msr: add Raw and Host domain policies

Raw policy contains the actual values from H/W MSRs. Add PLATFORM_INFO
msr to the policy during probe_cpuid_faulting().

Host policy may have certain features disabled if Xen decides not
to use them. For now, make Host policy equal to Raw policy with
cpuid_faulting availability dependent on X86_FEATURE_CPUID_FAULTING.

Finally, derive HVM/PV max domain policies from the Host policy.

Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/nmi: start NMI watchdog on CPU0 after SMP bootstrap github/staging
Igor Druzhinin [Tue, 20 Feb 2018 09:16:56 +0000 (10:16 +0100)]
x86/nmi: start NMI watchdog on CPU0 after SMP bootstrap

We're noticing a reproducible system boot hang on certain
Skylake platforms where the BIOS is configured in legacy
boot mode with x2APIC disabled. The system stalls immediately
after writing the first SMP initialization sequence into APIC ICR.

The cause of the problem is watchdog NMI handler execution -
somewhere near the end of NMI handling (after it's already
rescheduled the next NMI) it tries to access IO port 0x61
to get the actual NMI reason on CPU0. Unfortunately, this
port is emulated by BIOS using SMIs and this emulation for
some reason takes more time than we expect during INIT-SIPI-SIPI
sequence. As the result, the system is constantly moving between
NMI and SMI handler and not making any progress.

To avoid this, initialize the watchdog after SMP bootstrap on
CPU0 and, additionally, protect the NMI handler by moving
IO port access before NMI re-scheduling. The latter should also
help in case of post boot CPU onlining. Although we're running
watchdog at much lower frequency at this point, it's neveretheless
possible we may trigger the issue anyway.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agoshim: allow building of just the shim with build-ID-incapable linker
Jan Beulich [Tue, 20 Feb 2018 09:10:59 +0000 (10:10 +0100)]
shim: allow building of just the shim with build-ID-incapable linker

The ELF note the shim build inserts causes mkelf32 to choke on the
second program header. However, the output of mkelf32 isn't really
needed when building inside tools/firmware/ - an attempt to build it is
made solely because of a wrong dependency.

Further changes to the make logic will be needed to also allow building
a shim-enabled "normal" xen with such a linker (as it looks the --notes
option will need passing not just when the linker support build ID
generation).

Also drop a stray variable setting from the x86 Makefile.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agotools: libxenstat: fix format string overflow
Dario Faggioli [Fri, 16 Feb 2018 18:38:48 +0000 (19:38 +0100)]
tools: libxenstat: fix format string overflow

With gcc 7.3.0, the build fails like this:

src/xenstat_linux.c: In function ‘getBridge’
src/xenstat_linux.c:78:34: warning: ‘%s’ directive writing up to 255 bytes into a region of size 241 [-Wformat-overflow=]
     sprintf(tmp, "/sys/class/net/%s/bridge", de->d_name);
                                  ^~
src/xenstat_linux.c:78:5: note: ‘sprintf’ output between 23 and 278 bytes into a destination of size 256
     sprintf(tmp, "/sys/class/net/%s/bridge", de->d_name);
     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fix by making the buffer bigger.

Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agoshut down domain when last vCPU goes down
Jan Beulich [Mon, 19 Feb 2018 13:00:31 +0000 (14:00 +0100)]
shut down domain when last vCPU goes down

I've just had to deal with an early boot crash of Linux which occurred
so early that even "earlyprintk=xen" did not produce any useful output.
Hence the domain appeared to hang, while in fact it had brought down its
only vCPU. By translating this to a shutdown, the situation will be
better recognizable.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
7 years agox86/PV: avoid indirect call/thunk in I/O emulation
Jan Beulich [Mon, 19 Feb 2018 12:59:37 +0000 (13:59 +0100)]
x86/PV: avoid indirect call/thunk in I/O emulation

The stub is within reach from the .text section, so there's no point
using an indirect call here. This has the added benefit of there no
longer being two sufficiently different approaches, breaking one of
which people may not even notice.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citix.com>
7 years agohvm/monitor: fix usage of the control register mask master github/master gitlab/master
Roger Pau Monne [Fri, 16 Feb 2018 18:16:23 +0000 (18:16 +0000)]
hvm/monitor: fix usage of the control register mask

Previous usage is not correct and would prevent certain updates from
being notified to the monitor client.

For example if (value ^ old) == (PGE | PSE) and mask == PGE this
update would not be notified.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
7 years agox86/microcode: Propagate microcode update errors
Uwe Dannowski [Fri, 16 Feb 2018 13:19:54 +0000 (13:19 +0000)]
x86/microcode: Propagate microcode update errors

Errors on updating the microcode in the processor were silently
dropped when invoked via the microcode_update hypercall. Also, the log
message was misleading.

Signed-off-by: Uwe Dannowski <uwed@amazon.de>
Reviewed-by: Stefan Nuernberger <snu@amazon.de>
Reviewed-by: Martin Pohlack <mpohlack@amazon.de>
Reviewed-by: Amit Shah <aams@amazon.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/srat: fix end calculation in nodes_cover_memory()
Jan Beulich [Thu, 15 Feb 2018 17:17:32 +0000 (18:17 +0100)]
x86/srat: fix end calculation in nodes_cover_memory()

Along the lines of commit 7226486767 ("x86/srat: fix the end pfn check
in valid_numa_range()") nodes_cover_memory() also doesn't consistently
use "end": It's set to an inclusive value initially, but then compared
to the exclusive "end" field of struct node and also possibly set to
nodes[j].start, making it exclusive too. Change the initialization to
make the variable consistently exclusive.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/hvm/dmop: only copy what is needed to/from the guest
Ross Lagerwall [Thu, 15 Feb 2018 17:16:17 +0000 (18:16 +0100)]
x86/hvm/dmop: only copy what is needed to/from the guest

dm_op() fails with -EFAULT if the struct xen_dm_op given by the guest is
smaller than Xen's struct xen_dm_op. This is a problem because DMOP is
meant to be a stable ABI but it breaks whenever the size of struct
xen_dm_op changes.

To fix this, change how the copying to and from the guest is done. When
copying from the guest, first copy the header and inspect the op. Then,
only copy the correct amount needed for that op. When copying to the
guest, don't copy the header. Rather, copy only the correct amount
needed for that particular op.

So now the dm_op() will fail if the guest does not supply enough bytes
for the specific op. It will not fail if the guest supplies too many
bytes for the specific op, but Xen will not copy the extra bytes.

Remove some now unused macros and helper functions.

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agohvm/svm: Enable CR events
Alexandru Isaila [Thu, 15 Feb 2018 10:22:26 +0000 (12:22 +0200)]
hvm/svm: Enable CR events

The CR_INTERCEPT_CR3_WRITE intercept is out of the vmcb->_cr_intercepts
so the AMD arch can't intercept CR events.

This patch implements the CR intercept by adding the flag on a
write_ctrlreg event. The monitor write ctrlreg event is moved from the
Intel side to the common capabilities side.

We just need to enable the SVM intercept and then hvm_mov_to_cr() will
forward the event on to the monitor when appropriate.

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
7 years agohvm/svm: Enable MSR events
Alexandru Isaila [Thu, 15 Feb 2018 10:22:25 +0000 (12:22 +0200)]
hvm/svm: Enable MSR events

At this moment there is no function to enable msr interception on svm.

This patch implements this function and moves the mov to msr monitor
event
form the Intel arch side to the common capabilities.

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agohvm/svm: Enable Breakpoint events
Alexandru Isaila [Thu, 15 Feb 2018 10:22:24 +0000 (12:22 +0200)]
hvm/svm: Enable Breakpoint events

This commit implements the breakpoint events for svm.
At the moment, the Breakpoint vmexit is not forwarded to the monitor
layer.
This patch adds the hvm_monitor_debug call to the VMEXIT_EXCEPTION_BP.
Also, the Software Breakpoint cap is moved from the Intel arch to the
common part of the code.

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
7 years agox86/xpti: Hide almost all of .text and all .data/.rodata/.bss mappings
Andrew Cooper [Mon, 12 Feb 2018 16:06:00 +0000 (16:06 +0000)]
x86/xpti: Hide almost all of .text and all .data/.rodata/.bss mappings

The current XPTI implementation isolates the directmap (and therefore a lot of
guest data), but a large quantity of CPU0's state (including its stack)
remains visible.

Furthermore, an attacker able to read .text is in a vastly superior position
to normal when it comes to fingerprinting Xen for known vulnerabilities, or
scanning for ROP/Spectre gadgets.

Collect together the entrypoints in .text.entry (currently 3x4k frames, but
can almost certainly be slimmed down), and create a common mapping which is
inserted into each per-cpu shadow.  The stubs are also inserted into this
mapping by pointing at the in-use L2.  This allows stubs allocated later (SMP
boot, or CPU hotplug) to work without further changes to the common mappings.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/emul: Fix the decoding of segment overrides in 64bit mode
Andrew Cooper [Thu, 5 Oct 2017 14:30:49 +0000 (14:30 +0000)]
x86/emul: Fix the decoding of segment overrides in 64bit mode

Explicit segment overides other than %fs and %gs are documented as ignored by
both Intel and AMD.

In practice, this means that:

 * Explicit uses of %ss don't actually yield #SS[0] for non-canonical
   memory references.
 * Explicit uses of %{e,c,d}s don't override %rbp/%rsp-based memory references
   to yield #GP[0] for non-canonical memory references.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/entry: Use 32bit xors rater than 64bit xors for clearing GPRs
Andrew Cooper [Wed, 14 Feb 2018 13:07:05 +0000 (13:07 +0000)]
x86/entry: Use 32bit xors rater than 64bit xors for clearing GPRs

Intel's Silvermont/Knights Landing architecture treats them as full ALU
operations, rather than zeroing idoms.

No functional change, and no change in code volume (only changing the bit
selection in the REX prefix).

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agoxen/arm: cpuerrata: Actually check errata on non-boot CPUs
Julien Grall [Wed, 14 Feb 2018 12:22:23 +0000 (12:22 +0000)]
xen/arm: cpuerrata: Actually check errata on non-boot CPUs

The cpu errata framework was introduced in commit 8b01f6364f "xen/arm:
Detect silicon revision and set cap bits accordingly" and was meant to
detect errata present on any CPUs (via check_local_cpu_errata). However,
the function to check the MIDR (is_affected_midr_range) mistakenly
always use the boot CPU MIDR.

Fix is_affected_midr_range to use the current CPU MIDR.

Reported-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: Blacklist SMMU on Thunder-X
Julien Grall [Wed, 14 Feb 2018 15:30:45 +0000 (15:30 +0000)]
xen/arm: Blacklist SMMU on Thunder-X

Xen does not yet support Cavium SMMU because it requires some
workaround. For the time being, blacklist them.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: Extend the number of memory banks supported
Julien Grall [Wed, 14 Feb 2018 15:30:44 +0000 (15:30 +0000)]
xen/arm: Extend the number of memory banks supported

When booting using Grub on Thunder-X, the number of memory available is
greater than 64. Bump the number to 128, so we can take advantage of all
the memory.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoasm-x86/monitor: Fix monitor capability reporting on SVM systems
Alexandru Isaila [Mon, 12 Feb 2018 15:08:15 +0000 (17:08 +0200)]
asm-x86/monitor: Fix monitor capability reporting on SVM systems

No monitor features are available on AMD and all
capabilities are passed only to the Intel processor architecture.
This means that the arch_monitor_get_capabilities returns
capabilities = 0.

This patch is separating out features which are implemented on both
systems from those implemented only on Intel, so that we advertize the
working capabilities on AMD.

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
7 years agox86/spec_ctrl: Fix several bugs in SPEC_CTRL_ENTRY_FROM_INTR_IST
Andrew Cooper [Wed, 14 Feb 2018 10:38:34 +0000 (10:38 +0000)]
x86/spec_ctrl: Fix several bugs in SPEC_CTRL_ENTRY_FROM_INTR_IST

DO_OVERWRITE_RSB clobbers %rax, meaning in practice that the bti_ist_info
field gets zeroed.  Older versions of this code had the DO_OVERWRITE_RSB
register selectable, so reintroduce this ability and use it to cause the
INTR_IST path to use %rdx instead.

The use of %dl for the %cs.rpl check means that when an IST interrupt hits
Xen, we try to load 1 into the high 32 bits of MSR_SPEC_CTRL, suffering a #GP
fault instead.

Also, drop an unused label which was a copy/paste mistake.

Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reported-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
7 years agofirmware/shim: avoid mkdir error during Xen tree setup
Jan Beulich [Wed, 14 Feb 2018 07:16:00 +0000 (08:16 +0100)]
firmware/shim: avoid mkdir error during Xen tree setup

"mkdir -p" reports a missing operand, as config/ has no subdirs. Oddly
enough this doesn't cause the whole command (and hence the build to
fail), despite the "set -e" now covering the entire set of commands -
perhaps a quirk of the relatively old bash I've seen this with (a few
simple experiments suggest that commands inside () producing a non-
success status would exit the inner shell, but not the outer one).

Add a dummy . argument to the invocation.

Suggested-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agofirmware/shim: correctly handle errors during Xen tree setup
Jan Beulich [Tue, 13 Feb 2018 17:19:33 +0000 (18:19 +0100)]
firmware/shim: correctly handle errors during Xen tree setup

"set -e" on a separate Makefile line is meaningless. Glue together all
the lines that this is supposed to cover.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agobitops: rename LOG_2 to ilog2
Sameer Goel [Tue, 13 Feb 2018 16:56:42 +0000 (17:56 +0100)]
bitops: rename LOG_2 to ilog2

Changing the name of the macro from LOG_2 to ilog2.This makes the function name
similar to its Linux counterpart. Since, this is not used in multiple places,
the code churn is minimal.

This change helps in porting unchanged code from Linux.

Signed-off-by: Sameer Goel <sameer.goel@linaro.org>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agocoverage: add documentation for LLVM coverage
Roger Pau Monné [Tue, 13 Feb 2018 16:56:20 +0000 (17:56 +0100)]
coverage: add documentation for LLVM coverage

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agoxsm: add bodge when compiling with llvm coverage support
Roger Pau Monné [Tue, 13 Feb 2018 16:55:43 +0000 (17:55 +0100)]
xsm: add bodge when compiling with llvm coverage support

llvm coverage support seems to disable some of the optimizations
needed in order to compile xsm, and the end result is that references
to __xsm_action_mismatch_detected are left in the object files.

Since coverage support cannot be used in production, introduce
__xsm_action_mismatch_detected for llvm coverage builds.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
7 years agocoverage: introduce support for llvm profiling
Roger Pau Monné [Tue, 13 Feb 2018 16:54:09 +0000 (17:54 +0100)]
coverage: introduce support for llvm profiling

Introduce the functionality in order to fill the hooks of the
cov_sysctl_ops struct. Note that the functionality is still not wired
into the build system.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agox86: use paging_mark_pfn_dirty()
Jan Beulich [Tue, 13 Feb 2018 16:29:50 +0000 (17:29 +0100)]
x86: use paging_mark_pfn_dirty()

... in preference over paging_mark_dirty(), when the PFN is known
anyway.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
7 years agox86/mm: clean up SHARED_M2P{,_ENTRY} uses
Jan Beulich [Tue, 13 Feb 2018 16:28:36 +0000 (17:28 +0100)]
x86/mm: clean up SHARED_M2P{,_ENTRY} uses

Stop open-coding SHARED_M2P() and drop a pointless use of it from
paging_mfn_is_dirty() (!VALID_M2P() is a superset of SHARED_M2P()) and
another one from free_page_type() (prior assertions render this
redundant).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
7 years agotools/libxl: mark special pages as reserved in e820 map for PVH
Juergen Gross [Tue, 21 Nov 2017 11:06:06 +0000 (12:06 +0100)]
tools/libxl: mark special pages as reserved in e820 map for PVH

The "special pages" for PVH guests include the frames for console and
Xenstore ring buffers. Those have to be marked as "Reserved" in the
guest's E820 map, as otherwise conflicts might arise later e.g. when
hotplugging memory into the guest.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agolibxc: xc_dom_parse_elf_kernel: Return error for invalid kernel images
Simon Gaiser [Thu, 8 Feb 2018 21:49:10 +0000 (22:49 +0100)]
libxc: xc_dom_parse_elf_kernel: Return error for invalid kernel images

Commit 96edb111dd ("libxc: panic when trying to create a PVH guest
without kernel support") already improved the handling of non PVH
capable kernels. But xc_dom_parse_elf_kernel() still returned success on
invalid elf images and the domain build only failed later. Now the build
process will fail immediately on detecting the error.

Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
7 years agolibxl: Improve logging in libxl__build_dom()
Simon Gaiser [Thu, 8 Feb 2018 21:49:09 +0000 (22:49 +0100)]
libxl: Improve logging in libxl__build_dom()

xc_dom_parse_image() does not set errno (at least in many code paths).
So LOGE() is not useful.

Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
7 years agolibxc: Cleanup xc_dom_parse_elf_kernel()'s return value
Simon Gaiser [Thu, 8 Feb 2018 21:49:08 +0000 (22:49 +0100)]
libxc: Cleanup xc_dom_parse_elf_kernel()'s return value

xc_dom_loader.parser() should return elf_negerrnoval.

Signed-off-by: Simon Gaiser <simon@invisiblethingslab.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
7 years agolibxc: check for null size file mapping
Paul Semel [Mon, 12 Feb 2018 12:09:15 +0000 (13:09 +0100)]
libxc: check for null size file mapping

Changed the error message when trying to map a null size file.
When doing `xl create` command, we get an Invalid Kernel error
when the file size is greater than zero. For zero length files, we are
falling in the mmap error, and we get an `Invalid parameter` error,
which is not explicit. With this change, we get a `zero length file`
error.

Signed-off-by: Paul Semel <semelpaul@gmail.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agopvh/dom0: whitelist PVH Dom0 ACPI tables
Roger Pau Monne [Thu, 8 Feb 2018 12:25:39 +0000 (12:25 +0000)]
pvh/dom0: whitelist PVH Dom0 ACPI tables

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agopvh/dom0: pass address/length to pvh_acpi_table_allowed
Roger Pau Monne [Thu, 8 Feb 2018 12:25:38 +0000 (12:25 +0000)]
pvh/dom0: pass address/length to pvh_acpi_table_allowed

The current usage of acpi_gbl_root_table_list inside the function is
wrong.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agopvh/dom0: init variables at declaration time
Roger Pau Monne [Thu, 8 Feb 2018 12:25:37 +0000 (12:25 +0000)]
pvh/dom0: init variables at declaration time

Also remove a couple of newlines at the start of function
declarations.

No functional change.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/srat: fix the end pfn check in valid_numa_range()
Haozhong Zhang [Mon, 12 Feb 2018 01:44:23 +0000 (09:44 +0800)]
x86/srat: fix the end pfn check in valid_numa_range()

... and fix the coding style on fly.

valid_numa_range(..., epfn << PAGE_SHIFT, ...) and its only caller
memory_add(..., epfn, pxm) interpret epfn inconsistently. The former
interprets epfn as the last pfn, while the latter interprets it as the
last pfn plus one. Fix this inconsistency in valid_numa_range(), since
most of other places use the latter interpretation.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/vmx: Drop enum handler_return
Andrew Cooper [Wed, 6 Dec 2017 17:58:00 +0000 (17:58 +0000)]
x86/vmx: Drop enum handler_return

They are straight aliases of the more common X86EMUL_* constants.  While
adjusting these, fix the case indentation where appropriate.

No functional change, confirmed by diff'ing the compiled binary.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
7 years agoRevert "make xen ocaml safe-strings compliant"
Wei Liu [Thu, 8 Feb 2018 18:05:44 +0000 (18:05 +0000)]
Revert "make xen ocaml safe-strings compliant"

This reverts commit df1e4c6e7f8892e950433ff33c215df0cd7b30f7.

Oxenstored is broken by this change.

7 years agoRevert "ocaml/libs/xb: update xb.mli in accordance with df1e4c6e7f8"
Wei Liu [Thu, 8 Feb 2018 18:04:30 +0000 (18:04 +0000)]
Revert "ocaml/libs/xb: update xb.mli in accordance with df1e4c6e7f8"

This reverts commit a53b9b987a0a9b2c67569f90f3d7ab1327ade2e7.

7 years agoxen/arm: vpsci: Move PSCI function dispatching from vsmc.c to vpsci.c
Julien Grall [Tue, 6 Feb 2018 15:53:25 +0000 (15:53 +0000)]
xen/arm: vpsci: Move PSCI function dispatching from vsmc.c to vpsci.c

At the moment PSCI function dispatching is done in vsmc.c and the
function implementation in vpsci.c. Some bits of the implementation is
even done in vsmc.c (see PSCI_SYSTEM_RESET).

This means that it is difficult to follow the implementation and also
it requires to export functions for each PSCI function.

Therefore move PSCI dispatching in two new functions do_vpsci_0_1_call
and do_vpsci_0_2_call. The former will handle PSCI 0.1 calls while the
latter 0.2 or later calls.

At the same time, a new header vpsci.h was created to contain all
definitions for virtual PSCI and avoid confusion with the host PSCI.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: vsmc: Don't implement function IDs that don't exist
Julien Grall [Tue, 6 Feb 2018 15:53:24 +0000 (15:53 +0000)]
xen/arm: vsmc: Don't implement function IDs that don't exist

The current implementation of SMCCC relies on the fact only the function
number (bits [15:0]) is enough to identify what to implement.

However, PSCI call are only available in the range 0x84000000-0x8400001F
and 0xC4000000-0xC400001F. Furthermore, not all SMC32 functions have
equivalent in the SMC64. This is the case of:
    * PSCI_VERSION
    * CPU_OFF
    * MIGRATE_INFO_TYPE
    * SYSTEM_OFF
    * SYSTEM_RESET

Similarly call count, call uid, revision can only be query using smc32/hvc32
fast calls (See 6.2 in ARM DEN 0028B).

Xen should only implement identifier existing in the specification in
order to avoid potential clashes with later revision. Therefore rework the
vsmc code to use the whole function identifier rather than only the
function number.

At the same time, the new macros for call count, call uid, revision are
renamed to better suit the spec.

Lastly, update SSSC_SMCCC_FUNCTION_COUNT to match the correct number of
funtions. Note that version is not updated because the number has always
been wrong, and nobody could properly use it.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: vpsci: Removing dummy MIGRATE and MIGRATE_INFO_UP_CPU
Julien Grall [Tue, 6 Feb 2018 15:53:23 +0000 (15:53 +0000)]
xen/arm: vpsci: Removing dummy MIGRATE and MIGRATE_INFO_UP_CPU

The PSCI call MIGRATE and MIGRATE_INFO_UP_CPU are optional and
implemented as just returning PSCI_NOT_SUPPORTED (aka UNKNOWN_FUNCTION
for SMCCC).

The new SMCCC framework is able to deal with unimplemented function and
return the proper error code. So remove the implementations for both
function.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Reviewed-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoARM: make nr_irqs a constant
Andre Przywara [Tue, 6 Feb 2018 17:09:03 +0000 (17:09 +0000)]
ARM: make nr_irqs a constant

On ARM the maximum number of IRQs is a constant, but we share it being
a variable to match x86. Since we are not supposed to alter it, let's
mark it as "const" to avoid accidental change.

Suggested-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Acked-by: Julien Grall <julien.grall@arm.com>
7 years agoARM: VGIC: rework gicv[23]_update_lr to not use pending_irq
Andre Przywara [Tue, 6 Feb 2018 17:09:02 +0000 (17:09 +0000)]
ARM: VGIC: rework gicv[23]_update_lr to not use pending_irq

The functions to actually populate a list register were accessing
the VGIC internal pending_irq struct, although they should be abstracting
from that.
Break the needed information down to remove the reference to pending_irq
from gic-v[23].c.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoARM: VGIC: factor out vgic_get_hw_irq_desc()
Andre Przywara [Tue, 6 Feb 2018 17:09:01 +0000 (17:09 +0000)]
ARM: VGIC: factor out vgic_get_hw_irq_desc()

At the moment we happily access the VGIC internal struct pending_irq
(which describes a virtual IRQ) in irq.c.
Factor out the actually needed functionality to learn the associated
hardware IRQ and move that into gic-vgic.c to improve abstraction.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
7 years agoARM: VGIC: factor out vgic_connect_hw_irq()
Andre Przywara [Tue, 6 Feb 2018 17:09:00 +0000 (17:09 +0000)]
ARM: VGIC: factor out vgic_connect_hw_irq()

At the moment we happily access VGIC internal data structures like
the rank and struct pending_irq in gic.c, which should be VGIC agnostic.

Factor out a new function vgic_connect_hw_irq(), which allows a virtual
IRQ to be connected to a hardware IRQ (using the hw bit in the LR).

This removes said accesses to VGIC data structures and improves abstraction.

One thing to note is that this changes the locking scheme slightly:
we hold the rank lock for a shorter period of time, not covering some
of the later lines, which deal with the "irq_desc" structure only. This
should not have any adverse effect, but is a change in locking anyway.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
7 years agoARM: VGIC: rework events_need_delivery()
Andre Przywara [Tue, 6 Feb 2018 17:08:59 +0000 (17:08 +0000)]
ARM: VGIC: rework events_need_delivery()

In event.h we very deeply dive into the VGIC to learn if an event for
a guest is pending.
Rework that function to abstract the VGIC specific part out. Also
reorder the queries there, as we only actually need to check for the
event channel if there are no other pending IRQs.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoARM: VGIC: split up gic_dump_info() to cover virtual part separately
Andre Przywara [Tue, 6 Feb 2018 17:08:58 +0000 (17:08 +0000)]
ARM: VGIC: split up gic_dump_info() to cover virtual part separately

Currently gic_dump_info() not only dumps the hardware state of the GIC,
but also the VGIC internal virtual IRQ lists.
Split the latter off and move it into gic-vgic.c to observe the abstraction.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoARM: VGIC: split gic.c to observe hardware/virtual GIC separation
Andre Przywara [Tue, 6 Feb 2018 17:08:57 +0000 (17:08 +0000)]
ARM: VGIC: split gic.c to observe hardware/virtual GIC separation

Currently gic.c holds code to handle hardware IRQs as well as code to
bridge VGIC requests to the GIC virtualization hardware.
Despite being named gic.c, this file reaches into the VGIC and uses data
structures describing virtual IRQs.
To improve abstraction, move the VGIC functions into a separate file,
so that gic.c does what it says on the tin.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Acked-by: Julien Grall <julien.grall@arm.com>
7 years agoARM: VGIC: drop unneeded gic_restore_pending_irqs()
Andre Przywara [Tue, 6 Feb 2018 17:08:56 +0000 (17:08 +0000)]
ARM: VGIC: drop unneeded gic_restore_pending_irqs()

In gic_restore_pending_irqs() we push our pending virtual IRQs into the
list registers. This function is called once from gic_inject(), just
before we return to the guest, but also in gic_restore_state(), when
we context-switch a VCPU. Having a closer look it turns out that the
later call is not needed, since we will always call gic_inject() anyway.
So remove that call (and the forward declaration) to streamline this
interface and make separating the GIC from the VGIC world later.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen: Disable ARINC653 scheduler by default for non-DEBUG builds
George Dunlap [Thu, 8 Feb 2018 16:23:50 +0000 (16:23 +0000)]
xen: Disable ARINC653 scheduler by default for non-DEBUG builds

The ARINC653 scheduler is targeted at a very specific niche; typical
users cannot benefit from using it.  Disable it by default for
non-DEBUG builds.  (Enable it for DEBUG builds so that we catch any
build breakages sooner rather than later.)

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
7 years agoxen: Fix credit1 Kconfig entry
George Dunlap [Thu, 8 Feb 2018 16:23:50 +0000 (16:23 +0000)]
xen: Fix credit1 Kconfig entry

...so that it shows up in the menu and can be disabled.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
7 years agoocaml/libs/xb: don't generate *.mli automatically
Wei Liu [Wed, 7 Feb 2018 17:09:34 +0000 (17:09 +0000)]
ocaml/libs/xb: don't generate *.mli automatically

To stay in line with other parts of the ocaml code base.

This requires committing a bunch of mli files in tree.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
7 years agoocaml/libs/xb: update xb.mli in accordance with df1e4c6e7f8
Wei Liu [Wed, 7 Feb 2018 17:09:33 +0000 (17:09 +0000)]
ocaml/libs/xb: update xb.mli in accordance with df1e4c6e7f8

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
7 years agostubdom: install firmware files as data
Olaf Hering [Wed, 7 Feb 2018 15:11:17 +0000 (16:11 +0100)]
stubdom: install firmware files as data

Remove the executable bits of vtpm files by using _DATA instead of _PROG.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agokconfig/gcov: rename to coverage
Roger Pau Monné [Wed, 7 Feb 2018 15:32:18 +0000 (16:32 +0100)]
kconfig/gcov: rename to coverage

So it can be used by both gcc and clang. Just add the Kconfig option
and modify the makefiles so the llvm coverage specific code can be
added in a follow up patch.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
[jb: also change the shim config]

7 years agox86: reduce Meltdown band-aid IPI overhead
Jan Beulich [Wed, 7 Feb 2018 15:31:41 +0000 (16:31 +0100)]
x86: reduce Meltdown band-aid IPI overhead

In case we can detect single-threaded guest processes (by checking
whether we can account for all root page table uses locally on the vCPU
that's running), there's no point in issuing a sync IPI upon an L4 entry
update, as no other vCPU of the guest will have that page table loaded.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agoPCI/passthrough: don't discard Dom0 provided information
Jan Beulich [Wed, 7 Feb 2018 15:30:24 +0000 (16:30 +0100)]
PCI/passthrough: don't discard Dom0 provided information

Instead of giving, to subsequent code, the appearance of there not
having been any "info" data provided, adjust the conditional guarding
SR-IOV handling.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
7 years agoupdate the minimal ocaml version to 4.02
Michael Young [Wed, 7 Feb 2018 13:59:00 +0000 (13:59 +0000)]
update the minimal ocaml version to 4.02

The ocaml safe-strings patch uses code introduced in ocaml 4.02
so update the minimal version.

Signed-off-by: Michael Young <m.a.young@durham.ac.uk>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
7 years agox86/boot: Make alternative patching NMI-safe
Andrew Cooper [Wed, 31 Jan 2018 16:09:39 +0000 (16:09 +0000)]
x86/boot: Make alternative patching NMI-safe

During patching, there is a very slim risk that an NMI or MCE interrupt in the
middle of altering the code in the NMI/MCE paths, in which case bad things
will happen.

The NMI risk can be eliminated by running the patching loop in NMI context, at
which point the CPU will defer further NMIs until patching is complete.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agox86/mm: Add debug code to detect illegal page_lock and put_page_type ordering
George Dunlap [Wed, 24 Jan 2018 11:56:31 +0000 (11:56 +0000)]
x86/mm: Add debug code to detect illegal page_lock and put_page_type ordering

The fix for XSA-242 depends on the same cpu never calling
_put_page_type() while holding a page_lock() for that page; doing so
may cause a deadlock under the right conditions.

Furthermore, even before that, there was never any discipline for the
order in which page locks are grabbed; if there are any paths that
grab the locks for two different pages at once, we risk creating the
conditions for a deadlock to occur.

These are believed to be safe, because it is believed that:
1. No hypervisor paths ever lock two pages at once, and
2. We never call _put_page_type() on a page while holding its page lock.

Add a check to debug builds to catch any violations of these
assumpitons.

Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agomake xen ocaml safe-strings compliant
Michael Young [Tue, 6 Feb 2018 21:27:23 +0000 (21:27 +0000)]
make xen ocaml safe-strings compliant

Xen built with ocaml 4.06 gives errors such as
Error: This expression has type bytes but an expression was
        expected of type string
as Byte and safe-strings which were introduced in 4.02 are the
default in 4.06.
This patch which is mostly by Richard W.M. Jones of Red Hat
from https://bugzilla.redhat.com/show_bug.cgi?id=1526703
fixes these issues.

v2: drop tools/ocaml/libs/xc/xenctrl.ml from the patch as the
affected code was removed by commit d933f1a53c06002351c1e36d40615e40bd4bf6af
tools/ocaml: Drop coredump infrastructure

Signed-off-by: Michael Young <m.a.young@durham.ac.uk>
Reviewed-by: Christian Lindig <christian.lindig@citrix.com>
[ wei: remove trailing whitespaces ]
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
7 years agodocs: clearify symlink usage in xen-pv-channel
Olaf Hering [Wed, 7 Feb 2018 08:45:53 +0000 (09:45 +0100)]
docs: clearify symlink usage in xen-pv-channel

The previous version simply states that a symlink has to be created
without telling where the symlink should point to.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agodocs: fix kernel config option in xen-pv-channel
Olaf Hering [Wed, 7 Feb 2018 08:30:57 +0000 (09:30 +0100)]
docs: fix kernel config option in xen-pv-channel

HVC is shown underlined, the underscores are missing.
Fix it by using underscores.
Remove stale I.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agox86/spec_ctrl: Fix determination of when to use IBRS
Andrew Cooper [Tue, 6 Feb 2018 13:45:17 +0000 (13:45 +0000)]
x86/spec_ctrl: Fix determination of when to use IBRS

The original version of this logic was:

    /*
     * On Intel hardware, we'd like to use retpoline in preference to
     * IBRS, but only if it is safe on this hardware.
     */
    else if ( boot_cpu_has(X86_FEATURE_IBRSB) )
    {
        if ( retpoline_safe() )
            thunk = THUNK_RETPOLINE;
        else
            ibrs = true;
    }

but it was changed by a request during review.  Sadly, the result is buggy as
it breaks the later fallback logic by allowing IBRS to appear as available
when in fact it isn't.

This in practice means that on repoline-unsafe hardware without IBRS, we
select THUNK_JUMP despite intending to select THUNK_RETPOLINE.

Reported-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agolibxc: add xc_domain_remove_from_physmap to wrap XENMEM_remove_from_physmap
Zhongze Liu [Tue, 30 Jan 2018 17:50:18 +0000 (01:50 +0800)]
libxc: add xc_domain_remove_from_physmap to wrap XENMEM_remove_from_physmap

This is for the proposal "Allow setting up shared memory areas between VMs
from xl config file". See:

  https://lists.xen.org/archives/html/xen-devel/2017-08/msg03242.html

Then plan is to use XENMEM_add_to_physmap_batch to map the shared pages from
one domU to another and use XENMEM_remove_from_physmap to cancel the sharing.
A wrapper to XENMEM_add_to_physmap_batch was added in the following commit:

  commit 20e725e9364cff4a29945f66986ecd88cca8743d

Now add the wrapper to XENMEM_remove_from_physmap.

Signed-off-by: Zhongze Liu <blackskygg@gmail.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agotests/xen-access: disable CR4 write events on application exit
Razvan Cojocaru [Mon, 29 Jan 2018 21:48:24 +0000 (23:48 +0200)]
tests/xen-access: disable CR4 write events on application exit

On exit, xen-access did not unsubscribe from CR4 write vm_events,
potentially leaving the guest stuck.

Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
7 years agox86/NMI: invert condition in nmi_show_execution_state()
Jan Beulich [Tue, 6 Feb 2018 16:29:59 +0000 (17:29 +0100)]
x86/NMI: invert condition in nmi_show_execution_state()

We want to decode the symbol when _not_ in guest mode.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agolibxc: don't fail domain creation when unpacking initrd fails
Jan Beulich [Tue, 6 Feb 2018 16:29:33 +0000 (17:29 +0100)]
libxc: don't fail domain creation when unpacking initrd fails

At least Linux kernels have been able to work with gzip-ed initrd for
quite some time; initrd compressed with other methods aren't even being
attempted to unpack. Furthermore the unzip-ing routine used here isn't
capable of dealing with various forms of concatenated files, each of
which was gzip-ed separately (it is this particular case which has been
the source of observed VM creation failures).

Hence, if unpacking fails, simply hand the compressed blob to the guest
as is.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agoxen/livepatch: Drop stray tabs and fix indentation
Andrew Cooper [Mon, 5 Feb 2018 11:03:47 +0000 (11:03 +0000)]
xen/livepatch: Drop stray tabs and fix indentation

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agox86/emul: Fix the emulation of invlpga
Andrew Cooper [Fri, 2 Feb 2018 16:10:17 +0000 (16:10 +0000)]
x86/emul: Fix the emulation of invlpga

The instruction requires EFER.SVME set to be usable in the first place.

Furthermore, the emulation doesn't handle ASIDs, so avoid giving the
impression that they work.  Permit ASID 0 which is reserved for non-root
mode (in which case the instruction is identical to invlpg), but raise #UD for
any other ASID.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/emul: Misc non-functional improvements
Andrew Cooper [Fri, 2 Feb 2018 11:42:05 +0000 (11:42 +0000)]
x86/emul: Misc non-functional improvements

 * Drop trailing whitespace
 * Use ARRAY_SIZE() rather than opencoding it

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/svm: correct EFER.SVME intercept checks
Brian Woods [Mon, 5 Feb 2018 09:15:25 +0000 (10:15 +0100)]
x86/svm: correct EFER.SVME intercept checks

Corrects some EFER.SVME checks in intercepts.  See AMD APM vol2 section
15.4 for more details.  VMMCALL isn't checked due to guests needing it
to boot.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Brian Woods <brian.woods@amd.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
7 years agox86/svm: update VGIF support
Brian Woods [Mon, 5 Feb 2018 09:14:48 +0000 (10:14 +0100)]
x86/svm: update VGIF support

There are places where the GIF value is checked.  A guest with VGIF
enabled can change the GIF value without the host being involved,
therefore it needs to check the GIF value in the VMCB rather the one in
the nestedsvm struct.

Signed-off-by: Brian Woods <brian.woods@amd.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
7 years agox86emul: add missing suffixes in test harness
Jan Beulich [Mon, 5 Feb 2018 09:14:15 +0000 (10:14 +0100)]
x86emul: add missing suffixes in test harness

I'm in the process of putting together a gas change issuing at least
warnings when the intended size of a memory operation can't be deduced
from another (register) operand. Add missing suffixes to silence such
future diagnostics.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86emul: add tables for XOP 08 and 09 extension spaces
Jan Beulich [Mon, 5 Feb 2018 09:12:50 +0000 (10:12 +0100)]
x86emul: add tables for XOP 08 and 09 extension spaces

Convert the few existing opcodes so far supported.

Also adjust two vex_* case labels to better be ext_* (the values are
identical).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>