]> xenbits.xensource.com Git - xen.git/log
xen.git
8 years agoxen/arm: Add support for 16 bit VMIDs
Bhupinder Thakur [Fri, 16 Dec 2016 07:16:28 +0000 (12:46 +0530)]
xen/arm: Add support for 16 bit VMIDs

VMID space is increased to 16-bits from 8-bits in ARMv8 8.1 revision.
This allows more than 256 VMs to be supported by Xen.

This change adds support for 16-bit VMIDs in Xen based on whether the
architecture supports it.

Signed-off-by: Bhupinder Thakur <bhupinder.thakur@linaro.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
8 years agoxen/arm: Move p2m_vmid_allocator_init() inside setup_virt_paging()
Bhupinder Thakur [Fri, 16 Dec 2016 07:16:27 +0000 (12:46 +0530)]
xen/arm: Move p2m_vmid_allocator_init() inside setup_virt_paging()

Since VMIDs are related to 2nd stage address translation, it makes more sense
to move the call to p2m_vmid_allocator_init(), which initializes the vmid
allocation bitmap, inside setup_virt_paging(), where 2nd stage address translation
is set up.

Signed-off-by: Bhupinder Thakur <bhupinder.thakur@linaro.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
8 years agolibxl: set rc to 0 in init_acpi_config in success path
Wei Liu [Fri, 16 Dec 2016 15:51:33 +0000 (15:51 +0000)]
libxl: set rc to 0 in init_acpi_config in success path

xc_doamin_getinfo returns >=0 in success path, and if there is no vnode
configured, that rc will be returned to caller, which indicates error.

Fix that by setting rc to 0 in success path.

Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agox86/emul: Simplfy L{ES,DS,SS,FS,GS} handling
Andrew Cooper [Wed, 14 Dec 2016 11:05:18 +0000 (11:05 +0000)]
x86/emul: Simplfy L{ES,DS,SS,FS,GS} handling

%ss, %fs and %gs can be calculated by directly masking the opcode.  %es and
%ds cant, but the calculation isn't hard.

Use seg rather than dst.val for storing the calculated segment, which is
appropriately typed.  Drop the sel local variable entirely and use dst.val
instead.  The mode_64bit() check can be repositioned and simplified to drop
the ext check.  Replace opencoding of X86EMUL_OKAY.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/HVM: handle_{mmio*,pio}() return value adjustments
Jan Beulich [Fri, 16 Dec 2016 13:38:29 +0000 (14:38 +0100)]
x86/HVM: handle_{mmio*,pio}() return value adjustments

Don't ignore their return values. Don't indicate success to callers of
handle_pio() when in fact the domain has been crashed.

Make all three functions return bool. Adjust formatting of switch()
statements being touched anyway.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
8 years agox86/boot: fix build with certain older gcc versions
Jan Beulich [Fri, 16 Dec 2016 13:37:35 +0000 (14:37 +0100)]
x86/boot: fix build with certain older gcc versions

Despite all attempts so far (ending in commit fecf584294 ["Config.mk:
fix comment for debug option"] adjusting the respective comment),
Config.mk's debug= setting still affects the hypervisor build: CFLAGS
gets -g added there.

xen/arch/x86/boot/build32.mk includes that file, and hence inherits the
setting too. Some gcc versions take -g to create an .eh_frame section
despite -fno-asynchronous-unwind-tables (which instead one would expect
to produce .debug_frame).

In turn, commit 93c0c0287a ("x86/boot: create *.lnk files with linker
script") was - in my understanding - supposed to make sure .text is
first, but apparently it did also not really achieve that effect: Both
reloc.lnk and reloc.bin in the case here ended up with .eh_frame first,
which obviously rendered the whole final binary unusable.

Explicitly suppress generation of any kind of debug info when building
reloc.o.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: CMPXCHG16B requires an aligned operand
Jan Beulich [Fri, 16 Dec 2016 13:37:11 +0000 (14:37 +0100)]
x86emul: CMPXCHG16B requires an aligned operand

This distinguishes it from CMPXCHG8B.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: reduce CMPXCHG{8,16}B footprint and casting
Jan Beulich [Fri, 16 Dec 2016 13:36:36 +0000 (14:36 +0100)]
x86emul: reduce CMPXCHG{8,16}B footprint and casting

Re-use an existing stack variable (reducing stack footprint, which also
results in smaller code due to some stack accesses no longer needing a
32-bit displacement), at once using a union instead of casts. Also
switch to rex_prefix based conditionals instead of op_bytes ones.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: support {RD,WR}{F,G}SBASE
Jan Beulich [Fri, 16 Dec 2016 13:35:58 +0000 (14:35 +0100)]
x86emul: support {RD,WR}{F,G}SBASE

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86: introduce and use scratch CPU mask
Jan Beulich [Fri, 16 Dec 2016 13:34:34 +0000 (14:34 +0100)]
x86: introduce and use scratch CPU mask

__get_page_type(), so far using an on-stack CPU mask variable, is
involved in recursion when e.g. pinning page tables. This means there
may be up to five instances of the function active at a time, implying
five instances of the (up to 512 bytes large) CPU mask variable. An IRQ
happening at the deepest point of the stack has been observed to cause
a stack overflow with a 4095-pCPU build, when the IRQ handling results
in send_guest_pirq() being called (leading to vcpu_kick() -> ... ->
csched_vcpu_wake() -> __runq_tickle() -> cpumask_raise_softirq(), the
last two of which also have CPU mask variables on their stacks).

Introduce a per-CPU variable instead, which can then be used by any
code never running in IRQ context.

The mask can then also be used by other MMU code as well as by
msi_compose_msg() (and quite likely we'll find further uses down the
road).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agoVT-d: correct dma_msi_set_affinity()
Jan Beulich [Fri, 16 Dec 2016 13:33:43 +0000 (14:33 +0100)]
VT-d: correct dma_msi_set_affinity()

Commit 83cd2038fe ("VT-d: use msi_compose_msg()) together with
15aa6c6748 ("amd iommu: use base platform MSI implementation"),
introducing the use of a per-CPU scratch CPU mask, went too far:
dma_msi_set_affinity() may, at least in theory, be called in
interrupt context, and hence the use of that scratch variable is not
correct.

Since the function overwrites the destination information anyway,
allow msi_compose_msg() to be called with a NULL CPU mask, avoiding
the use of that scratch variable.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86: streamline copying to/from user memory
Jan Beulich [Fri, 16 Dec 2016 13:32:51 +0000 (14:32 +0100)]
x86: streamline copying to/from user memory

Their size parameters being "unsigned", there's neither a point for
them returning "unsigned long", nor for any of their (assembly)
arithmetic to involved 64-bit operations on other than addresses.

Take the opportunity and fold __do_clear_user() into its single user
(using qword stores instead of dword ones), name all asm() operands,
and reduce the amount of (redundant) operands.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agoxsm: allow relevant permission during migrate and gpu-passthrough.
Anshul Makkar [Mon, 12 Dec 2016 14:00:05 +0000 (14:00 +0000)]
xsm: allow relevant permission during migrate and gpu-passthrough.

During guest migrate allow permission to prevent
spurious page faults.
Prevents these errors:
d73: Non-privileged (73) attempt to map I/O space 00000000

avc: denied  { set_misc_info } for domid=0 target=11
scontext=system_u:system_r:dom0_t
tcontext=system_u:system_r:domU_t tclass=domain

GPU passthrough for hvm guest:
avc:  denied  { send_irq } for domid=0 target=10
scontext=system_u:system_r:dom0_t
tcontext=system_u:system_r:domU_t tclass=hvm

Signed-off-by: Anshul Makkar <anshul.makkar@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
8 years agolibxl: init_acpi_config should return rc in exit path
Wei Liu [Wed, 14 Dec 2016 11:44:36 +0000 (11:44 +0000)]
libxl: init_acpi_config should return rc in exit path

... otherwise it returns 0 even if the function fails.

Coverity-ID: 1397121

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agotools/xenstat: Don't disable xentop when cross-compiling
Edgar E. Iglesias [Thu, 15 Dec 2016 12:36:09 +0000 (13:36 +0100)]
tools/xenstat: Don't disable xentop when cross-compiling

This partially reverts 16504669c5cbb8b195d20412aadc838da5c428f7
since xentop cross-compiles fine.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agotools/xenstat: Remove redundant check for curses.h
Edgar E. Iglesias [Thu, 15 Dec 2016 12:36:08 +0000 (13:36 +0100)]
tools/xenstat: Remove redundant check for curses.h

This check for curses.h does not consider cross-compilation.
It only checks host paths.

Luckily, commit 65da4913214120ddc95bd846cb3649a29f87146a
introduced proper configure checks for ncurses so we can
remove the redundant check in the Makefile.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agox86/paging: Rename paging_mark_pfn_dirty() and use pfn_t
Andrew Cooper [Wed, 14 Dec 2016 14:20:12 +0000 (14:20 +0000)]
x86/paging: Rename paging_mark_pfn_dirty() and use pfn_t

paging_mark_gfn_dirty() actually takes a pfn, even by paramter name.  Rename
the function and alter the type to pfn_t to match.

Push pfn_t into the LOGDIRTY_IDX() macros, and clean up a couple of local
variable types in paging_mark_pfn_dirty().

Leave an explicit comment in vmx_vcpu_flush_pml_buffer() when we intentally
perform a straight conversion from gfn to pfn.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
8 years agox86/paging: Update paging_mark_dirty() to use mfn_t
Andrew Cooper [Wed, 14 Dec 2016 14:13:10 +0000 (14:13 +0000)]
x86/paging: Update paging_mark_dirty() to use mfn_t

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
8 years agox86emul: ignore most segment bases for 64-bit mode in is_aligned()
Jan Beulich [Thu, 15 Dec 2016 10:13:32 +0000 (11:13 +0100)]
x86emul: ignore most segment bases for 64-bit mode in is_aligned()

ops->read_segment() will report whatever is actually there in the
register, so we need to actively distinguish ES/CS/SS/DS from FS/GS.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agonestedhvm: replace VMCX_EADDR by INVALID_PADDR
Haozhong Zhang [Thu, 15 Dec 2016 10:12:34 +0000 (11:12 +0100)]
nestedhvm: replace VMCX_EADDR by INVALID_PADDR

... because INVALID_PADDR is a more general one.

Suggested-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
8 years agovvmx: check the operand of L1 vmxon
Haozhong Zhang [Thu, 15 Dec 2016 10:12:06 +0000 (11:12 +0100)]
vvmx: check the operand of L1 vmxon

Check whether the operand of L1 vmxon is a valid VMXON region address
and whether the VMXON region at that address contains a valid revision
ID.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
8 years agovvmx: return VMfail to L1 if L1 vmxon is executed in VMX operation
Haozhong Zhang [Thu, 15 Dec 2016 10:11:45 +0000 (11:11 +0100)]
vvmx: return VMfail to L1 if L1 vmxon is executed in VMX operation

According to Intel SDM, section "VMXON - Enter VMX Operation", a
VMfail should be returned to L1 hypervisor if L1 vmxon is executed in
VMX operation, rather than just print a warning message.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
8 years agovvmx: set vmxon_region_pa of vcpu out of VMX operation to an invalid address
Haozhong Zhang [Thu, 15 Dec 2016 10:11:20 +0000 (11:11 +0100)]
vvmx: set vmxon_region_pa of vcpu out of VMX operation to an invalid address

nvmx_handle_vmxon() previously checks whether a vcpu is in VMX
operation by comparing its vmxon_region_pa with GPA 0. However, 0 is
also a valid VMXON region address. If L1 hypervisor had set the VMXON
region address to 0, the check in nvmx_handle_vmxon() will be skipped.
Fix this problem by using an invalid VMXON region address for vcpu
out of VMX operation.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
8 years agox86/vm_event: add support for VM_EVENT_REASON_INTERRUPT
Razvan Cojocaru [Thu, 15 Dec 2016 10:09:03 +0000 (11:09 +0100)]
x86/vm_event: add support for VM_EVENT_REASON_INTERRUPT

Added support for a new event type, VM_EVENT_REASON_INTERRUPT,
which is now fired in a one-shot manner when enabled via the new
VM_EVENT_FLAG_GET_NEXT_INTERRUPT vm_event response flag.
The patch also fixes the behaviour of the xc_hvm_inject_trap()
hypercall, which would lead to non-architectural interrupts
overwriting pending (specifically reinjected) architectural ones.

Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Acked-by: Julien Grall <julien.grall@arm.com>
8 years agox86/HVM: introduce hvm_get_cpl() and respective hook
Jan Beulich [Thu, 15 Dec 2016 10:07:55 +0000 (11:07 +0100)]
x86/HVM: introduce hvm_get_cpl() and respective hook

... instead of repeating the same code in various places (and getting
it  wrong in some of them).

In vmx_inst_check_privilege() also stop open coding
vmx_guest_x86_mode().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Tim Deegan <tim@xen.org>
8 years agotools/livepatch: Exit with 2 if a timeout occurs
Ross Lagerwall [Wed, 14 Dec 2016 07:52:00 +0000 (07:52 +0000)]
tools/livepatch: Exit with 2 if a timeout occurs

Exit with 0 for success.
Exit with 1 for an error.
Exit with 2 if the operation should be retried for any reason (e.g. a
timeout or because another operation was in progress).

This allows a program or script driving xen-livepatch to determine if
the operation should be retried without parsing the output.

Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
8 years agotools/livepatch: Save errno where needed
Ross Lagerwall [Wed, 14 Dec 2016 07:51:59 +0000 (07:51 +0000)]
tools/livepatch: Save errno where needed

Fix a number of incorrect uses of errno after an operation that could
set it (e.g. fprintf, close).

Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
8 years agotools/livepatch: Remove unused struct member
Ross Lagerwall [Wed, 14 Dec 2016 07:51:58 +0000 (07:51 +0000)]
tools/livepatch: Remove unused struct member

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
8 years agotools/livepatch: Remove pointless retry loop
Ross Lagerwall [Wed, 14 Dec 2016 07:51:57 +0000 (07:51 +0000)]
tools/livepatch: Remove pointless retry loop

The default timeout in the hypervisor for a livepatch operation is 30 ms,
but xen-livepatch currently waits for up to 30 seconds for the operation
to complete. Instead, remove the retry loop and simply wait for 2 * 30 ms
for the operation to complete. The extra period is to account for the
time to actually start the operation.

Furthermore, have xen-livepatch set the hypervisor timeout rather than
relying on the hypervisor default since the tool doesn't know how long
it will be. Use nanosleep rather than usleep since usleep has been
removed from POSIX.1-2008.

Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
8 years agolivepatch: Fix documentation of timeout
Ross Lagerwall [Wed, 14 Dec 2016 07:51:56 +0000 (07:51 +0000)]
livepatch: Fix documentation of timeout

The hypervisor expects the timeout from the hypercall to be in
nanoseconds, so document this correctly. Also correctly document
what happens when timeout is set to zero.

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
8 years agotools/livepatch: Improve output
Ross Lagerwall [Wed, 14 Dec 2016 07:51:55 +0000 (07:51 +0000)]
tools/livepatch: Improve output

Improving the output of xen-livepatch, which is currently hard to read,
especially when an error occurs.

Some examples of the changes:
Before:
    $ xen-livepatch apply test
    Performing apply:. completed
After:
    $ xen-livepatch apply test
    Applying test:. completed

Before:
    $ xen-livepatch apply test2
    test2 failed with 22(Invalid argument)
    Performing apply: (no newline)
After:
    $ xen-livepatch apply test2
    Applying test2: failed
    Error 22: Invalid argument

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
8 years agotools/livepatch: Set stdout and stderr unbuffered
Ross Lagerwall [Wed, 14 Dec 2016 07:51:54 +0000 (07:51 +0000)]
tools/livepatch: Set stdout and stderr unbuffered

Using both stdout and stderr interleaved without newlines can result in
strange output when using line buffered mode (e.g. a terminal) or when
fully buffered (e.g. redirected to a file). Set stdout to unbuffered mode
to fix this (stderr is always unbuffered by default).

Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
8 years agotools/livepatch: Show the correct expected state before action
Ross Lagerwall [Wed, 14 Dec 2016 07:51:53 +0000 (07:51 +0000)]
tools/livepatch: Show the correct expected state before action

Somewhat confusingly, before the action has been executed the patch is
expected to be in the "allow" state, not the "expected" state.  The
check for this was correct but the subsequent error message was not.
Fix the error message to show this state correctly.

Before:
    $ xen-livepatch unload test
    test: in wrong state (APPLIED), expected (unknown)
After:
    $ xen-livepatch unload test
    test: in wrong state (APPLIED), expected (CHECKED)

Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
8 years agox86/traps: Correct pagefault handling issues introduced in c/s d5c251c
Andrew Cooper [Wed, 14 Dec 2016 11:33:17 +0000 (11:33 +0000)]
x86/traps: Correct pagefault handling issues introduced in c/s d5c251c

There are two bugs.

Firstly, the ASSERT(paging_mode_only_log_dirty(d)) can trip when servicing a
hypervisor #PF in the context of an HVM guest, e.g. a copy_to_user() failure
in the shadow pagetable code.

Secondly, the entry conditions paging_fault() were previously guarded on
!paging_mode_external(d) which limited entry to PV contexts, but for both
guest and hypervisor faults.  Switching this to paging_mode_log_dirty() opened
it up to HVM contexts as well.

Reinstate the old !paging_mode_external(d) check, as it is actually the
relevent fact, and extend the comment to explicitly state that hypervisor
faults should follow this path.

Inside, we are now guarenteed to be in the context of a PV guest, so can
safely use the assertion about log dirty.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
8 years agox86: Use ACPI reboot method for Dell OptiPlex 9020
Ross Lagerwall [Wed, 14 Dec 2016 11:12:01 +0000 (11:12 +0000)]
x86: Use ACPI reboot method for Dell OptiPlex 9020

When EFI booting the Dell OptiPlex 9020, it sometimes GP faults in the
EFI runtime instead of rebooting. Quirk this hardware to use the ACPI
reboot method instead.

dmidecode info:

BIOS Information
    Vendor: Dell Inc.
    Version: A15
    Release Date: 11/08/2015
System Information
    Manufacturer: Dell Inc.
    Product Name: OptiPlex 9020
    Version: 00

Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/emul: Further simplify DstBitBase handling
Andrew Cooper [Wed, 14 Dec 2016 10:58:08 +0000 (10:58 +0000)]
x86/emul: Further simplify DstBitBase handling

The masking of src.val is common to both paths.  Move it later and simplify
the entry condition for adjusting the memory operand.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agoConfig.mk: update mini-os changeset
Wei Liu [Wed, 14 Dec 2016 14:44:33 +0000 (14:44 +0000)]
Config.mk: update mini-os changeset

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
8 years agostubdom: modify ioemu linkfarm only if necessary
Juergen Gross [Tue, 13 Dec 2016 15:38:06 +0000 (16:38 +0100)]
stubdom: modify ioemu linkfarm only if necessary

Several stubdom libraries are being rebuilt each time a top level make
is called as they depend on stubdom/ioemu/linkfarm.stamp which is
depending on tools/qemu-xen-traditional-dir. Unfortunately this
directory is modified by each "make tools" call.

This can be avoided by writing stubdom/ioemu/linkfarm.stamp only if
a source file beneath tools/qemu-xen-traditional-dir has been added
or removed.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
8 years agox86emul: MOVNTI does not allow REP prefixes
Jan Beulich [Wed, 14 Dec 2016 09:11:08 +0000 (10:11 +0100)]
x86emul: MOVNTI does not allow REP prefixes

Just like 66, prefixes F3 and F2 cause #UD.

Also adjust a related comment, which in its previous wording was
misleading (as in 16-bit mode there would nothing be undone when
adjusting operand size from 2 to 4).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: check for LAHF_LM availability
Jan Beulich [Wed, 14 Dec 2016 09:10:39 +0000 (10:10 +0100)]
x86emul: check for LAHF_LM availability

We can't exclude someone wanting to hide LAHF/SAHF from 64-bit guests.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: check for CLFLUSH{,OPT} availability
Jan Beulich [Wed, 14 Dec 2016 09:10:11 +0000 (10:10 +0100)]
x86emul: check for CLFLUSH{,OPT} availability

We can't exclude someone wanting to hide either of the instructions
from guests.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: check for SYSENTER/SYSEXIT availability
Jan Beulich [Wed, 14 Dec 2016 09:09:40 +0000 (10:09 +0100)]
x86emul: check for SYSENTER/SYSEXIT availability

We can't exclude someone wanting to hide the instructions from guests.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: CMPXCHG{8,16}B ignore prefixes
Jan Beulich [Wed, 14 Dec 2016 09:08:22 +0000 (10:08 +0100)]
x86emul: CMPXCHG{8,16}B ignore prefixes

This removes 0F C7 from the list of two-byte opcodes treating prefixes
66, F3, and F2 as opcode extensions. We better manually handle this in
the opcode specific code:
- CMPXCHG8B ignores all these prefixes (its handling is being adjusted
  accordingly, with a respective test case added as well, to avoid
  re-introducing the subject of XSA-200),
- RDRAND/RDSEED (support to be added subsequently) honor 66, but treat
  F3 and F2 as opcode extensions (resolving to RDPID in the RDSEED
  case, which in turn ignores 66).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/PV: prefer pv_inject_hw_exception()
Jan Beulich [Wed, 14 Dec 2016 08:54:35 +0000 (09:54 +0100)]
x86/PV: prefer pv_inject_hw_exception()

... over editing the error code and calling do_guest_trap().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/PV: use generic emulator for privileged instruction handling
Jan Beulich [Wed, 14 Dec 2016 08:54:03 +0000 (09:54 +0100)]
x86/PV: use generic emulator for privileged instruction handling

There's a new emulator return code being added to allow bypassing
certain operations (see the code comment).

Another small tweak to the emulator is to single iteration handling
of INS and OUTS: Since we don't want to handle any other memory access
instructions, we want these to be handled by the rep_ins() / rep_outs()
hooks here too.

And then long-mode related bits now get hidden from the guest. This
should have been that way from the beginning, but becomes a requirement
now as the emulator's in_longmode() needs this to reflect guest view.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: make write and cmpxchg hooks optional
Jan Beulich [Wed, 14 Dec 2016 08:53:09 +0000 (09:53 +0100)]
x86emul: make write and cmpxchg hooks optional

While the read and fetch hooks are basically unavoidable, write and
cmpxchg aren't really needed by that many insns.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: generalize exception handling for rep_* hooks
Jan Beulich [Wed, 14 Dec 2016 08:52:35 +0000 (09:52 +0100)]
x86emul: generalize exception handling for rep_* hooks

If any of those hooks returns X86EMUL_EXCEPTION, some register state
should still get updated if some iterations have been performed (but
the rIP update will get suppressed if not all of them did get handled).
This updating is done by register_address_increment() and
__put_rep_prefix() (which hence must no longer be bypassed). As a
result put_rep_prefix() can then skip most of the writeback, but needs
to ensure proper completion of the executed instruction.

While on the HVM side the VA -> LA -> PA translation process ensures
that an exception would be raised on the first iteration only, doing so
would unduly complicate the PV side code about to be added.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
8 years agox86/32on64: use generic instruction decoding for call gate emulation
Jan Beulich [Wed, 14 Dec 2016 08:51:40 +0000 (09:51 +0100)]
x86/32on64: use generic instruction decoding for call gate emulation

... instead of custom handling. Note that we can't use generic
emulation, as the emulator's far branch support is rather rudimentary
at this point in time.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agolibxl: QED disks support
Cédric Bosdonnat [Tue, 13 Dec 2016 16:31:52 +0000 (17:31 +0100)]
libxl: QED disks support

Qdisk supports qcow and qcow2, extend it to also support qed disk
format.

Signed-off-by: Cédric Bosdonnat <cbosdonnat@suse.com>
[ wei: regenerate libxlu_disk_l.{c,h} ]
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agofix LDRB Thumb2 decoding
Stefano Stabellini [Tue, 13 Dec 2016 19:08:39 +0000 (11:08 -0800)]
fix LDRB Thumb2 decoding

Rt is four bit at offset 12, not three. See see encoding T2 for LDRB
A8.8.70 in ARM DDI 0406C.c

Suggested-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
8 years agofirmware/rombios: fix after update to libacpi
Roger Pau Monne [Tue, 13 Dec 2016 17:15:40 +0000 (17:15 +0000)]
firmware/rombios: fix after update to libacpi

Fix a build breakage after the libacpi changes, this is due to rombios using the
libacpi headers in order to parse the ACPI tables.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reported-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agox86/traps: Adjust paged-guest handling in the PV pagefault path
Andrew Cooper [Mon, 5 Dec 2016 11:29:12 +0000 (11:29 +0000)]
x86/traps: Adjust paged-guest handling in the PV pagefault path

PV guests necessarily can't be external, as Xen must steal address space from
them.  Pagefaults for HVM guests are handled by {vmx,svm}_vmexit_handler() and
don't enter the PV fixup_page_fault() path.  Therefore, the first call to
paging_fault() is dead, and dropped.

Logdirty mode is now the only paging mode we should ever find a PV guest with,
so add a new predicate and assertion to this fact.

Drop the final reference to paging_mode_external().  It is more accurately now
only for logdirty guests.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/shadow: Drop all emulation for PV vcpus
Andrew Cooper [Mon, 5 Dec 2016 11:35:32 +0000 (11:35 +0000)]
x86/shadow: Drop all emulation for PV vcpus

Emulation is only performed for paging_mode_refcount() domains, which in
practice means HVM domains only.

Drop the PV emulation code.  As it always set addr_side and sp_size to
BITS_PER_LONG, it can't have worked correctly for PV guests running in a
different mode to Xen.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
8 years agox86/paging: Enforce PG_external == PG_translate == PG_refcounts
Andrew Cooper [Mon, 5 Dec 2016 11:28:08 +0000 (11:28 +0000)]
x86/paging: Enforce PG_external == PG_translate == PG_refcounts

Setting PG_refcounts but not PG_translate is not useful.

While adjusting this, make a few other improvements.

 * Have paging_enable() unilaterally reject any unknown modes.
 * Drop the or'ing of PG_{HAP,SH}_enable.  The underlying functions already do
   this.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/VPMU: clear the overflow status of which counter happened to overflow
Luwei Kang [Tue, 13 Dec 2016 13:21:26 +0000 (14:21 +0100)]
x86/VPMU: clear the overflow status of which counter happened to overflow

Just set the corresponding bits of counters which happened to overflow,
rather than setting all the available bits of IA32_PERF_GLOBAL_OVF_CTRL
when pmu interrupt happened.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agolibacpi: announce that PVHv2 has no CMOS RTC in FADT
Roger Pau Monné [Tue, 13 Dec 2016 13:20:55 +0000 (14:20 +0100)]
libacpi: announce that PVHv2 has no CMOS RTC in FADT

At the moment this flag is unconditionally set for PVHv2 domains. Note that
using this boot flag requires that the FADT table revision is at least 5.

Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agolibacpi: update FADT layout to support version 5
Roger Pau Monné [Tue, 13 Dec 2016 13:20:34 +0000 (14:20 +0100)]
libacpi: update FADT layout to support version 5

Update the structure of the FADT table to version 5, and use that version for
PVHv2 guests. Note that HVM guests will continue to use FADT 4. In order to do
this, add a new field to acpi_config that contains the ACPI revision to use by
libacpi. Note that currently this only applies to the FADT.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agolibs/xenstore: set correct FreeBSD device
Roger Pau Monne [Mon, 12 Dec 2016 16:07:40 +0000 (16:07 +0000)]
libs/xenstore: set correct FreeBSD device

The path to the xenstore FreeBSD device is /dev/xen/xenstore.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agotools/fuzz: add README
Wei Liu [Thu, 8 Dec 2016 12:29:13 +0000 (12:29 +0000)]
tools/fuzz: add README

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agotools: hook up fuzz directory
Wei Liu [Thu, 8 Dec 2016 13:44:54 +0000 (13:44 +0000)]
tools: hook up fuzz directory

This will make all fuzzing targets get build every time tools directory
is built. This serves as basic regression test.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agotools/fuzz: introduce x86 instruction emulator target
Wei Liu [Thu, 8 Dec 2016 12:09:54 +0000 (12:09 +0000)]
tools/fuzz: introduce x86 instruction emulator target

Instruction emulator fuzzing code is from code previous written by
Andrew and George. Adapt it to llvm fuzzer and hook up the build system.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: George Dunlap <george.dunlap@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86emul/test: factor out emul_test_get_fpu
Wei Liu [Fri, 9 Dec 2016 16:09:41 +0000 (16:09 +0000)]
x86emul/test: factor out emul_test_get_fpu

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86emul/test: factor out emul_test_{read_cr,cpuid}
Wei Liu [Fri, 9 Dec 2016 11:45:08 +0000 (11:45 +0000)]
x86emul/test: factor out emul_test_{read_cr,cpuid}

While at it, move xgetbv, all cpu_has_* and cache_line_size macros to
x86_emulate.h.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86emul/test: factor out emul_test_make_stack_executable
Wei Liu [Fri, 9 Dec 2016 11:09:01 +0000 (11:09 +0000)]
x86emul/test: factor out emul_test_make_stack_executable

It will be used by emulator fuzzing target.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
8 years agotools/fuzz: introduce libelf target
Wei Liu [Wed, 7 Dec 2016 11:28:56 +0000 (11:28 +0000)]
tools/fuzz: introduce libelf target

Source code and Makefile to fuzz libelf in Google's oss-fuzz
infrastructure.

Introduce FUZZ_NO_LIBXC in libelf-private.h. That macro will be set when
compiling libelf fuzzer target because libxc is not required in libelf
fuzzing.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/shadow: Misc minor cleanup
Andrew Cooper [Mon, 5 Dec 2016 11:23:00 +0000 (11:23 +0000)]
x86/shadow: Misc minor cleanup

 * Move the #ifdefary inside sh_audit_gw() to avoid needing the else clause.
 * The walk_t parameter is only ever read, so make it const.
 * Use mfn_eq() rather than opencoding it.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
8 years agox86/shadow: Tweak some initialisation in sh_page_fault()
Andrew Cooper [Mon, 5 Dec 2016 11:32:59 +0000 (11:32 +0000)]
x86/shadow: Tweak some initialisation in sh_page_fault()

sh_page_fault() is a complicated function.  It aids clarity for the reader if
constant data is declared as such.

Declare struct npfec access and fetch_type_t ft as const, which requires
initialising them during declaration.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
8 years agox86/emul: Implement the STAC and CLAC instructions
Andrew Cooper [Tue, 18 Oct 2016 15:55:26 +0000 (16:55 +0100)]
x86/emul: Implement the STAC and CLAC instructions

Note that unlike most privilege restricted instructions, STAC and CLAC are
documented to raise #UD rather than #GP[0], and indeed do so.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agoxen: Fix determining when domain creation is complete
Andrew Cooper [Mon, 12 Dec 2016 18:28:40 +0000 (18:28 +0000)]
xen: Fix determining when domain creation is complete

d->creation_finished is used in several places alter behaviour depending on
whether the domain is being created, or is already running.

However, there is a latent bug if a toolstack component makes a pair of
pause/unpause calls, where creation will be considered finished prematurely.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Paul Durrant <paul.durrant@citrix.com>
8 years agox86/hvm: Fix HVMOP_get_param when skipping creating the default ioreq server
Andrew Cooper [Mon, 12 Dec 2016 18:12:54 +0000 (18:12 +0000)]
x86/hvm: Fix HVMOP_get_param when skipping creating the default ioreq server

c/s e7dabe5 "x86/hvm: don't unconditionally create a default ioreq server"
added a break statement, but the logic previously depended on falling through
into the default case to fill in the value the caller asked for.

This causes the sending migration code to put a junk PARAM into the stream,
and the receiving side to fail to zero the IOREQ pages, causing QEMU to object
when it finds stale requests while starting up.

Reorder the code so it more clearly falls through into the default case.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
8 years agofix potential pa_range_info out of bound access
Stefano Stabellini [Mon, 12 Dec 2016 19:22:39 +0000 (11:22 -0800)]
fix potential pa_range_info out of bound access

pa_range_info has only 8 elements and is accessed using pa_range as
index. pa_range is initialized to 16, potentially causing out of bound
access errors. Fix the issue by checking that pa_range is not greater
than the size of the array. Remove the now superfluous pa_range&0x8
check.

Coverity-ID: 1381865

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
8 years agofix potential int overflow in efi/boot
Stefano Stabellini [Fri, 9 Dec 2016 19:52:09 +0000 (11:52 -0800)]
fix potential int overflow in efi/boot

HorizontalResolution and VerticalResolution are 32bit, while size is
64bit. As it stands multiplications are evaluated with 32bit arithmetic,
which could overflow. Cast HorizontalResolution to 64bit to avoid that.

Coverity-ID: 1381858

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/emul: Annotate more intentional fallthrough cases
Andrew Cooper [Mon, 12 Dec 2016 10:05:04 +0000 (10:05 +0000)]
x86/emul: Annotate more intentional fallthrough cases

Some recent change in x86_emulate.c has simplified the callgraph sufficiently
for Coverity to notice these, rather than hitting its upper path limit.

All are legitimate fallthoughs.  Annotate them as such.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agoconsole: allow log level threshold adjustments
Jan Beulich [Mon, 12 Dec 2016 16:48:49 +0000 (17:48 +0100)]
console: allow log level threshold adjustments

... from serial console so that one doesn't always need to reboot to
see more / less messages.

Note that upper thresholds are sticky, i.e. while they get adjusted
upwards when the lower threshold would otherwise end up above the upper
one, they don't get adjusted when reducing the lower one. Full
flexibility is available only via a future sysctl interface.

Note further that (meaningless) large threshold values aren't being
rejected, for the sake of not adding more checks to the code than are
really necessary for safe operation.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/time: don't omit newline in dump_softtsc()
Jan Beulich [Mon, 12 Dec 2016 16:48:19 +0000 (17:48 +0100)]
x86/time: don't omit newline in dump_softtsc()

Reported-by: Anton Samsonov <devel@zxlab.ru>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: consolidate string insn register adjustments
Jan Beulich [Mon, 12 Dec 2016 16:47:29 +0000 (17:47 +0100)]
x86emul: consolidate string insn register adjustments

Move the looking at EFLAGS.DF into the macro (being renamed to no
longer suggest a particular direction, rendering all call sites more
readable.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agoAMD IOMMU: Support IOAPIC IDs larger than 128
Suravee Suthikulpanit [Mon, 12 Dec 2016 16:43:34 +0000 (17:43 +0100)]
AMD IOMMU: Support IOAPIC IDs larger than 128

Currently, the driver uses the APIC ID to index into the ioapic_sbdf array.
The current MAX_IO_APICS is 128, which causes the driver initialization
to fail on the system with IOAPIC ID >= 128.

Instead, this patch adds APIC ID in the struct ioapic_sbdf,
which is used to match the entry when searching through the array.

Also, this patch removes the use of ioapic_cmdline bit-map, which is
used to track the ivrs_ioapic options via command line.
Instead, it introduces the cmdline flag in the struct ioapic_sbdf,
to identify if the entry is created during ivrs_ioapic command-line parsing.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86: allow the emulated APICs to be enabled for the hardware domain
Roger Pau Monné [Mon, 12 Dec 2016 16:42:40 +0000 (17:42 +0100)]
x86: allow the emulated APICs to be enabled for the hardware domain

Allow the use of both the emulated local APIC and IO APIC for the hardware
domain.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agotools: bump some library version numbers to 4.9
Wei Liu [Tue, 6 Dec 2016 12:05:46 +0000 (12:05 +0000)]
tools: bump some library version numbers to 4.9

Bump the version number for libxc, libxlu, libxl and libvchan to 4.9.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
8 years agolibxl: Add COLO replication top-id support
Zhang Chen [Wed, 30 Nov 2016 09:47:52 +0000 (17:47 +0800)]
libxl: Add COLO replication top-id support

Because of qemu colo adds the top-id parameter, so we update libxl.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl: Add Xen colo support for qemu-upstream colo code
Zhang Chen [Wed, 30 Nov 2016 09:47:51 +0000 (17:47 +0800)]
libxl: Add Xen colo support for qemu-upstream colo code

Because of qemu code has been updated, we update Xen colo block code.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl: fix gentypes call in Makefile
Cédric Bosdonnat [Thu, 10 Nov 2016 16:46:00 +0000 (17:46 +0100)]
libxl: fix gentypes call in Makefile

From the make documentation:

"$* [...] If the target is `dir/a.foo.b' and the target pattern is
`a.%.b' then the stem is `dir/foo'. In a static pattern rule, the
stem is part of the file name that matched the `%' in the target
pattern."

The rule generating the c types files from the idl ones is not
a static pattern rule, but rather an implicit rule. Thus the value
of $* is preceded by the file path, instead of only what matches %.

In order to get this fixed, drop the path using a $(notdir $*).

Signed-off-by: Cédric Bosdonnat <cbosdonnat@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agolibxl: fix erroneous negation for isstubdom
Sander Eikelenboom [Sat, 10 Dec 2016 17:59:08 +0000 (18:59 +0100)]
libxl: fix erroneous negation for isstubdom

Commit 20b75251d9721d9c050a973c02baac396c794ade introduced an erroneous
negation which gave the isstubdom bool the opposite semantics, causing
the subsequent code to take the wrong code path, which breaks HVM
pci-passthrough.

Signed-off-by: Sander Eikelenboom <linux@eikelenboom.it>
Acked-by: Cedric Bosdonnat <cbosdonnat@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
8 years agoINSTALL: remove stale coverage build instruction
Wei Liu [Wed, 7 Dec 2016 15:10:01 +0000 (15:10 +0000)]
INSTALL: remove stale coverage build instruction

Now it is controlled by Kconfig.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86/hvm: don't unconditionally create a default ioreq server
Paul Durrant [Mon, 12 Dec 2016 08:49:10 +0000 (09:49 +0100)]
x86/hvm: don't unconditionally create a default ioreq server

Avoid doing so if the domain is not under construction.

If upstream QEMU is in use then it will explicitly create an ioreq server
rather than implicitly creating the default ioreq server, which is a
side-effect of reading HVM_PARAM_IOREQ_PFN, HVM_PARAM_BUFIOREQ_PFN,
or HVM_PARAM_BUFIOREQ_EVTCHN (as is done by legacy QEMUs).

However, if the domain is subsequently saved/migrated then those parameters
are read and hence the default server will be unnecessarily instantiated.

This patch adds an extra check of the 'creation_finished' flag when those
HVM params are read and will only instantiate the server if the domain is
under construction, which will always be the case when QEMU is invoked.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86emul: use SrcEax/DstEax where suitable for string insns
Jan Beulich [Mon, 12 Dec 2016 08:41:57 +0000 (09:41 +0100)]
x86emul: use SrcEax/DstEax where suitable for string insns

LODS, SCAS, and STOS all use the accumulator as one of their operands.
This avoids some open coding of things, but requires switching around
operands of SCAS.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86: add CPUID dependents of APIC and TSC
Jan Beulich [Mon, 12 Dec 2016 08:41:21 +0000 (09:41 +0100)]
x86: add CPUID dependents of APIC and TSC

TSC_DEADLINE in particular depends on both; take the opportunity to add
a few more.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: avoid numeric literals for EFLAGS values
Jan Beulich [Mon, 12 Dec 2016 08:40:40 +0000 (09:40 +0100)]
x86emul: avoid numeric literals for EFLAGS values

Make the code use EFLG_* constants instead.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: move some of the early operand adjustments
Jan Beulich [Mon, 12 Dec 2016 08:40:06 +0000 (09:40 +0100)]
x86emul: move some of the early operand adjustments

As said in the code comment being added, only adjustments affecting
further processing prior to the x86_decode_*() calls really belong into
x86_decode() itself.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: abstract gcc asm() flag output handling
Jan Beulich [Mon, 12 Dec 2016 08:39:26 +0000 (09:39 +0100)]
x86emul: abstract gcc asm() flag output handling

Let's try to limit #ifdef-ery, or else more of these would need to
appear later.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agox86emul: derive vcpu_must_have() from vcpu_has()
Jan Beulich [Mon, 12 Dec 2016 08:38:50 +0000 (09:38 +0100)]
x86emul: derive vcpu_must_have() from vcpu_has()

... to avoid introducing further redundancy when adding further feature
flag checks, and to bring its use better in line with its host_and_*()
sibling.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
8 years agomake tlbflush_filter()'s first parameter a pointer
Jan Beulich [Mon, 12 Dec 2016 08:34:09 +0000 (09:34 +0100)]
make tlbflush_filter()'s first parameter a pointer

This brings it in line with most other functions dealing with CPU
masks. Convert both implementations to inline functions at once.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
8 years agoarm/irq: Reorder check in route_irq_to_guest() to avoid 4 layers of "if"
Oleksandr Tyshchenko [Tue, 6 Dec 2016 17:53:20 +0000 (19:53 +0200)]
arm/irq: Reorder check in route_irq_to_guest() to avoid 4 layers of "if"

Remove one layer of "if" by reordering the check
in route_irq_to_guest() to make code more clearer.

Signed-off-by: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
8 years agofix out of bound access to mode_strings
Stefano Stabellini [Fri, 9 Dec 2016 01:17:04 +0000 (17:17 -0800)]
fix out of bound access to mode_strings

mode == ARRAY_SIZE(mode_strings) causes an out of bound access to
the mode_strings array.

Coverity-ID: 1381859

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
8 years agomissing vgic_unlock_rank in gic_remove_irq_from_guest
Stefano Stabellini [Fri, 9 Dec 2016 00:59:28 +0000 (16:59 -0800)]
missing vgic_unlock_rank in gic_remove_irq_from_guest

Add missing vgic_unlock_rank on the error path in
gic_remove_irq_from_guest.

Coverity-ID: 1381843

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Julien Grall <julien.grall@arm.com>
8 years agox86/hvm: Move hvm_hypervisor_cpuid_leaf() handling into cpuid_hypervisor_leaves()
Andrew Cooper [Sun, 2 Oct 2016 16:28:11 +0000 (17:28 +0100)]
x86/hvm: Move hvm_hypervisor_cpuid_leaf() handling into cpuid_hypervisor_leaves()

This reduces the net complexity of CPUID handling by having all adjustments in
the same place.  Remove the now-unused hvm_funcs.hypervisor_cpuid_leaf()
infrastructure.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/hvm: Move hvm_funcs.cpuid_intercept() handling into hvm_cpuid()
Andrew Cooper [Sun, 2 Oct 2016 16:28:11 +0000 (17:28 +0100)]
x86/hvm: Move hvm_funcs.cpuid_intercept() handling into hvm_cpuid()

This reduces the net complexity of CPUID handling by having all adjustments in
the same place.  Remove the now-unused hvm_funcs.cpuid_intercept
infrastructure.

The SYSCALL feature hiding is tweaked when moved.  In principle, an
administrator can choose to explicitly hide the SYSCALL feature from the
guest, as it has a separate feature bit.  If this is the case, the feature
shouldn't be set behind the back of the administrators wishes.  (Not that many
64bit OSes would function in this scenario.)  In reality, SYSCALL will always
be set in edx at this point.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/vpmu: Remove core2_no_vpmu_ops
Andrew Cooper [Tue, 4 Oct 2016 19:35:45 +0000 (20:35 +0100)]
x86/vpmu: Remove core2_no_vpmu_ops

core2_no_vpmu_ops exists solely to work around the default-leaking of CPUID/MSR
values in Xen.

With CPUID handling removed from arch_vpmu_ops, the RDMSR handling is the last
remaining hook.  Since core2_no_vpmu_ops's introduction in c/s 25250ed7 "vpmu
intel: Add cpuid handling when vpmu disabled", a lot of work has been done and
the nop path in vpmu_do_msr() now suffices.

vpmu_do_msr() also falls into the nop path for un-configured or unprivileged
domains, which enables the removal the duplicate logic in priv_op_read_msr().

Finally, make all arch_vpmu_ops structures const as they are never modified,
and make them static as they are not referred to externally.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
8 years agox86/vpmu: Move vpmu_do_cpuid() handling into {pv,hvm}_cpuid()
Andrew Cooper [Tue, 4 Oct 2016 19:35:45 +0000 (20:35 +0100)]
x86/vpmu: Move vpmu_do_cpuid() handling into {pv,hvm}_cpuid()

This reduces the net complexity of CPUID handling by having all adjustments in
the same place.  Remove the now-unused vpmu_do_cpuid() infrastructure.

This involves introducing a vpmu_enabled() predicate, and making the Intel
specific VPMU_CPU_HAS_* constants public.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
8 years agox86emul: correct 64-bit mode repeated string insn handling with zero count
Jan Beulich [Fri, 9 Dec 2016 14:51:57 +0000 (15:51 +0100)]
x86emul: correct 64-bit mode repeated string insn handling with zero count

When a 32-bit address override is in effect these zero-extend all
registers which would also get updated in case of non-zero repeat
count.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>