]> xenbits.xensource.com Git - xen.git/log
xen.git
7 years agox86/hvm: Fix boundary check in hvmemul_insn_fetch()
Andrew Cooper [Tue, 25 Jul 2017 18:48:43 +0000 (19:48 +0100)]
x86/hvm: Fix boundary check in hvmemul_insn_fetch()

c/s 0943a03037 added some extra protection for overflowing the emulation
instruction cache, but Coverity points out that boundary condition is off by
one when memcpy()'ing out of the buffer.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
7 years agolibxc: bail immediately when PV superpage is discovered
Wei Liu [Wed, 26 Jul 2017 07:44:56 +0000 (08:44 +0100)]
libxc: bail immediately when PV superpage is discovered

The original code was added with the hope that PV superpage migration
might work. But it was never proven that the code actually worked.

Now that PV superpage is gone, simplify the code by returning error
immediately.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agotools: nuke superpage parameters in code
Wei Liu [Wed, 26 Jul 2017 07:44:55 +0000 (08:44 +0100)]
tools: nuke superpage parameters in code

Also fix manpage because there is no superpages options in xl.cfg.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agox86: nuke PV superpage option and code
Wei Liu [Wed, 26 Jul 2017 07:44:54 +0000 (08:44 +0100)]
x86: nuke PV superpage option and code

Delete the user visible option and code for PV superpage support. The
mm code is modified as if the option is set to false (the default
value).

Return the address space occupied by spage_info back to the reserved
address space.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agoxen:arm earlyprintk configuration for Hikey 960 boards
Konrad Rzeszutek Wilk [Wed, 26 Jul 2017 17:18:45 +0000 (10:18 -0700)]
xen:arm earlyprintk configuration for Hikey 960 boards

Introduce an earlyprintk configuration of Hikey 960 boards.

Tested with:
 https://github.com/96boards-hikey/edk2.git #testing/hikey960_v2.5
 https://github.com/96boards-hikey/OpenPlatformPkg.git #testing/hikey960_v1.3.4
 https://git.savannah.gnu.org/git/grub.git #master
 https://github.com/96boards-hikey/linux.git #hikey960-upstream-rebase

For GRUB, the following stanza was used:

GRUB_MODULES="boot chain configfile echo efinet eval ext2 fat font gettext gfxterm gzio help linux loadenv lsefi normal part_gpt par
t_msdos read regexp search search_fs_file search_fs_uuid search_label terminal terminfo test tftp time xen_boot"

grub-install/usr/bin/grub-mkimage \
                --config grub.config \
                --dtb linux/arch/arm64/boot/dts/hisilicon/hi3660-hikey960.dtb \
                --directory=grub/usr/lib64/grub/arm64-efi \
                --output=grubaa64.efi \
                --format=arm64-efi \
                --prefix="/boot/grub" \
                $GRUB_MODULES

And grub.config:
search.fs_label rootfs root

set prefix=($root)/boot/grub
configfile $prefix/grub.cfg

Signed-off-by: Konrad Rzeszutek Wilk <konrad@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Julien Grall <julien.grall@arm.com>
7 years agotools: tracing: handle null scheduler's events
Dario Faggioli [Wed, 26 Jul 2017 14:55:29 +0000 (15:55 +0100)]
tools: tracing: handle null scheduler's events

In both xentrace and xenalyze.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
7 years agoxen: sched_null: add some tracing
Dario Faggioli [Wed, 26 Jul 2017 14:55:29 +0000 (15:55 +0100)]
xen: sched_null: add some tracing

In line with what is there in all the other schedulers.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
7 years agoxen: sched-null: support soft-affinity
Dario Faggioli [Wed, 26 Jul 2017 14:55:28 +0000 (15:55 +0100)]
xen: sched-null: support soft-affinity

The null scheduler does not really use hard-affinity for
scheduling, it uses it for 'placement', i.e., for deciding
to what pCPU to statically assign a vCPU.

Let's use soft-affinity in the same way, of course with the
difference that, if there's no free pCPU within the vCPU's
soft-affinity, we go checking the hard-affinity, instead of
putting the vCPU in the waitqueue.

This does has no impact on the scheduling overhead, because
soft-affinity is only considered in cold-path (like when a
vCPU joins the scheduler for the first time, or is manually
moved between pCPUs by the user).

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
7 years agoxen: sched_null: check for pending tasklet work a bit earlier
Dario Faggioli [Wed, 26 Jul 2017 14:55:27 +0000 (15:55 +0100)]
xen: sched_null: check for pending tasklet work a bit earlier

Whether or not there's pending tasklet work to do, it's
something we know from the tasklet_work_scheduled parameter.

Deal with that as soon as possible, like all other schedulers
do.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
7 years agoxen: sched: factor affinity helpers out of sched_credit.c
Dario Faggioli [Wed, 26 Jul 2017 14:55:27 +0000 (15:55 +0100)]
xen: sched: factor affinity helpers out of sched_credit.c

In fact, we want to be able to use them from any scheduler.

While there, make the moved code use 'v' for struct_vcpu*
variable, like it should be done everywhere.

No functional change intended.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Signed-off-by: Justin T. Weaver <jtweaver@hawaii.edu>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
7 years agox86/emul: Drop segment_attributes_t
Andrew Cooper [Mon, 5 Jun 2017 16:19:27 +0000 (17:19 +0100)]
x86/emul: Drop segment_attributes_t

The amount of namespace resolution is unnecessarily large, as all code deals
in terms of struct segment_register.  This removes the attr.fields part of all
references, and alters attr.bytes to just attr.

Three areas of code using initialisers for segment_register are tweaked to
compile with older versions of GCC.  arch_set_info_hvm_guest() has its SEG()
macros altered to use plain comma-based initialisation, while
{rm,vm86}_{cs,ds}_attr are simplified to plain numbers which matches their
description in the manuals.

No functional change.  (For some reason, the old {rm,vm86}_{cs,ds}_attr causes
GCC to create variable in .rodata, whereas the new code uses immediate
operands.  As a result, vmx_{get,set}_segment_register() are slightly
shorter.)

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/hvm: Rearange check_segment() to use a switch statement
Andrew Cooper [Mon, 5 Jun 2017 16:19:27 +0000 (17:19 +0100)]
x86/hvm: Rearange check_segment() to use a switch statement

This simplifies the logic by separating the x86_segment check from the type
check.  No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/svm: Drop svm_segment_register_t
Andrew Cooper [Fri, 30 Jun 2017 12:12:00 +0000 (12:12 +0000)]
x86/svm: Drop svm_segment_register_t

Most SVM code already uses struct segment_register.  Drop the typedef and
adjust the definitions in struct vmcb_struct, and svm_dump_sel().  Introduce
some build-time assertions that struct segment_register from the common
emulation code is usable in struct vmcb_struct.

While making these adjustments, fix some comments to not mix decimal and
hexidecimal offsets, and drop all trailing whitespace in vmcb.h

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
7 years agox86/pagewalk: Remove opt_allow_superpage check from guest_can_use_l2_superpages()
Andrew Cooper [Mon, 24 Jul 2017 16:28:25 +0000 (17:28 +0100)]
x86/pagewalk: Remove opt_allow_superpage check from guest_can_use_l2_superpages()

The purpose of guest_walk_tables() is to match the behaviour of real hardware.

A PV guest can have 2M superpages in its pagetables, via the M2P (and for dom0
via the initial P2M), even if the guest isn't permitted to create arbitrary 2M
superpage mappings.

guest_can_use_l2_superpages() checking opt_allow_superpage is a piece of PV
guest policy enforcement, rather than its intended purpose of meaning "would
hardware tolerate finding an L2 superpage with these control settings?"

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agox86/mm: Rename get_page_and_type_from_pagenr() to get_page_and_type_from_mfn()
Andrew Cooper [Wed, 18 Jan 2017 18:02:19 +0000 (18:02 +0000)]
x86/mm: Rename get_page_and_type_from_pagenr() to get_page_and_type_from_mfn()

'pagenr' is actually an mfn.  Rename the function to use consistent
terminology, switching it to use a typesafe mfn_t.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agox86/mm: Rename get_page_from_pagenr() to get_page_from_mfn()
Andrew Cooper [Wed, 18 Jan 2017 17:58:42 +0000 (17:58 +0000)]
x86/mm: Rename get_page_from_pagenr() to get_page_from_mfn()

'pagenr' is actually an mfn.  Rename the function to use consistent
terminology, switching it to use a typesafe mfn_t and boolean return type.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agoRevert "VT-d: fix VF of RC integrated PF matched to wrong VT-d unit"
Chao Gao [Tue, 25 Jul 2017 10:48:26 +0000 (18:48 +0800)]
Revert "VT-d: fix VF of RC integrated PF matched to wrong VT-d unit"

This reverts commit 89df98b77d28136c4d7aade13a1c8bc154d2919f, which
incurs Xen crash when loading VF driver. The reason seems that
pci_get_pdev() can't be called when interrupt is disabled. I don't have a
quick solution to fix this; therefore revert this patch to let common cases
work well. As to the corner case I intended to fix, I will propose another
solution later.

Below is the call trace of Xen crash:
(XEN) Xen BUG at spinlock.c:47
(XEN) ----[ Xen-4.10-unstable  x86_64  debug=y   Tainted:  C   ]----
(XEN) CPU:    2
(XEN) RIP:    e008:[<ffff82d08023513c>] spinlock.c#check_lock+0x3c/0x40
(XEN) RFLAGS: 0000000000010046   CONTEXT: hypervisor (d0v2)
(XEN) rax: 0000000000000000   rbx: ffff82d08043b9c8   rcx: 0000000000000001
(XEN) rdx: 0000000000000000   rsi: 0000000000000000   rdi: ffff82d08043b9ce
(XEN) rbp: ffff83043c47fa50   rsp: ffff83043c47fa50   r8:  0000000000000000
(XEN) r9:  0000000000000000   r10: 0000000000000000   r11: 0000ffff0000ffff
(XEN) r12: 0000000000000001   r13: 0000000000000000   r14: 0000000000000072
(XEN) r15: ffff83043c006c00   cr0: 0000000080050033   cr4: 00000000003526e0
(XEN) cr3: 000000081b39a000   cr2: ffff88016c058548
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
(XEN) Xen code around <ffff82d08023513c> (spinlock.c#check_lock+0x3c/0x40):
(XEN)  98 83 f2 01 39 d0 75 02 <0f> 0b 5d c3 55 48 89 e5 f0 ff 05 a1 f6 1e 00 5d
(XEN) Xen stack trace from rsp=ffff83043c47fa50:
(XEN)    ffff83043c47fa68 ffff82d080235234 0000000000000005 ffff83043c47fa78
(XEN)    ffff82d080251df3 ffff83043c47fab8 ffff82d080251e80 ffff83043c47fac8
(XEN)    ffff83043c422580 ffff83042e973cd0 0000000000000005 ffff83042e9609e0
(XEN)    0000000000000072 ffff83043c47fae8 ffff82d08025795a ffff83043c47fb18
(XEN)    ffff83043c47fc18 ffff83043c47fc18 ffff83042e9609e0 ffff83043c47fba8
(XEN)    ffff82d080259be1 ffff83043c47fb10 ffff82d08023516b 0000000000000246
(XEN)    ffff83043c47fb28 0000000000000206 0000000000000002 ffff83043c47fb58
(XEN)    ffff82d080290e38 ffff83042e973cd0 ffff83043c532000 ffff83043c532000
(XEN)    ffff83042e973db0 ffff83043c47fb68 ffff82d080354dd0 ffff83043c47fc18
(XEN)    ffff82d080274e07 0000000000000040 ffff83042e9609e0 ffff83043c47fc18
(XEN)    ffff83043c47fc18 0000000000000072 ffff83043c006c00 ffff83043c47fbb8
(XEN)    ffff82d0802526f7 ffff83043c47fc08 ffff82d080273c17 ffff83043ff99d90
(XEN)    ffff83043c006c00 ffff83043c47fc08 ffff83043c006c00 ffff83042e9609e0
(XEN)    ffff83043c47fc18 0000000000000072 ffff83043c006c00 ffff83043c47fc48
(XEN)    ffff82d0802754d1 00000000feeff00c 00000fff000041ca 0000000000000002
(XEN)    ffff83042e9609e0 ffff83042e973cd0 0000000000000002 ffff83043c47fc88
(XEN)    ffff82d0802755a8 ffff83043c47fc70 0000000000000246 ffff83043c532000
(XEN)    000000000000006c ffff83043c006c00 0000000000000000 ffff83043c47fd28
(XEN)    ffff82d080279b4f ffff83043c532000 ffff83043c47fe00 ffff83043c47fcd8
(XEN)    ffff83042e973d20 ffff83043c47fcf0 ffff830400000325 0000000000000246
(XEN) Xen call trace:
(XEN)    [<ffff82d08023513c>] spinlock.c#check_lock+0x3c/0x40
(XEN)    [<ffff82d080235234>] _spin_is_locked+0x11/0x4d
(XEN)    [<ffff82d080251df3>] pcidevs_locked+0x10/0x17
(XEN)    [<ffff82d080251e80>] pci_get_pdev+0x2f/0xfd
(XEN)    [<ffff82d08025795a>] acpi_find_matched_drhd_unit+0x4d/0x11a
(XEN)    [<ffff82d080259be1>] msi_msg_write_remap_rte+0x2f/0x749
(XEN)    [<ffff82d0802526f7>] iommu_update_ire_from_msi+0x36/0x38
(XEN)    [<ffff82d080273c17>] msi.c#write_msi_msg+0x3f/0x188
(XEN)    [<ffff82d0802754d1>] __setup_msi_irq+0x3a/0x5c
(XEN)    [<ffff82d0802755a8>] setup_msi_irq+0xb5/0xf7
(XEN)    [<ffff82d080279b4f>] map_domain_pirq+0x445/0x653
(XEN)    [<ffff82d08027aa99>] allocate_and_map_msi_pirq+0x10d/0x184
(XEN)    [<ffff82d080291258>] physdev_map_pirq+0x1f8/0x26b
(XEN)    [<ffff82d0802919a6>] do_physdev_op+0x595/0x110f
(XEN)    [<ffff82d080352db0>] pv_hypercall+0x1ef/0x42c
(XEN)    [<ffff82d080356606>] entry.o#test_all_events+0/0x30
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 2:
(XEN) Xen BUG at spinlock.c:47
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...

Signed-off-by: Chao Gao <chao.gao@intel.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agoxen: Drop repeated semicolons
Andrew Cooper [Tue, 25 Jul 2017 10:40:40 +0000 (11:40 +0100)]
xen: Drop repeated semicolons

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
7 years agoxen/link: Move .data.rel.ro sections into .rodata for final link
David Woodhouse [Tue, 25 Jul 2017 09:21:37 +0000 (10:21 +0100)]
xen/link: Move .data.rel.ro sections into .rodata for final link

This includes stuff like the hypercall tables which we really kind of want
to be read-only. And they were going into .data.read-mostly.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agoxen:Kconfig: Make SCIF built by default for ARM
Andrii Anisov [Tue, 18 Jul 2017 16:45:30 +0000 (19:45 +0300)]
xen:Kconfig: Make SCIF built by default for ARM

Both Renesas R-Car Gen2(ARM32) and Gen3(ARM64) are utilizing SCIF IP,
so make its serial driver built by default for ARM.

Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
Acked-by: Julien Grall <julien.grall@arm.com>
7 years agodocs: correct paragraph indention in xen-tscmode
Olaf Hering [Wed, 24 May 2017 09:12:40 +0000 (11:12 +0200)]
docs: correct paragraph indention in xen-tscmode

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agodocs: replace xm with xl in xen-tscmode
Olaf Hering [Wed, 24 May 2017 09:12:24 +0000 (11:12 +0200)]
docs: replace xm with xl in xen-tscmode

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agoxen: RTDS: rearrange members of control structures
Dario Faggioli [Fri, 23 Jun 2017 10:55:19 +0000 (12:55 +0200)]
xen: RTDS: rearrange members of control structures

Nothing changed in `pahole` output, in terms of holes
and padding, but some fields have been moved, to put
related members in same cache line.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
7 years agoxen: credit2: rearrange members of control structures
Dario Faggioli [Fri, 23 Jun 2017 10:55:12 +0000 (12:55 +0200)]
xen: credit2: rearrange members of control structures

With the aim of improving memory size and layout, and
at the same time trying to put related fields reside
in the same cacheline.

Here's a summary of the output of `pahole`, with and
without this patch, for the affected data structures.

csched2_runqueue_data:
 * Before:
    size: 216, cachelines: 4, members: 14
    sum members: 208, holes: 2, sum holes: 8
    last cacheline: 24 bytes
 * After:
    size: 208, cachelines: 4, members: 14
    last cacheline: 16 bytes

csched2_private:
 * Before:
    size: 120, cachelines: 2, members: 8
    sum members: 112, holes: 1, sum holes: 4
    padding: 4
    last cacheline: 56 bytes
 * After:
    size: 112, cachelines: 2, members: 8
    last cacheline: 48 bytes

csched2_vcpu:
 * Before:
    size: 112, cachelines: 2, members: 14
    sum members: 108, holes: 1, sum holes: 4
    last cacheline: 48 bytes
 * After:
    size: 112, cachelines: 2, members: 14
    padding: 4
    last cacheline: 48 bytes

While there, improve the wording, style and alignment
of comments too.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
7 years agoxen: credit: rearrange members of control structures
Dario Faggioli [Fri, 23 Jun 2017 10:55:05 +0000 (12:55 +0200)]
xen: credit: rearrange members of control structures

With the aim of improving memory size and layout, and
at the same time trying to put related fields reside
in the same cacheline.

Here's a summary of the output of `pahole`, with and
without this patch, for the affected data structures.

csched_pcpu:
 * Before:
    size: 88, cachelines: 2, members: 6
    sum members: 80, holes: 1, sum holes: 4
    padding: 4
    paddings: 1, sum paddings: 5
    last cacheline: 24 bytes
 * After:
    size: 80, cachelines: 2, members: 6
    paddings: 1, sum paddings: 5
    last cacheline: 16 bytes

csched_vcpu:
 * Before:
    size: 72, cachelines: 2, members: 9
    padding: 2
    last cacheline: 8 bytes
 * After:
    same numbers, but move some fields to put
    related fields in same cache line.

csched_private:
 * Before:
    size: 152, cachelines: 3, members: 17
    sum members: 140, holes: 2, sum holes: 8
    padding: 4
    paddings: 1, sum paddings: 5
    last cacheline: 24 bytes
 * After:
    same numbers, but move some fields to put
    related fields in same cache line.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
7 years agoxen: credit2: make the cpu to runqueue map per-cpu
Dario Faggioli [Fri, 23 Jun 2017 10:54:59 +0000 (12:54 +0200)]
xen: credit2: make the cpu to runqueue map per-cpu

Instead of keeping an NR_CPUS big array of int-s,
directly inside csched2_private, use a per-cpu
variable.

That's especially beneficial (in terms of saved
memory) when there are more instance of Credit2 (in
different cpupools), and also helps fitting
csched2_private itself into CPU caches.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
7 years agoxen: credit2: allocate runqueue data structure dynamically
Dario Faggioli [Fri, 23 Jun 2017 10:54:52 +0000 (12:54 +0200)]
xen: credit2: allocate runqueue data structure dynamically

Instead of keeping an NR_CPUS big array of csched2_runqueue_data
elements, directly inside the csched2_private structure, allocate
it dynamically.

This has two positive effects:
- reduces the size of csched2_private sensibly, which is
  especially good in case there are more instance of Credit2
  (in different cpupools), and is also good from the point
  of view of fitting the struct into CPU caches;
- we can use nr_cpu_ids as array size, which may be sensibly
  smaller than NR_CPUS

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@citrix.com>
7 years agotools: Drop xc_cpuid_check() and bindings
Andrew Cooper [Mon, 17 Jul 2017 12:38:03 +0000 (13:38 +0100)]
tools: Drop xc_cpuid_check() and bindings

There are no current users which I can locate.  One piece of xend which didn't
move forwards into xl/libxl is this:

  #   Configure host CPUID consistency checks, which must be satisfied for this
  #   VM to be allowed to run on this host's processor type:
  #cpuid_check=[ '1:ecx=xxxxxxxxxxxxxxxxxxxxxxxxxx1xxxxx' ]
  # - Host must have VMX feature flag set

The implementation of xc_cpuid_check() is conceptually broken.  Dom0's view of
CPUID is not the approprite view to check, and will be wrong in the presence
of CPUID masking/faulting, and for HVM-based toolstack domains.

If it turns out that the functionality is required, it should be implemented
in terms of XEN_SYSCTL_get_cpuid_policy to use the proper CPUID view.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agoxenconsole: Add option to xenconsole to always forward console input
Felix Schmoll [Thu, 20 Jul 2017 07:47:48 +0000 (09:47 +0200)]
xenconsole: Add option to xenconsole to always forward console input

Currently the default behaviour of the xenconsole client is to
ignore any input to stdin, unless stdin and stdout are both
ttys. The new option allows to manually overwrite this, causing the
client to forward input regardless.

Signed-off-by: Felix Schmoll <eggi.innovations@gmail.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
7 years agoxen: credit2: enable per cpu runqueue creation
Praveen Kumar [Tue, 11 Apr 2017 16:15:17 +0000 (21:45 +0530)]
xen: credit2: enable per cpu runqueue creation

The patch introduces a new command line option 'cpu' that when used will create
runqueue per logical pCPU. This may be useful for small systems, and also for
development, performance evalution and comparison.

Signed-off-by: Praveen Kumar <kpraveen.lkml@gmail.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
7 years agokbdif: Define "feature-raw-pointer" and "request-raw-pointer"
Owen Smith [Mon, 3 Jul 2017 12:57:53 +0000 (12:57 +0000)]
kbdif: Define "feature-raw-pointer" and "request-raw-pointer"

Backends set "feature-raw-pointer" if its capable of reporting
absolute positions without scaling the coordinates to screen
size. This should be set during the backend init.
Frontends set "request-raw-pointer" to request that backends
do not rescale absolute coordinates to screen size, and the
coordinates remain in the range [0, 0x7fff]. This request is
only applicable if "request-abs-pointer" is also set. Frontends
should set this value before setting Connected.

Signed-off-by: Owen Smith <owen.smith@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agox86/hvm: Drop more remains of the PVHv1 implementation
Andrew Cooper [Thu, 22 Jun 2017 10:30:00 +0000 (11:30 +0100)]
x86/hvm: Drop more remains of the PVHv1 implementation

These functions don't need is_hvm_{vcpu,domain}() predicates.

hvmop_set_evtchn_upcall_vector() does need the predicate to prevent a PV
caller accessing the hvm union, but swap the copy_from_guest() and
is_hvm_domain() predicate to avoid reading the hypercall parameter if we not
going to use it.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
7 years agox86/hvm: Fixes to hvmemul_insn_fetch()
Andrew Cooper [Tue, 9 May 2017 14:31:54 +0000 (15:31 +0100)]
x86/hvm: Fixes to hvmemul_insn_fetch()

Force insn_off to a single byte, as offset can wrap around or truncate with
respect to sh_ctxt->insn_buf_eip under a number of normal circumstances.

Furthermore, don't use an ASSERT() for bounds checking the write into
hvmemul_ctxt->insn_buf[].

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/evtchn: Restrict the ops usable in do_event_channel_op_compat()
Andrew Cooper [Tue, 18 Jul 2017 14:21:46 +0000 (15:21 +0100)]
x86/evtchn: Restrict the ops usable in do_event_channel_op_compat()

This hypercall is unused by guests these days, but there was no prevention of
usable subops.  The following ops have been restricted, as there is no
suitable structure in the evntchn_op union.

  EVTCHNOP_reset
  EVTCHNOP_init_control
  EVTCHNOP_expand_array
  EVTCHNOP_set_priority

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
7 years agodocs: Fix the markdown for the com{1,2} keyword command line documentation
Andrew Cooper [Mon, 17 Jul 2017 13:56:51 +0000 (14:56 +0100)]
docs: Fix the markdown for the com{1,2} keyword command line documentation

No change in content.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agoarm/p2m: Cleanup access to the host's p2m
Sergej Proskurin [Tue, 18 Jul 2017 10:33:52 +0000 (12:33 +0200)]
arm/p2m: Cleanup access to the host's p2m

This commit substitutes the direct access of the host's p2m
(&d->arch.p2m) for the macro "p2m_get_hostp2m". This macro simplifies
readability and also the differentiation between the host's p2m and
alternative p2m's, i.e., as part of the altp2m subsystem that will be
submitted in the future.

Signed-off-by: Sergej Proskurin <proskurin@sec.in.tum.de>
Acked-by: Julien Grall <julien.grall@arm.com>
7 years agotools/xen-mceinj: add support of injecting LMCE
Haozhong Zhang [Wed, 12 Jul 2017 02:04:40 +0000 (10:04 +0800)]
tools/xen-mceinj: add support of injecting LMCE

If option '-l' or '--lmce' is specified and the host supports LMCE,
xen-mceinj will inject LMCE to CPU specified by '-c' (or CPU0 if '-c'
is not present).

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agotools/libxc: add support of injecting MC# to specified CPUs
Haozhong Zhang [Wed, 12 Jul 2017 02:04:39 +0000 (10:04 +0800)]
tools/libxc: add support of injecting MC# to specified CPUs

Though XEN_MC_inject_v2 allows injecting MC# to specified CPUs, the
current xc_mca_op() does not use this feature and not provide an
interface to callers. This commit add a new xc_mca_op_inject_v2() that
receives a cpumap providing the set of target CPUs.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agox86/mce: add support of vLMCE injection to XEN_MC_inject_v2
Haozhong Zhang [Fri, 14 Jul 2017 10:44:58 +0000 (12:44 +0200)]
x86/mce: add support of vLMCE injection to XEN_MC_inject_v2

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/vmce, tools/libxl: expose LMCE capability in guest MSR_IA32_MCG_CAP
Haozhong Zhang [Fri, 14 Jul 2017 10:44:23 +0000 (12:44 +0200)]
x86/vmce, tools/libxl: expose LMCE capability in guest MSR_IA32_MCG_CAP

If LMCE is supported by host and ' mca_caps = [ "lmce" ] ' is present
in xl config, the LMCE capability will be exposed in guest MSR_IA32_MCG_CAP.
By default, LMCE is not exposed to guest so as to keep the backwards migration
compatibility.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com> for hypervisor side
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agox86/vmce: enable injecting LMCE to guest on Intel host
Haozhong Zhang [Fri, 14 Jul 2017 10:44:01 +0000 (12:44 +0200)]
x86/vmce: enable injecting LMCE to guest on Intel host

Inject LMCE to guest if the host MCE is LMCE and the affected vcpu is
known. Otherwise, broadcast MCE to all vcpus on Intel host.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/vmce: emulate MSR_IA32_MCG_EXT_CTL
Haozhong Zhang [Fri, 14 Jul 2017 10:43:27 +0000 (12:43 +0200)]
x86/vmce: emulate MSR_IA32_MCG_EXT_CTL

If MCG_LMCE_P is present in guest MSR_IA32_MCG_CAP, then allow guest
to read/write MSR_IA32_MCG_EXT_CTL.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/domctl: generalize the restore of vMCE parameters
Haozhong Zhang [Fri, 14 Jul 2017 10:42:35 +0000 (12:42 +0200)]
x86/domctl: generalize the restore of vMCE parameters

vMCE parameters in struct xen_domctl_ext_vcpucontext were extended in
the past, and is likely to be extended in the future. When migrating a
PV domain from old Xen, XEN_DOMCTL_set_ext_vcpucontext should handle
the differences.

Instead of adding ad-hoc handling code at each extension, we introduce
an array to record sizes of the current and all past versions of vMCE
parameters, and search for the largest one that does not expire the
size of passed-in parameters to determine vMCE parameters that will be
restored. If vMCE parameters are extended in the future, we only need
to adapt the array to reflect the extension.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agotools/libxl: Fix a segment fault when mmio_hole is set in hvm.cfg
Xiong Zhang [Thu, 13 Jul 2017 02:03:39 +0000 (10:03 +0800)]
tools/libxl: Fix a segment fault when mmio_hole is set in hvm.cfg

When valid mmio_hole is set in hvm.cfg, segment fault happens at accessing
localents pointer.

Because the size of localents pointer isn't enough to store appended
mmio_hole_size parameter.

Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agoVT-d: fix VF of RC integrated PF matched to wrong VT-d unit
Chao Gao [Fri, 7 Jul 2017 14:46:23 +0000 (16:46 +0200)]
VT-d: fix VF of RC integrated PF matched to wrong VT-d unit

The problem is for a VF of RC integrated PF (e.g. PF's BDF is 00:02.0),
we would wrongly use 00:00.0 to search VT-d unit.

If a PF is an extended function, the BDF of a traditional function within the
same device should be used to search VT-d unit. Otherwise, the real BDF of PF
should be used. According PCI-e spec, an extended function is a function
within an ARI device and Function Number is greater than 7. The original code
tried to tell apart them through checking PCI_SLOT(), missing counterpart of
pci_ari_enabled() (this function exists in linux kernel) compared to linux
kernel. Without checking whether ARI is enabled, it incurs a RC integrated PF
with PCI_SLOT() >0 is wrongly classified to an extended function. Note that a
RC integrated function isn't within an ARI device and thus cannot be extended
function and in this case the real BDF should be used.

Considering 'is_extfn' field of struct pci_dev has been passed down from
Domain0 to indicate whether the function is an extended function, this patch
just looks up the 'is_extfn' field of PF's struct pci_dev and set 'devfn' to 0
when 'is_extfn' is true.

Reported-by: Crawford, Eric R <Eric.R.Crawford@intel.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
7 years agox86emul: shifts/rotates don't allow LOCK prefix
Jan Beulich [Fri, 7 Jul 2017 14:43:35 +0000 (16:43 +0200)]
x86emul: shifts/rotates don't allow LOCK prefix

... just like e.g. SHLD/SHRD don't (see commit dee231b5a8 [x86emul:
improve LOCK handling]).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agogitignore: add tools/misc/xen-diag to .gitignore
Dongli Zhang [Tue, 4 Jul 2017 14:35:28 +0000 (22:35 +0800)]
gitignore: add tools/misc/xen-diag to .gitignore

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agoCODING_STYLE: removing trailing whitespaces
Julien Grall [Tue, 4 Jul 2017 12:12:13 +0000 (13:12 +0100)]
CODING_STYLE: removing trailing whitespaces

Signed-off-by: Julien Grall <julien.grall@arm.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agox86/psr.c: use plain bool
Wei Liu [Fri, 30 Jun 2017 16:58:07 +0000 (17:58 +0100)]
x86/psr.c: use plain bool

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/msi.c: use plain bool
Wei Liu [Fri, 30 Jun 2017 16:55:34 +0000 (17:55 +0100)]
x86/msi.c: use plain bool

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/numa.c: use plain bool
Wei Liu [Fri, 30 Jun 2017 16:50:14 +0000 (17:50 +0100)]
x86/numa.c: use plain bool

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/mpparse.c: use plain bool
Wei Liu [Fri, 30 Jun 2017 16:37:19 +0000 (17:37 +0100)]
x86/mpparse.c: use plain bool

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/io_apic.c: use plain bool
Wei Liu [Fri, 30 Jun 2017 16:33:55 +0000 (17:33 +0100)]
x86/io_apic.c: use plain bool

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/smpboot.c: use plain bool
Wei Liu [Fri, 30 Jun 2017 16:29:15 +0000 (17:29 +0100)]
x86/smpboot.c: use plain bool

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/srat.c: use plain bool
Wei Liu [Fri, 30 Jun 2017 16:26:34 +0000 (17:26 +0100)]
x86/srat.c: use plain bool

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/xstate.c: use plain bool
Wei Liu [Fri, 30 Jun 2017 16:23:53 +0000 (17:23 +0100)]
x86/xstate.c: use plain bool

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/monitor.c: use plain bool
Wei Liu [Fri, 30 Jun 2017 16:20:47 +0000 (17:20 +0100)]
x86/monitor.c: use plain bool

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
7 years agox86/i8259.c: use plain bool
Wei Liu [Fri, 30 Jun 2017 16:17:44 +0000 (17:17 +0100)]
x86/i8259.c: use plain bool

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/i387.c: use plain bool
Wei Liu [Fri, 30 Jun 2017 16:14:56 +0000 (17:14 +0100)]
x86/i387.c: use plain bool

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/e820.c: use plan bool
Wei Liu [Fri, 30 Jun 2017 16:03:34 +0000 (17:03 +0100)]
x86/e820.c: use plan bool

Note that e820_mtrr_clip remains s8 although the command line
parameter is bool, because it is a tristate variable.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/hpet.c: use plain bool
Wei Liu [Fri, 30 Jun 2017 16:00:06 +0000 (17:00 +0100)]
x86/hpet.c: use plain bool

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/domctl: use plain bool
Wei Liu [Fri, 30 Jun 2017 15:56:45 +0000 (16:56 +0100)]
x86/domctl: use plain bool

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/dmi.c: use plain bool
Wei Liu [Fri, 30 Jun 2017 15:54:18 +0000 (16:54 +0100)]
x86/dmi.c: use plain bool

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/debug.c: use plain bool
Wei Liu [Fri, 30 Jun 2017 15:52:13 +0000 (16:52 +0100)]
x86/debug.c: use plain bool

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/apic.c: use plain bool
Wei Liu [Fri, 30 Jun 2017 15:49:33 +0000 (16:49 +0100)]
x86/apic.c: use plain bool

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agox86/acpi: use plain bool
Wei Liu [Fri, 30 Jun 2017 15:44:12 +0000 (16:44 +0100)]
x86/acpi: use plain bool

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
7 years agoConfig.mk: update mini-os changeset
Wei Liu [Tue, 4 Jul 2017 13:31:01 +0000 (14:31 +0100)]
Config.mk: update mini-os changeset

The changes contain implementations for some termios functions.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
7 years agotools/libxl/libxl_pci.c: Judge igd through class code instead of device ID
Xiong Zhang [Sun, 2 Jul 2017 19:25:53 +0000 (03:25 +0800)]
tools/libxl/libxl_pci.c: Judge igd through class code instead of device ID

IGD passthrough couldn't work on Skylake and Kabylake, because their
Device ID aren't in fixup_ids[]. Currently we need to add every intel
graphic ID into fixup_ids[], it is hard to maintain.

This patch judge intel graphics through vendor id (0x8086) and class
code(0x030000), this could support both the old and new intel graphics,
and reduce maintain work in future.

Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agotools/libxl/libxl_pci.c: Extract sysfs_dev_get_class from libxl__grant_vga_iomem_perm...
Xiong Zhang [Sun, 2 Jul 2017 19:25:52 +0000 (03:25 +0800)]
tools/libxl/libxl_pci.c: Extract sysfs_dev_get_class from libxl__grant_vga_iomem_permission

No functional change. Just extract this function for next patch and avoid
code repetition.

Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agotools: utility to dump guest grant table info
Dongli Zhang [Sun, 2 Jul 2017 23:34:13 +0000 (07:34 +0800)]
tools: utility to dump guest grant table info

As both xen-netfront and xen-blkfront support multi-queue, they would
consume a lot of grant table references when there are many paravirtual
devices and vcpus assigned to guest. Guest domU might panic or hang due to
grant allocation failure when nr_grant_frames in guest has reached its max
value.

This utility would help the administrators to diagnose xen issue. There is
only one command gnttab_query_size so far to monitor the guest grant table
frame usage on dom0 side so that it is not required to debug on guest
kernel side for crash/hang analysis anymore.

It is extensible for adding new commands for more diagnostic functions and
the framework of xen-diag.c is from xen-livepatch.c.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agotools/libxc: add interface for GNTTABOP_query_size
Dongli Zhang [Sun, 2 Jul 2017 23:34:12 +0000 (07:34 +0800)]
tools/libxc: add interface for GNTTABOP_query_size

This patch adds new interface for GNTTABOP_query_size in libxc to help
query the current grant table frames and maximum grant table frames for a
specific domain.

Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agooxenstored: trim history in the frequent_ops function
Thomas Sanders [Tue, 28 Mar 2017 17:57:52 +0000 (18:57 +0100)]
oxenstored: trim history in the frequent_ops function

We were trimming the history of commits only at the end of each
transaction (regardless of how it ended).

Therefore if non-transactional writes were being made but no
transactions were being ended, the history would grow
indefinitely. Now we trim the history at regular intervals.

Signed-off-by: Thomas Sanders <thomas.sanders@citrix.com>
7 years agox86/vmx: expose LMCE feature via guest MSR_IA32_FEATURE_CONTROL
Haozhong Zhang [Tue, 4 Jul 2017 08:44:03 +0000 (10:44 +0200)]
x86/vmx: expose LMCE feature via guest MSR_IA32_FEATURE_CONTROL

If MCG_LMCE_P is present in guest MSR_IA32_MCG_CAP, then set LMCE and
LOCK bits in guest MSR_IA32_FEATURE_CONTROL. Intel SDM requires those
bits are set before SW can enable LMCE.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/mce_intel: detect and enable LMCE on Intel host
Haozhong Zhang [Tue, 4 Jul 2017 08:43:32 +0000 (10:43 +0200)]
x86/mce_intel: detect and enable LMCE on Intel host

Enable LMCE if it's supported by the host CPU. If Xen boot parameter
"mce_fb = 1" is present, LMCE will be disabled forcibly.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/mce: handle host LMCE
Haozhong Zhang [Tue, 4 Jul 2017 08:42:45 +0000 (10:42 +0200)]
x86/mce: handle host LMCE

A round of mce_softirq() may handle multiple deferred MCE's.
 1/ If all of them are LMCE's, then mce_softirq() is called on one CPU
    and should not wait for others.
 2/ If at least one of them is non-local MCE, then mce_softirq()
    should sync with other CPUs. mce_softirq() should check those two
    cases and handle them accordingly.

Because mce_softirq() can be interrupted by MC# again, we should also
ensure the deferred MCE handling in mce_softirq() is immutable to the
change of the checking result.

A per-cpu list 'lmce_pending' is introduced to 'struct mc_telem_cpu_ctl'
along with the existing per-cpu list 'pending' for LMCE handling.

MC# handler mcheck_cmn_handler() ensures that
 1/ if all deferred MCE's on a CPU are LMCE's, then all of their
    telemetries will be only in 'lmce_pending' on that CPU;
 2/ if at least one of deferred MCE on a CPU is not LMCE, then all
    telemetries of deferred MCE's on that CPU will be only in
    'pending' on that CPU.

Therefore, the non-empty of 'lmce_pending' can be used to determine
whether it's the former of the beginning two cases in MCE softirq
handler mce_softirq().

mce_softirq() atomically moves deferred MCE's from either list
'lmce_pending' on the current CPU or lists 'pending' on the current or
other CPUs to list 'processing' in the current CPU, and then handles
deferred MCE's in list 'processing'.  New coming MC# before and after
the atomic move, which change the result of the check, do not change
whether MCE's in 'processing' are LMCE or not, so mce_softirq() can
still handle 'processing' according to the result of previous check.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/mce: allow mce_barrier_{enter,exit} to return without waiting
Haozhong Zhang [Mon, 3 Jul 2017 15:43:45 +0000 (17:43 +0200)]
x86/mce: allow mce_barrier_{enter,exit} to return without waiting

Add a 'wait' argument to mce_barrier_{enter,exit}() to specify whether
the barrier functions should return immediately without waiting
mce_barrier_{enter,exit}() on other CPUs. This is useful when handling
LMCE, where mce_barrier_{enter,exit} are called only on one CPU.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
7 years agox86/mce: fix comment of struct mc_telem_cpu_ctl
Haozhong Zhang [Mon, 3 Jul 2017 15:43:36 +0000 (17:43 +0200)]
x86/mce: fix comment of struct mc_telem_cpu_ctl

Since c/s cbc585158f ("x86/mce: eliminate unnecessary NR_CPUS-sized
arrays"), struct mc_telem_cpu_ctl was introduced and has been used as
the type of per-cpu variables rather than global variables. However,
some comments within it have not been updated accordingly.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
7 years agoxen/arm: lpae: Switch from bool_t to bool
Julien Grall [Fri, 30 Jun 2017 15:54:31 +0000 (16:54 +0100)]
xen/arm: lpae: Switch from bool_t to bool

We are phasing out the use of bool_t in the hypervisor code.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: mm: Use __func__ rather than plain name in format string
Julien Grall [Fri, 30 Jun 2017 15:54:30 +0000 (16:54 +0100)]
xen/arm: mm: Use __func__ rather than plain name in format string

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabelllini <sstabellini@kernel.org>
7 years agoxen/arm: mm: Introduce temporary variable in create_xen_entries
Julien Grall [Fri, 30 Jun 2017 15:54:29 +0000 (16:54 +0100)]
xen/arm: mm: Introduce temporary variable in create_xen_entries

This is improving the code readability and avoid to dereference the
table every single time we need to access the entry.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: mm: Use lpae_valid and lpae_table in create_xen_entries
Julien Grall [Fri, 30 Jun 2017 15:54:28 +0000 (16:54 +0100)]
xen/arm: mm: Use lpae_valid and lpae_table in create_xen_entries

This newly introduced lpae_valid and lpae_table helpers will recude the
code and make more readable.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: p2m: Move lpae_* helpers in lpae.h
Julien Grall [Fri, 30 Jun 2017 15:54:27 +0000 (16:54 +0100)]
xen/arm: p2m: Move lpae_* helpers in lpae.h

lpae_* helpers can work on any LPAE translation tables. Move them in
lpae.h to allow other part of Xen to use them.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: p2m: Rename p2m_valid, p2m_table, p2m_mapping and p2m_is_superpage
Julien Grall [Fri, 30 Jun 2017 15:54:26 +0000 (16:54 +0100)]
xen/arm: p2m: Rename p2m_valid, p2m_table, p2m_mapping and p2m_is_superpage

The helpers p2m_valid, p2m_table, p2m_mapping and p2m_is_superpage are
not specific to the stage-2 translation tables. They can also work on
any LPAE translation tables. So rename then to lpae_* and use pte.walk
to look for the value of the field.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: lpae: Fix comments coding style
Julien Grall [Fri, 30 Jun 2017 15:54:25 +0000 (16:54 +0100)]
xen/arm: lpae: Fix comments coding style

Also adding one missing full stop + fix description

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: Move LPAE definition in a separate header
Julien Grall [Fri, 30 Jun 2017 15:54:24 +0000 (16:54 +0100)]
xen/arm: Move LPAE definition in a separate header

page.h is getting bigger. Move out every LPAE definitions in a separate
header. There is no functional changes.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: create_xen_entries: Use typesafe MFN
Julien Grall [Fri, 30 Jun 2017 15:54:23 +0000 (16:54 +0100)]
xen/arm: create_xen_entries: Use typesafe MFN

Add a bit more safety when using create_xen_entries.

Also when destroying/modifying mapping, the MFN is currently not used.
Rather than passing _mfn(0) use INVALID_MFN to stay consistent with the
other usage.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: livepatch: Redefine virt_to_mfn to support typesafe
Julien Grall [Fri, 30 Jun 2017 15:54:22 +0000 (16:54 +0100)]
xen/arm: livepatch: Redefine virt_to_mfn to support typesafe

The file xen/arch/arm/livepatch.c is using typesafe MFN in most of
the place. The only caller to virt_to_mfn is using with _mfn(...).

To avoid extra _mfn(...), re-define virt_to_mfn within
xen/arch/arm/livepatch.c to handle typesafe MFN.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel..org>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
7 years agoxen/arm: alternative: Redefine virt_to_mfn to support typesafe
Julien Grall [Fri, 30 Jun 2017 15:54:21 +0000 (16:54 +0100)]
xen/arm: alternative: Redefine virt_to_mfn to support typesafe

The file xen/arch/arm/alternative.c is using typesafe MFN in most of
the place. The only caller to virt_to_mfn is using with _mfn(...).

To avoid extra _mfn(...), re-define virt_to_mfn within
xen/arch/arm/alternative.c to handle typesafe MFN.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabelllini <sstabellini@kernel.org>
7 years agoxen/arm: domain_build: Redefine virt_to_mfn to support typesafe
Julien Grall [Fri, 30 Jun 2017 15:54:20 +0000 (16:54 +0100)]
xen/arm: domain_build: Redefine virt_to_mfn to support typesafe

The file xen/arch/arm/domain_build.c is using typesafe MFN in most of
the place. The only caller to virt_to_mfn is using prefixed with
_mfn(...).

To avoid extra _mfn(...), re-define virt_to_mfn within
arch/arm/domain_build.c to handle typesafe MFN.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: mm: Redefine virt_to_mfn to support typesafe
Julien Grall [Fri, 30 Jun 2017 15:54:19 +0000 (16:54 +0100)]
xen/arm: mm: Redefine virt_to_mfn to support typesafe

The file xen/arch/arm/mm.c is using the typesafe MFN in most of the
place. This requires all caller of virt_to_mfn to prefixed by _mfn(...).

To avoid the extra _mfn(...), re-defined virt_to_mfn within arch/arm/mm.c
to handle typesafe MFN.

This patch also introduce __virt_to_mfn, so virt_to_mfn can be
re-defined easily.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: p2m: Redefine mfn_to_page and page_to_mfn to use typesafe
Julien Grall [Fri, 30 Jun 2017 15:54:18 +0000 (16:54 +0100)]
xen/arm: p2m: Redefine mfn_to_page and page_to_mfn to use typesafe

The file xen/arch/arm/p2m.c is using typesafe MFN in most of the place.
This requires caller to mfn_to_page and page_to_mfn to use _mfn/mfn_x.

To avoid extra _mfn/mfn_x, re-define mfn_to_page and page_to_mfn within
xen/arch/arm/p2m.c to handle typesafe MFN.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: mm: Use typesafe mfn for xenheap_mfn_*
Julien Grall [Fri, 30 Jun 2017 15:54:17 +0000 (16:54 +0100)]
xen/arm: mm: Use typesafe mfn for xenheap_mfn_*

Add more safety when using xenheap_mfn_*.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agoxen/arm: setup: Remove bogus xenheap_mfn_end in setup_mm for arm64
Julien Grall [Fri, 30 Jun 2017 15:54:16 +0000 (16:54 +0100)]
xen/arm: setup: Remove bogus xenheap_mfn_end in setup_mm for arm64

xenheap_mfn_end is storing an MFN and not a physical address. The value will
be reset after the loop. So drop this bogus xenheap_mfn_end.

Signed-off-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
7 years agox86/shadow: Switch to using bool
Andrew Cooper [Fri, 23 Jun 2017 11:26:31 +0000 (11:26 +0000)]
x86/shadow: Switch to using bool

 * sh_pin() has boolean properties, so switch its return type.
 * sh_remove_shadows() uses ints everywhere other than its stub.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
7 years agotools/libxenforeignmemory: add xenforeignmemory_map2 function
Igor Druzhinin [Wed, 28 Jun 2017 19:27:08 +0000 (20:27 +0100)]
tools/libxenforeignmemory: add xenforeignmemory_map2 function

The new function repeats the behavior of the first version
except it has an extended list of arguments which are subsequently
passed to mmap() call.

This is needed for QEMU depriviledging.

Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agolibxl: reformat cpuid_flags
Marek Marczykowski-Górecki [Fri, 30 Jun 2017 13:16:59 +0000 (15:16 +0200)]
libxl: reformat cpuid_flags

Reverse sorting order, add blank lines at register change.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agolibxl: make cpuid_flags array static const
Marek Marczykowski-Górecki [Fri, 30 Jun 2017 13:16:58 +0000 (15:16 +0200)]
libxl: make cpuid_flags array static const

To have it in .rodata, instead of reconstructing each time on stack.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agolibxl: fix osvm cpuid flag
Marek Marczykowski-Górecki [Fri, 30 Jun 2017 13:16:57 +0000 (15:16 +0200)]
libxl: fix osvm cpuid flag

It's bit 9 not 10 (which is ibs).

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agolibxl: add more cpuid flags handling
Marek Marczykowski-Górecki [Fri, 30 Jun 2017 13:16:56 +0000 (15:16 +0200)]
libxl: add more cpuid flags handling

This is result of parsing cpu_map.xml from libvirt.
The most important part is handling leaf 0x00000007, but while at it add
other bits too.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
7 years agovvmx: fix ept_sync() for nested p2m
Sergey Dyasli [Wed, 28 Jun 2017 09:35:45 +0000 (10:35 +0100)]
vvmx: fix ept_sync() for nested p2m

If ept_sync_domain() is called for np2m, the following happens:

    1. *np2m*::ept_data::invalidate cpumask is updated
    2. IPIs are sent for CPUs in domain_dirty_cpumask forcing vmexits
    3. vmx_vmenter_helper() checks *hostp2m*::ept_data::invalidate
       and does nothing

Which is clearly a bug. Make ept_sync_domain() to update hostp2m's
invalidate mask in nested p2m case and make vmx_vmenter_helper() to
invalidate EPT translations for all EPTPs if nested virt is enabled.

Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>