]> xenbits.xensource.com Git - xen.git/log
xen.git
6 years agox86/domctl: don't pause the whole domain if only getting vcpu state
Alexandru Isaila [Mon, 10 Sep 2018 14:27:00 +0000 (16:27 +0200)]
x86/domctl: don't pause the whole domain if only getting vcpu state

This patch is focused on moving changing hvm_save_one() to save one
typecode from one vcpu and now that the save functions get data from a
single vcpu we can pause the specific vcpu instead of the domain.

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/hvm: remove redundant save functions
Alexandru Isaila [Mon, 10 Sep 2018 14:27:00 +0000 (16:27 +0200)]
x86/hvm: remove redundant save functions

This patch removes the redundant save functions and renames the
save_one* to save. It then changes the domain param to vcpu in the
save funcs and adapts print messages in order to match the format of the
other save related messages.

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/domctl: use hvm_save_vcpu_handler
Alexandru Isaila [Mon, 10 Sep 2018 14:27:00 +0000 (16:27 +0200)]
x86/domctl: use hvm_save_vcpu_handler

This patch is aimed on using the new save_one fuctions in the hvm_save

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/hvm: add handler for save_one funcs
Alexandru Isaila [Mon, 10 Sep 2018 14:27:00 +0000 (16:27 +0200)]
x86/hvm: add handler for save_one funcs

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/hvm: introduce lapic_save_regs_one()
Alexandru Isaila [Mon, 10 Sep 2018 14:26:00 +0000 (16:26 +0200)]
x86/hvm: introduce lapic_save_regs_one()

This is used to save data from a single instance.

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/hvm: introduce lapic_save_hidden_one()
Alexandru Isaila [Mon, 10 Sep 2018 14:26:00 +0000 (16:26 +0200)]
x86/hvm: introduce lapic_save_hidden_one()

This is used to save data from a single instance.

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/hvm: introduce viridian_save_vcpu_ctxt_one()
Alexandru Isaila [Mon, 10 Sep 2018 14:26:00 +0000 (16:26 +0200)]
x86/hvm: introduce viridian_save_vcpu_ctxt_one()

This is used to save data from a single instance.

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
6 years agox86/hvm: introduce hvm_save_mtrr_msr_one()
Alexandru Isaila [Mon, 10 Sep 2018 14:26:00 +0000 (16:26 +0200)]
x86/hvm: introduce hvm_save_mtrr_msr_one()

This is used to save data from a single instance.

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>i
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/hvm: introduce hvm_save_cpu_msrs_one()
Alexandru Isaila [Mon, 10 Sep 2018 14:26:00 +0000 (16:26 +0200)]
x86/hvm: introduce hvm_save_cpu_msrs_one()

This is used to save data from a single instance.

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/hvm: introduce hvm_save_cpu_xsave_states_one()
Alexandru Isaila [Mon, 10 Sep 2018 14:26:00 +0000 (16:26 +0200)]
x86/hvm: introduce hvm_save_cpu_xsave_states_one()

This is used to save data from a single instance.

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/hvm: introduce hvm_save_cpu_ctxt_one()
Alexandru Isaila [Mon, 10 Sep 2018 14:26:00 +0000 (16:26 +0200)]
x86/hvm: introduce hvm_save_cpu_ctxt_one()

This is used to save data from a single instance.

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/hvm: introduce hvm_save_tsc_adjust_one()
Alexandru Isaila [Mon, 10 Sep 2018 14:26:00 +0000 (16:26 +0200)]
x86/hvm: introduce hvm_save_tsc_adjust_one()

This is used to save data from a single instance.

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/cpu: introduce vmce_save_vcpu_ctxt_one()
Alexandru Isaila [Mon, 10 Sep 2018 14:26:00 +0000 (16:26 +0200)]
x86/cpu: introduce vmce_save_vcpu_ctxt_one()

This is used to save data from a single instance.

Signed-off-by: Alexandru Isaila <aisaila@bitdefender.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/mm: change default value for suppress #VE in set_mem_access()
Vlad Ioan Topan [Wed, 12 Sep 2018 07:50:00 +0000 (09:50 +0200)]
x86/mm: change default value for suppress #VE in set_mem_access()

The default value for the "suppress #VE" bit set by set_mem_access()
currently depends on whether the call is made from the same domain (the
bit is set when called from another domain and cleared if called from
the same domain). This patch changes that behavior to inherit the old
suppress #VE bit value if it is already set and to set it to 1
otherwise, which is safer and more reliable.

Signed-off-by: Vlad Ioan Topan <itopan@bitdefender.com>
Signed-off-by: Adrian Pop <apop@bitdefender.com>
Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
6 years agox86/iommu: add map-reserved dom0-iommu option to map reserved memory ranges
Roger Pau Monné [Fri, 7 Sep 2018 09:08:00 +0000 (11:08 +0200)]
x86/iommu: add map-reserved dom0-iommu option to map reserved memory ranges

Several people have reported hardware issues (malfunctioning USB
controllers) due to iommu page faults on Intel hardware. Those faults
are caused by missing RMRR (VTd) entries in the ACPI tables. Those can
be worked around on VTd hardware by manually adding RMRR entries on
the command line, this is however limited to Intel hardware and quite
cumbersome to do.

In order to solve those issues add a new dom0-iommu=map-reserved
option that identity maps all regions marked as reserved in the memory
map. Note that regions used by devices emulated by Xen (LAPIC, IO-APIC
or PCIe MCFG regions) are specifically avoided. Note that this option
is available to all Dom0 modes (as opposed to the inclusive option
which only works for PV Dom0).

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
6 years agox86/iommu: switch the hwdom mapping function to use page_get_type
Roger Pau Monné [Fri, 7 Sep 2018 09:08:00 +0000 (11:08 +0200)]
x86/iommu: switch the hwdom mapping function to use page_get_type

This avoids repeated calls to page_is_ram_type which improves
performance and makes the code easier to read.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agomm: introduce a helper to get the memory type of a page
Roger Pau Monné [Fri, 7 Sep 2018 09:08:00 +0000 (11:08 +0200)]
mm: introduce a helper to get the memory type of a page

Returns all the memory types applicable to a page.

This function is unimplemented for ARM.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agoiommu: make iommu_inclusive_mapping a suboption of dom0-iommu
Roger Pau Monné [Fri, 7 Sep 2018 09:08:00 +0000 (11:08 +0200)]
iommu: make iommu_inclusive_mapping a suboption of dom0-iommu

Introduce a new dom0-iommu=map-inclusive generic option that
supersedes iommu_inclusive_mapping. The previous behavior is preserved
and the option should only be enabled by default on Intel hardware.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
6 years agoiommu: introduce dom0-iommu option
Roger Pau Monné [Fri, 7 Sep 2018 09:08:00 +0000 (11:08 +0200)]
iommu: introduce dom0-iommu option

To select the iommu configuration used by Dom0. This option supersedes
iommu=dom0-strict|dom0-passthrough.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agoiommu: rename iommu_dom0_strict and iommu_passthrough
Roger Pau Monné [Fri, 7 Sep 2018 09:07:00 +0000 (11:07 +0200)]
iommu: rename iommu_dom0_strict and iommu_passthrough

To iommu_hwdom_strict and iommu_hwdom_passthrough which is more
descriptive of their usage. Also change their type from bool_t to
bool.

No functional change.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
6 years agoxen/sched: Re-position the domain_update_node_affinity() call during vcpu construction
Andrew Cooper [Thu, 6 Sep 2018 13:40:56 +0000 (14:40 +0100)]
xen/sched: Re-position the domain_update_node_affinity() call during vcpu construction

alloc_vcpu()'s call to domain_update_node_affinity() has existed for a decade,
but its effort is mostly wasted.

alloc_vcpu() is called in a loop for each vcpu, bringing them into existence.
The values of the affinity masks are still default, which is allcpus in
general, or a processor singleton for pinned domains.

Furthermore, domain_update_node_affinity() itself loops over all vcpus
accumulating the masks, making it quadratic with the number of vcpus.

Move it to be called once after all vcpus are constructed, which has the same
net effect, but with fewer intermediate memory allocations and less cpumask
arithmetic.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Dario Faggioli <dfaggioli@suse.com>
6 years agoxen/domain: Remove trailing whitespace
Andrii Anisov [Tue, 11 Sep 2018 15:36:32 +0000 (18:36 +0300)]
xen/domain: Remove trailing whitespace

Signed-off-by: Andrii Anisov <andrii_anisov@epam.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/HVM: don't #GP/#SS on wrapping virt->linear translations
Jan Beulich [Tue, 11 Sep 2018 13:06:23 +0000 (15:06 +0200)]
x86/HVM: don't #GP/#SS on wrapping virt->linear translations

Real hardware wraps silently in most cases, so we should behave the
same. Also split real and VM86 mode handling, as the latter really
ought to have limit checks applied.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/shadow: a little bit of style cleanup
Jan Beulich [Tue, 11 Sep 2018 13:05:09 +0000 (15:05 +0200)]
x86/shadow: a little bit of style cleanup

Correct indentation of a piece of code, adjusting comment style at the
same time. Constify gl3e pointers and drop a bogus (and useless once
corrected) cast.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
6 years agoxen: Fix inconsistent callers of panic()
Andrew Cooper [Wed, 29 Aug 2018 16:39:10 +0000 (16:39 +0000)]
xen: Fix inconsistent callers of panic()

Callers are inconsistent with whether they pass a newline to panic(),
including adjacent calls in the same function using different styles.

painc() not expecting a newline is inconsistent with most other printing
functions, which is most likely why we've gained so many inconsistencies.

Switch panic() to expect a newline, and update all callers which currently
lack a newline to include one.

This actually reduces the size of .rodata (0x07e3e8 down to 0x07e3a8) because
a number of strings are passed to both panic() and printk().  As they
previously differed by \n alone, they couldn't be merged.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
6 years agoSVM: limit GIF=0 region
Jan Beulich [Tue, 11 Sep 2018 09:06:41 +0000 (11:06 +0200)]
SVM: limit GIF=0 region

Use EFLAGS.IF for most ordinary purposes; there's in particular no need
to unduly defer NMI/#MC. Clear GIF only immediately before VMRUN itself.
This has the additional advantage that svm_stgi_label now indeed marks
the only place where GIF gets set.

Note regarding the main STI placement: Quite counterintuitively the
host's EFLAGS.IF continues to have a meaning while the guest runs; see
PM Vol 2 section "Physical (INTR) Interrupt Masking in EFLAGS". Hence we
need to set the flag for the duration of time being in guest context.
However, SPEC_CTRL_ENTRY_FROM_HVM wants to be carried out with EFLAGS.IF
clear.

Note regarding the main STGI placement: It could be moved further up,
but at present SPEC_CTRL_EXIT_TO_HVM is not NMI/#MC-safe.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
6 years agox86/HVM: split page straddling emulated accesses in more cases
Jan Beulich [Tue, 11 Sep 2018 09:03:46 +0000 (11:03 +0200)]
x86/HVM: split page straddling emulated accesses in more cases

Assuming consecutive linear addresses map to all RAM or all MMIO is not
correct. Nor is assuming that a page straddling MMIO access will access
the same emulating component for both parts of the access. If a guest
RAM read fails with HVMTRANS_bad_gfn_to_mfn and if the access straddles
a page boundary, issue accesses separately for both parts.

The extra call to known_gla() from hvmemul_write() is just to preserve
original behavior; for consistency the check also gets added to
hvmemul_rmw() (albeit I continue to be unsure whether we wouldn't better
drop both).

Note that the correctness of this depends on the MMIO caching used
elsewhere in the emulation code.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
6 years agox86/HVM: add known_gla() emulation helper
Jan Beulich [Tue, 11 Sep 2018 09:03:14 +0000 (11:03 +0200)]
x86/HVM: add known_gla() emulation helper

... as a central place to do respective checking for whether the
translation for the linear address is available as well as usable.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
6 years agox86/HVM: drop hvm_fetch_from_guest_linear()
Jan Beulich [Tue, 11 Sep 2018 09:02:37 +0000 (11:02 +0200)]
x86/HVM: drop hvm_fetch_from_guest_linear()

It can easily be expressed through hvm_copy_from_guest_linear(), and in
two cases this even simplifies callers.

Suggested-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
6 years agoxsm: fix clang build
Roger Pau Monné [Tue, 11 Sep 2018 09:01:13 +0000 (11:01 +0200)]
xsm: fix clang build

ebitmap.c:244:32: error: invalid conversion specifier 'Z' [-Werror,-Wformat-invalid-specifier]
               "match my size %Zd (high bit was %d)\n", mapunit,
                              ~^
ebitmap.c:245:16: error: format specifies type 'int' but the argument has type 'unsigned long'
      [-Werror,-Wformat]
               sizeof(u64) * 8, e->highbit);
               ^~~~~~~~~~~~~~~
ebitmap.c:245:33: error: data argument not used by format string [-Werror,-Wformat-extra-args]
               sizeof(u64) * 8, e->highbit);

Use %zd instead of %Zd, which is compliant with C99.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
6 years agox86/HVM: meet xentrace's expectations on emulation event data
Jan Beulich [Tue, 11 Sep 2018 09:00:01 +0000 (11:00 +0200)]
x86/HVM: meet xentrace's expectations on emulation event data

According to the logic in hvm_mmio_assist_process(), 64 bits of data are
expected with 64-bit addresses, and 32 bits of data with 32-bit ones. I
don't think this is very reasonable, but I'm also not going to touch the
consumer side, the more that it is anyway not very helpful for the code
here to only ever supply 32 bits of data (despite the field being 64
bits wide, and having been even in the 32-bit days of Xen).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
6 years agodocs: document ~/control/sysrq
Wei Liu [Wed, 5 Sep 2018 14:05:01 +0000 (15:05 +0100)]
docs: document ~/control/sysrq

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
6 years agomkdeb: use compression level 0
Wei Liu [Fri, 7 Sep 2018 10:41:31 +0000 (11:41 +0100)]
mkdeb: use compression level 0

This requires calling dpkg-deb directly and pass it -z0.

It reduces the time to run the mkdeb script from 14 seconds to 3
seconds on my workstation with SSD, from 87s to 15s on a machine
with HDD. The deb file grows from 49M to 58M.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
6 years agotools/mkrpm: switch payload to gzip to reduce turnaround time
Olaf Hering [Thu, 30 Aug 2018 10:05:11 +0000 (12:05 +0200)]
tools/mkrpm: switch payload to gzip to reduce turnaround time

rpmbuild -bb spents alot of time in compressing the binaries. Reduce the
turnaround time of 'make rpmball' by using gzip as compression tool.
This reduces the buildtime from 'w9.xzdio'/138 seconds to 'w1.gzdio'/88
seconds in my environment.
The downside is an increased filesize of xen.rpm, 19MB vs. 37MB.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agolibxl: don't set PoD target for PV guests
Wei Liu [Tue, 4 Sep 2018 16:15:23 +0000 (17:15 +0100)]
libxl: don't set PoD target for PV guests

Previously PoD target was unconditionally set for both PV and HVM
guests, but in fact PoD has always been an HVM (now PVH as well) only
feature.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
6 years agox86/hvm: rearrange content of hvm.h
Wei Liu [Tue, 4 Sep 2018 16:15:25 +0000 (17:15 +0100)]
x86/hvm: rearrange content of hvm.h

Move enum and function declarations to first half of the file.

Static inline functions and macros, which reference HVM specific
fields directly are grouped together in second half of the file.

The movement is needed because in a later patch the second half is
going to be enclosed in CONFIG_HVM.

Pure code movement. No functional change.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agoautomation: specify -j$(nproc) in build script
Wei Liu [Thu, 6 Sep 2018 14:55:59 +0000 (15:55 +0100)]
automation: specify -j$(nproc) in build script

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Doug Goldstein <cardoe@cardoe.com>
6 years agopvshim: introduce a PV shim defconfig
Roger Pau Monné [Fri, 7 Sep 2018 07:29:20 +0000 (09:29 +0200)]
pvshim: introduce a PV shim defconfig

In order to build a tailored pvshim-only binary from Xen. Switch the
PV shim build from the tools firmware into using the new defconfig.

A diff of the .config generated for the pvshim firmware build before
and after this change shows no differences.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/dmar: zap DMAR signature for dom0 once in TBOOT case
Zhenzhong Duan [Fri, 7 Sep 2018 07:27:19 +0000 (09:27 +0200)]
x86/dmar: zap DMAR signature for dom0 once in TBOOT case

Commit 6c298ecc1f ("vtd: Reinstate ACPI DMAR on system shutdown or
S3/S4/S5") did everything for acpi_dmar_zap() call to be unnecessary,
except for invoking the function from acpi_parse_dmar(), which
123c779379 ("VTd/dmar: Tweak how the DMAR table is clobbered")
added several years later.

Some stale comments are also removed, No functional change.

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
6 years agoxen/ARM+sched: Don't opencode %pv in printk()'s
Andrew Cooper [Wed, 29 Aug 2018 16:27:44 +0000 (16:27 +0000)]
xen/ARM+sched: Don't opencode %pv in printk()'s

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
6 years agox86: PIT emulation is common to both PV and HVM
Wei Liu [Tue, 4 Sep 2018 16:15:22 +0000 (17:15 +0100)]
x86: PIT emulation is common to both PV and HVM

Move the file to x86 common code and change its name to emul-i8254.c.

Put HVM only code under CONFIG_HVM or is_hvm_domain.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86: XENMEM_resource_ioreq_server is HVM only
Wei Liu [Thu, 6 Sep 2018 15:18:31 +0000 (16:18 +0100)]
x86: XENMEM_resource_ioreq_server is HVM only

Put the entire case branch under CONFIG_HVM.

Lift the check from hvm_get_ioreq_server_frame into its caller.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86: introduce and use a set of internal emulation flags
Wei Liu [Tue, 4 Sep 2018 16:15:19 +0000 (17:15 +0100)]
x86: introduce and use a set of internal emulation flags

Use these flags in has_* tests and emulation_flags_ok.

Not using raw flags directly enables DCE to kick in for has_* tests,
while at the same time makes sure emulation_flags_ok won't go out of
sync.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/viridian: set shutdown_code in response to CrashNotify
Paul Durrant [Fri, 10 Aug 2018 15:43:42 +0000 (16:43 +0100)]
x86/viridian: set shutdown_code in response to CrashNotify

When Windows writes the CrashNotify bit in the CRASH_CTL MSR then we know
it is crashing, so set the domain shutdown code appropriately.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoxen/domctl: Drop vcpu_alloc_lock
Andrew Cooper [Tue, 27 Feb 2018 17:22:40 +0000 (17:22 +0000)]
xen/domctl: Drop vcpu_alloc_lock

Since its introduction in c/s 8cbb5278e "x86/AMD: Add support for AMD's OSVW
feature in guests", the OSVW data has been corrected to be per-domain rather
than per-vcpu, and is initialised during XEN_DOMCTL_createdomain.

Furthermore, because XENPF_microcode_update uses hypercall continuations to
move between CPUs, it drops the vcpu_alloc_lock mid update, meaning that it
didn't provided the interlock guarantee that the OSVW patch was looking for in
the first place.

This interlock serves no purpose, so take the opportunity to drop it and
remove a global spinlock from the hypervisor.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agox86emul: fix test harness dependencies
Jan Beulich [Thu, 6 Sep 2018 14:05:52 +0000 (16:05 +0200)]
x86emul: fix test harness dependencies

The generated header files are what needs to spell out dependencies on
other (real) headers in the main Makefile here, not the intermediate
(helper) .o files produced through testcase.mk.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/hvm: remove default ioreq server (again)
Paul Durrant [Thu, 6 Sep 2018 14:04:51 +0000 (16:04 +0200)]
x86/hvm: remove default ioreq server (again)

My recent patch [1] to qemu-xen-traditional removes the last use of the
'default' ioreq server in Xen. (This is a catch-all ioreq server that is
used if no explicitly registered I/O range is targetted).

This patch can be applied once that patch is committed, to remove the
(>100 lines of) redundant code in Xen.

The previous version of this patch caused a QEMU build failure. This has
been fixed by extending the #ifdef around deprecated HVM_PARAM declarations
to __XEN_TOOLS__ as well as __XEN__.

NOTE: The removal of the special case for HVM_PARAM_DM_DOMAIN in
      hvm_allow_set_param() is not directly related to removal of
      default ioreq servers. It could have been cleaned up at any time
      after commit 9a422c03 "x86/hvm: stop passing explicit domid to
      hvm_create_ioreq_server()". It is now added to the new
      deprecated sets introduced by this patch.

[1] https://lists.xenproject.org/archives/html/xen-devel/2018-08/msg00270.html

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agoxen: add DEBUG_INFO Kconfig symbol
Olaf Hering [Thu, 6 Sep 2018 14:02:58 +0000 (16:02 +0200)]
xen: add DEBUG_INFO Kconfig symbol

Creating debug info during build is not strictly required at runtime.
Make it optional by introducing a new Kconfig knob "DEBUG_INFO".
This slightly reduces build time and diskusage, if disabled.

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agotools/xl: refuse to set number of vcpus to 0 via xl vcpu-set
Juergen Gross [Mon, 3 Sep 2018 12:59:42 +0000 (14:59 +0200)]
tools/xl: refuse to set number of vcpus to 0 via xl vcpu-set

Trying to set the number of vcpus of a domain to 0 isn't refused.
We should not allow that.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxen: fill topology info for all present cpus
Juergen Gross [Fri, 31 Aug 2018 15:22:05 +0000 (17:22 +0200)]
xen: fill topology info for all present cpus

The topology information obtainable via XEN_SYSCTL_cputopoinfo is
filled rather weird: the size of the array is derived from the highest
online cpu number, so in case there are trailing offline cpus they
will not be included.

On a dual core system with 4 threads booted with smt=0 without this
patch xl info -n will print:

cpu_topology           :
cpu:    core    socket     node
  0:       0        0        0
  1:       0        0        0
  2:       1        0        0

while with this patch the output is:

cpu_topology           :
cpu:    core    socket     node
  0:       0        0        0
  1:       0        0        0
  2:       1        0        0
  3:       1        0        0

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
6 years agotools/libxl: correct vcpu affinity output with sparse physical cpu map
Juergen Gross [Fri, 31 Aug 2018 15:22:04 +0000 (17:22 +0200)]
tools/libxl: correct vcpu affinity output with sparse physical cpu map

With not all physical cpus online (e.g. with smt=0) the output of hte
vcpu affinities is wrong, as the affinity bitmaps are capped after
nr_cpus bits, instead of using max_cpu_id.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agolibxl: create control/sysrq xenstore node
Vitaly Kuznetsov [Tue, 4 Sep 2018 11:39:29 +0000 (13:39 +0200)]
libxl: create control/sysrq xenstore node

'xl sysrq' command doesn't work with modern Linux guests with the following
message in guest's log:

 xen:manage: sysrq_handler: Error -13 writing sysrq in control/sysrq

xenstore trace confirms:

 IN 0x24bd9a0 20180904 04:36:32 WRITE (control/sysrq )
 OUT 0x24bd9a0 20180904 04:36:32 ERROR (EACCES )

The problem seems to be in the fact that we don't pre-create control/sysrq
xenstore node and libxl_send_sysrq() doing libxl__xs_printf() creates it as
read-only. As we want to allow guests to clean 'control/sysrq' after the
requested action is performed, we need to make this node writable.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agotools/xl: fix output of xl vcpu-pin dry run with smt=0
Juergen Gross [Mon, 3 Sep 2018 11:26:30 +0000 (13:26 +0200)]
tools/xl: fix output of xl vcpu-pin dry run with smt=0

Fix another smt=0 fallout: xl -N vcpu-pin prints only parts of the
affinities as it is using the number of online cpus instead of the
maximum cpu number.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agox86: monitor.o is currently HVM only
Wei Liu [Tue, 4 Sep 2018 16:15:21 +0000 (17:15 +0100)]
x86: monitor.o is currently HVM only

There has been plan to make PV work, but it is not yet there.  Provide
stubs to make it build with !CONFIG_HVM.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
6 years agox86: change name of parameter for various invlpg functions
Wei Liu [Tue, 4 Sep 2018 16:15:18 +0000 (17:15 +0100)]
x86: change name of parameter for various invlpg functions

They all incorrectly named a parameter virtual address while it should
have been linear address.

Requested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
6 years agoxen/domain: Make rangeset_domain_destroy() idempotent
Andrew Cooper [Mon, 3 Sep 2018 12:56:55 +0000 (13:56 +0100)]
xen/domain: Make rangeset_domain_destroy() idempotent

... and move it into the common __domain_destroy() path.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxen/domain: Fold xsm_free_security_domain() paths together
Andrew Cooper [Mon, 3 Sep 2018 11:48:13 +0000 (12:48 +0100)]
xen/domain: Fold xsm_free_security_domain() paths together

xsm_free_security_domain() is idempotent (both the dummy handler, and the
flask handler).  Move it into the shared __domain_destroy() path, and drop the
INIT_xsm flag from domain_create()

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxen/domain: Call lock_profile_deregister_struct() from common code
Andrew Cooper [Mon, 3 Sep 2018 11:10:48 +0000 (12:10 +0100)]
xen/domain: Call lock_profile_deregister_struct() from common code

lock_profile_register_struct() is called from common code, but the matching
deregister was previously only called from x86 code.

The practical upshot of this when using CONFIG_LOCK_PROFILE, destroyed domains
on ARM (and in particular, the freed page behind struct domain) remain on the
lockprofile linked list, which will become corrupt when the page is reused.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxen/domain: Break _domain_destroy() out of domain_create() and complete_domain_destroy()
Andrew Cooper [Mon, 3 Sep 2018 10:52:17 +0000 (11:52 +0100)]
xen/domain: Break _domain_destroy() out of domain_create() and complete_domain_destroy()

This is the first step in making the destroy path idempotent, and using it in
place of the ad-hoc cleanup paths in the create path.

To begin with, the trivial free operations are broken out.  The rest of the
cleanup code will be moved as it is demonstrated (or made) to be idempotent.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxen/domain: Prepare data for is_{pv,hvm}_domain() as early as possible
Andrew Cooper [Mon, 3 Sep 2018 13:22:16 +0000 (14:22 +0100)]
xen/domain: Prepare data for is_{pv,hvm}_domain() as early as possible

Given two subtle failures from getting this wrong before, and more cleanup on
the way, move the setting of d->guest_type as early as possible.

Note that despite moving the assignment of d->guest_type outside of the
is_idle_domain(d) check, it still behaves the same.  Previously, system
domains had no direct assignment of d->guest_type and behaved as PV guests
because guest_type_pv has the value 0.

While tidying up the predicate, leave a comment referring to
is_system_domain(), and move the associated ASSERT() to be beside the
assignment.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agox86emul: clean up AVX2 insn use in test harness
Jan Beulich [Tue, 4 Sep 2018 09:30:29 +0000 (11:30 +0200)]
x86emul: clean up AVX2 insn use in test harness

Drop the pretty pointless conditionals from code testing AVX insns and
properly use AVX2 mnemonics in code testing AVX2 insns (the test harness
is already requiring sufficiently new a compiler/assembler).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86emul: extend MASKMOV{Q,DQU} tests
Jan Beulich [Tue, 4 Sep 2018 09:29:22 +0000 (11:29 +0200)]
x86emul: extend MASKMOV{Q,DQU} tests

While deriving the first AVX512 pieces from existing code I've got the
(in the end wrong) impression that the emulation of these insns would be
broken. Besides testing that the instructions act as no-ops when the
controlling mask bits are all zero, add ones to also check that the data
merging actually works.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86emul: fix FMA scalar operand sizes
Jan Beulich [Tue, 4 Sep 2018 09:28:30 +0000 (11:28 +0200)]
x86emul: fix FMA scalar operand sizes

FMA insns, unlike the earlier AVX additions, don't use the low opcode
bit to distinguish between single and double vector elements. While the
difference is benign for packed flavors, the scalar ones need to use
VEX.W here. Oddly enough the table entries didn't even use
simd_scalar_fp, but uniformly used simd_packed_fp (implying the
distinction was by [VEX-encoded] opcode prefix).

Split simd_scalar_fp into simd_scalar_opc and simd_scalar_vexw, and
correct FMA scalar table entries to use the latter.

Also correct the scalar insn comments (they only ever use XMM registers
as operands).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agohvmloader: set entry point in linker script
Roger Pau Monné [Tue, 4 Sep 2018 09:27:41 +0000 (11:27 +0200)]
hvmloader: set entry point in linker script

Or else it defaults to using 0x100000 as the entry point, which might
or might not point to _start. This is a fix for 09b3907f93.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/hvm: Fix mapping corner case during task switching
Andrew Cooper [Wed, 1 Aug 2018 13:48:33 +0000 (13:48 +0000)]
x86/hvm: Fix mapping corner case during task switching

hvm_map_entry() can fail for a number of reasons, including for a misaligned
LDT/GDT access which crosses a 4K boundary.  Architecturally speaking, this
should be fixed, but Long Mode doesn't support task switches, and no 32bit OS
is going to misalign its LDT/GDT base, which is why this task isn't very high
on the TODO list.

However, the hvm_map_fail error label returns failure without raising an
exception, which interferes with hvm_task_switch()'s exception tracking, and
can cause it to finish and return to guest context as if the task switch had
completed successfully.

Resolve this corner case by folding all the failure paths together, which
causes an hvm_map_entry() failure to result in #TS[SEL].  hvm_unmap_entry()
copes fine with a NULL pointer so can be called unconditionally.

In practice, this is just a latent corner case as all hvm_map_entry() failures
crash the domain, but it should be fixed nevertheless.

Finally, rename hvm_load_segment_selector() to task_switch_load_seg() to avoid
giving the impression that it is usable for general segment loading.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/mm: Drop {HAP,SHADOW}_ERROR() wrappers
Andrew Cooper [Wed, 24 Jan 2018 16:43:55 +0000 (16:43 +0000)]
x86/mm: Drop {HAP,SHADOW}_ERROR() wrappers

Unlike the PRINTK/DEBUG wrappers, these go straight out to the console, rather
than ending up in the debugtrace buffer.

A number of these users are followed by domain_crash(), and future changes
will want to combine the printk() into the domain_crash() call.  Expand these
wrappers in place, using XENLOG_ERR before a BUG(), and XENLOG_G_ERR before a
domain_crash().

Perfom some %pv/PRI_mfn/etc cleanup while modifying the invocations, and
explicitly drop some calls which are unnecessary (bad shadow op, and the empty
stubs for incorrect sh_map_and_validate_gl?e() calls).

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
6 years agoxen/x86: Ignore the automatically generated include/asm-x86/asm-macros.h
Andrew Cooper [Mon, 3 Sep 2018 16:45:52 +0000 (17:45 +0100)]
xen/x86: Ignore the automatically generated include/asm-x86/asm-macros.h

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
6 years agoThe hvmloader binary generated when using LLVM LD doesn't work
Roger Pau Monné [Mon, 3 Sep 2018 15:54:12 +0000 (17:54 +0200)]
The hvmloader binary generated when using LLVM LD doesn't work
properly and seems to get stuck while trying to generate and load the
ACPI tables. This is caused by the layout of the binary when linked
with LLVM LD.

LLVM LD has a different default linker script that GNU LD, and the
resulting hvmloader binary is slightly different:

LLVM LD:
Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  PHDR           0x000034 0x000ff034 0x000ff034 0x00060 0x00060 R   0x4
  LOAD           0x000000 0x000ff000 0x000ff000 0x38000 0x38000 RWE 0x1000
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0

GNU LD:
Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD           0x000080 0x00100000 0x00100000 0x36308 0x3fd74 RWE 0x10
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x4

Note that in the LLVM LD case (as with GNU LD) the .text section does
indeed have the address set to 0x100000 as requested on the command
line:

[ 1] .text             PROGBITS        00100000 001000 00dd10 00  AX  0   0 16

There's however the PHDR which is not present when using GNU LD.

Fix this by using a very simple linker script that generates the same
binary regardless of whether LLVM or GNU LD is used. By using a linker
script the usage of -Ttext can also be avoided by placing the desired
.text load address directly in the linker script.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/boot: silence MADT table entry logging
Jan Beulich [Mon, 3 Sep 2018 15:51:40 +0000 (17:51 +0200)]
x86/boot: silence MADT table entry logging

Logging disabled LAPIC / x2APIC entries with invalid local APIC IDs
(ones having "broadcast" meaning when used) isn't very useful, and can
be quite noisy on larger systems. Suppress their logging unless
opt_cpu_info is true.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86: assorted array_index_nospec() insertions
Jan Beulich [Mon, 3 Sep 2018 15:50:10 +0000 (17:50 +0200)]
x86: assorted array_index_nospec() insertions

Don't chance having Spectre v1 (including BCBS) gadgets. In some of the
cases the insertions are more of precautionary nature rather than there
provably being a gadget, but I think we should err on the safe (secure)
side here.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoxen/arm: Fix dom0 boot following c/s 580c45869
Andrew Cooper [Fri, 31 Aug 2018 18:01:25 +0000 (19:01 +0100)]
xen/arm: Fix dom0 boot following c/s 580c45869

c/s 580c45869 "Call arch_domain_create() as early as possible in
domain_create()" overlooked the fact that ARM uses is_hardware_domain() in at
least two places during arch_domain_create().

The bug manifests as:

  (XEN) Freed 292kB init memory.
  (XEN) traps.c:2017:d0v0 HSR=0x938c0007 pc=0xc0639d08 gva=0xe0800004 gpa=0x00000010481004

when dom0 tries to use the vuart.  Judging by other uses of
is_hardware_domain(), I expect the x86 PVH dom0 boot is similarly broken.

Reposition the code which sets up hardware_domain so that the
is_hardware_domain() predicate works correctly all the way through domain
creation.

While moving it, leave a related comment explaining the positioning of the
is_priv assignment, which in hindsight should have been part of c/s ef765ec98
when exactly the same problem was discovered for the is_control_domain()
predicate.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
Tested-by: Julien Grall <julien.grall@arm.com>
6 years agox86/hvm: Drop hvm_{vmx,svm} shorthands
Andrew Cooper [Tue, 28 Aug 2018 16:00:36 +0000 (16:00 +0000)]
x86/hvm: Drop hvm_{vmx,svm} shorthands

By making {vmx,svm} in hvm_vcpu into an anonymous union (consistent with
domain side of things), the hvm_{vmx,svm} defines can be dropped, and all code
refer to the correctly-named fields.  This means that the data hierachy is no
longer obscured from grep/cscope/tags/etc.

Reformat one comment and switch one bool_t to bool while making changes.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
6 years agox86/svm: Rename arch_svm_struct to svm_vcpu
Andrew Cooper [Tue, 28 Aug 2018 15:59:28 +0000 (15:59 +0000)]
x86/svm: Rename arch_svm_struct to svm_vcpu

The suffix and prefix are redundant, and the name is curiously odd.  Rename it
to svm_vcpu to be consistent with all the other similar structures.  In
addition, rename local arch_svm local variables to svm for further
consistency.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
6 years agox86/vmx: Rename arch_vmx_struct to vmx_vcpu
Andrew Cooper [Tue, 28 Aug 2018 15:53:06 +0000 (15:53 +0000)]
x86/vmx: Rename arch_vmx_struct to vmx_vcpu

The suffix and prefix are redundant, and the name is curiously odd.  Rename it
to vmx_vcpu to be consistent with all the other similar structures.  In
addition, rename local arch_vmx local variables to vmx for further
consistency.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
CC: Roger Pau Monné <roger.pau@citrix.com>
Some of the local pointers are named arch_vmx.  I'm open to renaming them to
just vmx (like all the other local pointers) if people are happy with the
additional patch delta.

6 years agox86/hvm: Rename v->arch.hvm_vcpu to v->arch.hvm
Andrew Cooper [Tue, 28 Aug 2018 15:52:34 +0000 (15:52 +0000)]
x86/hvm: Rename v->arch.hvm_vcpu to v->arch.hvm

The trailing _vcpu suffix is redundant, but adds to code volume.  Drop it.

Reflow lines as appropriate.  No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
6 years agoxen/hvm: Rename d->arch.hvm_domain to d->arch.hvm
Andrew Cooper [Tue, 28 Aug 2018 15:50:41 +0000 (15:50 +0000)]
xen/hvm: Rename d->arch.hvm_domain to d->arch.hvm

The trailing _domain suffix is redundant, but adds to code volume.  Drop it.

Reflow lines as appropriate, and switch to using the new XFREE/etc wrappers
where applicable.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
6 years agoxen/domain: Allocate d->vcpu[] in domain_create()
Andrew Cooper [Mon, 19 Mar 2018 17:07:50 +0000 (17:07 +0000)]
xen/domain: Allocate d->vcpu[] in domain_create()

For ARM, the call to arch_domain_create() needs to have completed before
domain_max_vcpus() will return the correct upper bound.

For each arch's dom0's, drop the temporary max_vcpus parameter, and allocation
of dom0->vcpu.

With d->max_vcpus now correctly configured before evtchn_init(), the poll mask
can be constructed suitably for the domain, rather than for the worst-case
setting.

Due to the evtchn_init() fixes, it no longer calls domain_max_vcpus(), and
ARM's two implementations of vgic_max_vcpus() no longer need work around the
out-of-order call.

From this point on, d->max_vcpus and d->vcpus[] are valid for any domain which
can be looked up by domid.

The XEN_DOMCTL_max_vcpus hypercall is modified to reject any call attempt with
max != d->max_vcpus, which does match the older semantics (not that it is
obvious from the code).  The logic to allocate d->vcpu[] is dropped, but at
this point the hypercall still needs making to allocate each vcpu.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Julien Grall <julien.grall@arm.com>
6 years agoxen/dom0: Arrange for dom0_cfg to contain the real max_vcpus value
Andrew Cooper [Mon, 19 Mar 2018 17:28:50 +0000 (17:28 +0000)]
xen/dom0: Arrange for dom0_cfg to contain the real max_vcpus value

Make dom0_max_vcpus() a common interface, and implement it on ARM by splitting
the existing alloc_dom0_vcpu0() function in half.

As domain_create() doesn't yet set up the vcpu array, the max value is also
passed into alloc_dom0_vcpu0().  This is temporary for bisectibility and
removed in the following patch.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agotools: Pass max_vcpus to XEN_DOMCTL_createdomain
Andrew Cooper [Tue, 27 Feb 2018 17:39:37 +0000 (17:39 +0000)]
tools: Pass max_vcpus to XEN_DOMCTL_createdomain

XEN_DOMCTL_max_vcpus is a mandatory hypercall, but nothing actually prevents a
toolstack from unpausing a domain with no vcpus.

Originally, d->vcpus[] was an embedded array in struct domain, but c/s
fb442e217 "x86_64: allow more vCPU-s per guest" in Xen 4.0 altered it to being
dynamically allocated.  A side effect of this is that d->vcpu[] is NULL until
XEN_DOMCTL_max_vcpus has completed, but a lot of hypercalls blindly
dereference it.

Even today, the behaviour of XEN_DOMCTL_max_vcpus is a mandatory singleton
call which can't change the number of vcpus once a value has been chosen.

In preparation to remote the hypercall, extend xen_domctl_createdomain with
the a max_vcpus field and arrange for all callers to pass the appropriate
value.  There is no change in construction behaviour yet, but later patches
will rearrange the hypervisor internals.

For the python stubs, extend the domain_create keyword list to take a
max_vcpus parameter, in lieu of deleting the pyxc_domain_max_vcpus function.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agoxen/domain: Call arch_domain_create() as early as possible in domain_create()
Andrew Cooper [Mon, 19 Mar 2018 16:50:46 +0000 (16:50 +0000)]
xen/domain: Call arch_domain_create() as early as possible in domain_create()

This is in preparation to set up d->max_cpus and d->vcpu[] in domain_create(),
and allow later parts of domain construction to have access to the values.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agoxen/gnttab: Fold grant_table_{create,set_limits}() into grant_table_init()
Andrew Cooper [Mon, 19 Mar 2018 16:06:24 +0000 (16:06 +0000)]
xen/gnttab: Fold grant_table_{create,set_limits}() into grant_table_init()

Now that the max_{grant,maptrack}_frames are specified from the very beginning
of grant table construction, the various initialisation functions can be
folded together and simplified as a result.

Leave grant_table_init() as the public interface, which is more consistent
with other subsystems.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agoxen/domctl: Remove XEN_DOMCTL_set_gnttab_limits
Andrew Cooper [Tue, 27 Feb 2018 17:39:37 +0000 (17:39 +0000)]
xen/domctl: Remove XEN_DOMCTL_set_gnttab_limits

Now that XEN_DOMCTL_createdomain handles the grant table limits, remove
XEN_DOMCTL_set_gnttab_limits (including XSM hooks and libxc wrappers).

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
6 years agoxen/gnttab: Pass max_{grant,maptrack}_frames into grant_table_create()
Andrew Cooper [Mon, 19 Mar 2018 11:19:52 +0000 (11:19 +0000)]
xen/gnttab: Pass max_{grant,maptrack}_frames into grant_table_create()

... rather than setting the limits up after domain_create() has completed.

This removes the common gnttab infrastructure for calculating the number of
dom0 grant frames (as the common grant table code is not an appropriate place
for it to live), opting instead to require the dom0 construction code to pass
a sane value in via the configuration.

In practice, this now means that there is never a partially constructed grant
table for a reference-able domain.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Julien Grall <julien.grall@arm.com>
6 years agotools: Pass grant table limits to XEN_DOMCTL_set_gnttab_limits
Andrew Cooper [Tue, 27 Feb 2018 17:39:37 +0000 (17:39 +0000)]
tools: Pass grant table limits to XEN_DOMCTL_set_gnttab_limits

XEN_DOMCTL_set_gnttab_limits is a fairly new hypercall, and is strictly
mandatory.  As it pertains to domain limits, it should be provided at
createdomain time.

In preparation to remove the hypercall, extend xen_domctl_createdomain with
the fields and arrange for all callers to pass appropriate details.  There is
no change in construction behaviour yet, but later patches will rearrange the
hypervisor internals, then delete the hypercall.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Acked-by: Christian Lindig <christian.lindig@citrix.com>
Acked-by: Wei Liu <wei.liu2@citrix.com>
6 years agox86/pv: Deprecate support for paging out the LDT
Andrew Cooper [Tue, 3 Oct 2017 10:18:37 +0000 (11:18 +0100)]
x86/pv: Deprecate support for paging out the LDT

This code is believed to be vestigial remnant of the PV Windows XP port.  It
is not used by Linux, NetBSD, Solaris or MiniOS.  Furthermore the
implementation is incomplete; it only functions for a present => not-present
transition, rather than a present => read/write transition.

The for_each_vcpu() is one scalability limitation for PV guests, which can't
reasonably be altered to be continuable.  Most importantly however, is that
this only codepath which plays with descriptor frames of a remote vcpu.

A side effect of dropping support for paging the LDT out is that the LDT no
longer automatically cleans itself up on domain destruction.  Cover this by
explicitly releasing the LDT frames at the same time as the GDT frames.

Finally, leave some asserts around to confirm the expected behaviour of all
the functions playing with PGT_seg_desc_page references.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/pv: Rename v->arch.pv_vcpu to v->arch.pv
Andrew Cooper [Tue, 28 Aug 2018 15:50:27 +0000 (15:50 +0000)]
x86/pv: Rename v->arch.pv_vcpu to v->arch.pv

The trailing _vcpu suffix is redundant, but adds to code volume.  Drop it.

Reflow lines as appropriate, and switch to using the new XFREE/etc wrappers
where applicable.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/pv: Rename d->arch.pv_domain to d->arch.pv
Andrew Cooper [Tue, 28 Aug 2018 15:49:09 +0000 (15:49 +0000)]
x86/pv: Rename d->arch.pv_domain to d->arch.pv

The trailing _domain suffix is redundant, but adds to code volume.  Drop it.

Reflow lines as appropriate, and switch to using the new XFREE/etc wrappers
where applicable.

No functional change.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/genapic: drop .target_cpus() hook
Jan Beulich [Thu, 30 Aug 2018 09:08:19 +0000 (11:08 +0200)]
x86/genapic: drop .target_cpus() hook

All flavors specify target_cpus_all() anyway - replace use of the hook
by &cpu_online_map.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/grant: mute gcc 4.1.x warning in steal_linear_address()
Zhenzhong Duan [Thu, 30 Aug 2018 09:05:01 +0000 (11:05 +0200)]
x86/grant: mute gcc 4.1.x warning in steal_linear_address()

Move reference of ol1e ahead or else we see below warning.

cc1: warnings being treated as errors
grant_table.c: In function 'replace_grant_pv_mapping':
grant_table.c:142: warning: 'ol1e.l1' may be used uninitialized in this function

Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
6 years agox86/alternatives: allow using assembler macros in favor of C ones
Jan Beulich [Thu, 30 Aug 2018 09:03:47 +0000 (11:03 +0200)]
x86/alternatives: allow using assembler macros in favor of C ones

As was validly pointed out as motivation for similar Linux side changes
(https://lkml.org/lkml/2018/6/22/677), using long sequences of
directives and auxiliary instructions, like is commonly the case when
setting up an alternative patch site, gcc can be mislead into believing
an asm() to be more heavy weight than it really is. By presenting it
with an assembler macro invocation instead, this can be avoided.

Initially I wanted to outright change the C macros ALTERNATIVE() and
ALTERNATIVE_2() to invoke the respective assembler ones, but doing so
would require quite a bit of cleanup of some use sites, because of the
exra necessary quoting combined with the need that each assembler macro
argument must consist of just a single string literal. We can consider
working towards that subsequently.

For now, set the stage of using the assembler macros here by providing a
new generated header, being the slightly massaged pre-processor output
of (for now just) alternative-asm.h. The massaging is primarily to be
able to properly track the build dependency: For this, we need the C
compiler to see the inclusion, which means we shouldn't directly use an
asm(". include ...") directive.

The dependency added to asm-offsets.s is not a true one; it's just the
easiest approach I could think of to make sure the new header gets
generated early on, without having to fiddle with xen/Makefile (and
introducing some x86-specific construct there).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agoVMX: reduce number of posted-interrupt hooks
Jan Beulich [Thu, 30 Aug 2018 09:02:09 +0000 (11:02 +0200)]
VMX: reduce number of posted-interrupt hooks

Three of the four hooks are not exposed outside of vmx.c, and all of
them have only a single possible non-NULL value. So there's no reason to
use hooks here - a simple set of flag indicators is sufficient (and we
don't even need a flag for the VM entry one, as it's always
(de-)activated together the the vCPU blocking hook, which needs to
remain an actual function pointer). This is the more that with the
Spectre v2 workarounds indirect calls have become more expensive.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Kevin Tian <kevin.tian@intel.com>
6 years agox86/mm: re-arrange get_page_from_l<N>e() vs pv_l1tf_check_l<N>e()
Jan Beulich [Thu, 30 Aug 2018 09:01:02 +0000 (11:01 +0200)]
x86/mm: re-arrange get_page_from_l<N>e() vs pv_l1tf_check_l<N>e()

Restore symmetry between get_page_from_l<N>e(): pv_l1tf_check_l<N>e() is
now uniformly invoked from outside of them. They're no longer getting
called for non-present PTEs. This way the slightly odd three-way return
value meaning of the higher level ones can also be got rid of.

Leave an assertion in get_page_from_l1e() as the only non-static one of
the four siblings, to ensure that no new unguarded calls go unnoticed.

Introduce local variables holding the page table entries processed, and
use them throughout the loop bodies instead of re-reading them from the
page table several times.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/pt: split out HVM functions from vtd.c
Wei Liu [Sun, 26 Aug 2018 12:19:43 +0000 (13:19 +0100)]
x86/pt: split out HVM functions from vtd.c

Functions are moved to hvm.c. Reorder makefile items while at it.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
6 years agox86/pt: make it build with !CONFIG_HVM
Wei Liu [Sun, 26 Aug 2018 12:19:42 +0000 (13:19 +0100)]
x86/pt: make it build with !CONFIG_HVM

This requires providing stubs for a few functions which are part of
HVM code.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
6 years agoxen/arm: fix SMMU driver build
Stefano Stabellini [Tue, 28 Aug 2018 23:47:40 +0000 (16:47 -0700)]
xen/arm: fix SMMU driver build

Add missing "CONFIG_". This build regression was introduced by commit
277aa3523d "arm: make it possible to disable the SMMU driver".

Signed-off-by: Stefano Stabellini <stefanos@xilinx.com>
[julieng: Add the commit where the regression was introduced]
Acked-by: Julien Grall <julien.grall@arm.com>
6 years agox86: reduce "visibility" of spec_ctrl_asm.h
Jan Beulich [Wed, 29 Aug 2018 14:32:17 +0000 (16:32 +0200)]
x86: reduce "visibility" of spec_ctrl_asm.h

Other than indirect_thunk_asm.h, spec_ctrl_asm.h is a header generally
needed by assembly source files only. Avoid having all C sources have a
dependency on that header (the set of assembly sources now gaining a
dependency on the C header is much smaller and hence more acceptable).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86: move quoting of __ASM_{STAC,CLAC}
Jan Beulich [Wed, 29 Aug 2018 14:31:32 +0000 (16:31 +0200)]
x86: move quoting of __ASM_{STAC,CLAC}

Both consumers want them quoted, so quote them right away instead of
using __stringify() upon use. In the spirit of other recent additions
also make the assembly forms assembler macros, allowing the helper
#define-s to be #undef-ed subsequently.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/alternatives: fully leverage automatic NOP filling
Jan Beulich [Wed, 29 Aug 2018 14:30:54 +0000 (16:30 +0200)]
x86/alternatives: fully leverage automatic NOP filling

As of commit 4008c71d7a ("x86/alt: Support for automatic padding
calculations") there's no point having explict ASM_NOPn instances in
alternatives anymore - drop them. As a result also drop the asm/nops.h
inclusion from alternative.h, adding explicit inclusions in the two
remaining C files needing them.

While touching it also move the CR4_PV32_RESTORE definition out of the
SMAP-specific conditional into a more general one.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86: drop NO_XPTI synthetic feature
Jan Beulich [Wed, 29 Aug 2018 14:29:42 +0000 (16:29 +0200)]
x86: drop NO_XPTI synthetic feature

With there not being any patching done based on it, we don't need this.
Non-patching conditionals can use opt_xpti instead.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
6 years agox86/spec-ctrl: split reporting for PV and HVM guests
Jan Beulich [Wed, 29 Aug 2018 14:28:52 +0000 (16:28 +0200)]
x86/spec-ctrl: split reporting for PV and HVM guests

Putting them on separate lines was suggested before, and is going to
become necessary eventually anyway as things get added here. Split them
now, and put the respective pieces in CONFIG_* conditionals at the same
time.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>