Juergen Gross [Mon, 20 Mar 2017 08:00:20 +0000 (09:00 +0100)]
xenstore: set correct error code when violating quota
When the number of permitted xenstore entries for a domain is being
exceeded the operation trying to create a new entry is denied.
Unfortunately errno isn't being set in this case so the error code
returned to the client is undefined.
Set errno to ENOSPC in this case.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Paul Durrant [Wed, 22 Mar 2017 11:04:20 +0000 (12:04 +0100)]
x86/viridian: add warnings for unimplemented hypercalls and MSRs
These warnings can be useful when Microsoft updates Windows.
In the past there have been several cases when Windows erroneously uses
hypercalls and MSRs that should be gated on CPUID flags than Xen does
not set. The usual symptom is a guest crash with little or no information
in the hypervisor log. Adding these warnings at least gives a clue as to
what might be happening in such cases.
Some versions of Windows do currently issue hypercalls that they should
not, so this patch whitelists those to avoid the warnings as the lack
of implementation is clearly proved not to be a problem to the guest.
The warnings are rate limited so a malicious guest cannot use them to
as a DoS.
NOTE: Because the MSR warnings need to be gated on range checking the
MSR address this patch imports the up-to-date definitions of all
the viridian MSRs from the specification.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Paul Durrant [Wed, 22 Mar 2017 11:03:03 +0000 (12:03 +0100)]
x86/viridian: fix xen-hvmcrash when vp_assist page is present
Currently use of xen-hvmcrash will cause an immediate domain_crash() in
initialize_vp_assist() because it is called from viridian_load_vcpu_ctxt()
without having first cleared any previous mapping.
This patch addes a check into viridian_load_vcpu_ctxt() to avoid re-
initialization and turned the domain_crash() in initialize_vp_assist()
into an ASSERT() since neither codepath into that function should allow
it to be hit.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Zhongze Liu [Tue, 21 Mar 2017 14:14:21 +0000 (15:14 +0100)]
common: allow a default compiled-in command line using Kconfig
This allows downstreams to set their defaults without modifying the source code
all over the place. Also probably useful for the embedded space.
(See Also: https://xenproject.atlassian.net/browse/XEN-41)
If CMDLINE is set, it will be parsed prior to the bootloader command line.
This order of parsing implies that if any non-cumulative options are set in
both CMDLINE and the bootloader command line, only the ones in the latter will
take effect. Furthermore, if CMDLINE_OVERRIDE is set to y, the whole
bootloader command line will be ignored, which will be useful to work around
broken bootloaders. A wrapper to the original common/kernel.c:cmdline_parse()
was introduced to complete this task.
Signed-off-by: Zhongze Liu <blackskygg@gmail.com>
[jb: fix non-EXPERT build] Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Jan Beulich [Tue, 21 Mar 2017 14:13:42 +0000 (15:13 +0100)]
x86emul: correct FPU code/data pointers and opcode handling
Prevent leaking the hypervisor ones (stored by hardware during stub
execution), at once making sure the guest sees correct values there.
This piggybacks on the backout logic used to deal with write faults of
FPU insns.
Deliberately ignore the NO_FPU_SEL feature here: Honoring it would
merely mean extra code with no benefit (once we XRSTOR state, the
selector values will simply be lost anyway).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> [hvm/emulate.c] Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 21 Mar 2017 14:12:59 +0000 (15:12 +0100)]
x86emul: correct handling of FPU insns faulting on memory write
When an FPU instruction with a memory destination fails during the
memory write, it should not affect FPU register state. Due to the way
we emulate FPU (and SIMD) instructions, we can only guarantee this by
- backing out changes to the FPU register state in such a case or
- doing a descriptor read and/or page walk up front, perhaps with the
stubs accessing the actual memory location then.
The latter would require a significant change in how the emulator does
its guest memory accessing, so for now the former variant is being
chosen.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> [hvm/emulate.c] Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Jan Beulich [Tue, 21 Mar 2017 14:10:25 +0000 (15:10 +0100)]
x86emul: centralize put_fpu() invocations
..., splitting parts of it into check_*() macros. This is in
preparation of making ->put_fpu() do further adjustments to register
state. (Some of the check_xmm() invocations could be avoided, as in
some of the cases no insns handled there can actually raise #XM, but I
think we're better off keeping them to avoid later additions of further
insn patterns rendering the lack of the check a bug.)
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Andrew Cooper [Wed, 1 Mar 2017 19:02:35 +0000 (19:02 +0000)]
tools/insn-fuzz: Support AFL's afl-clang-fast mode
AFL has an alternative llvm-base instrumentation mode, which has much lower
overhead than the traditional afl-gcc.
One extra ability is to chose exactly where the master process gets
initialised to, before being forked for testing. This point is chosen after
the call to LLVMFuzzerInitialize(), so the stack isn't being remapped
executable for every test.
Another extra ability is to feed multiple inputs into a single test process,
to reduce the number of fork() calls required overall. Two caveats are that if
stdin is used for data, it must be unbuffered, and if input is passed via a
command line parameter, the underlying file must be opened and closed on each
iteration.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Wed, 1 Mar 2017 18:46:52 +0000 (18:46 +0000)]
tools/insn-fuzz: Make use of LLVMFuzzerInitialize()
libfuzz can perform one-time initialisation by calling LLVMFuzzerInitialize().
Move emul_test_init() into this, to avoid repeating it on every
LLVMFuzzerTestOneInput() call.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Thu, 2 Mar 2017 17:24:30 +0000 (17:24 +0000)]
tools/insn-fuzz: Accept fuzzing input on stdin
This is rather faster for afl-fuzz to arrange than using an explicit file
parameter. Also update the README to recommend using a tmpfs for findings_dir
which reduces disk load and is more performant.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Jan Beulich [Mon, 20 Mar 2017 16:00:34 +0000 (17:00 +0100)]
AMD-Vi: allocate root table on demand
This was my originally intended fix for the AMD side of XSA-207:
There's no need to unconditionally allocate the root table, and with
that there's then also no way to leak it when a guest has no devices
assigned.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Paul Durrant [Mon, 20 Mar 2017 15:59:54 +0000 (16:59 +0100)]
x86/viridian: update to version 5.0a of the specification
The Hypervisor Top Level Functional Specification v5.0a has many differences
from previous versions and introduces whole new sections.
This patch:
- Updates the URL at the top of the source.
- Fixes up section references accordingly.
- Modifies the MSR naming convention in the code to match the specification.
- Rename the apic_assist page to the vp_assist page to reflect the change
in the specification.
(The APIC assist feature itself is inconsistently named in the
specification so stick wth the current feature name).
- Updates the handling of CPUID leaf 3.
There is one functional change in this patch: The vp_assist page is
mapped (and completely zeroed) regardless of whether the APIC assist
feature is enabled. This reflects its new wider remit and simplifies the
code slightly.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Wei Liu [Thu, 16 Mar 2017 16:53:20 +0000 (16:53 +0000)]
x86: split PV dom0 builder to pv/dom0_builder.c
Long term we want to be able to disentangle PV and HVM code. Move the PV
domain builder to a dedicated file.
This in turn requires exposing a few functions and variables via a new
header dom0_build.h. These functions and variables are now prefixed with
"dom0_" if they weren't already so.
No functional change.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Wei Liu [Mon, 20 Mar 2017 13:05:08 +0000 (13:05 +0000)]
x86: modify setup_dom0_vcpu to use dom0_cpus internally
We will later move dom0 builders to different directories. To avoid the
need of making dom0_cpus visible outside dom0_builder.c, modify
setup_dom0_vcpus to cycle through dom0_cpus internally instead of
relying on the callers to do that.
No functional change.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Boris Ostrovsky [Mon, 20 Mar 2017 08:27:35 +0000 (09:27 +0100)]
x86/time: don't use virtual TSC if host and guest frequencies are equal
Commit 82713ec8d2 ("x86: use native RDTSC(P) execution when guest and
host frequencies are the same") left out optimization for PV guests
when host and guest run at the same frequency.
For such a case we should be able not to use virtual TSC regardless
of whether we are runing before or after a migration (i.e. regardless
of incarnation value).
Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
[jb: retain parts of the original comment] Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Mon, 20 Mar 2017 08:27:12 +0000 (09:27 +0100)]
x86/EFI: avoid Xen image when looking for module/kexec position
When booting straight from EFI, we don't further try to relocate Xen.
As a result, so far we also didn't avoid the area Xen uses when looking
for a location to put modules or the kexec area. Introduce a fake
module slot to deal with that without having to fiddle with a lot of
code.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Mon, 20 Mar 2017 08:25:36 +0000 (09:25 +0100)]
x86/EFI: avoid IOMMU faults on [_end,__2M_rwdata_end)
Commit c9a4a1c419 ("x86/layout: Correct Xen's idea of its own memory
layout") didn't go far enough with the conversion, causing IOMMU faults
when memory from that range was handed to a domain. We must not make
this memory available for allocation (the change is benign to xen.gz at
this point in time).
Note that the change to tboot_shutdown() is fixing another issue at
once: As it looks, the function so far skipped all memory below the Xen
image.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 17 Mar 2017 08:34:38 +0000 (09:34 +0100)]
x86emul: parallelize SIMD test code building
In anticipation of further flavors (AVX, AVX-512) going to be added
(which would make the current situation even worse), facilitate
reduction of build time (and hence latency to availability of test
results) via use of make's -j option.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 17 Mar 2017 08:33:45 +0000 (09:33 +0100)]
x86emul: correct DECLARE_ALIGNED()
Stop creating an excessively large array on the stack, by properly
taking into account the array element size when establishing its
element count (and of course also when calculating the pointer to
be actually used to access the memory).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monne [Fri, 3 Mar 2017 12:19:22 +0000 (12:19 +0000)]
x86: remove has_hvm_container_{domain/vcpu}
It is now useless since PVHv1 is removed and PVHv2 is a HVM domain from Xen's
point of view.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Tim Deegan <tim@xen.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Roger Pau Monne [Fri, 3 Mar 2017 12:19:22 +0000 (12:19 +0000)]
x86: remove PVHv1 code
This removal applies to both the hypervisor and the toolstack side of PVHv1.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: George Dunlap <george.dunlap@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Tim Deegan [Fri, 10 Mar 2017 10:10:57 +0000 (10:10 +0000)]
tools/kdd: don't use a pointer to an unaligned field.
The 'val' field in the packet is byte-aligned (because it is part of a
packed struct), but the pointer argument to kdd_rdmsr() has the normal
alignment constraints for a uint64_t *. Use a local variable to make sure
the passed pointer has the correct alignment.
Reported-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Tim Deegan <tim@xen.org> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Tested-by: Roger Pau Monné <roger.pau@citrix.com>
Olaf Hering [Wed, 15 Mar 2017 07:01:34 +0000 (07:01 +0000)]
tools: include sys/sysmacros.h on Linux
Due to a bug in the glibc headers the macros makedev(), major() and
minor() where avaialble by including sys/types.h. This bug was
addressed in glibc-2.25 by introducing a warning when these macros are
used. Since Xen is build with -Werror this new warning cause a compile
error.
Use sys/sysmacros.h to define these three macros.
blktap2 is already Linux specific. The kernel header which was used to
get makedev() does not provided it anymore, and it was wrong to use a
kernel header anyway.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Wei Liu <wei.liu2@citrix.com>
Razvan Cojocaru [Wed, 15 Mar 2017 09:20:30 +0000 (11:20 +0200)]
tools/libxc: Fix ARM build broken by XEN_DOMCTL_getvcpuextstate commit
The previous "tools/libxc: Exposed XEN_DOMCTL_getvcpuextstate" broke
the ARM build (the hypercall does not have a corresponding DOMCTL
ARM struct). This patch fixes the build by returning -ENODEV for
ARM from xc_vcpu_get_extstate().
Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Julien Grall [Wed, 8 Mar 2017 18:06:02 +0000 (18:06 +0000)]
xen/arm: p2m: Perform local TLB invalidation on vCPU migration
The ARM architecture allows an OS to have per-CPU page tables, as it
guarantees that TLBs never migrate from one CPU to another.
This works fine until this is done in a guest. Consider the following
scenario:
- vcpu-0 maps P to V
- vpcu-1 maps P' to V
If run on the same physical CPU, vcpu-1 can hit in TLBs generated by
vcpu-0 accesses, and access the wrong physical page.
The solution to this is to keep a per-p2m map of which vCPU ran the last
on each given pCPU and invalidate local TLBs if two vPCUs from the same
VM run on the same CPU.
Unfortunately it is not possible to allocate per-cpu variable on the
fly. So for now the size of the array is NR_CPUS, this is fine because
we still have space in the structure domain. We may want to add an
helper to allocate per-cpu variable in the future.
Jan Beulich [Tue, 14 Mar 2017 17:21:09 +0000 (18:21 +0100)]
EFI: retrieve and expose Apple device properties
Apple's EFI drivers supply device properties which are needed to
support Macs optimally. They contain vital information which cannot be
obtained any other way (e.g. Thunderbolt Device ROM). They're also used
to convey the current device state so that OS drivers can pick up where
EFI drivers left (e.g. GPU mode setting).
Reference: Linux commit 58c5475aba67706b31d9237808d5d3d54074e5ea (see
there for the full original commit message, only the initial part of
which is being reproduced above)
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 14 Mar 2017 17:20:27 +0000 (18:20 +0100)]
x86emul: correct {,v}{ld,st}mxcsr handling
Calls to get_fpu() were missing. Calls to put_fpu() are deliberately
not being added: Neither instruction can raise #XM, so the catch-all
_put_fpu() is just fine here.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monné [Tue, 14 Mar 2017 17:19:29 +0000 (18:19 +0100)]
build/clang: fix XSM dummy policy when using clang 4.0
There seems to be some weird bug in clang 4.0 that prevents xsm_pmu_op from
working as expected, and vpmu.o ends up with a reference to
__xsm_action_mismatch_detected which makes the build fail:
[...]
ld -melf_x86_64_fbsd -T xen.lds -N prelink.o \
xen/common/symbols-dummy.o -o xen/.xen-syms.0
prelink.o: In function `xsm_default_action':
xen/include/xsm/dummy.h:80: undefined reference to `__xsm_action_mismatch_detected'
xen/xen/include/xsm/dummy.h:80: relocation truncated to fit: R_X86_64_PC32 against undefined symbol `__xsm_action_mismatch_detected'
ld: xen/xen/.xen-syms.0: hidden symbol `__xsm_action_mismatch_detected' isn't defined
The current patch is the only way I've found to fix this so far, by simply
moving the XSM_PRIV check into the default case in xsm_pmu_op. This also fixes
the behavior of do_xenpmu_op, which will now return -EINVAL for unknown
XENPMU_* operations, instead of -EPERM when called by a privileged domain.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Juergen Gross [Tue, 14 Mar 2017 15:04:42 +0000 (16:04 +0100)]
tools/libxl: correct distclean target
Commit 3e5f1a63b53920763 ("tools: adapt xenlight.pc and xlutil.pc to
new pkg-config scheme") introduced an error for "make distclean" as
*.pc.in are deleted which are now files in git.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Juergen Gross [Tue, 14 Mar 2017 15:04:41 +0000 (16:04 +0100)]
tools: correct build in directory below tools
Recent changes to create *.pc files introduced a bug when trying to
build a library from a directory below tools as PKG_CONFIG_DIR wouldn't
be set. Correct this by adding a default value to Rules.mk.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Razvan Cojocaru [Tue, 14 Mar 2017 13:30:18 +0000 (15:30 +0200)]
tools/libxc: Exposed XEN_DOMCTL_getvcpuextstate
It's useful for an introspection tool to be able to inspect
XSAVE states. Xen already has a DOMCTL that can be used for this
purpose, but it had no public libxc wrapper. This patch adds
xc_vcpu_get_extstate().
Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Juergen Gross [Tue, 14 Mar 2017 13:31:24 +0000 (14:31 +0100)]
tools: adapt xenlight.pc and xlutil.pc to new pkg-config scheme
Instead of generating the *.pc.in files at configure time use the new
pkg-config scheme for those files. Add the dependencies to other Xen
libraries as needed.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Juergen Gross [Tue, 14 Mar 2017 13:31:12 +0000 (14:31 +0100)]
tools: add support for additional items in .pc files for local builds
Some libraries require different compiler-flags when being used in a
local build compared to a build using installed libraries.
Reflect that by supporting local cflags variables in generated
pkg-config files. The local variants will be empty in the installed
pkg-config files.
The flags for the linker in the local variants will have to specify
the search patch for the library with "-Wl,-rpath-link=", while the
flags for the installed library will be "-L".
Add needed directory patterns.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Juergen Gross [Tue, 14 Mar 2017 13:31:08 +0000 (14:31 +0100)]
tools: fix typo in tools/Rules.mk
Commit 78fb69ad9 ("tools/Rules.mk: Properly handle libraries with
recursive dependencies.") introduced a copy and paste error in
tools/Rules.mk:
LDLIBS_libxenstore and SHLIB_libxenstore don't use SHDEPS_libxenstore
but SHDEPS_libxenguest. This will add a superfluous dependency of
libxenstore on libxenevtchn.
Correct this bug.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Zhang Chen [Mon, 6 Mar 2017 02:59:25 +0000 (10:59 +0800)]
COLO-Proxy: Use socket to get checkpoint event.
We use kernel colo proxy's way to get the checkpoint event
from qemu colo-compare.
Qemu colo-compare need add a API to support this(I will add this in qemu).
Qemu side patch:
https://lists.nongnu.org/archive/html/qemu-devel/2017-02/msg07265.html
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Sergey Dyasli [Tue, 14 Mar 2017 11:25:14 +0000 (12:25 +0100)]
x86/vvmx: correct nested shadow VMCS handling
Currently xen always sets the shadow VMCS-indicator bit on nested
vmptrld and always clears it on nested vmclear. This behavior is
wrong when the guest loads a shadow VMCS: shadow bit will be lost
on nested vmclear.
Fix this by checking if the guest has provided a shadow VMCS.
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com> Acked-by: Kevin Tian <kevin.tian@intel.com>
Sergey Dyasli [Tue, 14 Mar 2017 11:24:38 +0000 (12:24 +0100)]
x86/vvmx: add mov-ss blocking check to vmentry
Intel SDM states that if there is a current VMCS and there is MOV-SS
blocking, VMFailValid occurs and control passes to the next instruction.
Implement such behaviour for nested vmlaunch and vmresume.
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Kevin Tian <kevin.tian@intel.com>
Andrew Cooper [Fri, 17 Feb 2017 18:31:45 +0000 (18:31 +0000)]
x86/cpuid: Handle leaf 0xb in guest_cpuid()
Leaf 0xb is reserved by AMD, and uniformly hidden from guests by the toolstack
logic and hypervisor PV logic. The previous dynamic logic filled in the
x2APIC ID for all HVM guests.
In practice, leaf 0xb is tightly linked with x2APIC, and x2APIC is offered to
guests on AMD hardware, as Xen's APIC emulation is x2APIC capable even if
hardware isn't.
Sensibly exposing the rest of the leaf requires further topology
infrastructure.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 17 Feb 2017 18:24:45 +0000 (18:24 +0000)]
x86/cpuid: Handle leaf 0xa in guest_cpuid()
Leaf 0xa is reserved by AMD, and only exposed to Intel guests when vPMU is
enabled. Leave the logic as-was, ready to be cleaned up when further
toolstack infrastructure is in place.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 17 Feb 2017 18:03:58 +0000 (18:03 +0000)]
x86/cpuid: Handle leaf 0x6 in guest_cpuid()
The thermal/performance leaf was previously hidden from HVM guests, but fully
visible to PV guests. Most of the leaf refers to MSR availability, and there
is nothing an unprivileged PV guest can do with the information, so hide the
leaf entirely.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 17 Feb 2017 17:32:29 +0000 (17:32 +0000)]
x86/cpuid: Handle leaf 0x5 in guest_cpuid()
The MONITOR flag isn't exposed to guests. The existing toolstack logic, and
pv_cpuid() in the hypervisor, zero the MONITOR leaf for queries.
However, the MONITOR leaf is still visible in the hardware domains native
CPUID view, and Linux depends on this to set up C-state information. Leak the
hosts MONITOR leaf under the same circumstances that the MONITOR feature is
leaked.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 17 Feb 2017 17:21:35 +0000 (17:21 +0000)]
x86/cpuid: Handle leaf 0x4 in guest_cpuid()
Leaf 0x4 is reserved by AMD. For Intel, it is a multi-invocation leaf with
ecx enumerating different cache details.
Add a new union for it in struct cpuid_policy, collect it from hardware in
calculate_raw_policy(), audit it in recalculate_cpuid_policy() and update
guest_cpuid() and update_domain_cpuid_info() to properly insert/extract data.
A lot of the data here will need further auditing/refinement when better
topology support is introduced, but for now, this matches the existing
toolstack behaviour.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Tue, 24 May 2016 14:46:01 +0000 (15:46 +0100)]
x86/pagewalk: Consistently use guest_walk_*() helpers for translation
hap_p2m_ga_to_gfn() and sh_page_fault() currently use guest_l1e_get_gfn() to
obtain the translation of a pagewalk. This is conceptually wrong (the
semantics of gw.l1e is an internal detail), and will actually be wrong when
PSE36 superpage support is fixed. Switch them to using guest_walk_to_gfn().
guest_walk_tables() also uses guest_l1e_get_gfn(), and is updated for
consistency.
Take the opportunity to const-correct the walk_t parameter of the
guest_walk_to_*() helpers, and implement guest_walk_to_gpa() in terms of
guest_walk_to_gfn() to avoid duplicating the actual translation calculation.
While editing guest_walk_to_gpa(), fix a latent bug by causing it to return
INVALID_PADDR rather than 0 for a failed translation, as 0 is also a valid
successful result. The sole caller, sh_page_fault(), has already confirmed
that the translation is valid, so this doesn't cause a behavioural change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Acked-by: George Dunlap <george.dunlap@citrix.com>