Roger Pau Monne [Fri, 3 Mar 2017 12:19:22 +0000 (12:19 +0000)]
x86: remove has_hvm_container_{domain/vcpu}
It is now useless since PVHv1 is removed and PVHv2 is a HVM domain from Xen's
point of view.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Tim Deegan <tim@xen.org> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: George Dunlap <george.dunlap@citrix.com> Acked-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Roger Pau Monne [Fri, 3 Mar 2017 12:19:22 +0000 (12:19 +0000)]
x86: remove PVHv1 code
This removal applies to both the hypervisor and the toolstack side of PVHv1.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: George Dunlap <george.dunlap@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com> Acked-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
Tim Deegan [Fri, 10 Mar 2017 10:10:57 +0000 (10:10 +0000)]
tools/kdd: don't use a pointer to an unaligned field.
The 'val' field in the packet is byte-aligned (because it is part of a
packed struct), but the pointer argument to kdd_rdmsr() has the normal
alignment constraints for a uint64_t *. Use a local variable to make sure
the passed pointer has the correct alignment.
Reported-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Tim Deegan <tim@xen.org> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Tested-by: Roger Pau Monné <roger.pau@citrix.com>
Olaf Hering [Wed, 15 Mar 2017 07:01:34 +0000 (07:01 +0000)]
tools: include sys/sysmacros.h on Linux
Due to a bug in the glibc headers the macros makedev(), major() and
minor() where avaialble by including sys/types.h. This bug was
addressed in glibc-2.25 by introducing a warning when these macros are
used. Since Xen is build with -Werror this new warning cause a compile
error.
Use sys/sysmacros.h to define these three macros.
blktap2 is already Linux specific. The kernel header which was used to
get makedev() does not provided it anymore, and it was wrong to use a
kernel header anyway.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Wei Liu <wei.liu2@citrix.com>
Razvan Cojocaru [Wed, 15 Mar 2017 09:20:30 +0000 (11:20 +0200)]
tools/libxc: Fix ARM build broken by XEN_DOMCTL_getvcpuextstate commit
The previous "tools/libxc: Exposed XEN_DOMCTL_getvcpuextstate" broke
the ARM build (the hypercall does not have a corresponding DOMCTL
ARM struct). This patch fixes the build by returning -ENODEV for
ARM from xc_vcpu_get_extstate().
Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Julien Grall [Wed, 8 Mar 2017 18:06:02 +0000 (18:06 +0000)]
xen/arm: p2m: Perform local TLB invalidation on vCPU migration
The ARM architecture allows an OS to have per-CPU page tables, as it
guarantees that TLBs never migrate from one CPU to another.
This works fine until this is done in a guest. Consider the following
scenario:
- vcpu-0 maps P to V
- vpcu-1 maps P' to V
If run on the same physical CPU, vcpu-1 can hit in TLBs generated by
vcpu-0 accesses, and access the wrong physical page.
The solution to this is to keep a per-p2m map of which vCPU ran the last
on each given pCPU and invalidate local TLBs if two vPCUs from the same
VM run on the same CPU.
Unfortunately it is not possible to allocate per-cpu variable on the
fly. So for now the size of the array is NR_CPUS, this is fine because
we still have space in the structure domain. We may want to add an
helper to allocate per-cpu variable in the future.
Jan Beulich [Tue, 14 Mar 2017 17:21:09 +0000 (18:21 +0100)]
EFI: retrieve and expose Apple device properties
Apple's EFI drivers supply device properties which are needed to
support Macs optimally. They contain vital information which cannot be
obtained any other way (e.g. Thunderbolt Device ROM). They're also used
to convey the current device state so that OS drivers can pick up where
EFI drivers left (e.g. GPU mode setting).
Reference: Linux commit 58c5475aba67706b31d9237808d5d3d54074e5ea (see
there for the full original commit message, only the initial part of
which is being reproduced above)
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 14 Mar 2017 17:20:27 +0000 (18:20 +0100)]
x86emul: correct {,v}{ld,st}mxcsr handling
Calls to get_fpu() were missing. Calls to put_fpu() are deliberately
not being added: Neither instruction can raise #XM, so the catch-all
_put_fpu() is just fine here.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Roger Pau Monné [Tue, 14 Mar 2017 17:19:29 +0000 (18:19 +0100)]
build/clang: fix XSM dummy policy when using clang 4.0
There seems to be some weird bug in clang 4.0 that prevents xsm_pmu_op from
working as expected, and vpmu.o ends up with a reference to
__xsm_action_mismatch_detected which makes the build fail:
[...]
ld -melf_x86_64_fbsd -T xen.lds -N prelink.o \
xen/common/symbols-dummy.o -o xen/.xen-syms.0
prelink.o: In function `xsm_default_action':
xen/include/xsm/dummy.h:80: undefined reference to `__xsm_action_mismatch_detected'
xen/xen/include/xsm/dummy.h:80: relocation truncated to fit: R_X86_64_PC32 against undefined symbol `__xsm_action_mismatch_detected'
ld: xen/xen/.xen-syms.0: hidden symbol `__xsm_action_mismatch_detected' isn't defined
The current patch is the only way I've found to fix this so far, by simply
moving the XSM_PRIV check into the default case in xsm_pmu_op. This also fixes
the behavior of do_xenpmu_op, which will now return -EINVAL for unknown
XENPMU_* operations, instead of -EPERM when called by a privileged domain.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Juergen Gross [Tue, 14 Mar 2017 15:04:42 +0000 (16:04 +0100)]
tools/libxl: correct distclean target
Commit 3e5f1a63b53920763 ("tools: adapt xenlight.pc and xlutil.pc to
new pkg-config scheme") introduced an error for "make distclean" as
*.pc.in are deleted which are now files in git.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Juergen Gross [Tue, 14 Mar 2017 15:04:41 +0000 (16:04 +0100)]
tools: correct build in directory below tools
Recent changes to create *.pc files introduced a bug when trying to
build a library from a directory below tools as PKG_CONFIG_DIR wouldn't
be set. Correct this by adding a default value to Rules.mk.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Razvan Cojocaru [Tue, 14 Mar 2017 13:30:18 +0000 (15:30 +0200)]
tools/libxc: Exposed XEN_DOMCTL_getvcpuextstate
It's useful for an introspection tool to be able to inspect
XSAVE states. Xen already has a DOMCTL that can be used for this
purpose, but it had no public libxc wrapper. This patch adds
xc_vcpu_get_extstate().
Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Juergen Gross [Tue, 14 Mar 2017 13:31:24 +0000 (14:31 +0100)]
tools: adapt xenlight.pc and xlutil.pc to new pkg-config scheme
Instead of generating the *.pc.in files at configure time use the new
pkg-config scheme for those files. Add the dependencies to other Xen
libraries as needed.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Juergen Gross [Tue, 14 Mar 2017 13:31:12 +0000 (14:31 +0100)]
tools: add support for additional items in .pc files for local builds
Some libraries require different compiler-flags when being used in a
local build compared to a build using installed libraries.
Reflect that by supporting local cflags variables in generated
pkg-config files. The local variants will be empty in the installed
pkg-config files.
The flags for the linker in the local variants will have to specify
the search patch for the library with "-Wl,-rpath-link=", while the
flags for the installed library will be "-L".
Add needed directory patterns.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Juergen Gross [Tue, 14 Mar 2017 13:31:08 +0000 (14:31 +0100)]
tools: fix typo in tools/Rules.mk
Commit 78fb69ad9 ("tools/Rules.mk: Properly handle libraries with
recursive dependencies.") introduced a copy and paste error in
tools/Rules.mk:
LDLIBS_libxenstore and SHLIB_libxenstore don't use SHDEPS_libxenstore
but SHDEPS_libxenguest. This will add a superfluous dependency of
libxenstore on libxenevtchn.
Correct this bug.
Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Zhang Chen [Mon, 6 Mar 2017 02:59:25 +0000 (10:59 +0800)]
COLO-Proxy: Use socket to get checkpoint event.
We use kernel colo proxy's way to get the checkpoint event
from qemu colo-compare.
Qemu colo-compare need add a API to support this(I will add this in qemu).
Qemu side patch:
https://lists.nongnu.org/archive/html/qemu-devel/2017-02/msg07265.html
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Sergey Dyasli [Tue, 14 Mar 2017 11:25:14 +0000 (12:25 +0100)]
x86/vvmx: correct nested shadow VMCS handling
Currently xen always sets the shadow VMCS-indicator bit on nested
vmptrld and always clears it on nested vmclear. This behavior is
wrong when the guest loads a shadow VMCS: shadow bit will be lost
on nested vmclear.
Fix this by checking if the guest has provided a shadow VMCS.
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com> Acked-by: Kevin Tian <kevin.tian@intel.com>
Sergey Dyasli [Tue, 14 Mar 2017 11:24:38 +0000 (12:24 +0100)]
x86/vvmx: add mov-ss blocking check to vmentry
Intel SDM states that if there is a current VMCS and there is MOV-SS
blocking, VMFailValid occurs and control passes to the next instruction.
Implement such behaviour for nested vmlaunch and vmresume.
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Kevin Tian <kevin.tian@intel.com>
Andrew Cooper [Fri, 17 Feb 2017 18:31:45 +0000 (18:31 +0000)]
x86/cpuid: Handle leaf 0xb in guest_cpuid()
Leaf 0xb is reserved by AMD, and uniformly hidden from guests by the toolstack
logic and hypervisor PV logic. The previous dynamic logic filled in the
x2APIC ID for all HVM guests.
In practice, leaf 0xb is tightly linked with x2APIC, and x2APIC is offered to
guests on AMD hardware, as Xen's APIC emulation is x2APIC capable even if
hardware isn't.
Sensibly exposing the rest of the leaf requires further topology
infrastructure.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 17 Feb 2017 18:24:45 +0000 (18:24 +0000)]
x86/cpuid: Handle leaf 0xa in guest_cpuid()
Leaf 0xa is reserved by AMD, and only exposed to Intel guests when vPMU is
enabled. Leave the logic as-was, ready to be cleaned up when further
toolstack infrastructure is in place.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 17 Feb 2017 18:03:58 +0000 (18:03 +0000)]
x86/cpuid: Handle leaf 0x6 in guest_cpuid()
The thermal/performance leaf was previously hidden from HVM guests, but fully
visible to PV guests. Most of the leaf refers to MSR availability, and there
is nothing an unprivileged PV guest can do with the information, so hide the
leaf entirely.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 17 Feb 2017 17:32:29 +0000 (17:32 +0000)]
x86/cpuid: Handle leaf 0x5 in guest_cpuid()
The MONITOR flag isn't exposed to guests. The existing toolstack logic, and
pv_cpuid() in the hypervisor, zero the MONITOR leaf for queries.
However, the MONITOR leaf is still visible in the hardware domains native
CPUID view, and Linux depends on this to set up C-state information. Leak the
hosts MONITOR leaf under the same circumstances that the MONITOR feature is
leaked.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Fri, 17 Feb 2017 17:21:35 +0000 (17:21 +0000)]
x86/cpuid: Handle leaf 0x4 in guest_cpuid()
Leaf 0x4 is reserved by AMD. For Intel, it is a multi-invocation leaf with
ecx enumerating different cache details.
Add a new union for it in struct cpuid_policy, collect it from hardware in
calculate_raw_policy(), audit it in recalculate_cpuid_policy() and update
guest_cpuid() and update_domain_cpuid_info() to properly insert/extract data.
A lot of the data here will need further auditing/refinement when better
topology support is introduced, but for now, this matches the existing
toolstack behaviour.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Andrew Cooper [Tue, 24 May 2016 14:46:01 +0000 (15:46 +0100)]
x86/pagewalk: Consistently use guest_walk_*() helpers for translation
hap_p2m_ga_to_gfn() and sh_page_fault() currently use guest_l1e_get_gfn() to
obtain the translation of a pagewalk. This is conceptually wrong (the
semantics of gw.l1e is an internal detail), and will actually be wrong when
PSE36 superpage support is fixed. Switch them to using guest_walk_to_gfn().
guest_walk_tables() also uses guest_l1e_get_gfn(), and is updated for
consistency.
Take the opportunity to const-correct the walk_t parameter of the
guest_walk_to_*() helpers, and implement guest_walk_to_gpa() in terms of
guest_walk_to_gfn() to avoid duplicating the actual translation calculation.
While editing guest_walk_to_gpa(), fix a latent bug by causing it to return
INVALID_PADDR rather than 0 for a failed translation, as 0 is also a valid
successful result. The sole caller, sh_page_fault(), has already confirmed
that the translation is valid, so this doesn't cause a behavioural change.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Acked-by: George Dunlap <george.dunlap@citrix.com>
Andrew Cooper [Sun, 3 Jul 2016 12:04:34 +0000 (13:04 +0100)]
x86/shadow: Try to correctly identify implicit supervisor accesses
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Reviewed-by: George Dunlap <george.dunlap@citrix.com>
All actions which refer to the active ldt/gdt/idt or task register
(e.g. loading a new segment selector) are known as implicit supervisor
accesses, even when the access originates from user code.
Right away, this fixes a bug during userspace emulation where a pagewalk for a
system table was (incorrectly) performed as a user access, causing an access
violation in the common case, as system tables reside on supervisor mappings.
The implicit/explicit distinction is necessary in the pagewalk when SMAP is
enabled. Refer to Intel SDM Vol 3 "Access Rights" for the exact details.
Introduce a new pagewalk input, and make use of the new system segment
references in hvmemul_{read,write}(). While modifying those areas, move the
calculation of the appropriate pagewalk input before its first use.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Acked-by: George Dunlap <george.dunlap@citrix.com> Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Jan Beulich [Thu, 9 Mar 2017 16:42:55 +0000 (17:42 +0100)]
x86emul: suppress reads for unhandled 0f38/0f3a extension space insns
The way these extension spaces get handled we so far always end up
going through the generic SrcMem operand fetch path for unused table
entries. Suppress actual memory accesses happening by forcing op_bytes
to zero in those cases.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Thu, 9 Mar 2017 16:41:58 +0000 (17:41 +0100)]
x86emul: correct vzero{all,upper} for non-64-bit-mode
The registers only accessible in 64-bit mode need to be left alone in
this case.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Julien Grall [Wed, 8 Mar 2017 18:06:00 +0000 (18:06 +0000)]
xen/arm: hvm_domain does not need to be cacheline aligned
hvm_domain only contains the HVM_PARAM that on ARM are not used often.
So it is not necessary to have hvm_domain fitting in a cacheline. Drop
it to save 128 bytes in the structure arch_domain.
Haozhong Zhang [Wed, 8 Mar 2017 14:11:06 +0000 (15:11 +0100)]
x86/mce: remove ASSERT's about mce_[u|d]handler_num in mce_action()
Those assertions as well as mce_[u|d]handlers[], mce_[u|d]handler_num
and mce_action() were intel only and lifted to the common code by c/s 3a91769d6e1. However, MCE handling on AMD does not use mce_[u|d]handlers[]
before and after that commit, so assertions in mce_action() about their
size do not make sense for AMD. To be worse, they can crash the debug
build on AMD. Remove them to make the debug build work on AMD.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Haozhong Zhang [Wed, 8 Mar 2017 14:10:45 +0000 (15:10 +0100)]
x86/mce: clear MSR_IA32_MCG_STATUS by writing 0
On Intel CPU, an attemp to write to MSR_IA32_MCG_STATUS with any
non-zero value would result in #GP.
This commit writes 0 on AMD CPU as well instead of just clearing MCIP
bit, because all non-reserved bits of MSR_IA32_MCG_STATUS have been
handled at this point.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Haozhong Zhang [Wed, 8 Mar 2017 14:10:29 +0000 (15:10 +0100)]
x86/vmce: fill MSR_IA32_MCG_STATUS on all vcpus in broadcast case
The current implementation only fills MC MSRs on vcpu0 and leaves MC
MSRs on other vcpus empty in the broadcast case. When guest reads 0
from MSR_IA32_MCG_STATUS on vcpuN (N > 0), it may think it's not
possible to recover the execution on that vcpu and then get panic,
although MSR_IA32_MCG_STATUS filled on vcpu0 may imply the injected
vMCE is actually recoverable. To avoid such unnecessary guest panic,
set MSR_IA32_MCG_STATUS on vcpuN (N > 0) to MCG_STATUS_MCIP|MCG_STATUS_RIPV.
In addition, fill_vmsr_data(mc_bank, ...) is changed to return -EINVAL
rather than 0, if an invalid domain ID is contained in mc_bank.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Haozhong Zhang [Wed, 8 Mar 2017 14:10:06 +0000 (15:10 +0100)]
x86/mce: set mcinfo_comm.type and .size in x86_mcinfo_reserve()
All existing calls to x86_mcinfo_reserve() are followed by statements
that set the size and the type of the reserved space, so move them into
x86_mcinfo_reserve() to simplify the code.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Haozhong Zhang [Wed, 8 Mar 2017 14:09:46 +0000 (15:09 +0100)]
x86/mce: remove unused x86_mcinfo_add()
c/s 9d13fd9fd320a7740c6446c048ff6a2990095966 turned to update the
mcinfo buffer in-place instead of using x86_mcinfo_add(). The last
uses of x86_mcinfo_add() were removed by that commit as well.
Therefore, x86_mcinfo_add() was deprecated in fact.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Haozhong Zhang [Wed, 8 Mar 2017 14:09:16 +0000 (15:09 +0100)]
x86/mce: adjust comment of callback register functions
c/s e966818264908e842e2847f579ca4d94e586eaac added
mce_need_clearbank_register below the comment of
x86_mce_callback_register(). This commit (1) adjusts the first
paragraph of comment to be a general statement of all callback
register functions, and (2) moves the second paragraph to the
front of x86_mce_callback_register().
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Wed, 8 Mar 2017 14:07:41 +0000 (15:07 +0100)]
x86/MCE: sanitize domain/vcpu ID handling
Storing -1 into both fields was misleading consumers: We really should
have a manifest constant for "invalid vCPU" here, and the already
existing DOMID_INVALID should be used.
Also correct a bogus (dead code) check in mca_init_global(), at once
introducing a manifest constant for the early boot "invalid vCPU"
pointer (avoiding proliferation of the open coding). Make that pointer
a non-canonical address at once.
Finally, don't leave mc_domid uninitialized in mca_init_bank().
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Wed, 8 Mar 2017 14:07:14 +0000 (15:07 +0100)]
MAINTAINERS: drop Christoph Egger
Other Amazon folks indicate he's not available as a maintainer anymore
at this point in time. Maintenance of the MCE sub-component will fall
back to the x86 maintainers.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Christoph Egger <chegger@amazon.de>
Andrew Cooper [Tue, 7 Mar 2017 23:32:24 +0000 (23:32 +0000)]
x86/emul: Avoid #UD in SIMD stubs
v{,u}comis{s,d}, and vcvt{,t}s{s,d}2si are two-operand instructions, while
vzero{all,upper} take no operands. Each require vex.reg set to ~0 to avoid
suffering #UD.
Spotted while fuzzing with AFL Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Paul Durrant [Tue, 7 Mar 2017 14:58:04 +0000 (14:58 +0000)]
vlapic/viridian: abort existing APIC assist if any vector is pending in ISR
The vlapic code already aborts an APIC assist if an interrupt is deferred
because a higher priority interrupt has already been delivered (and hence
its vector is pending in the ISR).
However, it is also necessary to abort an APIC assist in the case where a
higher priority is about to be delivered because, in either case, at least
two vectors will be pending in the ISR and hence an EOI is necessary.
Also, following on from the above reasoning, the decision to start a new
APIC assist should clearly be based upon whether any other vector is
pending in the ISR, regardless of whether it is lower or higher in
priority. (In fact the code in question cannot be reached if the
vector is lower in priority). Thus the single use of
vlapic_find_lowest_vector() can be replaced with a call to
vlapic_find_highest_isr() and the former function removed.
Without this patch, because the logic is flawed, a domain_crash() results
when an attempt is made to erroneously start a new APIC assist.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
Roger Pau Monné [Tue, 7 Mar 2017 16:11:06 +0000 (17:11 +0100)]
x86: drop unneeded __packed attributes
There where a couple of unneeded packed attributes in several x86-specific
structures, that are obviously aligned. The only non-trivial one is
vmcb_struct, which has been checked to have the same layout with and without
the packed attribute using pahole. In that case add a build-time size check to
be on the safe side.
No functional change is expected as a result of this commit.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Jan Beulich [Tue, 7 Mar 2017 16:09:09 +0000 (17:09 +0100)]
x86emul: test coverage for SSE3/SSSE3/SSE4* insns
... and their AVX equivalents. Note that a few instructions aren't
covered (yet), but those all fall into common pattern groups, so I
would hope that for now we can do with what is there.
Just like for SSE/SSE2, MMX insns aren't being covered at all, as
they're not easy to deal with: The compiler refuses to emit such for
other than uses of built-in functions.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 7 Mar 2017 16:06:38 +0000 (17:06 +0100)]
x86emul: test coverage for SSE/SSE2 insns
... and their AVX equivalents. Note that a few instructions aren't
covered (yet), but those all fall into common pattern groups, so I
would hope that for now we can do with what is there.
MMX insns aren't being covered at all, as they're not easy to deal
with: The compiler refuses to emit such for other than uses of built-in
functions.
The current way of testing AVX insns is meant to be temporary only:
Once we fully support that feature, the present tests should rather be
replaced than full ones simply added.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 7 Mar 2017 16:04:08 +0000 (17:04 +0100)]
x86emul: support MMX/SSE/SSE2 converts
Note that other than most scalar instructions, vcvt{,t}s{s,d}2si do #UD
when VEX.l is set on at least some Intel models. To be on the safe
side, implement the most restrictive mode here for now when emulating
an Intel CPU, and simply clear the bit when emulating an AMD one.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 7 Mar 2017 16:03:45 +0000 (17:03 +0100)]
x86emul: support MMX/SSE{,2,3} moves
Previously supported insns are being converted to the new model, and
several new ones are being added.
To keep the stub handling reasonably simple, integrate SET_SSE_PREFIX()
into copy_REX_VEX(), at once switching the stubs to use an empty REX
prefix instead of a double DS: one (no byte registers are being
accessed, so an empty REX prefix has no effect), except (of course) for
the 32-bit test harness build.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Tue, 7 Mar 2017 16:02:53 +0000 (17:02 +0100)]
x86emul: support most memory accessing MMX/SSE{,2,3} insns
This aims at covering most MMX/SSEn/AVX instructions in the 0x0f-escape
space with memory operands. Not covered here are irregular moves,
converts, and {,U}COMIS{S,D} (modifying EFLAGS).
Note that the distinction between simd_*_fp isn't strictly needed, but
I've kept them as separate entries since in an earlier version I needed
them to be separate, and we may well find it useful down the road to
have that distinction.
Also take the opportunity and adjust the vmovdqu test case the new
LDDQU one here has been cloned from: To zero a ymm register we don't
need to go through hoops, as 128-bit AVX insns zero the upper portion
of the destination register, and in the disabled AVX2 code there was a
wrong YMM register used.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
xen/arm: fix affected memory range by dcache clean functions
clean_dcache_va_range and clean_and_invalidate_dcache_va_range don't
calculate the range correctly when "end" is not cacheline aligned. As a
result, the last cacheline is not skipped. Fix the issue by aligning the
start address to the cacheline size.
In addition, make the code simpler and faster in
invalidate_dcache_va_range, by removing the module operation and using
bitmasks instead. Also remove the size adjustments in
invalidate_dcache_va_range, because the size variable is not used later
on.
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Julien Grall <julien.grall@arm.com> Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Razvan Cojocaru [Mon, 6 Mar 2017 16:51:15 +0000 (17:51 +0100)]
x86/mem_access: fix vm_event emulation check with altp2m enabled
Currently, p2m_mem_access_emulate_check() uses p2m_get_mem_access()
to check if the page restrictions have been lifted between the time
of sending the vm_event out and the reception of the reply - in
which case emulation is no longer required. Unfortunately,
p2m_get_mem_access() uses p2m_get_hostp2m(d) which only checks the
default EPT (view 0 in altp2m parlance). This patch fixes this by
checking the active altp2m view instead, whenever applicable.
Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Acked-by: Tamas K Lengyel <tamas@tklengyel.com>
Andrew Cooper [Thu, 2 Mar 2017 19:58:20 +0000 (19:58 +0000)]
x86/cpuid: Fix booting on AMD Phenom 6-core platform
c/s 5cecf60f4 "x86/cpuid: Handle leaf 0x1 in guest_cpuid()" causes Linux 4.10
to crash during boot.
It turns out to be because of the reported apic_id, which was altered to be
more consistent across guests. Revert back to the previous behaviour, by
limiting the apic_id adjustment to HVM guests only. Whomever gets to fixes
topology representation is going to have a lot of fun with non-power-of-2 AMD
boxes.
Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Tested-by: Sander Eikelenboom <linux@eikelenboom.it>