xenbits.xensource.com Git - people/royger/xen.git/commit

x86/vmx: Don't leak EFER.NXE into guest context

Intel hardware only uses 4 bits in MSR_EFER.  Changes to LME and LMA are
handled automatically via the VMENTRY_CTLS.IA32E_MODE bit.

SCE is handled by ad-hoc logic in context_switch(), vmx_restore_guest_msrs()
and vmx_update_guest_efer(), and works by altering the host SCE value to match
the setting the guest wants.  This works because, in HVM vcpu context, Xen
never needs to execute a SYSCALL or SYSRET instruction.

However, NXE has never been context switched.  Unlike SCE, NXE cannot be
context switched at vcpu boundaries because disabling NXE makes PTE.NX bits
reserved and cause a pagefault when encountered.  This means that the guest
always has Xen's setting in effect, irrespective of the bit it can see and
modify in its virtualised view of MSR_EFER.

This isn't a major problem for production operating systems because they, like
Xen, always turn the NXE on when it is available.  However, it does have an
observable effect on which guest PTE bits are valid, and whether
PFEC_insn_fetch is visible in a #PF error code.

Second generation VT-x hardware has host and guest EFER fields in the VMCS,
and support for loading and saving them automatically.  First generation VT-x
hardware needs to use MSR load/save lists to cause an atomic switch of
MSR_EFER on vmentry/exit.

Therefore we update vmx_init_vmcs_config() to find and use guest/host EFER
support when available (and MSR load/save lists on older hardware) and drop
all ad-hoc alteration of SCE.

There are two minor complications when selecting the EFER setting:
* For shadow guests, NXE is a paging setting and must remain under host
   control, but this is fine as Xen also handles the pagefaults.
* When the Unrestricted Guest control is clear, hardware doesn't tolerate LME
   and LMA being different.  This doesn't matter in practice as we intercept
   all writes to CR0 and reads from MSR_EFER, so can provide architecturally
   consistent behaviour from the guests point of view.

With changing how EFER is loaded, vmcs_dump_vcpu() needs adjusting.  Read EFER
from the appropriate information source, and identify when dumping the guest
EFER value which source was used.

As a result of fixing EFER context switching, we can remove the Intel-special
case from hvm_nx_enabled() and let guest_walk_tables() work with the real
guest paging settings.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>

author	Andrew Cooper <andrew.cooper3@citrix.com>
	Tue, 23 May 2017 16:32:30 +0000 (17:32 +0100)
committer	Andrew Cooper <andrew.cooper3@citrix.com>
	Wed, 4 Jul 2018 11:12:15 +0000 (12:12 +0100)
commit	fd32dcfe4c9a539f8e5d26ff4c5ca50ee54556b2
tree	e270b05f99ca707a29e6e0e17063ae6edbdcc1d6	tree
parent	540d5422a9b41639d7367b1d2b24f6bbd8d5ea67	commit \| diff

xen/arch/x86/domain.c		diff \| blob \| blame \| history
xen/arch/x86/hvm/vmx/vmcs.c		diff \| blob \| blame \| history
xen/arch/x86/hvm/vmx/vmx.c		diff \| blob \| blame \| history
xen/include/asm-x86/hvm/hvm.h		diff \| blob \| blame \| history
xen/include/asm-x86/hvm/vmx/vmcs.h		diff \| blob \| blame \| history