]> xenbits.xensource.com Git - xen.git/log
xen.git
11 years agodefer the domain mapping in scrub_one_page()
Andrew Cooper [Fri, 10 Jan 2014 10:39:21 +0000 (11:39 +0100)]
defer the domain mapping in scrub_one_page()

This avoids a resource leak and needless playing with the pagetables in the
case that the page is broken.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <JBeulich@suse.com>
Reviewed-by: Keir Fraser <keir@xen.org>
master commit: 7dd4f9da063cb2cd43426c785535534c9d958ce5
master date: 2013-12-09 14:13:23 +0100

11 years agoQEMU_TAG update
Ian Jackson [Thu, 9 Jan 2014 12:56:55 +0000 (12:56 +0000)]
QEMU_TAG update

11 years agoxenstore: sanity check incoming message body lengths
Matthew Daley [Sat, 30 Nov 2013 00:20:04 +0000 (13:20 +1300)]
xenstore: sanity check incoming message body lengths

This is for the client-side receiving messages from xenstored, so there
is no security impact, unlike XSA-72.

Coverity-ID: 1055449
Coverity-ID: 1056028
Signed-off-by: Matthew Daley <mattd@bugfuzz.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
(cherry picked from commit 8da1ed9031341381c218b7e6eaab5b4f239a327b)
(cherry picked from commit 014f9219f1dca3ee92948f0cfcda8d1befa6cbcd)

11 years agolibxl: don't leak pcidevs in libxl_pcidev_assignable
Matthew Daley [Sun, 1 Dec 2013 10:15:03 +0000 (23:15 +1300)]
libxl: don't leak pcidevs in libxl_pcidev_assignable

Coverity-ID: 1055896
Signed-off-by: Matthew Daley <mattd@bugfuzz.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
(cherry picked from commit 26b35b9ace97f433fcf4c5dfbdfb573d1075255f)
(cherry picked from commit cfa252b05855a712eda0da80cd638c7093ddf89f)

11 years agolibxl: don't leak output vcpu info on error in libxl_list_vcpu
Matthew Daley [Sun, 1 Dec 2013 10:15:01 +0000 (23:15 +1300)]
libxl: don't leak output vcpu info on error in libxl_list_vcpu

Coverity-ID: 1055887
Signed-off-by: Matthew Daley <mattd@bugfuzz.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
(cherry picked from commit 3c113a57f55dc4e36e3552342721db01efa832c6)
(cherry picked from commit d41c205e0173ee923e791c2fd320c7eb25f2e9cb)

11 years agolibxl: actually abort if initializing a ctx's lock fails
Matthew Daley [Sun, 1 Dec 2013 10:15:00 +0000 (23:15 +1300)]
libxl: actually abort if initializing a ctx's lock fails

If initializing the ctx's lock fails, don't keep going, but instead
error out.

Coverity-ID: 1055289
Signed-off-by: Matthew Daley <mattd@bugfuzz.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
(cherry picked from commit b1cb2bdde1f2393d75a925e6c15862b93d3e7abd)
(cherry picked from commit 62f88c08b31259032c81163f4133d6f25f033c1e)

11 years agoxl: fixes for do_daemonize
Roger Pau Monne [Fri, 22 Nov 2013 11:54:09 +0000 (12:54 +0100)]
xl: fixes for do_daemonize

Fix usage of CHK_ERRNO in do_daemonize and also remove the usage of a
bogus for(;;).

Coverity-ID: 1130516 and 1130520
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <Ian.Jackson@eu.citrix.com>
(cherry picked from commit ed8c9047f6fc6d28fc27d37576ec8c8c1be68efe)

Conflicts:
tools/libxl/xl_cmdimpl.c
(cherry picked from commit c393ff09ade45d1a2a8f1c12eac5eab4d38947a3)

11 years agolibxl: fix fd check in libxl__spawn_local_dm
Roger Pau Monne [Fri, 22 Nov 2013 11:54:08 +0000 (12:54 +0100)]
libxl: fix fd check in libxl__spawn_local_dm

Checking the logfile_w fd for -1 on failure is no longer true, because
libxl__create_qemu_logfile will now return ERROR_FAIL on failure which
is -3.

While there also add an error check for opening /dev/null.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <Ian.Jackson@eu.citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
(cherry picked from commit 3b88d95e9c0a5ff91d5b60e94d81f1982af57e7f)

Conflicts:
tools/libxl/libxl_dm.c
(cherry picked from commit 8f1bd27fcd7f8be1353e7309f450283f3e5f7cd0)

Conflicts:
tools/libxl/libxl_dm.c

11 years agotools/libxl: Avoid deliberate NULL pointer dereference
Andrew Cooper [Mon, 25 Nov 2013 11:12:50 +0000 (11:12 +0000)]
tools/libxl: Avoid deliberate NULL pointer dereference

Coverity ID: 1055290

Calling LIBXL__LOG_ERRNO(ctx,) with a ctx pointer we have just failed to
allocate is going to end badly.  Opencode a suitable use of xtl_log() instead.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
(cherry picked from commit 1677af03c14f2d8d88d2ed9ed8ce6d4906d19fb4)
(cherry picked from commit 4cbbbdfb775d387dc1e0931b44e14d3205c92265)

11 years agotools/libxc: Improve xc_dom_malloc_filemap() error handling
Andrew Cooper [Mon, 25 Nov 2013 11:05:49 +0000 (11:05 +0000)]
tools/libxc: Improve xc_dom_malloc_filemap() error handling

Coverity ID 1055563

In the original function, mmap() could be called with a length of -1 if the
second lseek failed and the caller had not provided max_size.

While fixing up this error, improve the logging of other error paths.  I know
from personal experience that debugging failures function is rather difficult
given only "xc_dom_malloc_filemap: failed (on file <somefile>)" in the logs.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
(cherry picked from commit c635c1ef7833e7505423f6567bf99bd355101587)
(cherry picked from commit a5febe4aeff4ab80ce0411f63f336c25951098cf)

11 years agotools/xc_restore: Initialise console and store mfns
Andrew Cooper [Mon, 25 Nov 2013 11:05:47 +0000 (11:05 +0000)]
tools/xc_restore: Initialise console and store mfns

If the console or store mfn chunks are not present in the migration stream,
stack junk gets reported for the mfns.

XenServer had a very hard to track down VM corruption issue caused by exactly
this issue.  Xenconsoled would connect to a junk mfn and incremented the ring
pointer if the junk happend to look like a valid gfn.

Coverity ID: 1056093 1056094

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
(cherry picked from commit 592b614f3469bb83d1158c3dc8c15b67aacfbf4f)
(cherry picked from commit 6d7b67c67039ceac36a780b59c2b890739094b95)

Conflicts:
tools/xcutils/xc_restore.c

11 years agotools/xenconsoled: Fix file handle leaks
Andrew Cooper [Mon, 25 Nov 2013 11:06:39 +0000 (11:06 +0000)]
tools/xenconsoled: Fix file handle leaks

Coverity ID: 715218 1055876 1055877

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
(cherry picked from commit 9ab1792e1ce9e77afe2cd230d69e56a0737a735f)
(cherry picked from commit 6f6d936af8acb7d9e36b70e5e70953f695ca3b36)

11 years agotools/xenconsole: Use xc_domain_getinfo() correctly
Andrew Cooper [Mon, 25 Nov 2013 11:06:38 +0000 (11:06 +0000)]
tools/xenconsole: Use xc_domain_getinfo() correctly

Coverity ID: 1055018

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
(cherry picked from commit aa344500a3bfceb3ef01931609ac1cfaf6dcf52d)
(cherry picked from commit 74cd17f84649012bec7ce484bf7b9c3f3a9e79ae)

11 years agotools/libxl: Fix integer overflows in sched_sedf_domain_set()
Andrew Cooper [Mon, 25 Nov 2013 11:12:51 +0000 (11:12 +0000)]
tools/libxl: Fix integer overflows in sched_sedf_domain_set()

Coverity ID: 1055662 1055663 1055664

Widen from int to uint64_t before multiplcation, rather than afterwards.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Ian Jackson <Ian.Jackson@eu.citrix.com>
(cherry picked from commit 9c01516fee7d548af58fd310d3c93dd71ea9ea28)
(cherry picked from commit 2de748569f827b037ec10104f7c12f44d01d0ffa)

11 years agotools/libxl: Fix memory leak in sched_domain_output()
Andrew Cooper [Mon, 25 Nov 2013 11:16:48 +0000 (11:16 +0000)]
tools/libxl: Fix memory leak in sched_domain_output()

Coverity ID: 1055904

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Keir Fraser <keir@xen.org>
CC: Jan Beulich <JBeulich@suse.com>
(cherry picked from commit 0792426b798fd3b39909d618cf8fe8bac30594f4)

Conflicts:
tools/libxl/xl_cmdimpl.c
(cherry picked from commit 338a8b13757d6ef36ff4e321cb4ef4190ba6ec02)

11 years agoIOMMU: clear "don't flush" override on error paths
Jan Beulich [Tue, 10 Dec 2013 15:21:57 +0000 (16:21 +0100)]
IOMMU: clear "don't flush" override on error paths

Both xenmem_add_to_physmap() and iommu_populate_page_table() each have
an error path that fails to clear that flag, thus suppressing further
flushes on the respective pCPU.

In iommu_populate_page_table() also slightly re-arrange code to avoid
the false impression of the flag in question being guarded by a
domain's page_alloc_lock.

This is CVE-2013-6400 / XSA-80.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
master commit: 552b7fcb9a70f1d4dd0e0cd5fb4d3d9da410104a
master date: 2013-12-10 16:10:37 +0100

11 years agox86/boot: fix BIOS memory corruption on certain IBM systems
Andrew Cooper [Mon, 9 Dec 2013 13:47:43 +0000 (14:47 +0100)]
x86/boot: fix BIOS memory corruption on certain IBM systems

IBM System x3530 M4 BIOSes (including the latest available at the time of this
patch) will corrupt a byte at physical address 0x105ff1 to the value of 0x86
if %esp has the value 0x00080000 when issuing an `int $0x15 (ax=0xec00)` to
inform the system about our intended operating mode.

Xen gets unhappy when the bootloader has placed it's .text section in over
this specific region of RAM.

After dropping into 16bit mode, clear all 32 bits of %esp, and for the BIOS
call already documented to be affected by BIOS bugs clear all GPRs.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Release-acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: 1ed76797439e384de18fcd6810bd4743d4f38b1e
master date: 2013-12-06 11:28:00 +0100

11 years agox86: fix early boot command line parsing
Daniel Kiper [Mon, 9 Dec 2013 13:47:07 +0000 (14:47 +0100)]
x86: fix early boot command line parsing

There is no reliable way to encode NUL character as a character so encode
it as a number. Read: http://sourceware.org/binutils/docs/as/Characters.html.
Octal and hex encoding do not work on at least one system (GNU assembler
version 2.22 (x86_64-linux-gnu) using BFD version (GNU Binutils for Debian) 2.22).
Without this fix e.g. no-real-mode option at the end of xen.gz command line
is not detected.

Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: dc37e0bfffc673f4bdce1d69ad86098bfb0ab531
master date: 2013-12-04 13:26:37 +0100

11 years agofix locking in offline_page()
Jan Beulich [Mon, 9 Dec 2013 13:46:26 +0000 (14:46 +0100)]
fix locking in offline_page()

Coverity ID 1055655

Apart from the Coverity-detected lock order reversal (a domain's
page_alloc_lock taken with the heap lock already held), calling
put_page() with heap_lock is a bad idea too (as a possible descendant
from put_page() is free_heap_pages(), which wants to take this very
lock).

From all I can tell the region over which heap_lock was held was far
too large: All we need to protect are the call to mark_page_offline()
and reserve_heap_page() (and I'd even put under question the need for
the former). Hence by slightly re-arranging the if/else-if chain we
can drop the lock much earlier, at once no longer covering the two
put_page() invocations.

Once at it, do a little bit of other cleanup: Put the "pod_replace"
code path inline rather than at its own label, and drop the effectively
unused variable "ret".

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Acked-by: Keir Fraser <keir@xen.org>
master commit: d4837a56da4a59259dd0cf9f3bdc073159d81d7a
master date: 2013-12-03 12:40:57 +0100

11 years agoFix ptr calculation when converting from a VA
Jean-Yves Migeon [Mon, 9 Dec 2013 13:45:59 +0000 (14:45 +0100)]
Fix ptr calculation when converting from a VA

The ptr calculation shall take the offset into the page into account
when ptr is valid.

Reported regression on NetBSD's port-xen with last known working libxen
being rev 2.9. This corrupts the kernel symbol table when the table is
not loaded on a page boundary.

Issue was tracked down by FastIce and Jeff Rizzo. See also
http://mail-index.netbsd.org/port-xen/2013/10/16/msg008088.html

Signed-off-by: Jean-Yves Migeon <jym@NetBSD.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
master commit: cb08944a482a5e80a3ff1113f0735761cc4c6cb8
master date: 2013-11-29 11:07:01 +0000

11 years agox86: properly handle MSI-X unmask operation from guests
Feng Wu [Mon, 9 Dec 2013 13:45:00 +0000 (14:45 +0100)]
x86: properly handle MSI-X unmask operation from guests

For a pass-through device with MSI-x capability, when guest tries
to unmask the MSI-x interrupt for the passed through device, xen
doesn't clear the mask bit for MSI-x in hardware in the following
scenario, which will cause network disconnection:

1. Guest masks the MSI-x interrupt
2. Guest updates the address and data for it
3. Guest unmasks the MSI-x interrupt (This is the problematic step)

In the step #3 above, Xen doesn't handle it well. When guest tries
to unmask MSI-X interrupt, it traps to Xen, Xen just returns to Qemu
if it notices that address or data has been modified by guest before,
then Qemu will update Xen with the latest value of address/data by
hypercall. However, in this whole process, the MSI-X interrupt unmask
operation is missing, which means Xen doesn't clear the mask bit in
hardware for the MSI-X interrupt, so it remains disabled, that is why
it loses the network connection.

This patch fixes this issue.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Only latch the address if the guest really is unmasking the entry.

Clean up the entire change.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 74fd0036deb585a139b63b26db025805ecedc37a
master date: 2013-11-27 15:15:43 +0100

11 years agoVMX: fix cr0.cd handling
Liu Jinsong [Mon, 9 Dec 2013 13:43:34 +0000 (14:43 +0100)]
VMX: fix cr0.cd handling

This patch solves XSA-60 security hole:
1. For guest w/o VT-d, and for guest with VT-d but snooped, Xen need
do nothing, since hardware snoop mechanism has ensured cache coherency.

2. For guest with VT-d but non-snooped, cache coherency can not be
guaranteed by h/w snoop, therefore it need emulate UC type to guest:
2.1). if it works w/ Intel EPT, set guest IA32_PAT fields as UC so that
guest memory type are all UC.
2.2). if it works w/ shadow, drop all shadows so that any new ones would
be created on demand w/ UC.

This patch also fix a bug of shadow cr0.cd setting. Current shadow has a
small window between cache flush and TLB invalidation, resulting in possilbe
cache pollution. This patch pause vcpus so that no vcpus context involved
into the window.

This is CVE-2013-2212 / XSA-60.

Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: 62652c00efa55fb45374bcc92f7d96fc411aebb2
master date: 2013-11-06 10:12:36 +0100

11 years agoVMX: remove the problematic set_uc_mode logic
Liu Jinsong [Mon, 9 Dec 2013 13:41:44 +0000 (14:41 +0100)]
VMX: remove the problematic set_uc_mode logic

XSA-60 security hole comes from the problematic vmx_set_uc_mode.
This patch remove vmx_set_uc_mode logic, which will be replaced by
PAT approach at later patch.

This is CVE-2013-2212 / XSA-60.

Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
master commit: 1c84d046735102e02d2df454ab07f14ac51f235d
master date: 2013-11-06 10:12:00 +0100

11 years agoVMX: disable EPT when !cpu_has_vmx_pat
Liu Jinsong [Mon, 9 Dec 2013 13:40:51 +0000 (14:40 +0100)]
VMX: disable EPT when !cpu_has_vmx_pat

Recently Oracle developers found a Xen security issue as DOS affecting,
named as XSA-60. Please refer http://xenbits.xen.org/xsa/advisory-60.html
Basically it involves how to handle guest cr0.cd setting, which under
some environment it consumes much time resulting in DOS-like behavior.

This is a preparing patch for fixing XSA-60. Later patch will fix XSA-60
via PAT under Intel EPT case, which depends on cpu_has_vmx_pat.

This is CVE-2013-2212 / XSA-60.

Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Acked-by: Jun Nakajima <jun.nakajima@intel.com>
master commit: c13b0d65ddedd74508edef5cd66defffe30468fc
master date: 2013-11-06 10:11:18 +0100

11 years agox86/hvm: fix segment validation
Tim Deegan [Mon, 9 Dec 2013 13:36:58 +0000 (14:36 +0100)]
x86/hvm: fix segment validation

Also Coverity CID 1055180.

Reported-by: David Binderman <dcb314@hotmail.com>
Signed-off-by: Tim Deegan <tim@xen.org>
Use _SEGMENT_* instead of plain numbers and adjust a comment.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 6ed4bfbabd487b41021caa7ed03cee1f00ecbabf
master date: 2013-11-26 09:54:21 +0100

11 years agox86/AMD: work around erratum 793
Jan Beulich [Tue, 3 Dec 2013 13:15:34 +0000 (14:15 +0100)]
x86/AMD: work around erratum 793

The recommendation is to set a bit in an MSR - do this if the firmware
didn't, considering that otherwise we expose ourselves to a guest
induced DoS.

This is CVE-2013-6885 / XSA-82.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
master commit: 98162f256ee33994a9881a720419dda9ad4c03a8
master date: 2013-12-03 09:49:54 +0100

11 years agox86/xsave: fix nonlazy state handling
Liu Jinsong [Mon, 2 Dec 2013 14:56:09 +0000 (15:56 +0100)]
x86/xsave: fix nonlazy state handling

Nonlazy xstates should be xsaved each time when vcpu_save_fpu.
Operation to nonlazy xstates will not trigger #NM exception, so
whenever vcpu scheduled in it got restored and whenever scheduled
out it should get saved.

Currently this bug affects AMD LWP feature, and later Intel MPX
feature. With the bugfix both LWP and MPX will work fine.

Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
Furthermore, during restore we also need to set nonlazy_xstate_used
according to the incoming accumulated XCR0.

Also adjust the changes to i387.c such that there won't be a pointless
clts()/stts() pair.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
master commit: 7d8b5dd98463524686bdee8b973b53c00c232122
master date: 2013-11-25 11:19:04 +0100

11 years agox86/crash: disable the watchdog NMIs on the crashing cpu
David Vrabel [Mon, 2 Dec 2013 14:55:16 +0000 (15:55 +0100)]
x86/crash: disable the watchdog NMIs on the crashing cpu

nmi_shootdown_cpus() is called during a crash to park all the other
CPUs.  This changes the NMI trap handlers which means there's no point
in having the watchdog still running.

This also disables the watchdog before executing any crash kexec image
and prevents the image from receiving unexpected NMIs.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
PVOps Linux as a kexec image shoots itself in the foot otherwise.

On a Core2 system, Linux declares a firmware bug and tries to invert some bits
in the performance counter register.  It ends up setting the number of retired
instructions to generate another NMI to fewer instructions than the NMI
interrupt path itself, and ceases to make any useful progress.

The call to disable_lapic_nmi_watchdog() must be this late into the kexec path
to be sure that this cpu is the one which will execute the kexec image.
Otherwise there are race conditions where the NMIs might be disabled on the
wrong cpu, resulting in the kexec image still receiving NMIs.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 2a16fcd5ba0244fef764886211452acc69c0ed00
master date: 2013-11-22 14:48:12 +0100

11 years agox86/hvm: reset TSC to 0 after domain resume from S3
Tomasz Wroblewski [Mon, 2 Dec 2013 14:54:42 +0000 (15:54 +0100)]
x86/hvm: reset TSC to 0 after domain resume from S3

Host S3 implicitly resets the host TSC to 0, but the tsc offset for hvm
domains is not recalculated when they resume, causing it to go into
negative values. In Linux guest using tsc clocksource, this results in
a hang after wrap back to positive values since the tsc clocksource
implementation expects it reset.

Signed-off-by: Tomasz Wroblewski <tomasz.wroblewski@citrix.com>
master commit: e95dc6ba69daef6468b3ae5912710727244d6e2f
master date: 2013-11-22 14:47:24 +0100

11 years agox86: consider modules when cutting off memory
Jan Beulich [Mon, 2 Dec 2013 14:53:50 +0000 (15:53 +0100)]
x86: consider modules when cutting off memory

The code in question runs after module ranges got already removed from
the E820 table, so when determining the new maximum page/PDX we need to
explicitly take them into account.

Furthermore we need to round up the ending addresses here, in order to
fully cover eventual partial trailing pages.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: a5db2c7aab7a638d84f22ac8fe5089d81175438b
master date: 2013-11-18 13:57:20 +0100

11 years agofix leaking of v->cpu_affinity_saved on domain destruction
Dario Faggioli [Mon, 2 Dec 2013 14:52:20 +0000 (15:52 +0100)]
fix leaking of v->cpu_affinity_saved on domain destruction

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: 6757efe1bf50ac7ff68fa4dd7d9333529f70ae9a
master date: 2013-11-15 17:43:28 +0100

11 years agocredit: Update other parameters when setting tslice_ms
Nate Studer [Mon, 2 Dec 2013 14:51:00 +0000 (15:51 +0100)]
credit: Update other parameters when setting tslice_ms

Add a utility function to update the rest of the timeslice
accounting fields when updating the timeslice of the
credit scheduler, so that capped CPUs behave correctly.

Before this patch changing the timeslice to a value higher
than the default would result in a domain not utilizing
its full capacity and changing the timeslice to a value
lower than the default would result in a domain exceeding
its capacity.

Signed-off-by: Nate Studer <nate.studer@dornerworks.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
master commit: 1320b8100c2ed390fc640557a050f5c700d8338d
master date: 2013-11-15 17:38:10 +0100

11 years agox86/HVM: only allow ring 0 guest code to make hypercalls
Jan Beulich [Wed, 27 Nov 2013 08:49:28 +0000 (09:49 +0100)]
x86/HVM: only allow ring 0 guest code to make hypercalls

Anything else would allow for privilege escalation.

This is CVE-2013-4554 / XSA-76.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
master commit: 5c447caaf49192c7b2c057ffbb565ce72aac666d
master date: 2013-11-27 09:01:49 +0100

11 years agox86: restrict XEN_DOMCTL_getmemlist
Jan Beulich [Wed, 27 Nov 2013 08:48:27 +0000 (09:48 +0100)]
x86: restrict XEN_DOMCTL_getmemlist

Coverity ID 1055652

(See the code comment.)

This is CVE-2013-4553 / XSA-74.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
master commit: 19f027cc5daff4a37fd0a28bca2514c721852dd0
master date: 2013-11-27 09:00:41 +0100

11 years agoQEMU_TAG update
Ian Jackson [Mon, 25 Nov 2013 13:53:27 +0000 (13:53 +0000)]
QEMU_TAG update

11 years agolibxl: Do not generate short block in libxl__datacopier_prefixdata
Ian Jackson [Tue, 3 Sep 2013 12:41:46 +0000 (13:41 +0100)]
libxl: Do not generate short block in libxl__datacopier_prefixdata

libxl__datacopier_prefixdata would prepend a deliberately short block
(not just a half-full one, but one with a short buffer) to the
dc->bufs queue.  However, this is wrong because datacopier_readable
will find it and try to continue to fill it up.

Instead, allocate a full-sized buffer.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Tested-by: Chunyan Liu <cyliu@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
(cherry picked from commit f6d2a87f051456bddb7a47922c8cf60f37073063)
(cherry picked from commit 841a5a5aca13255e7006e19eb880eaf6df143ac2)

11 years agolibxl: save/restore errno in SIGCHLD handler
Ian Jackson [Mon, 11 Nov 2013 17:17:55 +0000 (17:17 +0000)]
libxl: save/restore errno in SIGCHLD handler

Without this, code interrupted by SIGCHLD may experience strange
values of errno.  (As far as I know this is not the cause of any
reported bugs.)

This fix should be backported in due course.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
(cherry picked from commit 704d10289bcaee46a3ef6cdc7966186e3b8033fa)
(cherry picked from commit be68feb0e12a4538a33f5526ec1a09165aa18a45)

11 years agoVT-d: fix TLB flushing in dma_pte_clear_one()
Jan Beulich [Mon, 18 Nov 2013 13:00:34 +0000 (14:00 +0100)]
VT-d: fix TLB flushing in dma_pte_clear_one()

The third parameter of __intel_iommu_iotlb_flush() is to indicate
whether the to be flushed entry was a present one. A few lines before,
we bailed if !dma_pte_present(*pte), so there's no need to check the
flag here again - we can simply always pass TRUE here.

This is XSA-78.

Suggested-by: Cheng Yueqiang <yqcheng.2008@phdis.smu.edu.sg>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: 85c72f9fe764ed96f5c149efcdd69ab7c18bfe3d
master date: 2013-11-18 13:55:55 +0100

11 years agox86: eliminate has_arch_mmios()
Jan Beulich [Fri, 15 Nov 2013 10:47:41 +0000 (11:47 +0100)]
x86: eliminate has_arch_mmios()

... as being generally insufficient: Either has_arch_pdevs() or
cache_flush_permitted() should be used (in particular, it is
insufficient to consider MMIO ranges alone - I/O port ranges have the
same requirements if available to a guest).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: 79233938ab2a8f273fd5dcdbf8e8381b9eb3a461
master date: 2013-11-12 16:28:47 +0100

11 years agoVMX: don't crash processing 'd' debug key
Jan Beulich [Fri, 15 Nov 2013 10:46:26 +0000 (11:46 +0100)]
VMX: don't crash processing 'd' debug key

There's a window during scheduling where "current" and the active VMCS
may disagree: The former gets set much earlier than the latter. Since
both vmx_vmcs_enter() and vmx_vmcs_exit() immediately return when the
subject vCPU is "current", accessing VMCS fields would, depending on
whether there is any currently active VMCS, either read wrong data, or
cause a crash.

Going forward we might want to consider reducing the window during
which vmx_vmcs_enter() might fail (e.g. doing a plain __vmptrld() when
v->arch.hvm_vmx.vmcs != this_cpu(current_vmcs) but arch_vmx->active_cpu
== -1), but that would add complexities (acquiring and - more
importantly - properly dropping v->arch.hvm_vmx.vmcs_lock) that don't
look worthwhile adding right now.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: 58929248461ecadce13e92eb5a5d9ef718a7c88e
master date: 2013-11-12 11:52:19 +0100

11 years agox86/EFI: make trampoline allocation more flexible
Jan Beulich [Fri, 15 Nov 2013 10:44:17 +0000 (11:44 +0100)]
x86/EFI: make trampoline allocation more flexible

Certain UEFI implementations reserve all memory below 1Mb at boot time,
making it impossible to properly allocate the chunk necessary for the
trampoline. Fall back to simply grabbing a chunk from EfiBootServices*
regions immediately prior to calling ExitBootServices().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: c1f2dfe8f6a559bc28935f24e31bb33d17d9713d
master date: 2013-11-08 11:08:32 +0100

11 years agox86/HVM: 32-bit IN result must be zero-extended to 64 bits
Jan Beulich [Fri, 15 Nov 2013 10:42:36 +0000 (11:42 +0100)]
x86/HVM: 32-bit IN result must be zero-extended to 64 bits

Just like for all other operations with 32-bit operand size.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
x86/HVM: 32-bit IN result must be zero-extended to 64 bits (part 2)

Just spotted a counterpart of what commit 9d89100b (same title) dealt
with.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: 9d89100ba8b7b02adb7c2e89ef7c81e734942e7c
master date: 2013-11-05 14:51:53 +0100
master commit: 1e521eddeb51a9f1bf0e4dd1d17efc873eafae41
master date: 2013-11-15 11:01:49 +0100

11 years agox86/ACPI/x2APIC: guard against out of range ACPI or APIC IDs
Jan Beulich [Fri, 15 Nov 2013 10:37:39 +0000 (11:37 +0100)]
x86/ACPI/x2APIC: guard against out of range ACPI or APIC IDs

Other than for the legacy APIC, the x2APIC MADT entries have valid
ranges possibly extending beyond what our internal arrays can handle,
and hence we need to guard ourselves against corrupting memory here.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Keir Fraser <keir@xen.org>
master commit: 2c24cdcce3269f3286790c63821951a1de93c66a
master date: 2013-11-04 10:10:04 +0100

11 years agox86: refine address validity checks before accessing page tables
Jan Beulich [Fri, 15 Nov 2013 10:36:39 +0000 (11:36 +0100)]
x86: refine address validity checks before accessing page tables

In commit 40d66baa ("x86: correct LDT checks") and d06a0d71 ("x86: add
address validity check to guest_map_l1e()") I didn't really pay
attention to the fact that these checks would better be done before the
paging_mode_translate() ones, as there's also no equivalent check down
the shadow code paths involved here (at least not up to the first use
of the address), and such generic checks shouldn't really be done by
particular backend functions anyway.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
master commit: 343cad8c70585c4dba8afc75e1ec1b7610605ab2
master date: 2013-10-28 12:00:36 +0100

11 years agox86/xsave: also save/restore XCR0 across suspend (ACPI S3)
Jan Beulich [Fri, 15 Nov 2013 10:36:12 +0000 (11:36 +0100)]
x86/xsave: also save/restore XCR0 across suspend (ACPI S3)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: e47a90e6dca491c0ceea6ffa18055e7e32565e8e
master date: 2013-10-21 17:26:16 +0200

11 years agocredit: unpause parked vcpu before destroying it
Juergen Gross [Fri, 15 Nov 2013 10:35:28 +0000 (11:35 +0100)]
credit: unpause parked vcpu before destroying it

A capped out vcpu must be unpaused in case of moving it to another cpupool,
otherwise it will be paused forever.

Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
master commit: d38a668b6ef8c84d1d3fda9947ffb0056d01fe3a
master date: 2013-10-16 12:26:48 +0200

11 years agosched: fix race between sched_move_domain() and vcpu_wake()
David Vrabel [Fri, 15 Nov 2013 10:34:43 +0000 (11:34 +0100)]
sched: fix race between sched_move_domain() and vcpu_wake()

From: David Vrabel <david.vrabel@citrix.com>

sched_move_domain() changes v->processor for all the domain's VCPUs.
If another domain, softirq etc. triggers a simultaneous call to
vcpu_wake() (e.g., by setting an event channel as pending), then
vcpu_wake() may lock one schedule lock and try to unlock another.

vcpu_schedule_lock() attempts to handle this but only does so for the
window between reading the schedule_lock from the per-CPU data and the
spin_lock() call.  This does not help with sched_move_domain()
changing v->processor between the calls to vcpu_schedule_lock() and
vcpu_schedule_unlock().

Fix the race by taking the schedule_lock for v->processor in
sched_move_domain().

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Acked-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
Use vcpu_schedule_lock_irq() (which now returns the lock) to properly
retry the locking should the to be used lock have changed in the course
of acquiring it (issue pointed out by George Dunlap).

Add a comment explaining the state after the v->processor adjustment.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: ef55257bc81204e34691f1c2aa9e01f2d0768bdd
master date: 2013-10-14 08:58:31 +0200

11 years agofix locking in cpu_disable_scheduler()
Jan Beulich [Fri, 15 Nov 2013 10:34:01 +0000 (11:34 +0100)]
fix locking in cpu_disable_scheduler()

So commit eedd6039 ("scheduler: adjust internal locking interface")
uncovered - by now using proper spin lock constructs - a bug after all:
When bringing down a CPU, cpu_disable_scheduler() gets called with
interrupts disabled, and hence the use of vcpu_schedule_lock_irq() was
never really correct (i.e. the caller ended up with interrupts enabled
despite having disabled them explicitly).

Fixing this however surfaced another problem: The call path
vcpu_migrate() -> evtchn_move_pirqs() wants to acquire the event lock,
which however is a non-IRQ-safe once, and hence check_lock() doesn't
like this lock to be acquired when interrupts are already off. As we're
in stop-machine context here, getting things wrong wrt interrupt state
management during lock acquire/release is out of question though, so
the simple solution to this appears to be to just suppress spin lock
debugging for the period of time while the stop machine callback gets
run.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: 41a0cc9e26160a89245c9ba3233e3f70bf9cd4b4
master date: 2013-10-29 09:57:14 +0100

11 years agoscheduler: adjust internal locking interface
Jan Beulich [Fri, 15 Nov 2013 10:32:51 +0000 (11:32 +0100)]
scheduler: adjust internal locking interface

Make the locking functions return the lock pointers, so they can be
passed to the unlocking functions (which in turn can check that the
lock is still actually providing the intended protection, i.e. the
parameters determining which lock is the right one didn't change).

Further use proper spin lock primitives rather than open coded
local_irq_...() constructs, so that interrupts can be re-enabled as
appropriate while spinning.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: eedd60391610629b4e8a2e8278b857ff884f750d
master date: 2013-10-14 08:57:56 +0200

11 years agolibxl: Fix bug in libxl_cdrom_insert, make more robust against bad xenstore data
Ian Jackson [Wed, 1 May 2013 15:56:54 +0000 (16:56 +0100)]
libxl: Fix bug in libxl_cdrom_insert, make more robust against bad xenstore data

libxl_cdrom_insert was failing to initialize the backend type,
resulting in the wrong default backend.  The result was not only that
the CD was not inserted properly, but also that some improper xenstore
entries were created, causing further block commands to fail.

This patch fixes the bug by setting the disk backend type based on the
type of the existing device.

It also makes the system more robust by checking to see that it has
got a valid path before proceeding to write a partial xenstore entry.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Campbell <ian.campbell@citrix.com>
(cherry picked from commit c3556e2a1aee3c9b7dda5d57e85e8867fff1b9da)

Conflicts:
tools/libxl/libxl.c
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agonested VMX: VMLANUCH/VMRESUME emulation must check permission first thing
Jan Beulich [Mon, 11 Nov 2013 08:18:59 +0000 (09:18 +0100)]
nested VMX: VMLANUCH/VMRESUME emulation must check permission first thing

Otherwise uninitialized data may be used, leading to crashes.

This is CVE-2013-4551 / XSA-75.

Reported-and-tested-by: Jeff Zimmerman <Jeff_Zimmerman@McAfee.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-and-tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
master commit: 4e87bc5b03e05123ba5c888f77969140c8ebd1bf
master date: 2013-11-11 09:15:04 +0100

11 years agognttab: correct locking order reversal
Andrew Cooper [Mon, 4 Nov 2013 13:53:28 +0000 (14:53 +0100)]
gnttab: correct locking order reversal

Coverity ID 1087189

Correct a lock order reversal between a domains page allocation and grant
table locks.

This is CVE-2013-4494 / XSA-73.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Consolidate error handling.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Keir Fraser <keir@xen.org>
Tested-by: Matthew Daley <mattjd@gmail.com>
master commit: a321abc6d8122f8cb3928066cc74363c4fdddcfd
master date: 2013-11-04 10:06:36 +0100

11 years agotools: xenstored: if the reply is too big then send E2BIG error
Ian Jackson [Tue, 29 Oct 2013 15:45:53 +0000 (15:45 +0000)]
tools: xenstored: if the reply is too big then send E2BIG error

This fixes the issue for both C and ocaml xenstored, however only the ocaml
xenstored is vulnerable in its default configuration.

Adding a new error appears to be safe, since bit libxenstore and the Linux
driver at least treat an unknown error code as EINVAL.

This is XSA-72 / CVE-2013-4416.

Original ocaml patch by Jerome Maloberti <jerome.maloberti@citrix.com>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Thomas Sanders <thomas.sanders@citrix.com>
(cherry picked from commit 8b2c441a1b53a43a38b3c517e28f239da3349872)
(cherry picked from commit d88ac91ef2f0a93ea9359a8133405dbd78abc89b)

11 years agoRevert "x86/percpu: Force INVALID_PERCPU_AREA into the non-canonical address region"
Jan Beulich [Mon, 28 Oct 2013 10:03:54 +0000 (11:03 +0100)]
Revert "x86/percpu: Force INVALID_PERCPU_AREA into the non-canonical address region"

This reverts commit 707aec94c54127ebfda7d0f8455ecbb332ee49f0.
It needs the 32-bit case to be taken into account.

11 years agox86-64: check for canonical address before doing page walks
Jan Beulich [Tue, 22 Oct 2013 10:07:40 +0000 (12:07 +0200)]
x86-64: check for canonical address before doing page walks

... as there doesn't really exists any valid mapping for them.

Particularly in the case of do_page_walk() this also avoids returning
non-NULL for such invalid input.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: 6fd9b0361e2eb5a7f12bdd5cbf7e42c0d1937d26
master date: 2013-10-11 09:31:16 +0200

11 years agox86: add address validity check to guest_map_l1e()
Jan Beulich [Tue, 22 Oct 2013 10:06:43 +0000 (12:06 +0200)]
x86: add address validity check to guest_map_l1e()

Just like for guest_get_eff_l1e() this prevents accessing as page
tables (and with the wrong memory attribute) internal data inside Xen
happening to be mapped with 1Gb pages.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: d06a0d715ec1423b6c42141ab1b0ff69a3effb56
master date: 2013-10-11 09:29:43 +0200

11 years agox86: correct LDT checks
Jan Beulich [Tue, 22 Oct 2013 10:05:45 +0000 (12:05 +0200)]
x86: correct LDT checks

- MMUEXT_SET_LDT should behave as similarly to the LLDT instruction as
  possible: fail only if the base address is non-canonical
- instead LDT descriptor accesses should fault if the descriptor
  address ends up being non-canonical (by ensuring this we at once
  avoid reading an entry from the mach-to-phys table and consider it a
  page table entry)
- fault propagation on using LDT selectors must distinguish #PF and #GP
  (the latter must be raised for a non-canonical descriptor address,
  which also applies to several other uses of propagate_page_fault(),
  and hence the problem is being fixed there)
- map_ldt_shadow_page() should properly wrap addresses for 32-bit VMs

At once remove the odd invokation of map_ldt_shadow_page() from the
MMUEXT_SET_LDT handler: There's nothing really telling us that the
first LDT page is going to be preferred over others.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: 40d66baa46ca8a9ffa6df3e063a967d08ec92bcf
master date: 2013-10-11 09:28:26 +0200

11 years agoforbid PV guest console reads
Daniel De Graaf [Tue, 22 Oct 2013 10:04:43 +0000 (12:04 +0200)]
forbid PV guest console reads

The CONSOLEIO_read operation was incorrectly allowed to PV guests if the
hypervisor was compiled in debug mode (with VERBOSE defined).

Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
master commit: 65ba631bcb62c79eb33ebfde8a0471fd012c37a8
master date: 2013-10-04 12:51:44 +0200

11 years agox86/percpu: Force INVALID_PERCPU_AREA into the non-canonical address region
Andrew Cooper [Tue, 22 Oct 2013 10:04:01 +0000 (12:04 +0200)]
x86/percpu: Force INVALID_PERCPU_AREA into the non-canonical address region

This causes accidental uses of per_cpu() on a pcpu with an INVALID_PERCPU_AREA
to result in a #GF for attempting to access the middle of the non-canonical
virtual address region.

This is preferable to the current behaviour, where incorrect use of per_cpu()
will result in an effective NULL structure dereference which has security
implication in the context of PV guests.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: 7cfb0053629c4dd1a6f01dc43cca7c0c25b8b7bf
master date: 2013-10-04 12:24:34 +0200

11 years agox86/idle: Fix get_cpu_idle_time()'s interaction with offline pcpus
Andrew Cooper [Tue, 22 Oct 2013 10:03:03 +0000 (12:03 +0200)]
x86/idle: Fix get_cpu_idle_time()'s interaction with offline pcpus

Checking for "idle_vcpu[cpu] != NULL" is insufficient protection against
offline pcpus.  From a hypercall, vcpu_runstate_get() will determine "v !=
current", and try to take the vcpu_schedule_lock().  This will try to look up
per_cpu(schedule_data, v->processor) and promptly suffer a NULL structure
deference as v->processors' __per_cpu_offset is INVALID_PERCPU_AREA.

One example might look like this:

...
Xen call trace:
   [<ffff82c4c0126ddb>] vcpu_runstate_get+0x50/0x113
   [<ffff82c4c0126ec6>] get_cpu_idle_time+0x28/0x2e
   [<ffff82c4c012b5cb>] do_sysctl+0x3db/0xeb8
   [<ffff82c4c023280d>] compat_hypercall+0xbd/0x116

Pagetable walk from 0000000000000040:
 L4[0x000] = 0000000186df8027 0000000000028207
 L3[0x000] = 0000000188e36027 00000000000261c9
 L2[0x000] = 0000000000000000 ffffffffffffffff

****************************************
Panic on CPU 11:
...

get_cpu_idle_time() has been updated to correctly deal with offline pcpus
itself by returning 0, in the same way as it would if it was missing the
idle_vcpu[] pointer.

In doing so, XENPF_getidletime needed updating to correctly retain its
described behaviour of clearing bits in the cpumap for offline pcpus.

As this crash can only be triggered with toolstack hypercalls, it is not a
security issue and just a simple bug.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: 0aa27ce3351f7eb09d13e863a1d5f303086aa32a
master date: 2013-10-04 12:23:23 +0200

11 years agolibxl: fix out-of-memory error handling in libxl_list_cpupool
Matthew Daley [Tue, 10 Sep 2013 10:18:46 +0000 (22:18 +1200)]
libxl: fix out-of-memory error handling in libxl_list_cpupool

...otherwise it will return freed memory. All the current users of this
function check already for a NULL return, so use that.

Coverity-ID: 1056194

This is CVE-2013-4371 / XSA-70

Signed-off-by: Matthew Daley <mattjd@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
(cherry picked from commit 4c37ed562224295c0f8b00211287d57cae629782)
(cherry picked from commit 2350e70ee06c903a927340f7a0bf9ca25acce3f3)

11 years agotools/ocaml: fix erroneous free of cpumap in stub_xc_vcpu_getaffinity
Matthew Daley [Tue, 10 Sep 2013 11:12:45 +0000 (23:12 +1200)]
tools/ocaml: fix erroneous free of cpumap in stub_xc_vcpu_getaffinity

Not sure how it got there...

Coverity-ID: 1056196

This is CVE-2013-4370 / XSA-69

Signed-off-by: Matthew Daley <mattjd@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
(cherry picked from commit 3cd10fd21220f2b814324e6e732004f8f0487d0a)
(cherry picked from commit debfacf7d68de8e39a06ebc7f7b22386b28ce6fb)

11 years agolibxl: fix vif rate parsing
Ian Jackson [Thu, 10 Oct 2013 14:48:55 +0000 (15:48 +0100)]
libxl: fix vif rate parsing

strtok can return NULL here. We don't need to use strtok anyway, so just
use a simple strchr method.

Coverity-ID: 1055642

This is CVE-2013-4369 / XSA-68

Signed-off-by: Matthew Daley <mattjd@gmail.com>
Fix type. Add test case

Signed-off-by: Ian Campbell <Ian.campbell@citrix.com>
(cherry picked from commit c53702cee1d6f9f1b72f0cae0b412e21bcda8724)
(cherry picked from commit 60aefd150bc0ad0c7d325da5ffea0bf4e0544130)

11 years agox86: check segment descriptor read result in 64-bit OUTS emulation
Matthew Daley [Thu, 10 Oct 2013 13:24:15 +0000 (15:24 +0200)]
x86: check segment descriptor read result in 64-bit OUTS emulation

When emulating such an operation from a 64-bit context (CS has long
mode set), and the data segment is overridden to FS/GS, the result of
reading the overridden segment's descriptor (read_descriptor) is not
checked. If it fails, data_base is left uninitialized.

This can lead to 8 bytes of Xen's stack being leaked to the guest
(implicitly, i.e. via the address given in a #PF).

Coverity-ID: 1055116

This is CVE-2013-4368 / XSA-67.

Signed-off-by: Matthew Daley <mattjd@gmail.com>
Fix formatting.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
master commit: 0771faba163769089c9f05f7f76b63e397677613
master date: 2013-10-10 15:19:53 +0200

11 years agox86: properly set up fbld emulation operand address
Jan Beulich [Mon, 30 Sep 2013 12:29:38 +0000 (14:29 +0200)]
x86: properly set up fbld emulation operand address

This is CVE-2013-4361 / XSA-66.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
master commit: 28b706efb6abb637fabfd74cde70a50935a5640b
master date: 2013-09-30 14:18:58 +0200

11 years agox86: properly handle hvm_copy_from_guest_{phys,virt}() errors
Jan Beulich [Mon, 30 Sep 2013 12:26:18 +0000 (14:26 +0200)]
x86: properly handle hvm_copy_from_guest_{phys,virt}() errors

Ignoring them generally implies using uninitialized data and, in all
but two of the cases dealt with here, potentially leaking hypervisor
stack contents to guests.

This is CVE-2013-4355 / XSA-63.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 6bb838e7375f5b031e9ac346b353775c90de45dc
master date: 2013-09-30 14:17:46 +0200

11 years agox86/xsave: initialize extended register state when guests enable it
Jan Beulich [Wed, 25 Sep 2013 08:55:42 +0000 (10:55 +0200)]
x86/xsave: initialize extended register state when guests enable it

Till now, when setting previously unset bits in XCR0 we wouldn't touch
the active register state, thus leaving in the newly enabled registers
whatever a prior user of it left there, i.e. potentially leaking
information between guests.

This is CVE-2013-1442 / XSA-62.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 63a75ba0de817d6f384f96d25427a05c313e2179
master date: 2013-09-25 10:41:25 +0200

11 years agotools: xen-mceinj: Add missing return value checks
Bastian Blank [Sun, 11 Aug 2013 20:10:20 +0000 (22:10 +0200)]
tools: xen-mceinj: Add missing return value checks

The return value of vasprintf must be checked. This check is enforced
with the compiler options used in Debian by request and in Ubuntu by
default.

Check the return value and abort on error.

Signed-off-by: Bastian Blank <waldi@debian.org>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
(cherry picked from commit 1efe90faa31be104a24fe75323429d227eae1d9f)
(cherry picked from commit e36c0917dd54c932816e11a525f294101c77557d)

11 years agotools/gdbsx: fix build failure with glibc-2.17
Olaf Hering [Thu, 6 Dec 2012 16:50:48 +0000 (16:50 +0000)]
tools/gdbsx: fix build failure with glibc-2.17

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Committed-by: Keir Fraser <keir@xen.org>
(cherry picked from commit 5d98adc3e5e859ba23f62ca63450f6a60a9c5e2f)

11 years agox86/xsave: fix migration from xsave-capable to xsave-incapable host
Jan Beulich [Thu, 12 Sep 2013 09:33:44 +0000 (11:33 +0200)]
x86/xsave: fix migration from xsave-capable to xsave-incapable host

With CPUID features suitably masked this is supposed to work, but was
completely broken (i.e. the case wasn't even considered when the
original xsave save/restore code was written).

First of all, xsave_enabled() wrongly returned the value of
cpu_has_xsave, i.e. not even taking into consideration attributes of
the vCPU in question. Instead this function ought to check whether the
guest ever enabled xsave support (by writing a [non-zero] value to
XCR0). As a result of this, a vCPU's xcr0 and xcr0_accum must no longer
be initialized to XSTATE_FP_SSE (since that's a valid value a guest
could write to XCR0), and the xsave/xrstor as well as the context
switch code need to suitably account for this (by always enforcing at
least this part of the state to be saved/loaded).

This involves undoing large parts of c/s 22945:13a7d1f7f62c ("x86: add
strictly sanity check for XSAVE/XRSTOR") - we need to cleanly
distinguish between hardware capabilities and vCPU used features.

Next both HVM and PV save code needed tweaking to not always save the
full state supported by the underlying hardware, but just the parts
that the guest actually used. Similarly the restore code should bail
not just on state being restored that the hardware cannot handle, but
also on inconsistent save state (inconsistent XCR0 settings or size of
saved state not in line with XCR0).

And finally the PV extended context get/set code needs to use slightly
different logic than the HVM one, as here we can't just key off of
xsave_enabled() (i.e. avoid doing anything if a guest doesn't use
xsave) because the tools use this function to determine host
capabilities as well as read/write vCPU state. The set operation in
particular needs to be capable of cleanly dealing with input that
consists of only the xcr0 and xcr0_accum values (if they're both zero
then no further data is required).

While for things to work correctly both sides (saving _and_ restoring
host) need to run with the fixed code, afaict no breakage should occur
if either side isn't up to date (other than the breakage that this
patch attempts to fix).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Yang Zhang <yang.z.zhang@intel.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: 4cc1344447a0458df5d222960f2adf1b65084fa8
master date: 2013-09-09 14:36:54 +0200

11 years agox86/xsave: initialization improvements
Jan Beulich [Thu, 12 Sep 2013 09:32:21 +0000 (11:32 +0200)]
x86/xsave: initialization improvements

- properly validate available feature set on APs
- also validate xsaveopt availability on APs
- properly indicate whether the initialization is on the BSP (we
  shouldn't be using "cpu == 0" checks for this)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: c6066e78f4a66005b0d5d86c6ade32e2ab78923a
master date: 2013-08-30 10:56:07 +0200

11 years agox86: allow guest to set/clear MSI-X mask bit (try 2)
Joby Poriyath [Thu, 12 Sep 2013 09:31:31 +0000 (11:31 +0200)]
x86: allow guest to set/clear MSI-X mask bit (try 2)

Guest needs the ability to enable and disable MSI-X interrupts
by setting the MSI-X control bit, for a passed-through device.
Guest is allowed to write MSI-X mask bit only if Xen *thinks*
that mask is clear (interrupts enabled). If the mask is set by
Xen (interrupts disabled), writes to mask bit by the guest is
ignored.

Currently, a write to MSI-X mask bit by the guest is silently
ignored.

A likely scenario is where we have a 82599 SR-IOV nic passed
through to a guest. From the guest if you do

  ifconfig <ETH_DEV> down
  ifconfig <ETH_DEV> up

the interrupts remain masked. On VF reset, the mask bit is set
by the controller. At this point, Xen is not aware that mask is set.
However, interrupts are enabled by VF driver by clearing the mask
bit by writing directly to BAR3 region containing the MSI-X table.

From dom0, we can verify that
interrupts are being masked using 'xl debug-keys M'.

Initially, guest was allowed to modify MSI-X bit.
Later this behaviour was changed.
See changeset 74c213c506afcd74a8556dd092995fd4dc38b225.

Signed-off-by: Joby Poriyath <joby.poriyath@citrix.com>
master commit: a35137373aa9042424565e5ee76dc0a3bb7642ae
master date: 2013-09-09 10:43:11 +0200

11 years agox86/EFI: properly handle run time memory regions outside the 1:1 map
Jan Beulich [Thu, 12 Sep 2013 09:30:36 +0000 (11:30 +0200)]
x86/EFI: properly handle run time memory regions outside the 1:1 map

Namely with PFN compression, MMIO ranges that the firmware may need
runtime access to can live in the holes that gets shrunk/eliminated by
PFN compression, and hence no mappings would result from simply
copying Xen's direct mapping table's L3 page table entries. Build
mappings for this "manually" in the EFI runtime call 1:1 page tables.

Use the opportunity to also properly identify (via a forcibly undefined
manifest constant) all the disabled code regions associated with it not
being acceptable for us to call SetVirtualAddressMap().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
master commit: a350f3f43bcfac9c1591e28d8e43c505fcb172a5
master date: 2013-09-09 10:40:11 +0200

11 years agoxend: fix file descriptor leak in pci utilities
Xi Xiong [Thu, 12 Sep 2013 09:28:03 +0000 (11:28 +0200)]
xend: fix file descriptor leak in pci utilities

A file descriptor leak was detected after creating multiple domUs with
pass-through PCI devices. This patch fixes the issue.

Signed-off-by: Xi Xiong <xixiong@amazon.com>
Reviewed-by: Matt Wilson <msw@amazon.com>
[msw: adjusted commit message]
Signed-off-by: Matt Wilson <msw@amazon.com>
master commit: 749019afca4fd002d36856bad002cc11f7d0ddda
master date: 2013-09-03 16:36:52 +0100

11 years agoxend: handle extended PCI configuration space when saving state
Steven Noonan [Thu, 12 Sep 2013 09:27:27 +0000 (11:27 +0200)]
xend: handle extended PCI configuration space when saving state

Newer PCI standards (e.g., PCI-X 2.0 and PCIe) introduce extended
configuration space which is larger than 256 bytes. This patch uses
stat() to determine the amount of space used to correctly save all of
the PCI configuration space. Resets handled by the xen-pciback driver
don't have this problem, as that code correctly handles saving
extended configuration space.

Signed-off-by: Steven Noonan <snoonan@amazon.com>
Reviewed-by: Matt Wilson <msw@amazon.com>
[msw: adjusted commit message]
Signed-off-by: Matt Wilson <msw@amazon.com>
master commit: 1893cf77992cc0ce9d827a8d345437fa2494b540
master date: 2013-09-03 16:36:47 +0100

11 years agox86: AVX instruction emulation fixes
Jan Beulich [Thu, 12 Sep 2013 09:26:40 +0000 (11:26 +0200)]
x86: AVX instruction emulation fixes

- we used the C4/C5 (first prefix) byte instead of the apparent ModR/M
  one as the second prefix byte
- early decoding normalized vex.reg, thus corrupting it for the main
  consumer (copy_REX_VEX()), resulting in #UD on the two-operand
  instructions we emulate

Also add respective test cases to the testing utility plus
- fix get_fpu() (the fall-through order was inverted)
- add cpu_has_avx2, even if it's currently unused (as in the new test
  cases I decided to refrain from using AVX2 instructions in order to
  be able to actually run all the tests on the hardware I have)
- slightly tweak cpu_has_avx to more consistently express the outputs
  we don't care about (sinking them all into the same variable)

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: 062919448e2f4b127c9c3c085b1a8e1d56a33051
master date: 2013-08-28 17:03:50 +0200

11 years agox86: don't allow Dom0 access to the MSI address range
Jan Beulich [Thu, 12 Sep 2013 09:25:34 +0000 (11:25 +0200)]
x86: don't allow Dom0 access to the MSI address range

In particular, MMIO assignments should not be done using this area.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by Xiantao Zhang <xiantao.zhang@intel.com>
master commit: 850188e1278cecd1dfb9b936024bee2d8dfdcc18
master date: 2013-08-27 11:11:38 +0200

11 years agox86: Special case __HYPERVISOR_iret rather more when writing hypercall pages
Andrew Cooper [Thu, 12 Sep 2013 08:58:40 +0000 (10:58 +0200)]
x86: Special case __HYPERVISOR_iret rather more when writing hypercall pages

In all cases when a hypercall page is written, __HYPERVISOR_iret is first
written as a regular hypercall, then subsequently rewritten in its special
case.

For VMX and SVM, this means that following the ud2a instruction is 3 bytes of
an imm32 parameter.  For a ring3 kernel, this means that following the syscall
instruction is the second half of 'pop %r11'.

For a ring1 kernel, the iret case ends up as the same number of bytes as the
rest of the hypercalls, but it is pointless writing it twice, and is changed
for consistency.

Therefore, skip the loop iteration which would write the incorrect
__HYPERVISOR_iret hypercall.  This removes junk machine code from the tail and
makes disassemblers rather more happy when looking at the hypercall page.

Also, a miscellaneous whitespace fix in the comment for ring3 kernel.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: fca11da0ec956b17d7450d7776c3ffa22a8f538a
master date: 2013-07-16 11:10:45 +0200

11 years agoACPI: fix acpi_os_map_memory()
Jan Beulich [Wed, 11 Sep 2013 06:23:29 +0000 (08:23 +0200)]
ACPI: fix acpi_os_map_memory()

It using map_domain_page() was entirely wrong. Use __acpi_map_table()
instead for the time being, with locking added as the mappings it
produces get replaced with subsequent invocations. Using locking in
this way is acceptable here since the only two runtime callers are
acpi_os_{read,write}_memory(), which don't leave mappings pending upon
returning to their callers.

Also fix __acpi_map_table()'s first parameter's type - while benign for
unstable, backports to pre-4.3 trees will need this.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
master commit: 2ee9cbf9d8eaeff6e21222905d22dbd58dc5fe29
master date: 2013-08-21 08:38:40 +0200

11 years agoupdate Xen version to 4.2.4-pre
Jan Beulich [Wed, 11 Sep 2013 06:22:18 +0000 (08:22 +0200)]
update Xen version to 4.2.4-pre

11 years agoupdate Xen version to 4.2.3 RELEASE-4.2.3
Jan Beulich [Mon, 9 Sep 2013 12:27:41 +0000 (14:27 +0200)]
update Xen version to 4.2.3

11 years agoAMD IOMMU: add missing check
Jan Beulich [Fri, 6 Sep 2013 12:49:38 +0000 (14:49 +0200)]
AMD IOMMU: add missing check

We shouldn't accept IVHD tables specifying IO-APIC IDs beyond the limit
we support (MAX_IO_APICS, currently 128).

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Suravee Suthikulpanit <suravee.suthikulapanit@amd.com>
master commit: 3785d30efe8264b899499e0883b10cc434bd0959
master date: 2013-08-29 09:31:37 +0200

11 years agoFix inactive timer list corruption on second S3 resume
Tomasz Wroblewski [Fri, 6 Sep 2013 12:48:49 +0000 (14:48 +0200)]
Fix inactive timer list corruption on second S3 resume

init_timer cannot be safely called multiple times on same timer since it does memset(0)
on the structure, erasing the auxiliary member used by linked list code. This breaks
inactive timer list in common/timer.c.

Moved resume_timer initialisation to ns16550_init_postirq, so it's only done once.

Signed-off-by: Tomasz Wroblewski <tomasz.wroblewski@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: 9e2c5938246546a5b3f698b7421640d85602b994
master date: 2013-08-28 10:18:39 +0200

11 years agox86/Intel: add support for Haswell CPU models
Jan Beulich [Fri, 6 Sep 2013 12:48:00 +0000 (14:48 +0200)]
x86/Intel: add support for Haswell CPU models

... according to their most recent public documentation.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: 3e787021fb2420851c7bdc3911ea53c728ba5ac0
master date: 2013-08-27 11:15:15 +0200

11 years agox86/Intel: add further support for Ivy Bridge CPU models
Jan Beulich [Fri, 6 Sep 2013 12:47:37 +0000 (14:47 +0200)]
x86/Intel: add further support for Ivy Bridge CPU models

And some initial Haswell ones at once.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: "Nakajima, Jun" <jun.nakajima@intel.com>
11 years agoVT-d: warn about Compatibility Format Interrupts being enabled by firmware
Jan Beulich [Fri, 6 Sep 2013 12:43:51 +0000 (14:43 +0200)]
VT-d: warn about Compatibility Format Interrupts being enabled by firmware

... as being insecure.

Also drop the second (redundant) read DMAR_GSTS_REG from enable_intremap().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by Xiantao Zhang <xiantao.zhang@intel.com>
master commit: c9c6abab583d27fdca1d979a7f1d18ae30f54e9b
master date: 2013-08-21 16:44:58 +0200

11 years agopygrub: add Debian extlinux.conf path
Ian Campbell [Fri, 16 Aug 2013 14:21:05 +0000 (15:21 +0100)]
pygrub: add Debian extlinux.conf path

This is Debian bug #697407.

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=697407

Signed-off-by: Ian Campbell <ijc@hellion.org.uk>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
(cherry picked from commit 258d27a1d9fb33a490bef1381f52d522225c3dca)

11 years agooxenstored: Protect oxenstored from malicious domains.
Ian Jackson [Tue, 3 Sep 2013 10:55:48 +0000 (11:55 +0100)]
oxenstored: Protect oxenstored from malicious domains.

add check logic when read from IO ring, and if error happens,
then mark the reading connection as "bad", Unless vm reboot,
oxenstored will not handle message from this connection any more.

xs_ring_stubs.c: add a more strict check on ring reading
connection.ml, domain.ml: add getter and setter for bad flag
process.ml: if exception raised when reading from domain's ring,
            mark this domain as "bad"
xenstored.ml: if a domain is marked as "bad", do not handle it.

Signed-off-by: John Liu <john.liuqiming@huawei.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
(cherry picked from commit 704302ce9404c73cfb687d31adcf67094ab5bb53)
(cherry picked from commit a978634bee4db6c5e0ceeb66adcc5114f3f9bc48)

Conflicts:
tools/ocaml/xenstored/domain.ml

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agoupdate Xen version to 4.2.3-rc2 4.2.3-rc2
Jan Beulich [Tue, 27 Aug 2013 12:36:00 +0000 (14:36 +0200)]
update Xen version to 4.2.3-rc2

11 years agox86: correct public header's documentation of PAT MSR settings
Jan Beulich [Mon, 26 Aug 2013 10:48:01 +0000 (12:48 +0200)]
x86: correct public header's documentation of PAT MSR settings

The first (PAT6) column was wrong across the board, and the column for
PAT7 was missing altogether.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: 3829655bd3ad2b1150bd94955fc6988dec6b98f2
master date: 2013-08-23 09:23:24 +0200

11 years agoCorrect X2-APIC HVM emulation
Juergen Gross [Thu, 22 Aug 2013 09:30:01 +0000 (11:30 +0200)]
Correct X2-APIC HVM emulation

commit 6859874b61d5ddaf5289e72ed2b2157739b72ca5 ("x86/HVM: fix x2APIC
APIC_ID read emulation") introduced an error for the hvm emulation of
x2apic. Any try to write to APIC_ICR MSR will result in a GP fault.

Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
master commit: 69962e19ed432570f6cdcfdb5f6f22d6e3c54e6c
master date: 2013-08-22 11:24:00 +0200

11 years agoxen: Add stdbool.h workaround for BSD.
Tim Deegan [Tue, 20 Aug 2013 13:40:38 +0000 (15:40 +0200)]
xen: Add stdbool.h workaround for BSD.

On *BSD, stdbool.h lives in /usr/include, but we don't want to have
that on the search path in case we pick up any headers from the build
host's C libraries.

Copy the equivalent hack already in place for stdarg.h: on all
supported compilers the contents of stdbool.h are trivial, so just
supply the things we need in a xen/stdbool.h header.

Signed-off-by: Tim Deegan <tim@xen.org>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Tested-by: Patrick Welche <prlw1@cam.ac.uk>
master commit: 7b9685ca4ed2fd723600ce66eb20a6d0c115b6cb
master date: 2013-08-15 22:00:45 +0100

11 years agox86/time: fix check for negative time in __update_vcpu_system_time()
Tim Deegan [Tue, 20 Aug 2013 13:39:39 +0000 (15:39 +0200)]
x86/time: fix check for negative time in __update_vcpu_system_time()

Clang points out that u64 stime variable is always >= 0.

Signed-off-by: Tim Deegan <tim@xen.org>
master commit: ab7f9a793c78dfea81c037b34b0dd2db7070d8f8
master date: 2013-08-15 13:17:10 +0200

11 years agox86/MTRR: fix range check in mtrr_add_page()
Jan Beulich [Tue, 20 Aug 2013 13:39:07 +0000 (15:39 +0200)]
x86/MTRR: fix range check in mtrr_add_page()

Extracted from Yinghai Lu's Linux commit d5c78673 ("x86: Fix /proc/mtrr
with base/size more than 44bits").

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
master commit: f67af6d5803b6a015e30cb490a94f9547cb0437c
master date: 2013-08-14 11:20:26 +0200

11 years agoVT-d: protect against bogus information coming from BIOS
Jan Beulich [Tue, 20 Aug 2013 13:38:24 +0000 (15:38 +0200)]
VT-d: protect against bogus information coming from BIOS

Add checks similar to those done by Linux: The DRHD address must not
be all zeros or all ones (Linux only checks for zero), and capabilities
as well as extended capabilities must not be all ones.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Ben Guthro <benjamin.guthro@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Ben Guthro <benjamin.guthro@citrix.com>
Acked by: Yang Zhang <yang.z.zhang@intel.com>
Acked-by: Xiantao Zhang <xiantao.zhang@intel.com>
master commit: e8e8b030ecf916fea19639f0b6a446c1c9dbe174
master date: 2013-08-14 11:18:24 +0200

11 years agox86/AMD: Inject #GP instead of #UD when unable to map vmcb
Suravee Suthikulpanit [Tue, 20 Aug 2013 13:36:20 +0000 (15:36 +0200)]
x86/AMD: Inject #GP instead of #UD when unable to map vmcb

According to AMD Programmer's Manual vol2, vmrun, vmsave and vmload
should inject #GP instead of #UD when unable to access memory
location for vmcb.  Also, the code should make sure that L1 guest
EFER.SVME is not zero.  Otherwise, #UD should be injected.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Tim Deegan <tim@xen.org>
master commit: 910daaf5aaa837624099c0fc5c373bea7202ff43
master date: 2013-08-13 14:24:16 +0200

11 years agox86/AMD: Fix nested svm crash due to assertion in __virt_to_maddr
Suravee Suthikulpanit [Tue, 20 Aug 2013 13:35:09 +0000 (15:35 +0200)]
x86/AMD: Fix nested svm crash due to assertion in __virt_to_maddr

Fix assertion in __virt_to_maddr when starting nested SVM guest
in debug mode. Investigation has shown that svm_vmsave/svm_vmload
make use of __pa() with invalid address.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reviewed-by: Tim Deegan <tim@xen.org>
master commit: 85fc517ec3055e8e8d9c9e36e15a81e630237252
master date: 2013-08-13 14:22:14 +0200

11 years agolibelf: Fix typo in header guard macro
Patrick Welche [Tue, 20 Aug 2013 13:33:14 +0000 (15:33 +0200)]
libelf: Fix typo in header guard macro

s/__LIBELF_PRIVATE_H_/__LIBELF_PRIVATE_H__/

Signed-off-by: Patrick Welche <prlw1@cam.ac.uk>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
master commit: 0aec8823501f8ee058c1ba673d2ac3e0f3f2e8db
master date: 2013-08-08 12:47:38 +0100

11 years agox86: explicit suffix in inline assembler (for clang).
Tim Deegan [Fri, 16 Aug 2013 10:07:40 +0000 (12:07 +0200)]
x86: explicit suffix in inline assembler (for clang).

This fixes the clang build, and has no effect on gcc's output.

Signed-off-by: Tim Deegan <tim@xen.org>
Committed-by: Jan Beulich <jbeulich@suse.com>
master commit: 59a28b5f045331641cbf0c1fc8d5d67afe328939
master date: 2013-02-14 14:20:06 +0100

Note that this isn't just a build fix - if the "delta" input in the
64-bit variant ends up in memory, gas would default to 32-bit operand
size (and should really warn about the ambiguity).

32-bit portion contributed by NetBSD folks.

11 years agoVTD: Remove the check for reserved device scope type
Yang Zhang [Thu, 15 Aug 2013 07:14:11 +0000 (09:14 +0200)]
VTD: Remove the check for reserved device scope type

Though we only have four valid types now, the new type may be added in future.
It's better to remove the check and only deal with the type that we can
recognize.

Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Signed-off-by: Xiantao Zhang <xiantao.zhang@Intel.com>
Acked-by: Keir Fraser <keir@xen.org>
Add log message for this case.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
master commit: 749bc93f7a1ad47640cc7876d27641e98a08bf61
master date: 2013-04-16 10:36:05 +0200