]> xenbits.xensource.com Git - xen.git/log
xen.git
11 years agoxl: Fix CHK_ERRNO()
Andrew Cooper [Tue, 10 Dec 2013 15:45:17 +0000 (15:45 +0000)]
xl: Fix CHK_ERRNO()

The macro CHK_ERRNO() was being used to check two different error schemes, and
succeeded at neither.

Split the macro into two; CHK_SYSCALL() for calls which return -1 and set
errno on error, and CHK_ERRNOVAL() for calls which return an errno.

In both cases, ensure that strerror() now gets called with the error integer.

Coverity ID: 1055570 1090374 1130516

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Release-acked-by: George Dunlap <george.dunlap@eu.citrix.com>
11 years agox86/memshr: fix preemption in relinquish_shared_pages()
Jan Beulich [Tue, 17 Dec 2013 15:39:39 +0000 (16:39 +0100)]
x86/memshr: fix preemption in relinquish_shared_pages()

For one, should hypercall_preempt_check() return false the first time
it gets called, it would never have got called again (because count,
being checked for equality, didn't get reset to zero).

And then, if there were a huge range of unshared pages, with count not
getting incremented at all in that case there would also not be any
preemption.

Fix this by using a biased increment (ratio 1:16 for unshared vs shared
pages), and flushing the count to zero in case of a "false" return from
hypercall_preempt_check().

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
11 years agox86/mm: Prevent leaking domain mappings in paging_log_dirty_op()
Andrew Cooper [Tue, 17 Dec 2013 15:38:07 +0000 (16:38 +0100)]
x86/mm: Prevent leaking domain mappings in paging_log_dirty_op()

Coverity ID: 1135374 1135375 1135376 1135377

If {copy_to,clear}_guest_offset() fails, we would leak the domain mappings for
l4 thru l1.

Fixing this requires having conditional unmaps on the faulting path, which in
turn requires explicitly initialising the pointers to NULL because of the
early ENOMEM exit.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <JBeulich@suse.com>
Acked-by: Tim Deegan <tim@xen.org>
11 years agoXend: handle died domain in getVCPUInfo()
Joe Jin [Tue, 10 Dec 2013 09:04:47 +0000 (17:04 +0800)]
Xend: handle died domain in getVCPUInfo()

When created new guest on NUMA server, xend tried to get the best node
by calculated all vcpus info, if domain already be terminated then
getVCPUInfo() will throw below exception and guest start failed:

[2013-09-04 20:01:26 6254] ERROR (XendDomainInfo:496) VM start failed
Traceback (most recent call last):
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 482, in start
    XendTask.log_progress(31, 60, self._initDomain)
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendTask.py", line 209, in log_progress
    retval = func(*args, **kwds)
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 2918, in _initDomain
    node = self._setCPUAffinity()
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 2835, in _setCPUAffinity
    best_node = find_relaxed_node(candidate_node_list)[0]
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 2803, in find_relaxed_node
    cpuinfo = dom.getVCPUInfo()
  File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py", line 1600, in getVCPUInfo
    raise XendError(str(exn))
XendError: (3, 'No such process')

This patch will check return value of xc.vcpu_getinfo() and make sure the
error not caused by domain died before throw the exception.

Signed-off-by: Joe Jin <joe.jin@oracle.com>
Acked-by: Matt Wilson <msw@amazon.com>
Cc: Keir Fraser <keir@xen.org>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Ian Campbell <ian.campbell@citrix.com>
Cc: Roger Pau Monne <roger.pau@citrix.com>
11 years agoxen/arm: disable a physical IRQ when the guest disables the corresponding IRQ
Stefano Stabellini [Thu, 12 Dec 2013 18:59:07 +0000 (18:59 +0000)]
xen/arm: disable a physical IRQ when the guest disables the corresponding IRQ

In vgic_disable_irqs remove irqs from the lr_pending queue so that they
won't get automatically injected in the guest on maintenance interrupts.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
11 years agoxen/arm: Only enable physical IRQs when the guest asks
Julien Grall [Thu, 12 Dec 2013 18:59:06 +0000 (18:59 +0000)]
xen/arm: Only enable physical IRQs when the guest asks

Set/Unset IRQ_DISABLED from gic_irq_enable and gic_irq_disable.
Enable IRQs when the guest requests it, not unconditionally at boot time.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoxen/arm: implement gic_irq_enable and gic_irq_disable
Stefano Stabellini [Thu, 12 Dec 2013 18:59:05 +0000 (18:59 +0000)]
xen/arm: implement gic_irq_enable and gic_irq_disable

Rename gic_irq_startup to gic_irq_enable.
Rename gic_irq_shutdown to gic_irq_disable.

Implement gic_irq_startup and gic_irq_shutdown calling gic_irq_enable
and gic_irq_disable.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoxen/arm: do not add a second irq to the LRs if one is already present
Stefano Stabellini [Thu, 12 Dec 2013 18:59:04 +0000 (18:59 +0000)]
xen/arm: do not add a second irq to the LRs if one is already present

When the guest re-enable IRQs, do not add guest IRQs to LRs twice.

Suggested-by: Julien Grall <julien.grall@linaro.org>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoxen/arm: track the state of guest IRQs
Stefano Stabellini [Thu, 12 Dec 2013 18:59:03 +0000 (18:59 +0000)]
xen/arm: track the state of guest IRQs

Introduce a status field in struct pending_irq. Valid states are
GUEST_PENDING, GUEST_VISIBLE and GUEST_ENABLED and they are not mutually
exclusive.  See the in-code comment for an explanation of the states and
how they are used.
Use atomic operations to set and clear the status bits. Note that
setting GIC_IRQ_GUEST_VISIBLE and clearing GIC_IRQ_GUEST_PENDING can be
done in two separate operations as the underlying pending status is
actually only cleared on the LR after the guest ACKs the interrupts.
Until that happens it's not possible to receive another interrupt.

The main effect of this patch is that an IRQ can be set to GUEST_PENDING
while it is being serviced by the guest. In maintenance_interrupt we
check whether GUEST_PENDING is set and if it is we add the irq back into
the lr_pending queue so that it's going to be reinjected one more time,
if the interrupt is still enabled at the vgicd level.
If it is not, it is going to be injected as soon as the guest renables
the interrupt.

One exception is evtchn_irq: in that case we don't want to
set the GIC_IRQ_GUEST_PENDING bit if it is already GUEST_VISIBLE,
because as part of the event handling loop, the guest would realize that
new events are present even without a new notification.
Also we already have a way to figure out exactly when we do need to
inject a second notification if vgic_vcpu_inject_irq is called after the
end of the guest event handling loop and before the guest EOIs the
interrupt (see db453468d92369e7182663fb13e14d83ec4ce456 "arm: vgic: fix
race between evtchn upcall and evtchnop_send").

Don't call gic_inject_irq_stop from maintenance_interrupt because
gic_inject (called by leave_hypervisor_tail) is going to call
gic_inject_irq_start/stop appropriately later anyway.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoxen/arm: Physical IRQ is not always equal to virtual IRQ
Julien Grall [Thu, 12 Dec 2013 18:59:02 +0000 (18:59 +0000)]
xen/arm: Physical IRQ is not always equal to virtual IRQ

When Xen needs to EOI a physical IRQ, we should use the IRQ number
in irq_desc instead of the virtual IRQ.

Remove the eoi flag in maintenance_interrupt and replace the check with
a check on p->desc != NULL.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agotools: libxc: flush data cache after loading images into guest memory
Ian Campbell [Fri, 13 Dec 2013 08:21:51 +0000 (08:21 +0000)]
tools: libxc: flush data cache after loading images into guest memory

On ARM guest OSes are started with MMU and Caches disables (as they are on
native) however caching is enabled in the domain running the builder and
therefore we must flush the cache as we load the blobs, otherwise when the
guest starts running it may not see them. The dom0 build in the hypervisor has
the same requirements and already does the right thing.

The mechanism for performing a cache flush from userspace is OS specific, so
implement this as a new osdep hook:

 - On 32-bit ARM Linux provides a system call to flush the cache.
 - On 64-bit ARM Linux the processor is configured to allow cache flushes
   directly from userspace.
 - Non-Linux platforms will need to provide their own implementation. If
   similar mechanisms are not available then a new privcmd ioctl should be a
   suitable alternative.

No cache maintenance is required on x86, so provide a stub for all non-Linux
platforms which returns success on x86 only and log an error otherwise.

This fixes guest building on Xgene which has a very large L3 cache and so is
particularly susceptible to this problem. It has also been observed
sporadically on midway.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Andre Przywara <andre.przywara@calxeda.com>
Cc: Pranavkumar Sawargaonkar <psawargaonkar@apm.com>
Cc: Anup Patel <apatel@apm.com>
11 years agoxl: check for libxl_list_vm failure in print_uptime
Matthew Daley [Sat, 14 Dec 2013 01:15:21 +0000 (14:15 +1300)]
xl: check for libxl_list_vm failure in print_uptime

Signed-off-by: Matthew Daley <mattd@bugfuzz.com>
11 years agoxenconsole: adjust pty opening error checking and handling
Matthew Daley [Sat, 14 Dec 2013 01:04:47 +0000 (14:04 +1300)]
xenconsole: adjust pty opening error checking and handling

Currently we check the pty path received from xenstore with access(); if
it indicates that the pty is not accessible, we loop around and wait for
a new path to appear in xenstore.

This has several issues:
* If a path has been written to xenstore, it can be assumed that that
  pty should already be accessible to xenconsole, and hence any error
  that occurs while trying to open it should be fatal and not ignored
* If access() indicates no access to the pty, the memory allocated for
  the path is leaked when going around the loop again
* The accessibility of the pty could change between the access() and
  open() calls, leading to a TOCTOU race (this is what Coverity is
  complaining about).

By removing the explicit access() check and just erroring out whenever
open() fails, we fix all these issues.

Coverity-ID: 1056047
Signed-off-by: Matthew Daley <mattd@bugfuzz.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agox86/pvh: disable MTRR feature on cpuid for Dom0
Roger Pau Monné [Mon, 16 Dec 2013 09:52:43 +0000 (10:52 +0100)]
x86/pvh: disable MTRR feature on cpuid for Dom0

MTRR is not available for PVH Dom0, so prevent cpuid from
reporting it as an available feature.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
11 years agoevtchn/fifo: map correct pages when guest is HVM
David Vrabel [Mon, 16 Dec 2013 09:51:24 +0000 (10:51 +0100)]
evtchn/fifo: map correct pages when guest is HVM

If a HVM guest attempts to use the FIFO-based ABI it will not receive
any events and destroying the guest may crash Xen or trigger an assert
when attempting to unmap a control block page.  This occurs because
Xen maps the wrong page for both the control blocks and the event
arrays.

In map_guest_page(), use the MFN of the guest's page and not the GFN
when calling map_domain_page_global().

Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
11 years agotools/xenstored: Avoid buffer overflows while setting up sockets
Andrew Cooper [Mon, 25 Nov 2013 14:38:41 +0000 (14:38 +0000)]
tools/xenstored: Avoid buffer overflows while setting up sockets

Coverity ID: 1055996 1056002

Cache the xs_daemon_socket{,_ro}() strings to save pointlessly
re-snprintf()'ing the same path, and add explicit size checks against
addr.sun_path before strcpy()'ing into it.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
CC: Ian Campbell <Ian.Campbell@citrix.com>
CC: Matthew Daley <mattd@bugfuzz.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agolibxl: fix unsigned less-than-0 comparison in e820_sanitize
Matthew Daley [Sun, 1 Dec 2013 10:14:55 +0000 (23:14 +1300)]
libxl: fix unsigned less-than-0 comparison in e820_sanitize

Both src[i].size and delta are unsigned, so checking their difference
for being less than 0 doesn't work.

Coverity-ID: 1055615
Signed-off-by: Matthew Daley <mattd@bugfuzz.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agolibxl: check for xc_domain_setmaxmem failure in libxl__build_pre
Matthew Daley [Mon, 2 Dec 2013 12:11:43 +0000 (01:11 +1300)]
libxl: check for xc_domain_setmaxmem failure in libxl__build_pre

Coverity-ID: 1087115
Signed-off-by: Matthew Daley <mattd@bugfuzz.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
11 years agolibxl: don't leak ptr in libxl_list_vm error case
Matthew Daley [Tue, 3 Dec 2013 01:29:04 +0000 (14:29 +1300)]
libxl: don't leak ptr in libxl_list_vm error case

While at it, tidy up the function; there's no point in allocating more
than the amount of domains actually returned by xc_domain_getinfolist
(barring the caveat described in the newly-added comment)

Coverity-ID: 1055888
Signed-off-by: Matthew Daley <mattd@bugfuzz.com>
11 years agoxenstore: check F_SETFL fcntl invocation in setnonblock
Matthew Daley [Mon, 2 Dec 2013 12:45:16 +0000 (01:45 +1300)]
xenstore: check F_SETFL fcntl invocation in setnonblock

...and check the newly-added result of setnonblock itself where used.

Coverity-ID: 1055103
Signed-off-by: Matthew Daley <mattd@bugfuzz.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agox86/p2m: restrict auditing to debug builds
Jan Beulich [Fri, 13 Dec 2013 14:06:11 +0000 (15:06 +0100)]
x86/p2m: restrict auditing to debug builds

... since iterating through all of a guest's pages may take unduly
long.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
11 years agoocaml: do not install test binaries
Rob Hoes [Thu, 12 Dec 2013 16:36:49 +0000 (16:36 +0000)]
ocaml: do not install test binaries

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- added back an Empty install rule ]

11 years agoxen/elf: header: fix typoes in elfnote.h
Julien Grall [Wed, 11 Dec 2013 18:50:11 +0000 (18:50 +0000)]
xen/elf: header: fix typoes in elfnote.h

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoMerge branch 'staging' of ssh://xenbits.xen.org/home/xen/git/xen into staging
Ian Campbell [Wed, 11 Dec 2013 13:36:27 +0000 (13:36 +0000)]
Merge branch 'staging' of ssh://xenbits.xen.org/home/xen/git/xen into staging

11 years agolibxl: ocaml: add some missing CAML macros
Rob Hoes [Tue, 10 Dec 2013 16:48:33 +0000 (16:48 +0000)]
libxl: ocaml: add some missing CAML macros

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agolibxl: ocaml: drop the ocaml heap lock before calling into libxl
Rob Hoes [Tue, 10 Dec 2013 16:48:32 +0000 (16:48 +0000)]
libxl: ocaml: drop the ocaml heap lock before calling into libxl

Ocaml has a heap lock which must be held whenever ocaml code is running. Ocaml
usually drops this lock when it enters a potentially blocking low-level
function, such as writing to a file. Libxl has its own lock, which it may
acquire when being called.

Things get interesting when libxl calls back into ocaml code. There is a risk
of ending up in a deadlock when a thread holds both locks at the same time,
then temporarily drop the ocaml lock, while another thread calls another libxl
function.

To avoid deadlocks, we drop the ocaml heap lock before entering libxl, and
reacquire it in callbacks to ocaml. This way, the ocaml heap lock is never held
together with the libxl lock, except in osevent registration callbacks, and
xentoollog callbacks. If we guarantee to not call any libxl functions inside
those callbacks, we can avoid deadlocks.

This patch handle the dropping and reacquiring of the ocaml heap lock by the
caml_enter_blocking_section and caml_leave_blocking_section functions, and
related macros. We are also careful to not call any functions that access the
ocaml heap while the ocaml heap lock is dropped. This often involves copying
ocaml values to C before dropping the ocaml lock.

The ao_how in aohow_val is now malloc'ed, just to make this function a little
easier to use.

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agolibxl: ocaml: add console reader functions
Rob Hoes [Tue, 10 Dec 2013 16:48:31 +0000 (16:48 +0000)]
libxl: ocaml: add console reader functions

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agolibxl: ocaml: add VM lifecycle operations
Rob Hoes [Tue, 10 Dec 2013 16:48:30 +0000 (16:48 +0000)]
libxl: ocaml: add VM lifecycle operations

Also:
* Reorganise toplevel OCaml functions into modules of Xenlight.
* Factor out the management of ao_how into the function aohow_val. The ao_how
  is now malloc'ed, just to make this function a little easier to use.

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
11 years agolibxl: ocaml: add disk and cdrom helper functions
Rob Hoes [Tue, 10 Dec 2013 16:48:29 +0000 (16:48 +0000)]
libxl: ocaml: add disk and cdrom helper functions

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agolibxl: ocaml: allow device operations to be called asynchronously
Rob Hoes [Tue, 10 Dec 2013 16:48:28 +0000 (16:48 +0000)]
libxl: ocaml: allow device operations to be called asynchronously

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
CC: David Scott <dave.scott@eu.citrix.com>
11 years agolibxl: ocaml: event management
Rob Hoes [Tue, 10 Dec 2013 16:48:27 +0000 (16:48 +0000)]
libxl: ocaml: event management

Having bindings to the low-level functions libxl_osevent_register_hooks and
related, allows to run an event loop in OCaml; either one we write ourselves,
or one that is available elsewhere.

The Lwt cooperative threads library (http://ocsigen.org/lwt/), which is quite
popular these days, has an event loop that can be easily extended to poll any
additional fds that we get from libxl. Lwt provides a "lightweight" threading
model, which does not let you run any other (POSIX) threads in your
application, and therefore excludes an event loop implemented in the C
bindings.

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agolibxl: ocaml: implement some simple tests
Rob Hoes [Tue, 10 Dec 2013 16:48:26 +0000 (16:48 +0000)]
libxl: ocaml: implement some simple tests

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
11 years agolibxl: ocaml: add simple test case for xentoollog
Rob Hoes [Tue, 10 Dec 2013 16:48:25 +0000 (16:48 +0000)]
libxl: ocaml: add simple test case for xentoollog

Add a simple noddy test case (tools/ocaml/test) for the the Xentoollog OCaml
module.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
11 years agoxen: arm: inject unhandled instruction and data aborts to the guest.
Ian Campbell [Mon, 9 Dec 2013 14:58:24 +0000 (14:58 +0000)]
xen: arm: inject unhandled instruction and data aborts to the guest.

Currently an unhandled data abort in guest context leads to us killing the
guest and an unhandled instruction abort in guest context leads to us killing
the host!

Andre pointed out that an unhandled data abort can be caused by e.g. dmidecode
looking for things which are not there in the guests physical address space.
Propagating the fault to the guest allows it to properly SIGSEGV the
processes.

A guest kernel can trivially jump to an unmapped physical address which would
cause an instruction abort. Killing the host for that is obviously bad.
Instead inject the exception so the guest kernel can SIGSEGV or panic() etc as
it deems appropriate.

Tested on arm64 (Mustang) and arm32 (Midway) with a dom0 kernel late_initcall
which either dereferences or jumps to address 0, provoking both behaviours and
resulting correctly in a guest kernel panic. Also tested on fast models with a
32-bit dom0 on a 64-bit hypervisor, which behaved correctly.

In addition tested on both platforms with a userspace program which either
calls to or dereferences address 0. The process is correctly killed with SEGV.

Lastly tested on Mustang with a 32-bit version of the userspace test on a
64-bit dom0 kernel.

I think that covers all the cases.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
Cc: Andre Przywara <andre.przywara@calxeda.com>
[ ijc -- fixed up whitespace in if statements in cpsr_mode_switch ]

11 years agokexec/x86: do not map crash kernel area
Daniel Kiper [Wed, 11 Dec 2013 09:37:25 +0000 (10:37 +0100)]
kexec/x86: do not map crash kernel area

This mapping was apparently never used.

Suggested-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Acked-by: David Vrabel <david.vrabel@citrix.com>
11 years agox86/PV: don't commit debug register values early in arch_set_info_guest()
Jan Beulich [Wed, 11 Dec 2013 09:33:19 +0000 (10:33 +0100)]
x86/PV: don't commit debug register values early in arch_set_info_guest()

They're being taken care of later (via set_debugreg()), and temporarily
copying them into struct vcpu means that bad values may end up getting
loaded during context switch if the vCPU is already running and the
function errors out between the premature and real commit step, leading
to the same issue that XSA-12 dealt with.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Release-acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agox86/cpuidle: publish new states only after fully initializing them
Jan Beulich [Wed, 11 Dec 2013 09:30:02 +0000 (10:30 +0100)]
x86/cpuidle: publish new states only after fully initializing them

Since state information coming from Dom0 can arrive at any time, on
any CPU, we ought to make sure that a new state is fully initialized
before the target CPU might be using it.

Once touching that code, also do minor cleanup: A missing (but benign)
"break" and some white space adjustments.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Liu Jinsong <jinsong.liu@intel.com>
11 years agoMAINTAINERS: Add Andres Lagar-Cavilla for mem-sharing/paging
Andres Lagar-Cavilla [Tue, 10 Dec 2013 15:53:40 +0000 (16:53 +0100)]
MAINTAINERS: Add Andres Lagar-Cavilla for mem-sharing/paging

Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Acked-by: Tim Deegan <tim@xen.org>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agoamd/passthrough: Do not leak domain mappings from do_invalidate_dte()
Andrew Cooper [Tue, 10 Dec 2013 15:16:49 +0000 (16:16 +0100)]
amd/passthrough: Do not leak domain mappings from do_invalidate_dte()

Coverity ID: 1135379

As the code stands, the domain mapping will be leaked on each error path.

The mapping can be for a much shorter period of time, and all the relevent
information can be pulled out at once.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <JBeulich@suse.com>
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
11 years agoIOMMU: clear "don't flush" override on error paths
Jan Beulich [Tue, 10 Dec 2013 15:10:37 +0000 (16:10 +0100)]
IOMMU: clear "don't flush" override on error paths

Both xenmem_add_to_physmap() and iommu_populate_page_table() each have
an error path that fails to clear that flag, thus suppressing further
flushes on the respective pCPU.

In iommu_populate_page_table() also slightly re-arrange code to avoid
the false impression of the flag in question being guarded by a
domain's page_alloc_lock.

This is CVE-2013-6400 / XSA-80.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoxen: list interfaces subject to the security process exception in XSA-77
Ian Campbell [Tue, 10 Dec 2013 15:09:24 +0000 (16:09 +0100)]
xen: list interfaces subject to the security process exception in XSA-77

List all the sub ops of:
  __HYPERVISOR_domctl
  __HYPERVISOR_sysctl
  __HYPERVISOR_memory_op
  __HYPERVISOR_tmem_op
which are subject to the policy given in
http://xenbits.xen.org/xsa/advisory-77.html

It is expected that these lists will be whittled away as each interface is
audited for safety.

New interfaces should be expected to be safe when introduced (IOW the list
should never be expanded).

This is XSA-77.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
11 years agolibxl: ocaml: remove dead code in xentoollog bindings
Rob Hoes [Mon, 9 Dec 2013 15:17:30 +0000 (15:17 +0000)]
libxl: ocaml: remove dead code in xentoollog bindings

Found by Coverty. CIDs: 1128567 1128568 1128576 1128577.

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agolibxl: ocaml: fix memory corruption when converting string and key/values lists
Rob Hoes [Mon, 9 Dec 2013 15:17:29 +0000 (15:17 +0000)]
libxl: ocaml: fix memory corruption when converting string and key/values lists

Found by Coverty. CIDs: 1128562 1128563 1128564 1128565.

Signed-off-by: Rob Hoes <rob.hoes@citrix.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: David Scott <dave.scott@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoxen/arm: Fix regression after commit d963923
Julien Grall [Mon, 9 Dec 2013 18:34:10 +0000 (18:34 +0000)]
xen/arm: Fix regression after commit d963923

The commit d963923  "xen: arm: correct return value of
raw_copy_{to/from}_guest_*, raw_clear_guest" doesn't permit to boot guest
on Xen ARM.

Remove the stray semicolon from the end of the if statement.

Also we want to get the right rc in the error arrays, so we need to do the
copy_to_guest_offset before checking the rc returned by
xenmem_add_to_physmap_one.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- expanded commit log ]

11 years agoxen: arm: correct return value of raw_copy_{to/from}_guest_*, raw_clear_guest
Ian Campbell [Mon, 9 Dec 2013 12:13:48 +0000 (12:13 +0000)]
xen: arm: correct return value of raw_copy_{to/from}_guest_*, raw_clear_guest

This is a generic interface which is supposed to return the number of bytes
which were not copied. Make it so.

Update the incorrect callers prepare_dtb, decode_thumb{2} and
xenmem_add_to_physmap_range.

In the xenmem_add_to_physmap_range case, observe that we are not propagating
errors from xenmem_add_to_physmap_one and do so.

In the decode_thumb case and an emacs magic block to decode.c

Make the flush_dcache parameter to the helper an int while at it.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
11 years agoxen: arm: correct definition of DCISW (data cache invalidate by set/way)
Ian Campbell [Fri, 6 Dec 2013 14:29:32 +0000 (14:29 +0000)]
xen: arm: correct definition of DCISW (data cache invalidate by set/way)

We don't actually use this but I was using it locally for debugging and it
tripped me up.

Also add DCCIMVAC "data cache clean and invalidate by MVA" which is the only
cache op missing from cpregs.h.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
11 years agoxen: arm: handle initrd addresses above the 4G boundary
Ian Campbell [Mon, 9 Dec 2013 11:43:35 +0000 (11:43 +0000)]
xen: arm: handle initrd addresses above the 4G boundary

The Xgene platform has no RAM below 4G.

The /chosen/linux,initrd-* properties do not have "reg" semantics and
therefore #*-size are not used when interpreting. Instead they are are simply
numbers which are interpreted according to the properties length.

Fix this both when parsing the entry in the host DTB and when creating the
dom0 DTB. For dom0 we simply hardcode a 64-bit size, this is acceptable
even for a 32-bit guest.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
11 years agoxen: arm: vtimer fixes for arm64
Ian Campbell [Mon, 9 Dec 2013 11:13:36 +0000 (11:13 +0000)]
xen: arm: vtimer fixes for arm64

The code was writing back the register, even for writes and didn't implement
CNTPCT at all.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
11 years agoxen: arm: do not BUG on guest paddrs which are very high
Ian Campbell [Mon, 9 Dec 2013 11:09:10 +0000 (11:09 +0000)]
xen: arm: do not BUG on guest paddrs which are very high

The BUG_ON in p2m_map_first was over aggressive since the paddr_t can have
come from the guest, via add_to_physmap. Instead return failure to the caller.

Also the check was simultaneously too lose. The valid offsets are
0..P2M_FIRST_ENTRIES-1 inclusive.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
11 years agoConfig.mk: update OVMF changeset
Wei Liu [Sun, 8 Dec 2013 20:50:20 +0000 (20:50 +0000)]
Config.mk: update OVMF changeset

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: George Dunlap <george.dunlap@eu.citrix.com>
11 years agoxl: test script for the cpumap parser (for vCPU pinning)
Dario Faggioli [Sat, 7 Dec 2013 00:05:34 +0000 (01:05 +0100)]
xl: test script for the cpumap parser (for vCPU pinning)

This commit introduces "check-xl-vcpupin-parse" for helping
verifying and debugging the (v)CPU bitmap parsing code in xl.

The script runs "xl -N vcpu-pin 0 all <some strings>"
repeatedly, with various input strings, and checks that the
output is as expected.

This is what the script can do:

# ./check-xl-vcpupin-parse -h
 usage: ./check-xl-vcpupin-parse [options]

 Tests various vcpu-pinning strings. If run without arguments acts
 as follows:
  - generates some test data and saves them in
    check-xl-vcpupin-parse.data;
  - tests all the generated configurations (reading them back from
    check-xl-vcpupin-parse.data).

 An example of a test vector file is provided in
 check-xl-vcpupin-parse.data-example.

 Options:
  -h         prints this message
  -r seed    uses seed for initializing the rundom number generator
             (default: the script PID)
  -s string  tries using string as a vcpu pinning configuration and
             reports whether that succeeds or not
  -o ofile   save the test data in ofile
             (default: check-xl-vcpupin-parse.data)
  -i ifile   read test data from ifile

An example test data file (generated on a 2 NUMA nodes, 16 CPUs
host) is being provided in check-xl-vcpupin-parse.data-example.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoxl: implement and enable dryrun mode for `xl vcpu-pin'
Dario Faggioli [Sat, 7 Dec 2013 00:05:26 +0000 (01:05 +0100)]
xl: implement and enable dryrun mode for `xl vcpu-pin'

As it can be useful to see if the outcome of some complex vCPU
pinning bitmap specification looks as expected.

This also allow for the introduction of some automatic testing
and verification for the bitmap parsing code, as it happens
already in check-xl-disk-parse and check-xl-vif-parse.

In particular, to make the above possible, this commit also
changes the implementation of the vcpu-pin command so that,
instead of always returning 0, it returns an error if the
parsing fails.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agoxl: allow for node-wise specification of vcpu pinning
Dario Faggioli [Sat, 7 Dec 2013 00:05:18 +0000 (01:05 +0100)]
xl: allow for node-wise specification of vcpu pinning

Making it possible to use something like the following:
 * "nodes:0-3": all pCPUs of nodes 0,1,2,3;
 * "nodes:0-3,^node:2": all pCPUS of nodes 0,1,3;
 * "1,nodes:1-2,^6": pCPU 1 plus all pCPUs of nodes 1,2
   but not pCPU 6;
 * ...

In both domain config file and `xl vcpu-pin'.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Reviewed-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agolibxc/libxl: allow to retrieve the number of online pCPUs
Dario Faggioli [Sat, 7 Dec 2013 00:05:11 +0000 (01:05 +0100)]
libxc/libxl: allow to retrieve the number of online pCPUs

by introducing introduce xc_get_online_cpus() and
libxl_get_online_cpus().

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agolibxc/libxl: sanitize error handling in *_get_max_{cpus, nodes}
Dario Faggioli [Sat, 7 Dec 2013 00:05:03 +0000 (01:05 +0100)]
libxc/libxl: sanitize error handling in *_get_max_{cpus, nodes}

In libxc, make xc_get_max_{cpus,node}() always return either a
positive number or -1, and change all the callers to deal with
that.

In libxl, make libxl_get_max_{cpus,nodes}() always return either a
positive number or a libxl error code. Thanks to that, it is also
possible to fix loggig for libxl_{cpu,node}_bitmap_alloc(), which
now happens inside the functions themselves, more accurately
reporting what happened.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agolibxl: move libxl_{cpu, node}_bitmap_alloc()
Dario Faggioli [Sat, 7 Dec 2013 00:04:55 +0000 (01:04 +0100)]
libxl: move libxl_{cpu, node}_bitmap_alloc()

in libxl_utils.c (from .h), as they will be reworked in
the next commit ("libxc/libxl: sanitize error handling in
*_get_max_{cpus,nodes}") and we want to keep code motion
separate from functional changes.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agolibxl: fix memory leak in libxl_list_vcpu
Dario Faggioli [Sat, 7 Dec 2013 00:04:48 +0000 (01:04 +0100)]
libxl: fix memory leak in libxl_list_vcpu

more specifically, of the cpumap inside libxl_vcpuinfo, in case
of failure after it has been allocated.

While at it, use the correct libxl memory allocation wrapper for
calloc() in there and turn the function into using the new LOGE()
logging style.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agolibxl: better name for parameters in libxl_list_vcpu
Dario Faggioli [Sat, 7 Dec 2013 00:04:40 +0000 (01:04 +0100)]
libxl: better name for parameters in libxl_list_vcpu

so that the parameter that returns the number of pCPUs is
called nr_cpus_out, while the one that returns the number of
vCPUs is called nr_vcpus_out.

The patch is all about renaming, so no functional change.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agoxl: match output of vcpu-list with pinning syntax
Dario Faggioli [Sat, 7 Dec 2013 00:04:32 +0000 (01:04 +0100)]
xl: match output of vcpu-list with pinning syntax

in fact, pinning to all the pcpus happens by specifying "all"
(either on the command line or in the config file), while `xl
vcpu-list' report it as "any cpu".

Change this into something more consistent, by using "all"
everywhere.

Signed-off-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agodefer the domain mapping in scrub_one_page()
Andrew Cooper [Mon, 9 Dec 2013 13:13:23 +0000 (14:13 +0100)]
defer the domain mapping in scrub_one_page()

This avoids a resource leak and needless playing with the pagetables in the
case that the page is broken.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <JBeulich@suse.com>
Reviewed-by: Keir Fraser <keir@xen.org>
11 years agoxen/arm: arch_domain_create: don't return 0 when alloc_xenheap_pages has failed
Julien Grall [Sun, 8 Dec 2013 02:32:32 +0000 (02:32 +0000)]
xen/arm: arch_domain_create: don't return 0 when alloc_xenheap_pages has failed

The previous call before alloc_xenheap_pages reset rc to 0 if it success.
If the latter fails, arch_domain_create will return 0 and Xen will consider
the domain as valid. Move rc initialization later.

Signed-off-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agotmem: Fix uses of unmatched __map_domain_page()
Andrew Cooper [Fri, 6 Dec 2013 15:09:38 +0000 (16:09 +0100)]
tmem: Fix uses of unmatched __map_domain_page()

__map_domain_page() *must* be matched with an unmap_domain_page().  These five
static inline functions each map a page (or two), then throw away the context
needed to unmap it.

Each of the changes are limited to their respective functions.  In two cases,
this involved replacing a large amount of pointer arithmetic with memcpy()
(all callers were relying on memcpy() semantics of positive/negative returns
rather than specifically -1/+1). A third case had its pointer arithmetic
entirely replaced with memcpy().

In addition, remove redundant casts of void pointers and assertions.

This fixes Coverity IDs 1135373 1135374 1135375 1135376 1135377 1135378
11353739 which were retroactively identified following modelling improvements.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Tested-by: Bob Liu <bob.liu@oracle.com>
11 years agotmem: fix public header file
Bob Liu [Fri, 6 Dec 2013 14:58:00 +0000 (15:58 +0100)]
tmem: fix public header file

Commit 006a687ba4de74d7933c09b43872abc19f126c63 dropped typedef tmem_cli_mfn_t
from public tmem.h which may cause some problem.
This patch added tmem_cli_mfn_t back with #ifdef __XEN_INTERFACE_VERSION__
around.

Signed-off-by: Bob Liu <bob.liu@oracle.com>
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
11 years agolibxl: spice usbredirection support for upstream qemu
Fabio Fantoni [Tue, 19 Nov 2013 15:20:20 +0000 (16:20 +0100)]
libxl: spice usbredirection support for upstream qemu

Usage: spiceusbredirection=NUMBER (default=0)

Enables spice usbredirection. Creates NUMBER usbredirection channels
for redirection of up to 4 usb devices from spice client to domU's qemu.
It requires an usb controller and if not defined will automatically adds
an usb2 controller.

Changes from v3:
- fixed condition that enable usbversion if it isn't defined in presence
  of usbredirection enabled

Changes from v2:
- updated for usbversion patch v7
- now usbredirection cannot be used with usb and usbdevice parameters
- if usbversion is undefined it will creates an usb2 controller

Changes from v1:
- Now can be setted the number of redirection channels.
- Various code improvements.

Signed-off-by: Fabio Fantoni <fabio.fantoni@m2r.biz>
11 years agolibxl: usb2 and usb3 controller support for upstream qemu
Fabio Fantoni [Thu, 5 Dec 2013 14:40:47 +0000 (14:40 +0000)]
libxl: usb2 and usb3 controller support for upstream qemu

Usage: usbversion=1|2|3 (default=0, no usb controller defined)
Specifies the type of an emulated USB bus in the guest. 1 for usb1,
2 for usb2 and 3 for usb3, it is available only with upstream qemu.
The old usb and usbdevice parameters cannot be used with this.

Signed-off-by: Fabio Fantoni <fabio.fantoni@m2r.biz>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agotools/ovmf-makefile: only build debug target when specified
Wei Liu [Thu, 5 Dec 2013 17:29:33 +0000 (17:29 +0000)]
tools/ovmf-makefile: only build debug target when specified

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agohvmloader/ovmf: setup E820 map
Wei Liu [Thu, 5 Dec 2013 17:29:31 +0000 (17:29 +0000)]
hvmloader/ovmf: setup E820 map

E820 map will be used by OVMF to create memory map.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agohvmloader/ovmf: setup ovmf_info
Wei Liu [Thu, 5 Dec 2013 17:29:30 +0000 (17:29 +0000)]
hvmloader/ovmf: setup ovmf_info

OVMF info contains E820 map allocated by hvmloader. This info is passed
to OVMF to help it do proper initialization.

Currently only E820 is necessary, but we reserve spaces for other tables
in ovmf_info for later usage.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoxen: arm: improve handling of system with non-contiguous RAM regions
Ian Campbell [Mon, 2 Dec 2013 14:39:05 +0000 (14:39 +0000)]
xen: arm: improve handling of system with non-contiguous RAM regions

arm32 currently only makes use of memory which is contiguous with the first
bank. On the Midway platform this means that we only use 4GB of the 8GB
available.

Change things to make use of non-contiguous memory regions with the
restriction that we require that at least half of the total span of the RAM
addresses contain RAM. The frametable is currently not sparse and so this
restriction avoids problems with allocating enormous amounts of memory for the
frametable to cover holes in the address space and exhausting the actual RAM.

50% is arguably too restrictive. 4GB of RAM requires 32MB of frametable on
arm32 and 56M on arm64, so we could probably cope with a lower ratio of actual
RAM. However half is nice and conservative.

arm64 currently uses all banks without regard for the size of the frametable,
which I have observed causing problems on models. Implement that same
restriction as arm32 there.

Long term we should look at moving to a pfn compression based scheme similar
to x86, which removes the holes from the frametable.

There were some bogus/outdated comments scattered around this code which I
have removed.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Tested-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Julien Grall <julien.grall@linaro.org>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
11 years agoxen: arm: remove hardcoded gnttab location from dom0
Ian Campbell [Wed, 4 Dec 2013 17:03:02 +0000 (17:03 +0000)]
xen: arm: remove hardcoded gnttab location from dom0

The DT provided to guests (including dom0) includes a Xen node which, among
other things, describes an MMIO region which can be safely used for grant
table mappings (i.e. it is a hole in the physical address space). For domU we
provide a hardcoded values based on our hardcoded guest virtual machine
layout. However for dom0 we need to fit in with the underlying platform.
Leaving this hardcoded was an oversight which on some platforms could result
in the grant table overlaying RAM or MMIO regions which are in use by domain
0.

For the 4.4 release do as we did with the dom0 evtchn PPI and provide a hook
for the platform code to supply a suitable hardcoded address for the platform
(derived from reading the data sheet). Platforms which do not provide the hook
get the existing address as a default.

After 4.4 we should switch to selecting a region of host RAM which is not RAM
in the guest address map. This should be more flexible and safer but the patch
was looking too complex for 4.4.

Platform        Gnttab Address
========        ==============
exynos5.c       0xb0000000, confirmed and tested by Julien.
sunxi.c         0x01d00000, confirmed in data sheet.
midway.c        0xff800000, confirmed by Andre, boot tested by Ian.
vexpress.c      0xb0000000, existing hardcoded value was selected for vexpress.
omap5.c         0x4b000000, confirmed by Baozi
xgene-storm.c   0x1f800000, confirmed by Pranavkumar

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Tested-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Julien.Grall@linaro.org
Cc: Stefano.Stabellini@eu.citrix.com
Cc: Andre Przywara <andre.przywara@linaro.org>
Acked-by: Chen Baozi <baozich@gmail.com>
Acked-by: Pranavkumar Sawargaonkar <psawargaonkar@apm.com>
Cc: Anup Patel <apatel@apm.com>
11 years agox86/boot: fix BIOS memory corruption on certain IBM systems
Andrew Cooper [Fri, 6 Dec 2013 10:28:00 +0000 (11:28 +0100)]
x86/boot: fix BIOS memory corruption on certain IBM systems

IBM System x3530 M4 BIOSes (including the latest available at the time of this
patch) will corrupt a byte at physical address 0x105ff1 to the value of 0x86
if %esp has the value 0x00080000 when issuing an `int $0x15 (ax=0xec00)` to
inform the system about our intended operating mode.

Xen gets unhappy when the bootloader has placed it's .text section in over
this specific region of RAM.

After dropping into 16bit mode, clear all 32 bits of %esp, and for the BIOS
call already documented to be affected by BIOS bugs clear all GPRs.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
Release-acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
11 years agoRevert "VMX: flush cache when vmentry back to UC guest"
Jan Beulich [Fri, 6 Dec 2013 10:10:54 +0000 (11:10 +0100)]
Revert "VMX: flush cache when vmentry back to UC guest"

This reverts commit 86d60e85 as well as one related change from
62652c00 ("VMX: fix cr0.cd handling"), on the basis that all of this
flushing is still insufficient and, while not known to fix anything, is
known to negatively affect performance.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Release-acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Liu Jinsong <jinsong.liu@intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
11 years agoNested VMX: CR emulation fix up
Yang Zhang [Fri, 6 Dec 2013 10:08:20 +0000 (11:08 +0100)]
Nested VMX: CR emulation fix up

This patch fixs two issues:
1. The CR_READ_SHADOW should only cover the value that L2 wirtes to
CR when L2 is running. But currently, L0 wirtes wrong value to
it during virtual vmentry and L2's CR access emualtion.

2. L2 changed cr[0/4] in a way that did not change any of L1's shadowed
bits, but did change L0 shadowed bits. In this case, the effective cr[0/4]
value that L1 would like to write into the hardware is consist of
the L2-owned bits from the new value combined with the L1-owned bits
from L1's guest cr[0/4].

Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
11 years agoarm64: enable PSCI secondary CPU bringup
Andre Przywara [Thu, 5 Dec 2013 10:08:12 +0000 (11:08 +0100)]
arm64: enable PSCI secondary CPU bringup

If the device tree contains a PSCI node and the DTB CPU node tells us
to use PSCI for enabling secondary cores, we set the function pointer
to the PSCI wrapper function to enable PSCI SMP bringup.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoarm32: enable PSCI secondary CPU bringup
Andre Przywara [Thu, 5 Dec 2013 10:08:11 +0000 (11:08 +0100)]
arm32: enable PSCI secondary CPU bringup

If the device tree contains a PSCI node, we bring up secondary CPUs
by invoking the appropriate PSCI handler.
This will take priority over platform specific functions (which could
call the PSCI wrapper themselves if needed), so any PSCI enablement
of a platform will automatically be used (as on Linux).

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoarm: add a function to invoke the PSCI handler
Andre Przywara [Thu, 5 Dec 2013 10:08:10 +0000 (11:08 +0100)]
arm: add a function to invoke the PSCI handler

The PSCI handler is invoked via a secure monitor call with the
arguments defined in registers. Copy the function from the
Linux code and adjust it to work on both ARM32 and ARM64.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoarm: parse PSCI node from the host device-tree
Andre Przywara [Thu, 5 Dec 2013 10:08:09 +0000 (11:08 +0100)]
arm: parse PSCI node from the host device-tree

The availability of a PSCI handler is advertised in the DTB.
Find and parse the node (described in the Linux device-tree binding)
and save the function number for bringing up a CPU for later usage.
We do some sanity checks, especially we deny using HVC as a calling
method, as it does not make much sense currently under Xen.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoarm: move GIC SGI kicking into separate function
Andre Przywara [Thu, 5 Dec 2013 10:08:08 +0000 (11:08 +0100)]
arm: move GIC SGI kicking into separate function

Currently we unconditionally send SGIs to all cores on SMP bringup.

Those SGIs (software generated interrupts) are to push a secondary core
through a gate in the Xen bring up code to filter the right CPU. This gate is
necessary on platforms which do not allow us to wake up a specific secondary
processor and will trap all but the CPU we are trying to wake up.

With PSCI we can explicitly specify the core to startup, so we don't need the
kick here because the CPU will fall straight through Xen's gate.

So we move the GIC kick into a function and call it explicitly from the
platforms that need it. This gets us get rid of the empty cpu_up() platform
functions in ARM32 and the comment in there.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- explain more about the Xen gate in the commit message ]

11 years agoarm: rename xen/arch/arm/psci.c into vpsci.c
Andre Przywara [Thu, 5 Dec 2013 10:08:07 +0000 (11:08 +0100)]
arm: rename xen/arch/arm/psci.c into vpsci.c

Follow the current convention of prefixing guest related names
with "v" by renaming the guest PSCI functionality into vpsci.c to make
room for the host PSCI functions.

Signed-off-by: Andre Przywara <andre.przywara@linaro.org>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agotools: libxl: testidl: initialise the KeyedUnion keyvar before the union
Ian Campbell [Wed, 4 Dec 2013 17:48:56 +0000 (17:48 +0000)]
tools: libxl: testidl: initialise the KeyedUnion keyvar before the union

This is Coverity CID 1135378 and 1135379.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
11 years agoxen: arm: Enable 1:1 workaround by default
Ian Campbell [Wed, 4 Dec 2013 14:54:21 +0000 (14:54 +0000)]
xen: arm: Enable 1:1 workaround by default

I was just about to send out patches adding the 1:1 workaround to vexpress
(the foundation model is a vexpress platfrom with DMA) and sunxi.

That would have meant that all platforms now implement the quirk. Instead lets
just make it the default and remove the quirk.

In the future this will likely be set based on the presence absence of an
IOMMU, perhaps with additional overrides by the platform.

This results in some dead code in domain_build for dealing with the non-1:1
case. This is deliberate and is left in anticipation of IOMMU support in 4.5.

PLATFORM_QUIRK_GIC_64K_STRIDE is renumbered as a side effect of this change.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
11 years agoMerge branch 'staging' of ssh://xenbits.xen.org/home/xen/git/xen into staging
Ian Campbell [Wed, 4 Dec 2013 14:51:57 +0000 (14:51 +0000)]
Merge branch 'staging' of ssh://xenbits.xen.org/home/xen/git/xen into staging

11 years agoQEMU_TAG update
Ian Jackson [Wed, 4 Dec 2013 14:46:19 +0000 (14:46 +0000)]
QEMU_TAG update

11 years agoxen: arm64: clear boot_first instead of boot_pgtable twice
Dennis Lan (dlan) [Wed, 4 Dec 2013 14:37:25 +0000 (14:37 +0000)]
xen: arm64: clear boot_first instead of boot_pgtable twice

Signed-off-by: Lan Yixun (dlan) <dennis.yxun@gmail.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoUpdate QEMU_UPSTREAM_REVISION
Anthony PERARD [Thu, 28 Nov 2013 19:44:50 +0000 (19:44 +0000)]
Update QEMU_UPSTREAM_REVISION

Changing to master, otherwise we don't get the last updates.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agolibxenctrl: Fix xc_interface_close() crash if it gets NULL as an argument
Daniel Kiper [Mon, 2 Dec 2013 19:13:03 +0000 (20:13 +0100)]
libxenctrl: Fix xc_interface_close() crash if it gets NULL as an argument

xc_interface_close() crashes if it gets NULL as an argument. However,
it just calls xc_interface_close_common() which is called by many
others functions. It means that they are also vulnerable. So fix above
mentioned issue by adding NULL check in xc_interface_close_common().
This way we fix similar issue in other functions which calls
xc_interface_close_common() too.

Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoxen: arm: TCR_EL1 is 64-bit on arm64
Ian Campbell [Tue, 3 Dec 2013 15:13:36 +0000 (15:13 +0000)]
xen: arm: TCR_EL1 is 64-bit on arm64

Storing it in a 32-bit variable in struct arch_vcpu caused breakage over
context switch.

There were also several other places which stored this as the 32-bit value.
Update them all.

The "struct vcpu_guest_context" case needs special consideration. This struct
is in theory is exposed to guests, via the VCPUOP_initialise hypercall.
However as discussed in
http://lists.xen.org/archives/html/xen-devel/2013-10/msg00912.html this isn't
really a guest visible interface since ARM uses PSCI for VCPU bringup
(VCPUOP_initialise simply isn't available) The other users of this interface
are the domctls, which are not a stable API. Therefore while fixing the ttbcr
size also surround the struct in ifdefs to restrict the struct to the
hypervisor and the tools only (omitting the extra complexity of renaming as I
suggested in the referenced thread).

NB TCR_EL1 on arm64 is known as TTBCR on arm32, hence the apparent naming
inconsistencies.

Spotted-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Julien Grall <julien.grall@linaro.org>
Acked-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Cc: Anup Patel <anup.patel@linaro.org>
Cc: patches@linaro.org
Cc: patches@apm.com
11 years agoMerge branch 'staging' of ssh://xenbits.xen.org/home/xen/git/xen into staging
Ian Campbell [Wed, 4 Dec 2013 14:29:39 +0000 (14:29 +0000)]
Merge branch 'staging' of ssh://xenbits.xen.org/home/xen/git/xen into staging

11 years agoarinc: Add poolid parameter to scheduler get/set functions
Nathan Studer [Tue, 3 Dec 2013 22:24:27 +0000 (17:24 -0500)]
arinc: Add poolid parameter to scheduler get/set functions

Signed-off-by: Nathan Studer <nate.studer@dornerworks.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
11 years agoarinc: add cpu-pool support to scheduler
Nathan Studer [Wed, 4 Dec 2013 12:29:00 +0000 (13:29 +0100)]
arinc: add cpu-pool support to scheduler

1.  Remove the restriction that dom0 must be in the schedule, since dom-0 may
not belong to the scheduler's pool.
2.  Add a schedule entry for each of dom-0's vcpus as they are created.
3.  Add code to deal with empty schedules in the do_schedule function.
4.  Call the correct idle task for the pcpu on which the scheduling decision
is being made in do_schedule.
5.  Add code to prevent migration of a vcpu.
6.  Implement a proper cpu_pick function, which prefers the current processor.
7.  Add a scheduler lock to protect access to global variables from multiple
    PCPUs.

These changes do not implement arinc653 multicore.  Since the schedule only
supports 1 vcpu entry per slot, even if the vcpus of a domain are run on
multiple pcpus, the scheduler will essentially serialize their execution.

Signed-off-by: Nathan Studer <nate.studer@dornerworks.com>
Release-acked-by: George Dunlap <george.dunlap@eu.citrix.com>
11 years agox86: fix early boot command line parsing
Daniel Kiper [Wed, 4 Dec 2013 12:26:37 +0000 (13:26 +0100)]
x86: fix early boot command line parsing

There is no reliable way to encode NUL character as a character so encode
it as a number. Read: http://sourceware.org/binutils/docs/as/Characters.html.
Octal and hex encoding do not work on at least one system (GNU assembler
version 2.22 (x86_64-linux-gnu) using BFD version (GNU Binutils for Debian) 2.22).
Without this fix e.g. no-real-mode option at the end of xen.gz command line
is not detected.

Signed-off-by: Daniel Kiper <daniel.kiper@oracle.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agonested VMX: fix I/O port exit emulation
Jan Beulich [Wed, 4 Dec 2013 12:23:27 +0000 (13:23 +0100)]
nested VMX: fix I/O port exit emulation

For multi-byte operations all affected ports' bits in the bitmap need
to be checked, not just the first port's one.

Reported-by: Matthew Daley <mattd@bugfuzz.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Eddie Dong <eddie.dong@intel.com>
11 years agolibxl: don't try to fclose file twice on error in libxl_userdata_store
Matthew Daley [Tue, 3 Dec 2013 00:00:37 +0000 (13:00 +1300)]
libxl: don't try to fclose file twice on error in libxl_userdata_store

Do this by changing the function to not use stdio file operations, but
just use the fd directly with libxl_write_exactly.

While at it, tidy up the function's style issues.

Coverity-ID: 1056195
Signed-off-by: Matthew Daley <mattd@bugfuzz.com>
11 years agolibxl: don't leak buf in libxl_xen_console_read_start error handling
Matthew Daley [Tue, 3 Dec 2013 01:01:05 +0000 (14:01 +1300)]
libxl: don't leak buf in libxl_xen_console_read_start error handling

Use libxl__zallocs instead of plain mallocs + memset.

Coverity-ID: 1055889
Signed-off-by: Matthew Daley <mattd@bugfuzz.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
11 years agouse return value of domain_adjust_tot_pages() where feasible
Jan Beulich [Tue, 3 Dec 2013 11:41:54 +0000 (12:41 +0100)]
use return value of domain_adjust_tot_pages() where feasible

This is generally cheaper than re-reading ->tot_pages.

While doing so I also noticed an improper use (lacking error handling)
of get_domain() as well as lacks of ->is_dying checks in the memory
sharing code, which the patch fixes at once. In the course of doing
this I further noticed other error paths there pointlessly calling
put_page() et al with ->page_alloc_lock still held, which is also being
reversed.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Acked-by: Keir Fraser <keir@xen.org>
11 years agofix locking in offline_page()
Jan Beulich [Tue, 3 Dec 2013 11:40:57 +0000 (12:40 +0100)]
fix locking in offline_page()

Coverity ID 1055655

Apart from the Coverity-detected lock order reversal (a domain's
page_alloc_lock taken with the heap lock already held), calling
put_page() with heap_lock is a bad idea too (as a possible descendant
from put_page() is free_heap_pages(), which wants to take this very
lock).

From all I can tell the region over which heap_lock was held was far
too large: All we need to protect are the call to mark_page_offline()
and reserve_heap_page() (and I'd even put under question the need for
the former). Hence by slightly re-arranging the if/else-if chain we
can drop the lock much earlier, at once no longer covering the two
put_page() invocations.

Once at it, do a little bit of other cleanup: Put the "pod_replace"
code path inline rather than at its own label, and drop the effectively
unused variable "ret".

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Tim Deegan <tim@xen.org>
Acked-by: Keir Fraser <keir@xen.org>
11 years agofix string inconsistencies in callers of panic()
Andrew Cooper [Tue, 3 Dec 2013 11:39:22 +0000 (12:39 +0100)]
fix string inconsistencies in callers of panic()

panic() (as well as early_panic() in arm) is inconsistently called with or
without a trailing newline.  This results in cases where the lower line of
*****s is not on its own line.

Change panic() to always print a newline itself, and update callers not to.

In addition, panic() was occasionally called with a leading newline, and
occaionally with trailing punctuation which seems rather redundant given the
surrounding context.  Fix up these sitiuations as well.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
Release-acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agoblkif: add indirect descriptors interface to public headers
Roger Pau Monné [Tue, 3 Dec 2013 11:33:58 +0000 (12:33 +0100)]
blkif: add indirect descriptors interface to public headers

Indirect descriptors introduce a new block operation
(BLKIF_OP_INDIRECT) that passes grant references instead of segments
in the request. This grant references are filled with arrays of
blkif_request_segment_aligned, this way we can send more segments in a
request.

This interface is already implemented in Linux >= 3.11.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agocommon/vsprintf: fix return value when formatting symbolic addresses
Jan Beulich [Tue, 3 Dec 2013 08:57:41 +0000 (09:57 +0100)]
common/vsprintf: fix return value when formatting symbolic addresses

When the buffer to be formatted to is too small, the function return
value is expected to be the number of characters that would be printed
(particularly important if that value is then used for allocating a
buffer). Hence incrementing the active pointer must always be
independent of actually storing a character.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>
11 years agox86: be more power-efficient when waiting forever
Andrew Cooper [Tue, 3 Dec 2013 08:54:12 +0000 (09:54 +0100)]
x86: be more power-efficient when waiting forever

The effect is unchanged, but the processor will be spending most of its time
in the C1 or C1E power state rather than C0.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Keir Fraser <keir@xen.org>