All the GICv2 registers are word-accessible. Some them are also
byte-accessible (see GICD_IPRIORITYR*).
Those registers are incorrectly implemented when they should be RAZ. Only
word-access size are currently allowed for them.
To avoid further issues, introduce different label following the access-size
of the registers:
- read_as_zero_32 and write_ignore_32: Used for registers accessible
via a word.
- read_as_zero: Used when we don't have to check the access size.
The latter is used when the access size has already been checked in the
register emulation and/or when the register offset is reserved/implementation
defined.
Note that, only used labels has been introduced.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
(cherry picked from commit 1fefa550274758204a6bf58ea9b9509296197080)
Julien Grall [Mon, 16 Feb 2015 14:50:42 +0000 (14:50 +0000)]
xen/arm: vgic-v3: Correctly set GICD_TYPER.CPUNumber
On GICv3, the value (CPUNumber + 1) indicates the number of processor that may
be used as interrupts targets when ARE bit is zero. The maximum is 8
processors.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
(cherry picked from commit 834551bace5cfda7ca5ebbdc2ec9fd18f002e4ce)
Julien Grall [Mon, 16 Feb 2015 14:50:41 +0000 (14:50 +0000)]
xen/arm: vgic-v3: Correctly set GICD_TYPER.IDbits
From Linux 3.19, the GICv3 drivers is using GICD_TYPER.IDbits to check
the validity of the hardware interrupt number.
The field IDBits in the register GICD_TYPER is used to know the number of
interrupt identifiers (SPI, PPIs, SGIs, LPIs) supported by GIC Stream Protocol
Interface.
This field contains the number of interrupt identifier bits minus one.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Ian Campbell <ian.campbell@citrix.com>
(cherry picked from commit 8206d052eb11061d7b6cada566c0804c14001fec)
Julien Grall [Thu, 15 Jan 2015 20:23:40 +0000 (20:23 +0000)]
xen/arm: vgic: Rename nr_lines into nr_spis
The field nr_lines in the arch_domain vgic structure contains the number of
SPIs for the emulated GIC. Using the nr_lines make confusion with the GIC
code, where it means the number of IRQs. This can lead to coding error.
Also introduce vgic_num_irqs to get the number of IRQ handled by the emulated
GIC.
Finally drop the initialization of nr_spis in both gicv2v_init and gicv3_init
as it's already initialized in vgic common code.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
(cherry picked from commit 96fcbd0599b74b4b2629447a6ee90580f43e3aa4)
Limit XEN_DOMCTL_memory_mapping hypercall to only process up to 64 GFNs (or less)
Said hypercall for large BARs can take quite a while. As such
we can require that the hypercall MUST break up the request
in smaller values.
Another approach is to add preemption to it - whether we do the
preemption using hypercall_create_continuation or returning
EAGAIN to userspace (and have it re-invocate the call) - either
way the issue we cannot easily solve is that in 'map_mmio_regions'
if we encounter an error we MUST call 'unmap_mmio_regions' for the
whole BAR region.
Since the preemption would re-use input fields such as nr_mfns,
first_gfn, first_mfn - we would lose the original values -
and only undo what was done in the current round (i.e. ignoring
anything that was done prior to earlier preemptions).
Unless we re-used the return value as 'EAGAIN|nr_mfns_done<<10' but
that puts a limit (since the return value is a long) on the amount
of nr_mfns that can provided.
This patch sidesteps this problem by:
- Setting an hard limit of nr_mfns having to be 64 or less.
- Toolstack adjusts correspondingly to the nr_mfn limit.
- If the there is an error when adding the toolstack will call the
remove operation to remove the whole region.
The need to break this hypercall down is for large BARs can take
more than the guest (initial domain usually) time-slice. This has
the negative result in that the guest is locked out for a long
duration and is unable to act on any pending events.
We also augment the code to return zero if nr_mfns instead
of trying to the hypercall.
This is XSA-125 / CVE-2015-2752.
Suggested-by: Jan Beulich <jbeulich@suse.com> Acked-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ross Lagerwall [Thu, 26 Mar 2015 07:27:13 +0000 (08:27 +0100)]
x86: don't apply reboot quirks if reboot set by user
If reboot= is specified on the command-line, don't apply reboot quirks
to allow the command-line option to take precedence.
This is a port of Linux commit 5955633e91bf ("x86/reboot: Skip DMI
checks if reboot set by user").
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Leverage (and make apply on top of) c643fb110a ("x86/EFI: allow
reboot= overrides when running under EFI").
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 9832f5e8e3575f8affceb2751f7422704bf7b446
master date: 2015-03-13 12:41:51 +0100
At the point this patch calls domain_update_node_affinity(), the vcpu
hard affinities have not yet been updated; so calling it at this point
can in some circumstances trigger an ASSERT().
domain_update_node_affinity() is already called in
cpu_disable_scheduler(), so adding it to cpupool_unassign_cpu() is
redundant. Simply reverting the patch is sufficient.
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
x86/EFI: allow reboot= overrides when running under EFI
By default we will always use EFI reboot mechanism when
running under EFI platforms. However some EFI platforms
are buggy and need to use the ACPI mechanism to
reboot (such as Lenovo ThinkCentre M57). As such
respect the 'reboot=' override and DMI overrides
for EFI platforms.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
- BOOT_INVALID is just zero
- also consider acpi_disabled in BOOT_INVALID resolution
- duplicate BOOT_INVALID resolution in machine_restart()
- don't fall back from BOOT_ACPI to BOOT_EFI (if it was overridden, it
surely was for a reason)
- adjust doc change formatting
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
x86/EFI: fix reboot after c643fb110a
acpi_disabled needs to be moved out of .init.data.
Reported-by: Ross Lagerwall <ross.lagerwall@citrix.com>
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Ross Lagerwall <ross.lagerwall@citrix.com>
master commit: c643fb110a51693e82a36ca9178d54f0b9744024
master date: 2015-03-13 11:25:52 +0100
master commit: 8ff330ec11e471919621bce97c069b83b0319d15
master date: 2015-03-23 18:01:51 +0100
Ross Lagerwall [Thu, 26 Mar 2015 07:21:40 +0000 (08:21 +0100)]
EFI: fix getting EFI variable list on some systems
Copy the entire output buffer to the guest because some firmwares update
size on successful calls (contrary to the spec) and the buffer may
contain data beyond the output size that the firmware requires on a
subsequent GetNextVariableName() call (e.g. a NULL character).
Note that this shouldn't change the amount of data copied because on success, a
compliant firmware does not change size and so the entire buffer is copied
anyway. If size is changed, Xen does not copy the buffer.
Without this change, the following (simplified) sequence would occur:
GetNextVariableName: in \0, size 1024 || out AdminPw\0, size 7
GetNextVariableName: in AdminPw\0, size 1024 || out UserPw\0, size 6
GetNextVariableName: in UserPww\0, size 1024 || NOT FOUND
This was seen on an Intel S1200RP_SE with firmware
S1200RP.86B.02.02.0005.102320140911, version 4.6, date 2014-10-23.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 1f4eb9d27d0ebd62a0b6cdff8213726f5ae8f25c
master date: 2015-03-10 13:52:01 +0100
Jan Beulich [Thu, 26 Mar 2015 07:21:03 +0000 (08:21 +0100)]
VT-d: print_vtd_entries() should cope with superpages
Even if VT-d code alone (i.e. when not sharing tables with EPT) still
doesn't support superpages, this function - invoked upon DMA remapping
faults - needs to cope with such.
While at it also replace a few more plain numbers with suitable named
constants.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Roger Pau Monné <roger.pau@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Kevin Tian <kevin.tian@intel.com>
master commit: 92cf6c2456dc428694ed95b6b1dec5bb84319790
master date: 2015-03-09 14:00:19 +0100
Jan Beulich [Thu, 26 Mar 2015 07:20:11 +0000 (08:20 +0100)]
complete conversion set_bit() -> __cpumask_set_cpu() by 4aaca0e9cd
While converting to __cpumask_set_cpu() was correct, the first argument
passed should have been corrected to be "cpu" instead of "nr" at once.
The wrong construct results in problems on systems with relatively few
CPUs.
Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citirx.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
master commit: 5dbdf33c57e3c95125b92f86d847ed8432e28f1c
master date: 2015-02-27 16:09:27 +0100
Jan Beulich [Thu, 26 Mar 2015 07:18:29 +0000 (08:18 +0100)]
honor MEMF_no_refcount in alloc_heap_pages()
Non-anonymous allocations with this flag set should - for the purpose
of the availability check - be treated just like anonymous ones, as
they wouldn't lead to a reduction of ->outstanding_pages.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org>
master commit: 17294e69c4cd299da7ba3ca8077e24be76bd61b1
master date: 2015-02-26 13:58:54 +0100
Ian Campbell [Fri, 13 Mar 2015 10:39:50 +0000 (10:39 +0000)]
xen: arm: correct arm64 version of gva_to_ma_par
The implementation was backwards and checked that the guest could
read when asked about write and vice versa.
This is an update to the fix for XSA-98.
Reported-by: Tamas K Lengyel <tklengyel@sec.in.tum.de> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
(cherry picked from commit c1245e9d5bf311b5a3267ea4b077a16561fcf439)
Ian Campbell [Fri, 20 Feb 2015 14:41:09 +0000 (14:41 +0000)]
tools: libxl: Explicitly disable graphics backends on qemu cmdline
By default qemu will try to create some sort of backend for the
emulated VGA device, either SDL or VNC.
However when the user specifies sdl=0 and vnc=0 in their configuration
libxl was not explicitly disabling either backend, which could lead to
one unexpectedly running.
If either sdl=1 or vnc=1 is configured then both before and after this
change only the backends which are explicitly enabled are configured,
i.e. this issue only occurs when all backends are supposed to have
been disabled.
This affects qemu-xen and qemu-xen-traditional differently.
If qemu-xen was compiled with SDL support then this would result in an
SDL window being opened if $DISPLAY is valid, or a failure to start
the guest if not. Passing "-display none" to qemu before any further
-sdl options disables this default behaviour and ensures that SDL is
only started if the libxl configuration demands it.
If qemu-xen was compiled without SDL support then qemu would instead
start a VNC server listening on ::1 (IPv6 localhost) or 127.0.0.1
(IPv4 localhost) with IPv6 preferred if available. Explicitly pass
"-vnc none" when vnc is not enabled in the libxl configuration to
remove this possibility.
qemu-xen-traditional would never start a vnc backend unless asked.
However by default it will start an SDL backend, the way to disable
this is to pass a -vnc option. In other words passing "-vnc none" will
disable both vnc and sdl by default. sdl can then be reenabled if
configured by subsequent use of the -sdl option.
Tested with both qemu-xen and qemu-xen-traditional built with SDL
support and:
xl cr # defaults
xl cr sdl=0 vnc=0
xl cr sdl=1 vnc=0
xl cr sdl=0 vnc=1
xl cr sdl=0 vnc=0 vga=\"none\"
xl cr sdl=0 vnc=0 nographic=1
with both valid and invalid $DISPLAY.
This is XSA-119 / CVE-2015-2152.
Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
(cherry picked from commit 91b0ae9db33f72468b1d411a07f53085c893c097)
Jan Beulich [Thu, 12 Mar 2015 13:18:28 +0000 (14:18 +0100)]
x86/tboot: invalidate FIX_TBOOT_MAP_ADDRESS mapping after use
In order for commit cbeeaa7d ("x86/nmi: fix shootdown of pcpus
running in VMX non-root mode")'s re-use of that fixmap entry to not
cause undesirable (in crash context) cross-CPU TLB flushes, invalidate
the fixmap entry right after use.
Jan Beulich [Tue, 10 Mar 2015 12:55:17 +0000 (13:55 +0100)]
x86emul: fully ignore segment override for register-only operations
For ModRM encoded instructions with register operands we must not
overwrite ea.mem.seg (if a - bogus in that case - segment override was
present) as it aliases with ea.reg.
This is CVE-2015-2151 / XSA-123.
Reported-by: Felix Wilhelm <fwilhelm@ernw.de> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Reviewed-by: Keir Fraser <keir@xen.org>
master commit: bcf92a5382b75fd964c1f8678b2d9a3abe6dec39
master date: 2015-03-10 13:45:51 +0100
Andrew Cooper [Wed, 28 Jan 2015 15:52:35 +0000 (15:52 +0000)]
tools/libxc: Don't leave scratch_pfn uninitialised if the domain has no memory
c/s 5b5c40c0d1 "libxc: introduce a per architecture scratch pfn for temporary
grant mapping" accidentally an issue whereby there were two paths out of
xc_core_arch_get_scratch_gpfn() which returned 0, but only one of which
assigned a value to the gpfn parameter.
xc_domain_maximum_gpfn() can validly return 0, at which point gpfn 1 is a
valid scratch page to use.
In addition, widen rc before adding 1 and possibly overflowing.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Julien Grall <julien.grall@linaro.org> CC: Jan Beulich <JBeulich@suse.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
(cherry picked from commit 5b0447f647b1031595d24a8a50b362726c887d12)
Andrew Cooper [Wed, 18 Feb 2015 15:42:11 +0000 (16:42 +0100)]
x86/nmi: fix shootdown of pcpus running in VMX non-root mode
c/s 7dd3b06ff "vmx: fix handling of NMI VMEXIT" fixed one issue but
inadvertently introduced a regression when it came to the NMI shootdown. The
shootdown code worked by patching vector 2 in each IDT, but the introduced
direct call to do_nmi() bypassed this.
Instead of patching each IDT, take a different approach by updating the
existing dispatch table. This allows for the removal of the remote IDT
patching and the removal of the nmi_crash() entry point.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
master commit: cbeeaa7da01bfa37c1fcdfe79e8f4f1400262ccb
master date: 2015-02-11 17:18:27 +0100
Paul Durrant [Wed, 18 Feb 2015 15:40:30 +0000 (16:40 +0100)]
x86/hvm: explicitly mark ioreq server pages dirty
...when they are added back into the guest physmap, when an ioreq
server is disabled. If this is not done then the pages are missed
during migration, causing ioreq server creation to fail on the remote end.
This problem only manifests if the ioreq server is non-default because in
the default case the pages are never removed from the guest physmap.
Paul Durrant [Wed, 18 Feb 2015 15:39:46 +0000 (16:39 +0100)]
x86/hvm: wait for at least one ioreq server to be enabled
In the case where a stub domain is providing emulation for an HVM
guest, there is no interlock in the toolstack to make sure that
the stub domain is up and running before the guest is unpaused.
Prior to the introduction of ioreq servers this was not a problem,
since there was only ever one emulator so ioreqs were simply
created anyway and the vcpu remained blocked until the stub domain
started and picked up the ioreq.
Since ioreq servers allow for multiple emulators for a single guest
it's not possible to know a priori which emulator will handle a
particular ioreq, so emulators must attach to a guest before the
guest runs.
This patch works around the lack of interlock in the toolstack for
stub domains by keeping the domain paused until at least one ioreq
server is created and enabled, which in practice means the stub
domain is indeed up and running.
Boris Ostrovsky [Wed, 18 Feb 2015 15:38:19 +0000 (16:38 +0100)]
x86/VPMU: disable when NMI watchdog is on
NMI watchdog sets APIC_LVTPC register to generate an NMI when PMU counter
overflow occurs. This may be overwritten by VPMU code later, effectively
turning off the watchdog.
We should disable VPMU when NMI watchdog is running.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: e5e09b5c46444b0ab8450c6d6ee8316d4016ac18
master date: 2015-02-03 11:30:09 +0100
Ian Jackson [Fri, 13 Feb 2015 16:04:34 +0000 (16:04 +0000)]
tools/configure: detect $host_vendor of rumprun, not just rumpxen
This has been renamed by the rumpkernels upstream.
(This patch needs to be backported.)
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Antti Kantee <pooka@iki.fi> CC: Martin Lucina <martin@lucina.net> CC: Ian Campbell <Ian.Campbell@eu.citrix.com>
(cherry picked from commit f4e99a4f5098f6fa3c856f79b8365bb29a3d3a15)
Wei Liu [Tue, 3 Feb 2015 13:47:08 +0000 (13:47 +0000)]
rump kernels: use new platform macro
Starting from rump kernel changeset 91d5623 ("Renaming platform macros,
app-tools and autoconf target string"), __RUMPUSER_XEN__ and __RUMPAPP__
are deleted. We are supposed to use __RUMPRUN__ instead.
We still keep __RUMPUSER_XEN__ for now in order to make xen-unstable
pass osstest push gate. I will remove __RUMPUSER_XEN__ later.
Related discussion:
http://thread.gmane.org/gmane.comp.rumpkernel.user/739
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
(cherry picked from commit 441256a532dd737905ce335506d2ffcf0ff0db7c)
Julien Grall [Wed, 21 Jan 2015 13:25:44 +0000 (13:25 +0000)]
libxc: introduce a per architecture scratch pfn for temporary grant mapping
The code to initialize the grant table in libxc uses
xc_domain_maximum_gpfn() + 1 to get a guest pfn for mapping the grant
frame and to initialize it.
This solution has two major issues:
- The check of the return of xc_domain_maximum_gpfn is buggy because
xen_pfn_t is unsigned and in case of an error -ERRNO is returned.
Which is never catch with ( pfn <= 0 ).
- The guest memory layout maybe filled up to the end, i.e
xc_domain_maximum_gpfn() + 1 gives either 0 or an invalid PFN due to
hardware limitation.
Futhermore, on ARM, xc_domain_maximum_gpfn() is not implemented and
return -ENOSYS. This will make libxc to use always the same PFN which
may colapse with an already mapped region (see xen/include/public/arch-arm.h
for the layout).
This patch only address the problem for ARM, the x86 version use the same
behavior (ie xc_domain_maximum_gpfn() + 1), as I'm not familiar with Xen x86.
A new function xc_core_arch_get_scratch_gpfn is introduced to be able to
choose the gpfn per architecture.
For the ARM version, we use the GUEST_GNTTAB_GUEST which is the base of
the region by the guest to map the grant table. At the build time,
nothing is mapped there.
At the same time correctly check the return of xc_domain_maximum_gpfn
for x86.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Cc: Jan Beulich <jbeulich@suse.com> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Roger Pau Monné <roger.pau@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
(cherry picked from commit 5b5c40c0d10f5ffbd35b7eef6df34a086000442f)
Dan Carpenter [Tue, 3 Feb 2015 11:22:01 +0000 (12:22 +0100)]
bunzip2: off by one in get_next_block()
"origPtr" is used as an offset into the bd->dbuf[] array. That array is
allocated in start_bunzip() and has "bd->dbufSize" number of elements so
the test here should be >= instead of >.
Later we check "origPtr" again before using it as an offset so I don't
know if this bug can be triggered in real life.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Trivial adjustments to make the respective Linux commit b5c8afe5be51078a979d86ae5ae78c4ac948063d apply to Xen.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
master commit: 39798e95a954eec660a3f5f21489c30ef78daf6d
master date: 2015-01-28 16:50:08 +0100
x86: vcpu_destroy_pagetables() must not return -EINTR
.. otherwise it has the side effect that: domain_relinquish_resources
will stop and will return to user-space with -EINTR which it is not
equipped to deal with that error code; or vcpu_reset - which will
ignore it and convert the error to -ENOMEM..
The preemption mechanism we have for domain destruction is to return
-EAGAIN (and then user-space calls the hypercall again) and as such we need
to catch the case of:
we need to return -ERESTART otherwise we end up returning -ENOMEM.
There are also other callers of vcpu_destroy_pagetables: arch_vcpu_reset
(vcpu_reset) are:
- hvm_s3_suspend (asserts on any return code),
- vlapic_init_sipi_one (asserts on any return code),
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Jan Beulich <jbeulich@suse.com>
master commit: de4f284b3d7b47d3b9807f354552ecf3e0fff56b
master date: 2015-01-26 12:51:09 +0100
Jan Beulich [Tue, 3 Feb 2015 11:17:26 +0000 (12:17 +0100)]
x86: don't expose XSAVES capability to PV guests
As done by the recent Linux commit b65d6e17fe ("kvm: x86: mask out
XSAVES") for KVM, we should also mask out XSAVES from what PV guests
get to see as long as we don't emulate accesses to MSR_IA32_XSS.
Actually, go beyond that: Just like for leaf 7, switch from
blacklisting to whitelisting, i.e. only allow XSAVEOPT and XSAVEC for
the time being. And do these overrides consistently for both Dom0 and
DomU-s.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 8d050ed1097ce5f4bf6a1d6806fb1e3471976adb
master date: 2015-01-22 12:47:56 +0100
Andrew Cooper [Tue, 3 Feb 2015 11:16:30 +0000 (12:16 +0100)]
xsm/evtchn: never pretend to have successfully created a Xen event channel
Xen event channels are not internal resources. They still have one end in a
domain, and are created at the request of privileged domains. This logic
which "successfully" creates a Xen event channel opens up undesirable failure
cases with ill-specified XSM policies.
If a domain is permitted to create ioreq servers or memevent listeners, but
not to create event channels, the ioreq/memevent creation will succeed but
attempting to bind the returned event channel will fail without any indication
of a permission error.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
master commit: 09aa4759faa29c1fe735266de4c79f17329bd67b
master date: 2015-01-20 10:42:26 +0100
Jan Beulich [Tue, 3 Feb 2015 11:15:58 +0000 (12:15 +0100)]
common/memory: fix an XSM error path
XENMEM_{in,de}crease_reservation as well as XENMEM_populate_physmap
return the extent at which failure was detected, not error indicators.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Tim Deegan <tim@xen.org>
master commit: 76d4ff26d9647088353acaf4a56388a354a5d6e9
master date: 2015-01-19 11:59:05 +0100
Jan Beulich [Tue, 3 Feb 2015 11:15:03 +0000 (12:15 +0100)]
x86emul: tighten CLFLUSH emulation
While for us it's not as bad as it was for Linux, their commit 13e457e0ee ("KVM: x86: Emulator does not decode clflush well", by
Nadav Amit <namit@cs.technion.ac.il>) nevertheless points out two
shortcomings in our code: opcode 0F AE /7 is clflush only when it uses
a memory mode (otherwise it's SFENCE) and when there's no REP prefix
(an operand size prefix is fine, as that's CLFLUSHOPT).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
master commit: 9d03db6b81d1880bf3aa4fc83a60346bf02be251
master date: 2015-01-12 15:41:12 +0100
Vijaya Kumar K [Tue, 9 Dec 2014 04:39:55 +0000 (10:09 +0530)]
xen/arm: Manage pl011 uart TX interrupt correctly
In pl011.c, when TX interrupt is received
serial_tx_interrupt() is called to push next
characters. If TX buffer is empty, serial_tx_interrupt()
does not disable TX interrupt and hence pl011 UART
irq handler pl011_interrupt() always sees TX interrupt
status set in MIS register and cpu does not come out of
UART irq handler.
With this patch, mask TX interrupt by writing 0 to
IMSC register when TX buffer is empty and unmask by
writing 1 to IMSC register before sending characters.
Signed-off-by: Vijaya Kumar K <Vijaya.Kumar@caviumnetworks.com> Reviewed-by: Tim Deegan <tim@xen.org>
(cherry picked from commit 20e297f84035da854ceb6a160981f78ed56a408b)
Ian Campbell [Thu, 8 Jan 2015 11:53:55 +0000 (11:53 +0000)]
dt-uart: use ':' as separator between path and options
',' is a valid character in a device-tree path (see ePAPR v1.1 Table
2-1), in fact ',' is actually pretty common in node names.
Using ',' as a separator breaks for example on fast models. If you use
the full path (/smb/motherboard/iofpga@3,00000000/uart@090000) rather
than the alias then earlyprintk gives:
(XEN) Looking for UART console /smb/motherboard/iofpga@3
(XEN) Unable to find device "/smb/motherboard/iofpga@3"
(XEN) Bad console= option 'dtuart'
I actually noticed this on Jetson where the uart is
"/serial@0,70006300" and there happened to be no alias defined.
Instead use ':' as the separator, it is defined to terminate the path
in the context of /chosen/stdout-path (Table 3-4) which is pretty
closely analogous to the dtuart= option and so makes a pretty good
choice (especially since the next patch adds support for stdout-path).
Since no DT aware driver current supports any options there is no
point in retaining support for ',' for backwards compatibility.
Additionally, expand the buffer for the dtuart option, a path can be
far longer than 30 characters (in fact the maximum size of a single
node name is 31, so it's not even necessarily enough for an alias).
128 is completely arbitrary and allows for paths at least 8 deep even
with worst case node names.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Reviewed-by: Julien Grall <julien.grall@linaro.org>
(cherry picked from commit f01af57300cb60ab0fd8487fb5bbbe97bee234f0)
Julien Grall [Fri, 9 Jan 2015 15:56:45 +0000 (15:56 +0000)]
libxl: Don't ignore error when we fail to give access to ioport/irq/iomem
If we fail to give the access, the domain will unlikely work correctly.
So we should bail out at the first error.
Signed-off-by: Julien Grall <julien.grall@linaro.org> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
(cherry picked from commit 7070eec417934360bf3aed434191246dfe4f8091)
Jan Beulich [Wed, 7 Jan 2015 15:25:08 +0000 (16:25 +0100)]
VT-d: don't crash when PTE bits 52 and up are non-zero
This can (and will) be legitimately the case when sharing page tables
with EPT (more of a problem before p2m_access_rwx became zero, but
still possible even now when other than that is the default for a
guest), leading to an unconditional crash (in print_vtd_entries())
when a DMA remapping fault occurs.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
master commit: 46e0baf59105200d43612cf0c59de216958b008d
master date: 2015-01-07 11:13:58 +0100
Boris Ostrovsky [Wed, 7 Jan 2015 10:19:17 +0000 (11:19 +0100)]
x86/VPMU: Clear last_vcpu when destroying VPMU
We need to make sure that last_vcpu is not pointing to VCPU whose
VPMU is being destroyed. Otherwise we may try to dereference it in
the future, when VCPU is gone.
We have to do this via IPI since otherwise there is a (somewheat
theoretical) chance that between test and subsequent clearing
of last_vcpu the remote processor (i.e. vpmu->last_pcpu) might do
both vpmu_load() and then vpmu_save() for another VCPU. The former
will clear last_vcpu and the latter will set it to something else.
Performing this operation via IPI will guarantee that nothing can
happen on the remote processor between testing and clearing of
last_vcpu.
We should also check for VPMU_CONTEXT_ALLOCATED in vpmu_destroy() to
avoid unnecessary percpu tests and arch-specific destroy ops. Thus
checks in AMD and Intel routines are no longer needed.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Kevin Tian <kevin.tian@intel.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
master commit: ed8017155607db1bbe1f6ca41eac696b7ef8082b
master date: 2015-01-07 11:12:27 +0100
Ian Jackson [Tue, 6 Jan 2015 18:40:19 +0000 (18:40 +0000)]
README: Rewrap to 70 columns.
The first paragraph seems to have been wrapped to 70, so do the other
new paragraphs to 70 too for visual consistency.
No non-whitespace change.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Ian Jackson [Tue, 6 Jan 2015 18:40:18 +0000 (18:40 +0000)]
README: Minor punctuation and grammar changes
* Add two missing "and"s and a missing semicolon (a la Oxford comma).
* Use a double-space after full stop (like the first paragraph does).
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Olaf Hering [Fri, 19 Dec 2014 11:25:32 +0000 (12:25 +0100)]
tools/hotplug: remove EnvironmentFile from xen-qemu-dom0-disk-backend.service
The referenced Environment file does not exist, and the service file
does not make use of variables anyway.
N.B. If we start honouring env settings for any reason this will
have to be changed.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Olaf Hering [Fri, 19 Dec 2014 11:25:31 +0000 (12:25 +0100)]
tools/hotplug: use XENCONSOLED_TRACE in xenconsoled.service
Instead of inventing a new XENCONSOLED_LOG= variable reuse the
existing XENCONSOLED_TRACE= variable in xenconsoled.service.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Olaf Hering [Fri, 19 Dec 2014 11:25:30 +0000 (12:25 +0100)]
tools/hotplug: use xencommons as EnvironmentFile in xenconsoled.service
The referenced sysconfig/xenconsoled does not exist. If anything
needs to be specified it has to go into the existing
sysconfig/xencommons file.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Olaf Hering [Fri, 19 Dec 2014 11:25:29 +0000 (12:25 +0100)]
tools/hotplug: xendomains.service depends on network
Starting domains during boot will most likely require network for
the local bridge and it may need access to remote filesystems. Add
ordering tags to systemd service file.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Olaf Hering [Fri, 19 Dec 2014 11:25:28 +0000 (12:25 +0100)]
tools/hotplug: remove XENSTORED_ROOTDIR from xenstored.service
There is no need to export XENSTORED_ROOTDIR. This variable can be
enabled in sysconfig/xencommons. If the variable is unset xenstored
will automatically use @XEN_LIB_STORED@.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Olaf Hering [Fri, 19 Dec 2014 11:25:27 +0000 (12:25 +0100)]
tools/hotplug: remove SELinux options from var-lib-xenstored.mount
Using SELinux mount options per default breaks several systems.
Either the context= mount option is not known at all to the kernel,
as reported for ArchLinux. Or the default value "none" is unknown to
SELinux, as reported for Fedora. In both cases the unit will fail.
The proper place to specify mount options is /etc/fstab. Apparently
systemd is kind enough to use values from there even if Options= or
What= is specified in a .mount file.
Remove XENSTORED_MOUNT_CTX, the reference to a non-existent
EnvironmentFile and trim default Options= for the mount point.
The removed code was first mentioned in the patch referenced below,
with the following description:
...
* Some systems define the selinux context in the systemd Option for
the /var/lib/xenstored tmpfs:
Options=mode=755,context="system_u:object_r:xenstored_var_lib_t:s0"
For the upstream version we remove that and let systems specify
the context on their system /etc/default/xenstored or
/etc/sysconfig/xenstored $XENSTORED_MOUNT_CTX variable
...
It is nowhere stated (on xen-devel) what "Some systems" means, which
is unfortunately common practice in nearly all opensource projects.
http://lists.xenproject.org/archives/html/xen-devel/2014-03/msg02462.html
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Cc: Anthony PERARD <anthony.perard@citrix.com> Cc: M A Young <m.a.young@durham.ac.uk> Cc: Luis R. Rodriguez <mcgrof@do-not-panic.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Ed Swierk [Tue, 6 Jan 2015 15:21:07 +0000 (15:21 +0000)]
libxl: Fix building libxlu_cfg_y.y with bison 3.0
- Use %lex-param instead of obsolete YYLEX_PARAM to override lex scanner
parameter
- Change deprecated %name-prefix= to %name-prefix
Tested against bison 2.4.1 and 3.0.2.
This is expected to sometimes (depending on timestamps and whether the
bison input files are edited) break building on systems with ancient
versions of bison. Bison 2.4.1 is known to work and was released in
December 2008.
Also, consquentially, regenerate bison output files with bison
1:2.5.dfsg-2.1 from Debian wheezy.
Signed-off-by: Ed Swierk <eswierk@skyportsystems.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Tested-by: Wei Liu <wei.liu2@citrix.com> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Ian Jackson [Tue, 6 Jan 2015 15:15:15 +0000 (15:15 +0000)]
libxl: Renegerate flex output files
Regenerate libxl_*_l.* with flex 2.5.35-10.1 as in current Debian
wheezy. The differences are trivial: addition of declarations of
xlu__cfg_yyget_column and xlu__cfg_yyset_column, but no code body
changes.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Jan Beulich [Fri, 19 Dec 2014 11:17:02 +0000 (11:17 +0000)]
EFI: suppress bogus loader warning
This was accidentally lost in commit fbc3d9a220 ("EFI: add
efi_arch_handle_cmdline() for processing commandline"), leading to the
"Unknown command line option" warning being printed whenever options
get passed to the core hypervisor or the Dom0 kernel.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Mihai Donțu [Tue, 6 Jan 2015 12:49:52 +0000 (12:49 +0000)]
x86/HVM: prevent use-after-free when destroying a domain
hvm_domain_relinquish_resources() can free certain domain resources
which can still be accessed, e.g. by HVMOP_set_param, while the domain
is being cleaned up.
This is CVE-2015-0361 / XSA-116.
Signed-off-by: Mihai Donțu <mdontu@bitdefender.com> Tested-by: Răzvan Cojocaru <rcojocaru@bitdefender.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
reset PCI devices on force removal even when QEMU returns error
On do_pci_remove when QEMU returns error, we just bail out early without
resetting the device. On domain shutdown we are racing with QEMU exiting
and most often QEMU closes the QMP connection before executing the
requested command.
In these cases if force=1, it makes sense to go ahead with rest of the
PCI device removal, that includes resetting the device and calling
xc_deassign_device. Otherwise we risk not resetting the device properly
on domain shutdown.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Sun, 21 Dec 2014 11:18:53 +0000 (11:18 +0000)]
xen: arm: correct off-by-one error in consider_modules
By iterating up to <= mi->nr_mods we are running off the end of the boot
modules, but more importantly it causes us to then skip the first FDT reserved
region, meaning we might clobber it.
Signed-off-by: Ian Campbell <ijc@hellion.org.uk> Reviewed-by: Julien Grall <julien.grall@linaro.org>
Andrew Cooper [Mon, 5 Jan 2015 14:19:58 +0000 (14:19 +0000)]
tools/libxl: Use of init()/dispose() to avoid leaking libxl_dominfo.ssid_label
libxl_dominfo contains a ssid_label pointer which will have memory allocated
for it in libxl_domain_info() if the hypervisor has CONFIG_XSM compiled.
However, the lack of appropriate use of libxl_dominfo_{init,dispose}() will
cause the label string to be leaked, even in success cases.
This was discovered by XenServers Coverity scanning, and are issues not
identified by upstream Coverity Scan.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Wei Liu <wei.liu2@citrix.com> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Fri, 14 Nov 2014 14:41:38 +0000 (14:41 +0000)]
libxl: Fix if{} nesting in do_pci_remove
do_pci_remove contained this:
if (type == LIBXL_DOMAIN_TYPE_HVM) {
[stuff]
} else if (type != LIBXL_DOMAIN_TYPE_PV)
abort();
{
This is bizarre, and not correct. The effect is that HVM guests end
up running both the proper code and that intended for PV guests. This
causes (amongst other things) trouble when PCI devices are
hot-unplugged from HVM guests.
This bug was introduced in abfb006f "tools/libxl: explicitly grant
access to needed I/O-memory ranges".
This is clear candidate for Xen 4.5, being a bugfix to an important
feature.
Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Tested-by: Robert Hu <robert.hu@intel.com> Rlease-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> CC: Sander Eikelenboom <linux@eikelenboom.it> CC: George Dunlap <George.Dunlap@eu.citrix.com>
Ian Jackson [Mon, 5 Jan 2015 14:31:00 +0000 (14:31 +0000)]
libxl: Initialise CTX->xce in domain suspend, as needed
When excuting xl migrate/Remus, the following error can occur:
[root@master xen]# xl migrate 5 slaver
migration target: Ready to receive domain.
Saving to migration stream new xl format (info 0x1/0x0/1225)
Loading new save file <incoming migration stream> (new xl fmt info 0x1/0x0/12\
)
Savefile contains xl domain config in JSON format
Parsing config from <saved>
Segmentation fault (core dumped)
This is because CTX->xce is used without been initialized.
The bug was introduced by commit 2ffeb5d7f5d8
libxl: events: Deregister evtchn fd when not needed
which removed the initialization of xce from libxl__ctx_alloc.
In this patch we initialise the CTX->xce before using it. Also, we
adjust the doc comment for libxl__ev_evtchn_* to mention the need to
do so.
Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Wei Liu <wei.liu2@citrix.com>
Avoid emitting an error message referring to an incorrect or corrupt
container file just because no entry was found for the running CPU.
Additionally switch the order of data validation and consumption in
cpu_request_microcode()'s first loop, and also check the types of
skipped blocks in container_fast_forward().
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Jan Beulich [Fri, 12 Dec 2014 10:24:13 +0000 (10:24 +0000)]
domctl: fix IRQ permission granting/revocation
Commit 545607eb3c ("x86: fix various issues with handling guest IRQs")
wasn't really consistent in one respect: The granting of access to an
IRQ shouldn't assume the pIRQ->IRQ translation to be the same in both
domains. In fact it is wrong to assume that a translation is already/
still in place at the time access is being granted/revoked.
What is wanted is to translate the incoming pIRQ to an IRQ for
the invoking domain (as the pIRQ is the only notion the invoking
domain has of the IRQ), and grant the subject domain access to
the resulting IRQ.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Acked-by: Ian Campbell <ian.campbell@citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Jan Beulich [Thu, 11 Dec 2014 10:47:21 +0000 (10:47 +0000)]
x86: don't deliver NMI to PVH Dom0
... for the time being: The mechanism used depends on the domain's use
of the IRET hypercall - which PVH is not using. HVM code (which PVH
uses) will deliver an NMI if it sees v->nmi_pending however that
temporary affinity adjustment gets undone in the HYPERVISOR_iret
handler, yet PVH can't call that hypercall.
Also drop two bogus code lines spotted while going through the involved
code paths: Addresses of per-CPU variables can't possibly be NULL, and
the setting of st->vcpu in send_guest_trap()'s MCE case is redundant
with an earlier cmpxchgptr().
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
M A Young [Thu, 18 Dec 2014 10:02:16 +0000 (10:02 +0000)]
tools/xl: fix segfault in xl migrate --debug
If differences are found during the verification phase of xl migrate
--debug then it is likely to crash with a segfault because the bogus
pagebuf->pfn_types[pfn] is used in a print statement instead of
pfn_type[pfn] .
Signed-off-by: Michael Young <m.a.young@durham.ac.uk> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
George Dunlap [Tue, 9 Dec 2014 14:04:19 +0000 (14:04 +0000)]
libxl: Tell qemu to use raw format when using a tapdisk
At the moment libxl unconditinally passes the underlying file format
to qemu in the device string. However, when tapdisk is in use,
tapdisk handles the underlying format and presents qemu with
effectively a raw disk. When qemu looks at the tapdisk block device
and doesn't find the image format it was looking for, it will fail.
This effectively means that tapdisk cannot be used with HVM domains at
the moment except for raw files.
Instead, if we're using a tapdisk backend, tell qemu to use a raw file
format.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
[ ijc -- nuked extra blank line ]
Wei Liu [Mon, 15 Dec 2014 10:56:24 +0000 (10:56 +0000)]
xl: print message to stdout when (!debug && dryrun)
In commit d36a3734a ("xl: fix migration failure with xl migrate
--debug"), message is printed to stderr for both debug mode
and dryrun mode. That caused rdname() in xendomains fails to parse
domain name since it's expecting input from xl's stdout.
So this patch separates those two cases. If xl is running in debug mode,
then message is printed to stderr; if xl is running in dryrun mode and
debug is not enabled, message is printed to stdout. This will fix
xendomains and other scripts that use "xl create --dryrun", as well as
not re-introducing the old bug fixed in d36a3734a.
Reported-by: Mark Pryor <tlviewer@yahoo.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: M A Young <m.a.young@durham.ac.uk> Cc: Ian Campbell <ian.campbell@citrix.com> Release-Acked-by: Konrad Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Fri, 12 Dec 2014 18:26:02 +0000 (18:26 +0000)]
docs/commandline: Minor formatting fixes and clarifications
`font` had a trailing single quote which was out of place.
`gnttab_max_frames` was missing escapes for the underscores which caused the
underscores to take their markdown meaning, causing 'max' in the middle to be
italicised. Escape the underscores, and make all command line parameters
bold, to be consistent with the existing style.
Clarify how the default for `nmi` changes between debug and non debug builds
of Xen.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> CC: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Tue, 9 Dec 2014 16:43:22 +0000 (16:43 +0000)]
python/xc: Fix multiple issues in pyflask_context_to_sid()
The error handling from a failed memory allocation should return
PyErr_SetFromErrno(xc_error_obj); rather than simply calling it and continuing
to the memcpy() below, with the dest pointer being NULL.
Coverity also complains about passing a non-NUL terminated string to
xc_flask_context_to_sid(). xc_flask_context_to_sid() doesn't actually take a
NUL terminated string, but it does take a char* which, in context, used to be
a string, which is why Coverity complains.
One solution would be to use strdup(ctx) which is simpler than a
strlen()/malloc()/memcpy() combo, which would result in a NUL-terminated
string being used with xc_flask_context_to_sid().
However, ctx is strictly an input to the hypercall and is not mutated along
the way. Both these issues can be fixed, and the error logic simplified, by
not duplicating ctx in the first place.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Coverity-IDs: 10553051055721 Acked-by: Ian Campbell <Ian.Campbell@citrix.com> CC: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Wei Liu <wei.liu2@citrix.com> CC: Xen Coverity Team <coverity@xen.org>
The UART is not able to receive bytes when idle mode is not configured
properly, therefore setup the UART with autoidle and wakeup enabled.
Older Linux kernels (for example 3.8) configure hwmods for all devices
even if the device tree nodes for those devices is absent in device
tree, thus UART idle mode is configured too. With such kernels we can
workaround the issue by adding a fake node in the UART containing this
MMIO range, which is therefore mapped by Xen to dom0, which
reconfigures the UART, causing things to work normally.
Newer Linux Kernels (3.12 and beyond) do not configure idle mode for
UART and so this hack no longer works.
Signed-off-by: Oleksandr Dmytryshyn <oleksandr.dmytryshyn@globallogic.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- updated commit message as discussed ]
Jan Beulich [Mon, 15 Dec 2014 08:30:05 +0000 (09:30 +0100)]
console: allocate ring buffer earlier
... when "conring_size=" was specified on the command line. We can't
really do this as early as we would want to when the option was not
specified, as the default depends on knowing the system CPU count. Yet
the parsing of the ACPI tables is one of the things that generates a
lot of output especially on large systems.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> (ARM and generic bits) Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Jan Beulich [Thu, 11 Dec 2014 16:14:07 +0000 (17:14 +0100)]
have architectures specify the number of PIRQs a hardware domain gets
The current value of nr_static_irqs + 256 is often too small for larger
systems. Make it dependent on CPU count and number of IO-APIC pins on
x86, and (until it obtains PCI support) simply NR_IRQS on ARM.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Jan Beulich [Thu, 11 Dec 2014 16:13:04 +0000 (17:13 +0100)]
lock down hypercall continuation encoding masks
Andrew validly points out that even if these masks aren't a formal part
of the hypercall interface, we aren't free to change them: A guest
suspended for migration in the middle of a continuation would fail to
work if resumed on a hypervisor using a different value. Hence add
respective comments to their definitions.
Additionally, to help future extensibility as well as in the spirit of
reducing undefined behavior as much as possible, refuse hypercalls made
with the respective bits non-zero when the respective sub-ops don't
make use of those bits.
Reported-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Tim Deegan <tim@xen.org> Release-Acked-by: Konrad Rzeszutek Wilk <Konrad.wilk@oracle.com>
Ian Jackson [Wed, 26 Nov 2014 17:28:18 +0000 (17:28 +0000)]
libxl: events: Document and enforce actual callbacks restriction
libxl_event_register_callbacks cannot reasonably be called while libxl
is busy (has outstanding operations and/or enabled events).
This is because the previous spec implied (although not entirely
clearly) that event hooks would not be called for existing fd and
timeout interests. There is thus no way to reliably ensure that libxl
would get told about fds and timeouts which it became interested in
beforehand.
So there have to be no such fds or timeouts, which means that the
callbacks must only be registered or changed when the ctx is idle.
Document this restriction, and enforce it with a pair of asserts.
(It would be nicer, perhaps, to say that the application may not call
libxl_osevent_register_hooks other than right after creating the ctx.
But there are existing callers, including libvirt, who do it later -
even after doing major operations such as domain creation.)
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Tested-by: Ian Campbell <ian.campbell@citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Ian Jackson [Wed, 26 Nov 2014 17:27:27 +0000 (17:27 +0000)]
libxl: events: Deregister evtchn fd when not needed
We want to have no fd events registered when we are idle.
In this patch, deal with the evtchn fd:
* Defer setup of the evtchn handle to the first use.
* Defer registration of the evtchn fd; register as needed on use.
* When cancelling an evtchn wait, or when wait setup fails, check
whether there are now no evtchn waits and if so deregister the fd.
* On libxl teardown, the evtchn fd should therefore be unregistered.
assert that this is the case.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Tested-by: Ian Campbell <ian.campbell@citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
v2: Do not bother putting evtchn_fd in the ctx; instead, get it
from xc_evtchn_fd when we need it. (Cosmetic.)
Do not register the evtchn fd multiple times: check it's not
registered before we call libxl__ev_fd_register. (Bugfix.)
Ian Jackson [Thu, 27 Nov 2014 18:04:29 +0000 (18:04 +0000)]
libxl: events: Tear down SIGCHLD machinery on ctx destruction
We want to have no fd events registered when we are idle.
Also, we should put back the default SIGCHLD handler. So:
* In libxl_ctx_free, use libxl_childproc_setmode to set the mode to
the default, which is libxl_sigchld_owner_libxl (ie `libxl owns
SIGCHLD only when it has active children').
But of course there are no active children at libxl teardown so
this results in libxl__sigchld_notneeded: the ctx loses its
interest in SIGCHLD (unsetting the SIGCHLD handler if we were the
last ctx) and deregisters the per-ctx selfpipe fd.
* assert that this is the case: ie that we are no longer interested
in the selfpipe.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Tested-by: Ian Campbell <ian.campbell@citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Ian Jackson [Thu, 27 Nov 2014 18:03:03 +0000 (18:03 +0000)]
libxl: events: Deregister, don't just modify, sigchld pipe fd
We want to have no fd events registered when we are idle. This
implies that we must be able to deregister our interest in the sigchld
self-pipe fd, not just modify to request no events.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Tested-by: Ian Campbell <ian.campbell@citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Ian Jackson [Wed, 26 Nov 2014 16:44:52 +0000 (16:44 +0000)]
libxl: events: Deregister xenstore watch fd when not needed
We want to have no fd events registered when we are idle.
In this patch, deal with the xenstore watch fd:
* Track the total number of active watches.
* When deregistering a watch, or when watch registration fails, check
whether there are now no watches and if so deregister the fd.
* On libxl teardown, the watch fd should therefore be unregistered.
assert that this is the case.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Tested-by: Ian Campbell <ian.campbell@citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Ian Jackson [Wed, 26 Nov 2014 16:17:49 +0000 (16:17 +0000)]
libxl: events: Assert that libxl_ctx_free is not called from a hook
No-one in their right mind would do this, and if they did everything
would definitely collapse. Arrange that if this happens, we crash
ASAP.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Tested-by: Ian Campbell <ian.campbell@citrix.com> Release-Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>