Introduce xc_domain_soft_reset() function supporting XEN_DOMCTL_soft_reset.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
x86-specific hook cleans up the pirq-emuirq mappings, destroys all ioreq
servers and and replaces the shared_info frame with an empty page to support
subsequent XENMAPSPACE_shared_info call.
ARM-specific hook is -ENOSYS for now.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com>
New domctl resets state for a domain allowing it to 'start over': register
vcpu_info, switch to FIFO ABI for event channels. Still active grants are
being logged to help debugging misbehaving backends.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Log first 10 active grants for a domain. This function is going to be used
for soft reset, active grants on this path usually mean misbehaving backends
refusing to release their mappings on shutdown. We need that in addition to
the already existent 'g' keyhandler as such misbehaving backends can cause a
domain to crash right after the soft reset operation and 'g' option won't be
available in this case.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Jan Beulich <jbeulich@suse.com>
Wei Liu [Thu, 10 Sep 2015 11:18:03 +0000 (12:18 +0100)]
configure: don't silently disable systemd support
Originally when user runs ./configure --enable-systemd and systemd
development library is not available the build system silently disables
systemd support. This is not in line with normal expectation.
Instead, configure should error out when user has asked for systemd
support but development libraries can't be found.
Reported-by: George Dunlap <george.dunlap@eu.citrix.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
domctl: lower loglevel of XEN_DOMCTL_memory_mapping
We should lower loglevel to XENLOG_G_DEBUG while mapping or
unmapping memory via XEN_DOMCTL_memory_mapping since its
fair enough to check this info just while debugging.
Add the appropriate #if checks around the kexec code in the x86 codebase
so that the feature can actually be turned off by the flag instead of
always required to be enabled on x86.
Signed-off-by: Jonathan Creekmore <jonathan.creekmore@gmail.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: David Vrabel <david.vrabel@citrix.com>
x86: clean up vm_event-related code in asm-x86/domain.h
As suggested by Jan Beulich, moved struct monitor_write_data from
struct arch_domain to struct arch_vcpu, as well as moving all
vm_event-related data from asm-x86/domain.h to struct vm_event,
and allocating it dynamically only when needed.
Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Acked-by: Tamas K Lengyel <tamas@tklengyel.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
ACPI / table: Replace '1' with specific error return values
After commit 7f8f97c3cc (ACPI: acpi_table_parse() now returns
success/fail, not count), acpi_table_parse() returns '1' when it is
unable to find the table, but it should return a negative error code
in that case. Make it return -ENODEV instead.
Fix the same problem in acpi_table_init() analogously.
Signed-off-by: Hanjun Guo <hanjun.guo@linaro.org>
[rjw: Subject and changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[Linux commit 95df812dbdc350bfcf31e247e9100c378a472480] Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Tomasz Nowicki [Wed, 9 Sep 2015 14:25:42 +0000 (16:25 +0200)]
ACPI/table: Always count matched and successfully parsed entries
acpi_parse_entries() allows to traverse all available table entries (aka
subtables) by passing max_entries parameter equal to 0, but since its count
variable is only incremented if max_entries is not 0, the function always
returns 0 for max_entries equal to 0. It would be more useful if it returned
the number of entries matched instead, so make it increment count in that
case too.
Objects loaded by FileHandle->Read need to be flushed from dcache,
otherwise copy_from_paddr will read stale data when copying the kernel,
causing a failure to boot.
Introduce efi_arch_flush_dcache_area and call it from read_file.
This commit introduces no functional changes on x86.
Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Wei Liu [Sun, 6 Sep 2015 20:05:38 +0000 (21:05 +0100)]
libxc: don't populate same pfn more than once in populate_pfns
The original implementation of populate_pfns didn't consider the same
pfn can be present multiple times in the array. The mechanism to prevent
populating the same pfn multiple times only worked if the recurring pfn
appeared in different batches.
This bug is discovered by Linux 4.1 32 bit kernel save / restore test,
which has several ptes pointing to same pfn, which results in an array
containing recurring pfn. When libxc called x86_pv_localise_page, the
original implementation would populate the same pfn more than once.
The fix is to set bit in populated bitmap as we generate list of pfns to
be populated.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Paul Durrant [Wed, 2 Sep 2015 11:17:05 +0000 (12:17 +0100)]
public/io/netif.h: move and amend multicast control documentation
netif.h contains a specification of the XEN_NETIF_EXTRA_TYPE_MCAST_{ADD,DEL}
extra info messages require to manipulate a multicast filter list maintained
by a backend and specifies the xenstore negotiation protocol in a comment
just above the structure defintion, which is easy to miss.
This patch moves the documentation of the xenstore negotiation to be
co-located with the documentation for other features and also amends the
wording to be clearer.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Keir Fraser <keir@xen.org> Cc: Tim Deegan <tim@xen.org> Acked-by: Wei Liu <wei.liu2@citrix.com>
[ ijc -- added a blank line to the comment ]
Update the top-level make help to include all the possible targets and
not reference targets that are deprecated while hopefully being more
clear as to what each target does.
Signed-off-by: Doug Goldstein <cardoe@cardoe.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Thu, 3 Sep 2015 18:27:47 +0000 (19:27 +0100)]
tools/xen-access: use PRI_xen_pfn
Otherwise when building with 32bit compiler, we get:
xen-access.c: In function 'xenaccess_init':
xen-access.c:263:5: error: format '%llx' expects argument of type 'long long unsigned int', but argument 3 has type 'xen_pfn_t' [-Werror=format]
cc1: all warnings being treated as errors
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Tamas K Lengyel <tamas@tklengyel.com> Acked-by: Razvan Cojocaru <rcojocaru@bitdefender.com>
The xenheap bits variable is used to know the last RAM MFN always mapped
in Xen virtual memory. If the value is 0, it means that all the memory is
always mapped in Xen virtual memory.
On X-gene the RAM bank resides above 128GB and last xenheap MFN is
0x4400000. With the new way to calculate the number of bits, xenheap_bits
will be equal to 38 bits. This will result to hide all the RAM and the
impossibility to allocate xenheap memory.
Given that aarch64 have always all the memory mapped in Xen virtual
memory, it's not necessary to call xenheap_max_mfn which set the number
of bits.
Suggested-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Julien Grall <julien.grall@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
tmem: Remove extra spaces at end and some hard tabbing.
My editor marks these in red glowing red so removing them to
make it easier to focus on code.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Patch "tmem: Make the uint64_t oid[3] a proper structure:
xen_tmem_oid" converted the sysctl API to use an
proper structure. But it did not do it for the tmem hypercall.
This expands that and converts the tmem hypercall. For this
to work we define the struct in tmem.h and include it in
sysctl.h.
This change also included work to make the compat layer
happy. That was to declare the struct xen_tmem_oid to be
checked in xlat.lst - which will construct an typedef
in the compat file with the same type, hence allowing
copying of 'oid' member without type issues. The kicker
is that the compat layer adds the prefix 'xen' and since
our structure already has it - we must not include it.
The layout (and size) of this structure in memory for the
'struct tmem_op' (so guest facing) is the same! Verified
via pahole and with 32/64 bit guests.
tmem: Make the uint64_t oid[3] a proper structure: xen_tmem_oid
And use it almost everywhere. It is easy to use it for the
sysctl since the hypervisor and toolstack are intertwined.
But for the tmem hypercall we need to be dilligient (as it
is guest facing) so delaying that to another patch:
"tmem: Use 'struct xen_tmem_oid' for every user" to help
with bisection issues.
We also move some of the parameters on functions to be within
the right location.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Jen Beulich <jbeulich@suse.com>
tmem: Remove the old tmem control XSM checks as it is part of sysctl hypercall.
The sysctl is where the tmem control operations are done and the
XSM checks are done via there. The old mechanism (to check
for control tmem op XSM from do_tmem_op) is not needed anymore.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
tmem: Move TMEM_CONTROL subop of tmem hypercall to sysctl.
The operations are to be used by an control domain to set parameters,
list pools, clients, and to be used during migration.
There is no need to have them in the tmem hypercall path.
This patch moves code without adding fixes - and in fact in
some cases makes the parameters soo long that they hurt eyes - but
that is for another patch.
Note that in regards to existing users:
- Only the control domain could call it - which meant that if
a guest called it would get -EPERM, so we are OK there.
In practice no guests called this TMEM_CONTROL command.
- The spec: https://oss.oracle.com/projects/tmem/dist/documentation/api/tmemspec-v001.pdf
mentions: "TBD [Not sure if this is really needed.]"
which is a carte blanche as any to do this!
Note: The XSM check is the same - we just move it from do_tmem_op
to do_sysctl.
We also add an 32-bit pad to make the sysctl structure have the same
exact size under 32 and 64-bit toolstacks and not worry about aligment
issues.
And the XLAT does not need to deal with the buf as it has been
moved to another structure which is 32/64 fixed.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Acked-by: Jen Beulich <jbeulich@suse.com>
It mentions it but it is never used. The hypercall interface
knows nothing of this sort of thing either. Lets just remove it.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Wei Liu <wei.liu2@citrix.com> [release + toolstack] Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
tmem: Remove in xc_tmem_control_oid duplicate set_xen_guest_handle call
We are doing another call to set_xen_guest_handle right
after the xc_hypercall_bounce_pre (the correct place to do it).
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
tmem: Add ASSERT in obj_rb_insert for pool->rwlock lock.
Manipulating the obj-> structures requires us to hold the
pool->rwlock lock. Lets make that obvious in this function to
catch any errant users (none found, but we may in future).
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
When we are using shared pools we have an global array
(on which we put the pool), and an array of pools per domain.
We also have an shared list of clients (guests) _except_
for the very first domain that created the shared pool.
To deal with multiple guests using an shared pool we have an
ref count and a linked list. Whenever an new user of
a the shared pool joins we increase the ref count and add to
the linked list. Whenever an user quits the shared pool
we decrement and remove from the linked list.
Unfortunately this ref counting and linked list never
worked properly. There are multiple issues:
1) If we have one shared pool (and only one guest creating it)
- we do not add it to the shared list of clients. Which
means the logic in 'shared_pool_quit' never removed
the pool from global_shared_pools. That meant when the pool
was de-allocated - we still had an pointer to the pool
which would be accessed by tmemc_list_client (xl tmem-list -a)
and hit a NULL page!
2). If we have two shared pools in a domain - it (shared_pool_quit)
would remove the domain from the share_list linked list, decrements
the refcount to zero - and remove the pool from the global shared pool.
When done it would also clear the client->pools[] to NULL for itself.
Which is good. However since there are two shared pools in the domain
the next entry in the client->pools[] would have a stale pointer to
the just de-allocated pool. Accessing it and trying to de-allocate it
would lead to crashes or hypervisor hang - depending on the build.
Fun times!
To trigger this use
http://xenbits.xen.org/gitweb/?p=xentesttools/bootstrap.git;a=blob;f=root_image/drivers/tmem_test/tmem_test.c
This patch fixes it by making the very first domain that created
an shared pool to follow the same logic as every domain that is
joining a shared pool. That is increment the refcount and also
add itself to the shared list of domains using it.
We also remove an ASSERT that incorrectly assumed
that only one shared pool would exist for a domain.
And to mirror the reporting logic in shared_pool_join
we also add a printk to advertise inter-domain shared pool
joining.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
George Dunlap [Wed, 2 Sep 2015 09:34:55 +0000 (10:34 +0100)]
tools: Honor Config.mk debug value, rather than setting our own
Changeset 1166ecf ('tools/Rules.mk: Don't optimize debug builds; add
macro debugging information') exposed a bug whereby the autoconf stuff
in tools was setting its own debug value (defaulting to ENABLED, even
for releases) instead of honoring the value set in Config.mk.
After that changeset, if the global build has -D_FORTIFY_SOURCE
enabled (as is the default in CentOS 7 rpmbuild), then the tools build
will fail (because debug builds default to on).
There should be only one place to specify whether to build debug or
not, and Config.mk is already included by the relevant makefiles. So
simply remove the tools/configure debug option and everything falls
into place naturally.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Julien Grall [Thu, 13 Aug 2015 11:03:43 +0000 (12:03 +0100)]
xen/arm: mm: Do not dump the p2m when mapping a foreign gfn
The physmap operation XENMAPSPACE_gfmn_foreign is dumping the p2m when
an error occured by calling dump_p2m_lookup. But this function is not
using ratelimited printk.
Any domain able to map foreign gfmn would be able to flood the Xen
console.
The information wasn't not useful so drop it.
This is XSA-141.
Signed-off-by: Julien Grall <julien.grall@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Since 9c89dc95201ffed5fead17b35754bf9440fdbdc0 libxenstore prefers using
/dev/xen/xenbus over /proc/xen/xenbus. This makes the OCaml xenstore
library contain the same preference.
Signed-off-by: Doug Goldstein <cardoe@cardoe.com> Acked-by: David Scott <dave.scott@citrix.com>
Jan Beulich [Tue, 1 Sep 2015 14:51:44 +0000 (16:51 +0200)]
x86/mm: make {set,clear}_identity_p2m_mapping() work for PV guests
Namely Dom0 suffers from commit 5ae03990c1 ("xen/vtd: create RMRR
mapping") having removed the creation of such mappings for non-
translated guests.
Reported-by: Malcolm Crossley <malcolm.crossley@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: George Dunlap <george.dunlap@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Jan Beulich [Tue, 1 Sep 2015 12:02:57 +0000 (14:02 +0200)]
x86/NUMA: make init_node_heap() respect Xen heap limit
On NUMA systems, where we try to use node local memory for the basic
control structures of the buddy allocator, this special case needs to
take into consideration a possible address width limit placed on the
Xen heap. In turn this (but also other, more abstract considerations)
requires that xenheap_max_mfn() not be called more than once (at most
we might permit it to be called a second time with a larger value than
was passed the first time), and be called only before calling
end_boot_allocator().
While inspecting all the involved code, a couple of off-by-one issues
were found (and are being corrected here at once):
- arch_init_memory() cleared one too many page table slots
- the highmem_start based invocation of xenheap_max_mfn() passed too
big a value
- xenheap_max_mfn() calculated the wrong bit count in edge cases
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
With the addition of FMODE_ATOMIC_POS in the Linux 3.14 kernel,
concurrent blocking file accesses to a single open file descriptor can
cause a deadlock trying to grab the file position lock. If a watch has
been set up, causing a read_thread to blocking read on the file
descriptor, then future writes that would cause the background read to
complete will block waiting on the file position lock before they can
execute. This race condition only occurs when libxenstore is accessing
the xenstore daemon through the /proc/xen/xenbus file and not through
the unix domain socket, which is the case when the xenstore daemon is
running as a stub domain or when oxenstored is passed
--disable-socket. Accessing the daemon from the true character device
also does not exhibit this problem.
On Linux, prefer using the character device file over the proc file if
the character device exists.
Signed-off-by: Jonathan Creekmore <jonathan.creekmore@gmail.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Jan Beulich [Mon, 31 Aug 2015 11:52:24 +0000 (13:52 +0200)]
x86/NUMA: don't account hotplug regions
... except in cases where they really matter: node_memblk_range[] now
is the only place all regions get stored. nodes[] and NODE_DATA() track
present memory only. This improves the reporting when nodes have
disjoint "normal" and hotplug regions, with the hotplug region sitting
above the highest populated page. In such cases a node's spanned-pages
value (visible in both XEN_SYSCTL_numainfo and 'u' debug key output)
covered all the way up to top of populated memory, giving quite
different a picture from what an otherwise identically configured
system without and hotplug regions would report. Note, however, that
the actual hotplug case (as well as cases of nodes with multiple
disjoint present regions) is still not being handled such that the
reported values would represent how much memory a node really has (but
that can be considered intentional).
Reported-by: Jim Fehlig <jfehlig@suse.com>
This at once makes nodes_cover_memory() no longer consider E820_RAM
regions covered by SRAT hotplug regions.
Also reject self-overlaps with mismatching hotplug flags.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Tested-by: Jim Fehlig <jfehlig@suse.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Jan Beulich [Mon, 31 Aug 2015 11:51:52 +0000 (13:51 +0200)]
x86/NUMA: fix setup_node()
The function referenced an __initdata object (nodes_found). Since this
being a node mask was more complicated than needed, the variable gets
replaced by a simple counter. Check at once that the count of nodes
doesn't go beyond MAX_NUMNODES.
Also consolidate four printk()s related to the function's use into just
one.
Finally (quite the opposite of the above issue) __init-annotate
nodes_cover_memory().
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Jan Beulich [Mon, 31 Aug 2015 11:50:56 +0000 (13:50 +0200)]
x86: adjustments to memory_add()
The function should clean up after a failed map_pages_to_xen().
Sharing the M2P table with Dom0 needs to happen before adding the new
pages to the heap (so pages handed out by the allocator will be
represented in what a tool stack may need to map).
Avoid the IOMMU mapping loop whenever possible.
Drop a redundant setting of 'ret'.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Tamas K Lengyel [Fri, 28 Aug 2015 10:17:05 +0000 (12:17 +0200)]
x86/vmx: fix vmx_is_singlestep_supported return value
The function supposed to return a boolean but instead it returned
the value 0x8000000 which is the Intel internal flag for MTF. This has
caused various checks using this function to falsely report no MTF
capability.
Signed-off-by: Tamas K Lengyel <tlengyel@novetta.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Kevin Tian <kevin.tian@intel.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Thu, 27 Aug 2015 19:13:16 +0000 (20:13 +0100)]
docs: Fix installation of man8 pages
c/s a430436 "docs: Support for generating man(8) pages" accidentally
failed to update to the install and clean rules for man8 pages, meaning
that c/s 7b21214 "docs: Move xentrace.8 to docs/man/xentrace.pod.8"
caused a packaging regression when it came to xentop.8
To avoid similar bugs in the future, move the generation of the build,
install and clean rules into the manpage metarule.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andrew Cooper [Wed, 26 Aug 2015 09:15:20 +0000 (09:15 +0000)]
docs: Support for generating man(8) pages
The manpage rules are very repetative, because of the section number being
present in the filenames.
Instead of adding another set of 3 rules, switch to using a metarule to
automate the repetative action. New rules for different manpage sections can
be added simply by extending the $(foreach)
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Doug Goldstein [Tue, 25 Aug 2015 13:49:49 +0000 (13:49 +0000)]
build: use correct qemu emulator binary
Per http://wiki.qemu.org/ChangeLog/1.0 and the fact that no currently
supported distro ships the x86 system emulator binary as 'qemu', this
changes the default when a user specifies --with-system-qemu without a
PATH to 'qemu-system-i386', otherwise the default results in a
non-functional setup.
[ Reran autogen.sh -iwj ]
Signed-off-by: Doug Goldstein <cardoe@cardoe.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Wei Liu [Thu, 27 Aug 2015 15:54:01 +0000 (16:54 +0100)]
build: fix tarball stubdom build
When we create a source code tarball, mini-os is extracted to
extras/mini-os directory. When building a source code tarball, we
shouldn't clone mini-os again.
Only clone mini-os when that directory doesn't exist. This fixes tarball
build and doesn't affect non-tarball build.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com>
Jan Beulich [Thu, 27 Aug 2015 15:40:38 +0000 (17:40 +0200)]
IOMMU: skip domains without page tables when dumping
Reported-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Roger Pau Monné <roger.pau@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Specificially we are pulling in the upstream patch (commit 1b56452121672e6408c38ac8926bdd6998a39004)):
[ath9k] Remove confusing logic inversion in an ANI variable
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Jan Beulich [Tue, 25 Aug 2015 14:18:31 +0000 (16:18 +0200)]
x86/IO-APIC: don't create pIRQ mapping from masked RTE
While moving our XenoLinux patches to 4.2-rc I noticed bogus "already
mapped" messages resulting from Linux (legitimately) writing RTEs with
only the mask bit set. Clearly we shouldn't even attempt to create a
pIRQ <-> IRQ mapping from such RTEs.
In the course of this I also found that the respective message isn't
really useful without also printing the pre-existing mapping. And I
noticed that map_domain_pirq() allowed IRQ0 to get through, despite us
never allowing a domain to control that interrupt.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
x86, amd_ucode: skip microcode updates for final levels
Some of older[Fam10h] systems require that certain number of
applied microcode patch levels should not be overwritten by
the microcode loader. Otherwise, system hangs are known to occur.
The 'final_levels' of patch ids have been obtained empirically.
Refer bug https://bugzilla.suse.com/show_bug.cgi?id=913996
for details of the issue.
The short version is that people have predominantly noticed
system hang issues when trying to update microcode levels
beyond the patch IDs below.
[0x01000098, 0x0100009f, 0x010000af]
From internal discussions, we gathered that OS/hypervisor
cannot reliably perform microcode updates beyond these levels
due to hardware issues. Therefore, we need to abort microcode
update process if we hit any of these levels.
In this patch, we check for those microcode versions and abort
if the current core has one of those final patch levels applied
by the BIOS
A linux version of the patch has already made it into tip-
http://marc.info/?l=linux-kernel&m=143703405627170
Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@amd.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Wei Liu [Mon, 17 Aug 2015 18:57:01 +0000 (19:57 +0100)]
libxc: fix vNUMA memory allocation
Only 4KB allocation was using new_memflags. We should use new_memflags
in for 2MB and 1GB allocation as well because that variable contains
node information.
Without this patch, when creating a HVM guest with vNUMA, because the
node information was not present in the flags passed to libxc, actual
memory allocation didn't comply with what user specified. With this
patch the behaviour is correct.
Reported-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Mon, 17 Aug 2015 18:57:00 +0000 (19:57 +0100)]
xl: error out if vNUMA specifies more vcpus than pcpus
... but allow user to override that check by specifying maxvcpus= in xl
configuration file.
Note that the code is constructed such that the fallout is dealt with
after parsing. We can live with that because though it wastes a bit of
cpu cycles but it is still functionally correct and I would like to have
a clear split between parsing and dealing with fallouts.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Before this change, vdistance from node 0 to all nodes (including
itself) was 10 and vdistance from node 1 to all nodes was 20.
After this change, vdistance from node 0 to itself is 10, to node 1 is
20 and vdistance from node 1 to node 0 is 20, to itself is 10. That's
the correct vdistance settings we expect.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Boris Ostrovsky [Fri, 14 Aug 2015 16:18:52 +0000 (12:18 -0400)]
libxc: allow empty memory nodes in vNUMA
The test for 'nr_vmemranges < nr_vnodes' in xc_domain_setvnuma() was
originally writtten with the idea that number of memory ranges would
at least be equal to number of nodes.
We may want to specify nodes with no memory, however, and thus this
check should be removed.
Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Roger Pau Monne [Fri, 7 Aug 2015 10:17:38 +0000 (12:17 +0200)]
libxl: fix libxl__build_hvm error handling
With the current code in libxl__build_hvm it is possible for the function to
fail and still return 0.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich [Fri, 14 Aug 2015 10:36:26 +0000 (12:36 +0200)]
x86/HVM: honor p2m_ram_ro in hvm_map_guest_frame_rw()
... and its callers.
While all non-nested users are made fully honor the semantics of that
type, doing so in the nested case seemed insane (if doable at all,
considering VMCS shadowing), and hence there the respective operations
are simply made fail.
One case not (yet) taken care of is that of a page getting transitioned
to this type after a mapping got established.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Kevin Tian <kevin.tian@intel.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Jean Delvare [Thu, 13 Aug 2015 12:48:40 +0000 (14:48 +0200)]
x86/dmi_scan: only honor end-of-table for 64-bit tables
A 32-bit entry point to a DMI table says how many structures the table
contains. The SMBIOS specification explicitly says that end-of-table
markers should be ignored if they are not actually at the end of the
DMI table. So only honor the end-of-table marker for tables accessed
through 64-bit entry points, as they do not specify a structure count.
Signed-off-by: Jean Delvare <jdelvare@suse.de>
[Linux commit 17cd5bd5391e6e7b363d66335e1bc6760ae969b9] Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Jan Beulich [Thu, 13 Aug 2015 12:47:06 +0000 (14:47 +0200)]
add page_get_owner_and_reference() related ASSERT()s
The function shouldn't return NULL after having obtained a reference,
or else the caller won't know to drop it.
Also its result shouldn't be ignored - if calling code is certain that
a page already has a non-zero refcount, it better ASSERT()s so.
Finally this as well as get_page() and put_page() are required to be
available on all architectures - move the declarations to xen/mm.h.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Tim Deegan <tim@xen.org> Acked-by: Ian Campbell <ian.campbell@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
hvm_event_crX() already returns a bool_t to tell us whether an
event will be sent out or not, so the extra check that value != old
is not only useless, but also prevents non-onchangeonly events from
being sent.
Signed-off-by: Razvan Cojocaru <rcojocaru@bitdefender.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Jan Beulich [Thu, 13 Aug 2015 12:44:21 +0000 (14:44 +0200)]
x86/p2m: clear_identity_p2m_entry() must cope with 'relaxed' RDM mode
Tearing down a 1:1 mapping that was never established isn't really nice
(and in fact hits an ASSERT() in p2m_remove_page()). Convert from a
wrapper macro to a proper function which then can take care of the
situation.
Also take the opportunity to remove the 'page_order' parameter of
clear_identity_p2m_entry(), to make it match set_identity_p2m_entry().
Signed-off-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Julien Grall [Thu, 13 Aug 2015 12:41:09 +0000 (14:41 +0200)]
mm: populate_physmap: validate correctly the gfn for direct mapped domain
Direct mapped domain has already the memory allocated 1:1, so we are
directly using the gfn as mfn to map the RAM in the guest.
While we are validating that the page associated to the first mfn belongs to
the domain, the subsequent MFN are not validated when the extent_order
is > 0.
This may result to map memory region (MMIO, RAM) which doesn't belong to the
domain.
Although, only DOM0 on ARM is using a direct memory mapped. So it
doesn't affect any guest (at least on the upstream version) or even x86.
Signed-off-by: Julien Grall <julien.grall@citrix.com> Reviewed-by: Jan Beulich <jbeulich@suse.com> Release-acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Wed, 12 Aug 2015 13:56:01 +0000 (14:56 +0100)]
tools: libxl: Remove unnecessary trailing \n from log messages.
Both xl's LOG and the various libxl logging mechanisms automatically
include a trailing \n.
Remove all unnecessary \n's from the logs messages with the following
semantic patch.
spatch also reindents (I couldn't see how to make it stop). In general
it has improved matters but in 1 case it has introduced a long line,
this will be fixed in the next patch.
Semantic patch, run as
spatch --in-place --no-includes --include-headers \
--sp-file libxl-log-nl.spatch \
tools/libxl/libxl*.[ch] tools/libxl/xl*.[ch]
=========
// Heavily inspired by https://lkml.org/lkml/2014/9/12/134
Ian Campbell [Wed, 12 Aug 2015 09:07:37 +0000 (10:07 +0100)]
gitignore: Don't ignore *.rej
These indicate a patch application went wrong, I want to see them in
"git status". This appears to have been imported from .hgignore where
it has been since 2005.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Jan Beulich <jbeulich@suse.com> Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Andrew Cooper [Fri, 7 Aug 2015 13:51:59 +0000 (14:51 +0100)]
tools/xenstore: Correct use of va_end() after va_copy()
C requires that every use of va_copy() is matched with a va_end() call.
This is especially important for x86_64 as va_{start,copy}() may need to
allocate memory to generate a va_list containing parameters which were
previously in registers.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Andrew Cooper [Fri, 7 Aug 2015 14:06:24 +0000 (15:06 +0100)]
tools/libxl: Alter the use of rand() in testidl
Coverity warns for every occurrence of rand(), which is made worse
because each time the IDL changes, some of the calls get re-flagged.
Collect all calls to rand() in a single function, test_rand(), which
takes a modulo parameter for convenience. This turns 40 defects
currently into 1, which won't get re-flagged when the IDL changes.
In addition, fix the erroneous random choice for libxl_defbool_set().
"!!rand() % 1" is unconditionally 0, and even without the "% 1" would
still be very heavily skewed in one direction.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Andrew Cooper [Fri, 7 Aug 2015 14:06:23 +0000 (15:06 +0100)]
tools/libxl: Assert success of memory allocation in testidl
The chances of an allocation failing are slim but nonzero. Assert
success of each allocation to quieten Coverity, which re-notices defects
each time the IDL changes.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Julien Grall [Fri, 7 Aug 2015 18:53:55 +0000 (19:53 +0100)]
tools/libxc: linux: Don't use getpagesize() when unmapping the grants
The grants are based on the Xen granularity (i.e 4KB). While the function
to map grants for Linux (linux_gnttab_grant_map) is using the correct
size (XC_PAGE_SIZE), the unmap one (linux_gnttab_munmap) is using
getpagesize().
On domain using a page granularity different than Xen (this is the case
for AARCH64 guest using 64KB page), the unmap will be called with the
wrong size.
Signed-off-by: Julien Grall <julien.grall@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>