George Dunlap [Tue, 10 Apr 2012 09:41:30 +0000 (10:41 +0100)]
xen: Fix schedule()'s grabbing of the schedule lock
Because the location of the lock can change between the time you read
it and the time you grab it, the per-cpu schedule locks need to check
after lock acquisition that the lock location hasn't changed, and
release and re-try if so. This change was effected throughout the
source code, but one very important place was apparently missed: in
schedule() itself.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Committed-by: Keir Fraser <keir@xen.org>
x86/mm: Take care of domain reference for shared pages
Making a page sharable removes it from the previous owner's list. Making it
private adds it. These actions are similar to freeing or allocating a page.
Except that they were not minding the domain reference that is taken/dropped
when the first/last page is allocated/freed.
Without fixing this, a domain might remain zombie when destroyed if all its
pages are shared.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
Olaf Hering [Tue, 3 Apr 2012 15:22:59 +0000 (17:22 +0200)]
domctl.h: document non-standard error codes for enabling paging/access
The domctl to enable paging and access returns some non-standard error
codes after failure. This can be used in the tools to print specific
error messages. xenpaging recognizes these errno values and shows them
if the init function fails.
Document the return codes in the public header file.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
Olaf Hering [Fri, 30 Mar 2012 15:09:07 +0000 (17:09 +0200)]
xenpaging: add error code to indicate iommem passthrough
Similar to the existing ENODEV and EXDEV error codes, add EMDEV to
indicate that iommu passthrough is not compatible with paging.
All error codes are just made-up return codes to give proper error
messages in the pager.
Also update the HAP related error message now that paging is enabled
also on AMD hosts.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Jackson <Ian.Jackson@citrix.com> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
Roger Pau Monne [Wed, 22 Feb 2012 16:37:50 +0000 (17:37 +0100)]
autoconf: check for as86, ld86, bcc and iasl
Check for this tools, and set the proper paths on config/Tool.mk.
Signed-off-by: Roger Pau Monne <roger.pau@entel.upc.edu> Committed-by: Ian Jackson <ian.jackson.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Wed, 4 Apr 2012 15:10:18 +0000 (16:10 +0100)]
libxl: fixup error handling in libxl_send_trigger
xc_domain_send_trigger returns -1 and sets errno on failure so use
LIBXL__LOG_ERRNO not LIBXL__LOG_ERRNOVAL(rc).
Change the default case of the switch to set rc=-1,errno=EINVAL too.
Also we weren't actually returning the error code we'd decided on.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
George Dunlap [Wed, 4 Apr 2012 15:06:42 +0000 (16:06 +0100)]
xl, libxl: Add per-device and global permissive config options for pci passthrough
By default pciback only allows PV guests to write "known safe" values into
PCI config space. But many devices require writes to other areas of config
space in order to operate properly. One way to do that is with the "quirks"
interface, which specifies areas known safe to a particular device; the
other way is to mark a device as "permissive", which tells pciback to allow
all config space writes for that domain and device.
This adds a "permissive" flag to the libxl_pci struct and teaches libxl how
to write the appropriate value into sysfs to enable the permissive feature for
devices being passed through. It also adds the permissive config options either
on a per-device basis, or as a global option in the xl command-line.
Because of the potential stability and security implications of enabling
permissive, the flag is left off by default.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
George Dunlap [Wed, 4 Apr 2012 15:06:42 +0000 (16:06 +0100)]
libxl: Move bdf parsing into libxlu
Config parsing functions do not properly belong in libxl. Move them into
libxlu so that others can use them or not as they see fit.
No functional changes. One side-effect was making public a private libxl
utility function which just set the elements of a structure from the function
arguments passed in.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Matt Wilson [Wed, 4 Apr 2012 10:09:15 +0000 (11:09 +0100)]
PV-GRUB: add support for btrfs
This patch adds btrfs support to the GRUB tree used to build PV-GRUB.
The original patch is from Gentoo:
https://bugs.gentoo.org/show_bug.cgi?id=283637
Signed-off-by: Matt Wilson <msw@amazon.com> Committed-by: Ian Jackson <ian.jackson.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Matt Wilson [Wed, 4 Apr 2012 10:09:14 +0000 (11:09 +0100)]
PV-GRUB: add support for ext4
This patch adds support for ext4 to the GRUB tree used to build PV-GRUB.
The original patch is taken from the Fedora GRUB package in this commit:
http://pkgs.fedoraproject.org/gitweb/?p=grub.git;a=commitdiff;h=32bf414af04d377055957167aac7dedec691ef57
Signed-off-by: Matt Wilson <msw@amazon.com> Committed-by: Ian Jackson <ian.jackson.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Matt Wilson [Wed, 4 Apr 2012 10:09:14 +0000 (11:09 +0100)]
PV-GRUB: Check for errors when applying patches to GRUB
We want to ensure that patches apply cleanly without rejects. Bail if
patch returns a non-zero exit code.
Signed-off-by: Matt Wilson <msw@amazon.com> Committed-by: Ian Jackson <ian.jackson.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Olaf Hering [Wed, 14 Mar 2012 16:53:56 +0000 (17:53 +0100)]
tools/libfsimage: include Rules.mk first
Move the inclusion of Rules.mk up so that things like CFLAGS get initialized
properly. Currently only zfs appends CFLAGS. If CFLAGS get reset by Rules.mk
the private settings are lost and compilation of zfs support fails.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Committed-by: Ian Jackson <ian.jackson.citrix.com>
Olaf Hering [Wed, 14 Mar 2012 16:02:23 +0000 (17:02 +0100)]
tools/blktap: reorder MEMSHR_DIR to fix CFLAGS
In blktap2 MEMSHR_DIR is used before it is set. This removes the
required -D_GNU_SOURCE from CFLAGS, its used as option for -I
Fix this by moving memshr related flags to the place where its actually
used.
The failure is a missing O_DIRECT define.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Committed-by: Ian Jackson <ian.jackson.citrix.com>
autoconf: change AX_ARG_{DISABLE/ENABLE}_AND_EXPORT to make more sense
Change disable/enable feature macros to have a more significative name
of what they actually do, to avoid confusions.
New macros have the following names:
AX_ARG_DEFAULT_ENABLE: feature is enabled by default, provides the
--disable-{feature} option to disable it.
AX_ARG_DEFAULT_DISABLE: feature is disabled by default, provides the
--enable-{feature] option to enable it.
Signed-off-by: Roger Pau Monne <roger.pau@entel.upc.edu> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Olaf Hering <olaf@aepfle.de> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Olaf Hering [Tue, 3 Apr 2012 14:25:01 +0000 (15:25 +0100)]
tools: specify datadir for qemu-xen build to fix firmware loading
qemu-xen does currently not find the firmware files, such as
vgabios-cirrus.bin. The reason is that qemu-xen uses the default prefix
/usr/local. Use SHAREDIR/qemu-xen as directory so that it can coexist
with qemu-traditional which is installed in SHAREDIR/xen/qemu.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Olaf Hering [Tue, 3 Apr 2012 14:12:21 +0000 (15:12 +0100)]
tools/vtpm: use LDLIBS to pass -lgmp
Linking tpmd will fail with recent toolchains because -lgmp is passed
via LDFLAGS instead of LDLIBS. With this change -lgpm is placed at the
end of the gcc cmdline and linking tpmd succeeds again.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
autoconf: fix python-dev detection on old python versions
Replaced the use of python-config (that is only present in Python >= 2.5.x)
with the distutils python module.
Signed-off-by: Roger Pau Monne <roger.pau@entel.upc.edu> Cc: Zhang, Yang Z <yang.z.zhang@intel.com> Tested-by: KUWAMURA Shin'ya <kuwa@jp.fujitsu.com> Cc: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Upstream the "xen/acpi-processor: C and P-state driver that
uploads said data to hypervisor." takes care of uploading power information
information that normally a cpu frequency scaling driver would using
in the initial domain. We want the hypervisor to take that data and
make good usage of it.
Fortunatly for us we do not have to worry about the native cpu frequency
scaling drivers being loaded first, as the upstream commit:
"xen/cpufreq: Disable the cpu frequency scaling drivers from loading."
takes care of that. Meaning we can load the xen-acpi-processor at any time.
By default that driver is built as a module - and since we are
the only user of it - we should load it.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Committed-by: Ian Jackson <ian.jackson.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Olaf Hering [Mon, 26 Mar 2012 13:22:18 +0000 (15:22 +0200)]
tools/libxc: send page-in requests in batches in linux_privcmd_map_foreign_bulk
One of the bottlenecks with foreign page-in request is the poor retry
handling in linux_privcmd_map_foreign_bulk(). It sends one request per
paged gfn at a time and it waits until the gfn is accessible. This
causes long delays in mmap requests from qemu-dm and xc_save.
Instead of sending one request at a time, walk the entire gfn list and
send batches of mmap requests. They will eventually end up in the pager's
request ring (if it has room again), and will fill up this ring so that
in turn the pager can also process page-in in batches.
Signed-off-by: Olaf Hering <olaf@aepfle.de> Committed-by: Ian Jackson <ian.jackson.citrix.com>
Lin Ming [Mon, 2 Apr 2012 16:32:39 +0000 (17:32 +0100)]
libxl: support for "rtc_timeoffset" and "localtime"
Implement "rtc_timeoffset" and "localtime" options compatible as xm.
rtc_timeoffset is the offset between host time and guest time.
localtime means to specify whether the emulted RTC appears as UTC or is
offset by the host.
Acked-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Lin Ming <mlin@ss.pku.edu.cn> Committed-by: Ian Jackson <ian.jackson.citrix.com>
---
docs/man/xl.cfg.pod.5 | 8 ++++++++
tools/libxl/libxl_create.c | 11 +++++++++++
tools/libxl/libxl_dom.c | 3 +++
tools/libxl/libxl_types.idl | 2 ++
tools/libxl/xl_cmdimpl.c | 5 +++++
5 files changed, 29 insertions(+), 0 deletions(-)
George Dunlap [Mon, 2 Apr 2012 16:22:31 +0000 (17:22 +0100)]
libxl: Handle non-ballooned, zero slackmem properly for pci passthru
The e820_sanitize() function in libxl_pci.c expects one of its arguments to
be non-zero; but since a recent changeset, it can typically expect *to be*
zero. Since the zero case is handled properly, just remove the check.
Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson.citrix.com>
Ian Campbell [Tue, 27 Mar 2012 12:52:51 +0000 (13:52 +0100)]
xl: do not include xenctrl.h
Toolstacks which use libxl should not need to use libxc.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Jackson <ian.jackson.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Mon, 26 Mar 2012 16:11:13 +0000 (17:11 +0100)]
docs: spelling and typoes in misc/xen-command-line.markdown
Run a spell checker over the doc and fix the typos and spelling it uncovers
(including a few I just added myself).
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Jackson <ian.jackson.citrix.com> Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Ian Campbell [Mon, 26 Mar 2012 16:09:59 +0000 (17:09 +0100)]
docs: add some missing options to misc/xen-command-line.markdown
These were mostly ones from xen/arch/x86/boot/cmdline.S which are handled early
and therefore do not use the usual infrastructure and so got missed in the
initial trawl.
The document now contains (AFAICT) every still valid option which was
previously documented at:
http://wiki.xen.org/wiki?title=Xen_Hypervisor_Boot_Options&oldid=1379
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Jackson <ian.jackson.citrix.com>
David Vrabel [Mon, 2 Apr 2012 15:50:44 +0000 (16:50 +0100)]
device tree: print a warning if a node is nested too deep
Since device_tree_for_each_node() is called before printk() works, a
variable is used to switch between using early_printk() and printk().
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
David Vrabel [Mon, 2 Apr 2012 15:50:43 +0000 (16:50 +0100)]
arm: add dom0_mem command line argument
Add a simple dom0_mem command line argument. It's not as flexible as
the x86 equivalent (the 'max' and 'min' prefixes are not supported).
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
David Vrabel [Mon, 2 Apr 2012 15:50:42 +0000 (16:50 +0100)]
arm: use bootargs for the command line
Use the /chosen node's bootargs parameter for the Xen command line.
Parse it early on before the serial console is setup.
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Tim Deegan [Mon, 2 Apr 2012 09:54:05 +0000 (10:54 +0100)]
arm: Use HTPIDR to point to per-CPU state
Rather than having the per-VCPU stack contain a pointer to the
per-PCPU state, use the CPU's hypervisor thread ID register for that.
Signed-off-by: Tim Deegan <tim@xen.org> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
[ s/cpuid/id in set_processor_id -- ijc ] Committed-by: Ian Campbell <ian.campbell@citrix.com>
Wei Huang [Fri, 30 Mar 2012 20:05:54 +0000 (21:05 +0100)]
AMD_LWP: add interrupt support for AMD LWP
This patch adds interrupt support for AMD lightweight profiling. It
registers interrupt handler using alloc_direct_apic_vector(). When
notified, SVM reinjects virtual interrupts into guest VM using
guest's virtual local APIC.
x86/mm: Make iommu passthrough and mem paging/sharing mutually exclusive
Regardless of table sharing or processor vendor, these features cannot coexist
since iommu's don't expect gfn->mfn mappings to change, and sharing and paging
depend on trapping all accesses.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
The p2m-pt.c code, used by both shadow and AMD NPT modes, was not aware of
paging types, and the implications those types have on p2m entries. Add support
to the page table-based p2m to understand the paging types. This is a necessary
step towards enabling memory paging on AMD NPT mode, but not yet the full
solution.
Tested not to break neither shadow mode nor "normal" (i.e. no paging) AMD NPT
mode.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org> Acked-by: Tim Deegan <tim@xen.org> Committed-by: Tim Deegan <tim@xen.org>
Jan Beulich [Tue, 27 Mar 2012 13:23:43 +0000 (15:23 +0200)]
x86/hpet: clear unwanted bits
Leaving certain bits set when being started from an environment where
the HPET was already in use can affect functionality. Clear those bits
to be on the safe side.
We should also consider ignoring the HPET altogether if any reserved
bits are found to be set.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Tue, 27 Mar 2012 13:22:54 +0000 (15:22 +0200)]
x86/hpet: replace disabling of legacy broadcast
... by the call to hpet_disable() added in the immediately preceding
patch.
In order to retain the behavior intended by c/s 23776:0ddb4481f883,
implement one of the alternative options pointed out there: remove CPUs
from the online map in __stop_this_cpu() (and hence doing so in
stop_this_cpu() is no longer needed).
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Tue, 27 Mar 2012 13:20:23 +0000 (15:20 +0200)]
x86/hpet: disable before reboot or kexec
Linux up to now is not smart enough to properly clear the HPET when it
boots, which is particularly a problem when a kdump attempt from
running under Xen is being made. Linux itself added code to work around
this to its shutdown paths quite some time ago, so let's do something
similar in Xen: Save the configuration register settings during boot,
and restore them during shutdown. This should cover the majority of
cases where the secondary kernel might not come up because timer
interrupts don't work.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
Ian Campbell [Tue, 27 Mar 2012 10:13:58 +0000 (11:13 +0100)]
hcall: markup the grant table hypercalls to improve generated docs
As part of this I looked through the relevant chapter from interfaces.tex (from
4.1, deleted in unstable) to ensure no critical information was missing.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Keir Fraser <keir@xen.org> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Liu, Jinsong [Fri, 23 Mar 2012 15:08:17 +0000 (15:08 +0000)]
Xen core parking 2: core parking implementation
This patch implement Xen core parking.
Different core parking sequence has different power/performance
result, due to cpu socket/core/thread topology.
This patch provide power-first and performance-first policies, users
can choose core parking policy by their own demand.
Liu, Jinsong [Fri, 23 Mar 2012 15:07:53 +0000 (15:07 +0000)]
Xen core parking 1: hypercall
This patch implement hypercall through which dom0 send core parking
request, and get core parking result.
Due to the characteristic of continue_hypercall_on_cpu, dom0
seperately send/get core parking request/result.
Jan Beulich [Fri, 23 Mar 2012 07:39:39 +0000 (08:39 +0100)]
x86/gnttab: fix asm() operand in gnttab_clear_flag()
The operand needs to use the 'w' modifier in case the compiler happens
to pick a register (which apparently it does for no-one but the
reporter of this problem).
Reported-by: Lin Ming <mlin@ss.pku.edu.cn> Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org>
David Vrabel [Thu, 22 Mar 2012 14:26:46 +0000 (14:26 +0000)]
arm: remove the hack for loading vmlinux images
Don't adjust the RAM location/size when loading an ELF for dom0. It
was vmlinux specific and no longer needed because Linux can be loaded
from a zImage. Support for loading ELF images is not removed as it
may be useful for loading things other than the Linux kernel.
This also makes preparing the device tree for dom0 easier.
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
David Vrabel [Thu, 22 Mar 2012 14:26:46 +0000 (14:26 +0000)]
device tree: add device_tree_dump() to print a flat device tree
Add a device_tree_dump() function which prints to main structure and
properties names of a flat device tree (but not the properties values
yet).
This will be useful for debugging problems with the device tree
generated for dom0.
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
David Vrabel [Thu, 22 Mar 2012 14:26:44 +0000 (14:26 +0000)]
device tree: correctly ignore unit-address when matching nodes by name
When matching node by their name, correctly ignore the unit address
(@...) part of the name. Previously, a "memory-controller" node would
be incorrectly matched as a "memory" node.
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
David Vrabel [Thu, 22 Mar 2012 14:26:42 +0000 (14:26 +0000)]
libfdt: move headers to xen/include/xen/libfdt/
Move the public libfdt headers to xen/include/xen/libfdt/ so CFLAGS
does need to be set to find them. This requires minor tweaks to one
of the headers imported from upstream.
Signed-off-by: David Vrabel <david.vrabel@citrix.com> Committed-by: Ian Campbell <ian.campbell@citrix.com>
Jan Beulich [Fri, 16 Mar 2012 10:35:06 +0000 (11:35 +0100)]
unmodified drivers: use upstream sync_bitops if available
The forward ported xenlinux sources in openSuSE 12.2 were switched from
the old synch_bitops to the sync_bitops since kernel version 3.3. Add
compat macros to use either old or new helpers depending on used kernel
source version.
Signed-off-by: Jan Beulich <jbeulich@suse.com> Signed-off-by: Olaf Hering <olaf@aepfle.de>
Olaf Hering [Fri, 16 Mar 2012 10:34:41 +0000 (11:34 +0100)]
unmodified drivers: add pfn_is_ram helper for kdump
Register pfn_is_ram helper speed up reading /proc/vmcore in the kdump
kernel. It is compiled only if the kernel source is recent enough to
have the pfn_is_ram helper (v3.0-rc1, commit 997c136f518c5debd63847e78e2a8694f56dcf90).
Signed-off-by: Olaf Hering <olaf@aepfle.de> Committed-by: Jan Beulich <jbeulich@suse.com>
Olaf Hering [Fri, 16 Mar 2012 10:34:14 +0000 (11:34 +0100)]
unmodified drivers: hide xen_cpuid_base() in version 2.6.38+
Allow compilation of PVonHVM drivers with forward-ported xenlinux
sources in openSuSE 12.1. xen_cpuid_base() is now in mainline, the copy
in the xen tree leads to a compilation error. The current state leads
to a compile error:
/usr/src/packages/BUILD/xen-4.2.24547/non-dbg/obj/default/platform-pci/platform-pci.c:121: error: redefinition of 'xen_cpuid_base'
/usr/src/linux-3.0.13-0.11/arch/x86/include/asm/xen/hypervisor.h:43: error: previous definition of 'xen_cpuid_base' was here
The reason is that the kernel sources are searched before the xen
sources for asm/hypervisor.h:
Andrew Cooper [Fri, 16 Mar 2012 10:30:12 +0000 (10:30 +0000)]
KEXEC: Allocate crash structures in low memory
On 64bit Xen with 32bit dom0 and crashkernel, xmalloc'ing items such
as the CPU crash notes will go into the xenheap, which tends to be in
upper memory. This causes problems on machines with more than 64GB
(or 4GB if no PAE support) of ram as the crashkernel physically cant
access the crash notes.
The solution is to force Xen to allocate certain structures in lower
memory. This is achieved by introducing two new command line
parameters; low_crashinfo and crashinfo_maxaddr. Because of the
potential impact on 32bit PV guests, and that this problem does not
exist for 64bit dom0 on 64bit Xen, this new functionality defaults to
the codebase's previous behavior, requiring the user to explicitly
add extra command line parameters to change the behavior.
This patch consists of 3 logically distinct but closely related
changes.
1) Add the two new command line parameters.
2) Change crash note allocation to use lower memory when instructed.
3) Change the conring buffer to use lower memory when instructed.
There result is that the crash notes and console ring will be placed
in lower memory so useful information can be recovered in the case of
a crash.
Changes since v1:
- Patch xen-command-line.markdown to document new options
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Committed-by: Keir Fraser <keir@xen.org>
Andrew Cooper [Fri, 16 Mar 2012 10:29:20 +0000 (10:29 +0000)]
KEXEC: Allocate crash notes on boot
Currently, the buffers for crash notes are allocated per CPU when a
KEXEC_CMD_kexec_get_range hypercall is made, referencing the CPU in
question. This has certain problems including not being able to
allocate the crash buffers if the host is out of memory when crashing.
In addition, my forthcoming code to support 32bit kdump kernels on
64bit Xen on large (>64GB) boxes will require some guarentees as to
where the crash note buffers are actually allocated in physical
memory. This is far easier to sort out at boot time, rather than
after dom0 has been booted and potentially using the physical memory
required.
Therefore, allocate the crash note buffers at boot time.
Changes since v6:
* Tweak kexec_init() to use xzmalloc_array(), and to defer
registering the
crashdump keyhandler until the crash notes have been successfully
allocated.
Changes since v5:
* Introduce sizeof_cpu_notes to move calculation of note size into a
separate location.
* Tweak sizeof_note() to return size_t rather than int, as it is a
function based upon sizeof().
Changes since v4:
* Replace the current cpu crash note scheme of using void pointers
and hand calculating the size each time is needed, by a range
structure containing a pointer and a size. This removes duplicate
times where the size is calculated.
* Tweak kexec_get_cpu(). Don't fail if a cpu is offline because it
may already have crash notes, and may be up by the time a crash
happens. Split the error conditions up to return ERANGE for an
out-of-range cpu request rather than EINVAL. Finally, returning a
range of zeros is acceptable, so do this in preference to failing.
Changes since v3:
* Alter the spinlocks to avoid calling xmalloc/xfree while holding
the lock.
* Tidy up the coding style used.
Changes since v2:
* Allocate crash_notes dynamically using nr_cpu_ids at boot time,
rather than statically using NR_CPUS.
* Fix the incorrect use of signed integers for cpu id.
* Fix collateral damage to do_crashdump_trigger() and
crashdump_trigger_handler caused by reordering sizeof_note() and
setup_note()
* Tweak the issue about returing -ENOMEM from kexec_init_cpu_note().
No functional change.
* Change kexec_get_cpu() to attempt to allocate crash note buffers
in case we have more free memory now than when the pcpu came up.
* Now that there are two codepaths possibly allocating crash notes,
protect the allocation itself with a spinlock.
Changes since v1:
* Use cpu hotplug notifiers to handle allocating of the notes
buffers rather than assuming the boot state of cpus will be the
same as the crash state.
* Move crash_notes from being per_cpu. This is because the kdump
kernel elf binary put in the crash area is hard coded to physical
addresses which the dom0 kernel typically obtains at boot time.
If a cpu is offlined, its buffer should not be deallocated because
the kdump kernel would read junk when trying to get the crash
notes. Similarly, the same problem would occur if the cpu was
re-onlined later and its crash notes buffer was allocated
elsewhere.
* Only attempt to allocate buffers if a crash area has been
specified. Else, allocating crash note buffers is a waste of
space. Along with this, change the test in kexec_get_cpu to
return -EINVAL if no buffers have been allocated.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Committed-by: Keir Fraser <keir@xen.org>