Keir Fraser [Fri, 7 May 2010 08:50:17 +0000 (09:50 +0100)]
xenpm: remove wrong and pointless "current" indicator
Using the CPU number to compare with an index into an array containing
only a subset of CPUs isn't valid. And indicator isn't necessary here
at all since the CPU number being dealt with gets printed right before
this line.
Keir Fraser [Fri, 7 May 2010 08:46:50 +0000 (09:46 +0100)]
x86/cpufreq: fix turbo mode detection
{acpi,powernow}_cpufreq_cpu_init() generally don't run on the CPU the
policy they deal with is related to, hence using cpuid() directly
works only as long as all CPUs in the system are identical (which
admittedly is commonly the case).
Further add a per-policy flag indicating the availability of
APERF/MPERF MSRs, so that globally setting the .getavg accessor won't
be a problem on heterogeneous configurations.
Keir Fraser [Thu, 6 May 2010 16:00:08 +0000 (17:00 +0100)]
Reduce 'd' debug key's global impact
On large systems, dumping state may cause time management to get
stalled for so long a period that it wouldn't recover. Therefore alter
the state dumping logic to alternatively block each CPU as it prints
rather than one CPU for a very long time (using the alternative key
handling toggle introduced with an earlier patch).
Further, instead of using on_selected_cpus(), which is unsafe when
the dumping happens from a hardware interrupt, introduce and use a
dedicated IPI sending function (which each architecture can implement
to its liking)
Finally, don't print useless data (e.g. the hypervisor context of the
interrupt that is used for triggering the printing, but isn't part of
the context that's actually interesting).
Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Thu, 6 May 2010 14:54:52 +0000 (15:54 +0100)]
Remove CPUID4 emulation for AMD CPUs
The CPUID4 emulation code for AMD CPUs in intel_cacheinfo.c won't be
executed. This emulation code was from upstream kernel, where CPUID4
is used for cache information report in sysfs. But in Xen, this code
path won't be executed on AMD CPUs. init_amd() uses
display_cacheinfo() to find out CPU cache size instead.
Signed-off-by: Wei Huang <wei.huang2@amd.com> Acked-by: Mark Langsdorf <mark.langsdorf@amd.com>
Keir Fraser [Thu, 6 May 2010 10:59:55 +0000 (11:59 +0100)]
Reduce '0' debug key's global impact
On large systems, dumping state may cause time management to get
stalled for so long a period that it wouldn't recover. Therefore add
a tasklet-based alternative mechanism to handle Dom0 state dumps.
Keir Fraser [Thu, 6 May 2010 10:43:54 +0000 (11:43 +0100)]
svm: support EFER.LMSLE for guests
Now that the feature is officially documented (see
http://support.amd.com/us/Processor_TechDocs/24593.pdf), I think it
makes sense to also allow HVM guests to make use of it.
Keir Fraser [Tue, 4 May 2010 11:52:48 +0000 (12:52 +0100)]
CPUIDLE: shorten hpet spin_lock holding time
Try to reduce spin_lock overhead for deep C state entry/exit. This
will benefit systems with a lot of cpus which need the hpet broadcast
to wakeup from deep C state.
Keir Fraser [Tue, 4 May 2010 11:51:33 +0000 (12:51 +0100)]
x86: Relocate boot trampoline to avoid BIOS conflicts.
Fix booting through iSCSI protocol with Broadcom network cards.
These boards use the option ROM feature to implement the TCP/IP stack
protocol, and the iSCSI software initiator. The memory address
normally used by the PMM is 0x87000 which conflicts with the memory
allocation for Xen's trampoline routine, currently 0x88000.
Keir Fraser [Tue, 4 May 2010 11:48:28 +0000 (12:48 +0100)]
CPUIDLE: re-implement mwait wakeup process
It MWAITs on a completely new flag field, avoiding the IPI-avoidance
semantics of softirq_pending. It also does wakeup-waiting checks on
timer_deadline_start, that being the field that initiates wakeup via
the MONITORed memory region.
Signed-off-by: Keir Fraser <keir.fraser@citrix.com> Signed-off-by: Wei Gang <gang.wei@intel.com>
Keir Fraser [Tue, 4 May 2010 11:42:56 +0000 (12:42 +0100)]
linux pvdrv: generalize location of autoconf.h
The location of the file in the build tree changed in recent Linux;
since there can be only one such file, using a wild card instead of
an explicit directory name seems the easiest solution.
Keir Fraser [Tue, 4 May 2010 11:42:21 +0000 (12:42 +0100)]
x86: fix Dom0 booting time regression
Unfortunately the changes in c/s 21035 caused boot time to go up
significantly on certain large systems. To rectify this without going
back to the old behavior, introduce a new memory allocation flag so
that Dom0 allocations can exhaust non-DMA memory before starting to
consume DMA memory. For the latter, the behavior introduced in
aforementioned c/s gets retained, while for the former we can now even
try larger chunks first.
This builds on the fact that alloc_chunk() gets called with non-
increasing 'max_pages' arguments, end hence it can store locally the
allocation order last used (as larger order allocations can't succeed
during subsequent invocations if they failed once).
Keir Fraser [Tue, 4 May 2010 11:41:11 +0000 (12:41 +0100)]
x86: add support for domain-initiated global cache flush
Newer Linux' AGP code wants to flush caches on all CPUs under certain
circumstances. Since doing this on all vCPU-s of the domain in
question doesn't yield the intended effect, this needs to be done in
the hypervisor. Add a new MMUEXT operation for this.
Keir Fraser [Tue, 4 May 2010 11:38:19 +0000 (12:38 +0100)]
blktap: Fix old QCow tapdisk image handling
When I tried to use QCow image, I found that only each second boot is
successful. As I discovered, this is caused by wrong handling old qcow
tapdisk images. Extended header flag is not stored correctly so the
blktap tries to change endian fo L1 table on each startup.
From: Miroslav Rezanina <mrezanin@redhat.com> Signed-off-by: Keir Fraser <keir.fraser@citrix.com>
Keir Fraser [Tue, 4 May 2010 11:16:37 +0000 (12:16 +0100)]
Make sure git clone gets the right kernel branch
When cloning kernel repo:
1. make remote called "xen" rather than the default "origin"
2. directly checkout the desired branch, rather than the default
then the desired one
Git 1.5 doesn't support -b on git clone, and seems to do something odd
with the checkout branch argument, so avoid using the newer
commandline options.
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Keir Fraser [Tue, 4 May 2010 08:30:53 +0000 (09:30 +0100)]
Remus: python netlink fixes
Fix deprecation warning in Qdisc class under python 2.6.
Fix rtattr length and padding (rta_len is unaligned).
Null-terminate qdisc name in rtnl messages.
x86, shadow: propagate pat caching on the shadow l1
PAT caching was only propagated if has_arch_pdevs(),
causing the hvm_get_mem_pinned_cacheattr() to be ignored
in the non passthrough case.
l1_disallow_mask() needs to be relaxed.
Signed-off-by: Jean Guyader <jean.guyader@citrix.com>
The motivation comes from distributors that configure their
crashkernel command line automatically with some configuration tool
(YaST, you know ;)). Of course that tool knows the value of System
RAM, but if the user removes RAM, then the system becomes unbootable
or at least unusable and error handling is very difficult."
For x86, other than Linux we pass the actual amount of RAM rather than
the highest page's address (to cope with sparse physical address
maps).
console: Make initial static console buffer __initdata.
The previous scheme --- freeing an area of BSS --- did not interact
nicely with device passthrough as IOMMU will not have any Xen BSS area
in guest device pagetables. Hence if the freed BSS space gets
allocated to a guest, DMAs to guest's own memory can fail.
The simple solution here is to always free the static buffer at end of
boot (initmem is specially handled for IOMMUs) and require a
dynamically-allocated buffer always to be created.