Keir Fraser [Wed, 24 Oct 2007 09:20:03 +0000 (10:20 +0100)]
x86, cpufreq: Allow dom0 kernel to govern cpufreq via the Intel
Enahanced SpeedStep MSR.
From: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Keir Fraser <keir@xensource.com>
Alex Williamson [Tue, 23 Oct 2007 16:21:31 +0000 (10:21 -0600)]
[IA64] Prevent softlock when destroying VTi domain
Prevent softlock up when VTi domain destruction by making
relinquish_memory() continuable. It was assumed that
mm_teardown() frees most of page_list so that the list which
is passed to relinquish_memory() is short. However the
assumption isn't true for VTi domain case because qemu-dm
maps all the domain pages. To avoid softlock up message,
make relinquish_memory() continuable.
Keir Fraser [Tue, 23 Oct 2007 13:38:47 +0000 (14:38 +0100)]
hvm, vt-d: Add memory cache-attribute pinning domctl for HVM
guests. Use this to pin virtual framebuffer VRAM as attribute WB, even
if guest tries to map with other attributes. Signed-off-by: Disheng Su <disheng.su@intel.com>
Keir Fraser [Tue, 23 Oct 2007 08:26:43 +0000 (09:26 +0100)]
xenmon: Fix security vulnerability CVE-2007-3919.
The xenbaked daemon and xenmon utility communicate via a mmap'ed
shared file. Since this file is located in /tmp, unprivileged users
can cause arbitrary files to be truncated by creating a symlink from
the well-known /tmp filename to e.g., /etc/passwd.
The fix is to place the shared file in a directory to which only root
should have access (in this case /var/run/).
This bug was reported, and the fix suggested, by Steve Kemp
<skx@debian.org>. Thanks!
Keir Fraser [Mon, 22 Oct 2007 20:06:11 +0000 (21:06 +0100)]
x86: small boot-time changes:
* use memory 0x8c000-0x90000 to avoid trampling the area above
0x90000 -- some bootloaders may leave droppings in that region
* reserve 2kB for vga mode table -- limit of 128 VESA modes could
overflow the original 1kB allocation
* remove unnecessary alignment of trampoline GDT
Alex Williamson [Mon, 22 Oct 2007 18:26:53 +0000 (12:26 -0600)]
[IA64] Don't share privregs with hvm domain
Don't share privregs with hvm domain and twist IA64 xen dump core format
slightly. Xen shares privregs pages with IA64 HVM domain for xm dump-core
to dump the pages. However sharing the page allows hvm guest domain
peek/destroy the page contents that might cause xen crash. And the xen
dump core file doesn't need privregs page because cpu context should be
obtained from vcpu context in case of IA64 HVM domain.
Although this patch modify xen dump core format, current crash utility
(at least crash 4.0-4.7) doesn't look into .xen_ia64_mmapped_regs section
and I don't know any other tools to understand xen dump core file.
So this format modification doesn't cause incompatibility issue.
Alex Williamson [Mon, 22 Oct 2007 18:19:42 +0000 (12:19 -0600)]
[IA64] Kdump: 64-bit aligned access to elf-note data
xen_core_regs, as passed by kexec_crash_save_info(), is 32-bit aligned as
it is the data section of an ELF-note. In order to ensure 64-bit aligned
access when xen_core_regs is filled in, shift it a bit and then memmove()
the data back into the 32-bit aligned location after the values have been
written.
Without this change kdump panics on an unaligned-access.
Keir Fraser [Mon, 22 Oct 2007 13:22:39 +0000 (14:22 +0100)]
A few small fixes for xenstored:
- Proper sizeof parameter to snprintf
- Return proper xs_domain_dev for netbsd. Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Keir Fraser [Mon, 22 Oct 2007 12:04:32 +0000 (13:04 +0100)]
x86: Allow NMI callback CS to be specified via set_trap_table()
hypercall.
Based on a patch by Jan Beulich. Signed-off-by: Keir Fraser <keir@xensource.com>
Keir Fraser [Mon, 22 Oct 2007 06:44:25 +0000 (07:44 +0100)]
x86: Allow BOOT_TRAMPOLINE to be changed without needing manual
modification of the trampoline GDT. Adjust trampoline base to
0x94000. Signed-off-by: Keir Fraser <keir@xensource.com>
Alex Williamson [Sun, 21 Oct 2007 21:52:25 +0000 (15:52 -0600)]
[IA64] New features for xenitp
Add auto-repeat feature
(Just press enter to re-execute the last go/sstep/cb/disass command).
Do not flush stdout in the signal handler.
Single step over a breakpoint.
Can quit with domain paused (quit paused)
'disp db' now displays watchpoint.
Keir Fraser [Fri, 19 Oct 2007 17:00:10 +0000 (18:00 +0100)]
Replace sysctl.physinfo.sockets_per_node with more directly useful
sysctl.physinfo.nr_cpus. This also avoids miscalculation of
sockets_per_node by Xen where the number of CPUs in the system is
clipped.
From: Elizabeth Kon <eak@us.ibm.com> Signed-off-by: Keir Fraser <keir@xensource.com>
Keir Fraser [Fri, 19 Oct 2007 16:47:12 +0000 (17:47 +0100)]
Avoid passing uninitialised ACPI tables to dom0 when checksums fail.
If during boot, ACPI checksum failures disable ACPI support in Xen,
pass 'acpi=off' to the domain 0 kernel to avoid a fatal page fault
as domain 0 attempts to access the uninitialized ACPI tables.
Signed-off-by: David Lively <dlively@virtualiron.com> Signed-off-by: Steve Ofsthun <sofsthun@virtualiron.com>
Keir Fraser [Fri, 19 Oct 2007 13:49:08 +0000 (14:49 +0100)]
Fix x86/64 build for *BSD.
- Config.mk: uname -m prints "amd64". Deal with this.
- do not assume python is always in /usr/bin
- get-fields.sh: make it portable and non-bash specific Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Keir Fraser [Fri, 19 Oct 2007 10:32:18 +0000 (11:32 +0100)]
x86: Remove io_apic fake-vector style of IRQ acknowledgement. Not
needed now that pass-through IRQs can use the 'new' ack method. Signed-off-by: Keir Fraser <keir@xensource.com>
Alex Williamson [Wed, 17 Oct 2007 16:36:31 +0000 (10:36 -0600)]
[IA64] Backup/restore ACPI tables
We modify some of the ACPI tables for dom0 (limiting available CPUs,
modifying id/eid, and hiding SLIT/SRAT tables). This causes problems
when we try to kexec with different dom0 CPU counts or from Xen to
Linux. This introduces a mechanism to save ACPI tables before
modification and restoring them before kexec.
Signed-off-by: Alex Williamson <alex.williamson@hp.com> Acked-by: Simon Horman <horms@verge.net.au>
Keir Fraser [Wed, 17 Oct 2007 14:37:36 +0000 (15:37 +0100)]
x86: add option to display last exception records during register dumps Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Keir Fraser <keir@xensource.com>
Keir Fraser [Wed, 17 Oct 2007 13:38:19 +0000 (14:38 +0100)]
x86: Tighten handling of page-type attributes and make
map_pages_to_xen() smarter and safer. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Keir Fraser <keir@xensource.com>
Keir Fraser [Wed, 17 Oct 2007 10:12:32 +0000 (11:12 +0100)]
x86: Remove invlpg_works_ok and invlpg only single-page regions.
The flush_area_local() interface was unclear about whether a
multi-page region (2M/4M/1G) had to be mapped by a superpage, and
indeed some callers (map_pages_to_xen()) already would specify
FLUSH_LEVEL(2) for a region actually mapped by 4kB PTEs.
The safest fix is to relax the interface and do a full TLB flush in
these cases. My suspicion is that these cases are rare enough that the
cost of INVLPG versus full flush will be unimportant.
Keir Fraser [Wed, 17 Oct 2007 09:02:49 +0000 (10:02 +0100)]
Fix xenstore unwatch with node name starting with "@"
Watch node starting with "@" should not be canonicalized. Signed-off-by: Xiaowei Yang <xiaowei.yang@intel.com>
Keir Fraser [Wed, 17 Oct 2007 09:00:27 +0000 (10:00 +0100)]
hvm: TCGBIOS fixes
Fix IPL measurement of El Torito CD boot and some eventlog formats.
The TCG BIOS extensions are described here:
https://www.trustedcomputinggroup.org/specs/PCClient/TCG_PCClientImplementationforBIOS_1-20_1-00.pdf
- fix cdrom (El Torito) boot (8.2.5.6 El Torito, p63)
tcpa_ipl() is modified to support various boot devices.
move some measurement code into cdrom_boot() function.
- fix EV_IPL (0Dh) event (10.4.1 Event Types, p76)
eventfield size should be zero
- fix EV_SEPARATOR event (3.2.2 Integrity Collection and Reporting,
p32)
change eventfield to -1 (0xFFFFFFFF) from "---------------"
- add "Returned INT 19h" event (8.2.3 Logging of Boot Events, p59)
actually, tcgbios does not call int19h, but we extend this
tentatively
Keir Fraser [Tue, 16 Oct 2007 16:41:33 +0000 (17:41 +0100)]
xend: xenapi: Suspended domain causes fault if vif.get_all_records() is called
A single suspended domain on the system causes a fault when
vif.get_all_records() is called since this returns an ErrorDescription
and no 'Value' in the 'v' dictionary. This patch now returns a 'None'
as Value which might not be optimal but better than faulting.
Keir Fraser [Tue, 16 Oct 2007 16:31:37 +0000 (17:31 +0100)]
x86: consolidate/enhance TLB flushing interface
Folding into a single local handler and a single SMP multiplexor as
well as adding capability to also flush caches through the same
interfaces (a subsequent patch will make use of this).
Once at changing cpuinfo_x86, this patch also removes several unused
fields apparently inherited from Linux.
Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Keir Fraser <keir@xensource.com>
Alex Williamson [Mon, 15 Oct 2007 17:39:30 +0000 (11:39 -0600)]
[IA64] Support console_timestamps on IA64
This patch intends to support console_timestamps on IA64.
At this moment, EFIRTC is used for start time.
If we support synchronize HV time to Dom0 system time for long period,
Dom0 sys_settimeofday and sys_adjtimex functions should implement
Dom0-HV time sync functionality.
Keir Fraser [Mon, 15 Oct 2007 11:13:41 +0000 (12:13 +0100)]
x86/64: Do not clobber %r11 (user rflags) on syscall from guest
userspace to guest kernel. The flags are saved on the guest kernel
stack anyway, but some guests rely on %r11 instead. Signed-off-by: Keir Fraser <keir@xensource.com>
Tim Deegan [Mon, 15 Oct 2007 08:28:14 +0000 (09:28 +0100)]
PV guests don't require order-non-zero pages for shadowing, hence lift
the requirement on such being available for allocation when enabling
shadow mode, removing the potential for live migration to fail due to
fragmented memory.
Alex Williamson [Fri, 12 Oct 2007 21:02:06 +0000 (15:02 -0600)]
[IA64] Fix MCA error handler problems
Fixing MCA issues related to changes from kexec patch series...
[From "Kexec: Fix ia64_do_tlb_purge so that it works with XEN"]
> 2. Use the per_cpu variable to derive CURRENT_STACK_OFFSET rather
> than reading it from a kernel register. See 1) for explanation
> of why.
I added the same code in Reload DTR for stack part and also added a
code to avoid overlapping with kernel TR.
> 3. In the VHPT pruning code, don't use r25 as ia64_jump_to_sal,
> which branches to ia64_do_tlb_purge expects r25 to be preserved.
> There seems no reason not to use r2 as per the other purges
> done in ia64_do_tlb_purge. Furthermore use r16 and r18 instead
> of r20 and r24 for consistency reasons.
The r25 kept the value of __va_ul(vcpu_vhpt_maddr(v)), and it was
referred to by the following lines.
468 // r25 = __va_ul(vcpu_vhpt_maddr(v));
469 dep r20=0,r25,0,IA64_GRANULE_SHIFT
470 movl r26=PAGE_KERNEL
471 ;;
472 mov r21=IA64_TR_VHPT
473 dep r22=0,r20,60,4 // physical address of
I defined GET_VA_VCPU_VHPT_MADDR() macro to re-calculate the value of
__va_ul(vcpu_vhpt_maddr(v)) in each part.
And I renamed the register names for same reasons.
Alex Williamson [Fri, 12 Oct 2007 20:49:37 +0000 (14:49 -0600)]
[IA64] Fix TLB insertion for subpaging
Without this patch, Longhorn is sure to hang up. .NET application
might hit this bug. itc.i instruction is repeated forever, because
TLB entry with smaller page size is volatile.
add unwind directive to fast_hypercall path.
While fast_hypercall path calls function (hypercall, do_softirq()) and
might be blocked, it doesn't have unwind infomation.
So stack unwinding fails. Add necessary unwind directive.
fix stack unwinder.
- fix find_save_locs() and unw_unwind().
instruction pointer check should be suite for xen.
- fix unw_unwind_to_user()
VTi domain fault handler doesn't always updatevcpu->on_stack so that
the pUStk check fails. Add more checking to stop winding.