Keir Fraser [Thu, 25 Oct 2007 14:01:59 +0000 (15:01 +0100)]
hvm: In xenstore_process_logdirty_event(), if a stale shared memory
key is encountered reset 'seg' to NULL so the shared memory
initialization can be retried later.
Signed-off-by: Ben Guthro <bguthro@virtualron.com> Signed-off-by: Robert Phillips <rphillips@virtualiron.com> Signed-off-by: Keir Fraser <keir@xensource.com>
Keir Fraser [Thu, 25 Oct 2007 13:55:37 +0000 (14:55 +0100)]
hvm,x86: Add more vmxassist opcodes for Ubuntu 7.0.4 support Signed-off-by: Ben Guthro <bguthro@virtualron.com> Signed-off-by: Gary Grebus <ggrebus@virtualiron.com>
Keir Fraser [Thu, 25 Oct 2007 13:45:47 +0000 (14:45 +0100)]
pv-qemu 10/10: Make xenconsoled ignore doms with qemu-dm
This patch writes a field /local/vm/DOMID/console/type taking the
value 'ioemu' or 'xenconsoled'. If xenconsoled sees a type that is
not its own, then it skips handling of that guest. The qemu-dm
process doesn't need to read this field since it will only attach
to the console if given the -serial pty arg which XenD already
ensures matches this xenstore field.
The overall behaviour is that if a paravirt guest has a qemu-dm
process running then that handles the console, otherwise the
xenconsoled handles it. The former is more functional, with the
exception of not currently supporting persistent logging to a
file at the same time as exposing a PTY.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Keir Fraser [Thu, 25 Oct 2007 13:45:07 +0000 (14:45 +0100)]
pv-qemu 9/10: XenD device model re-factoring
This patches adapts XenD so that it is capable of starting a qemu-dm
device model for both paravirt and fullyvirt guests. A paravirt guest
will only be given a device model if it has a VFB configured, or the
user explicitly include the device_model option in the config
config. This avoids unnecessary overhead for those wanting a minimal
paravirt guest.
The bulk of this patch involves moving code from the HVMImageHandler
into the base ImageHandler class. The HVMImageHandler and
LinuxImageHandler subclasses now merely containing a couple of
overrides to set some specific command line flags. The most important
is -M xenpv, vs -M xenfv.
The XenConfig class has a minor refactoring to add a has_rfb() method
to avoid duplicating code in a couple of places. Instead of hardcoding
DEFAULT_DM it now uses the xen.util.auxbin APIs to locate it - this
works on platforms where qemu-dm is in /usr/lib64 instead of
/usr/lib. As before paravirt only gets a default qemu-dm if using a
VFB.
The vfbif.py class is trimmed out since it no longer needs to spawn a
daemon. A few other misc fixes deal with qemu-dm interactions when
saving/restoring, and in particular recovering from save failures (or
checkpointing).
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Keir Fraser [Thu, 25 Oct 2007 13:42:40 +0000 (14:42 +0100)]
pv-qemu 8/10: Add pv console to QEMU paravirt machine
This patch adds a paravirt console driver to qemu-dm. This is used
when the QEMU machine type is 'xenpv', connecting to the ring buffer
provided by the guest kernel. The '-serial' command line flag controls
how the guest console is exposed.
For parity with xenconsoled the '-serial pty' arg can be used. For
guests which are running a qemu-dm device model, the xenconsoled
daemon is no longer needed for guest consoles. The code for the
xen_console.c is based on the original code in
tools/console/daemon/io.c, but simplified; since its only dealing with
a single guest there's no state tracking to worry about.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Keir Fraser [Thu, 25 Oct 2007 13:41:35 +0000 (14:41 +0100)]
pv-qemu 7/10: Async negotiation with xenfb frontend
This patch re-factors the paravirt console xenfb_attach_dom
method. The original method blocks the caller until the front &
backends have both switched to the connected state. This isn't an
immediate problem, but patches which follow will extend qemu to also
handle the text console so blocking on graphics console startup will
block the text console processing.
The new code is basically a state machine. It starts off with a watch
waiting for the KBD backend to switch to 'initialized' mode, then does
the same for the FB backend. Now it waits for KBD & FB frontend
devices to initialize, reading & mapping the framebuffer & its config
at the appropriate step. When the KBD frontend finally reaches the
connected state it registers a graphical console with QEMU and sets up
the various framebuffer, mouse & keyboard event handlers. If a client
connects to the VNC server before this is completed, then they will
merely see a text console (or perhaps the monitor if configured that
way).
The main difference from previous versions of this patch, is that at
the suggestion of Markus Armbruster, I'vere-ordered the individual
static functions so they are in order-of-call, rather than
reversed. Although I now have to pre-declare them, it is much easier
to read the code. I have also fixed the keycode -> keysym translations
to match previous behaviour.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Keir Fraser [Thu, 25 Oct 2007 13:40:19 +0000 (14:40 +0100)]
pv-qemu 6/10: Merge private & public xenfb structs
This patch merges the public & private structs from the paravirt FB
into a single struct. Since QEMU is the only consumer of this code
there is no need for the artifical pub/priv split. Merging the two
will make it possible to more tightly integrate with QEMU's event
loop and do asynchronous non-blocking negoiation with the frontend
devices (see next patch).
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Keir Fraser [Thu, 25 Oct 2007 13:39:33 +0000 (14:39 +0100)]
pv-qemu 5/10: Refactor QEMU console integration
This patch moves a bunch of code out of the xen_machine_pv.c file and
into the xenfb.c file. This is simply a re-factoring to facilitate the
two patches which follow.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Keir Fraser [Thu, 25 Oct 2007 13:38:47 +0000 (14:38 +0100)]
pv-qemu 4/10: Refactor xenfb event handlers
This patch is a simple code re-factoring to move the event loop
integration directly into the xenfb.c file. It is to facilitate
the patches which follow.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Keir Fraser [Thu, 25 Oct 2007 13:37:23 +0000 (14:37 +0100)]
pv-qemu: Remove standalone xenfb code
This patch removes all trace of the standalone paravirt framebuffer
daemon. With this there is no longer any requirement for
LibVNCServer. Everything is handled by the QEMU device model. The
xenfb.c and xenfb.h files are now moved (without code change) into
tools/ioemu/hw/ & the temporary Makefile hack from the previous patch
is removed.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Keir Fraser [Thu, 25 Oct 2007 13:35:04 +0000 (14:35 +0100)]
pv-qemu 2/10: Add a QEMU machine type for paravirt guests
This patch adds a paravirt machine type to QEMU. This can be requested
by passing the arg '-M xenpv' to qemu-dm. Aside from -d, and
-domain-name, the only other args that are processed are the VNC / SDL
graphics related args. Any others will be ignored. A tweak to
helper2.c was made to stop it setting up a file handler watch when
there are no CPUs registered.
The paravirt machine is in hw/xen_machine_pv.c and registers an
instance of the xenfb class, integrating it with the QEMU event loop
and key/mouse handlers. A couple of methods were adding to xenfb.h to
allow direct access to the file handles for xenstore & the event
channel.
The vfbif.py device controller is modified to launch qemu-dm instead
of the old xen-vncfb / sdlfb daemons.
When receiving framebuffer updates from the guest, the update has to
be copied into QEMU's copy of the framebuffer. This is because QEMU
stores the framebuffer in the format that is native to the SDL
display, or VNC client. This is not neccessarily the same as the guest
framebuffer which is always 32bpp. If there is an exact depth match we
use memcpy for speed, but in the non-matching case we have to fallback
to slow code to convert pixel formats. It fully supports all features
of the paravirt framebuffer including the choice between absolute &
relative pointers. The overall VIRT memory image size is about same as
old xen-vncfb, but the resident memory size is a little increased due
to copy of the framebuffer & some QEMU static state overhead. Most of
this is shared across QEMU processes.
To avoid both moving the xenfb.c and making changes to it in the same
patch, this just uses a Makefile hack to link against the xenfb.o from
the tools/xenfb/ directory. This will be removed in the following
patch.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Keir Fraser [Thu, 25 Oct 2007 13:33:01 +0000 (14:33 +0100)]
pv-qemu 1/10: Add a QEMU machine type for fullvirt guests
This patch does a (no functional change) re-arrangement of the code
for starting up a fully virtualized guest. In particular it creates a
new QEMU machine type for Xen fullyvirt guests which can be specified
with '-M xenfv'. For compatibility this is in fact made to be the
default. The code for setting up memory maps is moved out of vl.c, and
into hw/xen_machine_fv.c. This is basically to ensure that it can be
easily skipped when we add a paravirt machine type in the next patch.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Keir Fraser [Thu, 25 Oct 2007 08:43:42 +0000 (09:43 +0100)]
x86: GDTR must be reset after using real-mode BIOS services. Some
BIOSes clobber GDTR. While we're here reset IDTR too, although it's
not really necessary. Signed-off-by: John Byrne <john.l.byrne@hp.com> Sigend-off-by: Keir Fraser <keir@xensource.com>
Keir Fraser [Thu, 25 Oct 2007 08:24:28 +0000 (09:24 +0100)]
xend, acm: Put the __UNLABELED__ label into the mapfile if policy specifies it
Put the __UNLABELED__ label into the mapfile if policy specifies this
label rather than keeping the NULL_LABEL there. Also lock the map file
when it's rewritten and propagate the return code from compiling the
policy to callers.
Keir Fraser [Thu, 25 Oct 2007 08:22:28 +0000 (09:22 +0100)]
xm-test: various fixes
- recently I added an other_config field to the VTPM record which now
needs to be accounted for otherwise the test determines a bad key
- the dry-run command was throwing a different type of exception
(ACMError) than what was caught (XSMError)
- the tests based on the raw Xen-API need to build the PV_args
parameters from the old 'root' and 'extra' parameters.
Keir Fraser [Wed, 24 Oct 2007 09:20:03 +0000 (10:20 +0100)]
x86, cpufreq: Allow dom0 kernel to govern cpufreq via the Intel
Enahanced SpeedStep MSR.
From: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Keir Fraser <keir@xensource.com>
Alex Williamson [Tue, 23 Oct 2007 16:21:31 +0000 (10:21 -0600)]
[IA64] Prevent softlock when destroying VTi domain
Prevent softlock up when VTi domain destruction by making
relinquish_memory() continuable. It was assumed that
mm_teardown() frees most of page_list so that the list which
is passed to relinquish_memory() is short. However the
assumption isn't true for VTi domain case because qemu-dm
maps all the domain pages. To avoid softlock up message,
make relinquish_memory() continuable.
Keir Fraser [Tue, 23 Oct 2007 13:38:47 +0000 (14:38 +0100)]
hvm, vt-d: Add memory cache-attribute pinning domctl for HVM
guests. Use this to pin virtual framebuffer VRAM as attribute WB, even
if guest tries to map with other attributes. Signed-off-by: Disheng Su <disheng.su@intel.com>
Keir Fraser [Tue, 23 Oct 2007 08:26:43 +0000 (09:26 +0100)]
xenmon: Fix security vulnerability CVE-2007-3919.
The xenbaked daemon and xenmon utility communicate via a mmap'ed
shared file. Since this file is located in /tmp, unprivileged users
can cause arbitrary files to be truncated by creating a symlink from
the well-known /tmp filename to e.g., /etc/passwd.
The fix is to place the shared file in a directory to which only root
should have access (in this case /var/run/).
This bug was reported, and the fix suggested, by Steve Kemp
<skx@debian.org>. Thanks!
Keir Fraser [Mon, 22 Oct 2007 20:06:11 +0000 (21:06 +0100)]
x86: small boot-time changes:
* use memory 0x8c000-0x90000 to avoid trampling the area above
0x90000 -- some bootloaders may leave droppings in that region
* reserve 2kB for vga mode table -- limit of 128 VESA modes could
overflow the original 1kB allocation
* remove unnecessary alignment of trampoline GDT
Alex Williamson [Mon, 22 Oct 2007 18:26:53 +0000 (12:26 -0600)]
[IA64] Don't share privregs with hvm domain
Don't share privregs with hvm domain and twist IA64 xen dump core format
slightly. Xen shares privregs pages with IA64 HVM domain for xm dump-core
to dump the pages. However sharing the page allows hvm guest domain
peek/destroy the page contents that might cause xen crash. And the xen
dump core file doesn't need privregs page because cpu context should be
obtained from vcpu context in case of IA64 HVM domain.
Although this patch modify xen dump core format, current crash utility
(at least crash 4.0-4.7) doesn't look into .xen_ia64_mmapped_regs section
and I don't know any other tools to understand xen dump core file.
So this format modification doesn't cause incompatibility issue.
Alex Williamson [Mon, 22 Oct 2007 18:19:42 +0000 (12:19 -0600)]
[IA64] Kdump: 64-bit aligned access to elf-note data
xen_core_regs, as passed by kexec_crash_save_info(), is 32-bit aligned as
it is the data section of an ELF-note. In order to ensure 64-bit aligned
access when xen_core_regs is filled in, shift it a bit and then memmove()
the data back into the 32-bit aligned location after the values have been
written.
Without this change kdump panics on an unaligned-access.
Keir Fraser [Mon, 22 Oct 2007 13:22:39 +0000 (14:22 +0100)]
A few small fixes for xenstored:
- Proper sizeof parameter to snprintf
- Return proper xs_domain_dev for netbsd. Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Keir Fraser [Mon, 22 Oct 2007 12:04:32 +0000 (13:04 +0100)]
x86: Allow NMI callback CS to be specified via set_trap_table()
hypercall.
Based on a patch by Jan Beulich. Signed-off-by: Keir Fraser <keir@xensource.com>
Keir Fraser [Mon, 22 Oct 2007 06:44:25 +0000 (07:44 +0100)]
x86: Allow BOOT_TRAMPOLINE to be changed without needing manual
modification of the trampoline GDT. Adjust trampoline base to
0x94000. Signed-off-by: Keir Fraser <keir@xensource.com>
Alex Williamson [Sun, 21 Oct 2007 21:52:25 +0000 (15:52 -0600)]
[IA64] New features for xenitp
Add auto-repeat feature
(Just press enter to re-execute the last go/sstep/cb/disass command).
Do not flush stdout in the signal handler.
Single step over a breakpoint.
Can quit with domain paused (quit paused)
'disp db' now displays watchpoint.
Keir Fraser [Fri, 19 Oct 2007 17:00:10 +0000 (18:00 +0100)]
Replace sysctl.physinfo.sockets_per_node with more directly useful
sysctl.physinfo.nr_cpus. This also avoids miscalculation of
sockets_per_node by Xen where the number of CPUs in the system is
clipped.
From: Elizabeth Kon <eak@us.ibm.com> Signed-off-by: Keir Fraser <keir@xensource.com>
Keir Fraser [Fri, 19 Oct 2007 16:47:12 +0000 (17:47 +0100)]
Avoid passing uninitialised ACPI tables to dom0 when checksums fail.
If during boot, ACPI checksum failures disable ACPI support in Xen,
pass 'acpi=off' to the domain 0 kernel to avoid a fatal page fault
as domain 0 attempts to access the uninitialized ACPI tables.
Signed-off-by: David Lively <dlively@virtualiron.com> Signed-off-by: Steve Ofsthun <sofsthun@virtualiron.com>
Keir Fraser [Fri, 19 Oct 2007 13:49:08 +0000 (14:49 +0100)]
Fix x86/64 build for *BSD.
- Config.mk: uname -m prints "amd64". Deal with this.
- do not assume python is always in /usr/bin
- get-fields.sh: make it portable and non-bash specific Signed-off-by: Christoph Egger <Christoph.Egger@amd.com>
Keir Fraser [Fri, 19 Oct 2007 10:32:18 +0000 (11:32 +0100)]
x86: Remove io_apic fake-vector style of IRQ acknowledgement. Not
needed now that pass-through IRQs can use the 'new' ack method. Signed-off-by: Keir Fraser <keir@xensource.com>
Alex Williamson [Wed, 17 Oct 2007 16:36:31 +0000 (10:36 -0600)]
[IA64] Backup/restore ACPI tables
We modify some of the ACPI tables for dom0 (limiting available CPUs,
modifying id/eid, and hiding SLIT/SRAT tables). This causes problems
when we try to kexec with different dom0 CPU counts or from Xen to
Linux. This introduces a mechanism to save ACPI tables before
modification and restoring them before kexec.
Signed-off-by: Alex Williamson <alex.williamson@hp.com> Acked-by: Simon Horman <horms@verge.net.au>
Keir Fraser [Wed, 17 Oct 2007 14:37:36 +0000 (15:37 +0100)]
x86: add option to display last exception records during register dumps Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Keir Fraser <keir@xensource.com>
Keir Fraser [Wed, 17 Oct 2007 13:38:19 +0000 (14:38 +0100)]
x86: Tighten handling of page-type attributes and make
map_pages_to_xen() smarter and safer. Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Keir Fraser <keir@xensource.com>
Keir Fraser [Wed, 17 Oct 2007 10:12:32 +0000 (11:12 +0100)]
x86: Remove invlpg_works_ok and invlpg only single-page regions.
The flush_area_local() interface was unclear about whether a
multi-page region (2M/4M/1G) had to be mapped by a superpage, and
indeed some callers (map_pages_to_xen()) already would specify
FLUSH_LEVEL(2) for a region actually mapped by 4kB PTEs.
The safest fix is to relax the interface and do a full TLB flush in
these cases. My suspicion is that these cases are rare enough that the
cost of INVLPG versus full flush will be unimportant.
Keir Fraser [Wed, 17 Oct 2007 09:02:49 +0000 (10:02 +0100)]
Fix xenstore unwatch with node name starting with "@"
Watch node starting with "@" should not be canonicalized. Signed-off-by: Xiaowei Yang <xiaowei.yang@intel.com>
Keir Fraser [Wed, 17 Oct 2007 09:00:27 +0000 (10:00 +0100)]
hvm: TCGBIOS fixes
Fix IPL measurement of El Torito CD boot and some eventlog formats.
The TCG BIOS extensions are described here:
https://www.trustedcomputinggroup.org/specs/PCClient/TCG_PCClientImplementationforBIOS_1-20_1-00.pdf
- fix cdrom (El Torito) boot (8.2.5.6 El Torito, p63)
tcpa_ipl() is modified to support various boot devices.
move some measurement code into cdrom_boot() function.
- fix EV_IPL (0Dh) event (10.4.1 Event Types, p76)
eventfield size should be zero
- fix EV_SEPARATOR event (3.2.2 Integrity Collection and Reporting,
p32)
change eventfield to -1 (0xFFFFFFFF) from "---------------"
- add "Returned INT 19h" event (8.2.3 Logging of Boot Events, p59)
actually, tcgbios does not call int19h, but we extend this
tentatively
Keir Fraser [Tue, 16 Oct 2007 16:41:33 +0000 (17:41 +0100)]
xend: xenapi: Suspended domain causes fault if vif.get_all_records() is called
A single suspended domain on the system causes a fault when
vif.get_all_records() is called since this returns an ErrorDescription
and no 'Value' in the 'v' dictionary. This patch now returns a 'None'
as Value which might not be optimal but better than faulting.
Keir Fraser [Tue, 16 Oct 2007 16:31:37 +0000 (17:31 +0100)]
x86: consolidate/enhance TLB flushing interface
Folding into a single local handler and a single SMP multiplexor as
well as adding capability to also flush caches through the same
interfaces (a subsequent patch will make use of this).
Once at changing cpuinfo_x86, this patch also removes several unused
fields apparently inherited from Linux.
Signed-off-by: Jan Beulich <jbeulich@novell.com> Signed-off-by: Keir Fraser <keir@xensource.com>