ia64/xen-unstable

changeset 8242:c8fdb0caa77b RELEASE-3.0.0

Updated docs for Xen 3.0.

Signed-off-by: Steven Hand <steven@xensource.com>
author smh22@firebug.cl.cam.ac.uk
date Mon Dec 05 13:39:26 2005 +0100 (2005-12-05)
parents 0255f48b757f
children 7729efa06812
files docs/src/interface.tex docs/src/user.tex
line diff
     1.1 --- a/docs/src/interface.tex	Sun Dec 04 20:12:00 2005 +0100
     1.2 +++ b/docs/src/interface.tex	Mon Dec 05 13:39:26 2005 +0100
     1.3 @@ -3,6 +3,10 @@
     1.4  \usepackage{comment,parskip}
     1.5  \setstretch{1.15}
     1.6  
     1.7 +% LIBRARY FUNCTIONS
     1.8 +
     1.9 +\newcommand{\hypercall}[1]{\vspace{2mm}{\sf #1}}
    1.10 +
    1.11  \begin{document}
    1.12  
    1.13  % TITLE PAGE
    1.14 @@ -15,19 +19,18 @@
    1.15  \vfill
    1.16  \begin{tabular}{l}
    1.17  {\Huge \bf Interface manual} \\[4mm]
    1.18 -{\huge Xen v2.0 for x86} \\[80mm]
    1.19 +{\huge Xen v3.0 for x86} \\[80mm]
    1.20  
    1.21 -{\Large Xen is Copyright (c) 2002-2004, The Xen Team} \\[3mm]
    1.22 +{\Large Xen is Copyright (c) 2002-2005, The Xen Team} \\[3mm]
    1.23  {\Large University of Cambridge, UK} \\[20mm]
    1.24  \end{tabular}
    1.25  \end{center}
    1.26  
    1.27 -{\bf
    1.28 -DISCLAIMER: This documentation is currently under active development
    1.29 +{\bf DISCLAIMER: This documentation is always under active development
    1.30  and as such there may be mistakes and omissions --- watch out for
    1.31  these and please report any you find to the developer's mailing list.
    1.32 -Contributions of material, suggestions and corrections are welcome.
    1.33 -}
    1.34 +The latest version is always available on-line.  Contributions of
    1.35 +material, suggestions and corrections are welcome.  }
    1.36  
    1.37  \vfill
    1.38  \cleardoublepage
    1.39 @@ -67,7 +70,7 @@ operating system.
    1.40  
    1.41  Xen essentially takes a `whole machine' virtualization approach as
    1.42  pioneered by IBM VM/370.  However, unlike VM/370 or more recent
    1.43 -efforts such as VMWare and Virtual PC, Xen does not attempt to
    1.44 +efforts such as VMware and Virtual PC, Xen does not attempt to
    1.45  completely virtualize the underlying hardware.  Instead parts of the
    1.46  hosted guest operating systems are modified to work with the VMM; the
    1.47  operating system is effectively ported to a new target architecture,
    1.48 @@ -87,48 +90,2016 @@ itself, allows the Xen framework to sepa
    1.49  mechanism and policy within the system.
    1.50  
    1.51  
    1.52 -%% chapter Virtual Architecture moved to architecture.tex
    1.53 -\include{src/interface/architecture}
    1.54 -
    1.55 -%% chapter Memory moved to memory.tex
    1.56 -\include{src/interface/memory}
    1.57 +\chapter{Virtual Architecture}
    1.58  
    1.59 -%% chapter Devices moved to devices.tex
    1.60 -\include{src/interface/devices}
    1.61 +In a Xen/x86 system, only the hypervisor runs with full processor
    1.62 +privileges ({\it ring 0} in the x86 four-ring model). It has full
    1.63 +access to the physical memory available in the system and is
    1.64 +responsible for allocating portions of it to running domains.  
    1.65  
    1.66 -%% chapter Further Information moved to further_info.tex
    1.67 -\include{src/interface/further_info}
    1.68 +On a 32-bit x86 system, guest operating systems may use {\it rings 1},
    1.69 +{\it 2} and {\it 3} as they see fit.  Segmentation is used to prevent
    1.70 +the guest OS from accessing the portion of the address space that is
    1.71 +reserved for Xen.  We expect most guest operating systems will use
    1.72 +ring 1 for their own operation and place applications in ring 3.
    1.73  
    1.74 +On 64-bit systems it is not possible to protect the hypervisor from
    1.75 +untrusted guest code running in rings 1 and 2. Guests are therefore
    1.76 +restricted to run in ring 3 only. The guest kernel is protected from its
    1.77 +applications by context switching between the kernel and currently
    1.78 +running application.
    1.79 +
    1.80 +In this chapter we consider the basic virtual architecture provided by
    1.81 +Xen: CPU state, exception and interrupt handling, and time.
    1.82 +Other aspects such as memory and device access are discussed in later
    1.83 +chapters.
    1.84 +
    1.85 +
    1.86 +\section{CPU state}
    1.87 +
    1.88 +All privileged state must be handled by Xen.  The guest OS has no
    1.89 +direct access to CR3 and is not permitted to update privileged bits in
    1.90 +EFLAGS. Guest OSes use \emph{hypercalls} to invoke operations in Xen;
    1.91 +these are analogous to system calls but occur from ring 1 to ring 0.
    1.92 +
    1.93 +A list of all hypercalls is given in Appendix~\ref{a:hypercalls}.
    1.94 +
    1.95 +
    1.96 +\section{Exceptions}
    1.97 +
    1.98 +A virtual IDT is provided --- a domain can submit a table of trap
    1.99 +handlers to Xen via the {\tt set\_trap\_table()} hypercall.  The
   1.100 +exception stack frame presented to a virtual trap handler is identical
   1.101 +to its native equivalent.
   1.102 +
   1.103 +
   1.104 +\section{Interrupts and events}
   1.105 +
   1.106 +Interrupts are virtualized by mapping them to \emph{event channels},
   1.107 +which are delivered asynchronously to the target domain using a callback
   1.108 +supplied via the {\tt set\_callbacks()} hypercall.  A guest OS can map
   1.109 +these events onto its standard interrupt dispatch mechanisms.  Xen is
   1.110 +responsible for determining the target domain that will handle each
   1.111 +physical interrupt source. For more details on the binding of event
   1.112 +sources to event channels, see Chapter~\ref{c:devices}.
   1.113 +
   1.114 +
   1.115 +\section{Time}
   1.116 +
   1.117 +Guest operating systems need to be aware of the passage of both real
   1.118 +(or wallclock) time and their own `virtual time' (the time for which
   1.119 +they have been executing). Furthermore, Xen has a notion of time which
   1.120 +is used for scheduling. The following notions of time are provided:
   1.121 +
   1.122 +\begin{description}
   1.123 +\item[Cycle counter time.]
   1.124 +
   1.125 +  This provides a fine-grained time reference.  The cycle counter time
   1.126 +  is used to accurately extrapolate the other time references.  On SMP
   1.127 +  machines it is currently assumed that the cycle counter time is
   1.128 +  synchronized between CPUs.  The current x86-based implementation
   1.129 +  achieves this within inter-CPU communication latencies.
   1.130 +
   1.131 +\item[System time.]
   1.132 +
   1.133 +  This is a 64-bit counter which holds the number of nanoseconds that
   1.134 +  have elapsed since system boot.
   1.135 +
   1.136 +\item[Wall clock time.]
   1.137 +
   1.138 +  This is the time of day in a Unix-style {\tt struct timeval}
   1.139 +  (seconds and microseconds since 1 January 1970, adjusted by leap
   1.140 +  seconds).  An NTP client hosted by {\it domain 0} can keep this
   1.141 +  value accurate.
   1.142 +
   1.143 +\item[Domain virtual time.]
   1.144 +
   1.145 +  This progresses at the same pace as system time, but only while a
   1.146 +  domain is executing --- it stops while a domain is de-scheduled.
   1.147 +  Therefore the share of the CPU that a domain receives is indicated
   1.148 +  by the rate at which its virtual time increases.
   1.149 +
   1.150 +\end{description}
   1.151 +
   1.152 +
   1.153 +Xen exports timestamps for system time and wall-clock time to guest
   1.154 +operating systems through a shared page of memory.  Xen also provides
   1.155 +the cycle counter time at the instant the timestamps were calculated,
   1.156 +and the CPU frequency in Hertz.  This allows the guest to extrapolate
   1.157 +system and wall-clock times accurately based on the current cycle
   1.158 +counter time.
   1.159 +
   1.160 +Since all time stamps need to be updated and read \emph{atomically}
   1.161 +a version number is also stored in the shared info page, which is
   1.162 +incremented before and after updating the timestamps. Thus a guest can
   1.163 +be sure that it read a consistent state by checking the two version
   1.164 +numbers are equal and even.
   1.165 +
   1.166 +Xen includes a periodic ticker which sends a timer event to the
   1.167 +currently executing domain every 10ms.  The Xen scheduler also sends a
   1.168 +timer event whenever a domain is scheduled; this allows the guest OS
   1.169 +to adjust for the time that has passed while it has been inactive.  In
   1.170 +addition, Xen allows each domain to request that they receive a timer
   1.171 +event sent at a specified system time by using the {\tt
   1.172 +  set\_timer\_op()} hypercall.  Guest OSes may use this timer to
   1.173 +implement timeout values when they block.
   1.174 +
   1.175 +
   1.176 +
   1.177 +%% % akw: demoting this to a section -- not sure if there is any point
   1.178 +%% % though, maybe just remove it.
   1.179 +
   1.180 +% KAF: Remove these random sections!
   1.181 +\begin{comment}
   1.182 +\section{Xen CPU Scheduling}
   1.183 +
   1.184 +Xen offers a uniform API for CPU schedulers.  It is possible to choose
   1.185 +from a number of schedulers at boot and it should be easy to add more.
   1.186 +The BVT, Atropos and Round Robin schedulers are part of the normal Xen
   1.187 +distribution.  BVT provides proportional fair shares of the CPU to the
   1.188 +running domains.  Atropos can be used to reserve absolute shares of
   1.189 +the CPU for each domain.  Round-robin is provided as an example of
   1.190 +Xen's internal scheduler API.
   1.191 +
   1.192 +\paragraph*{Note: SMP host support}
   1.193 +Xen has always supported SMP host systems.  Domains are statically
   1.194 +assigned to CPUs, either at creation time or when manually pinning to
   1.195 +a particular CPU.  The current schedulers then run locally on each CPU
   1.196 +to decide which of the assigned domains should be run there. The
   1.197 +user-level control software can be used to perform coarse-grain
   1.198 +load-balancing between CPUs.
   1.199 +\end{comment}
   1.200 +
   1.201 +
   1.202 +%% More information on the characteristics and use of these schedulers
   1.203 +%% is available in {\tt Sched-HOWTO.txt}.
   1.204 +
   1.205 +
   1.206 +\section{Privileged operations}
   1.207 +
   1.208 +Xen exports an extended interface to privileged domains (viz.\ {\it
   1.209 +  Domain 0}). This allows such domains to build and boot other domains
   1.210 +on the server, and provides control interfaces for managing
   1.211 +scheduling, memory, networking, and block devices.
   1.212 +
   1.213 +\chapter{Memory}
   1.214 +\label{c:memory} 
   1.215 +
   1.216 +Xen is responsible for managing the allocation of physical memory to
   1.217 +domains, and for ensuring safe use of the paging and segmentation
   1.218 +hardware.
   1.219 +
   1.220 +
   1.221 +\section{Memory Allocation}
   1.222 +
   1.223 +As well as allocating a portion of physical memory for its own private
   1.224 +use, Xen also reserves s small fixed portion of every virtual address
   1.225 +space. This is located in the top 64MB on 32-bit systems, the top
   1.226 +168MB on PAE systems, and a larger portion in the middle of the
   1.227 +address space on 64-bit systems. Unreserved physical memory is
   1.228 +available for allocation to domains at a page granularity.  Xen tracks
   1.229 +the ownership and use of each page, which allows it to enforce secure
   1.230 +partitioning between domains.
   1.231 +
   1.232 +Each domain has a maximum and current physical memory allocation.  A
   1.233 +guest OS may run a `balloon driver' to dynamically adjust its current
   1.234 +memory allocation up to its limit.
   1.235 +
   1.236 +
   1.237 +\section{Pseudo-Physical Memory}
   1.238 +
   1.239 +Since physical memory is allocated and freed on a page granularity,
   1.240 +there is no guarantee that a domain will receive a contiguous stretch
   1.241 +of physical memory. However most operating systems do not have good
   1.242 +support for operating in a fragmented physical address space. To aid
   1.243 +porting such operating systems to run on top of Xen, we make a
   1.244 +distinction between \emph{machine memory} and \emph{pseudo-physical
   1.245 +  memory}.
   1.246 +
   1.247 +Put simply, machine memory refers to the entire amount of memory
   1.248 +installed in the machine, including that reserved by Xen, in use by
   1.249 +various domains, or currently unallocated. We consider machine memory
   1.250 +to comprise a set of 4kB \emph{machine page frames} numbered
   1.251 +consecutively starting from 0. Machine frame numbers mean the same
   1.252 +within Xen or any domain.
   1.253 +
   1.254 +Pseudo-physical memory, on the other hand, is a per-domain
   1.255 +abstraction. It allows a guest operating system to consider its memory
   1.256 +allocation to consist of a contiguous range of physical page frames
   1.257 +starting at physical frame 0, despite the fact that the underlying
   1.258 +machine page frames may be sparsely allocated and in any order.
   1.259 +
   1.260 +To achieve this, Xen maintains a globally readable {\it
   1.261 +  machine-to-physical} table which records the mapping from machine
   1.262 +page frames to pseudo-physical ones. In addition, each domain is
   1.263 +supplied with a {\it physical-to-machine} table which performs the
   1.264 +inverse mapping. Clearly the machine-to-physical table has size
   1.265 +proportional to the amount of RAM installed in the machine, while each
   1.266 +physical-to-machine table has size proportional to the memory
   1.267 +allocation of the given domain.
   1.268 +
   1.269 +Architecture dependent code in guest operating systems can then use
   1.270 +the two tables to provide the abstraction of pseudo-physical memory.
   1.271 +In general, only certain specialized parts of the operating system
   1.272 +(such as page table management) needs to understand the difference
   1.273 +between machine and pseudo-physical addresses.
   1.274 +
   1.275 +
   1.276 +\section{Page Table Updates}
   1.277 +
   1.278 +In the default mode of operation, Xen enforces read-only access to
   1.279 +page tables and requires guest operating systems to explicitly request
   1.280 +any modifications.  Xen validates all such requests and only applies
   1.281 +updates that it deems safe.  This is necessary to prevent domains from
   1.282 +adding arbitrary mappings to their page tables.
   1.283 +
   1.284 +To aid validation, Xen associates a type and reference count with each
   1.285 +memory page. A page has one of the following mutually-exclusive types
   1.286 +at any point in time: page directory ({\sf PD}), page table ({\sf
   1.287 +  PT}), local descriptor table ({\sf LDT}), global descriptor table
   1.288 +({\sf GDT}), or writable ({\sf RW}). Note that a guest OS may always
   1.289 +create readable mappings of its own memory regardless of its current
   1.290 +type.
   1.291 +
   1.292 +%%% XXX: possibly explain more about ref count 'lifecyle' here?
   1.293 +This mechanism is used to maintain the invariants required for safety;
   1.294 +for example, a domain cannot have a writable mapping to any part of a
   1.295 +page table as this would require the page concerned to simultaneously
   1.296 +be of types {\sf PT} and {\sf RW}.
   1.297 +
   1.298 +\hypercall{mmu\_update(mmu\_update\_t *req, int count, int *success\_count, domid\_t domid)}
   1.299 +
   1.300 +This hypercall is used to make updates to either the domain's
   1.301 +pagetables or to the machine to physical mapping table.  It supports
   1.302 +submitting a queue of updates, allowing batching for maximal
   1.303 +performance.  Explicitly queuing updates using this interface will
   1.304 +cause any outstanding writable pagetable state to be flushed from the
   1.305 +system.
   1.306 +
   1.307 +\section{Writable Page Tables}
   1.308 +
   1.309 +Xen also provides an alternative mode of operation in which guests
   1.310 +have the illusion that their page tables are directly writable.  Of
   1.311 +course this is not really the case, since Xen must still validate
   1.312 +modifications to ensure secure partitioning. To this end, Xen traps
   1.313 +any write attempt to a memory page of type {\sf PT} (i.e., that is
   1.314 +currently part of a page table).  If such an access occurs, Xen
   1.315 +temporarily allows write access to that page while at the same time
   1.316 +\emph{disconnecting} it from the page table that is currently in use.
   1.317 +This allows the guest to safely make updates to the page because the
   1.318 +newly-updated entries cannot be used by the MMU until Xen revalidates
   1.319 +and reconnects the page.  Reconnection occurs automatically in a
   1.320 +number of situations: for example, when the guest modifies a different
   1.321 +page-table page, when the domain is preempted, or whenever the guest
   1.322 +uses Xen's explicit page-table update interfaces.
   1.323 +
   1.324 +Writable pagetable functionality is enabled when the guest requests
   1.325 +it, using a {\tt vm\_assist} hypercall.  Writable pagetables do {\em
   1.326 +not} provide full virtualisation of the MMU, so the memory management
   1.327 +code of the guest still needs to be aware that it is running on Xen.
   1.328 +Since the guest's page tables are used directly, it must translate
   1.329 +pseudo-physical addresses to real machine addresses when building page
   1.330 +table entries.  The guest may not attempt to map its own pagetables
   1.331 +writably, since this would violate the memory type invariants; page
   1.332 +tables will automatically be made writable by the hypervisor, as
   1.333 +necessary.
   1.334 +
   1.335 +\section{Shadow Page Tables}
   1.336 +
   1.337 +Finally, Xen also supports a form of \emph{shadow page tables} in
   1.338 +which the guest OS uses a independent copy of page tables which are
   1.339 +unknown to the hardware (i.e.\ which are never pointed to by {\tt
   1.340 +  cr3}). Instead Xen propagates changes made to the guest's tables to
   1.341 +the real ones, and vice versa. This is useful for logging page writes
   1.342 +(e.g.\ for live migration or checkpoint). A full version of the shadow
   1.343 +page tables also allows guest OS porting with less effort.
   1.344 +
   1.345 +
   1.346 +\section{Segment Descriptor Tables}
   1.347 +
   1.348 +At start of day a guest is supplied with a default GDT, which does not reside
   1.349 +within its own memory allocation.  If the guest wishes to use other
   1.350 +than the default `flat' ring-1 and ring-3 segments that this GDT
   1.351 +provides, it must register a custom GDT and/or LDT with Xen, allocated
   1.352 +from its own memory.
   1.353 +
   1.354 +The following hypercall is used to specify a new GDT:
   1.355 +
   1.356 +\begin{quote}
   1.357 +  int {\bf set\_gdt}(unsigned long *{\em frame\_list}, int {\em
   1.358 +    entries})
   1.359 +
   1.360 +  \emph{frame\_list}: An array of up to 14 machine page frames within
   1.361 +  which the GDT resides.  Any frame registered as a GDT frame may only
   1.362 +  be mapped read-only within the guest's address space (e.g., no
   1.363 +  writable mappings, no use as a page-table page, and so on). Only 14
   1.364 +  pages may be specified because pages 15 and 16 are reserved for
   1.365 +  the hypervisor's GDT entries.
   1.366 +
   1.367 +  \emph{entries}: The number of descriptor-entry slots in the GDT.
   1.368 +\end{quote}
   1.369 +
   1.370 +The LDT is updated via the generic MMU update mechanism (i.e., via the
   1.371 +{\tt mmu\_update()} hypercall.
   1.372 +
   1.373 +\section{Start of Day}
   1.374 +
   1.375 +The start-of-day environment for guest operating systems is rather
   1.376 +different to that provided by the underlying hardware. In particular,
   1.377 +the processor is already executing in protected mode with paging
   1.378 +enabled.
   1.379 +
   1.380 +{\it Domain 0} is created and booted by Xen itself. For all subsequent
   1.381 +domains, the analogue of the boot-loader is the {\it domain builder},
   1.382 +user-space software running in {\it domain 0}. The domain builder is
   1.383 +responsible for building the initial page tables for a domain and
   1.384 +loading its kernel image at the appropriate virtual address.
   1.385 +
   1.386 +\section{VM assists}
   1.387 +
   1.388 +Xen provides a number of ``assists'' for guest memory management.
   1.389 +These are available on an ``opt-in'' basis to provide commonly-used
   1.390 +extra functionality to a guest.
   1.391 +
   1.392 +\hypercall{vm\_assist(unsigned int cmd, unsigned int type)}
   1.393 +
   1.394 +The {\tt cmd} parameter describes the action to be taken, whilst the
   1.395 +{\tt type} parameter describes the kind of assist that is being
   1.396 +referred to.  Available commands are as follows:
   1.397 +
   1.398 +\begin{description}
   1.399 +\item[VMASST\_CMD\_enable] Enable a particular assist type
   1.400 +\item[VMASST\_CMD\_disable] Disable a particular assist type
   1.401 +\end{description}
   1.402 +
   1.403 +And the available types are:
   1.404 +
   1.405 +\begin{description}
   1.406 +\item[VMASST\_TYPE\_4gb\_segments] Provide emulated support for
   1.407 +  instructions that rely on 4GB segments (such as the techniques used
   1.408 +  by some TLS solutions).
   1.409 +\item[VMASST\_TYPE\_4gb\_segments\_notify] Provide a callback to the
   1.410 +  guest if the above segment fixups are used: allows the guest to
   1.411 +  display a warning message during boot.
   1.412 +\item[VMASST\_TYPE\_writable\_pagetables] Enable writable pagetable
   1.413 +  mode - described above.
   1.414 +\end{description}
   1.415 +
   1.416 +
   1.417 +\chapter{Xen Info Pages}
   1.418 +
   1.419 +The {\bf Shared info page} is used to share various CPU-related state
   1.420 +between the guest OS and the hypervisor.  This information includes VCPU
   1.421 +status, time information and event channel (virtual interrupt) state.
   1.422 +The {\bf Start info page} is used to pass build-time information to
   1.423 +the guest when it boots and when it is resumed from a suspended state.
   1.424 +This chapter documents the fields included in the {\tt
   1.425 +shared\_info\_t} and {\tt start\_info\_t} structures for use by the
   1.426 +guest OS.
   1.427 +
   1.428 +\section{Shared info page}
   1.429 +
   1.430 +The {\tt shared\_info\_t} is accessed at run time by both Xen and the
   1.431 +guest OS.  It is used to pass information relating to the
   1.432 +virtual CPU and virtual machine state between the OS and the
   1.433 +hypervisor.
   1.434 +
   1.435 +The structure is declared in {\tt xen/include/public/xen.h}:
   1.436 +
   1.437 +\scriptsize
   1.438 +\begin{verbatim}
   1.439 +typedef struct shared_info {
   1.440 +    vcpu_info_t vcpu_info[MAX_VIRT_CPUS];
   1.441 +
   1.442 +    /*
   1.443 +     * A domain can create "event channels" on which it can send and receive
   1.444 +     * asynchronous event notifications. There are three classes of event that
   1.445 +     * are delivered by this mechanism:
   1.446 +     *  1. Bi-directional inter- and intra-domain connections. Domains must
   1.447 +     *     arrange out-of-band to set up a connection (usually by allocating
   1.448 +     *     an unbound 'listener' port and avertising that via a storage service
   1.449 +     *     such as xenstore).
   1.450 +     *  2. Physical interrupts. A domain with suitable hardware-access
   1.451 +     *     privileges can bind an event-channel port to a physical interrupt
   1.452 +     *     source.
   1.453 +     *  3. Virtual interrupts ('events'). A domain can bind an event-channel
   1.454 +     *     port to a virtual interrupt source, such as the virtual-timer
   1.455 +     *     device or the emergency console.
   1.456 +     * 
   1.457 +     * Event channels are addressed by a "port index". Each channel is
   1.458 +     * associated with two bits of information:
   1.459 +     *  1. PENDING -- notifies the domain that there is a pending notification
   1.460 +     *     to be processed. This bit is cleared by the guest.
   1.461 +     *  2. MASK -- if this bit is clear then a 0->1 transition of PENDING
   1.462 +     *     will cause an asynchronous upcall to be scheduled. This bit is only
   1.463 +     *     updated by the guest. It is read-only within Xen. If a channel
   1.464 +     *     becomes pending while the channel is masked then the 'edge' is lost
   1.465 +     *     (i.e., when the channel is unmasked, the guest must manually handle
   1.466 +     *     pending notifications as no upcall will be scheduled by Xen).
   1.467 +     * 
   1.468 +     * To expedite scanning of pending notifications, any 0->1 pending
   1.469 +     * transition on an unmasked channel causes a corresponding bit in a
   1.470 +     * per-vcpu selector word to be set. Each bit in the selector covers a
   1.471 +     * 'C long' in the PENDING bitfield array.
   1.472 +     */
   1.473 +    unsigned long evtchn_pending[sizeof(unsigned long) * 8];
   1.474 +    unsigned long evtchn_mask[sizeof(unsigned long) * 8];
   1.475 +
   1.476 +    /*
   1.477 +     * Wallclock time: updated only by control software. Guests should base
   1.478 +     * their gettimeofday() syscall on this wallclock-base value.
   1.479 +     */
   1.480 +    uint32_t wc_version;      /* Version counter: see vcpu_time_info_t. */
   1.481 +    uint32_t wc_sec;          /* Secs  00:00:00 UTC, Jan 1, 1970.  */
   1.482 +    uint32_t wc_nsec;         /* Nsecs 00:00:00 UTC, Jan 1, 1970.  */
   1.483 +
   1.484 +    arch_shared_info_t arch;
   1.485 +
   1.486 +} shared_info_t;
   1.487 +\end{verbatim}
   1.488 +\normalsize
   1.489 +
   1.490 +\begin{description}
   1.491 +\item[vcpu\_info] An array of {\tt vcpu\_info\_t} structures, each of
   1.492 +  which holds either runtime information about a virtual CPU, or is
   1.493 +  ``empty'' if the corresponding VCPU does not exist.
   1.494 +\item[evtchn\_pending] Guest-global array, with one bit per event
   1.495 +  channel.  Bits are set if an event is currently pending on that
   1.496 +  channel.
   1.497 +\item[evtchn\_mask] Guest-global array for masking notifications on
   1.498 +  event channels.
   1.499 +\item[wc\_version] Version counter for current wallclock time.
   1.500 +\item[wc\_sec] Whole seconds component of current wallclock time.
   1.501 +\item[wc\_nsec] Nanoseconds component of current wallclock time.
   1.502 +\item[arch] Host architecture-dependent portion of the shared info
   1.503 +  structure.
   1.504 +\end{description}
   1.505 +
   1.506 +\subsection{vcpu\_info\_t}
   1.507 +
   1.508 +\scriptsize
   1.509 +\begin{verbatim}
   1.510 +typedef struct vcpu_info {
   1.511 +    /*
   1.512 +     * 'evtchn_upcall_pending' is written non-zero by Xen to indicate
   1.513 +     * a pending notification for a particular VCPU. It is then cleared 
   1.514 +     * by the guest OS /before/ checking for pending work, thus avoiding
   1.515 +     * a set-and-check race. Note that the mask is only accessed by Xen
   1.516 +     * on the CPU that is currently hosting the VCPU. This means that the
   1.517 +     * pending and mask flags can be updated by the guest without special
   1.518 +     * synchronisation (i.e., no need for the x86 LOCK prefix).
   1.519 +     * This may seem suboptimal because if the pending flag is set by
   1.520 +     * a different CPU then an IPI may be scheduled even when the mask
   1.521 +     * is set. However, note:
   1.522 +     *  1. The task of 'interrupt holdoff' is covered by the per-event-
   1.523 +     *     channel mask bits. A 'noisy' event that is continually being
   1.524 +     *     triggered can be masked at source at this very precise
   1.525 +     *     granularity.
   1.526 +     *  2. The main purpose of the per-VCPU mask is therefore to restrict
   1.527 +     *     reentrant execution: whether for concurrency control, or to
   1.528 +     *     prevent unbounded stack usage. Whatever the purpose, we expect
   1.529 +     *     that the mask will be asserted only for short periods at a time,
   1.530 +     *     and so the likelihood of a 'spurious' IPI is suitably small.
   1.531 +     * The mask is read before making an event upcall to the guest: a
   1.532 +     * non-zero mask therefore guarantees that the VCPU will not receive
   1.533 +     * an upcall activation. The mask is cleared when the VCPU requests
   1.534 +     * to block: this avoids wakeup-waiting races.
   1.535 +     */
   1.536 +    uint8_t evtchn_upcall_pending;
   1.537 +    uint8_t evtchn_upcall_mask;
   1.538 +    unsigned long evtchn_pending_sel;
   1.539 +    arch_vcpu_info_t arch;
   1.540 +    vcpu_time_info_t time;
   1.541 +} vcpu_info_t; /* 64 bytes (x86) */
   1.542 +\end{verbatim}
   1.543 +\normalsize
   1.544 +
   1.545 +\begin{description}
   1.546 +\item[evtchn\_upcall\_pending] This is set non-zero by Xen to indicate
   1.547 +  that there are pending events to be received.
   1.548 +\item[evtchn\_upcall\_mask] This is set non-zero to disable all
   1.549 +  interrupts for this CPU for short periods of time.  If individual
   1.550 +  event channels need to be masked, the {\tt evtchn\_mask} in the {\tt
   1.551 +  shared\_info\_t} is used instead.
   1.552 +\item[evtchn\_pending\_sel] When an event is delivered to this VCPU, a
   1.553 +  bit is set in this selector to indicate which word of the {\tt
   1.554 +  evtchn\_pending} array in the {\tt shared\_info\_t} contains the
   1.555 +  event in question.
   1.556 +\item[arch] Architecture-specific VCPU info. On x86 this contains the
   1.557 +  virtualized CR2 register (page fault linear address) for this VCPU.
   1.558 +\item[time] Time values for this VCPU.
   1.559 +\end{description}
   1.560 +
   1.561 +\subsection{vcpu\_time\_info}
   1.562 +
   1.563 +\scriptsize
   1.564 +\begin{verbatim}
   1.565 +typedef struct vcpu_time_info {
   1.566 +    /*
   1.567 +     * Updates to the following values are preceded and followed by an
   1.568 +     * increment of 'version'. The guest can therefore detect updates by
   1.569 +     * looking for changes to 'version'. If the least-significant bit of
   1.570 +     * the version number is set then an update is in progress and the guest
   1.571 +     * must wait to read a consistent set of values.
   1.572 +     * The correct way to interact with the version number is similar to
   1.573 +     * Linux's seqlock: see the implementations of read_seqbegin/read_seqretry.
   1.574 +     */
   1.575 +    uint32_t version;
   1.576 +    uint32_t pad0;
   1.577 +    uint64_t tsc_timestamp;   /* TSC at last update of time vals.  */
   1.578 +    uint64_t system_time;     /* Time, in nanosecs, since boot.    */
   1.579 +    /*
   1.580 +     * Current system time:
   1.581 +     *   system_time + ((tsc - tsc_timestamp) << tsc_shift) * tsc_to_system_mul
   1.582 +     * CPU frequency (Hz):
   1.583 +     *   ((10^9 << 32) / tsc_to_system_mul) >> tsc_shift
   1.584 +     */
   1.585 +    uint32_t tsc_to_system_mul;
   1.586 +    int8_t   tsc_shift;
   1.587 +    int8_t   pad1[3];
   1.588 +} vcpu_time_info_t; /* 32 bytes */
   1.589 +\end{verbatim}
   1.590 +\normalsize
   1.591 +
   1.592 +\begin{description}
   1.593 +\item[version] Used to ensure the guest gets consistent time updates.
   1.594 +\item[tsc\_timestamp] Cycle counter timestamp of last time value;
   1.595 +  could be used to expolate in between updates, for instance.
   1.596 +\item[system\_time] Time since boot (nanoseconds).
   1.597 +\item[tsc\_to\_system\_mul] Cycle counter to nanoseconds multiplier
   1.598 +(used in extrapolating current time).
   1.599 +\item[tsc\_shift] Cycle counter to nanoseconds shift (used in
   1.600 +extrapolating current time).
   1.601 +\end{description}
   1.602 +
   1.603 +\subsection{arch\_shared\_info\_t}
   1.604 +
   1.605 +On x86, the {\tt arch\_shared\_info\_t} is defined as follows (from
   1.606 +xen/public/arch-x86\_32.h):
   1.607 +
   1.608 +\scriptsize
   1.609 +\begin{verbatim}
   1.610 +typedef struct arch_shared_info {
   1.611 +    unsigned long max_pfn;                  /* max pfn that appears in table */
   1.612 +    /* Frame containing list of mfns containing list of mfns containing p2m. */
   1.613 +    unsigned long pfn_to_mfn_frame_list_list; 
   1.614 +} arch_shared_info_t;
   1.615 +\end{verbatim}
   1.616 +\normalsize
   1.617 +
   1.618 +\begin{description}
   1.619 +\item[max\_pfn] The maximum PFN listed in the physical-to-machine
   1.620 +  mapping table (P2M table).
   1.621 +\item[pfn\_to\_mfn\_frame\_list\_list] Machine address of the frame
   1.622 +  that contains the machine addresses of the P2M table frames.
   1.623 +\end{description}
   1.624 +
   1.625 +\section{Start info page}
   1.626 +
   1.627 +The start info structure is declared as the following (in {\tt
   1.628 +xen/include/public/xen.h}):
   1.629 +
   1.630 +\scriptsize
   1.631 +\begin{verbatim}
   1.632 +#define MAX_GUEST_CMDLINE 1024
   1.633 +typedef struct start_info {
   1.634 +    /* THE FOLLOWING ARE FILLED IN BOTH ON INITIAL BOOT AND ON RESUME.    */
   1.635 +    char magic[32];             /* "Xen-<version>.<subversion>". */
   1.636 +    unsigned long nr_pages;     /* Total pages allocated to this domain.  */
   1.637 +    unsigned long shared_info;  /* MACHINE address of shared info struct. */
   1.638 +    uint32_t flags;             /* SIF_xxx flags.                         */
   1.639 +    unsigned long store_mfn;    /* MACHINE page number of shared page.    */
   1.640 +    uint32_t store_evtchn;      /* Event channel for store communication. */
   1.641 +    unsigned long console_mfn;  /* MACHINE address of console page.       */
   1.642 +    uint32_t console_evtchn;    /* Event channel for console messages.    */
   1.643 +    /* THE FOLLOWING ARE ONLY FILLED IN ON INITIAL BOOT (NOT RESUME).     */
   1.644 +    unsigned long pt_base;      /* VIRTUAL address of page directory.     */
   1.645 +    unsigned long nr_pt_frames; /* Number of bootstrap p.t. frames.       */
   1.646 +    unsigned long mfn_list;     /* VIRTUAL address of page-frame list.    */
   1.647 +    unsigned long mod_start;    /* VIRTUAL address of pre-loaded module.  */
   1.648 +    unsigned long mod_len;      /* Size (bytes) of pre-loaded module.     */
   1.649 +    int8_t cmd_line[MAX_GUEST_CMDLINE];
   1.650 +} start_info_t;
   1.651 +\end{verbatim}
   1.652 +\normalsize
   1.653 +
   1.654 +The fields are in two groups: the first group are always filled in
   1.655 +when a domain is booted or resumed, the second set are only used at
   1.656 +boot time.
   1.657 +
   1.658 +The always-available group is as follows:
   1.659 +
   1.660 +\begin{description}
   1.661 +\item[magic] A text string identifying the Xen version to the guest.
   1.662 +\item[nr\_pages] The number of real machine pages available to the
   1.663 +  guest.
   1.664 +\item[shared\_info] Machine address of the shared info structure,
   1.665 +  allowing the guest to map it during initialisation.
   1.666 +\item[flags] Flags for describing optional extra settings to the
   1.667 +  guest.
   1.668 +\item[store\_mfn] Machine address of the Xenstore communications page.
   1.669 +\item[store\_evtchn] Event channel to communicate with the store.
   1.670 +\item[console\_mfn] Machine address of the console data page.
   1.671 +\item[console\_evtchn] Event channel to notify the console backend.
   1.672 +\end{description}
   1.673 +
   1.674 +The boot-only group may only be safely referred to during system boot:
   1.675 +
   1.676 +\begin{description}
   1.677 +\item[pt\_base] Virtual address of the page directory created for us
   1.678 +  by the domain builder.
   1.679 +\item[nr\_pt\_frames] Number of frames used by the builders' bootstrap
   1.680 +  pagetables.
   1.681 +\item[mfn\_list] Virtual address of the list of machine frames this
   1.682 +  domain owns.
   1.683 +\item[mod\_start] Virtual address of any pre-loaded modules
   1.684 +  (e.g. ramdisk)
   1.685 +\item[mod\_len] Size of pre-loaded module (if any).
   1.686 +\item[cmd\_line] Kernel command line passed by the domain builder.
   1.687 +\end{description}
   1.688 +
   1.689 +
   1.690 +% by Mark Williamson <mark.williamson@cl.cam.ac.uk>
   1.691 +
   1.692 +\chapter{Event Channels}
   1.693 +\label{c:eventchannels}
   1.694 +
   1.695 +Event channels are the basic primitive provided by Xen for event
   1.696 +notifications.  An event is the Xen equivalent of a hardware
   1.697 +interrupt.  They essentially store one bit of information, the event
   1.698 +of interest is signalled by transitioning this bit from 0 to 1.
   1.699 +
   1.700 +Notifications are received by a guest via an upcall from Xen,
   1.701 +indicating when an event arrives (setting the bit).  Further
   1.702 +notifications are masked until the bit is cleared again (therefore,
   1.703 +guests must check the value of the bit after re-enabling event
   1.704 +delivery to ensure no missed notifications).
   1.705 +
   1.706 +Event notifications can be masked by setting a flag; this is
   1.707 +equivalent to disabling interrupts and can be used to ensure atomicity
   1.708 +of certain operations in the guest kernel.
   1.709 +
   1.710 +\section{Hypercall interface}
   1.711 +
   1.712 +\hypercall{event\_channel\_op(evtchn\_op\_t *op)}
   1.713 +
   1.714 +The event channel operation hypercall is used for all operations on
   1.715 +event channels / ports.  Operations are distinguished by the value of
   1.716 +the {\tt cmd} field of the {\tt op} structure.  The possible commands
   1.717 +are described below:
   1.718 +
   1.719 +\begin{description}
   1.720 +
   1.721 +\item[EVTCHNOP\_alloc\_unbound]
   1.722 +  Allocate a new event channel port, ready to be connected to by a
   1.723 +  remote domain.
   1.724 +  \begin{itemize}
   1.725 +  \item Specified domain must exist.
   1.726 +  \item A free port must exist in that domain.
   1.727 +  \end{itemize}
   1.728 +  Unprivileged domains may only allocate their own ports, privileged
   1.729 +  domains may also allocate ports in other domains.
   1.730 +\item[EVTCHNOP\_bind\_interdomain]
   1.731 +  Bind an event channel for interdomain communications.
   1.732 +  \begin{itemize}
   1.733 +  \item Caller domain must have a free port to bind.
   1.734 +  \item Remote domain must exist.
   1.735 +  \item Remote port must be allocated and currently unbound.
   1.736 +  \item Remote port must be expecting the caller domain as the ``remote''.
   1.737 +  \end{itemize}
   1.738 +\item[EVTCHNOP\_bind\_virq]
   1.739 +  Allocate a port and bind a VIRQ to it.
   1.740 +  \begin{itemize}
   1.741 +  \item Caller domain must have a free port to bind.
   1.742 +  \item VIRQ must be valid.
   1.743 +  \item VCPU must exist.
   1.744 +  \item VIRQ must not currently be bound to an event channel.
   1.745 +  \end{itemize}
   1.746 +\item[EVTCHNOP\_bind\_ipi]
   1.747 +  Allocate and bind a port for notifying other virtual CPUs.
   1.748 +  \begin{itemize}
   1.749 +  \item Caller domain must have a free port to bind.
   1.750 +  \item VCPU must exist.
   1.751 +  \end{itemize}
   1.752 +\item[EVTCHNOP\_bind\_pirq]
   1.753 +  Allocate and bind a port to a real IRQ.
   1.754 +  \begin{itemize}
   1.755 +  \item Caller domain must have a free port to bind.
   1.756 +  \item PIRQ must be within the valid range.
   1.757 +  \item Another binding for this PIRQ must not exist for this domain.
   1.758 +  \item Caller must have an available port.
   1.759 +  \end{itemize}
   1.760 +\item[EVTCHNOP\_close]
   1.761 +  Close an event channel (no more events will be received).
   1.762 +  \begin{itemize}
   1.763 +  \item Port must be valid (currently allocated).
   1.764 +  \end{itemize}
   1.765 +\item[EVTCHNOP\_send] Send a notification on an event channel attached
   1.766 +  to a port.
   1.767 +  \begin{itemize}
   1.768 +  \item Port must be valid.
   1.769 +  \item Only valid for Interdomain, IPI or Allocated Unbound ports.
   1.770 +  \end{itemize}
   1.771 +\item[EVTCHNOP\_status] Query the status of a port; what kind of port,
   1.772 +  whether it is bound, what remote domain is expected, what PIRQ or
   1.773 +  VIRQ it is bound to, what VCPU will be notified, etc.
   1.774 +  Unprivileged domains may only query the state of their own ports.
   1.775 +  Privileged domains may query any port.
   1.776 +\item[EVTCHNOP\_bind\_vcpu] Bind event channel to a particular VCPU -
   1.777 +  receive notification upcalls only on that VCPU.
   1.778 +  \begin{itemize}
   1.779 +  \item VCPU must exist.
   1.780 +  \item Port must be valid.
   1.781 +  \item Event channel must be either: allocated but unbound, bound to
   1.782 +  an interdomain event channel, bound to a PIRQ.
   1.783 +  \end{itemize}
   1.784 +
   1.785 +\end{description}
   1.786 +
   1.787 +%%
   1.788 +%% grant_tables.tex
   1.789 +%% 
   1.790 +%% Made by Mark Williamson
   1.791 +%% Login   <mark@maw48>
   1.792 +%%
   1.793 +
   1.794 +\chapter{Grant tables}
   1.795 +\label{c:granttables}
   1.796 +
   1.797 +Xen's grant tables provide a generic mechanism to memory sharing
   1.798 +between domains.  This shared memory interface underpins the split
   1.799 +device drivers for block and network IO.
   1.800 +
   1.801 +Each domain has its own {\bf grant table}.  This is a data structure
   1.802 +that is shared with Xen; it allows the domain to tell Xen what kind of
   1.803 +permissions other domains have on its pages.  Entries in the grant
   1.804 +table are identified by {\bf grant references}.  A grant reference is
   1.805 +an integer, which indexes into the grant table.  It acts as a
   1.806 +capability which the grantee can use to perform operations on the
   1.807 +granter's memory.
   1.808 +
   1.809 +This capability-based system allows shared-memory communications
   1.810 +between unprivileged domains.  A grant reference also encapsulates the
   1.811 +details of a shared page, removing the need for a domain to know the
   1.812 +real machine address of a page it is sharing.  This makes it possible
   1.813 +to share memory correctly with domains running in fully virtualised
   1.814 +memory.
   1.815 +
   1.816 +\section{Interface}
   1.817 +
   1.818 +\subsection{Grant table manipulation}
   1.819 +
   1.820 +Creating and destroying grant references is done by direct access to
   1.821 +the grant table.  This removes the need to involve Xen when creating
   1.822 +grant references, modifying access permissions, etc.  The grantee
   1.823 +domain will invoke hypercalls to use the grant references.  Four main
   1.824 +operations can be accomplished by directly manipulating the table:
   1.825 +
   1.826 +\begin{description}
   1.827 +\item[Grant foreign access] allocate a new entry in the grant table
   1.828 +  and fill out the access permissions accordingly.  The access
   1.829 +  permissions will be looked up by Xen when the grantee attempts to
   1.830 +  use the reference to map the granted frame.
   1.831 +\item[End foreign access] check that the grant reference is not
   1.832 +  currently in use, then remove the mapping permissions for the frame.
   1.833 +  This prevents further mappings from taking place but does not allow
   1.834 +  forced revocations of existing mappings.
   1.835 +\item[Grant foreign transfer] allocate a new entry in the table
   1.836 +  specifying transfer permissions for the grantee.  Xen will look up
   1.837 +  this entry when the grantee attempts to transfer a frame to the
   1.838 +  granter.
   1.839 +\item[End foreign transfer] remove permissions to prevent a transfer
   1.840 +  occurring in future.  If the transfer is already committed,
   1.841 +  modifying the grant table cannot prevent it from completing.
   1.842 +\end{description}
   1.843 +
   1.844 +\subsection{Hypercalls}
   1.845 +
   1.846 +Use of grant references is accomplished via a hypercall.  The grant
   1.847 +table op hypercall takes three arguments:
   1.848 +
   1.849 +\hypercall{grant\_table\_op(unsigned int cmd, void *uop, unsigned int count)}
   1.850 +
   1.851 +{\tt cmd} indicates the grant table operation of interest.  {\tt uop}
   1.852 +is a pointer to a structure (or an array of structures) describing the
   1.853 +operation to be performed.  The {\tt count} field describes how many
   1.854 +grant table operations are being batched together.
   1.855 +
   1.856 +The core logic is situated in {\tt xen/common/grant\_table.c}.  The
   1.857 +grant table operation hypercall can be used to perform the following
   1.858 +actions:
   1.859 +
   1.860 +\begin{description}
   1.861 +\item[GNTTABOP\_map\_grant\_ref] Given a grant reference from another
   1.862 +  domain, map the referred page into the caller's address space.
   1.863 +\item[GNTTABOP\_unmap\_grant\_ref] Remove a mapping to a granted frame
   1.864 +  from the caller's address space.  This is used to voluntarily
   1.865 +  relinquish a mapping to a granted page.
   1.866 +\item[GNTTABOP\_setup\_table] Setup grant table for caller domain.
   1.867 +\item[GNTTABOP\_dump\_table] Debugging operation.
   1.868 +\item[GNTTABOP\_transfer] Given a transfer reference from another
   1.869 +  domain, transfer ownership of a page frame to that domain.
   1.870 +\end{description}
   1.871 +
   1.872 +%%
   1.873 +%% xenstore.tex
   1.874 +%% 
   1.875 +%% Made by Mark Williamson
   1.876 +%% Login   <mark@maw48>
   1.877 +%% 
   1.878 +
   1.879 +\chapter{Xenstore}
   1.880 +
   1.881 +Xenstore is the mechanism by which control-plane activities occur.
   1.882 +These activities include:
   1.883 +
   1.884 +\begin{itemize}
   1.885 +\item Setting up shared memory regions and event channels for use with
   1.886 +  the split device drivers.
   1.887 +\item Notifying the guest of control events (e.g. balloon driver
   1.888 +  requests)
   1.889 +\item Reporting back status information from the guest
   1.890 +  (e.g. performance-related statistics, etc).
   1.891 +\end{itemize}
   1.892 +
   1.893 +The store is arranged as a hierachical collection of key-value pairs.
   1.894 +Each domain has a directory hierarchy containing data related to its
   1.895 +configuration.  Domains are permitted to register for notifications
   1.896 +about changes in subtrees of the store, and to apply changes to the
   1.897 +store transactionally.
   1.898 +
   1.899 +\section{Guidelines}
   1.900 +
   1.901 +A few principles govern the operation of the store:
   1.902 +
   1.903 +\begin{itemize}
   1.904 +\item Domains should only modify the contents of their own
   1.905 +  directories.
   1.906 +\item The setup protocol for a device channel should simply consist of
   1.907 +  entering the configuration data into the store.
   1.908 +\item The store should allow device discovery without requiring the
   1.909 +  relevant device drivers to be loaded: a Xen ``bus'' should be
   1.910 +  visible to probing code in the guest.
   1.911 +\item The store should be usable for inter-tool communications,
   1.912 +  allowing the tools themselves to be decomposed into a number of
   1.913 +  smaller utilities, rather than a single monolithic entity.  This
   1.914 +  also facilitates the development of alternate user interfaces to the
   1.915 +  same functionality.
   1.916 +\end{itemize}
   1.917 +
   1.918 +\section{Store layout}
   1.919 +
   1.920 +There are three main paths in XenStore:
   1.921 +
   1.922 +\begin{description}
   1.923 +\item[/vm] stores configuration information about domain
   1.924 +\item[/local/domain] stores information about the domain on the local node (domid, etc.)
   1.925 +\item[/tool] stores information for the various tools
   1.926 +\end{description}
   1.927 +
   1.928 +The {\tt /vm} path stores configuration information for a domain.
   1.929 +This information doesn't change and is indexed by the domain's UUID.
   1.930 +A {\tt /vm} entry contains the following information:
   1.931 +
   1.932 +\begin{description}
   1.933 +\item[ssidref] ssid reference for domain
   1.934 +\item[uuid] uuid of the domain (somewhat redundant)
   1.935 +\item[on\_reboot] the action to take on a domain reboot request (destroy or restart)
   1.936 +\item[on\_poweroff] the action to take on a domain halt request (destroy or restart)
   1.937 +\item[on\_crash] the action to take on a domain crash (destroy or restart)
   1.938 +\item[vcpus] the number of allocated vcpus for the domain
   1.939 +\item[memory] the amount of memory (in megabytes) for the domain Note: appears to sometimes be empty for domain-0
   1.940 +\item[vcpu\_avail] the number of active vcpus for the domain (vcpus - number of disabled vcpus)
   1.941 +\item[name] the name of the domain
   1.942 +\end{description}
   1.943 +
   1.944 +
   1.945 +{\tt /vm/$<$uuid$>$/image/}
   1.946 +
   1.947 +The image path is only available for Domain-Us and contains:
   1.948 +\begin{description}
   1.949 +\item[ostype] identifies the builder type (linux or vmx)
   1.950 +\item[kernel] path to kernel on domain-0
   1.951 +\item[cmdline] command line to pass to domain-U kernel
   1.952 +\item[ramdisk] path to ramdisk on domain-0
   1.953 +\end{description}
   1.954 +
   1.955 +{\tt /local}
   1.956 +
   1.957 +The {\tt /local} path currently only contains one directory, {\tt
   1.958 +/local/domain} that is indexed by domain id.  It contains the running
   1.959 +domain information.  The reason to have two storage areas is that
   1.960 +during migration, the uuid doesn't change but the domain id does.  The
   1.961 +{\tt /local/domain} directory can be created and populated before
   1.962 +finalizing the migration enabling localhost to localhost migration.
   1.963 +
   1.964 +{\tt /local/domain/$<$domid$>$}
   1.965 +
   1.966 +This path contains:
   1.967 +
   1.968 +\begin{description}
   1.969 +\item[cpu\_time] xend start time (this is only around for domain-0)
   1.970 +\item[handle] private handle for xend
   1.971 +\item[name] see /vm
   1.972 +\item[on\_reboot] see /vm
   1.973 +\item[on\_poweroff] see /vm
   1.974 +\item[on\_crash] see /vm
   1.975 +\item[vm] the path to the VM directory for the domain
   1.976 +\item[domid] the domain id (somewhat redundant)
   1.977 +\item[running] indicates that the domain is currently running
   1.978 +\item[memory] the current memory in megabytes for the domain (empty for domain-0?)
   1.979 +\item[maxmem\_KiB] the maximum memory for the domain (in kilobytes)
   1.980 +\item[memory\_KiB] the memory allocated to the domain (in kilobytes)
   1.981 +\item[cpu] the current CPU the domain is pinned to (empty for domain-0?)
   1.982 +\item[cpu\_weight] the weight assigned to the domain
   1.983 +\item[vcpu\_avail] a bitmap telling the domain whether it may use a given VCPU
   1.984 +\item[online\_vcpus] how many vcpus are currently online
   1.985 +\item[vcpus] the total number of vcpus allocated to the domain
   1.986 +\item[console/] a directory for console information
   1.987 +  \begin{description}
   1.988 +  \item[ring-ref] the grant table reference of the console ring queue
   1.989 +  \item[port] the event channel being used for the console ring queue (local port)
   1.990 +  \item[tty] the current tty the console data is being exposed of
   1.991 +  \item[limit] the limit (in bytes) of console data to buffer
   1.992 +  \end{description}
   1.993 +\item[backend/] a directory containing all backends the domain hosts
   1.994 +  \begin{description}
   1.995 +  \item[vbd/] a directory containing vbd backends
   1.996 +    \begin{description}
   1.997 +    \item[$<$domid$>$/] a directory containing vbd's for domid
   1.998 +      \begin{description}
   1.999 +      \item[$<$virtual-device$>$/] a directory for a particular
  1.1000 +	virtual-device on domid
  1.1001 +	\begin{description}
  1.1002 +	\item[frontend-id] domain id of frontend
  1.1003 +	\item[frontend] the path to the frontend domain
  1.1004 +	\item[physical-device] backend device number
  1.1005 +	\item[sector-size] backend sector size
  1.1006 +	\item[info] 0 read/write, 1 read-only (is this right?)
  1.1007 +	\item[domain] name of frontend domain
  1.1008 +	\item[params] parameters for device
  1.1009 +	\item[type] the type of the device
  1.1010 +	\item[dev] the virtual device (as given by the user)
  1.1011 +	\item[node] output from block creation script
  1.1012 +	\end{description}
  1.1013 +      \end{description}
  1.1014 +    \end{description}
  1.1015 +  
  1.1016 +  \item[vif/] a directory containing vif backends
  1.1017 +    \begin{description}
  1.1018 +    \item[$<$domid$>$/] a directory containing vif's for domid
  1.1019 +      \begin{description}
  1.1020 +      \item[$<$vif number$>$/] a directory for each vif
  1.1021 +      \item[frontend-id] the domain id of the frontend
  1.1022 +      \item[frontend] the path to the frontend
  1.1023 +      \item[mac] the mac address of the vif
  1.1024 +      \item[bridge] the bridge the vif is connected to
  1.1025 +      \item[handle] the handle of the vif
  1.1026 +      \item[script] the script used to create/stop the vif
  1.1027 +      \item[domain] the name of the frontend
  1.1028 +      \end{description}
  1.1029 +    \end{description}
  1.1030 +  \end{description}
  1.1031 +
  1.1032 +  \item[device/] a directory containing the frontend devices for the
  1.1033 +    domain
  1.1034 +    \begin{description}
  1.1035 +    \item[vbd/] a directory containing vbd frontend devices for the
  1.1036 +      domain
  1.1037 +      \begin{description}
  1.1038 +      \item[$<$virtual-device$>$/] a directory containing the vbd frontend for
  1.1039 +	virtual-device
  1.1040 +	\begin{description}
  1.1041 +	\item[virtual-device] the device number of the frontend device
  1.1042 +	\item[backend-id] the domain id of the backend
  1.1043 +	\item[backend] the path of the backend in the store (/local/domain
  1.1044 +	  path)
  1.1045 +	\item[ring-ref] the grant table reference for the block request
  1.1046 +	  ring queue
  1.1047 +	\item[event-channel] the event channel used for the block request
  1.1048 +	  ring queue
  1.1049 +	\end{description}
  1.1050 +	
  1.1051 +      \item[vif/] a directory containing vif frontend devices for the
  1.1052 +	domain
  1.1053 +	\begin{description}
  1.1054 +	\item[$<$id$>$/] a directory for vif id frontend device for the domain
  1.1055 +	  \begin{description}
  1.1056 +	  \item[backend-id] the backend domain id
  1.1057 +	  \item[mac] the mac address of the vif
  1.1058 +	  \item[handle] the internal vif handle
  1.1059 +	  \item[backend] a path to the backend's store entry
  1.1060 +	  \item[tx-ring-ref] the grant table reference for the transmission ring queue 
  1.1061 +	  \item[rx-ring-ref] the grant table reference for the receiving ring queue 
  1.1062 +	  \item[event-channel] the event channel used for the two ring queues 
  1.1063 +	  \end{description}
  1.1064 +	\end{description}
  1.1065 +	
  1.1066 +      \item[device-misc/] miscellanous information for devices 
  1.1067 +	\begin{description}
  1.1068 +	\item[vif/] miscellanous information for vif devices
  1.1069 +	  \begin{description}
  1.1070 +	  \item[nextDeviceID] the next device id to use 
  1.1071 +	  \end{description}
  1.1072 +	\end{description}
  1.1073 +      \end{description}
  1.1074 +    \end{description}
  1.1075 +
  1.1076 +  \item[store/] per-domain information for the store
  1.1077 +    \begin{description}
  1.1078 +    \item[port] the event channel used for the store ring queue 
  1.1079 +    \item[ring-ref] - the grant table reference used for the store's
  1.1080 +      communication channel 
  1.1081 +    \end{description}
  1.1082 +    
  1.1083 +  \item[image] - private xend information 
  1.1084 +\end{description}
  1.1085 +
  1.1086 +
  1.1087 +\chapter{Devices}
  1.1088 +\label{c:devices}
  1.1089 +
  1.1090 +Virtual devices under Xen are provided by a {\bf split device driver}
  1.1091 +architecture.  The illusion of the virtual device is provided by two
  1.1092 +co-operating drivers: the {\bf frontend}, which runs an the
  1.1093 +unprivileged domain and the {\bf backend}, which runs in a domain with
  1.1094 +access to the real device hardware (often called a {\bf driver
  1.1095 +domain}; in practice domain 0 usually fulfills this function).
  1.1096 +
  1.1097 +The frontend driver appears to the unprivileged guest as if it were a
  1.1098 +real device, for instance a block or network device.  It receives IO
  1.1099 +requests from its kernel as usual, however since it does not have
  1.1100 +access to the physical hardware of the system it must then issue
  1.1101 +requests to the backend.  The backend driver is responsible for
  1.1102 +receiving these IO requests, verifying that they are safe and then
  1.1103 +issuing them to the real device hardware.  The backend driver appears
  1.1104 +to its kernel as a normal user of in-kernel IO functionality.  When
  1.1105 +the IO completes the backend notifies the frontend that the data is
  1.1106 +ready for use; the frontend is then able to report IO completion to
  1.1107 +its own kernel.
  1.1108 +
  1.1109 +Frontend drivers are designed to be simple; most of the complexity is
  1.1110 +in the backend, which has responsibility for translating device
  1.1111 +addresses, verifying that requests are well-formed and do not violate
  1.1112 +isolation guarantees, etc.
  1.1113 +
  1.1114 +Split drivers exchange requests and responses in shared memory, with
  1.1115 +an event channel for asynchronous notifications of activity.  When the
  1.1116 +frontend driver comes up, it uses Xenstore to set up a shared memory
  1.1117 +frame and an interdomain event channel for communications with the
  1.1118 +backend.  Once this connection is established, the two can communicate
  1.1119 +directly by placing requests / responses into shared memory and then
  1.1120 +sending notifications on the event channel.  This separation of
  1.1121 +notification from data transfer allows message batching, and results
  1.1122 +in very efficient device access.
  1.1123 +
  1.1124 +This chapter focuses on some individual split device interfaces
  1.1125 +available to Xen guests.
  1.1126 +
  1.1127 +        
  1.1128 +\section{Network I/O}
  1.1129 +
  1.1130 +Virtual network device services are provided by shared memory
  1.1131 +communication with a backend domain.  From the point of view of other
  1.1132 +domains, the backend may be viewed as a virtual ethernet switch
  1.1133 +element with each domain having one or more virtual network interfaces
  1.1134 +connected to it.
  1.1135 +
  1.1136 +From the point of view of the backend domain itself, the network
  1.1137 +backend driver consists of a number of ethernet devices.  Each of
  1.1138 +these has a logical direct connection to a virtual network device in
  1.1139 +another domain.  This allows the backend domain to route, bridge,
  1.1140 +firewall, etc the traffic to / from the other domains using normal
  1.1141 +operating system mechanisms.
  1.1142 +
  1.1143 +\subsection{Backend Packet Handling}
  1.1144 +
  1.1145 +The backend driver is responsible for a variety of actions relating to
  1.1146 +the transmission and reception of packets from the physical device.
  1.1147 +With regard to transmission, the backend performs these key actions:
  1.1148 +
  1.1149 +\begin{itemize}
  1.1150 +\item {\bf Validation:} To ensure that domains do not attempt to
  1.1151 +  generate invalid (e.g. spoofed) traffic, the backend driver may
  1.1152 +  validate headers ensuring that source MAC and IP addresses match the
  1.1153 +  interface that they have been sent from.
  1.1154 +
  1.1155 +  Validation functions can be configured using standard firewall rules
  1.1156 +  ({\small{\tt iptables}} in the case of Linux).
  1.1157 +  
  1.1158 +\item {\bf Scheduling:} Since a number of domains can share a single
  1.1159 +  physical network interface, the backend must mediate access when
  1.1160 +  several domains each have packets queued for transmission.  This
  1.1161 +  general scheduling function subsumes basic shaping or rate-limiting
  1.1162 +  schemes.
  1.1163 +  
  1.1164 +\item {\bf Logging and Accounting:} The backend domain can be
  1.1165 +  configured with classifier rules that control how packets are
  1.1166 +  accounted or logged.  For example, log messages might be generated
  1.1167 +  whenever a domain attempts to send a TCP packet containing a SYN.
  1.1168 +\end{itemize}
  1.1169 +
  1.1170 +On receipt of incoming packets, the backend acts as a simple
  1.1171 +demultiplexer: Packets are passed to the appropriate virtual interface
  1.1172 +after any necessary logging and accounting have been carried out.
  1.1173 +
  1.1174 +\subsection{Data Transfer}
  1.1175 +
  1.1176 +Each virtual interface uses two ``descriptor rings'', one for
  1.1177 +transmit, the other for receive.  Each descriptor identifies a block
  1.1178 +of contiguous machine memory allocated to the domain.
  1.1179 +
  1.1180 +The transmit ring carries packets to transmit from the guest to the
  1.1181 +backend domain.  The return path of the transmit ring carries messages
  1.1182 +indicating that the contents have been physically transmitted and the
  1.1183 +backend no longer requires the associated pages of memory.
  1.1184 +
  1.1185 +To receive packets, the guest places descriptors of unused pages on
  1.1186 +the receive ring.  The backend will return received packets by
  1.1187 +exchanging these pages in the domain's memory with new pages
  1.1188 +containing the received data, and passing back descriptors regarding
  1.1189 +the new packets on the ring.  This zero-copy approach allows the
  1.1190 +backend to maintain a pool of free pages to receive packets into, and
  1.1191 +then deliver them to appropriate domains after examining their
  1.1192 +headers.
  1.1193 +
  1.1194 +% Real physical addresses are used throughout, with the domain
  1.1195 +% performing translation from pseudo-physical addresses if that is
  1.1196 +% necessary.
  1.1197 +
  1.1198 +If a domain does not keep its receive ring stocked with empty buffers
  1.1199 +then packets destined to it may be dropped.  This provides some
  1.1200 +defence against receive livelock problems because an overloaded domain
  1.1201 +will cease to receive further data.  Similarly, on the transmit path,
  1.1202 +it provides the application with feedback on the rate at which packets
  1.1203 +are able to leave the system.
  1.1204 +
  1.1205 +Flow control on rings is achieved by including a pair of producer
  1.1206 +indexes on the shared ring page.  Each side will maintain a private
  1.1207 +consumer index indicating the next outstanding message.  In this
  1.1208 +manner, the domains cooperate to divide the ring into two message
  1.1209 +lists, one in each direction.  Notification is decoupled from the
  1.1210 +immediate placement of new messages on the ring; the event channel
  1.1211 +will be used to generate notification when {\em either} a certain
  1.1212 +number of outstanding messages are queued, {\em or} a specified number
  1.1213 +of nanoseconds have elapsed since the oldest message was placed on the
  1.1214 +ring.
  1.1215 +
  1.1216 +%% Not sure if my version is any better -- here is what was here
  1.1217 +%% before: Synchronization between the backend domain and the guest is
  1.1218 +%% achieved using counters held in shared memory that is accessible to
  1.1219 +%% both.  Each ring has associated producer and consumer indices
  1.1220 +%% indicating the area in the ring that holds descriptors that contain
  1.1221 +%% data.  After receiving {\it n} packets or {\t nanoseconds} after
  1.1222 +%% receiving the first packet, the hypervisor sends an event to the
  1.1223 +%% domain.
  1.1224 +
  1.1225 +
  1.1226 +\subsection{Network ring interface}
  1.1227 +
  1.1228 +The network device uses two shared memory rings for communication: one
  1.1229 +for transmit, one for receieve.
  1.1230 +
  1.1231 +Transmit requests are described by the following structure:
  1.1232 +
  1.1233 +\scriptsize
  1.1234 +\begin{verbatim}
  1.1235 +typedef struct netif_tx_request {
  1.1236 +    grant_ref_t gref;      /* Reference to buffer page */
  1.1237 +    uint16_t offset;       /* Offset within buffer page */
  1.1238 +    uint16_t flags;        /* NETTXF_* */
  1.1239 +    uint16_t id;           /* Echoed in response message. */
  1.1240 +    uint16_t size;         /* Packet size in bytes.       */
  1.1241 +} netif_tx_request_t;
  1.1242 +\end{verbatim}
  1.1243 +\normalsize
  1.1244 +
  1.1245 +\begin{description}
  1.1246 +\item[gref] Grant reference for the network buffer
  1.1247 +\item[offset] Offset to data
  1.1248 +\item[flags] Transmit flags (currently only NETTXF\_csum\_blank is
  1.1249 +  supported, to indicate that the protocol checksum field is
  1.1250 +  incomplete).
  1.1251 +\item[id] Echoed to guest by the backend in the ring-level response so
  1.1252 +  that the guest can match it to this request
  1.1253 +\item[size] Buffer size
  1.1254 +\end{description}
  1.1255 +
  1.1256 +Each transmit request is followed by a transmit response at some later
  1.1257 +date.  This is part of the shared-memory communication protocol and
  1.1258 +allows the guest to (potentially) retire internal structures related
  1.1259 +to the request.  It does not imply a network-level response.  This
  1.1260 +structure is as follows:
  1.1261 +
  1.1262 +\scriptsize
  1.1263 +\begin{verbatim}
  1.1264 +typedef struct netif_tx_response {
  1.1265 +    uint16_t id;
  1.1266 +    int16_t  status;
  1.1267 +} netif_tx_response_t;
  1.1268 +\end{verbatim}
  1.1269 +\normalsize
  1.1270 +
  1.1271 +\begin{description}
  1.1272 +\item[id] Echo of the ID field in the corresponding transmit request.
  1.1273 +\item[status] Success / failure status of the transmit request.
  1.1274 +\end{description}
  1.1275 +
  1.1276 +Receive requests must be queued by the frontend, accompanied by a
  1.1277 +donation of page-frames to the backend.  The backend transfers page
  1.1278 +frames full of data back to the guest
  1.1279 +
  1.1280 +\scriptsize
  1.1281 +\begin{verbatim}
  1.1282 +typedef struct {
  1.1283 +    uint16_t    id;        /* Echoed in response message.        */
  1.1284 +    grant_ref_t gref;      /* Reference to incoming granted frame */
  1.1285 +} netif_rx_request_t;
  1.1286 +\end{verbatim}
  1.1287 +\normalsize
  1.1288 +
  1.1289 +\begin{description}
  1.1290 +\item[id] Echoed by the frontend to identify this request when
  1.1291 +  responding.
  1.1292 +\item[gref] Transfer reference - the backend will use this reference
  1.1293 +  to transfer a frame of network data to us.
  1.1294 +\end{description}
  1.1295 +
  1.1296 +Receive response descriptors are queued for each received frame.  Note
  1.1297 +that these may only be queued in reply to an existing receive request,
  1.1298 +providing an in-built form of traffic throttling.
  1.1299 +
  1.1300 +\scriptsize
  1.1301 +\begin{verbatim}
  1.1302 +typedef struct {
  1.1303 +    uint16_t id;
  1.1304 +    uint16_t offset;       /* Offset in page of start of received packet  */
  1.1305 +    uint16_t flags;        /* NETRXF_* */
  1.1306 +    int16_t  status;       /* -ve: BLKIF_RSP_* ; +ve: Rx'ed pkt size. */
  1.1307 +} netif_rx_response_t;
  1.1308 +\end{verbatim}
  1.1309 +\normalsize
  1.1310 +
  1.1311 +\begin{description}
  1.1312 +\item[id] ID echoed from the original request, used by the guest to
  1.1313 +  match this response to the original request.
  1.1314 +\item[offset] Offset to data within the transferred frame.
  1.1315 +\item[flags] Transmit flags (currently only NETRXF\_csum\_valid is
  1.1316 +  supported, to indicate that the protocol checksum field has already
  1.1317 +  been validated).
  1.1318 +\item[status] Success / error status for this operation.
  1.1319 +\end{description}
  1.1320 +
  1.1321 +Note that the receive protocol includes a mechanism for guests to
  1.1322 +receive incoming memory frames but there is no explicit transfer of
  1.1323 +frames in the other direction.  Guests are expected to return memory
  1.1324 +to the hypervisor in order to use the network interface.  They {\em
  1.1325 +must} do this or they will exceed their maximum memory reservation and
  1.1326 +will not be able to receive incoming frame transfers.  When necessary,
  1.1327 +the backend is able to replenish its pool of free network buffers by
  1.1328 +claiming some of this free memory from the hypervisor.
  1.1329 +
  1.1330 +\section{Block I/O}
  1.1331 +
  1.1332 +All guest OS disk access goes through the virtual block device VBD
  1.1333 +interface.  This interface allows domains access to portions of block
  1.1334 +storage devices visible to the the block backend device.  The VBD
  1.1335 +interface is a split driver, similar to the network interface
  1.1336 +described above.  A single shared memory ring is used between the
  1.1337 +frontend and backend drivers for each virtual device, across which
  1.1338 +IO requests and responses are sent.
  1.1339 +
  1.1340 +Any block device accessible to the backend domain, including
  1.1341 +network-based block (iSCSI, *NBD, etc), loopback and LVM/MD devices,
  1.1342 +can be exported as a VBD.  Each VBD is mapped to a device node in the
  1.1343 +guest, specified in the guest's startup configuration.
  1.1344 +
  1.1345 +\subsection{Data Transfer}
  1.1346 +
  1.1347 +The per-(virtual)-device ring between the guest and the block backend
  1.1348 +supports two messages:
  1.1349 +
  1.1350 +\begin{description}
  1.1351 +\item [{\small {\tt READ}}:] Read data from the specified block
  1.1352 +  device.  The front end identifies the device and location to read
  1.1353 +  from and attaches pages for the data to be copied to (typically via
  1.1354 +  DMA from the device).  The backend acknowledges completed read
  1.1355 +  requests as they finish.
  1.1356 +
  1.1357 +\item [{\small {\tt WRITE}}:] Write data to the specified block
  1.1358 +  device.  This functions essentially as {\small {\tt READ}}, except
  1.1359 +  that the data moves to the device instead of from it.
  1.1360 +\end{description}
  1.1361 +
  1.1362 +%% Rather than copying data, the backend simply maps the domain's
  1.1363 +%% buffers in order to enable direct DMA to them.  The act of mapping
  1.1364 +%% the buffers also increases the reference counts of the underlying
  1.1365 +%% pages, so that the unprivileged domain cannot try to return them to
  1.1366 +%% the hypervisor, install them as page tables, or any other unsafe
  1.1367 +%% behaviour.
  1.1368 +%%
  1.1369 +%% % block API here
  1.1370 +
  1.1371 +\subsection{Block ring interface}
  1.1372 +
  1.1373 +The block interface is defined by the structures passed over the
  1.1374 +shared memory interface.  These structures are either requests (from
  1.1375 +the frontend to the backend) or responses (from the backend to the
  1.1376 +frontend).
  1.1377 +
  1.1378 +The request structure is defined as follows:
  1.1379 +
  1.1380 +\scriptsize
  1.1381 +\begin{verbatim}
  1.1382 +typedef struct blkif_request {
  1.1383 +    uint8_t        operation;    /* BLKIF_OP_???                         */
  1.1384 +    uint8_t        nr_segments;  /* number of segments                   */
  1.1385 +    blkif_vdev_t   handle;       /* only for read/write requests         */
  1.1386 +    uint64_t       id;           /* private guest value, echoed in resp  */
  1.1387 +    blkif_sector_t sector_number;/* start sector idx on disk (r/w only)  */
  1.1388 +    struct blkif_request_segment {
  1.1389 +        grant_ref_t gref;        /* reference to I/O buffer frame        */
  1.1390 +        /* @first_sect: first sector in frame to transfer (inclusive).   */
  1.1391 +        /* @last_sect: last sector in frame to transfer (inclusive).     */
  1.1392 +        uint8_t     first_sect, last_sect;
  1.1393 +    } seg[BLKIF_MAX_SEGMENTS_PER_REQUEST];
  1.1394 +} blkif_request_t;
  1.1395 +\end{verbatim}
  1.1396 +\normalsize
  1.1397 +
  1.1398 +The fields are as follows:
  1.1399 +
  1.1400 +\begin{description}
  1.1401 +\item[operation] operation ID: one of the operations described above
  1.1402 +\item[nr\_segments] number of segments for scatter / gather IO
  1.1403 +  described by this request
  1.1404 +\item[handle] identifier for a particular virtual device on this
  1.1405 +  interface
  1.1406 +\item[id] this value is echoed in the response message for this IO;
  1.1407 +  the guest may use it to identify the original request
  1.1408 +\item[sector\_number] start sector on the virtal device for this
  1.1409 +  request
  1.1410 +\item[frame\_and\_sects] This array contains structures encoding
  1.1411 +  scatter-gather IO to be performed:
  1.1412 +  \begin{description}
  1.1413 +  \item[gref] The grant reference for the foreign I/O buffer page.
  1.1414 +  \item[first\_sect] First sector to access within the buffer page (0 to 7).
  1.1415 +  \item[last\_sect] Last sector to access within the buffer page (0 to 7).
  1.1416 +  \end{description}
  1.1417 +  Data will be transferred into frames at an offset determined by the
  1.1418 +  value of {\tt first\_sect}.
  1.1419 +\end{description}
  1.1420 +
  1.1421 +
  1.1422 +\chapter{Further Information}
  1.1423 +
  1.1424 +If you have questions that are not answered by this manual, the
  1.1425 +sources of information listed below may be of interest to you.  Note
  1.1426 +that bug reports, suggestions and contributions related to the
  1.1427 +software (or the documentation) should be sent to the Xen developers'
  1.1428 +mailing list (address below).
  1.1429 +
  1.1430 +
  1.1431 +\section{Other documentation}
  1.1432 +
  1.1433 +If you are mainly interested in using (rather than developing for)
  1.1434 +Xen, the \emph{Xen Users' Manual} is distributed in the {\tt docs/}
  1.1435 +directory of the Xen source distribution.
  1.1436 +
  1.1437 +% Various HOWTOs are also available in {\tt docs/HOWTOS}.
  1.1438 +
  1.1439 +
  1.1440 +\section{Online references}
  1.1441 +
  1.1442 +The official Xen web site can be found at:
  1.1443 +\begin{quote} {\tt http://www.xensource.com}
  1.1444 +\end{quote}
  1.1445 +
  1.1446 +
  1.1447 +This contains links to the latest versions of all online
  1.1448 +documentation, including the latest version of the FAQ.
  1.1449 +
  1.1450 +Information regarding Xen is also available at the Xen Wiki at
  1.1451 +\begin{quote} {\tt http://wiki.xensource.com/xenwiki/}\end{quote}
  1.1452 +The Xen project uses Bugzilla as its bug tracking system. You'll find
  1.1453 +the Xen Bugzilla at http://bugzilla.xensource.com/bugzilla/.
  1.1454 +
  1.1455 +
  1.1456 +\section{Mailing lists}
  1.1457 +
  1.1458 +There are several mailing lists that are used to discuss Xen related
  1.1459 +topics. The most widely relevant are listed below. An official page of
  1.1460 +mailing lists and subscription information can be found at \begin{quote}
  1.1461 +  {\tt http://lists.xensource.com/} \end{quote}
  1.1462 +
  1.1463 +\begin{description}
  1.1464 +\item[xen-devel@lists.xensource.com] Used for development
  1.1465 +  discussions and bug reports.  Subscribe at: \\
  1.1466 +  {\small {\tt http://lists.xensource.com/xen-devel}}
  1.1467 +\item[xen-users@lists.xensource.com] Used for installation and usage
  1.1468 +  discussions and requests for help.  Subscribe at: \\
  1.1469 +  {\small {\tt http://lists.xensource.com/xen-users}}
  1.1470 +\item[xen-announce@lists.xensource.com] Used for announcements only.
  1.1471 +  Subscribe at: \\
  1.1472 +  {\small {\tt http://lists.xensource.com/xen-announce}}
  1.1473 +\item[xen-changelog@lists.xensource.com] Changelog feed
  1.1474 +  from the unstable and 2.0 trees - developer oriented.  Subscribe at: \\
  1.1475 +  {\small {\tt http://lists.xensource.com/xen-changelog}}
  1.1476 +\end{description}
  1.1477  
  1.1478  \appendix
  1.1479  
  1.1480 -%% chapter hypercalls moved to hypercalls.tex
  1.1481 -\include{src/interface/hypercalls}
  1.1482 +
  1.1483 +\chapter{Xen Hypercalls}
  1.1484 +\label{a:hypercalls}
  1.1485 +
  1.1486 +Hypercalls represent the procedural interface to Xen; this appendix 
  1.1487 +categorizes and describes the current set of hypercalls. 
  1.1488 +
  1.1489 +\section{Invoking Hypercalls} 
  1.1490 +
  1.1491 +Hypercalls are invoked in a manner analogous to system calls in a
  1.1492 +conventional operating system; a software interrupt is issued which
  1.1493 +vectors to an entry point within Xen. On x86/32 machines the
  1.1494 +instruction required is {\tt int \$82}; the (real) IDT is setup so
  1.1495 +that this may only be issued from within ring 1. The particular 
  1.1496 +hypercall to be invoked is contained in {\tt EAX} --- a list 
  1.1497 +mapping these values to symbolic hypercall names can be found 
  1.1498 +in {\tt xen/include/public/xen.h}. 
  1.1499 +
  1.1500 +On some occasions a set of hypercalls will be required to carry
  1.1501 +out a higher-level function; a good example is when a guest 
  1.1502 +operating wishes to context switch to a new process which 
  1.1503 +requires updating various privileged CPU state. As an optimization
  1.1504 +for these cases, there is a generic mechanism to issue a set of 
  1.1505 +hypercalls as a batch: 
  1.1506 +
  1.1507 +\begin{quote}
  1.1508 +\hypercall{multicall(void *call\_list, int nr\_calls)}
  1.1509 +
  1.1510 +Execute a series of hypervisor calls; {\tt nr\_calls} is the length of
  1.1511 +the array of {\tt multicall\_entry\_t} structures pointed to be {\tt
  1.1512 +call\_list}. Each entry contains the hypercall operation code followed
  1.1513 +by up to 7 word-sized arguments.
  1.1514 +\end{quote}
  1.1515 +
  1.1516 +Note that multicalls are provided purely as an optimization; there is
  1.1517 +no requirement to use them when first porting a guest operating
  1.1518 +system.
  1.1519  
  1.1520  
  1.1521 -%% 
  1.1522 -%% XXX SMH: not really sure how useful below is -- if it's still 
  1.1523 -%% actually true, might be useful for someone wanting to write a 
  1.1524 -%% new scheduler... not clear how many of them there are...
  1.1525 -%%
  1.1526 +\section{Virtual CPU Setup} 
  1.1527  
  1.1528 -%% \include{src/interface/scheduling}
  1.1529 -%% scheduling information moved to scheduling.tex
  1.1530 -%% still commented out
  1.1531 +At start of day, a guest operating system needs to setup the virtual
  1.1532 +CPU it is executing on. This includes installing vectors for the
  1.1533 +virtual IDT so that the guest OS can handle interrupts, page faults,
  1.1534 +etc. However the very first thing a guest OS must setup is a pair 
  1.1535 +of hypervisor callbacks: these are the entry points which Xen will
  1.1536 +use when it wishes to notify the guest OS of an occurrence. 
  1.1537 +
  1.1538 +\begin{quote}
  1.1539 +\hypercall{set\_callbacks(unsigned long event\_selector, unsigned long
  1.1540 +  event\_address, unsigned long failsafe\_selector, unsigned long
  1.1541 +  failsafe\_address) }
  1.1542 +
  1.1543 +Register the normal (``event'') and failsafe callbacks for 
  1.1544 +event processing. In each case the code segment selector and 
  1.1545 +address within that segment are provided. The selectors must
  1.1546 +have RPL 1; in XenLinux we simply use the kernel's CS for both 
  1.1547 +{\tt event\_selector} and {\tt failsafe\_selector}.
  1.1548 +
  1.1549 +The value {\tt event\_address} specifies the address of the guest OSes
  1.1550 +event handling and dispatch routine; the {\tt failsafe\_address}
  1.1551 +specifies a separate entry point which is used only if a fault occurs
  1.1552 +when Xen attempts to use the normal callback. 
  1.1553 +
  1.1554 +\end{quote} 
  1.1555 +
  1.1556 +On x86/64 systems the hypercall takes slightly different
  1.1557 +arguments. This is because callback CS does not need to be specified
  1.1558 +(since teh callbacks are entered via SYSRET), and also because an
  1.1559 +entry address needs to be specified for SYSCALLs from guest user
  1.1560 +space:
  1.1561 +
  1.1562 +\begin{quote}
  1.1563 +\hypercall{set\_callbacks(unsigned long event\_address, unsigned long
  1.1564 +  failsafe\_address, unsigned long syscall\_address)}
  1.1565 +\end{quote} 
  1.1566  
  1.1567  
  1.1568 +After installing the hypervisor callbacks, the guest OS can 
  1.1569 +install a `virtual IDT' by using the following hypercall: 
  1.1570  
  1.1571 -%%
  1.1572 -%% XXX SMH: we probably should have something in here on debugging 
  1.1573 -%% etc; this is a kinda developers manual and many devs seem to 
  1.1574 -%% like debugging support :^) 
  1.1575 -%% Possibly sanitize below, else wait until new xendbg stuff is in 
  1.1576 -%% (and/or kip's stuff?) and write about that instead? 
  1.1577 -%%
  1.1578 +\begin{quote} 
  1.1579 +\hypercall{set\_trap\_table(trap\_info\_t *table)} 
  1.1580  
  1.1581 -%% \include{src/interface/debugging}
  1.1582 -%% debugging information moved to debugging.tex
  1.1583 -%% still commented out
  1.1584 +Install one or more entries into the per-domain 
  1.1585 +trap handler table (essentially a software version of the IDT). 
  1.1586 +Each entry in the array pointed to by {\tt table} includes the 
  1.1587 +exception vector number with the corresponding segment selector 
  1.1588 +and entry point. Most guest OSes can use the same handlers on 
  1.1589 +Xen as when running on the real hardware.
  1.1590 +
  1.1591 +
  1.1592 +\end{quote} 
  1.1593 +
  1.1594 +A further hypercall is provided for the management of virtual CPUs:
  1.1595 +
  1.1596 +\begin{quote}
  1.1597 +\hypercall{vcpu\_op(int cmd, int vcpuid, void *extra\_args)}
  1.1598 +
  1.1599 +This hypercall can be used to bootstrap VCPUs, to bring them up and
  1.1600 +down and to test their current status.
  1.1601 +
  1.1602 +\end{quote}
  1.1603 +
  1.1604 +\section{Scheduling and Timer}
  1.1605 +
  1.1606 +Domains are preemptively scheduled by Xen according to the 
  1.1607 +parameters installed by domain 0 (see Section~\ref{s:dom0ops}). 
  1.1608 +In addition, however, a domain may choose to explicitly 
  1.1609 +control certain behavior with the following hypercall: 
  1.1610 +
  1.1611 +\begin{quote} 
  1.1612 +\hypercall{sched\_op(unsigned long op)} 
  1.1613 +
  1.1614 +Request scheduling operation from hypervisor. The options are: {\it
  1.1615 +SCHEDOP\_yield}, {\it SCHEDOP\_block}, and {\it SCHEDOP\_shutdown}.
  1.1616 +{\it yield} keeps the calling domain runnable but may cause a
  1.1617 +reschedule if other domains are runnable.  {\it block} removes the
  1.1618 +calling domain from the run queue and cause is to sleeps until an
  1.1619 +event is delivered to it.  {\it shutdown} is used to end the domain's
  1.1620 +execution; the caller can additionally specify whether the domain
  1.1621 +should reboot, halt or suspend.
  1.1622 +\end{quote} 
  1.1623 +
  1.1624 +To aid the implementation of a process scheduler within a guest OS,
  1.1625 +Xen provides a virtual programmable timer:
  1.1626 +
  1.1627 +\begin{quote}
  1.1628 +\hypercall{set\_timer\_op(uint64\_t timeout)} 
  1.1629 +
  1.1630 +Request a timer event to be sent at the specified system time (time 
  1.1631 +in nanoseconds since system boot). The hypercall actually passes the 
  1.1632 +64-bit timeout value as a pair of 32-bit values. 
  1.1633 +
  1.1634 +\end{quote} 
  1.1635 +
  1.1636 +Note that calling {\tt set\_timer\_op()} prior to {\tt sched\_op} 
  1.1637 +allows block-with-timeout semantics. 
  1.1638 +
  1.1639 +
  1.1640 +\section{Page Table Management} 
  1.1641 +
  1.1642 +Since guest operating systems have read-only access to their page 
  1.1643 +tables, Xen must be involved when making any changes. The following
  1.1644 +multi-purpose hypercall can be used to modify page-table entries, 
  1.1645 +update the machine-to-physical mapping table, flush the TLB, install 
  1.1646 +a new page-table base pointer, and more.
  1.1647 +
  1.1648 +\begin{quote} 
  1.1649 +\hypercall{mmu\_update(mmu\_update\_t *req, int count, int *success\_count)} 
  1.1650 +
  1.1651 +Update the page table for the domain; a set of {\tt count} updates are
  1.1652 +submitted for processing in a batch, with {\tt success\_count} being 
  1.1653 +updated to report the number of successful updates.  
  1.1654 +
  1.1655 +Each element of {\tt req[]} contains a pointer (address) and value; 
  1.1656 +the least significant 2-bits of the pointer are used to distinguish 
  1.1657 +the type of update requested as follows:
  1.1658 +\begin{description} 
  1.1659 +
  1.1660 +\item[\it MMU\_NORMAL\_PT\_UPDATE:] update a page directory entry or
  1.1661 +page table entry to the associated value; Xen will check that the
  1.1662 +update is safe, as described in Chapter~\ref{c:memory}.
  1.1663 +
  1.1664 +\item[\it MMU\_MACHPHYS\_UPDATE:] update an entry in the
  1.1665 +  machine-to-physical table. The calling domain must own the machine
  1.1666 +  page in question (or be privileged).
  1.1667 +\end{description}
  1.1668 +
  1.1669 +\end{quote}
  1.1670 +
  1.1671 +Explicitly updating batches of page table entries is extremely
  1.1672 +efficient, but can require a number of alterations to the guest
  1.1673 +OS. Using the writable page table mode (Chapter~\ref{c:memory}) is
  1.1674 +recommended for new OS ports.
  1.1675 +
  1.1676 +Regardless of which page table update mode is being used, however,
  1.1677 +there are some occasions (notably handling a demand page fault) where
  1.1678 +a guest OS will wish to modify exactly one PTE rather than a
  1.1679 +batch, and where that PTE is mapped into the current address space.
  1.1680 +This is catered for by the following:
  1.1681 +
  1.1682 +\begin{quote} 
  1.1683 +\hypercall{update\_va\_mapping(unsigned long va, uint64\_t val,
  1.1684 +                         unsigned long flags)}
  1.1685 +
  1.1686 +Update the currently installed PTE that maps virtual address {\tt va}
  1.1687 +to new value {\tt val}. As with {\tt mmu\_update()}, Xen checks the
  1.1688 +modification  is safe before applying it. The {\tt flags} determine
  1.1689 +which kind of TLB flush, if any, should follow the update. 
  1.1690 +
  1.1691 +\end{quote} 
  1.1692 +
  1.1693 +Finally, sufficiently privileged domains may occasionally wish to manipulate 
  1.1694 +the pages of others: 
  1.1695 +
  1.1696 +\begin{quote}
  1.1697 +\hypercall{update\_va\_mapping(unsigned long va, uint64\_t val,
  1.1698 +                         unsigned long flags, domid\_t domid)}
  1.1699 +
  1.1700 +Identical to {\tt update\_va\_mapping()} save that the pages being
  1.1701 +mapped must belong to the domain {\tt domid}. 
  1.1702 +
  1.1703 +\end{quote}
  1.1704 +
  1.1705 +An additional MMU hypercall provides an ``extended command''
  1.1706 +interface.  This provides additional functionality beyond the basic
  1.1707 +table updating commands:
  1.1708 +
  1.1709 +\begin{quote}
  1.1710 +
  1.1711 +\hypercall{mmuext\_op(struct mmuext\_op *op, int count, int *success\_count, domid\_t domid)}
  1.1712 +
  1.1713 +This hypercall is used to perform additional MMU operations.  These
  1.1714 +include updating {\tt cr3} (or just re-installing it for a TLB flush),
  1.1715 +requesting various kinds of TLB flush, flushing the cache, installing
  1.1716 +a new LDT, or pinning \& unpinning page-table pages (to ensure their
  1.1717 +reference count doesn't drop to zero which would require a
  1.1718 +revalidation of all entries).  Some of the operations available are
  1.1719 +restricted to domains with sufficient system privileges.
  1.1720 +
  1.1721 +It is also possible for privileged domains to reassign page ownership
  1.1722 +via an extended MMU operation, although grant tables are used instead
  1.1723 +of this where possible; see Section~\ref{s:idc}.
  1.1724 +
  1.1725 +\end{quote}
  1.1726 +
  1.1727 +Finally, a hypercall interface is exposed to activate and deactivate
  1.1728 +various optional facilities provided by Xen for memory management.
  1.1729 +
  1.1730 +\begin{quote} 
  1.1731 +\hypercall{vm\_assist(unsigned int cmd, unsigned int type)}
  1.1732 +
  1.1733 +Toggle various memory management modes (in particular writable page
  1.1734 +tables).
  1.1735 +
  1.1736 +\end{quote} 
  1.1737 +
  1.1738 +\section{Segmentation Support}
  1.1739 +
  1.1740 +Xen allows guest OSes to install a custom GDT if they require it; 
  1.1741 +this is context switched transparently whenever a domain is 
  1.1742 +[de]scheduled.  The following hypercall is effectively a 
  1.1743 +`safe' version of {\tt lgdt}: 
  1.1744 +
  1.1745 +\begin{quote}
  1.1746 +\hypercall{set\_gdt(unsigned long *frame\_list, int entries)} 
  1.1747 +
  1.1748 +Install a global descriptor table for a domain; {\tt frame\_list} is
  1.1749 +an array of up to 16 machine page frames within which the GDT resides,
  1.1750 +with {\tt entries} being the actual number of descriptor-entry
  1.1751 +slots. All page frames must be mapped read-only within the guest's
  1.1752 +address space, and the table must be large enough to contain Xen's
  1.1753 +reserved entries (see {\tt xen/include/public/arch-x86\_32.h}).
  1.1754 +
  1.1755 +\end{quote}
  1.1756 +
  1.1757 +Many guest OSes will also wish to install LDTs; this is achieved by
  1.1758 +using {\tt mmu\_update()} with an extended command, passing the
  1.1759 +linear address of the LDT base along with the number of entries. No
  1.1760 +special safety checks are required; Xen needs to perform this task
  1.1761 +simply since {\tt lldt} requires CPL 0.
  1.1762 +
  1.1763 +
  1.1764 +Xen also allows guest operating systems to update just an 
  1.1765 +individual segment descriptor in the GDT or LDT:  
  1.1766 +
  1.1767 +\begin{quote}
  1.1768 +\hypercall{update\_descriptor(uint64\_t ma, uint64\_t desc)}
  1.1769 +
  1.1770 +Update the GDT/LDT entry at machine address {\tt ma}; the new
  1.1771 +8-byte descriptor is stored in {\tt desc}.
  1.1772 +Xen performs a number of checks to ensure the descriptor is 
  1.1773 +valid. 
  1.1774 +
  1.1775 +\end{quote}
  1.1776 +
  1.1777 +Guest OSes can use the above in place of context switching entire 
  1.1778 +LDTs (or the GDT) when the number of changing descriptors is small. 
  1.1779 +
  1.1780 +\section{Context Switching} 
  1.1781 +
  1.1782 +When a guest OS wishes to context switch between two processes, 
  1.1783 +it can use the page table and segmentation hypercalls described
  1.1784 +above to perform the the bulk of the privileged work. In addition, 
  1.1785 +however, it will need to invoke Xen to switch the kernel (ring 1) 
  1.1786 +stack pointer: 
  1.1787 +
  1.1788 +\begin{quote} 
  1.1789 +\hypercall{stack\_switch(unsigned long ss, unsigned long esp)} 
  1.1790 +
  1.1791 +Request kernel stack switch from hypervisor; {\tt ss} is the new 
  1.1792 +stack segment, which {\tt esp} is the new stack pointer. 
  1.1793 +
  1.1794 +\end{quote} 
  1.1795 +
  1.1796 +A useful hypercall for context switching allows ``lazy'' save and
  1.1797 +restore of floating point state:
  1.1798 +
  1.1799 +\begin{quote}
  1.1800 +\hypercall{fpu\_taskswitch(int set)} 
  1.1801 +
  1.1802 +This call instructs Xen to set the {\tt TS} bit in the {\tt cr0}
  1.1803 +control register; this means that the next attempt to use floating
  1.1804 +point will cause a trap which the guest OS can trap. Typically it will
  1.1805 +then save/restore the FP state, and clear the {\tt TS} bit, using the
  1.1806 +same call.
  1.1807 +\end{quote} 
  1.1808 +
  1.1809 +This is provided as an optimization only; guest OSes can also choose
  1.1810 +to save and restore FP state on all context switches for simplicity. 
  1.1811 +
  1.1812 +Finally, a hypercall is provided for entering vm86 mode:
  1.1813 +
  1.1814 +\begin{quote}
  1.1815 +\hypercall{switch\_vm86}
  1.1816 +
  1.1817 +This allows the guest to run code in vm86 mode, which is needed for
  1.1818 +some legacy software.
  1.1819 +\end{quote}
  1.1820 +
  1.1821 +\section{Physical Memory Management}
  1.1822 +
  1.1823 +As mentioned previously, each domain has a maximum and current 
  1.1824 +memory allocation. The maximum allocation, set at domain creation 
  1.1825 +time, cannot be modified. However a domain can choose to reduce 
  1.1826 +and subsequently grow its current allocation by using the
  1.1827 +following call: 
  1.1828 +
  1.1829 +\begin{quote} 
  1.1830 +\hypercall{memory\_op(unsigned int op, void *arg)}
  1.1831 +
  1.1832 +Increase or decrease current memory allocation (as determined by 
  1.1833 +the value of {\tt op}).  The available operations are:
  1.1834 +
  1.1835 +\begin{description}
  1.1836 +\item[XENMEM\_increase\_reservation] Request an increase in machine
  1.1837 +  memory allocation; {\tt arg} must point to a {\tt
  1.1838 +  xen\_memory\_reservation} structure.
  1.1839 +\item[XENMEM\_decrease\_reservation] Request a decrease in machine
  1.1840 +  memory allocation; {\tt arg} must point to a {\tt
  1.1841 +  xen\_memory\_reservation} structure.
  1.1842 +\item[XENMEM\_maximum\_ram\_page] Request the frame number of the
  1.1843 +  highest-addressed frame of machine memory in the system.  {\tt arg}
  1.1844 +  must point to an {\tt unsigned long} where this value will be
  1.1845 +  stored.
  1.1846 +\item[XENMEM\_current\_reservation] Returns current memory reservation
  1.1847 +  of the specified domain.
  1.1848 +\item[XENMEM\_maximum\_reservation] Returns maximum memory resrevation
  1.1849 +  of the specified domain.
  1.1850 +\end{description}
  1.1851 +
  1.1852 +\end{quote} 
  1.1853 +
  1.1854 +In addition to simply reducing or increasing the current memory
  1.1855 +allocation via a `balloon driver', this call is also useful for 
  1.1856 +obtaining contiguous regions of machine memory when required (e.g. 
  1.1857 +for certain PCI devices, or if using superpages).  
  1.1858 +
  1.1859 +
  1.1860 +\section{Inter-Domain Communication}
  1.1861 +\label{s:idc} 
  1.1862 +
  1.1863 +Xen provides a simple asynchronous notification mechanism via
  1.1864 +\emph{event channels}. Each domain has a set of end-points (or
  1.1865 +\emph{ports}) which may be bound to an event source (e.g. a physical
  1.1866 +IRQ, a virtual IRQ, or an port in another domain). When a pair of
  1.1867 +end-points in two different domains are bound together, then a `send'
  1.1868 +operation on one will cause an event to be received by the destination
  1.1869 +domain.
  1.1870 +
  1.1871 +The control and use of event channels involves the following hypercall: 
  1.1872 +
  1.1873 +\begin{quote}
  1.1874 +\hypercall{event\_channel\_op(evtchn\_op\_t *op)} 
  1.1875 +
  1.1876 +Inter-domain event-channel management; {\tt op} is a discriminated 
  1.1877 +union which allows the following 7 operations: 
  1.1878 +
  1.1879 +\begin{description} 
  1.1880 +
  1.1881 +\item[\it alloc\_unbound:] allocate a free (unbound) local
  1.1882 +  port and prepare for connection from a specified domain. 
  1.1883 +\item[\it bind\_virq:] bind a local port to a virtual 
  1.1884 +IRQ; any particular VIRQ can be bound to at most one port per domain. 
  1.1885 +\item[\it bind\_pirq:] bind a local port to a physical IRQ;
  1.1886 +once more, a given pIRQ can be bound to at most one port per
  1.1887 +domain. Furthermore the calling domain must be sufficiently
  1.1888 +privileged.
  1.1889 +\item[\it bind\_interdomain:] construct an interdomain event 
  1.1890 +channel; in general, the target domain must have previously allocated 
  1.1891 +an unbound port for this channel, although this can be bypassed by 
  1.1892 +privileged domains during domain setup. 
  1.1893 +\item[\it close:] close an interdomain event channel. 
  1.1894 +\item[\it send:] send an event to the remote end of a 
  1.1895 +interdomain event channel. 
  1.1896 +\item[\it status:] determine the current status of a local port. 
  1.1897 +\end{description} 
  1.1898 +
  1.1899 +For more details see
  1.1900 +{\tt xen/include/public/event\_channel.h}. 
  1.1901 +
  1.1902 +\end{quote} 
  1.1903 +
  1.1904 +Event channels are the fundamental communication primitive between 
  1.1905 +Xen domains and seamlessly support SMP. However they provide little
  1.1906 +bandwidth for communication {\sl per se}, and hence are typically 
  1.1907 +married with a piece of shared memory to produce effective and 
  1.1908 +high-performance inter-domain communication. 
  1.1909 +
  1.1910 +Safe sharing of memory pages between guest OSes is carried out by
  1.1911 +granting access on a per page basis to individual domains. This is
  1.1912 +achieved by using the {\tt grant\_table\_op()} hypercall.
  1.1913 +
  1.1914 +\begin{quote}
  1.1915 +\hypercall{grant\_table\_op(unsigned int cmd, void *uop, unsigned int count)}
  1.1916 +
  1.1917 +Used to invoke operations on a grant reference, to setup the grant
  1.1918 +table and to dump the tables' contents for debugging.
  1.1919 +
  1.1920 +\end{quote} 
  1.1921 +
  1.1922 +\section{IO Configuration} 
  1.1923 +
  1.1924 +Domains with physical device access (i.e.\ driver domains) receive
  1.1925 +limited access to certain PCI devices (bus address space and
  1.1926 +interrupts). However many guest operating systems attempt to 
  1.1927 +determine the PCI configuration by directly access the PCI BIOS, 
  1.1928 +which cannot be allowed for safety. 
  1.1929 +
  1.1930 +Instead, Xen provides the following hypercall: 
  1.1931 +
  1.1932 +\begin{quote}
  1.1933 +\hypercall{physdev\_op(void *physdev\_op)}
  1.1934 +
  1.1935 +Set and query IRQ configuration details, set the system IOPL, set the
  1.1936 +TSS IO bitmap.
  1.1937 +
  1.1938 +\end{quote} 
  1.1939 +
  1.1940 +
  1.1941 +For examples of using {\tt physdev\_op()}, see the 
  1.1942 +Xen-specific PCI code in the linux sparse tree. 
  1.1943 +
  1.1944 +\section{Administrative Operations}
  1.1945 +\label{s:dom0ops}
  1.1946 +
  1.1947 +A large number of control operations are available to a sufficiently
  1.1948 +privileged domain (typically domain 0). These allow the creation and
  1.1949 +management of new domains, for example. A complete list is given 
  1.1950 +below: for more details on any or all of these, please see 
  1.1951 +{\tt xen/include/public/dom0\_ops.h} 
  1.1952 +
  1.1953 +
  1.1954 +\begin{quote}
  1.1955 +\hypercall{dom0\_op(dom0\_op\_t *op)} 
  1.1956 +
  1.1957 +Administrative domain operations for domain management. The options are:
  1.1958 +
  1.1959 +\begin{description} 
  1.1960 +\item [\it DOM0\_GETMEMLIST:] get list of pages used by the domain
  1.1961 +
  1.1962 +\item [\it DOM0\_SCHEDCTL:]
  1.1963 +
  1.1964 +\item [\it DOM0\_ADJUSTDOM:] adjust scheduling priorities for domain
  1.1965 +
  1.1966 +\item [\it DOM0\_CREATEDOMAIN:] create a new domain
  1.1967 +
  1.1968 +\item [\it DOM0\_DESTROYDOMAIN:] deallocate all resources associated
  1.1969 +with a domain
  1.1970 +
  1.1971 +\item [\it DOM0\_PAUSEDOMAIN:] remove a domain from the scheduler run 
  1.1972 +queue. 
  1.1973 +
  1.1974 +\item [\it DOM0\_UNPAUSEDOMAIN:] mark a paused domain as schedulable
  1.1975 +  once again. 
  1.1976 +
  1.1977 +\item [\it DOM0\_GETDOMAININFO:] get statistics about the domain
  1.1978 +
  1.1979 +\item [\it DOM0\_SETDOMAININFO:] set VCPU-related attributes
  1.1980 +
  1.1981 +\item [\it DOM0\_MSR:] read or write model specific registers
  1.1982 +
  1.1983 +\item [\it DOM0\_DEBUG:] interactively invoke the debugger
  1.1984 +
  1.1985 +\item [\it DOM0\_SETTIME:] set system time
  1.1986 +
  1.1987 +\item [\it DOM0\_GETPAGEFRAMEINFO:] 
  1.1988 +
  1.1989 +\item [\it DOM0\_READCONSOLE:] read console content from hypervisor buffer ring
  1.1990 +
  1.1991 +\item [\it DOM0\_PINCPUDOMAIN:] pin domain to a particular CPU
  1.1992 +
  1.1993 +\item [\it DOM0\_TBUFCONTROL:] get and set trace buffer attributes
  1.1994 +
  1.1995 +\item [\it DOM0\_PHYSINFO:] get information about the host machine
  1.1996 +
  1.1997 +\item [\it DOM0\_SCHED\_ID:] get the ID of the current Xen scheduler
  1.1998 +
  1.1999 +\item [\it DOM0\_SHADOW\_CONTROL:] switch between shadow page-table modes
  1.2000 +
  1.2001 +\item [\it DOM0\_SETDOMAINMAXMEM:] set maximum memory allocation of a domain
  1.2002 +
  1.2003 +\item [\it DOM0\_GETPAGEFRAMEINFO2:] batched interface for getting
  1.2004 +page frame info
  1.2005 +
  1.2006 +\item [\it DOM0\_ADD\_MEMTYPE:] set MTRRs
  1.2007 +
  1.2008 +\item [\it DOM0\_DEL\_MEMTYPE:] remove a memory type range
  1.2009 +
  1.2010 +\item [\it DOM0\_READ\_MEMTYPE:] read MTRR
  1.2011 +
  1.2012 +\item [\it DOM0\_PERFCCONTROL:] control Xen's software performance
  1.2013 +counters
  1.2014 +
  1.2015 +\item [\it DOM0\_MICROCODE:] update CPU microcode
  1.2016 +
  1.2017 +\item [\it DOM0\_IOPORT\_PERMISSION:] modify domain permissions for an
  1.2018 +IO port range (enable / disable a range for a particular domain)
  1.2019 +
  1.2020 +\item [\it DOM0\_GETVCPUCONTEXT:] get context from a VCPU
  1.2021 +
  1.2022 +\item [\it DOM0\_GETVCPUINFO:] get current state for a VCPU
  1.2023 +\item [\it DOM0\_GETDOMAININFOLIST:] batched interface to get domain
  1.2024 +info
  1.2025 +
  1.2026 +\item [\it DOM0\_PLATFORM\_QUIRK:] inform Xen of a platform quirk it
  1.2027 +needs to handle (e.g. noirqbalance)
  1.2028 +
  1.2029 +\item [\it DOM0\_PHYSICAL\_MEMORY\_MAP:] get info about dom0's memory
  1.2030 +map
  1.2031 +
  1.2032 +\item [\it DOM0\_MAX\_VCPUS:] change max number of VCPUs for a domain
  1.2033 +
  1.2034 +\item [\it DOM0\_SETDOMAINHANDLE:] set the handle for a domain
  1.2035 +
  1.2036 +\end{description} 
  1.2037 +\end{quote} 
  1.2038 +
  1.2039 +Most of the above are best understood by looking at the code 
  1.2040 +implementing them (in {\tt xen/common/dom0\_ops.c}) and in 
  1.2041 +the user-space tools that use them (mostly in {\tt tools/libxc}). 
  1.2042 +
  1.2043 +Hypercalls relating to the management of the Access Control Module are
  1.2044 +also restricted to domain 0 access for now:
  1.2045 +
  1.2046 +\begin{quote}
  1.2047 +
  1.2048 +\hypercall{acm\_op(struct acm\_op * u\_acm\_op)}
  1.2049 +
  1.2050 +This hypercall can be used to configure the state of the ACM, query
  1.2051 +that state, request access control decisions and dump additional
  1.2052 +information.
  1.2053 +
  1.2054 +\end{quote}
  1.2055 +
  1.2056 +
  1.2057 +\section{Debugging Hypercalls} 
  1.2058 +
  1.2059 +A few additional hypercalls are mainly useful for debugging: 
  1.2060 +
  1.2061 +\begin{quote} 
  1.2062 +\hypercall{console\_io(int cmd, int count, char *str)}
  1.2063 +
  1.2064 +Use Xen to interact with the console; operations are:
  1.2065 +
  1.2066 +{\it CONSOLEIO\_write}: Output count characters from buffer str.
  1.2067 +
  1.2068 +{\it CONSOLEIO\_read}: Input at most count characters into buffer str.
  1.2069 +\end{quote} 
  1.2070 +
  1.2071 +A pair of hypercalls allows access to the underlying debug registers: 
  1.2072 +\begin{quote}
  1.2073 +\hypercall{set\_debugreg(int reg, unsigned long value)}
  1.2074 +
  1.2075 +Set debug register {\tt reg} to {\tt value} 
  1.2076 +
  1.2077 +\hypercall{get\_debugreg(int reg)}
  1.2078 +
  1.2079 +Return the contents of the debug register {\tt reg}
  1.2080 +\end{quote}
  1.2081 +
  1.2082 +And finally: 
  1.2083 +\begin{quote}
  1.2084 +\hypercall{xen\_version(int cmd)}
  1.2085 +
  1.2086 +Request Xen version number.
  1.2087 +\end{quote} 
  1.2088 +
  1.2089 +This is useful to ensure that user-space tools are in sync 
  1.2090 +with the underlying hypervisor. 
  1.2091  
  1.2092  
  1.2093  \end{document}
     2.1 --- a/docs/src/interface/architecture.tex	Sun Dec 04 20:12:00 2005 +0100
     2.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
     2.3 @@ -1,140 +0,0 @@
     2.4 -\chapter{Virtual Architecture}
     2.5 -
     2.6 -On a Xen-based system, the hypervisor itself runs in {\it ring 0}.  It
     2.7 -has full access to the physical memory available in the system and is
     2.8 -responsible for allocating portions of it to the domains.  Guest
     2.9 -operating systems run in and use {\it rings 1}, {\it 2} and {\it 3} as
    2.10 -they see fit. Segmentation is used to prevent the guest OS from
    2.11 -accessing the portion of the address space that is reserved for Xen.
    2.12 -We expect most guest operating systems will use ring 1 for their own
    2.13 -operation and place applications in ring 3.
    2.14 -
    2.15 -In this chapter we consider the basic virtual architecture provided by
    2.16 -Xen: the basic CPU state, exception and interrupt handling, and time.
    2.17 -Other aspects such as memory and device access are discussed in later
    2.18 -chapters.
    2.19 -
    2.20 -
    2.21 -\section{CPU state}
    2.22 -
    2.23 -All privileged state must be handled by Xen.  The guest OS has no
    2.24 -direct access to CR3 and is not permitted to update privileged bits in
    2.25 -EFLAGS. Guest OSes use \emph{hypercalls} to invoke operations in Xen;
    2.26 -these are analogous to system calls but occur from ring 1 to ring 0.
    2.27 -
    2.28 -A list of all hypercalls is given in Appendix~\ref{a:hypercalls}.
    2.29 -
    2.30 -
    2.31 -\section{Exceptions}
    2.32 -
    2.33 -A virtual IDT is provided --- a domain can submit a table of trap
    2.34 -handlers to Xen via the {\tt set\_trap\_table()} hypercall.  Most trap
    2.35 -handlers are identical to native x86 handlers, although the page-fault
    2.36 -handler is somewhat different.
    2.37 -
    2.38 -
    2.39 -\section{Interrupts and events}
    2.40 -
    2.41 -Interrupts are virtualized by mapping them to \emph{events}, which are
    2.42 -delivered asynchronously to the target domain using a callback
    2.43 -supplied via the {\tt set\_callbacks()} hypercall.  A guest OS can map
    2.44 -these events onto its standard interrupt dispatch mechanisms.  Xen is
    2.45 -responsible for determining the target domain that will handle each
    2.46 -physical interrupt source. For more details on the binding of event
    2.47 -sources to events, see Chapter~\ref{c:devices}.
    2.48 -
    2.49 -
    2.50 -\section{Time}
    2.51 -
    2.52 -Guest operating systems need to be aware of the passage of both real
    2.53 -(or wallclock) time and their own `virtual time' (the time for which
    2.54 -they have been executing). Furthermore, Xen has a notion of time which
    2.55 -is used for scheduling. The following notions of time are provided:
    2.56 -
    2.57 -\begin{description}
    2.58 -\item[Cycle counter time.]
    2.59 -
    2.60 -  This provides a fine-grained time reference.  The cycle counter time
    2.61 -  is used to accurately extrapolate the other time references.  On SMP
    2.62 -  machines it is currently assumed that the cycle counter time is
    2.63 -  synchronized between CPUs.  The current x86-based implementation
    2.64 -  achieves this within inter-CPU communication latencies.
    2.65 -
    2.66 -\item[System time.]
    2.67 -
    2.68 -  This is a 64-bit counter which holds the number of nanoseconds that
    2.69 -  have elapsed since system boot.
    2.70 -
    2.71 -\item[Wall clock time.]
    2.72 -
    2.73 -  This is the time of day in a Unix-style {\tt struct timeval}
    2.74 -  (seconds and microseconds since 1 January 1970, adjusted by leap
    2.75 -  seconds).  An NTP client hosted by {\it domain 0} can keep this
    2.76 -  value accurate.
    2.77 -
    2.78 -\item[Domain virtual time.]
    2.79 -
    2.80 -  This progresses at the same pace as system time, but only while a
    2.81 -  domain is executing --- it stops while a domain is de-scheduled.
    2.82 -  Therefore the share of the CPU that a domain receives is indicated
    2.83 -  by the rate at which its virtual time increases.
    2.84 -
    2.85 -\end{description}
    2.86 -
    2.87 -
    2.88 -Xen exports timestamps for system time and wall-clock time to guest
    2.89 -operating systems through a shared page of memory.  Xen also provides
    2.90 -the cycle counter time at the instant the timestamps were calculated,
    2.91 -and the CPU frequency in Hertz.  This allows the guest to extrapolate
    2.92 -system and wall-clock times accurately based on the current cycle
    2.93 -counter time.
    2.94 -
    2.95 -Since all time stamps need to be updated and read \emph{atomically}
    2.96 -two version numbers are also stored in the shared info page. The first
    2.97 -is incremented prior to an update, while the second is only
    2.98 -incremented afterwards. Thus a guest can be sure that it read a
    2.99 -consistent state by checking the two version numbers are equal.
   2.100 -
   2.101 -Xen includes a periodic ticker which sends a timer event to the
   2.102 -currently executing domain every 10ms.  The Xen scheduler also sends a
   2.103 -timer event whenever a domain is scheduled; this allows the guest OS
   2.104 -to adjust for the time that has passed while it has been inactive.  In
   2.105 -addition, Xen allows each domain to request that they receive a timer
   2.106 -event sent at a specified system time by using the {\tt
   2.107 -  set\_timer\_op()} hypercall.  Guest OSes may use this timer to
   2.108 -implement timeout values when they block.
   2.109 -
   2.110 -
   2.111 -
   2.112 -%% % akw: demoting this to a section -- not sure if there is any point
   2.113 -%% % though, maybe just remove it.
   2.114 -
   2.115 -\section{Xen CPU Scheduling}
   2.116 -
   2.117 -Xen offers a uniform API for CPU schedulers.  It is possible to choose
   2.118 -from a number of schedulers at boot and it should be easy to add more.
   2.119 -The BVT, Atropos and Round Robin schedulers are part of the normal Xen
   2.120 -distribution.  BVT provides proportional fair shares of the CPU to the
   2.121 -running domains.  Atropos can be used to reserve absolute shares of
   2.122 -the CPU for each domain.  Round-robin is provided as an example of
   2.123 -Xen's internal scheduler API.
   2.124 -
   2.125 -\paragraph*{Note: SMP host support}
   2.126 -Xen has always supported SMP host systems.  Domains are statically
   2.127 -assigned to CPUs, either at creation time or when manually pinning to
   2.128 -a particular CPU.  The current schedulers then run locally on each CPU
   2.129 -to decide which of the assigned domains should be run there. The
   2.130 -user-level control software can be used to perform coarse-grain
   2.131 -load-balancing between CPUs.
   2.132 -
   2.133 -
   2.134 -%% More information on the characteristics and use of these schedulers
   2.135 -%% is available in {\tt Sched-HOWTO.txt}.
   2.136 -
   2.137 -
   2.138 -\section{Privileged operations}
   2.139 -
   2.140 -Xen exports an extended interface to privileged domains (viz.\ {\it
   2.141 -  Domain 0}). This allows such domains to build and boot other domains
   2.142 -on the server, and provides control interfaces for managing
   2.143 -scheduling, memory, networking, and block devices.
     3.1 --- a/docs/src/interface/debugging.tex	Sun Dec 04 20:12:00 2005 +0100
     3.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
     3.3 @@ -1,62 +0,0 @@
     3.4 -\chapter{Debugging}
     3.5 -
     3.6 -Xen provides tools for debugging both Xen and guest OSes.  Currently, the
     3.7 -Pervasive Debugger provides a GDB stub, which provides facilities for symbolic
     3.8 -debugging of Xen itself and of OS kernels running on top of Xen.  The Trace
     3.9 -Buffer provides a lightweight means to log data about Xen's internal state and
    3.10 -behaviour at runtime, for later analysis.
    3.11 -
    3.12 -\section{Pervasive Debugger}
    3.13 -
    3.14 -Information on using the pervasive debugger is available in pdb.txt.
    3.15 -
    3.16 -
    3.17 -\section{Trace Buffer}
    3.18 -
    3.19 -The trace buffer provides a means to observe Xen's operation from domain 0.
    3.20 -Trace events, inserted at key points in Xen's code, record data that can be
    3.21 -read by the {\tt xentrace} tool.  Recording these events has a low overhead
    3.22 -and hence the trace buffer may be useful for debugging timing-sensitive
    3.23 -behaviours.
    3.24 -
    3.25 -\subsection{Internal API}
    3.26 -
    3.27 -To use the trace buffer functionality from within Xen, you must {\tt \#include
    3.28 -<xen/trace.h>}, which contains definitions related to the trace buffer.  Trace
    3.29 -events are inserted into the buffer using the {\tt TRACE\_xD} ({\tt x} = 0, 1,
    3.30 -2, 3, 4 or 5) macros.  These all take an event number, plus {\tt x} additional
    3.31 -(32-bit) data as their arguments.  For trace buffer-enabled builds of Xen these
    3.32 -will insert the event ID and data into the trace buffer, along with the current
    3.33 -value of the CPU cycle-counter.  For builds without the trace buffer enabled,
    3.34 -the macros expand to no-ops and thus can be left in place without incurring
    3.35 -overheads.
    3.36 -
    3.37 -\subsection{Trace-enabled builds}
    3.38 -
    3.39 -By default, the trace buffer is enabled only in debug builds (i.e. {\tt NDEBUG}
    3.40 -is not defined).  It can be enabled separately by defining {\tt TRACE\_BUFFER},
    3.41 -either in {\tt <xen/config.h>} or on the gcc command line.
    3.42 -
    3.43 -The size (in pages) of the per-CPU trace buffers can be specified using the
    3.44 -{\tt tbuf\_size=n } boot parameter to Xen.  If the size is set to 0, the trace
    3.45 -buffers will be disabled.
    3.46 -
    3.47 -\subsection{Dumping trace data}
    3.48 -
    3.49 -When running a trace buffer build of Xen, trace data are written continuously
    3.50 -into the buffer data areas, with newer data overwriting older data.  This data
    3.51 -can be captured using the {\tt xentrace} program in domain 0.
    3.52 -
    3.53 -The {\tt xentrace} tool uses {\tt /dev/mem} in domain 0 to map the trace
    3.54 -buffers into its address space.  It then periodically polls all the buffers for
    3.55 -new data, dumping out any new records from each buffer in turn.  As a result,
    3.56 -for machines with multiple (logical) CPUs, the trace buffer output will not be
    3.57 -in overall chronological order.
    3.58 -
    3.59 -The output from {\tt xentrace} can be post-processed using {\tt
    3.60 -xentrace\_cpusplit} (used to split trace data out into per-cpu log files) and
    3.61 -{\tt xentrace\_format} (used to pretty-print trace data).  For the predefined
    3.62 -trace points, there is an example format file in {\tt tools/xentrace/formats }.
    3.63 -
    3.64 -For more information, see the manual pages for {\tt xentrace}, {\tt
    3.65 -xentrace\_format} and {\tt xentrace\_cpusplit}.
     4.1 --- a/docs/src/interface/devices.tex	Sun Dec 04 20:12:00 2005 +0100
     4.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
     4.3 @@ -1,178 +0,0 @@
     4.4 -\chapter{Devices}
     4.5 -\label{c:devices}
     4.6 -
     4.7 -Devices such as network and disk are exported to guests using a split
     4.8 -device driver.  The device driver domain, which accesses the physical
     4.9 -device directly also runs a \emph{backend} driver, serving requests to
    4.10 -that device from guests.  Each guest will use a simple \emph{frontend}
    4.11 -driver, to access the backend.  Communication between these domains is
    4.12 -composed of two parts: First, data is placed onto a shared memory page
    4.13 -between the domains.  Second, an event channel between the two domains
    4.14 -is used to pass notification that data is outstanding.  This
    4.15 -separation of notification from data transfer allows message batching,
    4.16 -and results in very efficient device access.
    4.17 -
    4.18 -Event channels are used extensively in device virtualization; each
    4.19 -domain has a number of end-points or \emph{ports} each of which may be
    4.20 -bound to one of the following \emph{event sources}:
    4.21 -\begin{itemize}
    4.22 -  \item a physical interrupt from a real device, 
    4.23 -  \item a virtual interrupt (callback) from Xen, or 
    4.24 -  \item a signal from another domain 
    4.25 -\end{itemize}
    4.26 -
    4.27 -Events are lightweight and do not carry much information beyond the
    4.28 -source of the notification. Hence when performing bulk data transfer,
    4.29 -events are typically used as synchronization primitives over a shared
    4.30 -memory transport. Event channels are managed via the {\tt
    4.31 -  event\_channel\_op()} hypercall; for more details see
    4.32 -Section~\ref{s:idc}.
    4.33 -
    4.34 -This chapter focuses on some individual device interfaces available to
    4.35 -Xen guests.
    4.36 -
    4.37 -
    4.38 -\section{Network I/O}
    4.39 -
    4.40 -Virtual network device services are provided by shared memory
    4.41 -communication with a backend domain.  From the point of view of other
    4.42 -domains, the backend may be viewed as a virtual ethernet switch
    4.43 -element with each domain having one or more virtual network interfaces
    4.44 -connected to it.
    4.45 -
    4.46 -\subsection{Backend Packet Handling}
    4.47 -
    4.48 -The backend driver is responsible for a variety of actions relating to
    4.49 -the transmission and reception of packets from the physical device.
    4.50 -With regard to transmission, the backend performs these key actions:
    4.51 -
    4.52 -\begin{itemize}
    4.53 -\item {\bf Validation:} To ensure that domains do not attempt to
    4.54 -  generate invalid (e.g. spoofed) traffic, the backend driver may
    4.55 -  validate headers ensuring that source MAC and IP addresses match the
    4.56 -  interface that they have been sent from.
    4.57 -
    4.58 -  Validation functions can be configured using standard firewall rules
    4.59 -  ({\small{\tt iptables}} in the case of Linux).
    4.60 -  
    4.61 -\item {\bf Scheduling:} Since a number of domains can share a single
    4.62 -  physical network interface, the backend must mediate access when
    4.63 -  several domains each have packets queued for transmission.  This
    4.64 -  general scheduling function subsumes basic shaping or rate-limiting
    4.65 -  schemes.
    4.66 -  
    4.67 -\item {\bf Logging and Accounting:} The backend domain can be
    4.68 -  configured with classifier rules that control how packets are
    4.69 -  accounted or logged.  For example, log messages might be generated
    4.70 -  whenever a domain attempts to send a TCP packet containing a SYN.
    4.71 -\end{itemize}
    4.72 -
    4.73 -On receipt of incoming packets, the backend acts as a simple
    4.74 -demultiplexer: Packets are passed to the appropriate virtual interface
    4.75 -after any necessary logging and accounting have been carried out.
    4.76 -
    4.77 -\subsection{Data Transfer}
    4.78 -
    4.79 -Each virtual interface uses two ``descriptor rings'', one for
    4.80 -transmit, the other for receive.  Each descriptor identifies a block
    4.81 -of contiguous physical memory allocated to the domain.
    4.82 -
    4.83 -The transmit ring carries packets to transmit from the guest to the
    4.84 -backend domain.  The return path of the transmit ring carries messages
    4.85 -indicating that the contents have been physically transmitted and the
    4.86 -backend no longer requires the associated pages of memory.
    4.87 -
    4.88 -To receive packets, the guest places descriptors of unused pages on
    4.89 -the receive ring.  The backend will return received packets by
    4.90 -exchanging these pages in the domain's memory with new pages
    4.91 -containing the received data, and passing back descriptors regarding
    4.92 -the new packets on the ring.  This zero-copy approach allows the
    4.93 -backend to maintain a pool of free pages to receive packets into, and
    4.94 -then deliver them to appropriate domains after examining their
    4.95 -headers.
    4.96 -
    4.97 -% Real physical addresses are used throughout, with the domain
    4.98 -% performing translation from pseudo-physical addresses if that is
    4.99 -% necessary.
   4.100 -
   4.101 -If a domain does not keep its receive ring stocked with empty buffers
   4.102 -then packets destined to it may be dropped.  This provides some
   4.103 -defence against receive livelock problems because an overload domain
   4.104 -will cease to receive further data.  Similarly, on the transmit path,
   4.105 -it provides the application with feedback on the rate at which packets
   4.106 -are able to leave the system.
   4.107 -
   4.108 -Flow control on rings is achieved by including a pair of producer
   4.109 -indexes on the shared ring page.  Each side will maintain a private
   4.110 -consumer index indicating the next outstanding message.  In this
   4.111 -manner, the domains cooperate to divide the ring into two message
   4.112 -lists, one in each direction.  Notification is decoupled from the
   4.113 -immediate placement of new messages on the ring; the event channel
   4.114 -will be used to generate notification when {\em either} a certain
   4.115 -number of outstanding messages are queued, {\em or} a specified number
   4.116 -of nanoseconds have elapsed since the oldest message was placed on the
   4.117 -ring.
   4.118 -
   4.119 -%% Not sure if my version is any better -- here is what was here
   4.120 -%% before: Synchronization between the backend domain and the guest is
   4.121 -%% achieved using counters held in shared memory that is accessible to
   4.122 -%% both.  Each ring has associated producer and consumer indices
   4.123 -%% indicating the area in the ring that holds descriptors that contain
   4.124 -%% data.  After receiving {\it n} packets or {\t nanoseconds} after
   4.125 -%% receiving the first packet, the hypervisor sends an event to the
   4.126 -%% domain.
   4.127 -
   4.128 -
   4.129 -\section{Block I/O}
   4.130 -
   4.131 -All guest OS disk access goes through the virtual block device VBD
   4.132 -interface.  This interface allows domains access to portions of block
   4.133 -storage devices visible to the the block backend device.  The VBD
   4.134 -interface is a split driver, similar to the network interface
   4.135 -described above.  A single shared memory ring is used between the
   4.136 -frontend and backend drivers, across which read and write messages are
   4.137 -sent.
   4.138 -
   4.139 -Any block device accessible to the backend domain, including
   4.140 -network-based block (iSCSI, *NBD, etc), loopback and LVM/MD devices,
   4.141 -can be exported as a VBD.  Each VBD is mapped to a device node in the
   4.142 -guest, specified in the guest's startup configuration.
   4.143 -
   4.144 -Old (Xen 1.2) virtual disks are not supported under Xen 2.0, since
   4.145 -similar functionality can be achieved using the more complete LVM
   4.146 -system, which is already in widespread use.
   4.147 -
   4.148 -\subsection{Data Transfer}
   4.149 -
   4.150 -The single ring between the guest and the block backend supports three
   4.151 -messages:
   4.152 -
   4.153 -\begin{description}
   4.154 -\item [{\small {\tt PROBE}}:] Return a list of the VBDs available to
   4.155 -  this guest from the backend.  The request includes a descriptor of a
   4.156 -  free page into which the reply will be written by the backend.
   4.157 -
   4.158 -\item [{\small {\tt READ}}:] Read data from the specified block
   4.159 -  device.  The front end identifies the device and location to read
   4.160 -  from and attaches pages for the data to be copied to (typically via
   4.161 -  DMA from the device).  The backend acknowledges completed read
   4.162 -  requests as they finish.
   4.163 -
   4.164 -\item [{\small {\tt WRITE}}:] Write data to the specified block
   4.165 -  device.  This functions essentially as {\small {\tt READ}}, except
   4.166 -  that the data moves to the device instead of from it.
   4.167 -\end{description}
   4.168 -
   4.169 -%% um... some old text: In overview, the same style of descriptor-ring
   4.170 -%% that is used for network packets is used here.  Each domain has one
   4.171 -%% ring that carries operation requests to the hypervisor and carries
   4.172 -%% the results back again.
   4.173 -
   4.174 -%% Rather than copying data, the backend simply maps the domain's
   4.175 -%% buffers in order to enable direct DMA to them.  The act of mapping
   4.176 -%% the buffers also increases the reference counts of the underlying
   4.177 -%% pages, so that the unprivileged domain cannot try to return them to
   4.178 -%% the hypervisor, install them as page tables, or any other unsafe
   4.179 -%% behaviour.
   4.180 -%%
   4.181 -%% % block API here
     5.1 --- a/docs/src/interface/further_info.tex	Sun Dec 04 20:12:00 2005 +0100
     5.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
     5.3 @@ -1,49 +0,0 @@
     5.4 -\chapter{Further Information}
     5.5 -
     5.6 -If you have questions that are not answered by this manual, the
     5.7 -sources of information listed below may be of interest to you.  Note
     5.8 -that bug reports, suggestions and contributions related to the
     5.9 -software (or the documentation) should be sent to the Xen developers'
    5.10 -mailing list (address below).
    5.11 -
    5.12 -
    5.13 -\section{Other documentation}
    5.14 -
    5.15 -If you are mainly interested in using (rather than developing for)
    5.16 -Xen, the \emph{Xen Users' Manual} is distributed in the {\tt docs/}
    5.17 -directory of the Xen source distribution.
    5.18 -
    5.19 -% Various HOWTOs are also available in {\tt docs/HOWTOS}.
    5.20 -
    5.21 -
    5.22 -\section{Online references}
    5.23 -
    5.24 -The official Xen web site is found at:
    5.25 -\begin{quote}
    5.26 -{\tt http://www.cl.cam.ac.uk/Research/SRG/netos/xen/}
    5.27 -\end{quote}
    5.28 -
    5.29 -This contains links to the latest versions of all on-line
    5.30 -documentation.
    5.31 -
    5.32 -
    5.33 -\section{Mailing lists}
    5.34 -
    5.35 -There are currently four official Xen mailing lists:
    5.36 -
    5.37 -\begin{description}
    5.38 -\item[xen-devel@lists.xensource.com] Used for development
    5.39 -  discussions and bug reports.  Subscribe at: \\
    5.40 -  {\small {\tt http://lists.xensource.com/xen-devel}}
    5.41 -\item[xen-users@lists.xensource.com] Used for installation and usage
    5.42 -  discussions and requests for help.  Subscribe at: \\
    5.43 -  {\small {\tt http://lists.xensource.com/xen-users}}
    5.44 -\item[xen-announce@lists.xensource.com] Used for announcements only.
    5.45 -  Subscribe at: \\
    5.46 -  {\small {\tt http://lists.xensource.com/xen-announce}}
    5.47 -\item[xen-changelog@lists.xensource.com] Changelog feed
    5.48 -  from the unstable and 2.0 trees - developer oriented.  Subscribe at: \\
    5.49 -  {\small {\tt http://lists.xensource.com/xen-changelog}}
    5.50 -\end{description}
    5.51 -
    5.52 -Of these, xen-devel is the most active.
     6.1 --- a/docs/src/interface/hypercalls.tex	Sun Dec 04 20:12:00 2005 +0100
     6.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
     6.3 @@ -1,524 +0,0 @@
     6.4 -
     6.5 -\newcommand{\hypercall}[1]{\vspace{2mm}{\sf #1}}
     6.6 -
     6.7 -\chapter{Xen Hypercalls}
     6.8 -\label{a:hypercalls}
     6.9 -
    6.10 -Hypercalls represent the procedural interface to Xen; this appendix 
    6.11 -categorizes and describes the current set of hypercalls. 
    6.12 -
    6.13 -\section{Invoking Hypercalls} 
    6.14 -
    6.15 -Hypercalls are invoked in a manner analogous to system calls in a
    6.16 -conventional operating system; a software interrupt is issued which
    6.17 -vectors to an entry point within Xen. On x86\_32 machines the
    6.18 -instruction required is {\tt int \$82}; the (real) IDT is setup so
    6.19 -that this may only be issued from within ring 1. The particular 
    6.20 -hypercall to be invoked is contained in {\tt EAX} --- a list 
    6.21 -mapping these values to symbolic hypercall names can be found 
    6.22 -in {\tt xen/include/public/xen.h}. 
    6.23 -
    6.24 -On some occasions a set of hypercalls will be required to carry
    6.25 -out a higher-level function; a good example is when a guest 
    6.26 -operating wishes to context switch to a new process which 
    6.27 -requires updating various privileged CPU state. As an optimization
    6.28 -for these cases, there is a generic mechanism to issue a set of 
    6.29 -hypercalls as a batch: 
    6.30 -
    6.31 -\begin{quote}
    6.32 -\hypercall{multicall(void *call\_list, int nr\_calls)}
    6.33 -
    6.34 -Execute a series of hypervisor calls; {\tt nr\_calls} is the length of
    6.35 -the array of {\tt multicall\_entry\_t} structures pointed to be {\tt
    6.36 -call\_list}. Each entry contains the hypercall operation code followed
    6.37 -by up to 7 word-sized arguments.
    6.38 -\end{quote}
    6.39 -
    6.40 -Note that multicalls are provided purely as an optimization; there is
    6.41 -no requirement to use them when first porting a guest operating
    6.42 -system.
    6.43 -
    6.44 -
    6.45 -\section{Virtual CPU Setup} 
    6.46 -
    6.47 -At start of day, a guest operating system needs to setup the virtual
    6.48 -CPU it is executing on. This includes installing vectors for the
    6.49 -virtual IDT so that the guest OS can handle interrupts, page faults,
    6.50 -etc. However the very first thing a guest OS must setup is a pair 
    6.51 -of hypervisor callbacks: these are the entry points which Xen will
    6.52 -use when it wishes to notify the guest OS of an occurrence. 
    6.53 -
    6.54 -\begin{quote}
    6.55 -\hypercall{set\_callbacks(unsigned long event\_selector, unsigned long
    6.56 -  event\_address, unsigned long failsafe\_selector, unsigned long
    6.57 -  failsafe\_address) }
    6.58 -
    6.59 -Register the normal (``event'') and failsafe callbacks for 
    6.60 -event processing. In each case the code segment selector and 
    6.61 -address within that segment are provided. The selectors must
    6.62 -have RPL 1; in XenLinux we simply use the kernel's CS for both 
    6.63 -{\tt event\_selector} and {\tt failsafe\_selector}.
    6.64 -
    6.65 -The value {\tt event\_address} specifies the address of the guest OSes
    6.66 -event handling and dispatch routine; the {\tt failsafe\_address}
    6.67 -specifies a separate entry point which is used only if a fault occurs
    6.68 -when Xen attempts to use the normal callback. 
    6.69 -\end{quote} 
    6.70 -
    6.71 -
    6.72 -After installing the hypervisor callbacks, the guest OS can 
    6.73 -install a `virtual IDT' by using the following hypercall: 
    6.74 -
    6.75 -\begin{quote} 
    6.76 -\hypercall{set\_trap\_table(trap\_info\_t *table)} 
    6.77 -
    6.78 -Install one or more entries into the per-domain 
    6.79 -trap handler table (essentially a software version of the IDT). 
    6.80 -Each entry in the array pointed to by {\tt table} includes the 
    6.81 -exception vector number with the corresponding segment selector 
    6.82 -and entry point. Most guest OSes can use the same handlers on 
    6.83 -Xen as when running on the real hardware; an exception is the 
    6.84 -page fault handler (exception vector 14) where a modified 
    6.85 -stack-frame layout is used. 
    6.86 -
    6.87 -
    6.88 -\end{quote} 
    6.89 -
    6.90 -
    6.91 -
    6.92 -\section{Scheduling and Timer}
    6.93 -
    6.94 -Domains are preemptively scheduled by Xen according to the 
    6.95 -parameters installed by domain 0 (see Section~\ref{s:dom0ops}). 
    6.96 -In addition, however, a domain may choose to explicitly 
    6.97 -control certain behavior with the following hypercall: 
    6.98 -
    6.99 -\begin{quote} 
   6.100 -\hypercall{sched\_op(unsigned long op)} 
   6.101 -
   6.102 -Request scheduling operation from hypervisor. The options are: {\it
   6.103 -yield}, {\it block}, and {\it shutdown}.  {\it yield} keeps the
   6.104 -calling domain runnable but may cause a reschedule if other domains
   6.105 -are runnable.  {\it block} removes the calling domain from the run
   6.106 -queue and cause is to sleeps until an event is delivered to it.  {\it
   6.107 -shutdown} is used to end the domain's execution; the caller can
   6.108 -additionally specify whether the domain should reboot, halt or
   6.109 -suspend.
   6.110 -\end{quote} 
   6.111 -
   6.112 -To aid the implementation of a process scheduler within a guest OS,
   6.113 -Xen provides a virtual programmable timer:
   6.114 -
   6.115 -\begin{quote}
   6.116 -\hypercall{set\_timer\_op(uint64\_t timeout)} 
   6.117 -
   6.118 -Request a timer event to be sent at the specified system time (time 
   6.119 -in nanoseconds since system boot). The hypercall actually passes the 
   6.120 -64-bit timeout value as a pair of 32-bit values. 
   6.121 -
   6.122 -\end{quote} 
   6.123 -
   6.124 -Note that calling {\tt set\_timer\_op()} prior to {\tt sched\_op} 
   6.125 -allows block-with-timeout semantics. 
   6.126 -
   6.127 -
   6.128 -\section{Page Table Management} 
   6.129 -
   6.130 -Since guest operating systems have read-only access to their page 
   6.131 -tables, Xen must be involved when making any changes. The following
   6.132 -multi-purpose hypercall can be used to modify page-table entries, 
   6.133 -update the machine-to-physical mapping table, flush the TLB, install 
   6.134 -a new page-table base pointer, and more.
   6.135 -
   6.136 -\begin{quote} 
   6.137 -\hypercall{mmu\_update(mmu\_update\_t *req, int count, int *success\_count)} 
   6.138 -
   6.139 -Update the page table for the domain; a set of {\tt count} updates are
   6.140 -submitted for processing in a batch, with {\tt success\_count} being 
   6.141 -updated to report the number of successful updates.  
   6.142 -
   6.143 -Each element of {\tt req[]} contains a pointer (address) and value; 
   6.144 -the least significant 2-bits of the pointer are used to distinguish 
   6.145 -the type of update requested as follows:
   6.146 -\begin{description} 
   6.147 -
   6.148 -\item[\it MMU\_NORMAL\_PT\_UPDATE:] update a page directory entry or
   6.149 -page table entry to the associated value; Xen will check that the
   6.150 -update is safe, as described in Chapter~\ref{c:memory}.
   6.151 -
   6.152 -\item[\it MMU\_MACHPHYS\_UPDATE:] update an entry in the
   6.153 -  machine-to-physical table. The calling domain must own the machine
   6.154 -  page in question (or be privileged).
   6.155 -
   6.156 -\item[\it MMU\_EXTENDED\_COMMAND:] perform additional MMU operations.
   6.157 -The set of additional MMU operations is considerable, and includes
   6.158 -updating {\tt cr3} (or just re-installing it for a TLB flush),
   6.159 -flushing the cache, installing a new LDT, or pinning \& unpinning
   6.160 -page-table pages (to ensure their reference count doesn't drop to zero
   6.161 -which would require a revalidation of all entries).
   6.162 -
   6.163 -Further extended commands are used to deal with granting and 
   6.164 -acquiring page ownership; see Section~\ref{s:idc}. 
   6.165 -
   6.166 -
   6.167 -\end{description}
   6.168 -
   6.169 -More details on the precise format of all commands can be 
   6.170 -found in {\tt xen/include/public/xen.h}. 
   6.171 -
   6.172 -
   6.173 -\end{quote}
   6.174 -
   6.175 -Explicitly updating batches of page table entries is extremely
   6.176 -efficient, but can require a number of alterations to the guest
   6.177 -OS. Using the writable page table mode (Chapter~\ref{c:memory}) is
   6.178 -recommended for new OS ports.
   6.179 -
   6.180 -Regardless of which page table update mode is being used, however,
   6.181 -there are some occasions (notably handling a demand page fault) where
   6.182 -a guest OS will wish to modify exactly one PTE rather than a
   6.183 -batch. This is catered for by the following:
   6.184 -
   6.185 -\begin{quote} 
   6.186 -\hypercall{update\_va\_mapping(unsigned long page\_nr, unsigned long
   6.187 -val, \\ unsigned long flags)}
   6.188 -
   6.189 -Update the currently installed PTE for the page {\tt page\_nr} to 
   6.190 -{\tt val}. As with {\tt mmu\_update()}, Xen checks the modification 
   6.191 -is safe before applying it. The {\tt flags} determine which kind
   6.192 -of TLB flush, if any, should follow the update. 
   6.193 -
   6.194 -\end{quote} 
   6.195 -
   6.196 -Finally, sufficiently privileged domains may occasionally wish to manipulate 
   6.197 -the pages of others: 
   6.198 -\begin{quote}
   6.199 -
   6.200 -\hypercall{update\_va\_mapping\_otherdomain(unsigned long page\_nr,
   6.201 -unsigned long val, unsigned long flags, uint16\_t domid)}
   6.202 -
   6.203 -Identical to {\tt update\_va\_mapping()} save that the pages being
   6.204 -mapped must belong to the domain {\tt domid}. 
   6.205 -
   6.206 -\end{quote}
   6.207 -
   6.208 -This privileged operation is currently used by backend virtual device
   6.209 -drivers to safely map pages containing I/O data. 
   6.210 -
   6.211 -
   6.212 -
   6.213 -\section{Segmentation Support}
   6.214 -
   6.215 -Xen allows guest OSes to install a custom GDT if they require it; 
   6.216 -this is context switched transparently whenever a domain is 
   6.217 -[de]scheduled.  The following hypercall is effectively a 
   6.218 -`safe' version of {\tt lgdt}: 
   6.219 -
   6.220 -\begin{quote}
   6.221 -\hypercall{set\_gdt(unsigned long *frame\_list, int entries)} 
   6.222 -
   6.223 -Install a global descriptor table for a domain; {\tt frame\_list} is
   6.224 -an array of up to 16 machine page frames within which the GDT resides,
   6.225 -with {\tt entries} being the actual number of descriptor-entry
   6.226 -slots. All page frames must be mapped read-only within the guest's
   6.227 -address space, and the table must be large enough to contain Xen's
   6.228 -reserved entries (see {\tt xen/include/public/arch-x86\_32.h}).
   6.229 -
   6.230 -\end{quote}
   6.231 -
   6.232 -Many guest OSes will also wish to install LDTs; this is achieved by
   6.233 -using {\tt mmu\_update()} with an extended command, passing the
   6.234 -linear address of the LDT base along with the number of entries. No
   6.235 -special safety checks are required; Xen needs to perform this task
   6.236 -simply since {\tt lldt} requires CPL 0.
   6.237 -
   6.238 -
   6.239 -Xen also allows guest operating systems to update just an 
   6.240 -individual segment descriptor in the GDT or LDT:  
   6.241 -
   6.242 -\begin{quote}
   6.243 -\hypercall{update\_descriptor(unsigned long ma, unsigned long word1,
   6.244 -unsigned long word2)}
   6.245 -
   6.246 -Update the GDT/LDT entry at machine address {\tt ma}; the new
   6.247 -8-byte descriptor is stored in {\tt word1} and {\tt word2}.
   6.248 -Xen performs a number of checks to ensure the descriptor is 
   6.249 -valid. 
   6.250 -
   6.251 -\end{quote}
   6.252 -
   6.253 -Guest OSes can use the above in place of context switching entire 
   6.254 -LDTs (or the GDT) when the number of changing descriptors is small. 
   6.255 -
   6.256 -\section{Context Switching} 
   6.257 -
   6.258 -When a guest OS wishes to context switch between two processes, 
   6.259 -it can use the page table and segmentation hypercalls described
   6.260 -above to perform the the bulk of the privileged work. In addition, 
   6.261 -however, it will need to invoke Xen to switch the kernel (ring 1) 
   6.262 -stack pointer: 
   6.263 -
   6.264 -\begin{quote} 
   6.265 -\hypercall{stack\_switch(unsigned long ss, unsigned long esp)} 
   6.266 -
   6.267 -Request kernel stack switch from hypervisor; {\tt ss} is the new 
   6.268 -stack segment, which {\tt esp} is the new stack pointer. 
   6.269 -
   6.270 -\end{quote} 
   6.271 -
   6.272 -A final useful hypercall for context switching allows ``lazy'' 
   6.273 -save and restore of floating point state: 
   6.274 -
   6.275 -\begin{quote}
   6.276 -\hypercall{fpu\_taskswitch(void)} 
   6.277 -
   6.278 -This call instructs Xen to set the {\tt TS} bit in the {\tt cr0}
   6.279 -control register; this means that the next attempt to use floating
   6.280 -point will cause a trap which the guest OS can trap. Typically it will
   6.281 -then save/restore the FP state, and clear the {\tt TS} bit. 
   6.282 -\end{quote} 
   6.283 -
   6.284 -This is provided as an optimization only; guest OSes can also choose
   6.285 -to save and restore FP state on all context switches for simplicity. 
   6.286 -
   6.287 -
   6.288 -\section{Physical Memory Management}
   6.289 -
   6.290 -As mentioned previously, each domain has a maximum and current 
   6.291 -memory allocation. The maximum allocation, set at domain creation 
   6.292 -time, cannot be modified. However a domain can choose to reduce 
   6.293 -and subsequently grow its current allocation by using the
   6.294 -following call: 
   6.295 -
   6.296 -\begin{quote} 
   6.297 -\hypercall{dom\_mem\_op(unsigned int op, unsigned long *extent\_list,
   6.298 -  unsigned long nr\_extents, unsigned int extent\_order)}
   6.299 -
   6.300 -Increase or decrease current memory allocation (as determined by 
   6.301 -the value of {\tt op}). Each invocation provides a list of 
   6.302 -extents each of which is $2^s$ pages in size, 
   6.303 -where $s$ is the value of {\tt extent\_order}. 
   6.304 -
   6.305 -\end{quote} 
   6.306 -
   6.307 -In addition to simply reducing or increasing the current memory
   6.308 -allocation via a `balloon driver', this call is also useful for 
   6.309 -obtaining contiguous regions of machine memory when required (e.g. 
   6.310 -for certain PCI devices, or if using superpages).  
   6.311 -
   6.312 -
   6.313 -\section{Inter-Domain Communication}
   6.314 -\label{s:idc} 
   6.315 -
   6.316 -Xen provides a simple asynchronous notification mechanism via
   6.317 -\emph{event channels}. Each domain has a set of end-points (or
   6.318 -\emph{ports}) which may be bound to an event source (e.g. a physical
   6.319 -IRQ, a virtual IRQ, or an port in another domain). When a pair of
   6.320 -end-points in two different domains are bound together, then a `send'
   6.321 -operation on one will cause an event to be received by the destination
   6.322 -domain.
   6.323 -
   6.324 -The control and use of event channels involves the following hypercall: 
   6.325 -
   6.326 -\begin{quote}
   6.327 -\hypercall{event\_channel\_op(evtchn\_op\_t *op)} 
   6.328 -
   6.329 -Inter-domain event-channel management; {\tt op} is a discriminated 
   6.330 -union which allows the following 7 operations: 
   6.331 -
   6.332 -\begin{description} 
   6.333 -
   6.334 -\item[\it alloc\_unbound:] allocate a free (unbound) local
   6.335 -  port and prepare for connection from a specified domain. 
   6.336 -\item[\it bind\_virq:] bind a local port to a virtual 
   6.337 -IRQ; any particular VIRQ can be bound to at most one port per domain. 
   6.338 -\item[\it bind\_pirq:] bind a local port to a physical IRQ;
   6.339 -once more, a given pIRQ can be bound to at most one port per
   6.340 -domain. Furthermore the calling domain must be sufficiently
   6.341 -privileged.
   6.342 -\item[\it bind\_interdomain:] construct an interdomain event 
   6.343 -channel; in general, the target domain must have previously allocated 
   6.344 -an unbound port for this channel, although this can be bypassed by 
   6.345 -privileged domains during domain setup. 
   6.346 -\item[\it close:] close an interdomain event channel. 
   6.347 -\item[\it send:] send an event to the remote end of a 
   6.348 -interdomain event channel. 
   6.349 -\item[\it status:] determine the current status of a local port. 
   6.350 -\end{description} 
   6.351 -
   6.352 -For more details see
   6.353 -{\tt xen/include/public/event\_channel.h}. 
   6.354 -
   6.355 -\end{quote} 
   6.356 -
   6.357 -Event channels are the fundamental communication primitive between 
   6.358 -Xen domains and seamlessly support SMP. However they provide little
   6.359 -bandwidth for communication {\sl per se}, and hence are typically 
   6.360 -married with a piece of shared memory to produce effective and 
   6.361 -high-performance inter-domain communication. 
   6.362 -
   6.363 -Safe sharing of memory pages between guest OSes is carried out by
   6.364 -granting access on a per page basis to individual domains. This is
   6.365 -achieved by using the {\tt grant\_table\_op()} hypercall.
   6.366 -
   6.367 -\begin{quote}
   6.368 -\hypercall{grant\_table\_op(unsigned int cmd, void *uop, unsigned int count)}
   6.369 -
   6.370 -Grant or remove access to a particular page to a particular domain. 
   6.371 -
   6.372 -\end{quote} 
   6.373 -
   6.374 -This is not currently widely in use by guest operating systems, but 
   6.375 -we intend to integrate support more fully in the near future. 
   6.376 -
   6.377 -\section{PCI Configuration} 
   6.378 -
   6.379 -Domains with physical device access (i.e.\ driver domains) receive
   6.380 -limited access to certain PCI devices (bus address space and
   6.381 -interrupts). However many guest operating systems attempt to 
   6.382 -determine the PCI configuration by directly access the PCI BIOS, 
   6.383 -which cannot be allowed for safety. 
   6.384 -
   6.385 -Instead, Xen provides the following hypercall: 
   6.386 -
   6.387 -\begin{quote}
   6.388 -\hypercall{physdev\_op(void *physdev\_op)}
   6.389 -
   6.390 -Perform a PCI configuration option; depending on the value 
   6.391 -of {\tt physdev\_op} this can be a PCI config read, a PCI config 
   6.392 -write, or a small number of other queries. 
   6.393 -
   6.394 -\end{quote} 
   6.395 -
   6.396 -
   6.397 -For examples of using {\tt physdev\_op()}, see the 
   6.398 -Xen-specific PCI code in the linux sparse tree. 
   6.399 -
   6.400 -\section{Administrative Operations}
   6.401 -\label{s:dom0ops}
   6.402 -
   6.403 -A large number of control operations are available to a sufficiently
   6.404 -privileged domain (typically domain 0). These allow the creation and
   6.405 -management of new domains, for example. A complete list is given 
   6.406 -below: for more details on any or all of these, please see 
   6.407 -{\tt xen/include/public/dom0\_ops.h} 
   6.408 -
   6.409 -
   6.410 -\begin{quote}
   6.411 -\hypercall{dom0\_op(dom0\_op\_t *op)} 
   6.412 -
   6.413 -Administrative domain operations for domain management. The options are:
   6.414 -
   6.415 -\begin{description} 
   6.416 -\item [\it DOM0\_CREATEDOMAIN:] create a new domain
   6.417 -
   6.418 -\item [\it DOM0\_PAUSEDOMAIN:] remove a domain from the scheduler run 
   6.419 -queue. 
   6.420 -
   6.421 -\item [\it DOM0\_UNPAUSEDOMAIN:] mark a paused domain as schedulable
   6.422 -  once again. 
   6.423 -
   6.424 -\item [\it DOM0\_DESTROYDOMAIN:] deallocate all resources associated
   6.425 -with a domain
   6.426 -
   6.427 -\item [\it DOM0\_GETMEMLIST:] get list of pages used by the domain
   6.428 -
   6.429 -\item [\it DOM0\_SCHEDCTL:]
   6.430 -
   6.431 -\item [\it DOM0\_ADJUSTDOM:] adjust scheduling priorities for domain
   6.432 -
   6.433 -\item [\it DOM0\_BUILDDOMAIN:] do final guest OS setup for domain
   6.434 -
   6.435 -\item [\it DOM0\_GETDOMAINFO:] get statistics about the domain
   6.436 -
   6.437 -\item [\it DOM0\_GETPAGEFRAMEINFO:] 
   6.438 -
   6.439 -\item [\it DOM0\_GETPAGEFRAMEINFO2:]
   6.440 -
   6.441 -\item [\it DOM0\_IOPL:] set I/O privilege level
   6.442 -
   6.443 -\item [\it DOM0\_MSR:] read or write model specific registers
   6.444 -
   6.445 -\item [\it DOM0\_DEBUG:] interactively invoke the debugger
   6.446 -
   6.447 -\item [\it DOM0\_SETTIME:] set system time
   6.448 -
   6.449 -\item [\it DOM0\_READCONSOLE:] read console content from hypervisor buffer ring
   6.450 -
   6.451 -\item [\it DOM0\_PINCPUDOMAIN:] pin domain to a particular CPU
   6.452 -
   6.453 -\item [\it DOM0\_GETTBUFS:] get information about the size and location of
   6.454 -                      the trace buffers (only on trace-buffer enabled builds)
   6.455 -
   6.456 -\item [\it DOM0\_PHYSINFO:] get information about the host machine
   6.457 -
   6.458 -\item [\it DOM0\_PCIDEV\_ACCESS:] modify PCI device access permissions
   6.459 -
   6.460 -\item [\it DOM0\_SCHED\_ID:] get the ID of the current Xen scheduler
   6.461 -
   6.462 -\item [\it DOM0\_SHADOW\_CONTROL:] switch between shadow page-table modes
   6.463 -
   6.464 -\item [\it DOM0\_SETDOMAININITIALMEM:] set initial memory allocation of a domain
   6.465 -
   6.466 -\item [\it DOM0\_SETDOMAINMAXMEM:] set maximum memory allocation of a domain
   6.467 -
   6.468 -\item [\it DOM0\_SETDOMAINVMASSIST:] set domain VM assist options
   6.469 -\end{description} 
   6.470 -\end{quote} 
   6.471 -
   6.472 -Most of the above are best understood by looking at the code 
   6.473 -implementing them (in {\tt xen/common/dom0\_ops.c}) and in 
   6.474 -the user-space tools that use them (mostly in {\tt tools/libxc}). 
   6.475 -
   6.476 -\section{Debugging Hypercalls} 
   6.477 -
   6.478 -A few additional hypercalls are mainly useful for debugging: 
   6.479 -
   6.480 -\begin{quote} 
   6.481 -\hypercall{console\_io(int cmd, int count, char *str)}
   6.482 -
   6.483 -Use Xen to interact with the console; operations are:
   6.484 -
   6.485 -{\it CONSOLEIO\_write}: Output count characters from buffer str.
   6.486 -
   6.487 -{\it CONSOLEIO\_read}: Input at most count characters into buffer str.
   6.488 -\end{quote} 
   6.489 -
   6.490 -A pair of hypercalls allows access to the underlying debug registers: 
   6.491 -\begin{quote}
   6.492 -\hypercall{set\_debugreg(int reg, unsigned long value)}
   6.493 -
   6.494 -Set debug register {\tt reg} to {\tt value} 
   6.495 -
   6.496 -\hypercall{get\_debugreg(int reg)}
   6.497 -
   6.498 -Return the contents of the debug register {\tt reg}
   6.499 -\end{quote}
   6.500 -
   6.501 -And finally: 
   6.502 -\begin{quote}
   6.503 -\hypercall{xen\_version(int cmd)}
   6.504 -
   6.505 -Request Xen version number.
   6.506 -\end{quote} 
   6.507 -
   6.508 -This is useful to ensure that user-space tools are in sync 
   6.509 -with the underlying hypervisor. 
   6.510 -
   6.511 -\section{Deprecated Hypercalls}
   6.512 -
   6.513 -Xen is under constant development and refinement; as such there 
   6.514 -are plans to improve the way in which various pieces of functionality 
   6.515 -are exposed to guest OSes. 
   6.516 -
   6.517 -\begin{quote} 
   6.518 -\hypercall{vm\_assist(unsigned int cmd, unsigned int type)}
   6.519 -
   6.520 -Toggle various memory management modes (in particular wrritable page
   6.521 -tables and superpage support). 
   6.522 -
   6.523 -\end{quote} 
   6.524 -
   6.525 -This is likely to be replaced with mode values in the shared 
   6.526 -information page since this is more resilient for resumption 
   6.527 -after migration or checkpoint. 
     7.1 --- a/docs/src/interface/memory.tex	Sun Dec 04 20:12:00 2005 +0100
     7.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
     7.3 @@ -1,162 +0,0 @@
     7.4 -\chapter{Memory}
     7.5 -\label{c:memory} 
     7.6 -
     7.7 -Xen is responsible for managing the allocation of physical memory to
     7.8 -domains, and for ensuring safe use of the paging and segmentation
     7.9 -hardware.
    7.10 -
    7.11 -
    7.12 -\section{Memory Allocation}
    7.13 -
    7.14 -Xen resides within a small fixed portion of physical memory; it also
    7.15 -reserves the top 64MB of every virtual address space. The remaining
    7.16 -physical memory is available for allocation to domains at a page
    7.17 -granularity.  Xen tracks the ownership and use of each page, which
    7.18 -allows it to enforce secure partitioning between domains.
    7.19 -
    7.20 -Each domain has a maximum and current physical memory allocation.  A
    7.21 -guest OS may run a `balloon driver' to dynamically adjust its current
    7.22 -memory allocation up to its limit.
    7.23 -
    7.24 -
    7.25 -%% XXX SMH: I use machine and physical in the next section (which is
    7.26 -%% kinda required for consistency with code); wonder if this section
    7.27 -%% should use same terms?
    7.28 -%%
    7.29 -%% Probably. 
    7.30 -%%
    7.31 -%% Merging this and below section at some point prob makes sense.
    7.32 -
    7.33 -\section{Pseudo-Physical Memory}
    7.34 -
    7.35 -Since physical memory is allocated and freed on a page granularity,
    7.36 -there is no guarantee that a domain will receive a contiguous stretch
    7.37 -of physical memory. However most operating systems do not have good
    7.38 -support for operating in a fragmented physical address space. To aid
    7.39 -porting such operating systems to run on top of Xen, we make a
    7.40 -distinction between \emph{machine memory} and \emph{pseudo-physical
    7.41 -  memory}.
    7.42 -
    7.43 -Put simply, machine memory refers to the entire amount of memory
    7.44 -installed in the machine, including that reserved by Xen, in use by
    7.45 -various domains, or currently unallocated. We consider machine memory
    7.46 -to comprise a set of 4K \emph{machine page frames} numbered
    7.47 -consecutively starting from 0. Machine frame numbers mean the same
    7.48 -within Xen or any domain.
    7.49 -
    7.50 -Pseudo-physical memory, on the other hand, is a per-domain
    7.51 -abstraction. It allows a guest operating system to consider its memory
    7.52 -allocation to consist of a contiguous range of physical page frames
    7.53 -starting at physical frame 0, despite the fact that the underlying
    7.54 -machine page frames may be sparsely allocated and in any order.
    7.55 -
    7.56 -To achieve this, Xen maintains a globally readable {\it
    7.57 -  machine-to-physical} table which records the mapping from machine
    7.58 -page frames to pseudo-physical ones. In addition, each domain is
    7.59 -supplied with a {\it physical-to-machine} table which performs the
    7.60 -inverse mapping. Clearly the machine-to-physical table has size
    7.61 -proportional to the amount of RAM installed in the machine, while each
    7.62 -physical-to-machine table has size proportional to the memory
    7.63 -allocation of the given domain.
    7.64 -
    7.65 -Architecture dependent code in guest operating systems can then use
    7.66 -the two tables to provide the abstraction of pseudo-physical memory.
    7.67 -In general, only certain specialized parts of the operating system
    7.68 -(such as page table management) needs to understand the difference
    7.69 -between machine and pseudo-physical addresses.
    7.70 -
    7.71 -
    7.72 -\section{Page Table Updates}
    7.73 -
    7.74 -In the default mode of operation, Xen enforces read-only access to
    7.75 -page tables and requires guest operating systems to explicitly request
    7.76 -any modifications.  Xen validates all such requests and only applies
    7.77 -updates that it deems safe.  This is necessary to prevent domains from
    7.78 -adding arbitrary mappings to their page tables.
    7.79 -
    7.80 -To aid validation, Xen associates a type and reference count with each
    7.81 -memory page. A page has one of the following mutually-exclusive types
    7.82 -at any point in time: page directory ({\sf PD}), page table ({\sf
    7.83 -  PT}), local descriptor table ({\sf LDT}), global descriptor table
    7.84 -({\sf GDT}), or writable ({\sf RW}). Note that a guest OS may always
    7.85 -create readable mappings of its own memory regardless of its current
    7.86 -type.
    7.87 -
    7.88 -%%% XXX: possibly explain more about ref count 'lifecyle' here?
    7.89 -This mechanism is used to maintain the invariants required for safety;
    7.90 -for example, a domain cannot have a writable mapping to any part of a
    7.91 -page table as this would require the page concerned to simultaneously
    7.92 -be of types {\sf PT} and {\sf RW}.
    7.93 -
    7.94 -
    7.95 -% \section{Writable Page Tables}
    7.96 -
    7.97 -Xen also provides an alternative mode of operation in which guests be
    7.98 -have the illusion that their page tables are directly writable.  Of
    7.99 -course this is not really the case, since Xen must still validate
   7.100 -modifications to ensure secure partitioning. To this end, Xen traps
   7.101 -any write attempt to a memory page of type {\sf PT} (i.e., that is
   7.102 -currently part of a page table).  If such an access occurs, Xen
   7.103 -temporarily allows write access to that page while at the same time
   7.104 -\emph{disconnecting} it from the page table that is currently in use.
   7.105 -This allows the guest to safely make updates to the page because the
   7.106 -newly-updated entries cannot be used by the MMU until Xen revalidates
   7.107 -and reconnects the page.  Reconnection occurs automatically in a
   7.108 -number of situations: for example, when the guest modifies a different
   7.109 -page-table page, when the domain is preempted, or whenever the guest
   7.110 -uses Xen's explicit page-table update interfaces.
   7.111 -
   7.112 -Finally, Xen also supports a form of \emph{shadow page tables} in
   7.113 -which the guest OS uses a independent copy of page tables which are
   7.114 -unknown to the hardware (i.e.\ which are never pointed to by {\tt
   7.115 -  cr3}). Instead Xen propagates changes made to the guest's tables to
   7.116 -the real ones, and vice versa. This is useful for logging page writes
   7.117 -(e.g.\ for live migration or checkpoint). A full version of the shadow
   7.118 -page tables also allows guest OS porting with less effort.
   7.119 -
   7.120 -
   7.121 -\section{Segment Descriptor Tables}
   7.122 -
   7.123 -On boot a guest is supplied with a default GDT, which does not reside
   7.124 -within its own memory allocation.  If the guest wishes to use other
   7.125 -than the default `flat' ring-1 and ring-3 segments that this GDT
   7.126 -provides, it must register a custom GDT and/or LDT with Xen, allocated
   7.127 -from its own memory. Note that a number of GDT entries are reserved by
   7.128 -Xen -- any custom GDT must also include sufficient space for these
   7.129 -entries.
   7.130 -
   7.131 -For example, the following hypercall is used to specify a new GDT:
   7.132 -
   7.133 -\begin{quote}
   7.134 -  int {\bf set\_gdt}(unsigned long *{\em frame\_list}, int {\em
   7.135 -    entries})
   7.136 -
   7.137 -  \emph{frame\_list}: An array of up to 16 machine page frames within
   7.138 -  which the GDT resides.  Any frame registered as a GDT frame may only
   7.139 -  be mapped read-only within the guest's address space (e.g., no
   7.140 -  writable mappings, no use as a page-table page, and so on).
   7.141 -
   7.142 -  \emph{entries}: The number of descriptor-entry slots in the GDT.
   7.143 -  Note that the table must be large enough to contain Xen's reserved
   7.144 -  entries; thus we must have `{\em entries $>$
   7.145 -    LAST\_RESERVED\_GDT\_ENTRY}\ '.  Note also that, after registering
   7.146 -  the GDT, slots \emph{FIRST\_} through
   7.147 -  \emph{LAST\_RESERVED\_GDT\_ENTRY} are no longer usable by the guest
   7.148 -  and may be overwritten by Xen.
   7.149 -\end{quote}
   7.150 -
   7.151 -The LDT is updated via the generic MMU update mechanism (i.e., via the
   7.152 -{\tt mmu\_update()} hypercall.
   7.153 -
   7.154 -\section{Start of Day}
   7.155 -
   7.156 -The start-of-day environment for guest operating systems is rather
   7.157 -different to that provided by the underlying hardware. In particular,
   7.158 -the processor is already executing in protected mode with paging
   7.159 -enabled.
   7.160 -
   7.161 -{\it Domain 0} is created and booted by Xen itself. For all subsequent
   7.162 -domains, the analogue of the boot-loader is the {\it domain builder},
   7.163 -user-space software running in {\it domain 0}. The domain builder is
   7.164 -responsible for building the initial page tables for a domain and
   7.165 -loading its kernel image at the appropriate virtual address.
     8.1 --- a/docs/src/interface/scheduling.tex	Sun Dec 04 20:12:00 2005 +0100
     8.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
     8.3 @@ -1,268 +0,0 @@
     8.4 -\chapter{Scheduling API}  
     8.5 -
     8.6 -The scheduling API is used by both the schedulers described above and should
     8.7 -also be used by any new schedulers.  It provides a generic interface and also
     8.8 -implements much of the ``boilerplate'' code.
     8.9 -
    8.10 -Schedulers conforming to this API are described by the following
    8.11 -structure:
    8.12 -
    8.13 -\begin{verbatim}
    8.14 -struct scheduler
    8.15 -{
    8.16 -    char *name;             /* full name for this scheduler      */
    8.17 -    char *opt_name;         /* option name for this scheduler    */
    8.18 -    unsigned int sched_id;  /* ID for this scheduler             */
    8.19 -
    8.20 -    int          (*init_scheduler) ();
    8.21 -    int          (*alloc_task)     (struct task_struct *);
    8.22 -    void         (*add_task)       (struct task_struct *);
    8.23 -    void         (*free_task)      (struct task_struct *);
    8.24 -    void         (*rem_task)       (struct task_struct *);
    8.25 -    void         (*wake_up)        (struct task_struct *);
    8.26 -    void         (*do_block)       (struct task_struct *);
    8.27 -    task_slice_t (*do_schedule)    (s_time_t);
    8.28 -    int          (*control)        (struct sched_ctl_cmd *);
    8.29 -    int          (*adjdom)         (struct task_struct *,
    8.30 -                                    struct sched_adjdom_cmd *);
    8.31 -    s32          (*reschedule)     (struct task_struct *);
    8.32 -    void         (*dump_settings)  (void);
    8.33 -    void         (*dump_cpu_state) (int);
    8.34 -    void         (*dump_runq_el)   (struct task_struct *);
    8.35 -};
    8.36 -\end{verbatim}
    8.37 -
    8.38 -The only method that {\em must} be implemented is
    8.39 -{\tt do\_schedule()}.  However, if there is not some implementation for the
    8.40 -{\tt wake\_up()} method then waking tasks will not get put on the runqueue!
    8.41 -
    8.42 -The fields of the above structure are described in more detail below.
    8.43 -
    8.44 -\subsubsection{name}
    8.45 -
    8.46 -The name field should point to a descriptive ASCII string.
    8.47 -
    8.48 -\subsubsection{opt\_name}
    8.49 -
    8.50 -This field is the value of the {\tt sched=} boot-time option that will select
    8.51 -this scheduler.
    8.52 -
    8.53 -\subsubsection{sched\_id}
    8.54 -
    8.55 -This is an integer that uniquely identifies this scheduler.  There should be a
    8.56 -macro corrsponding to this scheduler ID in {\tt <xen/sched-if.h>}.
    8.57 -
    8.58 -\subsubsection{init\_scheduler}
    8.59 -
    8.60 -\paragraph*{Purpose}
    8.61 -
    8.62 -This is a function for performing any scheduler-specific initialisation.  For
    8.63 -instance, it might allocate memory for per-CPU scheduler data and initialise it
    8.64 -appropriately.
    8.65 -
    8.66 -\paragraph*{Call environment}
    8.67 -
    8.68 -This function is called after the initialisation performed by the generic
    8.69 -layer.  The function is called exactly once, for the scheduler that has been
    8.70 -selected.
    8.71 -
    8.72 -\paragraph*{Return values}
    8.73 -
    8.74 -This should return negative on failure --- this will cause an
    8.75 -immediate panic and the system will fail to boot.
    8.76 -
    8.77 -\subsubsection{alloc\_task}
    8.78 -
    8.79 -\paragraph*{Purpose}
    8.80 -Called when a {\tt task\_struct} is allocated by the generic scheduler
    8.81 -layer.  A particular scheduler implementation may use this method to
    8.82 -allocate per-task data for this task.  It may use the {\tt
    8.83 -sched\_priv} pointer in the {\tt task\_struct} to point to this data.
    8.84 -
    8.85 -\paragraph*{Call environment}
    8.86 -The generic layer guarantees that the {\tt sched\_priv} field will
    8.87 -remain intact from the time this method is called until the task is
    8.88 -deallocated (so long as the scheduler implementation does not change
    8.89 -it explicitly!).
    8.90 -
    8.91 -\paragraph*{Return values}
    8.92 -Negative on failure.
    8.93 -
    8.94 -\subsubsection{add\_task}
    8.95 -
    8.96 -\paragraph*{Purpose}
    8.97 -
    8.98 -Called when a task is initially added by the generic layer.
    8.99 -
   8.100 -\paragraph*{Call environment}
   8.101 -
   8.102 -The fields in the {\tt task\_struct} are now filled out and available for use.
   8.103 -Schedulers should implement appropriate initialisation of any per-task private
   8.104 -information in this method.
   8.105 -
   8.106 -\subsubsection{free\_task}
   8.107 -
   8.108 -\paragraph*{Purpose}
   8.109 -
   8.110 -Schedulers should free the space used by any associated private data
   8.111 -structures.
   8.112 -
   8.113 -\paragraph*{Call environment}
   8.114 -
   8.115 -This is called when a {\tt task\_struct} is about to be deallocated.
   8.116 -The generic layer will have done generic task removal operations and
   8.117 -(if implemented) called the scheduler's {\tt rem\_task} method before
   8.118 -this method is called.
   8.119 -
   8.120 -\subsubsection{rem\_task}
   8.121 -
   8.122 -\paragraph*{Purpose}
   8.123 -
   8.124 -This is called when a task is being removed from scheduling (but is
   8.125 -not yet being freed).
   8.126 -
   8.127 -\subsubsection{wake\_up}
   8.128 -
   8.129 -\paragraph*{Purpose}
   8.130 -
   8.131 -Called when a task is woken up, this method should put the task on the runqueue
   8.132 -(or do the scheduler-specific equivalent action).
   8.133 -
   8.134 -\paragraph*{Call environment}
   8.135 -
   8.136 -The task is already set to state RUNNING.
   8.137 -
   8.138 -\subsubsection{do\_block}
   8.139 -
   8.140 -\paragraph*{Purpose}
   8.141 -
   8.142 -This function is called when a task is blocked.  This function should
   8.143 -not remove the task from the runqueue.
   8.144 -
   8.145 -\paragraph*{Call environment}
   8.146 -
   8.147 -The EVENTS\_MASTER\_ENABLE\_BIT is already set and the task state changed to
   8.148 -TASK\_INTERRUPTIBLE on entry to this method.  A call to the {\tt
   8.149 -  do\_schedule} method will be made after this method returns, in
   8.150 -order to select the next task to run.
   8.151 -
   8.152 -\subsubsection{do\_schedule}
   8.153 -
   8.154 -This method must be implemented.
   8.155 -
   8.156 -\paragraph*{Purpose}
   8.157 -
   8.158 -The method is called each time a new task must be chosen for scheduling on the
   8.159 -current CPU.  The current time as passed as the single argument (the current
   8.160 -task can be found using the {\tt current} macro).
   8.161 -
   8.162 -This method should select the next task to run on this CPU and set it's minimum
   8.163 -time to run as well as returning the data described below.
   8.164 -
   8.165 -This method should also take the appropriate action if the previous
   8.166 -task has blocked, e.g. removing it from the runqueue.
   8.167 -
   8.168 -\paragraph*{Call environment}
   8.169 -
   8.170 -The other fields in the {\tt task\_struct} are updated by the generic layer,
   8.171 -which also performs all Xen-specific tasks and performs the actual task switch
   8.172 -(unless the previous task has been chosen again).
   8.173 -
   8.174 -This method is called with the {\tt schedule\_lock} held for the current CPU
   8.175 -and local interrupts disabled.
   8.176 -
   8.177 -\paragraph*{Return values}
   8.178 -
   8.179 -Must return a {\tt struct task\_slice} describing what task to run and how long
   8.180 -for (at maximum).
   8.181 -
   8.182 -\subsubsection{control}
   8.183 -
   8.184 -\paragraph*{Purpose}
   8.185 -
   8.186 -This method is called for global scheduler control operations.  It takes a
   8.187 -pointer to a {\tt struct sched\_ctl\_cmd}, which it should either
   8.188 -source data from or populate with data, depending on the value of the
   8.189 -{\tt direction} field.
   8.190 -
   8.191 -\paragraph*{Call environment}
   8.192 -
   8.193 -The generic layer guarantees that when this method is called, the
   8.194 -caller selected the correct scheduler ID, hence the scheduler's
   8.195 -implementation does not need to sanity-check these parts of the call.
   8.196 -
   8.197 -\paragraph*{Return values}
   8.198 -
   8.199 -This function should return the value to be passed back to user space, hence it
   8.200 -should either be 0 or an appropriate errno value.
   8.201 -
   8.202 -\subsubsection{sched\_adjdom}
   8.203 -
   8.204 -\paragraph*{Purpose}
   8.205 -
   8.206 -This method is called to adjust the scheduling parameters of a particular
   8.207 -domain, or to query their current values.  The function should check
   8.208 -the {\tt direction} field of the {\tt sched\_adjdom\_cmd} it receives in
   8.209 -order to determine which of these operations is being performed.
   8.210 -
   8.211 -\paragraph*{Call environment}
   8.212 -
   8.213 -The generic layer guarantees that the caller has specified the correct
   8.214 -control interface version and scheduler ID and that the supplied {\tt
   8.215 -task\_struct} will not be deallocated during the call (hence it is not
   8.216 -necessary to {\tt get\_task\_struct}).
   8.217 -
   8.218 -\paragraph*{Return values}
   8.219 -
   8.220 -This function should return the value to be passed back to user space, hence it
   8.221 -should either be 0 or an appropriate errno value.
   8.222 -
   8.223 -\subsubsection{reschedule}
   8.224 -
   8.225 -\paragraph*{Purpose}
   8.226 -
   8.227 -This method is called to determine if a reschedule is required as a result of a
   8.228 -particular task.
   8.229 -
   8.230 -\paragraph*{Call environment}
   8.231 -The generic layer will cause a reschedule if the current domain is the idle
   8.232 -task or it has exceeded its minimum time slice before a reschedule.  The
   8.233 -generic layer guarantees that the task passed is not currently running but is
   8.234 -on the runqueue.
   8.235 -
   8.236 -\paragraph*{Return values}
   8.237 -
   8.238 -Should return a mask of CPUs to cause a reschedule on.
   8.239 -
   8.240 -\subsubsection{dump\_settings}
   8.241 -
   8.242 -\paragraph*{Purpose}
   8.243 -
   8.244 -If implemented, this should dump any private global settings for this
   8.245 -scheduler to the console.
   8.246 -
   8.247 -\paragraph*{Call environment}
   8.248 -
   8.249 -This function is called with interrupts enabled.
   8.250 -
   8.251 -\subsubsection{dump\_cpu\_state}
   8.252 -
   8.253 -\paragraph*{Purpose}
   8.254 -
   8.255 -This method should dump any private settings for the specified CPU.
   8.256 -
   8.257 -\paragraph*{Call environment}
   8.258 -
   8.259 -This function is called with interrupts disabled and the {\tt schedule\_lock}
   8.260 -for the specified CPU held.
   8.261 -
   8.262 -\subsubsection{dump\_runq\_el}
   8.263 -
   8.264 -\paragraph*{Purpose}
   8.265 -
   8.266 -This method should dump any private settings for the specified task.
   8.267 -
   8.268 -\paragraph*{Call environment}
   8.269 -
   8.270 -This function is called with interrupts disabled and the {\tt schedule\_lock}
   8.271 -for the task's CPU held.
     9.1 --- a/docs/src/user.tex	Sun Dec 04 20:12:00 2005 +0100
     9.2 +++ b/docs/src/user.tex	Mon Dec 05 13:39:26 2005 +0100
     9.3 @@ -1,13 +1,13 @@
     9.4 -\batchmode
     9.5  \documentclass[11pt,twoside,final,openright]{report}
     9.6 -\usepackage{a4,graphicx,html,parskip,setspace,times,xspace}
     9.7 +\usepackage{a4,graphicx,html,parskip,setspace,times,xspace,url}
     9.8  \setstretch{1.15}
     9.9  
    9.10 +\renewcommand{\ttdefault}{pcr}
    9.11  
    9.12  \def\Xend{{Xend}\xspace}
    9.13  \def\xend{{xend}\xspace}
    9.14  
    9.15 -\latexhtml{\newcommand{\path}[1]{{\small {\tt #1}}}}{\newcommand{\path}[1]{{\tt #1}}}
    9.16 +\latexhtml{\renewcommand{\path}[1]{{\small {\tt #1}}}}{\renewcommand{\path}[1]{{\tt #1}}}
    9.17  
    9.18  
    9.19  \begin{document}
    9.20 @@ -23,17 +23,17 @@
    9.21  \begin{tabular}{l}
    9.22  {\Huge \bf Users' Manual} \\[4mm]
    9.23  {\huge Xen v3.0} \\[80mm]
    9.24 -
    9.25  {\Large Xen is Copyright (c) 2002-2005, The Xen Team} \\[3mm]
    9.26  {\Large University of Cambridge, UK} \\[20mm]
    9.27  \end{tabular}
    9.28  \end{center}
    9.29  
    9.30 -{\bf DISCLAIMER: This documentation is currently under active
    9.31 -  development and as such there may be mistakes and omissions --- watch
    9.32 -  out for these and please report any you find to the developers'
    9.33 -  mailing list. Contributions of material, suggestions and corrections
    9.34 -  are welcome.}
    9.35 +{\bf DISCLAIMER: This documentation is always under active development
    9.36 +and as such there may be mistakes and omissions --- watch out for
    9.37 +these and please report any you find to the developers' mailing list,
    9.38 +xen-devel@lists.xensource.com. The latest version is always available
    9.39 +on-line. Contributions of material, suggestions and corrections are
    9.40 +welcome.}
    9.41  
    9.42  \vfill
    9.43  \cleardoublepage
    9.44 @@ -62,112 +62,1745 @@
    9.45  
    9.46  
    9.47  %% Chapter Introduction moved to introduction.tex
    9.48 -\include{src/user/introduction}
    9.49 +\chapter{Introduction}
    9.50 +
    9.51 +
    9.52 +Xen is an open-source \emph{para-virtualizing} virtual machine monitor
    9.53 +(VMM), or ``hypervisor'', for the x86 processor architecture. Xen can
    9.54 +securely execute multiple virtual machines on a single physical system
    9.55 +with close-to-native performance.  Xen facilitates enterprise-grade
    9.56 +functionality, including:
    9.57 +
    9.58 +\begin{itemize}
    9.59 +\item Virtual machines with performance close to native hardware.
    9.60 +\item Live migration of running virtual machines between physical hosts.
    9.61 +\item Up to 32 virtual CPUs per guest virtual machine, with VCPU hotplug.
    9.62 +\item x86/32, x86/32 with PAE, and x86/64 platform support.
    9.63 +\item Intel Virtualization Technology (VT-x) for unmodified guest operating systems (including Microsoft Windows).
    9.64 +\item Excellent hardware support (supports almost all Linux device
    9.65 +  drivers). 
    9.66 +\end{itemize}
    9.67 +
    9.68 +Xen is licensed under the GNU General Public License (GPL2).
    9.69 +
    9.70 +
    9.71 +\section{Usage Scenarios}
    9.72 +
    9.73 +Usage scenarios for Xen include:
    9.74 +
    9.75 +\begin{description}
    9.76 +\item [Server Consolidation.] Move multiple servers onto a single
    9.77 +  physical host with performance and fault isolation provided at the
    9.78 +  virtual machine boundaries.
    9.79 +\item [Hardware Independence.] Allow legacy applications and operating 
    9.80 +  systems to exploit new hardware.
    9.81 +\item [Multiple OS configurations.] Run multiple operating systems
    9.82 +  simultaneously, for development or testing purposes.
    9.83 +\item [Kernel Development.] Test and debug kernel modifications in a
    9.84 +  sand-boxed virtual machine --- no need for a separate test machine.
    9.85 +\item [Cluster Computing.] Management at VM granularity provides more
    9.86 +  flexibility than separately managing each physical host, but better
    9.87 +  control and isolation than single-system image solutions,
    9.88 +  particularly by using live migration for load balancing.
    9.89 +\item [Hardware support for custom OSes.] Allow development of new
    9.90 +  OSes while benefiting from the wide-ranging hardware support of
    9.91 +  existing OSes such as Linux.
    9.92 +\end{description}
    9.93 +
    9.94 +
    9.95 +\section{Operating System Support}
    9.96 +
    9.97 +Para-virtualization permits very high performance virtualization, even
    9.98 +on architectures like x86 that are traditionally very hard to
    9.99 +virtualize.
   9.100 +
   9.101 +This approach requires operating systems to be \emph{ported} to run on
   9.102 +Xen. Porting an OS to run on Xen is similar to supporting a new
   9.103 +hardware platform, however the process is simplified because the
   9.104 +para-virtual machine architecture is very similar to the underlying
   9.105 +native hardware. Even though operating system kernels must explicitly
   9.106 +support Xen, a key feature is that user space applications and
   9.107 +libraries \emph{do not} require modification.
   9.108 +
   9.109 +With hardware CPU virtualization as provided by Intel VT and AMD
   9.110 +Pacifica technology, the ability to run an unmodified guest OS kernel
   9.111 +is available.  No porting of the OS is required, although some
   9.112 +additional driver support is necessary within Xen itself.  Unlike
   9.113 +traditional full virtualization hypervisors, which suffer a tremendous
   9.114 +performance overhead, the combination of Xen and VT or Xen and
   9.115 +Pacifica technology complement one another to offer superb performance
   9.116 +for para-virtualized guest operating systems and full support for
   9.117 +unmodified guests running natively on the processor.  Full support for
   9.118 +VT and Pacifica chipsets will appear in early 2006.
   9.119 +
   9.120 +Paravirtualized Xen support is available for increasingly many
   9.121 +operating systems: currently, mature Linux support is available and
   9.122 +included in the standard distribution.  Other OS ports---including
   9.123 +NetBSD, FreeBSD and Solaris x86 v10---are nearing completion. 
   9.124 +
   9.125 +
   9.126 +\section{Hardware Support}
   9.127 +
   9.128 +Xen currently runs on the x86 architecture, requiring a ``P6'' or
   9.129 +newer processor (e.g.\ Pentium Pro, Celeron, Pentium~II, Pentium~III,
   9.130 +Pentium~IV, Xeon, AMD~Athlon, AMD~Duron). Multiprocessor machines are
   9.131 +supported, and there is support for HyperThreading (SMT).  In 
   9.132 +addition, ports to IA64 and Power architectures are in progress.
   9.133 +
   9.134 +The default 32-bit Xen supports up to 4GB of memory. However Xen 3.0
   9.135 +adds support for Intel's Physical Addressing Extensions (PAE), which
   9.136 +enable x86/32 machines to address up to 64 GB of physical memory.  Xen
   9.137 +3.0 also supports x86/64 platforms such as Intel EM64T and AMD Opteron
   9.138 +which can currently address up to 1TB of physical memory.
   9.139 +
   9.140 +Xen offloads most of the hardware support issues to the guest OS
   9.141 +running in the \emph{Domain~0} management virtual machine. Xen itself
   9.142 +contains only the code required to detect and start secondary
   9.143 +processors, set up interrupt routing, and perform PCI bus
   9.144 +enumeration. Device drivers run within a privileged guest OS rather
   9.145 +than within Xen itself. This approach provides compatibility with the
   9.146 +majority of device hardware supported by Linux. The default XenLinux
   9.147 +build contains support for most server-class network and disk
   9.148 +hardware, but you can add support for other hardware by configuring
   9.149 +your XenLinux kernel in the normal way.
   9.150 +
   9.151 +
   9.152 +\section{Structure of a Xen-Based System}
   9.153 +
   9.154 +A Xen system has multiple layers, the lowest and most privileged of
   9.155 +which is Xen itself.
   9.156 +
   9.157 +Xen may host multiple \emph{guest} operating systems, each of which is
   9.158 +executed within a secure virtual machine. In Xen terminology, a
   9.159 +\emph{domain}. Domains are scheduled by Xen to make effective use of the
   9.160 +available physical CPUs. Each guest OS manages its own applications.
   9.161 +This management includes the responsibility of scheduling each
   9.162 +application within the time allotted to the VM by Xen.
   9.163 +
   9.164 +The first domain, \emph{domain~0}, is created automatically when the
   9.165 +system boots and has special management privileges. Domain~0 builds
   9.166 +other domains and manages their virtual devices. It also performs
   9.167 +administrative tasks such as suspending, resuming and migrating other
   9.168 +virtual machines.
   9.169 +
   9.170 +Within domain~0, a process called \emph{xend} runs to manage the system.
   9.171 +\Xend\ is responsible for managing virtual machines and providing access
   9.172 +to their consoles. Commands are issued to \xend\ over an HTTP interface,
   9.173 +via a command-line tool.
   9.174 +
   9.175 +
   9.176 +\section{History}
   9.177 +
   9.178 +Xen was originally developed by the Systems Research Group at the
   9.179 +University of Cambridge Computer Laboratory as part of the XenoServers
   9.180 +project, funded by the UK-EPSRC\@.
   9.181 +
   9.182 +XenoServers aim to provide a ``public infrastructure for global
   9.183 +distributed computing''. Xen plays a key part in that, allowing one to
   9.184 +efficiently partition a single machine to enable multiple independent
   9.185 +clients to run their operating systems and applications in an
   9.186 +environment. This environment provides protection, resource isolation
   9.187 +and accounting. The project web page contains further information along
   9.188 +with pointers to papers and technical reports:
   9.189 +\path{http://www.cl.cam.ac.uk/xeno}
   9.190 +
   9.191 +Xen has grown into a fully-fledged project in its own right, enabling us
   9.192 +to investigate interesting research issues regarding the best techniques
   9.193 +for virtualizing resources such as the CPU, memory, disk and network.
   9.194 +Project contributors now include XenSource, Intel, IBM, HP, AMD, Novell,
   9.195 +RedHat.
   9.196 +
   9.197 +Xen was first described in a paper presented at SOSP in
   9.198 +2003\footnote{\tt
   9.199 +  http://www.cl.cam.ac.uk/netos/papers/2003-xensosp.pdf}, and the first
   9.200 +public release (1.0) was made that October. Since then, Xen has
   9.201 +significantly matured and is now used in production scenarios on many
   9.202 +sites.
   9.203 +
   9.204 +\section{What's New}
   9.205 +
   9.206 +Xen 3.0.0 offers:
   9.207 +
   9.208 +\begin{itemize}
   9.209 +\item Support for up to 32-way SMP guest operating systems
   9.210 +\item Intel (Physical Addressing Extensions) PAE to support 32-bit
   9.211 +  servers with more than 4GB physical memory
   9.212 +\item x86/64 support (Intel EM64T, AMD Opteron)
   9.213 +\item Intel VT-x support to enable the running of unmodified guest
   9.214 +operating systems (Windows XP/2003, Legacy Linux)
   9.215 +\item Enhanced control tools
   9.216 +\item Improved ACPI support
   9.217 +\item AGP/DRM graphics
   9.218 +\end{itemize}
   9.219 +
   9.220 +
   9.221 +Xen 3.0 features greatly enhanced hardware support, configuration
   9.222 +flexibility, usability and a larger complement of supported operating
   9.223 +systems.  This latest release takes Xen a step closer to being the 
   9.224 +definitive open source solution for virtualization.
   9.225 +
   9.226  
   9.227  
   9.228  \part{Installation}
   9.229  
   9.230  %% Chapter Basic Installation
   9.231 -\include{src/user/installation}
   9.232 -
   9.233 -%% Chapter Installing Xen on Debian
   9.234 -\include{src/user/debian}
   9.235 -
   9.236 -%% Chapter Installing Xen on Fedora Core
   9.237 -\include{src/user/fedora}
   9.238 -
   9.239 -%% Chapter Installing Xen on Gentoo Linux
   9.240 -\include{src/user/gentoo}
   9.241 +\chapter{Basic Installation}
   9.242  
   9.243 -%% Chapter Installing Xen on SuSE or SuSE SLES
   9.244 -\include{src/user/suse}
   9.245 -
   9.246 -%% Chapter Installing Xen on Red Hat Enterprise Linux (RHEL)
   9.247 -\include{src/user/rhel}
   9.248 +The Xen distribution includes three main components: Xen itself, ports
   9.249 +of Linux and NetBSD to run on Xen, and the userspace tools required to
   9.250 +manage a Xen-based system. This chapter describes how to install the
   9.251 +Xen~3.0 distribution from source. Alternatively, there may be pre-built
   9.252 +packages available as part of your operating system distribution.
   9.253  
   9.254 -% Chapter dom0 Installation
   9.255 -\include{src/user/dom0_installation}
   9.256  
   9.257 -% Chapter domU Installation
   9.258 -\include{src/user/domU_installation}
   9.259 +\section{Prerequisites}
   9.260 +\label{sec:prerequisites}
   9.261  
   9.262 -% Building Xen
   9.263 -\include{src/user/building_xen}
   9.264 +The following is a full list of prerequisites. Items marked `$\dag$' are
   9.265 +required by the \xend\ control tools, and hence required if you want to
   9.266 +run more than one virtual machine; items marked `$*$' are only required
   9.267 +if you wish to build from source.
   9.268 +\begin{itemize}
   9.269 +\item A working Linux distribution using the GRUB bootloader and running
   9.270 +  on a P6-class or newer CPU\@.
   9.271 +\item [$\dag$] The \path{iproute2} package.
   9.272 +\item [$\dag$] The Linux bridge-utils\footnote{Available from {\tt
   9.273 +      http://bridge.sourceforge.net}} (e.g., \path{/sbin/brctl})
   9.274 +\item [$\dag$] The Linux hotplug system\footnote{Available from {\tt
   9.275 +      http://linux-hotplug.sourceforge.net/}} (e.g.,
   9.276 +  \path{/sbin/hotplug} and related scripts)
   9.277 +\item [$*$] Build tools (gcc v3.2.x or v3.3.x, binutils, GNU make).
   9.278 +\item [$*$] Development installation of zlib (e.g.,\ zlib-dev).
   9.279 +\item [$*$] Development installation of Python v2.2 or later (e.g.,\
   9.280 +  python-dev).
   9.281 +\item [$*$] \LaTeX\ and transfig are required to build the
   9.282 +  documentation.
   9.283 +\end{itemize}
   9.284 +
   9.285 +Once you have satisfied these prerequisites, you can now install either
   9.286 +a binary or source distribution of Xen.
   9.287 +
   9.288 +\section{Installing from Binary Tarball}
   9.289 +
   9.290 +Pre-built tarballs are available for download from the XenSource downloads
   9.291 +page:
   9.292 +\begin{quote} {\tt http://www.xensource.com/downloads/}
   9.293 +\end{quote}
   9.294 +
   9.295 +Once you've downloaded the tarball, simply unpack and install:
   9.296 +\begin{verbatim}
   9.297 +# tar zxvf xen-3.0-install.tgz
   9.298 +# cd xen-3.0-install
   9.299 +# sh ./install.sh
   9.300 +\end{verbatim}
   9.301 +
   9.302 +Once you've installed the binaries you need to configure your system as
   9.303 +described in Section~\ref{s:configure}.
   9.304 +
   9.305 +\section{Installing from RPMs}
   9.306 +Pre-built RPMs are available for download from the XenSource downloads
   9.307 +page:
   9.308 +\begin{quote} {\tt http://www.xensource.com/downloads/}
   9.309 +\end{quote}
   9.310 +
   9.311 +Once you've downloaded the RPMs, you typically install them via the 
   9.312 +RPM commands: 
   9.313 +
   9.314 +\verb|# rpm -iv rpmname| 
   9.315 +
   9.316 +See the instructions and the Release Notes for each RPM set referenced at:
   9.317 +  \begin{quote}
   9.318 +    {\tt http://www.xensource.com/downloads/}.
   9.319 +  \end{quote}
   9.320 + 
   9.321 +\section{Installing from Source}
   9.322 +
   9.323 +This section describes how to obtain, build and install Xen from source.
   9.324 +
   9.325 +\subsection{Obtaining the Source}
   9.326 +
   9.327 +The Xen source tree is available as either a compressed source tarball
   9.328 +or as a clone of our master Mercurial repository.
   9.329 +
   9.330 +\begin{description}
   9.331 +\item[Obtaining the Source Tarball]\mbox{} \\
   9.332 +  Stable versions and daily snapshots of the Xen source tree are
   9.333 +  available from the Xen download page:
   9.334 +  \begin{quote} {\tt \tt http://www.xensource.com/downloads/}
   9.335 +  \end{quote}
   9.336 +\item[Obtaining the source via Mercurial]\mbox{} \\
   9.337 +  The source tree may also be obtained via the public Mercurial
   9.338 +  repository at:
   9.339 +  \begin{quote}{\tt http://xenbits.xensource.com}
   9.340 +  \end{quote} See the instructions and the Getting Started Guide
   9.341 +  referenced at:
   9.342 +  \begin{quote}
   9.343 +    {\tt http://www.xensource.com/downloads/}
   9.344 +  \end{quote}
   9.345 +\end{description}
   9.346 +
   9.347 +% \section{The distribution}
   9.348 +%
   9.349 +% The Xen source code repository is structured as follows:
   9.350 +%
   9.351 +% \begin{description}
   9.352 +% \item[\path{tools/}] Xen node controller daemon (Xend), command line
   9.353 +%   tools, control libraries
   9.354 +% \item[\path{xen/}] The Xen VMM.
   9.355 +% \item[\path{buildconfigs/}] Build configuration files
   9.356 +% \item[\path{linux-*-xen-sparse/}] Xen support for Linux.
   9.357 +% \item[\path{patches/}] Experimental patches for Linux.
   9.358 +% \item[\path{docs/}] Various documentation files for users and
   9.359 +%   developers.
   9.360 +% \item[\path{extras/}] Bonus extras.
   9.361 +% \end{description}
   9.362 +
   9.363 +\subsection{Building from Source}
   9.364 +
   9.365 +The top-level Xen Makefile includes a target ``world'' that will do the
   9.366 +following:
   9.367 +
   9.368 +\begin{itemize}
   9.369 +\item Build Xen.
   9.370 +\item Build the control tools, including \xend.
   9.371 +\item Download (if necessary) and unpack the Linux 2.6 source code, and
   9.372 +  patch it for use with Xen.
   9.373 +\item Build a Linux kernel to use in domain~0 and a smaller unprivileged
   9.374 +  kernel, which can be used for unprivileged virtual machines.
   9.375 +\end{itemize}
   9.376 +
   9.377 +After the build has completed you should have a top-level directory
   9.378 +called \path{dist/} in which all resulting targets will be placed. Of
   9.379 +particular interest are the two XenLinux kernel images, one with a
   9.380 +``-xen0'' extension which contains hardware device drivers and drivers
   9.381 +for Xen's virtual devices, and one with a ``-xenU'' extension that
   9.382 +just contains the virtual ones. These are found in
   9.383 +\path{dist/install/boot/} along with the image for Xen itself and the
   9.384 +configuration files used during the build.
   9.385 +
   9.386 +%The NetBSD port can be built using:
   9.387 +%\begin{quote}
   9.388 +%\begin{verbatim}
   9.389 +%# make netbsd20
   9.390 +%\end{verbatim}\end{quote}
   9.391 +%NetBSD port is built using a snapshot of the netbsd-2-0 cvs branch.
   9.392 +%The snapshot is downloaded as part of the build process if it is not
   9.393 +%yet present in the \path{NETBSD\_SRC\_PATH} search path.  The build
   9.394 +%process also downloads a toolchain which includes all of the tools
   9.395 +%necessary to build the NetBSD kernel under Linux.
   9.396 +
   9.397 +To customize the set of kernels built you need to edit the top-level
   9.398 +Makefile. Look for the line:
   9.399 +\begin{quote}
   9.400 +\begin{verbatim}
   9.401 +KERNELS ?= linux-2.6-xen0 linux-2.6-xenU
   9.402 +\end{verbatim}
   9.403 +\end{quote}
   9.404 +
   9.405 +You can edit this line to include any set of operating system kernels
   9.406 +which have configurations in the top-level \path{buildconfigs/}
   9.407 +directory.
   9.408 +
   9.409 +%% Inspect the Makefile if you want to see what goes on during a
   9.410 +%% build.  Building Xen and the tools is straightforward, but XenLinux
   9.411 +%% is more complicated.  The makefile needs a `pristine' Linux kernel
   9.412 +%% tree to which it will then add the Xen architecture files.  You can
   9.413 +%% tell the makefile the location of the appropriate Linux compressed
   9.414 +%% tar file by
   9.415 +%% setting the LINUX\_SRC environment variable, e.g. \\
   9.416 +%% \verb!# LINUX_SRC=/tmp/linux-2.6.11.tar.bz2 make world! \\ or by
   9.417 +%% placing the tar file somewhere in the search path of {\tt
   9.418 +%%   LINUX\_SRC\_PATH} which defaults to `{\tt .:..}'.  If the
   9.419 +%% makefile can't find a suitable kernel tar file it attempts to
   9.420 +%% download it from kernel.org (this won't work if you're behind a
   9.421 +%% firewall).
   9.422 +
   9.423 +%% After untaring the pristine kernel tree, the makefile uses the {\tt
   9.424 +%%   mkbuildtree} script to add the Xen patches to the kernel.
   9.425 +
   9.426 +%% \framebox{\parbox{5in}{
   9.427 +%%     {\bf Distro specific:} \\
   9.428 +%%     {\it Gentoo} --- if not using udev (most installations,
   9.429 +%%     currently), you'll need to enable devfs and devfs mount at boot
   9.430 +%%     time in the xen0 config.  }}
   9.431 +
   9.432 +\subsection{Custom Kernels}
   9.433 +
   9.434 +% If you have an SMP machine you may wish to give the {\tt '-j4'}
   9.435 +% argument to make to get a parallel build.
   9.436 +
   9.437 +If you wish to build a customized XenLinux kernel (e.g.\ to support
   9.438 +additional devices or enable distribution-required features), you can
   9.439 +use the standard Linux configuration mechanisms, specifying that the
   9.440 +architecture being built for is \path{xen}, e.g:
   9.441 +\begin{quote}
   9.442 +\begin{verbatim}
   9.443 +# cd linux-2.6.12-xen0
   9.444 +# make ARCH=xen xconfig
   9.445 +# cd ..
   9.446 +# make
   9.447 +\end{verbatim}
   9.448 +\end{quote}
   9.449 +
   9.450 +You can also copy an existing Linux configuration (\path{.config}) into
   9.451 +e.g.\ \path{linux-2.6.12-xen0} and execute:
   9.452 +\begin{quote}
   9.453 +\begin{verbatim}
   9.454 +# make ARCH=xen oldconfig
   9.455 +\end{verbatim}
   9.456 +\end{quote}
   9.457 +
   9.458 +You may be prompted with some Xen-specific options. We advise accepting
   9.459 +the defaults for these options.
   9.460 +
   9.461 +Note that the only difference between the two types of Linux kernels
   9.462 +that are built is the configuration file used for each. The ``U''
   9.463 +suffixed (unprivileged) versions don't contain any of the physical
   9.464 +hardware device drivers, leading to a 30\% reduction in size; hence you
   9.465 +may prefer these for your non-privileged domains. The ``0'' suffixed
   9.466 +privileged versions can be used to boot the system, as well as in driver
   9.467 +domains and unprivileged domains.
   9.468 +
   9.469 +\subsection{Installing Generated Binaries}
   9.470 +
   9.471 +The files produced by the build process are stored under the
   9.472 +\path{dist/install/} directory. To install them in their default
   9.473 +locations, do:
   9.474 +\begin{quote}
   9.475 +\begin{verbatim}
   9.476 +# make install
   9.477 +\end{verbatim}
   9.478 +\end{quote}
   9.479 +
   9.480 +Alternatively, users with special installation requirements may wish to
   9.481 +install them manually by copying the files to their appropriate
   9.482 +destinations.
   9.483 +
   9.484 +%% Files in \path{install/boot/} include:
   9.485 +%% \begin{itemize}
   9.486 +%% \item \path{install/boot/xen-3.0.gz} Link to the Xen 'kernel'
   9.487 +%% \item \path{install/boot/vmlinuz-2.6-xen0} Link to domain 0
   9.488 +%%   XenLinux kernel
   9.489 +%% \item \path{install/boot/vmlinuz-2.6-xenU} Link to unprivileged
   9.490 +%%   XenLinux kernel
   9.491 +%% \end{itemize}
   9.492 +
   9.493 +The \path{dist/install/boot} directory will also contain the config
   9.494 +files used for building the XenLinux kernels, and also versions of Xen
   9.495 +and XenLinux kernels that contain debug symbols such as
   9.496 +(\path{xen-syms-3.0.0} and \path{vmlinux-syms-2.6.12.6-xen0}) which are
   9.497 +essential for interpreting crash dumps. Retain these files as the
   9.498 +developers may wish to see them if you post on the mailing list.
   9.499 +
   9.500 +
   9.501 +\section{Configuration}
   9.502 +\label{s:configure}
   9.503 +
   9.504 +Once you have built and installed the Xen distribution, it is simple to
   9.505 +prepare the machine for booting and running Xen.
   9.506 +
   9.507 +\subsection{GRUB Configuration}
   9.508 +
   9.509 +An entry should be added to \path{grub.conf} (often found under
   9.510 +\path{/boot/} or \path{/boot/grub/}) to allow Xen / XenLinux to boot.
   9.511 +This file is sometimes called \path{menu.lst}, depending on your
   9.512 +distribution. The entry should look something like the following:
   9.513 +
   9.514 +%% KMSelf Thu Dec  1 19:06:13 PST 2005 262144 is useful for RHEL/RH and
   9.515 +%% related Dom0s.
   9.516 +{\small
   9.517 +\begin{verbatim}
   9.518 +title Xen 3.0 / XenLinux 2.6
   9.519 +  kernel /boot/xen-3.0.gz dom0_mem=262144
   9.520 +  module /boot/vmlinuz-2.6-xen0 root=/dev/sda4 ro console=tty0
   9.521 +\end{verbatim}
   9.522 +}
   9.523 +
   9.524 +The kernel line tells GRUB where to find Xen itself and what boot
   9.525 +parameters should be passed to it (in this case, setting the domain~0
   9.526 +memory allocation in kilobytes and the settings for the serial port).
   9.527 +For more details on the various Xen boot parameters see
   9.528 +Section~\ref{s:xboot}.
   9.529 +
   9.530 +The module line of the configuration describes the location of the
   9.531 +XenLinux kernel that Xen should start and the parameters that should be
   9.532 +passed to it. These are standard Linux parameters, identifying the root
   9.533 +device and specifying it be initially mounted read only and instructing
   9.534 +that console output be sent to the screen. Some distributions such as
   9.535 +SuSE do not require the \path{ro} parameter.
   9.536 +
   9.537 +%% \framebox{\parbox{5in}{
   9.538 +%%     {\bf Distro specific:} \\
   9.539 +%%     {\it SuSE} --- Omit the {\tt ro} option from the XenLinux
   9.540 +%%     kernel command line, since the partition won't be remounted rw
   9.541 +%%     during boot.  }}
   9.542 +
   9.543 +To use an initrd, add another \path{module} line to the configuration,
   9.544 +like: {\small
   9.545 +\begin{verbatim}
   9.546 +  module /boot/my_initrd.gz
   9.547 +\end{verbatim}
   9.548 +}
   9.549 +
   9.550 +%% KMSelf Thu Dec  1 19:05:30 PST 2005 Other configs as an appendix?
   9.551 +
   9.552 +When installing a new kernel, it is recommended that you do not delete
   9.553 +existing menu options from \path{menu.lst}, as you may wish to boot your
   9.554 +old Linux kernel in future, particularly if you have problems.
   9.555 +
   9.556 +\subsection{Serial Console (optional)}
   9.557 +
   9.558 +Serial console access allows you to manage, monitor, and interact with
   9.559 +your system over a serial console.  This can allow access from another
   9.560 +nearby system via a null-modem (``LapLink'') cable or remotely via a serial
   9.561 +concentrator.
   9.562 +
   9.563 +You system's BIOS, bootloader (GRUB), Xen, Linux, and login access must
   9.564 +each be individually configured for serial console access.  It is
   9.565 +\emph{not} strictly necessary to have each component fully functional,
   9.566 +but it can be quite useful.
   9.567 +
   9.568 +For general information on serial console configuration under Linux,
   9.569 +refer to the ``Remote Serial Console HOWTO'' at The Linux Documentation
   9.570 +Project: \url{http://www.tldp.org} 
   9.571 +
   9.572 +\subsubsection{Serial Console BIOS configuration}
   9.573 +
   9.574 +Enabling system serial console output neither enables nor disables
   9.575 +serial capabilities in GRUB, Xen, or Linux, but may make remote
   9.576 +management of your system more convenient by displaying POST and other
   9.577 +boot messages over serial port and allowing remote BIOS configuration.
   9.578 +
   9.579 +Refer to your hardware vendor's documentation for capabilities and
   9.580 +procedures to enable BIOS serial redirection.
   9.581 +
   9.582 +
   9.583 +\subsubsection{Serial Console GRUB configuration}
   9.584 +
   9.585 +Enabling GRUB serial console output neither enables nor disables Xen or
   9.586 +Linux serial capabilities, but may made remote management of your system
   9.587 +more convenient by displaying GRUB prompts, menus, and actions over
   9.588 +serial port and allowing remote GRUB management.
   9.589 +
   9.590 +Adding the following two lines to your GRUB configuration file,
   9.591 +typically either \path{/boot/grub/menu.lst} or \path{/boot/grub/grub.conf}
   9.592 +depending on your distro, will enable GRUB serial output.
   9.593 +
   9.594 +\begin{quote} 
   9.595 +{\small \begin{verbatim}
   9.596 +  serial --unit=0 --speed=115200 --word=8 --parity=no --stop=1
   9.597 +  terminal --timeout=10 serial console
   9.598 +\end{verbatim}}
   9.599 +\end{quote}
   9.600 +
   9.601 +Note that when both the serial port and the local monitor and keyboard
   9.602 +are enabled, the text ``\emph{Press any key to continue}'' will appear
   9.603 +at both.  Pressing a key on one device will cause GRUB to display to
   9.604 +that device.  The other device will see no output.  If no key is
   9.605 +pressed before the timeout period expires, the system will boot to the
   9.606 +default GRUB boot entry.
   9.607 +
   9.608 +Please refer to the GRUB documentation for further information.
   9.609 +
   9.610 +
   9.611 +\subsubsection{Serial Console Xen configuration}
   9.612 +
   9.613 +Enabling Xen serial console output neither enables nor disables Linux
   9.614 +kernel output or logging in to Linux over serial port.  It does however
   9.615 +allow you to monitor and log the Xen boot process via serial console and
   9.616 +can be very useful in debugging.
   9.617 +
   9.618 +%% kernel /boot/xen-2.0.gz dom0_mem=131072 com1=115200,8n1
   9.619 +%% module /boot/vmlinuz-2.6-xen0 root=/dev/sda4 ro
   9.620 +
   9.621 +In order to configure Xen serial console output, it is necessary to
   9.622 +add a boot option to your GRUB config; e.g.\ replace the previous
   9.623 +example kernel line with:
   9.624 +\begin{quote} {\small \begin{verbatim}
   9.625 +   kernel /boot/xen.gz dom0_mem=131072 com1=115200,8n1
   9.626 +\end{verbatim}}
   9.627 +\end{quote}
   9.628 +
   9.629 +This configures Xen to output on COM1 at 115,200 baud, 8 data bits, 1
   9.630 +stop bit and no parity. Modify these parameters for your environment.
   9.631 +
   9.632 +One can also configure XenLinux to share the serial console; to achieve
   9.633 +this append ``\path{console=ttyS0}'' to your module line.
   9.634 +
   9.635 +
   9.636 +\subsubsection{Serial Console Linux configuration}
   9.637 +
   9.638 +Enabling Linux serial console output at boot neither enables nor
   9.639 +disables logging in to Linux over serial port.  It does however allow
   9.640 +you to monitor and log the Linux boot process via serial console and can be
   9.641 +very useful in debugging.
   9.642 +
   9.643 +To enable Linux output at boot time, add the parameter
   9.644 +\path{console=ttyS0} (or ttyS1, ttyS2, etc.) to your kernel GRUB line.
   9.645 +Under Xen, this might be:
   9.646 +\begin{quote} 
   9.647 +{\footnotesize \begin{verbatim}
   9.648 +  module /vmlinuz-2.6-xen0 ro root=/dev/VolGroup00/LogVol00 \
   9.649 +  console=ttyS0, 115200
   9.650 +\end{verbatim}}
   9.651 +\end{quote}
   9.652 +to enable output over ttyS0 at 115200 baud.
   9.653 +
   9.654 +
   9.655 +
   9.656 +\subsubsection{Serial Console Login configuration}
   9.657 +
   9.658 +Logging in to Linux via serial console, under Xen or otherwise, requires
   9.659 +specifying a login prompt be started on the serial port.  To permit root
   9.660 +logins over serial console, the serial port must be added to
   9.661 +\path{/etc/securetty}.
   9.662 +
   9.663 +\newpage
   9.664 +To automatically start a login prompt over the serial port, 
   9.665 +add the line: \begin{quote} {\small {\tt c:2345:respawn:/sbin/mingetty
   9.666 +ttyS0}} \end{quote} to \path{/etc/inittab}.   Run \path{init q} to force
   9.667 +a reload of your inttab and start getty.
   9.668 +
   9.669 +To enable root logins, add \path{ttyS0} to \path{/etc/securetty} if not
   9.670 +already present.
   9.671 +
   9.672 +Your distribution may use an alternate getty; options include getty,
   9.673 +mgetty and agetty.  Consult your distribution's documentation
   9.674 +for further information.
   9.675 +
   9.676 +
   9.677 +\subsection{TLS Libraries}
   9.678 +
   9.679 +Users of the XenLinux 2.6 kernel should disable Thread Local Storage
   9.680 +(TLS) (e.g.\ by doing a \path{mv /lib/tls /lib/tls.disabled}) before
   9.681 +attempting to boot a XenLinux kernel\footnote{If you boot without first
   9.682 +  disabling TLS, you will get a warning message during the boot process.
   9.683 +  In this case, simply perform the rename after the machine is up and
   9.684 +  then run \path{/sbin/ldconfig} to make it take effect.}. You can
   9.685 +always reenable TLS by restoring the directory to its original location
   9.686 +(i.e.\ \path{mv /lib/tls.disabled /lib/tls}).
   9.687 +
   9.688 +The reason for this is that the current TLS implementation uses
   9.689 +segmentation in a way that is not permissible under Xen. If TLS is not
   9.690 +disabled, an emulation mode is used within Xen which reduces performance
   9.691 +substantially. To ensure full performance you should install a 
   9.692 +`Xen-friendly' (nosegneg) version of the library. 
   9.693 +
   9.694 +
   9.695 +\section{Booting Xen}
   9.696 +
   9.697 +It should now be possible to restart the system and use Xen. Reboot and
   9.698 +choose the new Xen option when the Grub screen appears.
   9.699 +
   9.700 +What follows should look much like a conventional Linux boot. The first
   9.701 +portion of the output comes from Xen itself, supplying low level
   9.702 +information about itself and the underlying hardware. The last portion
   9.703 +of the output comes from XenLinux.
   9.704 +
   9.705 +You may see some error messages during the XenLinux boot. These are not
   9.706 +necessarily anything to worry about---they may result from kernel
   9.707 +configuration differences between your XenLinux kernel and the one you
   9.708 +usually use.
   9.709 +
   9.710 +When the boot completes, you should be able to log into your system as
   9.711 +usual. If you are unable to log in, you should still be able to reboot
   9.712 +with your normal Linux kernel by selecting it at the GRUB prompt.
   9.713 +
   9.714  
   9.715  % Booting Xen
   9.716 -\include{src/user/booting_xen}
   9.717 +\chapter{Booting a Xen System}
   9.718 +
   9.719 +Booting the system into Xen will bring you up into the privileged
   9.720 +management domain, Domain0. At that point you are ready to create
   9.721 +guest domains and ``boot'' them using the \texttt{xm create} command.
   9.722 +
   9.723 +\section{Booting Domain0}
   9.724 +
   9.725 +After installation and configuration is complete, reboot the system
   9.726 +and and choose the new Xen option when the Grub screen appears.
   9.727 +
   9.728 +What follows should look much like a conventional Linux boot.  The
   9.729 +first portion of the output comes from Xen itself, supplying low level
   9.730 +information about itself and the underlying hardware.  The last
   9.731 +portion of the output comes from XenLinux.
   9.732 +
   9.733 +%% KMSelf Wed Nov 30 18:09:37 PST 2005:  We should specify what these are.
   9.734 +
   9.735 +When the boot completes, you should be able to log into your system as
   9.736 +usual.  If you are unable to log in, you should still be able to
   9.737 +reboot with your normal Linux kernel by selecting it at the GRUB prompt.
   9.738 +
   9.739 +The first step in creating a new domain is to prepare a root
   9.740 +filesystem for it to boot.  Typically, this might be stored in a normal
   9.741 +partition, an LVM or other volume manager partition, a disk file or on
   9.742 +an NFS server.  A simple way to do this is simply to boot from your
   9.743 +standard OS install CD and install the distribution into another
   9.744 +partition on your hard drive.
   9.745 +
   9.746 +To start the \xend\ control daemon, type
   9.747 +\begin{quote}
   9.748 +  \verb!# xend start!
   9.749 +\end{quote}
   9.750 +
   9.751 +If you wish the daemon to start automatically, see the instructions in
   9.752 +Section~\ref{s:xend}. Once the daemon is running, you can use the
   9.753 +\path{xm} tool to monitor and maintain the domains running on your
   9.754 +system. This chapter provides only a brief tutorial. We provide full
   9.755 +details of the \path{xm} tool in the next chapter.
   9.756 +
   9.757 +% \section{From the web interface}
   9.758 +%
   9.759 +% Boot the Xen machine and start Xensv (see Chapter~\ref{cha:xensv}
   9.760 +% for more details) using the command: \\
   9.761 +% \verb_# xensv start_ \\
   9.762 +% This will also start Xend (see Chapter~\ref{cha:xend} for more
   9.763 +% information).
   9.764 +%
   9.765 +% The domain management interface will then be available at {\tt
   9.766 +%   http://your\_machine:8080/}.  This provides a user friendly wizard
   9.767 +% for starting domains and functions for managing running domains.
   9.768 +%
   9.769 +% \section{From the command line}
   9.770 +\section{Booting Guest Domains}
   9.771 +
   9.772 +\subsection{Creating a Domain Configuration File}
   9.773 +
   9.774 +Before you can start an additional domain, you must create a
   9.775 +configuration file. We provide two example files which you can use as
   9.776 +a starting point:
   9.777 +\begin{itemize}
   9.778 +\item \path{/etc/xen/xmexample1} is a simple template configuration
   9.779 +  file for describing a single VM\@.
   9.780 +\item \path{/etc/xen/xmexample2} file is a template description that
   9.781 +  is intended to be reused for multiple virtual machines.  Setting the
   9.782 +  value of the \path{vmid} variable on the \path{xm} command line
   9.783 +  fills in parts of this template.
   9.784 +\end{itemize}
   9.785 +
   9.786 +There are also a number of other examples which you may find useful.
   9.787 +Copy one of these files and edit it as appropriate.  Typical values
   9.788 +you may wish to edit include:
   9.789 +
   9.790 +\begin{quote}
   9.791 +\begin{description}
   9.792 +\item[kernel] Set this to the path of the kernel you compiled for use
   9.793 +  with Xen (e.g.\ \path{kernel = ``/boot/vmlinuz-2.6-xenU''})
   9.794 +\item[memory] Set this to the size of the domain's memory in megabytes
   9.795 +  (e.g.\ \path{memory = 64})
   9.796 +\item[disk] Set the first entry in this list to calculate the offset
   9.797 +  of the domain's root partition, based on the domain ID\@.  Set the
   9.798 +  second to the location of \path{/usr} if you are sharing it between
   9.799 +  domains (e.g.\ \path{disk = ['phy:your\_hard\_drive\%d,sda1,w' \%
   9.800 +    (base\_partition\_number + vmid),
   9.801 +    'phy:your\_usr\_partition,sda6,r' ]}
   9.802 +\item[dhcp] Uncomment the dhcp variable, so that the domain will
   9.803 +  receive its IP address from a DHCP server (e.g.\ \path{dhcp=``dhcp''})
   9.804 +\end{description}
   9.805 +\end{quote}
   9.806 +
   9.807 +You may also want to edit the {\bf vif} variable in order to choose
   9.808 +the MAC address of the virtual ethernet interface yourself.  For
   9.809 +example:
   9.810 +
   9.811 +\begin{quote}
   9.812 +\verb_vif = ['mac=00:16:3E:F6:BB:B3']_
   9.813 +\end{quote}
   9.814 +If you do not set this variable, \xend\ will automatically generate a
   9.815 +random MAC address from the range 00:16:3E:xx:xx:xx, assigned by IEEE to
   9.816 +XenSource as an OUI (organizationally unique identifier).  XenSource
   9.817 +Inc. gives permission for anyone to use addresses randomly allocated
   9.818 +from this range for use by their Xen domains.
   9.819 +
   9.820 +For a list of IEEE OUI assignments, see 
   9.821 +\url{http://standards.ieee.org/regauth/oui/oui.txt} 
   9.822 +
   9.823 +
   9.824 +\subsection{Booting the Guest Domain}
   9.825 +
   9.826 +The \path{xm} tool provides a variety of commands for managing
   9.827 +domains.  Use the \path{create} command to start new domains. Assuming
   9.828 +you've created a configuration file \path{myvmconf} based around
   9.829 +\path{/etc/xen/xmexample2}, to start a domain with virtual machine
   9.830 +ID~1 you should type:
   9.831 +
   9.832 +\begin{quote}
   9.833 +\begin{verbatim}
   9.834 +# xm create -c myvmconf vmid=1
   9.835 +\end{verbatim}
   9.836 +\end{quote}
   9.837 +
   9.838 +The \path{-c} switch causes \path{xm} to turn into the domain's
   9.839 +console after creation.  The \path{vmid=1} sets the \path{vmid}
   9.840 +variable used in the \path{myvmconf} file.
   9.841 +
   9.842 +You should see the console boot messages from the new domain appearing
   9.843 +in the terminal in which you typed the command, culminating in a login
   9.844 +prompt.
   9.845 +
   9.846 +
   9.847 +\section{Starting / Stopping Domains Automatically}
   9.848 +
   9.849 +It is possible to have certain domains start automatically at boot
   9.850 +time and to have dom0 wait for all running domains to shutdown before
   9.851 +it shuts down the system.
   9.852 +
   9.853 +To specify a domain is to start at boot-time, place its configuration
   9.854 +file (or a link to it) under \path{/etc/xen/auto/}.
   9.855 +
   9.856 +A Sys-V style init script for Red Hat and LSB-compliant systems is
   9.857 +provided and will be automatically copied to \path{/etc/init.d/}
   9.858 +during install.  You can then enable it in the appropriate way for
   9.859 +your distribution.
   9.860 +
   9.861 +For instance, on Red Hat:
   9.862 +
   9.863 +\begin{quote}
   9.864 +  \verb_# chkconfig --add xendomains_
   9.865 +\end{quote}
   9.866 +
   9.867 +By default, this will start the boot-time domains in runlevels 3, 4
   9.868 +and 5.
   9.869 +
   9.870 +You can also use the \path{service} command to run this script
   9.871 +manually, e.g:
   9.872 +
   9.873 +\begin{quote}
   9.874 +  \verb_# service xendomains start_
   9.875 +
   9.876 +  Starts all the domains with config files under /etc/xen/auto/.
   9.877 +\end{quote}
   9.878 +
   9.879 +\begin{quote}
   9.880 +  \verb_# service xendomains stop_
   9.881 +
   9.882 +  Shuts down all running Xen domains.
   9.883 +\end{quote}
   9.884 +
   9.885  
   9.886  
   9.887  \part{Configuration and Management}
   9.888  
   9.889  %% Chapter Domain Management Tools and Daemons
   9.890 -\include{src/user/domain_mgmt}
   9.891 -
   9.892 -%% Chapter Starting Additional Domains
   9.893 -\include{src/user/start_addl_dom}
   9.894 -
   9.895 -%% Chapter Domain Configuration
   9.896 -\include{src/user/domain_configuration}
   9.897 -
   9.898 -% Chapter Console Management
   9.899 -\include{src/user/console_management}
   9.900 -
   9.901 -% Chapter Network Management
   9.902 -\include{src/user/network_management}
   9.903 +\chapter{Domain Management Tools}
   9.904  
   9.905 -% Chapter Storage and FileSytem Management
   9.906 -\include{src/user/domain_filesystem}
   9.907 -
   9.908 -% Chapter Memory Management
   9.909 -\include{src/user/memory_management}
   9.910 -
   9.911 -% Chapter CPU Management
   9.912 -\include{src/user/cpu_management}
   9.913 -
   9.914 -% Chapter Scheduler Management
   9.915 -\include{src/user/scheduler_management}
   9.916 -
   9.917 -% Chapter Migrating Domains
   9.918 -\include{src/user/migrating_domains}
   9.919 -
   9.920 -%% Chapter Securing Xen
   9.921 -\include{src/user/securing_xen}
   9.922 +This chapter summarizes the management software and tools available.
   9.923  
   9.924  
   9.925 -\part{Monitoring and Troubleshooting}
   9.926 -
   9.927 -%% Chapter Monitoring Xen
   9.928 -\include{src/user/monitoring_xen}
   9.929 -
   9.930 -% Chapter xenstat
   9.931 -\include{src/user/xenstat}
   9.932 -
   9.933 -% Chapter Log Files
   9.934 -\include{src/user/logfiles}
   9.935 +\section{\Xend\ }
   9.936 +\label{s:xend}
   9.937  
   9.938 -%% Chapter Debugging
   9.939 -\include{src/user/debugging}
   9.940 +The Xen Daemon (\Xend) performs system management functions related to
   9.941 +virtual machines. It forms a central point of control for a machine
   9.942 +and can be controlled using an HTTP-based protocol. \Xend\ must be
   9.943 +running in order to start and manage virtual machines.
   9.944  
   9.945 -% Chapter xentrace
   9.946 -\include{src/user/xentrace}
   9.947 +\Xend\ must be run as root because it needs access to privileged system
   9.948 +management functions. A small set of commands may be issued on the
   9.949 +\xend\ command line:
   9.950  
   9.951 -%% Chapter Known Problems
   9.952 -\include{src/user/known_problems}
   9.953 +\begin{tabular}{ll}
   9.954 +  \verb!# xend start! & start \xend, if not already running \\
   9.955 +  \verb!# xend stop!  & stop \xend\ if already running       \\
   9.956 +  \verb!# xend restart! & restart \xend\ if running, otherwise start it \\
   9.957 +  % \verb!# xend trace_start! & start \xend, with very detailed debug logging \\
   9.958 +  \verb!# xend status! & indicates \xend\ status by its return code
   9.959 +\end{tabular}
   9.960  
   9.961 -%% Chapter Testing Xen
   9.962 -\include{src/user/testing}
   9.963 +A SysV init script called {\tt xend} is provided to start \xend\ at
   9.964 +boot time. {\tt make install} installs this script in
   9.965 +\path{/etc/init.d}. To enable it, you have to make symbolic links in
   9.966 +the appropriate runlevel directories or use the {\tt chkconfig} tool,
   9.967 +where available.  Once \xend\ is running, administration can be done
   9.968 +using the \texttt{xm} tool.
   9.969 +
   9.970 +As \xend\ runs, events will be logged to \path{/var/log/xend.log}
   9.971 +and \path{/var/log/xend-debug.log}. These, along with the standard 
   9.972 +syslog files, are useful when troubleshooting problems.
   9.973 +
   9.974 +\section{Xm}
   9.975 +\label{s:xm}
   9.976 +
   9.977 +Command line management tasks are performed using the \path{xm}
   9.978 +tool. For online help for the commands available, type:
   9.979 +
   9.980 +\begin{quote}
   9.981 +\begin{verbatim}
   9.982 +# xm help
   9.983 +\end{verbatim}
   9.984 +\end{quote}
   9.985 +
   9.986 +You can also type \path{xm help $<$command$>$} for more information on a
   9.987 +given command.
   9.988 +
   9.989 +The xm tool is the primary tool for managing Xen from the console. The
   9.990 +general format of an xm command line is:
   9.991 +
   9.992 +\begin{verbatim}
   9.993 +# xm command [switches] [arguments] [variables]
   9.994 +\end{verbatim}
   9.995 +
   9.996 +The available \emph{switches} and \emph{arguments} are dependent on the
   9.997 +\emph{command} chosen. The \emph{variables} may be set using
   9.998 +declarations of the form {\tt variable=value} and command line
   9.999 +declarations override any of the values in the configuration file being
  9.1000 +used, including the standard variables described above and any custom
  9.1001 +variables (for instance, the \path{xmdefconfig} file uses a {\tt vmid}
  9.1002 +variable).
  9.1003 +
  9.1004 +\subsection{Basic Management Commands}
  9.1005 +
  9.1006 +A complete list of \path{xm} commands is obtained by typing \texttt{xm
  9.1007 +  help}. One useful command is \verb_# xm list_ which lists all
  9.1008 +  domains running in rows of the following format:
  9.1009 +\begin{center} {\tt name domid memory vcpus state cputime}
  9.1010 +\end{center}
  9.1011 +
  9.1012 +The meaning of each field is as follows: 
  9.1013 +\begin{quote}
  9.1014 +  \begin{description}
  9.1015 +  \item[name] The descriptive name of the virtual machine.
  9.1016 +  \item[domid] The number of the domain ID this virtual machine is
  9.1017 +    running in.
  9.1018 +  \item[memory] Memory size in megabytes.
  9.1019 +  \item[vcpus] The number of virtual CPUs this domain has.
  9.1020 +  \item[state] Domain state consists of 5 fields:
  9.1021 +    \begin{description}
  9.1022 +    \item[r] running
  9.1023 +    \item[b] blocked
  9.1024 +    \item[p] paused
  9.1025 +    \item[s] shutdown
  9.1026 +    \item[c] crashed
  9.1027 +    \end{description}
  9.1028 +  \item[cputime] How much CPU time (in seconds) the domain has used so
  9.1029 +    far.
  9.1030 +  \end{description}
  9.1031 +\end{quote}
  9.1032 +
  9.1033 +The \path{xm list} command also supports a long output format when the
  9.1034 +\path{-l} switch is used.  This outputs the fulls details of the
  9.1035 +running domains in \xend's SXP configuration format.
  9.1036  
  9.1037  
  9.1038 -\part{Reference Documentation}
  9.1039 +You can get access to the console of a particular domain using 
  9.1040 +the \verb_# xm console_ command  (e.g.\ \verb_# xm console myVM_). 
  9.1041  
  9.1042 -%% Chapter Control Software
  9.1043 -\include{src/user/control_software}
  9.1044 +
  9.1045 +
  9.1046 +%% Chapter Domain Configuration
  9.1047 +\chapter{Domain Configuration}
  9.1048 +\label{cha:config}
  9.1049 +
  9.1050 +The following contains the syntax of the domain configuration files
  9.1051 +and description of how to further specify networking, driver domain
  9.1052 +and general scheduling behavior.
  9.1053 +
  9.1054 +
  9.1055 +\section{Configuration Files}
  9.1056 +\label{s:cfiles}
  9.1057 +
  9.1058 +Xen configuration files contain the following standard variables.
  9.1059 +Unless otherwise stated, configuration items should be enclosed in
  9.1060 +quotes: see the configuration scripts in \path{/etc/xen/} 
  9.1061 +for concrete examples. 
  9.1062 +
  9.1063 +\begin{description}
  9.1064 +\item[kernel] Path to the kernel image.
  9.1065 +\item[ramdisk] Path to a ramdisk image (optional).
  9.1066 +  % \item[builder] The name of the domain build function (e.g.
  9.1067 +  %   {\tt'linux'} or {\tt'netbsd'}.
  9.1068 +\item[memory] Memory size in megabytes.
  9.1069 +\item[vcpus] The number of virtual CPUs. 
  9.1070 +\item[console] Port to export the domain console on (default 9600 +
  9.1071 +  domain ID).
  9.1072 +\item[nics] Number of virtual network interfaces.
  9.1073 +\item[vif] List of MAC addresses (random addresses are assigned if not
  9.1074 +  given) and bridges to use for the domain's network interfaces, e.g.\ 
  9.1075 +\begin{verbatim}
  9.1076 +vif = [ 'mac=aa:00:00:00:00:11, bridge=xen-br0',
  9.1077 +        'bridge=xen-br1' ]
  9.1078 +\end{verbatim}
  9.1079 +  to assign a MAC address and bridge to the first interface and assign
  9.1080 +  a different bridge to the second interface, leaving \xend\ to choose
  9.1081 +  the MAC address.
  9.1082 +\item[disk] List of block devices to export to the domain e.g. 
  9.1083 +  \verb_disk = [ 'phy:hda1,sda1,r' ]_ 
  9.1084 +  exports physical device \path{/dev/hda1} to the domain as
  9.1085 +  \path{/dev/sda1} with read-only access. Exporting a disk read-write
  9.1086 +  which is currently mounted is dangerous -- if you are \emph{certain}
  9.1087 +  you wish to do this, you can specify \path{w!} as the mode.
  9.1088 +\item[dhcp] Set to {\tt `dhcp'} if you want to use DHCP to configure
  9.1089 +  networking.
  9.1090 +\item[netmask] Manually configured IP netmask.
  9.1091 +\item[gateway] Manually configured IP gateway.
  9.1092 +\item[hostname] Set the hostname for the virtual machine.
  9.1093 +\item[root] Specify the root device parameter on the kernel command
  9.1094 +  line.
  9.1095 +\item[nfs\_server] IP address for the NFS server (if any).
  9.1096 +\item[nfs\_root] Path of the root filesystem on the NFS server (if
  9.1097 +  any).
  9.1098 +\item[extra] Extra string to append to the kernel command line (if
  9.1099 +  any)
  9.1100 +\end{description}
  9.1101 +
  9.1102 +Additional fields are documented in the example configuration files 
  9.1103 +(e.g. to configure virtual TPM functionality). 
  9.1104 +
  9.1105 +For additional flexibility, it is also possible to include Python
  9.1106 +scripting commands in configuration files.  An example of this is the
  9.1107 +\path{xmexample2} file, which uses Python code to handle the
  9.1108 +\path{vmid} variable.
  9.1109 +
  9.1110 +
  9.1111 +%\part{Advanced Topics}
  9.1112 +
  9.1113 +
  9.1114 +\section{Network Configuration}
  9.1115 +
  9.1116 +For many users, the default installation should work ``out of the
  9.1117 +box''.  More complicated network setups, for instance with multiple
  9.1118 +Ethernet interfaces and/or existing bridging setups will require some
  9.1119 +special configuration.
  9.1120 +
  9.1121 +The purpose of this section is to describe the mechanisms provided by
  9.1122 +\xend\ to allow a flexible configuration for Xen's virtual networking.
  9.1123 +
  9.1124 +\subsection{Xen virtual network topology}
  9.1125 +
  9.1126 +Each domain network interface is connected to a virtual network
  9.1127 +interface in dom0 by a point to point link (effectively a ``virtual
  9.1128 +crossover cable'').  These devices are named {\tt
  9.1129 +  vif$<$domid$>$.$<$vifid$>$} (e.g.\ {\tt vif1.0} for the first
  9.1130 +interface in domain~1, {\tt vif3.1} for the second interface in
  9.1131 +domain~3).
  9.1132 +
  9.1133 +Traffic on these virtual interfaces is handled in domain~0 using
  9.1134 +standard Linux mechanisms for bridging, routing, rate limiting, etc.
  9.1135 +Xend calls on two shell scripts to perform initial configuration of
  9.1136 +the network and configuration of new virtual interfaces.  By default,
  9.1137 +these scripts configure a single bridge for all the virtual
  9.1138 +interfaces.  Arbitrary routing / bridging configurations can be
  9.1139 +configured by customizing the scripts, as described in the following
  9.1140 +section.
  9.1141 +
  9.1142 +\subsection{Xen networking scripts}
  9.1143 +
  9.1144 +Xen's virtual networking is configured by two shell scripts (by
  9.1145 +default \path{network} and \path{vif-bridge}).  These are called
  9.1146 +automatically by \xend\ when certain events occur, with arguments to
  9.1147 +the scripts providing further contextual information.  These scripts
  9.1148 +are found by default in \path{/etc/xen/scripts}.  The names and
  9.1149 +locations of the scripts can be configured in
  9.1150 +\path{/etc/xen/xend-config.sxp}.
  9.1151 +
  9.1152 +\begin{description}
  9.1153 +\item[network:] This script is called whenever \xend\ is started or
  9.1154 +  stopped to respectively initialize or tear down the Xen virtual
  9.1155 +  network. In the default configuration initialization creates the
  9.1156 +  bridge `xen-br0' and moves eth0 onto that bridge, modifying the
  9.1157 +  routing accordingly. When \xend\ exits, it deletes the Xen bridge
  9.1158 +  and removes eth0, restoring the normal IP and routing configuration.
  9.1159 +
  9.1160 +  %% In configurations where the bridge already exists, this script
  9.1161 +  %% could be replaced with a link to \path{/bin/true} (for instance).
  9.1162 +
  9.1163 +\item[vif-bridge:] This script is called for every domain virtual
  9.1164 +  interface and can configure firewalling rules and add the vif to the
  9.1165 +  appropriate bridge. By default, this adds and removes VIFs on the
  9.1166 +  default Xen bridge.
  9.1167 +\end{description}
  9.1168 +
  9.1169 +For more complex network setups (e.g.\ where routing is required or
  9.1170 +integrate with existing bridges) these scripts may be replaced with
  9.1171 +customized variants for your site's preferred configuration.
  9.1172 +
  9.1173 +%% There are two possible types of privileges: IO privileges and
  9.1174 +%% administration privileges.
  9.1175 +
  9.1176 +
  9.1177 +
  9.1178 +
  9.1179 +% Chapter Storage and FileSytem Management
  9.1180 +\chapter{Storage and File System Management}
  9.1181 +
  9.1182 +Storage can be made available to virtual machines in a number of
  9.1183 +different ways.  This chapter covers some possible configurations.
  9.1184 +
  9.1185 +The most straightforward method is to export a physical block device (a
  9.1186 +hard drive or partition) from dom0 directly to the guest domain as a
  9.1187 +virtual block device (VBD).
  9.1188 +
  9.1189 +Storage may also be exported from a filesystem image or a partitioned
  9.1190 +filesystem image as a \emph{file-backed VBD}.
  9.1191 +
  9.1192 +Finally, standard network storage protocols such as NBD, iSCSI, NFS,
  9.1193 +etc., can be used to provide storage to virtual machines.
  9.1194 +
  9.1195 +
  9.1196 +\section{Exporting Physical Devices as VBDs}
  9.1197 +\label{s:exporting-physical-devices-as-vbds}
  9.1198 +
  9.1199 +One of the simplest configurations is to directly export individual
  9.1200 +partitions from domain~0 to other domains. To achieve this use the
  9.1201 +\path{phy:} specifier in your domain configuration file. For example a
  9.1202 +line like
  9.1203 +\begin{quote}
  9.1204 +  \verb_disk = ['phy:hda3,sda1,w']_
  9.1205 +\end{quote}
  9.1206 +specifies that the partition \path{/dev/hda3} in domain~0 should be
  9.1207 +exported read-write to the new domain as \path{/dev/sda1}; one could
  9.1208 +equally well export it as \path{/dev/hda} or \path{/dev/sdb5} should
  9.1209 +one wish.
  9.1210 +
  9.1211 +In addition to local disks and partitions, it is possible to export
  9.1212 +any device that Linux considers to be ``a disk'' in the same manner.
  9.1213 +For example, if you have iSCSI disks or GNBD volumes imported into
  9.1214 +domain~0 you can export these to other domains using the \path{phy:}
  9.1215 +disk syntax. E.g.:
  9.1216 +\begin{quote}
  9.1217 +  \verb_disk = ['phy:vg/lvm1,sda2,w']_
  9.1218 +\end{quote}
  9.1219 +
  9.1220 +\begin{center}
  9.1221 +  \framebox{\bf Warning: Block device sharing}
  9.1222 +\end{center}
  9.1223 +\begin{quote}
  9.1224 +  Block devices should typically only be shared between domains in a
  9.1225 +  read-only fashion otherwise the Linux kernel's file systems will get
  9.1226 +  very confused as the file system structure may change underneath
  9.1227 +  them (having the same ext3 partition mounted \path{rw} twice is a
  9.1228 +  sure fire way to cause irreparable damage)!  \Xend\ will attempt to
  9.1229 +  prevent you from doing this by checking that the device is not
  9.1230 +  mounted read-write in domain~0, and hasn't already been exported
  9.1231 +  read-write to another domain.  If you want read-write sharing,
  9.1232 +  export the directory to other domains via NFS from domain~0 (or use
  9.1233 +  a cluster file system such as GFS or ocfs2).
  9.1234 +\end{quote}
  9.1235 +
  9.1236 +
  9.1237 +\section{Using File-backed VBDs}
  9.1238 +
  9.1239 +It is also possible to use a file in Domain~0 as the primary storage
  9.1240 +for a virtual machine.  As well as being convenient, this also has the
  9.1241 +advantage that the virtual block device will be \emph{sparse} ---
  9.1242 +space will only really be allocated as parts of the file are used.  So
  9.1243 +if a virtual machine uses only half of its disk space then the file
  9.1244 +really takes up half of the size allocated.
  9.1245 +
  9.1246 +For example, to create a 2GB sparse file-backed virtual block device
  9.1247 +(actually only consumes 1KB of disk):
  9.1248 +\begin{quote}
  9.1249 +  \verb_# dd if=/dev/zero of=vm1disk bs=1k seek=2048k count=1_
  9.1250 +\end{quote}
  9.1251 +
  9.1252 +Make a file system in the disk file:
  9.1253 +\begin{quote}
  9.1254 +  \verb_# mkfs -t ext3 vm1disk_
  9.1255 +\end{quote}
  9.1256 +
  9.1257 +(when the tool asks for confirmation, answer `y')
  9.1258 +
  9.1259 +Populate the file system e.g.\ by copying from the current root:
  9.1260 +\begin{quote}
  9.1261 +\begin{verbatim}
  9.1262 +# mount -o loop vm1disk /mnt
  9.1263 +# cp -ax /{root,dev,var,etc,usr,bin,sbin,lib} /mnt
  9.1264 +# mkdir /mnt/{proc,sys,home,tmp}
  9.1265 +\end{verbatim}
  9.1266 +\end{quote}
  9.1267 +
  9.1268 +Tailor the file system by editing \path{/etc/fstab},
  9.1269 +\path{/etc/hostname}, etc.\ Don't forget to edit the files in the
  9.1270 +mounted file system, instead of your domain~0 filesystem, e.g.\ you
  9.1271 +would edit \path{/mnt/etc/fstab} instead of \path{/etc/fstab}.  For
  9.1272 +this example put \path{/dev/sda1} to root in fstab.
  9.1273 +
  9.1274 +Now unmount (this is important!):
  9.1275 +\begin{quote}
  9.1276 +  \verb_# umount /mnt_
  9.1277 +\end{quote}
  9.1278 +
  9.1279 +In the configuration file set:
  9.1280 +\begin{quote}
  9.1281 +  \verb_disk = ['file:/full/path/to/vm1disk,sda1,w']_
  9.1282 +\end{quote}
  9.1283 +
  9.1284 +As the virtual machine writes to its `disk', the sparse file will be
  9.1285 +filled in and consume more space up to the original 2GB.
  9.1286 +
  9.1287 +{\bf Note that file-backed VBDs may not be appropriate for backing
  9.1288 +  I/O-intensive domains.}  File-backed VBDs are known to experience
  9.1289 +substantial slowdowns under heavy I/O workloads, due to the I/O
  9.1290 +handling by the loopback block device used to support file-backed VBDs
  9.1291 +in dom0.  Better I/O performance can be achieved by using either
  9.1292 +LVM-backed VBDs (Section~\ref{s:using-lvm-backed-vbds}) or physical
  9.1293 +devices as VBDs (Section~\ref{s:exporting-physical-devices-as-vbds}).
  9.1294 +
  9.1295 +Linux supports a maximum of eight file-backed VBDs across all domains
  9.1296 +by default.  This limit can be statically increased by using the
  9.1297 +\emph{max\_loop} module parameter if CONFIG\_BLK\_DEV\_LOOP is
  9.1298 +compiled as a module in the dom0 kernel, or by using the
  9.1299 +\emph{max\_loop=n} boot option if CONFIG\_BLK\_DEV\_LOOP is compiled
  9.1300 +directly into the dom0 kernel.
  9.1301 +
  9.1302 +
  9.1303 +\section{Using LVM-backed VBDs}
  9.1304 +\label{s:using-lvm-backed-vbds}
  9.1305 +
  9.1306 +A particularly appealing solution is to use LVM volumes as backing for
  9.1307 +domain file-systems since this allows dynamic growing/shrinking of
  9.1308 +volumes as well as snapshot and other features.
  9.1309 +
  9.1310 +To initialize a partition to support LVM volumes:
  9.1311 +\begin{quote}
  9.1312 +\begin{verbatim}
  9.1313 +# pvcreate /dev/sda10           
  9.1314 +\end{verbatim} 
  9.1315 +\end{quote}
  9.1316 +
  9.1317 +Create a volume group named `vg' on the physical partition:
  9.1318 +\begin{quote}
  9.1319 +\begin{verbatim}
  9.1320 +# vgcreate vg /dev/sda10
  9.1321 +\end{verbatim} 
  9.1322 +\end{quote}
  9.1323 +
  9.1324 +Create a logical volume of size 4GB named `myvmdisk1':
  9.1325 +\begin{quote}
  9.1326 +\begin{verbatim}
  9.1327 +# lvcreate -L4096M -n myvmdisk1 vg
  9.1328 +\end{verbatim}
  9.1329 +\end{quote}
  9.1330 +
  9.1331 +You should now see that you have a \path{/dev/vg/myvmdisk1} Make a
  9.1332 +filesystem, mount it and populate it, e.g.:
  9.1333 +\begin{quote}
  9.1334 +\begin{verbatim}
  9.1335 +# mkfs -t ext3 /dev/vg/myvmdisk1
  9.1336 +# mount /dev/vg/myvmdisk1 /mnt
  9.1337 +# cp -ax / /mnt
  9.1338 +# umount /mnt
  9.1339 +\end{verbatim}
  9.1340 +\end{quote}
  9.1341 +
  9.1342 +Now configure your VM with the following disk configuration:
  9.1343 +\begin{quote}
  9.1344 +\begin{verbatim}
  9.1345 + disk = [ 'phy:vg/myvmdisk1,sda1,w' ]
  9.1346 +\end{verbatim}
  9.1347 +\end{quote}
  9.1348 +
  9.1349 +LVM enables you to grow the size of logical volumes, but you'll need
  9.1350 +to resize the corresponding file system to make use of the new space.
  9.1351 +Some file systems (e.g.\ ext3) now support online resize.  See the LVM
  9.1352 +manuals for more details.
  9.1353 +
  9.1354 +You can also use LVM for creating copy-on-write (CoW) clones of LVM
  9.1355 +volumes (known as writable persistent snapshots in LVM terminology).
  9.1356 +This facility is new in Linux 2.6.8, so isn't as stable as one might
  9.1357 +hope.  In particular, using lots of CoW LVM disks consumes a lot of
  9.1358 +dom0 memory, and error conditions such as running out of disk space
  9.1359 +are not handled well. Hopefully this will improve in future.
  9.1360 +
  9.1361 +To create two copy-on-write clone of the above file system you would
  9.1362 +use the following commands:
  9.1363 +
  9.1364 +\begin{quote}
  9.1365 +\begin{verbatim}
  9.1366 +# lvcreate -s -L1024M -n myclonedisk1 /dev/vg/myvmdisk1
  9.1367 +# lvcreate -s -L1024M -n myclonedisk2 /dev/vg/myvmdisk1
  9.1368 +\end{verbatim}
  9.1369 +\end{quote}
  9.1370 +
  9.1371 +Each of these can grow to have 1GB of differences from the master
  9.1372 +volume. You can grow the amount of space for storing the differences
  9.1373 +using the lvextend command, e.g.:
  9.1374 +\begin{quote}
  9.1375 +\begin{verbatim}
  9.1376 +# lvextend +100M /dev/vg/myclonedisk1
  9.1377 +\end{verbatim}
  9.1378 +\end{quote}
  9.1379 +
  9.1380 +Don't let the `differences volume' ever fill up otherwise LVM gets
  9.1381 +rather confused. It may be possible to automate the growing process by
  9.1382 +using \path{dmsetup wait} to spot the volume getting full and then
  9.1383 +issue an \path{lvextend}.
  9.1384 +
  9.1385 +In principle, it is possible to continue writing to the volume that
  9.1386 +has been cloned (the changes will not be visible to the clones), but
  9.1387 +we wouldn't recommend this: have the cloned volume as a `pristine'
  9.1388 +file system install that isn't mounted directly by any of the virtual
  9.1389 +machines.
  9.1390 +
  9.1391 +
  9.1392 +\section{Using NFS Root}
  9.1393 +
  9.1394 +First, populate a root filesystem in a directory on the server
  9.1395 +machine. This can be on a distinct physical machine, or simply run
  9.1396 +within a virtual machine on the same node.
  9.1397 +
  9.1398 +Now configure the NFS server to export this filesystem over the
  9.1399 +network by adding a line to \path{/etc/exports}, for instance:
  9.1400 +
  9.1401 +\begin{quote}
  9.1402 +  \begin{small}
  9.1403 +\begin{verbatim}
  9.1404 +/export/vm1root      1.2.3.4/24 (rw,sync,no_root_squash)
  9.1405 +\end{verbatim}
  9.1406 +  \end{small}
  9.1407 +\end{quote}
  9.1408 +
  9.1409 +Finally, configure the domain to use NFS root.  In addition to the
  9.1410 +normal variables, you should make sure to set the following values in
  9.1411 +the domain's configuration file:
  9.1412 +
  9.1413 +\begin{quote}
  9.1414 +  \begin{small}
  9.1415 +\begin{verbatim}
  9.1416 +root       = '/dev/nfs'
  9.1417 +nfs_server = '2.3.4.5'       # substitute IP address of server
  9.1418 +nfs_root   = '/path/to/root' # path to root FS on the server
  9.1419 +\end{verbatim}
  9.1420 +  \end{small}
  9.1421 +\end{quote}
  9.1422 +
  9.1423 +The domain will need network access at boot time, so either statically
  9.1424 +configure an IP address using the config variables \path{ip},
  9.1425 +\path{netmask}, \path{gateway}, \path{hostname}; or enable DHCP
  9.1426 +(\path{dhcp='dhcp'}).
  9.1427 +
  9.1428 +Note that the Linux NFS root implementation is known to have stability
  9.1429 +problems under high load (this is not a Xen-specific problem), so this
  9.1430 +configuration may not be appropriate for critical servers.
  9.1431 +
  9.1432 +
  9.1433 +\chapter{CPU Management}
  9.1434 +
  9.1435 +%% KMS Something sage about CPU / processor management.
  9.1436 +
  9.1437 +Xen allows a domain's virtual CPU(s) to be associated with one or more
  9.1438 +host CPUs.  This can be used to allocate real resources among one or
  9.1439 +more guests, or to make optimal use of processor resources when
  9.1440 +utilizing dual-core, hyperthreading, or other advanced CPU technologies.
  9.1441 +
  9.1442 +Xen enumerates physical CPUs in a `depth first' fashion.  For a system
  9.1443 +with both hyperthreading and multiple cores, this would be all the
  9.1444 +hyperthreads on a given core, then all the cores on a given socket,
  9.1445 +and then all sockets.  I.e.  if you had a two socket, dual core,
  9.1446 +hyperthreaded Xeon the CPU order would be:
  9.1447 +
  9.1448 +
  9.1449 +\begin{center}
  9.1450 +\begin{tabular}{l|l|l|l|l|l|l|r}
  9.1451 +\multicolumn{4}{c|}{socket0}     &  \multicolumn{4}{c}{socket1} \\ \hline
  9.1452 +\multicolumn{2}{c|}{core0}  &  \multicolumn{2}{c|}{core1}  &
  9.1453 +\multicolumn{2}{c|}{core0}  &  \multicolumn{2}{c}{core1} \\ \hline
  9.1454 +ht0 & ht1 & ht0 & ht1 & ht0 & ht1 & ht0 & ht1 \\
  9.1455 +\#0 & \#1 & \#2 & \#3 & \#4 & \#5 & \#6 & \#7 \\
  9.1456 +\end{tabular}
  9.1457 +\end{center}
  9.1458 +
  9.1459 +
  9.1460 +Having multiple vcpus belonging to the same domain mapped to the same
  9.1461 +physical CPU is very likely to lead to poor performance. It's better to
  9.1462 +use `vcpus-set' to hot-unplug one of the vcpus and ensure the others are
  9.1463 +pinned on different CPUs.
  9.1464 +
  9.1465 +If you are running IO intensive tasks, its typically better to dedicate
  9.1466 +either a hyperthread or whole core to running domain 0, and hence pin
  9.1467 +other domains so that they can't use CPU 0. If your workload is mostly
  9.1468 +compute intensive, you may want to pin vcpus such that all physical CPU
  9.1469 +threads are available for guest domains.
  9.1470 +
  9.1471 +\chapter{Migrating Domains}
  9.1472 +
  9.1473 +\section{Domain Save and Restore}
  9.1474 +
  9.1475 +The administrator of a Xen system may suspend a virtual machine's
  9.1476 +current state into a disk file in domain~0, allowing it to be resumed at
  9.1477 +a later time.
  9.1478 +
  9.1479 +For example you can suspend a domain called ``VM1'' to disk using the
  9.1480 +command:
  9.1481 +\begin{verbatim}
  9.1482 +# xm save VM1 VM1.chk
  9.1483 +\end{verbatim}
  9.1484 +
  9.1485 +This will stop the domain named ``VM1'' and save its current state
  9.1486 +into a file called \path{VM1.chk}.
  9.1487 +
  9.1488 +To resume execution of this domain, use the \path{xm restore} command:
  9.1489 +\begin{verbatim}
  9.1490 +# xm restore VM1.chk
  9.1491 +\end{verbatim}
  9.1492 +
  9.1493 +This will restore the state of the domain and resume its execution.
  9.1494 +The domain will carry on as before and the console may be reconnected
  9.1495 +using the \path{xm console} command, as described earlier.
  9.1496 +
  9.1497 +\section{Migration and Live Migration}
  9.1498 +
  9.1499 +Migration is used to transfer a domain between physical hosts. There
  9.1500 +are two varieties: regular and live migration. The former moves a
  9.1501 +virtual machine from one host to another by pausing it, copying its
  9.1502 +memory contents, and then resuming it on the destination. The latter
  9.1503 +performs the same logical functionality but without needing to pause
  9.1504 +the domain for the duration. In general when performing live migration
  9.1505 +the domain continues its usual activities and---from the user's
  9.1506 +perspective---the migration should be imperceptible.
  9.1507 +
  9.1508 +To perform a live migration, both hosts must be running Xen / \xend\ and
  9.1509 +the destination host must have sufficient resources (e.g.\ memory
  9.1510 +capacity) to accommodate the domain after the move. Furthermore we
  9.1511 +currently require both source and destination machines to be on the same
  9.1512 +L2 subnet.
  9.1513 +
  9.1514 +Currently, there is no support for providing automatic remote access
  9.1515 +to filesystems stored on local disk when a domain is migrated.
  9.1516 +Administrators should choose an appropriate storage solution (i.e.\
  9.1517 +SAN, NAS, etc.) to ensure that domain filesystems are also available
  9.1518 +on their destination node. GNBD is a good method for exporting a
  9.1519 +volume from one machine to another. iSCSI can do a similar job, but is
  9.1520 +more complex to set up.
  9.1521 +
  9.1522 +When a domain migrates, it's MAC and IP address move with it, thus it is
  9.1523 +only possible to migrate VMs within the same layer-2 network and IP
  9.1524 +subnet. If the destination node is on a different subnet, the
  9.1525 +administrator would need to manually configure a suitable etherip or IP
  9.1526 +tunnel in the domain~0 of the remote node.
  9.1527 +
  9.1528 +A domain may be migrated using the \path{xm migrate} command. To live
  9.1529 +migrate a domain to another machine, we would use the command:
  9.1530 +
  9.1531 +\begin{verbatim}
  9.1532 +# xm migrate --live mydomain destination.ournetwork.com
  9.1533 +\end{verbatim}
  9.1534 +
  9.1535 +Without the \path{--live} flag, \xend\ simply stops the domain and
  9.1536 +copies the memory image over to the new node and restarts it. Since
  9.1537 +domains can have large allocations this can be quite time consuming,
  9.1538 +even on a Gigabit network. With the \path{--live} flag \xend\ attempts
  9.1539 +to keep the domain running while the migration is in progress, resulting
  9.1540 +in typical down times of just 60--300ms.
  9.1541 +
  9.1542 +For now it will be necessary to reconnect to the domain's console on the
  9.1543 +new machine using the \path{xm console} command. If a migrated domain
  9.1544 +has any open network connections then they will be preserved, so SSH
  9.1545 +connections do not have this limitation.
  9.1546 +
  9.1547 +
  9.1548 +%% Chapter Securing Xen
  9.1549 +\chapter{Securing Xen}
  9.1550 +
  9.1551 +This chapter describes how to secure a Xen system. It describes a number
  9.1552 +of scenarios and provides a corresponding set of best practices. It
  9.1553 +begins with a section devoted to understanding the security implications
  9.1554 +of a Xen system.
  9.1555 +
  9.1556 +
  9.1557 +\section{Xen Security Considerations}
  9.1558 +
  9.1559 +When deploying a Xen system, one must be sure to secure the management
  9.1560 +domain (Domain-0) as much as possible. If the management domain is
  9.1561 +compromised, all other domains are also vulnerable. The following are a
  9.1562 +set of best practices for Domain-0:
  9.1563 +
  9.1564 +\begin{enumerate}
  9.1565 +\item \textbf{Run the smallest number of necessary services.} The less
  9.1566 +  things that are present in a management partition, the better.
  9.1567 +  Remember, a service running as root in the management domain has full
  9.1568 +  access to all other domains on the system.
  9.1569 +\item \textbf{Use a firewall to restrict the traffic to the management
  9.1570 +    domain.} A firewall with default-reject rules will help prevent
  9.1571 +  attacks on the management domain.
  9.1572 +\item \textbf{Do not allow users to access Domain-0.} The Linux kernel
  9.1573 +  has been known to have local-user root exploits. If you allow normal
  9.1574 +  users to access Domain-0 (even as unprivileged users) you run the risk
  9.1575 +  of a kernel exploit making all of your domains vulnerable.
  9.1576 +\end{enumerate}
  9.1577 +
  9.1578 +\section{Security Scenarios}
  9.1579 +
  9.1580 +
  9.1581 +\subsection{The Isolated Management Network}
  9.1582 +
  9.1583 +In this scenario, each node has two network cards in the cluster. One
  9.1584 +network card is connected to the outside world and one network card is a
  9.1585 +physically isolated management network specifically for Xen instances to
  9.1586 +use.
  9.1587 +
  9.1588 +As long as all of the management partitions are trusted equally, this is
  9.1589 +the most secure scenario. No additional configuration is needed other
  9.1590 +than forcing Xend to bind to the management interface for relocation.
  9.1591 +
  9.1592 +
  9.1593 +\subsection{A Subnet Behind a Firewall}
  9.1594 +
  9.1595 +In this scenario, each node has only one network card but the entire
  9.1596 +cluster sits behind a firewall. This firewall should do at least the
  9.1597 +following:
  9.1598 +
  9.1599 +\begin{enumerate}
  9.1600 +\item Prevent IP spoofing from outside of the subnet.
  9.1601 +\item Prevent access to the relocation port of any of the nodes in the
  9.1602 +  cluster except from within the cluster.
  9.1603 +\end{enumerate}
  9.1604 +
  9.1605 +The following iptables rules can be used on each node to prevent
  9.1606 +migrations to that node from outside the subnet assuming the main
  9.1607 +firewall does not do this for you:
  9.1608 +
  9.1609 +\begin{verbatim}
  9.1610 +# this command disables all access to the Xen relocation
  9.1611 +# port:
  9.1612 +iptables -A INPUT -p tcp --destination-port 8002 -j REJECT
  9.1613 +
  9.1614 +# this command enables Xen relocations only from the specific
  9.1615 +# subnet:
  9.1616 +iptables -I INPUT -p tcp -{}-source 192.168.1.1/8 \
  9.1617 +    --destination-port 8002 -j ACCEPT
  9.1618 +\end{verbatim}
  9.1619 +
  9.1620 +\subsection{Nodes on an Untrusted Subnet}
  9.1621 +
  9.1622 +Migration on an untrusted subnet is not safe in current versions of Xen.
  9.1623 +It may be possible to perform migrations through a secure tunnel via an
  9.1624 +VPN or SSH. The only safe option in the absence of a secure tunnel is to
  9.1625 +disable migration completely. The easiest way to do this is with
  9.1626 +iptables:
  9.1627 +
  9.1628 +\begin{verbatim}
  9.1629 +# this command disables all access to the Xen relocation port
  9.1630 +iptables -A INPUT -p tcp -{}-destination-port 8002 -j REJECT
  9.1631 +\end{verbatim}
  9.1632 +
  9.1633 +\part{Reference}
  9.1634  
  9.1635  %% Chapter Build and Boot Options
  9.1636 -\include{src/user/options}
  9.1637 +\chapter{Build and Boot Options} 
  9.1638 +
  9.1639 +This chapter describes the build- and boot-time options which may be
  9.1640 +used to tailor your Xen system.
  9.1641 +
  9.1642 +\section{Top-level Configuration Options} 
  9.1643 +
  9.1644 +Top-level configuration is achieved by editing one of two 
  9.1645 +files: \path{Config.mk} and \path{Makefile}. 
  9.1646 +
  9.1647 +The former allows the overall build target architecture to be 
  9.1648 +specified. You will typically not need to modify this unless 
  9.1649 +you are cross-compiling or if you wish to build a PAE-enabled 
  9.1650 +Xen system. Additional configuration options are documented 
  9.1651 +in the \path{Config.mk} file. 
  9.1652 +
  9.1653 +The top-level \path{Makefile} is chiefly used to customize the set of
  9.1654 +kernels built. Look for the line: 
  9.1655 +\begin{quote}
  9.1656 +\begin{verbatim}
  9.1657 +KERNELS ?= linux-2.6-xen0 linux-2.6-xenU
  9.1658 +\end{verbatim}
  9.1659 +\end{quote}
  9.1660 +
  9.1661 +Allowable options here are any kernels which have a corresponding 
  9.1662 +build configuration file in the \path{buildconfigs/} directory. 
  9.1663 +
  9.1664 +
  9.1665 +
  9.1666 +\section{Xen Build Options}
  9.1667 +
  9.1668 +Xen provides a number of build-time options which should be set as
  9.1669 +environment variables or passed on make's command-line.
  9.1670 +
  9.1671 +\begin{description}
  9.1672 +\item[verbose=y] Enable debugging messages when Xen detects an
  9.1673 +  unexpected condition.  Also enables console output from all domains.
  9.1674 +\item[debug=y] Enable debug assertions.  Implies {\bf verbose=y}.
  9.1675 +  (Primarily useful for tracing bugs in Xen).
  9.1676 +\item[debugger=y] Enable the in-Xen debugger. This can be used to
  9.1677 +  debug Xen, guest OSes, and applications.
  9.1678 +\item[perfc=y] Enable performance counters for significant events
  9.1679 +  within Xen. The counts can be reset or displayed on Xen's console
  9.1680 +  via console control keys.
  9.1681 +\end{description}
  9.1682 +
  9.1683 +
  9.1684 +\section{Xen Boot Options}
  9.1685 +\label{s:xboot}
  9.1686 +
  9.1687 +These options are used to configure Xen's behaviour at runtime.  They
  9.1688 +should be appended to Xen's command line, either manually or by
  9.1689 +editing \path{grub.conf}.
  9.1690 +
  9.1691 +\begin{description}
  9.1692 +\item [ noreboot ] Don't reboot the machine automatically on errors.
  9.1693 +  This is useful to catch debug output if you aren't catching console
  9.1694 +  messages via the serial line.
  9.1695 +\item [ nosmp ] Disable SMP support.  This option is implied by
  9.1696 +  `ignorebiostables'.
  9.1697 +\item [ watchdog ] Enable NMI watchdog which can report certain
  9.1698 +  failures.
  9.1699 +\item [ noirqbalance ] Disable software IRQ balancing and affinity.
  9.1700 +  This can be used on systems such as Dell 1850/2850 that have
  9.1701 +  workarounds in hardware for IRQ-routing issues.
  9.1702 +\item [ badpage=$<$page number$>$,$<$page number$>$, \ldots ] Specify
  9.1703 +  a list of pages not to be allocated for use because they contain bad
  9.1704 +  bytes. For example, if your memory tester says that byte 0x12345678
  9.1705 +  is bad, you would place `badpage=0x12345' on Xen's command line.
  9.1706 +\item [ com1=$<$baud$>$,DPS,$<$io\_base$>$,$<$irq$>$
  9.1707 +  com2=$<$baud$>$,DPS,$<$io\_base$>$,$<$irq$>$ ] \mbox{}\\
  9.1708 +  Xen supports up to two 16550-compatible serial ports.  For example:
  9.1709 +  `com1=9600, 8n1, 0x408, 5' maps COM1 to a 9600-baud port, 8 data
  9.1710 +  bits, no parity, 1 stop bit, I/O port base 0x408, IRQ 5.  If some
  9.1711 +  configuration options are standard (e.g., I/O base and IRQ), then
  9.1712 +  only a prefix of the full configuration string need be specified. If
  9.1713 +  the baud rate is pre-configured (e.g., by the bootloader) then you
  9.1714 +  can specify `auto' in place of a numeric baud rate.
  9.1715 +\item [ console=$<$specifier list$>$ ] Specify the destination for Xen
  9.1716 +  console I/O.  This is a comma-separated list of, for example:
  9.1717 +  \begin{description}
  9.1718 +  \item[ vga ] Use VGA console and allow keyboard input.
  9.1719 +  \item[ com1 ] Use serial port com1.
  9.1720 +  \item[ com2H ] Use serial port com2. Transmitted chars will have the
  9.1721 +    MSB set. Received chars must have MSB set.
  9.1722 +  \item[ com2L] Use serial port com2. Transmitted chars will have the
  9.1723 +    MSB cleared. Received chars must have MSB cleared.
  9.1724 +  \end{description}
  9.1725 +  The latter two examples allow a single port to be shared by two
  9.1726 +  subsystems (e.g.\ console and debugger). Sharing is controlled by
  9.1727 +  MSB of each transmitted/received character.  [NB. Default for this
  9.1728 +  option is `com1,vga']
  9.1729 +\item [ sync\_console ] Force synchronous console output. This is
  9.1730 +  useful if you system fails unexpectedly before it has sent all
  9.1731 +  available output to the console. In most cases Xen will
  9.1732 +  automatically enter synchronous mode when an exceptional event
  9.1733 +  occurs, but this option provides a manual fallback.
  9.1734 +\item [ conswitch=$<$switch-char$><$auto-switch-char$>$ ] Specify how
  9.1735 +  to switch serial-console input between Xen and DOM0. The required
  9.1736 +  sequence is CTRL-$<$switch-char$>$ pressed three times. Specifying
  9.1737 +  the backtick character disables switching.  The
  9.1738 +  $<$auto-switch-char$>$ specifies whether Xen should auto-switch
  9.1739 +  input to DOM0 when it boots --- if it is `x' then auto-switching is
  9.1740 +  disabled.  Any other value, or omitting the character, enables
  9.1741 +  auto-switching.  [NB. Default switch-char is `a'.]
  9.1742 +\item [ nmi=xxx ]
  9.1743 +  Specify what to do with an NMI parity or I/O error. \\
  9.1744 +  `nmi=fatal':  Xen prints a diagnostic and then hangs. \\
  9.1745 +  `nmi=dom0':   Inform DOM0 of the NMI. \\
  9.1746 +  `nmi=ignore': Ignore the NMI.
  9.1747 +\item [ mem=xxx ] Set the physical RAM address limit. Any RAM
  9.1748 +  appearing beyond this physical address in the memory map will be
  9.1749 +  ignored. This parameter may be specified with a B, K, M or G suffix,
  9.1750 +  representing bytes, kilobytes, megabytes and gigabytes respectively.
  9.1751 +  The default unit, if no suffix is specified, is kilobytes.
  9.1752 +\item [ dom0\_mem=xxx ] Set the amount of memory to be allocated to
  9.1753 +  domain0. In Xen 3.x the parameter may be specified with a B, K, M or
  9.1754 +  G suffix, representing bytes, kilobytes, megabytes and gigabytes
  9.1755 +  respectively; if no suffix is specified, the parameter defaults to
  9.1756 +  kilobytes. In previous versions of Xen, suffixes were not supported
  9.1757 +  and the value is always interpreted as kilobytes.
  9.1758 +\item [ tbuf\_size=xxx ] Set the size of the per-cpu trace buffers, in
  9.1759 +  pages (default 1).  Note that the trace buffers are only enabled in
  9.1760 +  debug builds.  Most users can ignore this feature completely.
  9.1761 +\item [ sched=xxx ] Select the CPU scheduler Xen should use.  The
  9.1762 +  current possibilities are `sedf' (default) and `bvt'.
  9.1763 +\item [ apic\_verbosity=debug,verbose ] Print more detailed
  9.1764 +  information about local APIC and IOAPIC configuration.
  9.1765 +\item [ lapic ] Force use of local APIC even when left disabled by
  9.1766 +  uniprocessor BIOS.
  9.1767 +\item [ nolapic ] Ignore local APIC in a uniprocessor system, even if
  9.1768 +  enabled by the BIOS.
  9.1769 +\item [ apic=bigsmp,default,es7000,summit ] Specify NUMA platform.
  9.1770 +  This can usually be probed automatically.
  9.1771 +\end{description}
  9.1772 +
  9.1773 +In addition, the following options may be specified on the Xen command
  9.1774 +line. Since domain 0 shares responsibility for booting the platform,
  9.1775 +Xen will automatically propagate these options to its command line.
  9.1776 +These options are taken from Linux's command-line syntax with
  9.1777 +unchanged semantics.
  9.1778 +
  9.1779 +\begin{description}
  9.1780 +\item [ acpi=off,force,strict,ht,noirq,\ldots ] Modify how Xen (and
  9.1781 +  domain 0) parses the BIOS ACPI tables.
  9.1782 +\item [ acpi\_skip\_timer\_override ] Instruct Xen (and domain~0) to
  9.1783 +  ignore timer-interrupt override instructions specified by the BIOS
  9.1784 +  ACPI tables.
  9.1785 +\item [ noapic ] Instruct Xen (and domain~0) to ignore any IOAPICs
  9.1786 +  that are present in the system, and instead continue to use the
  9.1787 +  legacy PIC.
  9.1788 +\end{description} 
  9.1789 +
  9.1790 +
  9.1791 +\section{XenLinux Boot Options}
  9.1792 +
  9.1793 +In addition to the standard Linux kernel boot options, we support:
  9.1794 +\begin{description}
  9.1795 +\item[ xencons=xxx ] Specify the device node to which the Xen virtual
  9.1796 +  console driver is attached. The following options are supported:
  9.1797 +  \begin{center}
  9.1798 +    \begin{tabular}{l}
  9.1799 +      `xencons=off': disable virtual console \\
  9.1800 +      `xencons=tty': attach console to /dev/tty1 (tty0 at boot-time) \\
  9.1801 +      `xencons=ttyS': attach console to /dev/ttyS0
  9.1802 +    \end{tabular}
  9.1803 +\end{center}
  9.1804 +The default is ttyS for dom0 and tty for all other domains.
  9.1805 +\end{description}
  9.1806 +
  9.1807  
  9.1808  %% Chapter Further Support
  9.1809 -\include{src/user/further_support}
  9.1810 +\chapter{Further Support}
  9.1811 +
  9.1812 +If you have questions that are not answered by this manual, the
  9.1813 +sources of information listed below may be of interest to you.  Note
  9.1814 +that bug reports, suggestions and contributions related to the
  9.1815 +software (or the documentation) should be sent to the Xen developers'
  9.1816 +mailing list (address below).
  9.1817 +
  9.1818 +
  9.1819 +\section{Other Documentation}
  9.1820 +
  9.1821 +For developers interested in porting operating systems to Xen, the
  9.1822 +\emph{Xen Interface Manual} is distributed in the \path{docs/}
  9.1823 +directory of the Xen source distribution.
  9.1824 +
  9.1825 +
  9.1826 +\section{Online References}
  9.1827 +
  9.1828 +The official Xen web site can be found at:
  9.1829 +\begin{quote} {\tt http://www.xensource.com}
  9.1830 +\end{quote}
  9.1831 +
  9.1832 +This contains links to the latest versions of all online
  9.1833 +documentation, including the latest version of the FAQ.
  9.1834 +
  9.1835 +Information regarding Xen is also available at the Xen Wiki at
  9.1836 +\begin{quote} {\tt http://wiki.xensource.com/xenwiki/}\end{quote}
  9.1837 +The Xen project uses Bugzilla as its bug tracking system. You'll find
  9.1838 +the Xen Bugzilla at http://bugzilla.xensource.com/bugzilla/.
  9.1839 +
  9.1840 +
  9.1841 +\section{Mailing Lists}
  9.1842 +
  9.1843 +There are several mailing lists that are used to discuss Xen related
  9.1844 +topics. The most widely relevant are listed below. An official page of
  9.1845 +mailing lists and subscription information can be found at \begin{quote}
  9.1846 +  {\tt http://lists.xensource.com/} \end{quote}
  9.1847 +
  9.1848 +\begin{description}
  9.1849 +\item[xen-devel@lists.xensource.com] Used for development
  9.1850 +  discussions and bug reports.  Subscribe at: \\
  9.1851 +  {\small {\tt http://lists.xensource.com/xen-devel}}
  9.1852 +\item[xen-users@lists.xensource.com] Used for installation and usage
  9.1853 +  discussions and requests for help.  Subscribe at: \\
  9.1854 +  {\small {\tt http://lists.xensource.com/xen-users}}
  9.1855 +\item[xen-announce@lists.xensource.com] Used for announcements only.
  9.1856 +  Subscribe at: \\
  9.1857 +  {\small {\tt http://lists.xensource.com/xen-announce}}
  9.1858 +\item[xen-changelog@lists.xensource.com] Changelog feed
  9.1859 +  from the unstable and 2.0 trees - developer oriented.  Subscribe at: \\
  9.1860 +  {\small {\tt http://lists.xensource.com/xen-changelog}}
  9.1861 +\end{description}
  9.1862 +
  9.1863  
  9.1864  
  9.1865  %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
  9.1866 @@ -175,7 +1808,72 @@
  9.1867  \appendix
  9.1868  
  9.1869  %% Chapter Glossary of Terms moved to glossary.tex
  9.1870 -\include{src/user/glossary}
  9.1871 +\chapter{Glossary of Terms}
  9.1872 +
  9.1873 +\begin{description}
  9.1874 +
  9.1875 +\item[BVT] The BVT scheduler is used to give proportional fair shares
  9.1876 +  of the CPU to domains.
  9.1877 +
  9.1878 +\item[Domain] A domain is the execution context that contains a
  9.1879 +  running {\bf virtual machine}.  The relationship between virtual
  9.1880 +  machines and domains on Xen is similar to that between programs and
  9.1881 +  processes in an operating system: a virtual machine is a persistent
  9.1882 +  entity that resides on disk (somewhat like a program).  When it is
  9.1883 +  loaded for execution, it runs in a domain.  Each domain has a {\bf
  9.1884 +    domain ID}.
  9.1885 +
  9.1886 +\item[Domain 0] The first domain to be started on a Xen machine.
  9.1887 +  Domain 0 is responsible for managing the system.
  9.1888 +
  9.1889 +\item[Domain ID] A unique identifier for a {\bf domain}, analogous to
  9.1890 +  a process ID in an operating system.
  9.1891 +
  9.1892 +\item[Full virtualization] An approach to virtualization which
  9.1893 +  requires no modifications to the hosted operating system, providing
  9.1894 +  the illusion of a complete system of real hardware devices.
  9.1895 +
  9.1896 +\item[Hypervisor] An alternative term for {\bf VMM}, used because it
  9.1897 +  means `beyond supervisor', since it is responsible for managing
  9.1898 +  multiple `supervisor' kernels.
  9.1899 +
  9.1900 +\item[Live migration] A technique for moving a running virtual machine
  9.1901 +  to another physical host, without stopping it or the services
  9.1902 +  running on it.
  9.1903 +
  9.1904 +\item[Paravirtualization] An approach to virtualization which requires
  9.1905 +  modifications to the operating system in order to run in a virtual
  9.1906 +  machine.  Xen uses paravirtualization but preserves binary
  9.1907 +  compatibility for user space applications.
  9.1908 +
  9.1909 +\item[Shadow pagetables] A technique for hiding the layout of machine
  9.1910 +  memory from a virtual machine's operating system.  Used in some {\bf
  9.1911 +  VMMs} to provide the illusion of contiguous physical memory, in
  9.1912 +  Xen this is used during {\bf live migration}.
  9.1913 +
  9.1914 +\item[Virtual Block Device] Persistant storage available to a virtual
  9.1915 +  machine, providing the abstraction of an actual block storage device.
  9.1916 +  {\bf VBD}s may be actual block devices, filesystem images, or
  9.1917 +  remote/network storage.
  9.1918 +
  9.1919 +\item[Virtual Machine] The environment in which a hosted operating
  9.1920 +  system runs, providing the abstraction of a dedicated machine.  A
  9.1921 +  virtual machine may be identical to the underlying hardware (as in
  9.1922 +  {\bf full virtualization}, or it may differ, as in {\bf
  9.1923 +  paravirtualization}).
  9.1924 +
  9.1925 +\item[VMM] Virtual Machine Monitor - the software that allows multiple
  9.1926 +  virtual machines to be multiplexed on a single physical machine.
  9.1927 +
  9.1928 +\item[Xen] Xen is a paravirtualizing virtual machine monitor,
  9.1929 +  developed primarily by the Systems Research Group at the University
  9.1930 +  of Cambridge Computer Laboratory.
  9.1931 +
  9.1932 +\item[XenLinux] A name for the port of the Linux kernel that
  9.1933 +  runs on Xen.
  9.1934 +
  9.1935 +\end{description}
  9.1936 +
  9.1937  
  9.1938  \end{document}
  9.1939  
    10.1 --- a/docs/src/user/booting_xen.tex	Sun Dec 04 20:12:00 2005 +0100
    10.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    10.3 @@ -1,170 +0,0 @@
    10.4 -\chapter{Booting Xen}
    10.5 -
    10.6 -Once Xen is installed and configured as described in the preceding chapter, it
    10.7 -should now be possible to restart the system and use Xen.
    10.8 -
    10.9 -Booting the system into Xen will bring you up into the privileged management d
   10.10 -omain, Domain0. At that point you are ready to create guest domains and "boot" t
   10.11 -hem using the xm create command.
   10.12 -
   10.13 -\section{Booting Domain0}
   10.14 -
   10.15 -After installation and configuration is complete, reboot the system and and ch
   10.16 -oose the new Xen option when the Grub screen appears.
   10.17 -
   10.18 -What follows should look much like a conventional Linux boot.  The
   10.19 -first portion of the output comes from Xen itself, supplying low level
   10.20 -information about itself and the underlying hardware.  The last
   10.21 -portion of the output comes from XenLinux.
   10.22 -
   10.23 -You may see some errors during the XenLinux boot.  These are not
   10.24 -necessarily anything to worry about --- they may result from kernel
   10.25 -configuration differences between your XenLinux kernel and the one you
   10.26 -usually use.
   10.27 -
   10.28 -%% KMSelf Wed Nov 30 18:09:37 PST 2005:  We should specify what these are.
   10.29 -
   10.30 -When the boot completes, you should be able to log into your system as
   10.31 -usual.  If you are unable to log in, you should still be able to
   10.32 -reboot with your normal Linux kernel by selecting it at the GRUB prompt.
   10.33 -
   10.34 -The first step in creating a new domain is to prepare a root
   10.35 -filesystem for it to boot.  Typically, this might be stored in a normal
   10.36 -partition, an LVM or other volume manager partition, a disk file or on
   10.37 -an NFS server.  A simple way to do this is simply to boot from your
   10.38 -standard OS install CD and install the distribution into another
   10.39 -partition on your hard drive.
   10.40 -
   10.41 -To start the \xend\ control daemon, type
   10.42 -\begin{quote}
   10.43 -  \verb!# xend start!
   10.44 -\end{quote}
   10.45 -
   10.46 -If you wish the daemon to start automatically, see the instructions in
   10.47 -Section~\ref{s:xend}. Once the daemon is running, you can use the
   10.48 -\path{xm} tool to monitor and maintain the domains running on your
   10.49 -system. This chapter provides only a brief tutorial. We provide full
   10.50 -details of the \path{xm} tool in the next chapter.
   10.51 -
   10.52 -% \section{From the web interface}
   10.53 -%
   10.54 -% Boot the Xen machine and start Xensv (see Chapter~\ref{cha:xensv}
   10.55 -% for more details) using the command: \\
   10.56 -% \verb_# xensv start_ \\
   10.57 -% This will also start Xend (see Chapter~\ref{cha:xend} for more
   10.58 -% information).
   10.59 -%
   10.60 -% The domain management interface will then be available at {\tt
   10.61 -%   http://your\_machine:8080/}.  This provides a user friendly wizard
   10.62 -% for starting domains and functions for managing running domains.
   10.63 -%
   10.64 -% \section{From the command line}
   10.65 -\section{Booting Guest Domains}
   10.66 -
   10.67 -\subsection{Creating a Domain Configuration File}
   10.68 -
   10.69 -Before you can start an additional domain, you must create a
   10.70 -configuration file. We provide two example files which you can use as
   10.71 -a starting point:
   10.72 -\begin{itemize}
   10.73 -\item \path{/etc/xen/xmexample1} is a simple template configuration
   10.74 -  file for describing a single VM\@.
   10.75 -\item \path{/etc/xen/xmexample2} file is a template description that
   10.76 -  is intended to be reused for multiple virtual machines.  Setting the
   10.77 -  value of the \path{vmid} variable on the \path{xm} command line
   10.78 -  fills in parts of this template.
   10.79 -\end{itemize}
   10.80 -
   10.81 -Copy one of these files and edit it as appropriate.  Typical values
   10.82 -you may wish to edit include:
   10.83 -
   10.84 -\begin{quote}
   10.85 -\begin{description}
   10.86 -\item[kernel] Set this to the path of the kernel you compiled for use
   10.87 -  with Xen (e.g.\ \path{kernel = ``/boot/vmlinuz-2.6-xenU''})
   10.88 -\item[memory] Set this to the size of the domain's memory in megabytes
   10.89 -  (e.g.\ \path{memory = 64})
   10.90 -\item[disk] Set the first entry in this list to calculate the offset
   10.91 -  of the domain's root partition, based on the domain ID\@.  Set the
   10.92 -  second to the location of \path{/usr} if you are sharing it between
   10.93 -  domains (e.g.\ \path{disk = ['phy:your\_hard\_drive\%d,sda1,w' \%
   10.94 -    (base\_partition\_number + vmid),
   10.95 -    'phy:your\_usr\_partition,sda6,r' ]}
   10.96 -\item[dhcp] Uncomment the dhcp variable, so that the domain will
   10.97 -  receive its IP address from a DHCP server (e.g.\ \path{dhcp=``dhcp''})
   10.98 -\end{description}
   10.99 -\end{quote}
  10.100 -
  10.101 -You may also want to edit the {\bf vif} variable in order to choose
  10.102 -the MAC address of the virtual ethernet interface yourself.  For
  10.103 -example:
  10.104 -
  10.105 -\begin{quote}
  10.106 -\verb_vif = ['mac=00:16:3E:F6:BB:B3']_
  10.107 -\end{quote}
  10.108 -If you do not set this variable, \xend\ will automatically generate a
  10.109 -random MAC address from the range 00:16:3E:xx:xx:xx, assigned by IEEE to
  10.110 -XenSource as an OUI (organizationally unique identifier).  XenSource
  10.111 -Inc. gives permission for anyone to use addresses randomly allocated
  10.112 -from this range for use by their Xen domains.
  10.113 -
  10.114 -For a list of IEEE OUI assignments, see \newline
  10.115 -{\tt http://standards.ieee.org/regauth/oui/oui.txt}.
  10.116 -
  10.117 -
  10.118 -\subsection{Booting the Guest Domain}
  10.119 -
  10.120 -The \path{xm} tool provides a variety of commands for managing
  10.121 -domains.  Use the \path{create} command to start new domains. Assuming
  10.122 -you've created a configuration file \path{myvmconf} based around
  10.123 -\path{/etc/xen/xmexample2}, to start a domain with virtual machine
  10.124 -ID~1 you should type:
  10.125 -
  10.126 -\begin{quote}
  10.127 -\begin{verbatim}
  10.128 -# xm create -c myvmconf vmid=1
  10.129 -\end{verbatim}
  10.130 -\end{quote}
  10.131 -
  10.132 -The \path{-c} switch causes \path{xm} to turn into the domain's
  10.133 -console after creation.  The \path{vmid=1} sets the \path{vmid}
  10.134 -variable used in the \path{myvmconf} file.
  10.135 -
  10.136 -You should see the console boot messages from the new domain appearing
  10.137 -in the terminal in which you typed the command, culminating in a login
  10.138 -prompt.
  10.139 -
  10.140 -\subsection{Example: ttylinux}
  10.141 -
  10.142 -Ttylinux is a very small Linux distribution, designed to require very
  10.143 -few resources.  We will use it as a concrete example of how to start a
  10.144 -Xen domain.  Most users will probably want to install a full-featured
  10.145 -distribution once they have mastered the basics\footnote{ttylinux is
  10.146 -  the distribution's home page: {\tt
  10.147 -    http://www.minimalinux.org/ttylinux/}}.
  10.148 -
  10.149 -\begin{enumerate}
  10.150 -\item Download and extract the ttylinux disk image from the Files
  10.151 -  section of the project's SourceForge site (see
  10.152 -  \path{http://sf.net/projects/xen/}).
  10.153 -\item Create a configuration file like the following:
  10.154 -  \begin{quote}
  10.155 -\begin{verbatim}
  10.156 -kernel = "/boot/vmlinuz-2.6-xenU"
  10.157 -memory = 64
  10.158 -name = "ttylinux"
  10.159 -nics = 1
  10.160 -ip = "1.2.3.4"
  10.161 -disk = ['file:/path/to/ttylinux/rootfs,sda1,w']
  10.162 -root = "/dev/sda1 ro"
  10.163 -\end{verbatim}    
  10.164 -  \end{quote}
  10.165 -\item Now start the domain and connect to its console:
  10.166 -  \begin{quote}
  10.167 -\begin{verbatim}
  10.168 -xm create configfile -c
  10.169 -\end{verbatim}
  10.170 -  \end{quote}
  10.171 -\item Login as root, password root.
  10.172 -\end{enumerate}
  10.173 -
    11.1 --- a/docs/src/user/building_xen.tex	Sun Dec 04 20:12:00 2005 +0100
    11.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    11.3 @@ -1,3 +0,0 @@
    11.4 -\chapter{Building Xen}
    11.5 -
    11.6 -Placeholder.
    12.1 --- a/docs/src/user/console_management.tex	Sun Dec 04 20:12:00 2005 +0100
    12.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    12.3 @@ -1,3 +0,0 @@
    12.4 -\chapter{Console Management}
    12.5 -
    12.6 -Placeholder.
    13.1 --- a/docs/src/user/control_software.tex	Sun Dec 04 20:12:00 2005 +0100
    13.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    13.3 @@ -1,24 +0,0 @@
    13.4 -\chapter{Control Software} 
    13.5 -
    13.6 -\section{Xensv (web control interface)}
    13.7 -\label{s:xensv}
    13.8 -
    13.9 -Xensv is the experimental web control interface for managing a Xen
   13.10 -machine. It can be used to perform some (but not yet all) of the
   13.11 -management tasks that can be done using the xm tool.
   13.12 -
   13.13 -It can be started using:
   13.14 -\begin{quote}
   13.15 -  \verb_# xensv start_
   13.16 -\end{quote}
   13.17 -and stopped using:
   13.18 -\begin{quote}
   13.19 -  \verb_# xensv stop_
   13.20 -\end{quote}
   13.21 -
   13.22 -By default, Xensv will serve out the web interface on port 8080. This
   13.23 -can be changed by editing
   13.24 -\path{/usr/lib/python2.3/site-packages/xen/sv/params.py}.
   13.25 -
   13.26 -Once Xensv is running, the web interface can be used to create and
   13.27 -manage running domains.
    14.1 --- a/docs/src/user/cpu_management.tex	Sun Dec 04 20:12:00 2005 +0100
    14.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    14.3 @@ -1,44 +0,0 @@
    14.4 -\chapter{CPU Management}
    14.5 -
    14.6 -Placeholder.
    14.7 -%% KMS Something sage about CPU / processor management.
    14.8 -
    14.9 -Xen allows a domain's virtual CPU(s) to be associated with one or more
   14.10 -host CPUs.  This can be used to allocate real resources among one or
   14.11 -more guests, or to make optimal use of processor resources when
   14.12 -utilizing dual-core, hyperthreading, or other advanced CPU technologies.
   14.13 -
   14.14 -Xen enumerates physical CPUs in a `depth first' fashion.  For a system
   14.15 -with both hyperthreading and multiple cores, this would be all the
   14.16 -hyperthreads on a given core, then all the cores on a given socket,
   14.17 -and then all sockets.  I.e.  if you had a two socket, dual core,
   14.18 -hyperthreaded Xeon the CPU order would be:
   14.19 -
   14.20 -
   14.21 -\begin{center}
   14.22 -\begin{tabular}{|l|l|l|l|l|l|l|r|}
   14.23 -\multicolumn{4}{c|}{socket0}     &  \multicolumn{4}{c|}{socket1} \\ \hline
   14.24 -\multicolumn{2}{c|}{core0}  &  \multicolumn{2}{c|}{core1}  &
   14.25 -\multicolumn{2}{c|}{core0}  &  \multicolumn{2}{c|}{core1} \\ \hline
   14.26 -ht0 & ht1 & ht0 & ht1 & ht0 & ht1 & ht0 & ht1 \\
   14.27 -\#0 & \#1 & \#2 & \#3 & \#4 & \#5 & \#6 & \#7 \\
   14.28 -\end{tabular}
   14.29 -\end{center}
   14.30 -
   14.31 -
   14.32 -Having multiple vcpus belonging to the same domain mapped to the same
   14.33 -physical CPU is very likely to lead to poor performance. It's better to
   14.34 -use `vcpus-set' to hot-unplug one of the vcpus and ensure the others are
   14.35 -pinned on different CPUs.
   14.36 -
   14.37 -If you are running IO intensive tasks, its typically better to dedicate
   14.38 -either a hyperthread or whole core to running domain 0, and hence pin
   14.39 -other domains so that they can't use CPU 0. If your workload is mostly
   14.40 -compute intensive, you may want to pin vcpus such that all physical CPU
   14.41 -threads are available for guest domains.
   14.42 -
   14.43 -
   14.44 -\section{Setting CPU Pinning}
   14.45 -
   14.46 -FIXME:  To specify a domain's CPU pinning use the XXX command/syntax in
   14.47 -XXX.
    15.1 --- a/docs/src/user/debian.tex	Sun Dec 04 20:12:00 2005 +0100
    15.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    15.3 @@ -1,173 +0,0 @@
    15.4 -\chapter{Installing Xen/XenLinux on Debian}
    15.5 -
    15.6 -This appendix describes installing Xen 3.0 on Debian Linux.
    15.7 -
    15.8 -Xen can be installed on Debian GNU/Linux using the following methods:
    15.9 -
   15.10 -\begin{itemize}
   15.11 -\item From a binary tarball
   15.12 -\item From source 
   15.13 -\item From debs
   15.14 -\end{itemize}
   15.15 -
   15.16 -\section{Installing from a binary tarball}
   15.17 -This section describes the process of installing Xen on Debian Sarge using the stable binary release tarball.
   15.18 -
   15.19 -\subsection{Required Packages}
   15.20 -Install these Debian packages:
   15.21 -
   15.22 -\begin{itemize}
   15.23 -\item bridge-utils
   15.24 -\item libcurl3-dev
   15.25 -\item iproute
   15.26 -\item zlib1g-dev
   15.27 -\item python-dev
   15.28 -\end{itemize}
   15.29 -
   15.30 -\begin{verbatim}
   15.31 -apt-get install bridge-utils   libcurl3-dev iproute  zlib1g-dev python-dev
   15.32 -\end{verbatim}
   15.33 -
   15.34 -
   15.35 -\subsection{Download the binary tarball}
   15.36 -Download the Xen 3.0 binary tarball from the XenSource downloads
   15.37 -page:
   15.38 -
   15.39 -\begin{quote} {\tt http://www.xensource.com/downloads/}
   15.40 -\end{quote}
   15.41 - 
   15.42 -\subsection{Extract and Install}
   15.43 -\begin{verbatim}
   15.44 -#  tar zxvf 
   15.45 -xen-2.0.7-install-x86_32.tgz
   15.46 -# cd xen-2.0.7-install-x86_32.tgz
   15.47 -# ./install.sh
   15.48 -\end{verbatim}
   15.49 -
   15.50 -If everything goes well, you should something like
   15.51 -
   15.52 -\begin{verbatim}
   15.53 -Installing Xen from 
   15.54 -'./install' to '/'...
   15.55 -    All done.
   15.56 -    Checking to see whether prerequisite tools are installed...
   15.57 -    All done.
   15.58 -\end{verbatim}
   15.59 -
   15.60 -
   15.61 -\subsection{Configure grub}
   15.62 -Make an entry in your grub configuration like below.
   15.63 -
   15.64 -{\small
   15.65 -\begin{verbatim}
   15.66 -title          Xen on Debian
   15.67 -kernel         (hd0,5)/boot/xen.gz dom0_mem=131000
   15.68 -module         (hd0,5)/boot/vmlinuz-2.6-xen0 root=/dex/hda6 ro console=tty0
   15.69 -\end{verbatim}
   15.70 -}
   15.71 -
   15.72 -You can now boot into Xen by by choosing the right option from grub menu.
   15.73 -
   15.74 -\section{Installing from source}
   15.75 -\subsection{Required Packages}
   15.76 -Besides packages mentioned under binary tarball install, you will need:
   15.77 -
   15.78 -\begin{itemize}
   15.79 -\item gcc v3.2.x or v3.3.x
   15.80 -\item binutils
   15.81 -\item GNU make
   15.82 -\end{itemize}
   15.83 -
   15.84 -
   15.85 -\subsection{Download the source tree}
   15.86 -The Xen source tree is available as either a compressed source tarball
   15.87 -or as a clone of our master Mercurial repository.
   15.88 -
   15.89 -\begin{description}
   15.90 -\item[Obtaining the Source Tarball]\mbox{} \\
   15.91 -  Stable versions and daily snapshots of the Xen source tree are
   15.92 -  available from the Xen download page:
   15.93 -  \begin{quote} {\tt http://www.xensource.com/downloads/}
   15.94 -  \end{quote}
   15.95 -\item[Obtaining the source via Mercurial]\mbox{} \\
   15.96 -  The source tree may also be obtained via the public Mercurial
   15.97 -  repository hosted at:
   15.98 -  \begin{quote}{\tt http://xenbits.xensource.com}.
   15.99 -  \end{quote} See the instructions and the Getting Started Guide
  15.100 -  referenced at:
  15.101 -  \begin{quote}
  15.102 -    {\tt http://www.xensource.com/downloads/}.
  15.103 -  \end{quote}
  15.104 -\end{description}
  15.105 -
  15.106 -\subsection{Extract, build and install}
  15.107 -
  15.108 -\begin{verbatim}
  15.109 -# tar zxvf xen-3.0.0-src.tgz
  15.110 -# cd xen-3.0
  15.111 -# make dist
  15.112 -#./install.sh
  15.113 -\end{verbatim}
  15.114 -
  15.115 -\section{Installing from debs}
  15.116 -This section describes the process of installing Xen on Debian Sarge using debs created by Edward Despard.
  15.117 -
  15.118 -\subsection{Edward's announcement to xen-user list}
  15.119 -"For part of my Google Summer of Code work I've put together debs for xen of 2.0.7 and of unstable. The unstable debs are built off of yesterday's hg tree, but I try to update them fairly regularly when new developments occur." 
  15.120 -
  15.121 -\subsection{Adding apt source}
  15.122 -Add the following lines to \path{/etc/apt/sources.list}:
  15.123 -
  15.124 -\begin{quote}
  15.125 -{\small
  15.126 -\begin{verbatim}
  15.127 -deb http://tinyurl.com/8tpup
  15.128 -\end{verbatim}
  15.129 -}
  15.130 -\end{quote}
  15.131 -   
  15.132 -Note: On Ubuntu, simple replace debian with ubuntu in the above. Replace xen-unstable with with xen-stable for a stable version.
  15.133 -
  15.134 -Now run \path{aptitude update} or \path{apt-get update}. Doing \path{apt-cache search xen}, you should see following packages in the output.
  15.135 -
  15.136 -\begin{itemize}
  15.137 -\item kernel-image-2.6.12-xen0 - Xen 2.6 kernel image
  15.138 -\item kernel-image-2.6.12-xenu - Xen 2.6 kernel image
  15.139 -\item kernel-patch-xen-2.6.12 - patch to kernel to support xen
  15.140 -\item libxen3.0 - control libraries for Xen
  15.141 -\item libxen-dev - development libraries for Xen
  15.142 -\item xen-doc - documentation for Xen
  15.143 -\item xen-hypervisor - Xen hypervisor kernel
  15.144 -\item xen-kernels - Xen kernels
  15.145 -\item xen - Package to install all of Xen
  15.146 -\item xen-tools - Tools for managing xen domains
  15.147 -\end{itemize}
  15.148 -
  15.149 -\subsection{Installing Xen}
  15.150 -You can now install xen using \path{apt-get}, \path{aptitude}, \path{synaptic}, etc. 
  15.151 - 
  15.152 -After doing \path{apt-get install xen}, you will have a working dom0 and should be able boot into it without any problem. By doing \path{apt-cache depends xen}, you will find that the following packages were also installed as a result of dependency.
  15.153 -
  15.154 -\begin{verbatim}
  15.155 -#  apt-cache  
  15.156 -depends  xen
  15.157 -       xen
  15.158 -           Depends: xen-doc
  15.159 -           Depends: xen-kernels
  15.160 -           Depends: xen-hypervisor
  15.161 -           Depends: xen-tools
  15.162 -\end{verbatim}
  15.163 -
  15.164 -
  15.165 -\subsection{xenkernels.conf}
  15.166 -To automate grub entry for xen, \path{/etc/xenkernels.conf} is used which is installed when the package in installed. Below is a sample entry
  15.167 -
  15.168 -\begin{verbatim}
  15.169 -label=Xen(3.0-unstable082205)/Linux(2.6.12)--
  15.170 -          xen=/boot/xen-3.0-unstable082205.gz
  15.171 -          kernel=/boot/xen/dom0/vmlinuz-2.6.12-xen0
  15.172 -          mem=256000
  15.173 -          root=/dev/hda4
  15.174 -\end{verbatim}
  15.175 -
  15.176 -You have to run run \path{update-grub-xen} every time \path{xenkernels.conf} is modified. Read \path{man update-grub-xen} for more information.
    16.1 --- a/docs/src/user/debugging.tex	Sun Dec 04 20:12:00 2005 +0100
    16.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    16.3 @@ -1,18 +0,0 @@
    16.4 -\chapter{Debugging}
    16.5 -
    16.6 -Xen has a set of debugging features that can be useful to try and figure
    16.7 -out what's going on. Hit ``h'' on the serial line (if you specified a baud
    16.8 -rate on the Xen command line) or ScrollLock-h on the keyboard to get a
    16.9 -list of supported commands.
   16.10 -
   16.11 -If you have a crash you'll likely get a crash dump containing an EIP
   16.12 -(PC) which, along with an \path{objdump -d image}, can be useful in
   16.13 -figuring out what's happened. Debug a Xenlinux image just as you would
   16.14 -any other Linux kernel.
   16.15 -
   16.16 -%% We supply a handy debug terminal program which you can find in
   16.17 -%% \path{/usr/local/src/xen-2.0.bk/tools/misc/miniterm/} This should
   16.18 -%% be built and executed on another machine that is connected via a
   16.19 -%% null modem cable. Documentation is included.  Alternatively, if the
   16.20 -%% Xen machine is connected to a serial-port server then we supply a
   16.21 -%% dumb TCP terminal client, {\tt xencons}.
    17.1 --- a/docs/src/user/dom0_installation.tex	Sun Dec 04 20:12:00 2005 +0100
    17.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    17.3 @@ -1,3 +0,0 @@
    17.4 -\chapter{dom0 Installation}
    17.5 -
    17.6 -Placeholder.
    18.1 --- a/docs/src/user/domU_installation.tex	Sun Dec 04 20:12:00 2005 +0100
    18.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    18.3 @@ -1,3 +0,0 @@
    18.4 -\chapter{domU Installation}
    18.5 -
    18.6 -Placeholder.
    19.1 --- a/docs/src/user/domain_configuration.tex	Sun Dec 04 20:12:00 2005 +0100
    19.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    19.3 @@ -1,281 +0,0 @@
    19.4 -\chapter{Domain Configuration}
    19.5 -\label{cha:config}
    19.6 -
    19.7 -The following contains the syntax of the domain configuration files
    19.8 -and description of how to further specify networking, driver domain
    19.9 -and general scheduling behavior.
   19.10 -
   19.11 -
   19.12 -\section{Configuration Files}
   19.13 -\label{s:cfiles}
   19.14 -
   19.15 -Xen configuration files contain the following standard variables.
   19.16 -Unless otherwise stated, configuration items should be enclosed in
   19.17 -quotes: see \path{/etc/xen/xmexample1} and \path{/etc/xen/xmexample2}
   19.18 -for concrete examples of the syntax.
   19.19 -
   19.20 -\begin{description}
   19.21 -\item[kernel] Path to the kernel image.
   19.22 -\item[ramdisk] Path to a ramdisk image (optional).
   19.23 -  % \item[builder] The name of the domain build function (e.g.
   19.24 -  %   {\tt'linux'} or {\tt'netbsd'}.
   19.25 -\item[memory] Memory size in megabytes.
   19.26 -\item[cpu] CPU to run this domain on, or {\tt -1} for auto-allocation.
   19.27 -\item[console] Port to export the domain console on (default 9600 +
   19.28 -  domain ID).
   19.29 -\item[nics] Number of virtual network interfaces.
   19.30 -\item[vif] List of MAC addresses (random addresses are assigned if not
   19.31 -  given) and bridges to use for the domain's network interfaces, e.g.\ 
   19.32 -\begin{verbatim}
   19.33 -vif = [ 'mac=aa:00:00:00:00:11, bridge=xen-br0',
   19.34 -        'bridge=xen-br1' ]
   19.35 -\end{verbatim}
   19.36 -  to assign a MAC address and bridge to the first interface and assign
   19.37 -  a different bridge to the second interface, leaving \xend\ to choose
   19.38 -  the MAC address.
   19.39 -\item[disk] List of block devices to export to the domain, e.g.\ \\
   19.40 -  \verb_disk = [ 'phy:hda1,sda1,r' ]_ \\
   19.41 -  exports physical device \path{/dev/hda1} to the domain as
   19.42 -  \path{/dev/sda1} with read-only access. Exporting a disk read-write
   19.43 -  which is currently mounted is dangerous -- if you are \emph{certain}
   19.44 -  you wish to do this, you can specify \path{w!} as the mode.
   19.45 -\item[dhcp] Set to {\tt `dhcp'} if you want to use DHCP to configure
   19.46 -  networking.
   19.47 -\item[netmask] Manually configured IP netmask.
   19.48 -\item[gateway] Manually configured IP gateway.
   19.49 -\item[hostname] Set the hostname for the virtual machine.
   19.50 -\item[root] Specify the root device parameter on the kernel command
   19.51 -  line.
   19.52 -\item[nfs\_server] IP address for the NFS server (if any).
   19.53 -\item[nfs\_root] Path of the root filesystem on the NFS server (if
   19.54 -  any).
   19.55 -\item[extra] Extra string to append to the kernel command line (if
   19.56 -  any)
   19.57 -\item[restart] Three possible options:
   19.58 -  \begin{description}
   19.59 -  \item[always] Always restart the domain, no matter what its exit
   19.60 -    code is.
   19.61 -  \item[never] Never restart the domain.
   19.62 -  \item[onreboot] Restart the domain iff it requests reboot.
   19.63 -  \end{description}
   19.64 -\end{description}
   19.65 -
   19.66 -For additional flexibility, it is also possible to include Python
   19.67 -scripting commands in configuration files.  An example of this is the
   19.68 -\path{xmexample2} file, which uses Python code to handle the
   19.69 -\path{vmid} variable.
   19.70 -
   19.71 -
   19.72 -%\part{Advanced Topics}
   19.73 -
   19.74 -
   19.75 -\section{Network Configuration}
   19.76 -
   19.77 -For many users, the default installation should work ``out of the
   19.78 -box''.  More complicated network setups, for instance with multiple
   19.79 -Ethernet interfaces and/or existing bridging setups will require some
   19.80 -special configuration.
   19.81 -
   19.82 -The purpose of this section is to describe the mechanisms provided by
   19.83 -\xend\ to allow a flexible configuration for Xen's virtual networking.
   19.84 -
   19.85 -\subsection{Xen virtual network topology}
   19.86 -
   19.87 -Each domain network interface is connected to a virtual network
   19.88 -interface in dom0 by a point to point link (effectively a ``virtual
   19.89 -crossover cable'').  These devices are named {\tt
   19.90 -  vif$<$domid$>$.$<$vifid$>$} (e.g.\ {\tt vif1.0} for the first
   19.91 -interface in domain~1, {\tt vif3.1} for the second interface in
   19.92 -domain~3).
   19.93 -
   19.94 -Traffic on these virtual interfaces is handled in domain~0 using
   19.95 -standard Linux mechanisms for bridging, routing, rate limiting, etc.
   19.96 -Xend calls on two shell scripts to perform initial configuration of
   19.97 -the network and configuration of new virtual interfaces.  By default,
   19.98 -these scripts configure a single bridge for all the virtual
   19.99 -interfaces.  Arbitrary routing / bridging configurations can be
  19.100 -configured by customizing the scripts, as described in the following
  19.101 -section.
  19.102 -
  19.103 -\subsection{Xen networking scripts}
  19.104 -
  19.105 -Xen's virtual networking is configured by two shell scripts (by
  19.106 -default \path{network} and \path{vif-bridge}).  These are called
  19.107 -automatically by \xend\ when certain events occur, with arguments to
  19.108 -the scripts providing further contextual information.  These scripts
  19.109 -are found by default in \path{/etc/xen/scripts}.  The names and
  19.110 -locations of the scripts can be configured in
  19.111 -\path{/etc/xen/xend-config.sxp}.
  19.112 -
  19.113 -\begin{description}
  19.114 -\item[network:] This script is called whenever \xend\ is started or
  19.115 -  stopped to respectively initialize or tear down the Xen virtual
  19.116 -  network. In the default configuration initialization creates the
  19.117 -  bridge `xen-br0' and moves eth0 onto that bridge, modifying the
  19.118 -  routing accordingly. When \xend\ exits, it deletes the Xen bridge
  19.119 -  and removes eth0, restoring the normal IP and routing configuration.
  19.120 -
  19.121 -  %% In configurations where the bridge already exists, this script
  19.122 -  %% could be replaced with a link to \path{/bin/true} (for instance).
  19.123 -
  19.124 -\item[vif-bridge:] This script is called for every domain virtual
  19.125 -  interface and can configure firewalling rules and add the vif to the
  19.126 -  appropriate bridge. By default, this adds and removes VIFs on the
  19.127 -  default Xen bridge.
  19.128 -\end{description}
  19.129 -
  19.130 -For more complex network setups (e.g.\ where routing is required or
  19.131 -integrate with existing bridges) these scripts may be replaced with
  19.132 -customized variants for your site's preferred configuration.
  19.133 -
  19.134 -%% There are two possible types of privileges: IO privileges and
  19.135 -%% administration privileges.
  19.136 -
  19.137 -
  19.138 -\section{Driver Domain Configuration}
  19.139 -
  19.140 -I/O privileges can be assigned to allow a domain to directly access
  19.141 -PCI devices itself.  This is used to support driver domains.
  19.142 -
  19.143 -Setting back-end privileges is currently only supported in SXP format
  19.144 -config files.  To allow a domain to function as a back-end for others,
  19.145 -somewhere within the {\tt vm} element of its configuration file must
  19.146 -be a {\tt back-end} element of the form {\tt (back-end ({\em type}))}
  19.147 -where {\tt \em type} may be either {\tt netif} or {\tt blkif},
  19.148 -according to the type of virtual device this domain will service.
  19.149 -%% After this domain has been built, \xend will connect all new and
  19.150 -%% existing {\em virtual} devices (of the appropriate type) to that
  19.151 -%% back-end.
  19.152 -
  19.153 -Note that a block back-end cannot currently import virtual block
  19.154 -devices from other domains, and a network back-end cannot import
  19.155 -virtual network devices from other domains.  Thus (particularly in the
  19.156 -case of block back-ends, which cannot import a virtual block device as
  19.157 -their root filesystem), you may need to boot a back-end domain from a
  19.158 -ramdisk or a network device.
  19.159 -
  19.160 -Access to PCI devices may be configured on a per-device basis.  Xen
  19.161 -will assign the minimal set of hardware privileges to a domain that
  19.162 -are required to control its devices.  This can be configured in either
  19.163 -format of configuration file:
  19.164 -
  19.165 -\begin{itemize}
  19.166 -\item SXP Format: Include device elements of the form: \\
  19.167 -  \centerline{  {\tt (device (pci (bus {\em x}) (dev {\em y}) (func {\em z})))}} \\
  19.168 -  inside the top-level {\tt vm} element.  Each one specifies the
  19.169 -  address of a device this domain is allowed to access --- the numbers
  19.170 -  \emph{x},\emph{y} and \emph{z} may be in either decimal or
  19.171 -  hexadecimal format.
  19.172 -\item Flat Format: Include a list of PCI device addresses of the
  19.173 -  format: \\
  19.174 -  \centerline{{\tt pci = ['x,y,z', \ldots]}} \\
  19.175 -  where each element in the list is a string specifying the components
  19.176 -  of the PCI device address, separated by commas.  The components
  19.177 -  ({\tt \em x}, {\tt \em y} and {\tt \em z}) of the list may be
  19.178 -  formatted as either decimal or hexadecimal.
  19.179 -\end{itemize}
  19.180 -
  19.181 -%% \section{Administration Domains}
  19.182 -
  19.183 -%% Administration privileges allow a domain to use the `dom0
  19.184 -%% operations' (so called because they are usually available only to
  19.185 -%% domain 0).  A privileged domain can build other domains, set
  19.186 -%% scheduling parameters, etc.
  19.187 -
  19.188 -% Support for other administrative domains is not yet available...
  19.189 -% perhaps we should plumb it in some time
  19.190 -
  19.191 -
  19.192 -\section{Scheduler Configuration}
  19.193 -\label{s:sched}
  19.194 -
  19.195 -Xen offers a boot time choice between multiple schedulers.  To select
  19.196 -a scheduler, pass the boot parameter \emph{sched=sched\_name} to Xen,
  19.197 -substituting the appropriate scheduler name.  Details of the
  19.198 -schedulers and their parameters are included below; future versions of
  19.199 -the tools will provide a higher-level interface to these tools.
  19.200 -
  19.201 -It is expected that system administrators configure their system to
  19.202 -use the scheduler most appropriate to their needs.  Currently, the BVT
  19.203 -scheduler is the recommended choice.
  19.204 -
  19.205 -\subsection{Borrowed Virtual Time}
  19.206 -
  19.207 -{\tt sched=bvt} (the default) \\
  19.208 -
  19.209 -BVT provides proportional fair shares of the CPU time.  It has been
  19.210 -observed to penalize domains that block frequently (e.g.\ I/O
  19.211 -intensive domains), but this can be compensated for by using warping.
  19.212 -
  19.213 -\subsubsection{Global Parameters}
  19.214 -
  19.215 -\begin{description}
  19.216 -\item[ctx\_allow] The context switch allowance is similar to the
  19.217 -  ``quantum'' in traditional schedulers.  It is the minimum time that
  19.218 -  a scheduled domain will be allowed to run before being preempted.
  19.219 -\end{description}
  19.220 -
  19.221 -\subsubsection{Per-domain parameters}
  19.222 -
  19.223 -\begin{description}
  19.224 -\item[mcuadv] The MCU (Minimum Charging Unit) advance determines the
  19.225 -  proportional share of the CPU that a domain receives.  It is set
  19.226 -  inversely proportionally to a domain's sharing weight.
  19.227 -\item[warp] The amount of ``virtual time'' the domain is allowed to
  19.228 -  warp backwards.
  19.229 -\item[warpl] The warp limit is the maximum time a domain can run
  19.230 -  warped for.
  19.231 -\item[warpu] The unwarp requirement is the minimum time a domain must
  19.232 -  run unwarped for before it can warp again.
  19.233 -\end{description}
  19.234 -
  19.235 -\subsection{Atropos}
  19.236 -
  19.237 -{\tt sched=atropos} \\
  19.238 -
  19.239 -Atropos is a soft real time scheduler.  It provides guarantees about
  19.240 -absolute shares of the CPU, with a facility for sharing slack CPU time
  19.241 -on a best-effort basis. It can provide timeliness guarantees for
  19.242 -latency-sensitive domains.
  19.243 -
  19.244 -Every domain has an associated period and slice.  The domain should
  19.245 -receive `slice' nanoseconds every `period' nanoseconds.  This allows
  19.246 -the administrator to configure both the absolute share of the CPU a
  19.247 -domain receives and the frequency with which it is scheduled.
  19.248 -
  19.249 -%% When domains unblock, their period is reduced to the value of the
  19.250 -%% latency hint (the slice is scaled accordingly so that they still
  19.251 -%% get the same proportion of the CPU).  For each subsequent period,
  19.252 -%% the slice and period times are doubled until they reach their
  19.253 -%% original values.
  19.254 -
  19.255 -Note: don't over-commit the CPU when using Atropos (i.e.\ don't reserve
  19.256 -more CPU than is available --- the utilization should be kept to
  19.257 -slightly less than 100\% in order to ensure predictable behavior).
  19.258 -
  19.259 -\subsubsection{Per-domain parameters}
  19.260 -
  19.261 -\begin{description}
  19.262 -\item[period] The regular time interval during which a domain is
  19.263 -  guaranteed to receive its allocation of CPU time.
  19.264 -\item[slice] The length of time per period that a domain is guaranteed
  19.265 -  to run for (in the absence of voluntary yielding of the CPU).
  19.266 -\item[latency] The latency hint is used to control how soon after
  19.267 -  waking up a domain it should be scheduled.
  19.268 -\item[xtratime] This is a boolean flag that specifies whether a domain
  19.269 -  should be allowed a share of the system slack time.
  19.270 -\end{description}
  19.271 -
  19.272 -\subsection{Round Robin}
  19.273 -
  19.274 -{\tt sched=rrobin} \\
  19.275 -
  19.276 -The round robin scheduler is included as a simple demonstration of
  19.277 -Xen's internal scheduler API.  It is not intended for production use.
  19.278 -
  19.279 -\subsubsection{Global Parameters}
  19.280 -
  19.281 -\begin{description}
  19.282 -\item[rr\_slice] The maximum time each domain runs before the next
  19.283 -  scheduling decision is made.
  19.284 -\end{description}
    20.1 --- a/docs/src/user/domain_filesystem.tex	Sun Dec 04 20:12:00 2005 +0100
    20.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    20.3 @@ -1,251 +0,0 @@
    20.4 -\chapter{Storage and File System Management}
    20.5 -
    20.6 -Storage can be made available to virtual machines in a number of
    20.7 -different ways.  This chapter covers some possible configurations.
    20.8 -
    20.9 -The most straightforward method is to export a physical block device (a
   20.10 -hard drive or partition) from dom0 directly to the guest domain as a
   20.11 -virtual block device (VBD).
   20.12 -
   20.13 -Storage may also be exported from a filesystem image or a partitioned
   20.14 -filesystem image as a \emph{file-backed VBD}.
   20.15 -
   20.16 -Finally, standard network storage protocols such as NBD, iSCSI, NFS,
   20.17 -etc., can be used to provide storage to virtual machines.
   20.18 -
   20.19 -
   20.20 -\section{Exporting Physical Devices as VBDs}
   20.21 -\label{s:exporting-physical-devices-as-vbds}
   20.22 -
   20.23 -One of the simplest configurations is to directly export individual
   20.24 -partitions from domain~0 to other domains. To achieve this use the
   20.25 -\path{phy:} specifier in your domain configuration file. For example a
   20.26 -line like
   20.27 -\begin{quote}
   20.28 -  \verb_disk = ['phy:hda3,sda1,w']_
   20.29 -\end{quote}
   20.30 -specifies that the partition \path{/dev/hda3} in domain~0 should be
   20.31 -exported read-write to the new domain as \path{/dev/sda1}; one could
   20.32 -equally well export it as \path{/dev/hda} or \path{/dev/sdb5} should
   20.33 -one wish.
   20.34 -
   20.35 -In addition to local disks and partitions, it is possible to export
   20.36 -any device that Linux considers to be ``a disk'' in the same manner.
   20.37 -For example, if you have iSCSI disks or GNBD volumes imported into
   20.38 -domain~0 you can export these to other domains using the \path{phy:}
   20.39 -disk syntax. E.g.:
   20.40 -\begin{quote}
   20.41 -  \verb_disk = ['phy:vg/lvm1,sda2,w']_
   20.42 -\end{quote}
   20.43 -
   20.44 -\begin{center}
   20.45 -  \framebox{\bf Warning: Block device sharing}
   20.46 -\end{center}
   20.47 -\begin{quote}
   20.48 -  Block devices should typically only be shared between domains in a
   20.49 -  read-only fashion otherwise the Linux kernel's file systems will get
   20.50 -  very confused as the file system structure may change underneath
   20.51 -  them (having the same ext3 partition mounted \path{rw} twice is a
   20.52 -  sure fire way to cause irreparable damage)!  \Xend\ will attempt to
   20.53 -  prevent you from doing this by checking that the device is not
   20.54 -  mounted read-write in domain~0, and hasn't already been exported
   20.55 -  read-write to another domain.  If you want read-write sharing,
   20.56 -  export the directory to other domains via NFS from domain~0 (or use
   20.57 -  a cluster file system such as GFS or ocfs2).
   20.58 -\end{quote}
   20.59 -
   20.60 -
   20.61 -\section{Using File-backed VBDs}
   20.62 -
   20.63 -It is also possible to use a file in Domain~0 as the primary storage
   20.64 -for a virtual machine.  As well as being convenient, this also has the
   20.65 -advantage that the virtual block device will be \emph{sparse} ---
   20.66 -space will only really be allocated as parts of the file are used.  So
   20.67 -if a virtual machine uses only half of its disk space then the file
   20.68 -really takes up half of the size allocated.
   20.69 -
   20.70 -For example, to create a 2GB sparse file-backed virtual block device
   20.71 -(actually only consumes 1KB of disk):
   20.72 -\begin{quote}
   20.73 -  \verb_# dd if=/dev/zero of=vm1disk bs=1k seek=2048k count=1_
   20.74 -\end{quote}
   20.75 -
   20.76 -Make a file system in the disk file:
   20.77 -\begin{quote}
   20.78 -  \verb_# mkfs -t ext3 vm1disk_
   20.79 -\end{quote}
   20.80 -
   20.81 -(when the tool asks for confirmation, answer `y')
   20.82 -
   20.83 -Populate the file system e.g.\ by copying from the current root:
   20.84 -\begin{quote}
   20.85 -\begin{verbatim}
   20.86 -# mount -o loop vm1disk /mnt
   20.87 -# cp -ax /{root,dev,var,etc,usr,bin,sbin,lib} /mnt
   20.88 -# mkdir /mnt/{proc,sys,home,tmp}
   20.89 -\end{verbatim}
   20.90 -\end{quote}
   20.91 -
   20.92 -Tailor the file system by editing \path{/etc/fstab},
   20.93 -\path{/etc/hostname}, etc.\ Don't forget to edit the files in the
   20.94 -mounted file system, instead of your domain~0 filesystem, e.g.\ you
   20.95 -would edit \path{/mnt/etc/fstab} instead of \path{/etc/fstab}.  For
   20.96 -this example put \path{/dev/sda1} to root in fstab.
   20.97 -
   20.98 -Now unmount (this is important!):
   20.99 -\begin{quote}
  20.100 -  \verb_# umount /mnt_
  20.101 -\end{quote}
  20.102 -
  20.103 -In the configuration file set:
  20.104 -\begin{quote}
  20.105 -  \verb_disk = ['file:/full/path/to/vm1disk,sda1,w']_
  20.106 -\end{quote}
  20.107 -
  20.108 -As the virtual machine writes to its `disk', the sparse file will be
  20.109 -filled in and consume more space up to the original 2GB.
  20.110 -
  20.111 -{\bf Note that file-backed VBDs may not be appropriate for backing
  20.112 -  I/O-intensive domains.}  File-backed VBDs are known to experience
  20.113 -substantial slowdowns under heavy I/O workloads, due to the I/O
  20.114 -handling by the loopback block device used to support file-backed VBDs
  20.115 -in dom0.  Better I/O performance can be achieved by using either
  20.116 -LVM-backed VBDs (Section~\ref{s:using-lvm-backed-vbds}) or physical
  20.117 -devices as VBDs (Section~\ref{s:exporting-physical-devices-as-vbds}).
  20.118 -
  20.119 -Linux supports a maximum of eight file-backed VBDs across all domains
  20.120 -by default.  This limit can be statically increased by using the
  20.121 -\emph{max\_loop} module parameter if CONFIG\_BLK\_DEV\_LOOP is
  20.122 -compiled as a module in the dom0 kernel, or by using the
  20.123 -\emph{max\_loop=n} boot option if CONFIG\_BLK\_DEV\_LOOP is compiled
  20.124 -directly into the dom0 kernel.
  20.125 -
  20.126 -
  20.127 -\section{Using LVM-backed VBDs}
  20.128 -\label{s:using-lvm-backed-vbds}
  20.129 -
  20.130 -A particularly appealing solution is to use LVM volumes as backing for
  20.131 -domain file-systems since this allows dynamic growing/shrinking of
  20.132 -volumes as well as snapshot and other features.
  20.133 -
  20.134 -To initialize a partition to support LVM volumes:
  20.135 -\begin{quote}
  20.136 -\begin{verbatim}
  20.137 -# pvcreate /dev/sda10           
  20.138 -\end{verbatim} 
  20.139 -\end{quote}
  20.140 -
  20.141 -Create a volume group named `vg' on the physical partition:
  20.142 -\begin{quote}
  20.143 -\begin{verbatim}
  20.144 -# vgcreate vg /dev/sda10
  20.145 -\end{verbatim} 
  20.146 -\end{quote}
  20.147 -
  20.148 -Create a logical volume of size 4GB named `myvmdisk1':
  20.149 -\begin{quote}
  20.150 -\begin{verbatim}
  20.151 -# lvcreate -L4096M -n myvmdisk1 vg
  20.152 -\end{verbatim}
  20.153 -\end{quote}
  20.154 -
  20.155 -You should now see that you have a \path{/dev/vg/myvmdisk1} Make a
  20.156 -filesystem, mount it and populate it, e.g.:
  20.157 -\begin{quote}
  20.158 -\begin{verbatim}
  20.159 -# mkfs -t ext3 /dev/vg/myvmdisk1
  20.160 -# mount /dev/vg/myvmdisk1 /mnt
  20.161 -# cp -ax / /mnt
  20.162 -# umount /mnt
  20.163 -\end{verbatim}
  20.164 -\end{quote}
  20.165 -
  20.166 -Now configure your VM with the following disk configuration:
  20.167 -\begin{quote}
  20.168 -\begin{verbatim}
  20.169 - disk = [ 'phy:vg/myvmdisk1,sda1,w' ]
  20.170 -\end{verbatim}
  20.171 -\end{quote}
  20.172 -
  20.173 -LVM enables you to grow the size of logical volumes, but you'll need
  20.174 -to resize the corresponding file system to make use of the new space.
  20.175 -Some file systems (e.g.\ ext3) now support online resize.  See the LVM
  20.176 -manuals for more details.
  20.177 -
  20.178 -You can also use LVM for creating copy-on-write (CoW) clones of LVM
  20.179 -volumes (known as writable persistent snapshots in LVM terminology).
  20.180 -This facility is new in Linux 2.6.8, so isn't as stable as one might
  20.181 -hope.  In particular, using lots of CoW LVM disks consumes a lot of
  20.182 -dom0 memory, and error conditions such as running out of disk space
  20.183 -are not handled well. Hopefully this will improve in future.
  20.184 -
  20.185 -To create two copy-on-write clone of the above file system you would
  20.186 -use the following commands:
  20.187 -
  20.188 -\begin{quote}
  20.189 -\begin{verbatim}
  20.190 -# lvcreate -s -L1024M -n myclonedisk1 /dev/vg/myvmdisk1
  20.191 -# lvcreate -s -L1024M -n myclonedisk2 /dev/vg/myvmdisk1
  20.192 -\end{verbatim}
  20.193 -\end{quote}
  20.194 -
  20.195 -Each of these can grow to have 1GB of differences from the master
  20.196 -volume. You can grow the amount of space for storing the differences
  20.197 -using the lvextend command, e.g.:
  20.198 -\begin{quote}
  20.199 -\begin{verbatim}
  20.200 -# lvextend +100M /dev/vg/myclonedisk1
  20.201 -\end{verbatim}
  20.202 -\end{quote}
  20.203 -
  20.204 -Don't let the `differences volume' ever fill up otherwise LVM gets
  20.205 -rather confused. It may be possible to automate the growing process by
  20.206 -using \path{dmsetup wait} to spot the volume getting full and then
  20.207 -issue an \path{lvextend}.
  20.208 -
  20.209 -In principle, it is possible to continue writing to the volume that
  20.210 -has been cloned (the changes will not be visible to the clones), but
  20.211 -we wouldn't recommend this: have the cloned volume as a `pristine'
  20.212 -file system install that isn't mounted directly by any of the virtual
  20.213 -machines.
  20.214 -
  20.215 -
  20.216 -\section{Using NFS Root}
  20.217 -
  20.218 -First, populate a root filesystem in a directory on the server
  20.219 -machine. This can be on a distinct physical machine, or simply run
  20.220 -within a virtual machine on the same node.
  20.221 -
  20.222 -Now configure the NFS server to export this filesystem over the
  20.223 -network by adding a line to \path{/etc/exports}, for instance:
  20.224 -
  20.225 -\begin{quote}
  20.226 -  \begin{small}
  20.227 -\begin{verbatim}
  20.228 -/export/vm1root      1.2.3.4/24 (rw,sync,no_root_squash)
  20.229 -\end{verbatim}
  20.230 -  \end{small}
  20.231 -\end{quote}
  20.232 -
  20.233 -Finally, configure the domain to use NFS root.  In addition to the
  20.234 -normal variables, you should make sure to set the following values in
  20.235 -the domain's configuration file:
  20.236 -
  20.237 -\begin{quote}
  20.238 -  \begin{small}
  20.239 -\begin{verbatim}
  20.240 -root       = '/dev/nfs'
  20.241 -nfs_server = '2.3.4.5'       # substitute IP address of server
  20.242 -nfs_root   = '/path/to/root' # path to root FS on the server
  20.243 -\end{verbatim}
  20.244 -  \end{small}
  20.245 -\end{quote}
  20.246 -
  20.247 -The domain will need network access at boot time, so either statically
  20.248 -configure an IP address using the config variables \path{ip},
  20.249 -\path{netmask}, \path{gateway}, \path{hostname}; or enable DHCP
  20.250 -(\path{dhcp='dhcp'}).
  20.251 -
  20.252 -Note that the Linux NFS root implementation is known to have stability
  20.253 -problems under high load (this is not a Xen-specific problem), so this
  20.254 -configuration may not be appropriate for critical servers.
    21.1 --- a/docs/src/user/domain_mgmt.tex	Sun Dec 04 20:12:00 2005 +0100
    21.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    21.3 @@ -1,165 +0,0 @@
    21.4 -\chapter{Domain Management Tools}
    21.5 -
    21.6 -This chapter summarises the tools available to manage running domains.
    21.7 -
    21.8 -
    21.9 -\section{\Xend\ }
   21.10 -\label{s:xend}
   21.11 -
   21.12 -The Xen Daemon (\Xend) (node control daemon) performs system management
   21.13 -functions related to virtual machines. It forms a central point of
   21.14 -control for a machine and can be controlled using an HTTP-based
   21.15 -protocol. \Xend\ must be running in order to start and manage virtual
   21.16 -machines.
   21.17 -
   21.18 -\Xend\ must be run as root because it needs access to privileged system
   21.19 -management functions. A small set of commands may be issued on the
   21.20 -\xend\ command line:
   21.21 -
   21.22 -\begin{tabular}{ll}
   21.23 -  \verb!# xend start! & start \xend, if not already running \\
   21.24 -  \verb!# xend stop!  & stop \xend\ if already running       \\
   21.25 -  \verb!# xend restart! & restart \xend\ if running, otherwise start it \\
   21.26 -  % \verb!# xend trace_start! & start \xend, with very detailed debug logging \\
   21.27 -  \verb!# xend status! & indicates \xend\ status by its return code
   21.28 -\end{tabular}
   21.29 -
   21.30 -A SysV init script called {\tt xend} is provided to start \xend\ at boot
   21.31 -time. {\tt make install} installs this script in \path{/etc/init.d}. To
   21.32 -enable it, you have to make symbolic links in the appropriate runlevel
   21.33 -directories or use the {\tt chkconfig} tool, where available.
   21.34 -
   21.35 -Once \xend\ is running, more sophisticated administration can be done
   21.36 -using the xm tool (see Section~\ref{s:xm}) and the experimental Xensv
   21.37 -web interface (see Section~\ref{s:xensv}).
   21.38 -
   21.39 -As \xend\ runs, events will be logged to \path{/var/log/xend.log} and,
   21.40 -if the migration assistant daemon (\path{xfrd}) has been started,
   21.41 -\path{/var/log/xfrd.log}. These may be of use for troubleshooting
   21.42 -problems.
   21.43 -
   21.44 -\section{Xm}
   21.45 -\label{s:xm}
   21.46 -
   21.47 -Command line management tasks are also performed using the \path{xm}
   21.48 -tool. For online help for the commands available, type:
   21.49 -
   21.50 -\begin{quote}
   21.51 -\begin{verbatim}
   21.52 -# xm help
   21.53 -\end{verbatim}
   21.54 -\end{quote}
   21.55 -
   21.56 -You can also type \path{xm help $<$command$>$} for more information on a
   21.57 -given command.
   21.58 -
   21.59 -The xm tool is the primary tool for managing Xen from the console. The
   21.60 -general format of an xm command line is:
   21.61 -
   21.62 -\begin{verbatim}
   21.63 -# xm command [switches] [arguments] [variables]
   21.64 -\end{verbatim}
   21.65 -
   21.66 -The available \emph{switches} and \emph{arguments} are dependent on the
   21.67 -\emph{command} chosen. The \emph{variables} may be set using
   21.68 -declarations of the form {\tt variable=value} and command line
   21.69 -declarations override any of the values in the configuration file being
   21.70 -used, including the standard variables described above and any custom
   21.71 -variables (for instance, the \path{xmdefconfig} file uses a {\tt vmid}
   21.72 -variable).
   21.73 -
   21.74 -The available commands are as follows:
   21.75 -
   21.76 -\begin{description}
   21.77 -\item[mem-set] Request a domain to adjust its memory footprint.
   21.78 -\item[create] Create a new domain.
   21.79 -\item[destroy] Kill a domain immediately.
   21.80 -\item[list] List running domains.
   21.81 -\item[shutdown] Ask a domain to shutdown.
   21.82 -\item[dmesg] Fetch the Xen (not Linux!) boot output.
   21.83 -\item[consoles] Lists the available consoles.
   21.84 -\item[console] Connect to the console for a domain.
   21.85 -\item[help] Get help on xm commands.
   21.86 -\item[save] Suspend a domain to disk.
   21.87 -\item[restore] Restore a domain from disk.
   21.88 -\item[pause] Pause a domain's execution.
   21.89 -\item[unpause] Un-pause a domain.
   21.90 -\item[pincpu] Pin a domain to a CPU.
   21.91 -\item[bvt] Set BVT scheduler parameters for a domain.
   21.92 -\item[bvt\_ctxallow] Set the BVT context switching allowance for the
   21.93 -  system.
   21.94 -\item[atropos] Set the atropos parameters for a domain.
   21.95 -\item[rrobin] Set the round robin time slice for the system.
   21.96 -\item[info] Get information about the Xen host.
   21.97 -\item[call] Call a \xend\ HTTP API function directly.
   21.98 -\end{description}
   21.99 -
  21.100 -\subsection{Basic Management Commands}
  21.101 -
  21.102 -The most important \path{xm} commands are:
  21.103 -\begin{quote}
  21.104 -  \verb_# xm list_: Lists all domains running.\\
  21.105 -  \verb_# xm consoles_: Gives information about the domain consoles.\\
  21.106 -  \verb_# xm console_: Opens a console to a domain (e.g.\
  21.107 -  \verb_# xm console myVM_)
  21.108 -\end{quote}
  21.109 -
  21.110 -\subsection{\tt xm list}
  21.111 -
  21.112 -The output of \path{xm list} is in rows of the following format:
  21.113 -\begin{center} {\tt name domid memory cpu state cputime console}
  21.114 -\end{center}
  21.115 -
  21.116 -\begin{quote}
  21.117 -  \begin{description}
  21.118 -  \item[name] The descriptive name of the virtual machine.
  21.119 -  \item[domid] The number of the domain ID this virtual machine is
  21.120 -    running in.
  21.121 -  \item[memory] Memory size in megabytes.
  21.122 -  \item[cpu] The CPU this domain is running on.
  21.123 -  \item[state] Domain state consists of 5 fields:
  21.124 -    \begin{description}
  21.125 -    \item[r] running
  21.126 -    \item[b] blocked
  21.127 -    \item[p] paused
  21.128 -    \item[s] shutdown
  21.129 -    \item[c] crashed
  21.130 -    \end{description}
  21.131 -  \item[cputime] How much CPU time (in seconds) the domain has used so
  21.132 -    far.
  21.133 -  \item[console] TCP port accepting connections to the domain's
  21.134 -    console.
  21.135 -  \end{description}
  21.136 -\end{quote}
  21.137 -
  21.138 -The \path{xm list} command also supports a long output format when the
  21.139 -\path{-l} switch is used.  This outputs the fulls details of the
  21.140 -running domains in \xend's SXP configuration format.
  21.141 -
  21.142 -For example, suppose the system is running the ttylinux domain as
  21.143 -described earlier.  The list command should produce output somewhat
  21.144 -like the following:
  21.145 -\begin{verbatim}
  21.146 -# xm list
  21.147 -Name              Id  Mem(MB)  CPU  State  Time(s)  Console
  21.148 -Domain-0           0      251    0  r----    172.2        
  21.149 -ttylinux           5       63    0  -b---      3.0    9605
  21.150 -\end{verbatim}
  21.151 -
  21.152 -Here we can see the details for the ttylinux domain, as well as for
  21.153 -domain~0 (which, of course, is always running).  Note that the console
  21.154 -port for the ttylinux domain is 9605.  This can be connected to by TCP
  21.155 -using a terminal program (e.g. \path{telnet} or, better,
  21.156 -\path{xencons}).  The simplest way to connect is to use the
  21.157 -\path{xm~console} command, specifying the domain name or ID.  To
  21.158 -connect to the console of the ttylinux domain, we could use any of the
  21.159 -following:
  21.160 -\begin{verbatim}
  21.161 -# xm console ttylinux
  21.162 -# xm console 5
  21.163 -# xencons localhost 9605
  21.164 -\end{verbatim}
  21.165 -
  21.166 -\section{xenstored}
  21.167 -
  21.168 -Placeholder.
    22.1 --- a/docs/src/user/fedora.tex	Sun Dec 04 20:12:00 2005 +0100
    22.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    22.3 @@ -1,102 +0,0 @@
    22.4 -\chapter{Installing Xen on Fedora~Core 4}
    22.5 -
    22.6 -This section will help you in installing Xen 3 on Fedora Core 4 using various methods.
    22.7 -
    22.8 -\section{Installing Xen from Source Package and binary package}
    22.9 -
   22.10 -\subsection{Required Packages}
   22.11 -bridge\_utils
   22.12 -
   22.13 -
   22.14 -\subsection{Installing}
   22.15 -
   22.16 -Download the source or binary tarballs available at \begin{quote} {\tt http://www.xensource.com/downloads } \end{quote}.
   22.17 -
   22.18 -Extract the archive using following command:
   22.19 -
   22.20 -\begin{verbatim}
   22.21 -tar -zxvf xen-*****-***.tgz
   22.22 -\end{verbatim}
   22.23 -
   22.24 -cd into the xen directory.
   22.25 -
   22.26 -To compile and install the source do
   22.27 -
   22.28 -\begin{verbatim}
   22.29 -     make dist
   22.30 -     make install
   22.31 -\end{verbatim}
   22.32 -
   22.33 -
   22.34 -To install the binary tarball, all you need to do is run the \path{install.sh} script.
   22.35 -
   22.36 -\begin{verbatim}
   22.37 -     #./install.sh
   22.38 -\end{verbatim}
   22.39 -
   22.40 -\subsection{Installing Xen using yum}
   22.41 -
   22.42 -To install xen, type the command
   22.43 -
   22.44 -\begin{verbatim}
   22.45 -#yum install xen
   22.46 -\end{verbatim}
   22.47 -
   22.48 -This will download the following rpms and install them:
   22.49 -
   22.50 -\begin{itemize}
   22.51 -\item xen
   22.52 -\item bridge-utils
   22.53 -\item sysfsutils
   22.54 -\end{itemize}
   22.55 -
   22.56 -Next we need to install kernel-xen0 and kernel-xenU. Type the command:
   22.57 -
   22.58 -\begin{verbatim}
   22.59 - yum install kernel-xen0 kernel-xenU 
   22.60 -\end{verbatim}
   22.61 -
   22.62 -Note: This installs xen0 and xenU kernels and adds an entry in the grub configuration.
   22.63 -Getting Xen up and running
   22.64 -
   22.65 -Once this finishes, you have xen0 and xenU kernels installed in the /boot filesystem. To boot into Dom0, edit the grub configuration file, which is menu.lst
   22.66 -
   22.67 -Note: Installation using yum doesn't require the configuration of grub as mentioned below.
   22.68 -
   22.69 -An example grub entry would be like:
   22.70 -
   22.71 -{\small
   22.72 -\begin{verbatim}
   22.73 -title Xen Unstable(From Fedora Core 4)
   22.74 -          root (hd0,0)
   22.75 -          kernel /fedora/xen.gz dom0\_mem=230000 console=vga
   22.76 -          module /fedora/vmlinuz-2.6-xen0 root=/dev/Vol1/LV3 ro console=tty0
   22.77 -          module /fedora/initrd-2.6.11-1.1369\_FC4smp.img
   22.78 -\end{verbatim}
   22.79 -}
   22.80 -
   22.81 -Also make sure that \path{/var/run/xenstored} and \path{/var/lib/xenstored} directories have been created. If they are not, manually create them.
   22.82 -
   22.83 -Now reboot and select the xen0 option from the GRUB menu.
   22.84 -
   22.85 -To check whether you are running the xen0 kernel, type \path{uname -r}
   22.86 -
   22.87 -Now start the xend process:
   22.88 -
   22.89 -\begin{verbatim}
   22.90 -xend start
   22.91 -\end{verbatim}
   22.92 -
   22.93 -To check whether xend process is running or not, type the following command which lists the running domains.
   22.94 -
   22.95 -\begin{verbatim}
   22.96 -#xm list
   22.97 -      Name              Id  Mem(MB)  CPU VCPU(s)  State  Time(s)
   22.98 -      Domain-0           0      219    0      1   r-----     28.9
   22.99 -\end{verbatim}
  22.100 -
  22.101 -Since you haven't created any guest domains yet, you would see only Domain0.
  22.102 -
  22.103 -Further Help and documentations
  22.104 -
  22.105 -Besides the usual resources, see the Fedora Quickstart Guide \begin{quote} {\tt http://www.fedoraproject.org/wiki/FedoraXenQuickstart } \end{quote}
    23.1 --- a/docs/src/user/further_support.tex	Sun Dec 04 20:12:00 2005 +0100
    23.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    23.3 @@ -1,52 +0,0 @@
    23.4 -\chapter{Further Support}
    23.5 -
    23.6 -If you have questions that are not answered by this manual, the
    23.7 -sources of information listed below may be of interest to you.  Note
    23.8 -that bug reports, suggestions and contributions related to the
    23.9 -software (or the documentation) should be sent to the Xen developers'
   23.10 -mailing list (address below).
   23.11 -
   23.12 -
   23.13 -\section{Other Documentation}
   23.14 -
   23.15 -For developers interested in porting operating systems to Xen, the
   23.16 -\emph{Xen Interface Manual} is distributed in the \path{docs/}
   23.17 -directory of the Xen source distribution.
   23.18 -
   23.19 -
   23.20 -\section{Online References}
   23.21 -
   23.22 -The official Xen web site is found at:
   23.23 -\begin{quote} {\tt http://www.cl.cam.ac.uk/netos/xen/}
   23.24 -\end{quote}
   23.25 -
   23.26 -This contains links to the latest versions of all online
   23.27 -documentation, including the latest version of the FAQ.
   23.28 -
   23.29 -Information regarding Xen is also available at the Xen Wiki at
   23.30 -\begin{quote} {\tt http://wiki.xensource.com/xenwiki/}\end{quote}
   23.31 -The Xen project uses Bugzilla as its bug tracking system. You'll find
   23.32 -the Xen Bugzilla at http://bugzilla.xensource.com/bugzilla/.
   23.33 -
   23.34 -
   23.35 -\section{Mailing Lists}
   23.36 -
   23.37 -There are several mailing lists that are used to discuss Xen related
   23.38 -topics. The most widely relevant are listed below. An official page of
   23.39 -mailing lists and subscription information can be found at \begin{quote}
   23.40 -  {\tt http://lists.xensource.com/} \end{quote}
   23.41 -
   23.42 -\begin{description}
   23.43 -\item[xen-devel@lists.xensource.com] Used for development
   23.44 -  discussions and bug reports.  Subscribe at: \\
   23.45 -  {\small {\tt http://lists.xensource.com/xen-devel}}
   23.46 -\item[xen-users@lists.xensource.com] Used for installation and usage
   23.47 -  discussions and requests for help.  Subscribe at: \\
   23.48 -  {\small {\tt http://lists.xensource.com/xen-users}}
   23.49 -\item[xen-announce@lists.xensource.com] Used for announcements only.
   23.50 -  Subscribe at: \\
   23.51 -  {\small {\tt http://lists.xensource.com/xen-announce}}
   23.52 -\item[xen-changelog@lists.xensource.com] Changelog feed
   23.53 -  from the unstable and 2.0 trees - developer oriented.  Subscribe at: \\
   23.54 -  {\small {\tt http://lists.xensource.com/xen-changelog}}
   23.55 -\end{description}
    24.1 --- a/docs/src/user/gentoo.tex	Sun Dec 04 20:12:00 2005 +0100
    24.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    24.3 @@ -1,3 +0,0 @@
    24.4 -\chapter{Installing Xen on Gentoo Linux}
    24.5 -
    24.6 -Placeholder.
    25.1 --- a/docs/src/user/glossary.tex	Sun Dec 04 20:12:00 2005 +0100
    25.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    25.3 @@ -1,84 +0,0 @@
    25.4 -\chapter{Glossary of Terms}
    25.5 -
    25.6 -\begin{description}
    25.7 -
    25.8 -\item[Atropos] One of the CPU schedulers provided by Xen.  Atropos
    25.9 -  provides domains with absolute shares of the CPU, with timeliness
   25.10 -  guarantees and a mechanism for sharing out `slack time'.
   25.11 -
   25.12 -\item[BVT] The BVT scheduler is used to give proportional fair shares
   25.13 -  of the CPU to domains.
   25.14 -
   25.15 -\item[Exokernel] A minimal piece of privileged code, similar to a {\bf
   25.16 -    microkernel} but providing a more `hardware-like' interface to the
   25.17 -  tasks it manages.  This is similar to a paravirtualising VMM like
   25.18 -  {\bf Xen} but was designed as a new operating system structure,
   25.19 -  rather than specifically to run multiple conventional OSs.
   25.20 -
   25.21 -\item[Domain] A domain is the execution context that contains a
   25.22 -  running {\bf virtual machine}.  The relationship between virtual
   25.23 -  machines and domains on Xen is similar to that between programs and
   25.24 -  processes in an operating system: a virtual machine is a persistent
   25.25 -  entity that resides on disk (somewhat like a program).  When it is
   25.26 -  loaded for execution, it runs in a domain.  Each domain has a {\bf
   25.27 -    domain ID}.
   25.28 -
   25.29 -\item[Domain 0] The first domain to be started on a Xen machine.
   25.30 -  Domain 0 is responsible for managing the system.
   25.31 -
   25.32 -\item[Domain ID] A unique identifier for a {\bf domain}, analogous to
   25.33 -  a process ID in an operating system.
   25.34 -
   25.35 -\item[Full virtualisation] An approach to virtualisation which
   25.36 -  requires no modifications to the hosted operating system, providing
   25.37 -  the illusion of a complete system of real hardware devices.
   25.38 -
   25.39 -\item[Hypervisor] An alternative term for {\bf VMM}, used because it
   25.40 -  means `beyond supervisor', since it is responsible for managing
   25.41 -  multiple `supervisor' kernels.
   25.42 -
   25.43 -\item[Live migration] A technique for moving a running virtual machine
   25.44 -  to another physical host, without stopping it or the services
   25.45 -  running on it.
   25.46 -
   25.47 -\item[Microkernel] A small base of code running at the highest
   25.48 -  hardware privilege level.  A microkernel is responsible for sharing
   25.49 -  CPU and memory (and sometimes other devices) between less privileged
   25.50 -  tasks running on the system.  This is similar to a VMM, particularly
   25.51 -  a {\bf paravirtualising} VMM but typically addressing a different
   25.52 -  problem space and providing different kind of interface.
   25.53 -
   25.54 -\item[NetBSD/Xen] A port of NetBSD to the Xen architecture.
   25.55 -
   25.56 -\item[Paravirtualisation] An approach to virtualisation which requires
   25.57 -  modifications to the operating system in order to run in a virtual
   25.58 -  machine.  Xen uses paravirtualisation but preserves binary
   25.59 -  compatibility for user space applications.
   25.60 -
   25.61 -\item[Shadow pagetables] A technique for hiding the layout of machine
   25.62 -  memory from a virtual machine's operating system.  Used in some {\bf
   25.63 -  VMMs} to provide the illusion of contiguous physical memory, in
   25.64 -  Xen this is used during {\bf live migration}.
   25.65 -
   25.66 -\item[Virtual Block Device] Persistant storage available to a virtual
   25.67 -  machine, providing the abstraction of an actual block storage device.
   25.68 -  {\bf VBD}s may be actual block devices, filesystem images, or
   25.69 -  remote/network storage.
   25.70 -
   25.71 -\item[Virtual Machine] The environment in which a hosted operating
   25.72 -  system runs, providing the abstraction of a dedicated machine.  A
   25.73 -  virtual machine may be identical to the underlying hardware (as in
   25.74 -  {\bf full virtualisation}, or it may differ, as in {\bf
   25.75 -  paravirtualisation}).
   25.76 -
   25.77 -\item[VMM] Virtual Machine Monitor - the software that allows multiple
   25.78 -  virtual machines to be multiplexed on a single physical machine.
   25.79 -
   25.80 -\item[Xen] Xen is a paravirtualising virtual machine monitor,
   25.81 -  developed primarily by the Systems Research Group at the University
   25.82 -  of Cambridge Computer Laboratory.
   25.83 -
   25.84 -\item[XenLinux] Official name for the port of the Linux kernel that
   25.85 -  runs on Xen.
   25.86 -
   25.87 -\end{description}
    26.1 --- a/docs/src/user/installation.tex	Sun Dec 04 20:12:00 2005 +0100
    26.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    26.3 @@ -1,465 +0,0 @@
    26.4 -\chapter{Basic Installation}
    26.5 -
    26.6 -The Xen distribution includes three main components: Xen itself, ports
    26.7 -of Linux and NetBSD to run on Xen, and the userspace tools required to
    26.8 -manage a Xen-based system. This chapter describes how to install the
    26.9 -Xen~3.0 distribution from source. Alternatively, there may be pre-built
   26.10 -packages available as part of your operating system distribution.
   26.11 -
   26.12 -
   26.13 -\section{Prerequisites}
   26.14 -\label{sec:prerequisites}
   26.15 -
   26.16 -The following is a full list of prerequisites. Items marked `$\dag$' are
   26.17 -required by the \xend\ control tools, and hence required if you want to
   26.18 -run more than one virtual machine; items marked `$*$' are only required
   26.19 -if you wish to build from source.
   26.20 -\begin{itemize}
   26.21 -\item A working Linux distribution using the GRUB bootloader and running
   26.22 -  on a P6-class or newer CPU\@.
   26.23 -\item [$\dag$] The \path{iproute2} package.
   26.24 -\item [$\dag$] The Linux bridge-utils\footnote{Available from {\tt
   26.25 -      http://bridge.sourceforge.net}} (e.g., \path{/sbin/brctl})
   26.26 -\item [$\dag$] The Linux hotplug system\footnote{Available from {\tt
   26.27 -      http://linux-hotplug.sourceforge.net/}} (e.g.,
   26.28 -  \path{/sbin/hotplug} and related scripts)
   26.29 -\item [$*$] Build tools (gcc v3.2.x or v3.3.x, binutils, GNU make).
   26.30 -\item [$*$] Development installation of libcurl (e.g.,\ libcurl-devel).
   26.31 -\item [$*$] Development installation of zlib (e.g.,\ zlib-dev).
   26.32 -\item [$*$] Development installation of Python v2.2 or later (e.g.,\
   26.33 -  python-dev).
   26.34 -\item [$*$] \LaTeX\ and transfig are required to build the
   26.35 -  documentation.
   26.36 -\end{itemize}
   26.37 -
   26.38 -Once you have satisfied these prerequisites, you can now install either
   26.39 -a binary or source distribution of Xen.
   26.40 -
   26.41 -\section{Installing from Binary Tarball}
   26.42 -
   26.43 -Pre-built tarballs are available for download from the XenSource downloads
   26.44 -page:
   26.45 -\begin{quote} {\tt http://www.xensource.com/downloads/}
   26.46 -\end{quote}
   26.47 -
   26.48 -Once you've downloaded the tarball, simply unpack and install:
   26.49 -\begin{verbatim}
   26.50 -# tar zxvf xen-3.0-install.tgz
   26.51 -# cd xen-3.0-install
   26.52 -# sh ./install.sh
   26.53 -\end{verbatim}
   26.54 -
   26.55 -Once you've installed the binaries you need to configure your system as
   26.56 -described in Section~\ref{s:configure}.
   26.57 -
   26.58 -\section{Installing from RPMs}
   26.59 -Pre-built RPMs are available for download from the XenSource downloads
   26.60 -page:
   26.61 -\begin{quote} {\tt http://www.xensource.com/downloads/}
   26.62 -\end{quote}
   26.63 -
   26.64 -Once you've downloaded the RPMs, you typically install them via the RPM commands:
   26.65 -\begin{verbatim}
   26.66 -# rpm -ivh \emph{rpmname}
   26.67 -\end{verbatim}
   26.68 -
   26.69 -See the instructions and the Release Notes for each RPM set referenced at:
   26.70 -  \begin{quote}
   26.71 -    {\tt http://www.xensource.com/downloads/}.
   26.72 -  \end{quote}
   26.73 - 
   26.74 -\section{Installing from Source}
   26.75 -
   26.76 -This section describes how to obtain, build and install Xen from source.
   26.77 -
   26.78 -\subsection{Obtaining the Source}
   26.79 -
   26.80 -The Xen source tree is available as either a compressed source tarball
   26.81 -or as a clone of our master Mercurial repository.
   26.82 -
   26.83 -\begin{description}
   26.84 -\item[Obtaining the Source Tarball]\mbox{} \\
   26.85 -  Stable versions and daily snapshots of the Xen source tree are
   26.86 -  available from the Xen download page:
   26.87 -  \begin{quote} {\tt \tt http://www.xensource.com/downloads/}
   26.88 -  \end{quote}
   26.89 -\item[Obtaining the source via Mercurial]\mbox{} \\
   26.90 -  The source tree may also be obtained via the public Mercurial
   26.91 -  repository hosted at:
   26.92 -  \begin{quote}{\tt http://xenbits.xensource.com}.
   26.93 -  \end{quote} See the instructions and the Getting Started Guide
   26.94 -  referenced at:
   26.95 -  \begin{quote}
   26.96 -    {\tt http://www.xensource.com/downloads/}.
   26.97 -  \end{quote}
   26.98 -\end{description}
   26.99 -
  26.100 -% \section{The distribution}
  26.101 -%
  26.102 -% The Xen source code repository is structured as follows:
  26.103 -%
  26.104 -% \begin{description}
  26.105 -% \item[\path{tools/}] Xen node controller daemon (Xend), command line
  26.106 -%   tools, control libraries
  26.107 -% \item[\path{xen/}] The Xen VMM.
  26.108 -% \item[\path{buildconfigs/}] Build configuration files
  26.109 -% \item[\path{linux-*-xen-sparse/}] Xen support for Linux.
  26.110 -% \item[\path{patches/}] Experimental patches for Linux.
  26.111 -% \item[\path{docs/}] Various documentation files for users and
  26.112 -%   developers.
  26.113 -% \item[\path{extras/}] Bonus extras.
  26.114 -% \end{description}
  26.115 -
  26.116 -\subsection{Building from Source}
  26.117 -
  26.118 -The top-level Xen Makefile includes a target ``world'' that will do the
  26.119 -following:
  26.120 -
  26.121 -\begin{itemize}
  26.122 -\item Build Xen.
  26.123 -\item Build the control tools, including \xend.
  26.124 -\item Download (if necessary) and unpack the Linux 2.6 source code, and
  26.125 -  patch it for use with Xen.
  26.126 -\item Build a Linux kernel to use in domain~0 and a smaller unprivileged
  26.127 -  kernel, which can optionally be used for unprivileged virtual
  26.128 -  machines.
  26.129 -\end{itemize}
  26.130 -
  26.131 -After the build has completed you should have a top-level directory
  26.132 -called \path{dist/} in which all resulting targets will be placed. Of
  26.133 -particular interest are the two XenLinux kernel images, one with a
  26.134 -``-xen0'' extension which contains hardware device drivers and drivers
  26.135 -for Xen's virtual devices, and one with a ``-xenU'' extension that
  26.136 -just contains the virtual ones. These are found in
  26.137 -\path{dist/install/boot/} along with the image for Xen itself and the
  26.138 -configuration files used during the build.
  26.139 -
  26.140 -%The NetBSD port can be built using:
  26.141 -%\begin{quote}
  26.142 -%\begin{verbatim}
  26.143 -%# make netbsd20
  26.144 -%\end{verbatim}\end{quote}
  26.145 -%NetBSD port is built using a snapshot of the netbsd-2-0 cvs branch.
  26.146 -%The snapshot is downloaded as part of the build process if it is not
  26.147 -%yet present in the \path{NETBSD\_SRC\_PATH} search path.  The build
  26.148 -%process also downloads a toolchain which includes all of the tools
  26.149 -%necessary to build the NetBSD kernel under Linux.
  26.150 -
  26.151 -To customize the set of kernels built you need to edit the top-level
  26.152 -Makefile. Look for the line:
  26.153 -\begin{quote}
  26.154 -\begin{verbatim}
  26.155 -KERNELS ?= mk.linux-2.6-xen0 mk.linux-2.6-xenU
  26.156 -\end{verbatim}
  26.157 -\end{quote}
  26.158 -
  26.159 -You can edit this line to include any set of operating system kernels
  26.160 -which have configurations in the top-level \path{buildconfigs/}
  26.161 -directory, for example \path{mk.linux-2.6-xenU} to build a Linux 2.6
  26.162 -kernel containing only virtual device drivers.
  26.163 -
  26.164 -%% Inspect the Makefile if you want to see what goes on during a
  26.165 -%% build.  Building Xen and the tools is straightforward, but XenLinux
  26.166 -%% is more complicated.  The makefile needs a `pristine' Linux kernel
  26.167 -%% tree to which it will then add the Xen architecture files.  You can
  26.168 -%% tell the makefile the location of the appropriate Linux compressed
  26.169 -%% tar file by
  26.170 -%% setting the LINUX\_SRC environment variable, e.g. \\
  26.171 -%% \verb!# LINUX_SRC=/tmp/linux-2.6.11.tar.bz2 make world! \\ or by
  26.172 -%% placing the tar file somewhere in the search path of {\tt
  26.173 -%%   LINUX\_SRC\_PATH} which defaults to `{\tt .:..}'.  If the
  26.174 -%% makefile can't find a suitable kernel tar file it attempts to
  26.175 -%% download it from kernel.org (this won't work if you're behind a
  26.176 -%% firewall).
  26.177 -
  26.178 -%% After untaring the pristine kernel tree, the makefile uses the {\tt
  26.179 -%%   mkbuildtree} script to add the Xen patches to the kernel.
  26.180 -
  26.181 -%% \framebox{\parbox{5in}{
  26.182 -%%     {\bf Distro specific:} \\
  26.183 -%%     {\it Gentoo} --- if not using udev (most installations,
  26.184 -%%     currently), you'll need to enable devfs and devfs mount at boot
  26.185 -%%     time in the xen0 config.  }}
  26.186 -
  26.187 -\subsection{Custom Kernels}
  26.188 -
  26.189 -% If you have an SMP machine you may wish to give the {\tt '-j4'}
  26.190 -% argument to make to get a parallel build.
  26.191 -
  26.192 -If you wish to build a customized XenLinux kernel (e.g.\ to support
  26.193 -additional devices or enable distribution-required features), you can
  26.194 -use the standard Linux configuration mechanisms, specifying that the
  26.195 -architecture being built for is \path{xen}, e.g:
  26.196 -\begin{quote}
  26.197 -\begin{verbatim}
  26.198 -# cd linux-2.6.11-xen0
  26.199 -# make ARCH=xen xconfig
  26.200 -# cd ..
  26.201 -# make
  26.202 -\end{verbatim}
  26.203 -\end{quote}
  26.204 -
  26.205 -You can also copy an existing Linux configuration (\path{.config}) into
  26.206 -e.g.\ \path{linux-2.6.11-xen0} and execute:
  26.207 -\begin{quote}
  26.208 -\begin{verbatim}
  26.209 -# make ARCH=xen oldconfig
  26.210 -\end{verbatim}
  26.211 -\end{quote}
  26.212 -
  26.213 -You may be prompted with some Xen-specific options. We advise accepting
  26.214 -the defaults for these options.
  26.215 -
  26.216 -Note that the only difference between the two types of Linux kernels
  26.217 -that are built is the configuration file used for each. The ``U''
  26.218 -suffixed (unprivileged) versions don't contain any of the physical
  26.219 -hardware device drivers, leading to a 30\% reduction in size; hence you
  26.220 -may prefer these for your non-privileged domains. The ``0'' suffixed
  26.221 -privileged versions can be used to boot the system, as well as in driver
  26.222 -domains and unprivileged domains.
  26.223 -
  26.224 -\subsection{Installing Generated Binaries}
  26.225 -
  26.226 -The files produced by the build process are stored under the
  26.227 -\path{dist/install/} directory. To install them in their default
  26.228 -locations, do:
  26.229 -\begin{quote}
  26.230 -\begin{verbatim}
  26.231 -# make install
  26.232 -\end{verbatim}
  26.233 -\end{quote}
  26.234 -
  26.235 -Alternatively, users with special installation requirements may wish to
  26.236 -install them manually by copying the files to their appropriate
  26.237 -destinations.
  26.238 -
  26.239 -%% Files in \path{install/boot/} include:
  26.240 -%% \begin{itemize}
  26.241 -%% \item \path{install/boot/xen-3.0.gz} Link to the Xen 'kernel'
  26.242 -%% \item \path{install/boot/vmlinuz-2.6-xen0} Link to domain 0
  26.243 -%%   XenLinux kernel
  26.244 -%% \item \path{install/boot/vmlinuz-2.6-xenU} Link to unprivileged
  26.245 -%%   XenLinux kernel
  26.246 -%% \end{itemize}
  26.247 -
  26.248 -The \path{dist/install/boot} directory will also contain the config
  26.249 -files used for building the XenLinux kernels, and also versions of Xen
  26.250 -and XenLinux kernels that contain debug symbols such as
  26.251 -(\path{xen-syms-2.0.6} and \path{vmlinux-syms-2.6.11.11-xen0}) which are
  26.252 -essential for interpreting crash dumps. Retain these files as the
  26.253 -developers may wish to see them if you post on the mailing list.
  26.254 -
  26.255 -
  26.256 -\section{Configuration}
  26.257 -\label{s:configure}
  26.258 -
  26.259 -Once you have built and installed the Xen distribution, it is simple to
  26.260 -prepare the machine for booting and running Xen.
  26.261 -
  26.262 -\subsection{GRUB Configuration}
  26.263 -
  26.264 -An entry should be added to \path{grub.conf} (often found under
  26.265 -\path{/boot/} or \path{/boot/grub/}) to allow Xen / XenLinux to boot.
  26.266 -This file is sometimes called \path{menu.lst}, depending on your
  26.267 -distribution. The entry should look something like the following:
  26.268 -
  26.269 -%% KMSelf Thu Dec  1 19:06:13 PST 2005 262144 is useful for RHEL/RH and
  26.270 -%% related Dom0s.
  26.271 -{\small
  26.272 -\begin{verbatim}
  26.273 -title Xen 3.0 / XenLinux 2.6
  26.274 -  kernel /boot/xen-3.0.gz dom0_mem=262144
  26.275 -  module /boot/vmlinuz-2.6-xen0 root=/dev/sda4 ro console=tty0
  26.276 -\end{verbatim}
  26.277 -}
  26.278 -
  26.279 -The kernel line tells GRUB where to find Xen itself and what boot
  26.280 -parameters should be passed to it (in this case, setting the domain~0
  26.281 -memory allocation in kilobytes and the settings for the serial port).
  26.282 -For more details on the various Xen boot parameters see
  26.283 -Section~\ref{s:xboot}.
  26.284 -
  26.285 -The module line of the configuration describes the location of the
  26.286 -XenLinux kernel that Xen should start and the parameters that should be
  26.287 -passed to it. These are standard Linux parameters, identifying the root
  26.288 -device and specifying it be initially mounted read only and instructing
  26.289 -that console output be sent to the screen. Some distributions such as
  26.290 -SuSE do not require the \path{ro} parameter.
  26.291 -
  26.292 -%% \framebox{\parbox{5in}{
  26.293 -%%     {\bf Distro specific:} \\
  26.294 -%%     {\it SuSE} --- Omit the {\tt ro} option from the XenLinux
  26.295 -%%     kernel command line, since the partition won't be remounted rw
  26.296 -%%     during boot.  }}
  26.297 -
  26.298 -To use an initrd, add another \path{module} line to the configuration,
  26.299 -like: {\small
  26.300 -\begin{verbatim}
  26.301 -  module /boot/my_initrd.gz
  26.302 -\end{verbatim}
  26.303 -}
  26.304 -
  26.305 -%% KMSelf Thu Dec  1 19:05:30 PST 2005 Other configs as an appendix?
  26.306 -
  26.307 -When installing a new kernel, it is recommended that you do not delete
  26.308 -existing menu options from \path{menu.lst}, as you may wish to boot your
  26.309 -old Linux kernel in future, particularly if you have problems.
  26.310 -
  26.311 -\subsection{Serial Console (optional)}
  26.312 -
  26.313 -Serial console access allows you to manage, monitor, and interact with
  26.314 -your system over a serial console.  This can allow access from another
  26.315 -nearby system via a null-modem ("LapLink") cable, remotely via a serial
  26.316 -concentrator, or for debugging an emulator such as Qemu.
  26.317 -
  26.318 -You system's BIOS, bootloader (GRUB), Xen, Linux, and login access must
  26.319 -each be individually configured for serial console access.  It is
  26.320 -\emph{not} strictly necessary to have each component fully functional,
  26.321 -but it can be quite useful.
  26.322 -
  26.323 -For general information on serial console configuration under Linux,
  26.324 -refer to the ``Remote Serial Console HOWTO'' at The Linux Documentation
  26.325 -Project:  {\tt http://www.tldp.org}.
  26.326 -
  26.327 -\subsubsection{Serial Console BIOS configuration}
  26.328 -
  26.329 -Enabling system serial console output neither enables nor disables
  26.330 -serial capabilities in GRUB, Xen, or Linux, but may make remote
  26.331 -management of your system more convenient by displaying POST and other
  26.332 -boot messages over serial port and allowing remote BIOS configuration.
  26.333 -
  26.334 -Refer to your hardware vendor's documentation for capabilities and
  26.335 -procedures to enable BIOS serial redirection.
  26.336 -
  26.337 -
  26.338 -\subsubsection{Serial Console GRUB configuration}
  26.339 -
  26.340 -Placeholder
  26.341 -
  26.342 -Enabling GRUB serial console output neither enables nor disables Xen or
  26.343 -Linux serial capabilities, but may made remote management of your system
  26.344 -more convenient by displaying GRUB prompts, menus, and actions over
  26.345 -serial port and allowing remote GRUB management.
  26.346 -
  26.347 -Adding the following two lines to your GRUB configuration file,
  26.348 -typically \path{/boot/grub/menu.lst} or \path{/boot/grub/grub.conf}
  26.349 -depending on your distro, will enable GRUB serial output.
  26.350 -
  26.351 -\begin{quote} {\small \begin{verbatim}
  26.352 -  serial --unit=0 --speed=115200 --word=8 --parity=no --stop=1
  26.353 -  terminal --timeout=10 serial console
  26.354 -\end{verbatim}}
  26.355 -\end{quote}
  26.356 -
  26.357 -Note that when both the serial port and the local monitor and keyboard
  26.358 -are enabled, the text "Press any key to continue." will appear at both.
  26.359 -Pressing a key on one device will cause GRUB to display to that device.
  26.360 -The other device will see no output.  If no key is pressed before the
  26.361 -timeout period expires, the system will boot to the default GRUB boot
  26.362 -entry.
  26.363 -
  26.364 -Please refer to the GRUB info documentation for further information.
  26.365 -
  26.366 -
  26.367 -\subsubsection{Serial Console Xen configuration}
  26.368 -
  26.369 -Enabling Xen serial console output neither enables nor disables Linux
  26.370 -kernel output or logging in to Linux over serial port.  It does however
  26.371 -allow you to monitor and log the Xen boot process via serial console and
  26.372 -can be very useful in debugging.
  26.373 -
  26.374 -%% kernel /boot/xen-2.0.gz dom0_mem=131072 com1=115200,8n1
  26.375 -%% module /boot/vmlinuz-2.6-xen0 root=/dev/sda4 ro
  26.376 -
  26.377 -In order to configure Xen serial console output, it is necessary to add
  26.378 -an boot option to your GRUB config; e.g.\ replace the above kernel line
  26.379 -with:
  26.380 -\begin{quote} {\small \begin{verbatim}
  26.381 -   kernel /boot/xen.gz dom0_mem=131072 com1=115200,8n1
  26.382 -\end{verbatim}}
  26.383 -\end{quote}
  26.384 -
  26.385 -This configures Xen to output on COM1 at 115,200 baud, 8 data bits, 1
  26.386 -stop bit and no parity. Modify these parameters for your environment.
  26.387 -
  26.388 -One can also configure XenLinux to share the serial console; to achieve
  26.389 -this append ``\path{console=ttyS0}'' to your module line.
  26.390 -
  26.391 -
  26.392 -\subsubsection{Serial Console Linux configuration}
  26.393 -
  26.394 -Enabling Linux serial console output at boot neither enables nor
  26.395 -disables logging in to Linux over serial port.  It does however allow
  26.396 -you to monitor and log the Linux boot process via serial console and can be
  26.397 -very useful in debugging.
  26.398 -
  26.399 -To enable Linux output at boot time, add the parameter
  26.400 -\path{console=ttyS0} (or ttyS1, ttyS2, etc.) to your kernel GRUB line.
  26.401 -Under Xen, this might be:
  26.402 -\begin{quote} {\small \begin{verbatim}
  26.403 -  module /vmlinuz-2.6-xen0 ro root=/dev/VolGroup00/LogVol00 console=ttyS0, 115200
  26.404 -\end{verbatim}}
  26.405 -\end{quote}
  26.406 -to enable output over ttyS0 at 115200 baud.
  26.407 -
  26.408 -
  26.409 -
  26.410 -\subsubsection{Serial Console Login configuration}
  26.411 -
  26.412 -Logging in to Linux via serial console, under Xen or otherwise, requires
  26.413 -specifying a login prompt be started on the serial port.  To permit root
  26.414 -logins over serial console, the serial port must be added to
  26.415 -\path{/etc/securetty}.
  26.416 -
  26.417 -To automatically start a login prompt over serial port, 
  26.418 -Add the line: \begin{quote} {\small {\tt c:2345:respawn:/sbin/mingetty
  26.419 -ttyS0}} \end{quote} to \path{/etc/inittab}.   Run \path{init q} to force
  26.420 -a reload of your inttab and start getty.
  26.421 -
  26.422 -To enable root logins, add \path{ttyS0} to \path{/etc/securetty} if not
  26.423 -already present.
  26.424 -
  26.425 -Your distribution may use an alternate getty, options include getty,
  26.426 -mgetty, agetty, and others.  Consult your distribution's documentation
  26.427 -for further information.
  26.428 -
  26.429 -
  26.430 -\subsection{TLS Libraries}
  26.431 -
  26.432 -Users of the XenLinux 2.6 kernel should disable Thread Local Storage
  26.433 -(TLS) (e.g.\ by doing a \path{mv /lib/tls /lib/tls.disabled}) before
  26.434 -attempting to boot a XenLinux kernel\footnote{If you boot without first
  26.435 -  disabling TLS, you will get a warning message during the boot process.
  26.436 -  In this case, simply perform the rename after the machine is up and
  26.437 -  then run \path{/sbin/ldconfig} to make it take effect.}. You can
  26.438 -always reenable TLS by restoring the directory to its original location
  26.439 -(i.e.\ \path{mv /lib/tls.disabled /lib/tls}).
  26.440 -
  26.441 -The reason for this is that the current TLS implementation uses
  26.442 -segmentation in a way that is not permissible under Xen. If TLS is not
  26.443 -disabled, an emulation mode is used within Xen which reduces performance
  26.444 -substantially.
  26.445 -
  26.446 -We hope that this issue can be resolved by working with Linux
  26.447 -distributions to implement a minor backward-compatible change to the TLS
  26.448 -library.
  26.449 -
  26.450 -
  26.451 -\section{Booting Xen}
  26.452 -
  26.453 -It should now be possible to restart the system and use Xen. Reboot and
  26.454 -choose the new Xen option when the Grub screen appears.
  26.455 -
  26.456 -What follows should look much like a conventional Linux boot. The first
  26.457 -portion of the output comes from Xen itself, supplying low level
  26.458 -information about itself and the underlying hardware. The last portion
  26.459 -of the output comes from XenLinux.
  26.460 -
  26.461 -You may see some errors during the XenLinux boot. These are not
  26.462 -necessarily anything to worry about --- they may result from kernel
  26.463 -configuration differences between your XenLinux kernel and the one you
  26.464 -usually use.
  26.465 -
  26.466 -When the boot completes, you should be able to log into your system as
  26.467 -usual. If you are unable to log in, you should still be able to reboot
  26.468 -with your normal Linux kernel by selecting it at the GRUB prompt.
    27.1 --- a/docs/src/user/introduction.tex	Sun Dec 04 20:12:00 2005 +0100
    27.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    27.3 @@ -1,152 +0,0 @@
    27.4 -\chapter{Introduction}
    27.5 -
    27.6 -
    27.7 -Xen is a \emph{para-virtualizing} virtual machine monitor (VMM), or
    27.8 -``hypervisor'', for the x86 processor architecture. Xen can securely
    27.9 -execute multiple virtual machines on a single physical system with
   27.10 -close-to-native performance. The virtual machine technology facilitates
   27.11 -enterprise-grade functionality, including:
   27.12 -
   27.13 -\begin{itemize}
   27.14 -\item Virtual machines with performance typically 94-98\% of native hardware.
   27.15 -\item Live migration of running virtual machines between physical hosts.
   27.16 -\item Excellent hardware support. Supports most Linux device drivers.
   27.17 -\item Sand-boxed, re-startable device drivers.
   27.18 -\end{itemize}
   27.19 -
   27.20 -Para-virtualization permits very high performance virtualization, even
   27.21 -on architectures like x86 that are traditionally very hard to
   27.22 -virtualize.
   27.23 -
   27.24 -The drawback of this approach is that it requires operating systems to
   27.25 -be \emph{ported} to run on Xen. Porting an OS to run on Xen is similar
   27.26 -to supporting a new hardware platform, however the process is simplified
   27.27 -because the para-virtual machine architecture is very similar to the
   27.28 -underlying native hardware. Even though operating system kernels must
   27.29 -explicitly support Xen, a key feature is that user space applications
   27.30 -and libraries \emph{do not} require modification.
   27.31 -
   27.32 -With hardware CPU virtualization as provided by Intel VT and AMD
   27.33 -Pacifica technology, the ability to run an unmodified guest OS kernel is
   27.34 -available.  No porting of the OS is required; however some additional
   27.35 -driver support is necessary within Xen itself.  Unlike traditional full
   27.36 -virtualization hypervisors, which suffer a tremendous performance
   27.37 -overhead, the combination of Xen and VT or Xen and Pacifica technology
   27.38 -complement one another to offer superb performance for para-virtualized
   27.39 -guest operating systems and full support for unmodified guests, which
   27.40 -run natively on the processor without need for emulation, under VT.
   27.41 -Full support for VT and Pacifica chipsets will appear in early 2006.
   27.42 -
   27.43 -Xen support is available for increasingly many operating systems:
   27.44 -currently, Linux and NetBSD are available for Xen 3.0. A FreeBSD port is
   27.45 -undergoing testing and will be incorporated into the release soon. Other
   27.46 -OS ports, including Plan 9, are in progress. We hope that that arch-xen
   27.47 -patches will be incorporated into the mainstream releases of these
   27.48 -operating systems in due course (as has already happened for NetBSD).
   27.49 -
   27.50 -%% KMSelf Thu Dec  1 14:59:02 PST 2005 PPC port status?
   27.51 -
   27.52 -Possible usage scenarios for Xen include:
   27.53 -
   27.54 -\begin{description}
   27.55 -\item [Kernel development.] Test and debug kernel modifications in a
   27.56 -  sand-boxed virtual machine --- no need for a separate test machine.
   27.57 -\item [Multiple OS configurations.] Run multiple operating systems
   27.58 -  simultaneously, for instance for compatibility or QA purposes.
   27.59 -\item [Server consolidation.] Move multiple servers onto a single
   27.60 -  physical host with performance and fault isolation provided at the
   27.61 -  virtual machine boundaries.
   27.62 -\item [Cluster computing.] Management at VM granularity provides more
   27.63 -  flexibility than separately managing each physical host, but better
   27.64 -  control and isolation than single-system image solutions,
   27.65 -  particularly by using live migration for load balancing.
   27.66 -\item [Hardware support for custom OSes.] Allow development of new
   27.67 -  OSes while benefiting from the wide-ranging hardware support of
   27.68 -  existing OSes such as Linux.
   27.69 -\end{description}
   27.70 -
   27.71 -
   27.72 -\section{Structure of a Xen-Based System}
   27.73 -
   27.74 -A Xen system has multiple layers, the lowest and most privileged of
   27.75 -which is Xen itself.
   27.76 -
   27.77 -Xen may host multiple \emph{guest} operating systems, each of which is
   27.78 -executed within a secure virtual machine. In Xen terminology, a
   27.79 -\emph{domain}. Domains are scheduled by Xen to make effective use of the
   27.80 -available physical CPUs. Each guest OS manages its own applications.
   27.81 -This management includes the responsibility of scheduling each
   27.82 -application within the time allotted to the VM by Xen.
   27.83 -
   27.84 -The first domain, \emph{domain~0}, is created automatically when the
   27.85 -system boots and has special management privileges. Domain~0 builds
   27.86 -other domains and manages their virtual devices. It also performs
   27.87 -administrative tasks such as suspending, resuming and migrating other
   27.88 -virtual machines.
   27.89 -
   27.90 -Within domain~0, a process called \emph{xend} runs to manage the system.
   27.91 -\Xend\ is responsible for managing virtual machines and providing access
   27.92 -to their consoles. Commands are issued to \xend\ over an HTTP interface,
   27.93 -either from a command-line tool or from a web browser.
   27.94 -
   27.95 -
   27.96 -\section{Hardware Support}
   27.97 -
   27.98 -Xen currently runs only on the x86 architecture, requiring a ``P6'' or
   27.99 -newer processor (e.g.\ Pentium Pro, Celeron, Pentium~II, Pentium~III,
  27.100 -Pentium~IV, Xeon, AMD~Athlon, AMD~Duron). Multiprocessor machines are
  27.101 -supported, and there is basic support for HyperThreading (SMT), although
  27.102 -this remains a topic for ongoing research. A port specifically for
  27.103 -x86/64 is in progress. Xen already runs on such systems in 32-bit legacy
  27.104 -mode. In addition, a port to the IA64 architecture is approaching
  27.105 -completion. We hope to add other architectures such as PPC and ARM in
  27.106 -due course.
  27.107 -
  27.108 -Xen can currently use up to 4GB of memory. It is possible for x86
  27.109 -machines to address up to 64GB of physical memory but there are no plans
  27.110 -to support these systems: The x86/64 port is the planned route to
  27.111 -supporting larger memory sizes.
  27.112 -
  27.113 -Xen offloads most of the hardware support issues to the guest OS running
  27.114 -in Domain~0. Xen itself contains only the code required to detect and
  27.115 -start secondary processors, set up interrupt routing, and perform PCI
  27.116 -bus enumeration. Device drivers run within a privileged guest OS rather
  27.117 -than within Xen itself. This approach provides compatibility with the
  27.118 -majority of device hardware supported by Linux. The default XenLinux
  27.119 -build contains support for relatively modern server-class network and
  27.120 -disk hardware, but you can add support for other hardware by configuring
  27.121 -your XenLinux kernel in the normal way.
  27.122 -
  27.123 -
  27.124 -\section{History}
  27.125 -
  27.126 -Xen was originally developed by the Systems Research Group at the
  27.127 -University of Cambridge Computer Laboratory as part of the XenoServers
  27.128 -project, funded by the UK-EPSRC\@.
  27.129 -
  27.130 -XenoServers aim to provide a ``public infrastructure for global
  27.131 -distributed computing''. Xen plays a key part in that, allowing one to
  27.132 -efficiently partition a single machine to enable multiple independent
  27.133 -clients to run their operating systems and applications in an
  27.134 -environment. This environment provides protection, resource isolation
  27.135 -and accounting. The project web page contains further information along
  27.136 -with pointers to papers and technical reports:
  27.137 -\path{http://www.cl.cam.ac.uk/xeno}
  27.138 -
  27.139 -Xen has grown into a fully-fledged project in its own right, enabling us
  27.140 -to investigate interesting research issues regarding the best techniques
  27.141 -for virtualizing resources such as the CPU, memory, disk and network.
  27.142 -The project has been bolstered by support from Intel Research Cambridge
  27.143 -and HP Labs, who are now working closely with us.
  27.144 -
  27.145 -Xen was first described in a paper presented at SOSP in
  27.146 -2003\footnote{\tt
  27.147 -  http://www.cl.cam.ac.uk/netos/papers/2003-xensosp.pdf}, and the first
  27.148 -public release (1.0) was made that October. Since then, Xen has
  27.149 -significantly matured and is now used in production scenarios on many
  27.150 -sites.
  27.151 -
  27.152 -Xen 3.0 features greatly enhanced hardware support, configuration
  27.153 -flexibility, usability and a larger complement of supported operating
  27.154 -systems. This latest release takes Xen a step closer to becoming the
  27.155 -definitive open source solution for virtualization.
    28.1 --- a/docs/src/user/known_problems.tex	Sun Dec 04 20:12:00 2005 +0100
    28.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    28.3 @@ -1,3 +0,0 @@
    28.4 -\chapter{Known Problems}
    28.5 -
    28.6 -Problem One: No Known Problems Chapter.
    29.1 --- a/docs/src/user/logfiles.tex	Sun Dec 04 20:12:00 2005 +0100
    29.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    29.3 @@ -1,3 +0,0 @@
    29.4 -\chapter{Log Files}
    29.5 -
    29.6 -Placeholder.
    30.1 --- a/docs/src/user/memory_management.tex	Sun Dec 04 20:12:00 2005 +0100
    30.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    30.3 @@ -1,51 +0,0 @@
    30.4 -\chapter{Memory Management}
    30.5 -
    30.6 -\section{Managing Domain Memory}
    30.7 -
    30.8 -XenLinux domains have the ability to relinquish/reclaim machine
    30.9 -memory at the request of the administrator or the user of the domain.
   30.10 -
   30.11 -\subsection{Setting memory footprints from dom0}
   30.12 -
   30.13 -The machine administrator can request that a domain alter its memory
   30.14 -footprint using the \path{xm mem-set} command.  For instance, we can
   30.15 -request that our example ttylinux domain reduce its memory footprint
   30.16 -to 32 megabytes.
   30.17 -
   30.18 -\begin{verbatim}
   30.19 -# xm mem-set ttylinux 32
   30.20 -\end{verbatim}
   30.21 -
   30.22 -We can now see the result of this in the output of \path{xm list}:
   30.23 -
   30.24 -\begin{verbatim}
   30.25 -# xm list
   30.26 -Name              Id  Mem(MB)  CPU  State  Time(s)  Console
   30.27 -Domain-0           0      251    0  r----    172.2        
   30.28 -ttylinux           5       31    0  -b---      4.3    9605
   30.29 -\end{verbatim}
   30.30 -
   30.31 -The domain has responded to the request by returning memory to Xen. We
   30.32 -can restore the domain to its original size using the command line:
   30.33 -
   30.34 -\begin{verbatim}
   30.35 -# xm mem-set ttylinux 64
   30.36 -\end{verbatim}
   30.37 -
   30.38 -\subsection{Setting memory footprints from within a domain}
   30.39 -
   30.40 -The virtual file \path{/proc/xen/balloon} allows the owner of a domain
   30.41 -to adjust their own memory footprint.  Reading the file (e.g.\
   30.42 -\path{cat /proc/xen/balloon}) prints out the current memory footprint
   30.43 -of the domain.  Writing the file (e.g.\ \path{echo new\_target >
   30.44 -  /proc/xen/balloon}) requests that the kernel adjust the domain's
   30.45 -memory footprint to a new value.
   30.46 -
   30.47 -\subsection{Setting memory limits}
   30.48 -
   30.49 -Xen associates a memory size limit with each domain.  By default, this
   30.50 -is the amount of memory the domain is originally started with,
   30.51 -preventing the domain from ever growing beyond this size.  To permit a
   30.52 -domain to grow beyond its original allocation or to prevent a domain
   30.53 -you've shrunk from reclaiming the memory it relinquished, use the
   30.54 -\path{xm maxmem} command.
    31.1 --- a/docs/src/user/migrating_domains.tex	Sun Dec 04 20:12:00 2005 +0100
    31.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    31.3 @@ -1,70 +0,0 @@
    31.4 -\chapter{Migrating Domains}
    31.5 -
    31.6 -\section{Domain Save and Restore}
    31.7 -
    31.8 -The administrator of a Xen system may suspend a virtual machine's
    31.9 -current state into a disk file in domain~0, allowing it to be resumed at
   31.10 -a later time.
   31.11 -
   31.12 -The ttylinux domain described earlier can be suspended to disk using the
   31.13 -command:
   31.14 -\begin{verbatim}
   31.15 -# xm save ttylinux ttylinux.xen
   31.16 -\end{verbatim}
   31.17 -
   31.18 -This will stop the domain named ``ttylinux'' and save its current state
   31.19 -into a file called \path{ttylinux.xen}.
   31.20 -
   31.21 -To resume execution of this domain, use the \path{xm restore} command:
   31.22 -\begin{verbatim}
   31.23 -# xm restore ttylinux.xen
   31.24 -\end{verbatim}
   31.25 -
   31.26 -This will restore the state of the domain and restart it. The domain
   31.27 -will carry on as before and the console may be reconnected using the
   31.28 -\path{xm console} command, as above.
   31.29 -
   31.30 -\section{Live Migration}
   31.31 -
   31.32 -Live migration is used to transfer a domain between physical hosts
   31.33 -whilst that domain continues to perform its usual activities --- from
   31.34 -the user's perspective, the migration should be imperceptible.
   31.35 -
   31.36 -To perform a live migration, both hosts must be running Xen / \xend\ and
   31.37 -the destination host must have sufficient resources (e.g.\ memory
   31.38 -capacity) to accommodate the domain after the move. Furthermore we
   31.39 -currently require both source and destination machines to be on the same
   31.40 -L2 subnet.
   31.41 -
   31.42 -Currently, there is no support for providing automatic remote access to
   31.43 -filesystems stored on local disk when a domain is migrated.
   31.44 -Administrators should choose an appropriate storage solution (i.e.\ SAN,
   31.45 -NAS, etc.) to ensure that domain filesystems are also available on their
   31.46 -destination node. GNBD is a good method for exporting a volume from one
   31.47 -machine to another. iSCSI can do a similar job, but is more complex to
   31.48 -set up.
   31.49 -
   31.50 -When a domain migrates, it's MAC and IP address move with it, thus it is
   31.51 -only possible to migrate VMs within the same layer-2 network and IP
   31.52 -subnet. If the destination node is on a different subnet, the
   31.53 -administrator would need to manually configure a suitable etherip or IP
   31.54 -tunnel in the domain~0 of the remote node.
   31.55 -
   31.56 -A domain may be migrated using the \path{xm migrate} command. To live
   31.57 -migrate a domain to another machine, we would use the command:
   31.58 -
   31.59 -\begin{verbatim}
   31.60 -# xm migrate --live mydomain destination.ournetwork.com
   31.61 -\end{verbatim}
   31.62 -
   31.63 -Without the \path{--live} flag, \xend\ simply stops the domain and
   31.64 -copies the memory image over to the new node and restarts it. Since
   31.65 -domains can have large allocations this can be quite time consuming,
   31.66 -even on a Gigabit network. With the \path{--live} flag \xend\ attempts
   31.67 -to keep the domain running while the migration is in progress, resulting
   31.68 -in typical `downtimes' of just 60--300ms.
   31.69 -
   31.70 -For now it will be necessary to reconnect to the domain's console on the
   31.71 -new machine using the \path{xm console} command. If a migrated domain
   31.72 -has any open network connections then they will be preserved, so SSH
   31.73 -connections do not have this limitation.
    32.1 --- a/docs/src/user/monitoring_xen.tex	Sun Dec 04 20:12:00 2005 +0100
    32.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    32.3 @@ -1,3 +0,0 @@
    32.4 -\chapter{Monitoring Xen}
    32.5 -
    32.6 -Placeholder.
    33.1 --- a/docs/src/user/network_management.tex	Sun Dec 04 20:12:00 2005 +0100
    33.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    33.3 @@ -1,3 +0,0 @@
    33.4 -\chapter{Network Management}
    33.5 -
    33.6 -Placeholder.
    34.1 --- a/docs/src/user/options.tex	Sun Dec 04 20:12:00 2005 +0100
    34.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    34.3 @@ -1,149 +0,0 @@
    34.4 -\chapter{Build and Boot Options} 
    34.5 -
    34.6 -This chapter describes the build- and boot-time options which may be
    34.7 -used to tailor your Xen system.
    34.8 -
    34.9 -
   34.10 -\section{Xen Build Options}
   34.11 -
   34.12 -Xen provides a number of build-time options which should be set as
   34.13 -environment variables or passed on make's command-line.
   34.14 -
   34.15 -\begin{description}
   34.16 -\item[verbose=y] Enable debugging messages when Xen detects an
   34.17 -  unexpected condition.  Also enables console output from all domains.
   34.18 -\item[debug=y] Enable debug assertions.  Implies {\bf verbose=y}.
   34.19 -  (Primarily useful for tracing bugs in Xen).
   34.20 -\item[debugger=y] Enable the in-Xen debugger. This can be used to
   34.21 -  debug Xen, guest OSes, and applications.
   34.22 -\item[perfc=y] Enable performance counters for significant events
   34.23 -  within Xen. The counts can be reset or displayed on Xen's console
   34.24 -  via console control keys.
   34.25 -\item[trace=y] Enable per-cpu trace buffers which log a range of
   34.26 -  events within Xen for collection by control software.
   34.27 -\end{description}
   34.28 -
   34.29 -
   34.30 -\section{Xen Boot Options}
   34.31 -\label{s:xboot}
   34.32 -
   34.33 -These options are used to configure Xen's behaviour at runtime.  They
   34.34 -should be appended to Xen's command line, either manually or by
   34.35 -editing \path{grub.conf}.
   34.36 -
   34.37 -\begin{description}
   34.38 -\item [ noreboot ] Don't reboot the machine automatically on errors.
   34.39 -  This is useful to catch debug output if you aren't catching console
   34.40 -  messages via the serial line.
   34.41 -\item [ nosmp ] Disable SMP support.  This option is implied by
   34.42 -  `ignorebiostables'.
   34.43 -\item [ watchdog ] Enable NMI watchdog which can report certain
   34.44 -  failures.
   34.45 -\item [ noirqbalance ] Disable software IRQ balancing and affinity.
   34.46 -  This can be used on systems such as Dell 1850/2850 that have
   34.47 -  workarounds in hardware for IRQ-routing issues.
   34.48 -\item [ badpage=$<$page number$>$,$<$page number$>$, \ldots ] Specify
   34.49 -  a list of pages not to be allocated for use because they contain bad
   34.50 -  bytes. For example, if your memory tester says that byte 0x12345678
   34.51 -  is bad, you would place `badpage=0x12345' on Xen's command line.
   34.52 -\item [ com1=$<$baud$>$,DPS,$<$io\_base$>$,$<$irq$>$
   34.53 -  com2=$<$baud$>$,DPS,$<$io\_base$>$,$<$irq$>$ ] \mbox{}\\
   34.54 -  Xen supports up to two 16550-compatible serial ports.  For example:
   34.55 -  `com1=9600, 8n1, 0x408, 5' maps COM1 to a 9600-baud port, 8 data
   34.56 -  bits, no parity, 1 stop bit, I/O port base 0x408, IRQ 5.  If some
   34.57 -  configuration options are standard (e.g., I/O base and IRQ), then
   34.58 -  only a prefix of the full configuration string need be specified. If
   34.59 -  the baud rate is pre-configured (e.g., by the bootloader) then you
   34.60 -  can specify `auto' in place of a numeric baud rate.
   34.61 -\item [ console=$<$specifier list$>$ ] Specify the destination for Xen
   34.62 -  console I/O.  This is a comma-separated list of, for example:
   34.63 -  \begin{description}
   34.64 -  \item[ vga ] Use VGA console and allow keyboard input.
   34.65 -  \item[ com1 ] Use serial port com1.
   34.66 -  \item[ com2H ] Use serial port com2. Transmitted chars will have the
   34.67 -    MSB set. Received chars must have MSB set.
   34.68 -  \item[ com2L] Use serial port com2. Transmitted chars will have the
   34.69 -    MSB cleared. Received chars must have MSB cleared.
   34.70 -  \end{description}
   34.71 -  The latter two examples allow a single port to be shared by two
   34.72 -  subsystems (e.g.\ console and debugger). Sharing is controlled by
   34.73 -  MSB of each transmitted/received character.  [NB. Default for this
   34.74 -  option is `com1,vga']
   34.75 -\item [ sync\_console ] Force synchronous console output. This is
   34.76 -  useful if you system fails unexpectedly before it has sent all
   34.77 -  available output to the console. In most cases Xen will
   34.78 -  automatically enter synchronous mode when an exceptional event
   34.79 -  occurs, but this option provides a manual fallback.
   34.80 -\item [ conswitch=$<$switch-char$><$auto-switch-char$>$ ] Specify how
   34.81 -  to switch serial-console input between Xen and DOM0. The required
   34.82 -  sequence is CTRL-$<$switch-char$>$ pressed three times. Specifying
   34.83 -  the backtick character disables switching.  The
   34.84 -  $<$auto-switch-char$>$ specifies whether Xen should auto-switch
   34.85 -  input to DOM0 when it boots --- if it is `x' then auto-switching is
   34.86 -  disabled.  Any other value, or omitting the character, enables
   34.87 -  auto-switching.  [NB. Default switch-char is `a'.]
   34.88 -\item [ nmi=xxx ]
   34.89 -  Specify what to do with an NMI parity or I/O error. \\
   34.90 -  `nmi=fatal':  Xen prints a diagnostic and then hangs. \\
   34.91 -  `nmi=dom0':   Inform DOM0 of the NMI. \\
   34.92 -  `nmi=ignore': Ignore the NMI.
   34.93 -\item [ mem=xxx ] Set the physical RAM address limit. Any RAM
   34.94 -  appearing beyond this physical address in the memory map will be
   34.95 -  ignored. This parameter may be specified with a B, K, M or G suffix,
   34.96 -  representing bytes, kilobytes, megabytes and gigabytes respectively.
   34.97 -  The default unit, if no suffix is specified, is kilobytes.
   34.98 -\item [ dom0\_mem=xxx ] Set the amount of memory to be allocated to
   34.99 -  domain0. In Xen 3.x the parameter may be specified with a B, K, M or
  34.100 -  G suffix, representing bytes, kilobytes, megabytes and gigabytes
  34.101 -  respectively; if no suffix is specified, the parameter defaults to
  34.102 -  kilobytes. In previous versions of Xen, suffixes were not supported
  34.103 -  and the value is always interpreted as kilobytes.
  34.104 -\item [ tbuf\_size=xxx ] Set the size of the per-cpu trace buffers, in
  34.105 -  pages (default 1).  Note that the trace buffers are only enabled in
  34.106 -  debug builds.  Most users can ignore this feature completely.
  34.107 -\item [ sched=xxx ] Select the CPU scheduler Xen should use.  The
  34.108 -  current possibilities are `bvt' (default), `atropos' and `rrobin'.
  34.109 -  For more information see Section~\ref{s:sched}.
  34.110 -\item [ apic\_verbosity=debug,verbose ] Print more detailed
  34.111 -  information about local APIC and IOAPIC configuration.
  34.112 -\item [ lapic ] Force use of local APIC even when left disabled by
  34.113 -  uniprocessor BIOS.
  34.114 -\item [ nolapic ] Ignore local APIC in a uniprocessor system, even if
  34.115 -  enabled by the BIOS.
  34.116 -\item [ apic=bigsmp,default,es7000,summit ] Specify NUMA platform.
  34.117 -  This can usually be probed automatically.
  34.118 -\end{description}
  34.119 -
  34.120 -In addition, the following options may be specified on the Xen command
  34.121 -line. Since domain 0 shares responsibility for booting the platform,
  34.122 -Xen will automatically propagate these options to its command line.
  34.123 -These options are taken from Linux's command-line syntax with
  34.124 -unchanged semantics.
  34.125 -
  34.126 -\begin{description}
  34.127 -\item [ acpi=off,force,strict,ht,noirq,\ldots ] Modify how Xen (and
  34.128 -  domain 0) parses the BIOS ACPI tables.
  34.129 -\item [ acpi\_skip\_timer\_override ] Instruct Xen (and domain~0) to
  34.130 -  ignore timer-interrupt override instructions specified by the BIOS
  34.131 -  ACPI tables.
  34.132 -\item [ noapic ] Instruct Xen (and domain~0) to ignore any IOAPICs
  34.133 -  that are present in the system, and instead continue to use the
  34.134 -  legacy PIC.
  34.135 -\end{description} 
  34.136 -
  34.137 -
  34.138 -\section{XenLinux Boot Options}
  34.139 -
  34.140 -In addition to the standard Linux kernel boot options, we support:
  34.141 -\begin{description}
  34.142 -\item[ xencons=xxx ] Specify the device node to which the Xen virtual
  34.143 -  console driver is attached. The following options are supported:
  34.144 -  \begin{center}
  34.145 -    \begin{tabular}{l}
  34.146 -      `xencons=off': disable virtual console \\
  34.147 -      `xencons=tty': attach console to /dev/tty1 (tty0 at boot-time) \\
  34.148 -      `xencons=ttyS': attach console to /dev/ttyS0
  34.149 -    \end{tabular}
  34.150 -\end{center}
  34.151 -The default is ttyS for dom0 and tty for all other domains.
  34.152 -\end{description}
    35.1 --- a/docs/src/user/rhel.tex	Sun Dec 04 20:12:00 2005 +0100
    35.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    35.3 @@ -1,127 +0,0 @@
    35.4 -\chapter{Installing Xen on Red Hat Enterprise Linux (RHEL) 4.1}
    35.5 -
    35.6 -RedHat Enterprise Linux is the enterprise-grade, certified version of the Red Hat distribution. This section includes resolving dependencies using yum, installing Xen, and creating an initrd for Xen.
    35.7 -
    35.8 -Stable binary release install
    35.9 -Source install
   35.10 -\section{Stable binary release install}
   35.11 -
   35.12 -\subsection{Setup yum repository}
   35.13 -
   35.14 -Setup your yum repository to Dag's Yum Repository or similar. Dag's is recommended.
   35.15 -
   35.16 -\subsection{Required Packages}
   35.17 -
   35.18 -These packages are required:
   35.19 -
   35.20 -\begin{itemize}
   35.21 -\item bridge-utils
   35.22 -\item curl
   35.23 -\item libidn
   35.24 -\item sysfsutils
   35.25 -\end{itemize}
   35.26 -
   35.27 -Use yum to install these packages.
   35.28 -
   35.29 -\begin{verbatim}
   35.30 -yum install bridge-utils curl libidn sysfsutils
   35.31 -\end{verbatim}
   35.32 -
   35.33 -\subsection{Download Xen}
   35.34 -
   35.35 -\subsection{Download the binary tarball}
   35.36 -Download the Xen 3.0 binary tarball from the XenSource downloads page:
   35.37 -
   35.38 -\begin{quote} {\tt http://www.xensource.com/downloads/}
   35.39 -\end{quote}
   35.40 -
   35.41 -\subsection{Extract and Install}
   35.42 -
   35.43 -\begin{verbatim}
   35.44 -tar zxvf xen-unstable-install-x86\_32.tgz
   35.45 -
   35.46 -cd xen-unstable-install
   35.47 -
   35.48 -./install.sh 
   35.49 -\end{verbatim}
   35.50 -
   35.51 -
   35.52 -\subsection{Disable TLS}
   35.53 -
   35.54 -\begin{verbatim}
   35.55 -mv /lib/tls /lib/tls.disabled
   35.56 -\end{verbatim}
   35.57 -
   35.58 -\subsection{Creating initrd}
   35.59 -
   35.60 -You can use the distro's initrd. The following steps show you how to create one yourself for dom0 and domU. The example uses a Domain0 image, so to adatp it, simply use the appropriate image for DomainU.
   35.61 -
   35.62 -\begin{verbatim}
   35.63 -run depmod 2.x.y-xen0 to re-create modules dependency
   35.64 -
   35.65 -mkinitrd  /boot/initrd-2.x.y-xen0.img  2.x.y-xen0 
   35.66 -\end{verbatim}
   35.67 -
   35.68 -If you get an error
   35.69 -
   35.70 -\begin{verbatim}
   35.71 -   "No module xxx found for kernel 2.x.y-xen0, aborting."
   35.72 -\end{verbatim}
   35.73 -
   35.74 -uncheck xxx in \path{/etc/modprobe.conf} if you don't want support for xxx. If you know that its built into kernel (to check \path{grep -i xxx config-2.6.12-xen0}) you can do
   35.75 -
   35.76 -\begin{verbatim}
   35.77 -mkinitrd  --builtin=aic7xxx  ./2.6.12-xen0.img  2.6.12-xen0
   35.78 -\end{verbatim}
   35.79 -
   35.80 -If another yyy module is reported as "not found,"
   35.81 -
   35.82 -\begin{verbatim}
   35.83 -mkinitrd  --builtin=xxx --builtin=yyy ./2.6.12-xen0.img  2.6.12-xen0
   35.84 -\end{verbatim}
   35.85 -
   35.86 -\subsection{Grub Configuration}
   35.87 -
   35.88 -As usual, you need to make entry in grub configuration file for Xen. Here's a sample grub entry.
   35.89 -
   35.90 -{\small
   35.91 -\begin{verbatim}
   35.92 -title  Xen/RHEL 4.1
   35.93 -       kernel (hd0,5)/boot/xen.gz dom0\_mem=256000
   35.94 -       module (hd0,5)/boot/vmlinuz-2.6.11.12-xen0 root=/dev/hda6
   35.95 -       module (hd0,5)/boot/initrd-2.6.11.12-xen0.img
   35.96 -\end{verbatim}
   35.97 -}
   35.98 -
   35.99 -\section{Source install}
  35.100 -
  35.101 -
  35.102 -\subsection{Download Source Tarball}
  35.103 -
  35.104 -\subsection{Download the binary tarball}
  35.105 -Download the Xen 3.0 binary tarball from the XenSource downloads page:
  35.106 -
  35.107 -\begin{quote} {\tt http://www.xensource.com/downloads/}
  35.108 -\end{quote}
  35.109 -
  35.110 -\subsection{Pre-requisites to build from source}
  35.111 -
  35.112 -Make sure you have all packages. If you had chosen to install Development tools during the distro installation, you should not need to install any extra packages. If not, install the following:
  35.113 -
  35.114 -\begin{itemize}
  35.115 -\item gcc-3.4.3-22.1
  35.116 -\item python-devel-2.3.4-14.1
  35.117 -\item zlib-devel-1.2.1.2-1
  35.118 -\item curl-devel-7.12.1-5.rhel4
  35.119 -\end{itemize}
  35.120 -
  35.121 -\subsection{Install Xen}
  35.122 -
  35.123 -\begin{verbatim}
  35.124 -tar zxvf xen-unstable-src.tgz
  35.125 -cd xen-unstable/
  35.126 -make world
  35.127 -make install
  35.128 -\end{verbatim}
  35.129 -
  35.130 -The rest of the steps follow as with the binary tarball installation.
    36.1 --- a/docs/src/user/scheduler_management.tex	Sun Dec 04 20:12:00 2005 +0100
    36.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    36.3 @@ -1,3 +0,0 @@
    36.4 -\chapter{Scheduler Management}
    36.5 -
    36.6 -Placeholder.
    37.1 --- a/docs/src/user/securing_xen.tex	Sun Dec 04 20:12:00 2005 +0100
    37.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    37.3 @@ -1,85 +0,0 @@
    37.4 -\chapter{Securing Xen}
    37.5 -
    37.6 -This chapter describes how to secure a Xen system. It describes a number
    37.7 -of scenarios and provides a corresponding set of best practices. It
    37.8 -begins with a section devoted to understanding the security implications
    37.9 -of a Xen system.
   37.10 -
   37.11 -
   37.12 -\section{Xen Security Considerations}
   37.13 -
   37.14 -When deploying a Xen system, one must be sure to secure the management
   37.15 -domain (Domain-0) as much as possible. If the management domain is
   37.16 -compromised, all other domains are also vulnerable. The following are a
   37.17 -set of best practices for Domain-0:
   37.18 -
   37.19 -\begin{enumerate}
   37.20 -\item \textbf{Run the smallest number of necessary services.} The less
   37.21 -  things that are present in a management partition, the better.
   37.22 -  Remember, a service running as root in the management domain has full
   37.23 -  access to all other domains on the system.
   37.24 -\item \textbf{Use a firewall to restrict the traffic to the management
   37.25 -    domain.} A firewall with default-reject rules will help prevent
   37.26 -  attacks on the management domain.
   37.27 -\item \textbf{Do not allow users to access Domain-0.} The Linux kernel
   37.28 -  has been known to have local-user root exploits. If you allow normal
   37.29 -  users to access Domain-0 (even as unprivileged users) you run the risk
   37.30 -  of a kernel exploit making all of your domains vulnerable.
   37.31 -\end{enumerate}
   37.32 -
   37.33 -\section{Security Scenarios}
   37.34 -
   37.35 -
   37.36 -\subsection{The Isolated Management Network}
   37.37 -
   37.38 -In this scenario, each node has two network cards in the cluster. One
   37.39 -network card is connected to the outside world and one network card is a
   37.40 -physically isolated management network specifically for Xen instances to
   37.41 -use.
   37.42 -
   37.43 -As long as all of the management partitions are trusted equally, this is
   37.44 -the most secure scenario. No additional configuration is needed other
   37.45 -than forcing Xend to bind to the management interface for relocation.
   37.46 -
   37.47 -\textbf{FIXME:} What is the option to allow for this?
   37.48 -
   37.49 -
   37.50 -\subsection{A Subnet Behind a Firewall}
   37.51 -
   37.52 -In this scenario, each node has only one network card but the entire
   37.53 -cluster sits behind a firewall. This firewall should do at least the
   37.54 -following:
   37.55 -
   37.56 -\begin{enumerate}
   37.57 -\item Prevent IP spoofing from outside of the subnet.
   37.58 -\item Prevent access to the relocation port of any of the nodes in the
   37.59 -  cluster except from within the cluster.
   37.60 -\end{enumerate}
   37.61 -
   37.62 -The following iptables rules can be used on each node to prevent
   37.63 -migrations to that node from outside the subnet assuming the main
   37.64 -firewall does not do this for you:
   37.65 -
   37.66 -\begin{verbatim}
   37.67 -# this command disables all access to the Xen relocation
   37.68 -# port:
   37.69 -iptables -A INPUT -p tcp --destination-port 8002 -j REJECT
   37.70 -
   37.71 -# this command enables Xen relocations only from the specific
   37.72 -# subnet:
   37.73 -iptables -I INPUT -p tcp -{}-source 192.168.1.1/8 \
   37.74 -    --destination-port 8002 -j ACCEPT
   37.75 -\end{verbatim}
   37.76 -
   37.77 -\subsection{Nodes on an Untrusted Subnet}
   37.78 -
   37.79 -Migration on an untrusted subnet is not safe in current versions of Xen.
   37.80 -It may be possible to perform migrations through a secure tunnel via an
   37.81 -VPN or SSH. The only safe option in the absence of a secure tunnel is to
   37.82 -disable migration completely. The easiest way to do this is with
   37.83 -iptables:
   37.84 -
   37.85 -\begin{verbatim}
   37.86 -# this command disables all access to the Xen relocation port
   37.87 -iptables -A INPUT -p tcp -{}-destination-port 8002 -j REJECT
   37.88 -\end{verbatim}
    38.1 --- a/docs/src/user/start_addl_dom.tex	Sun Dec 04 20:12:00 2005 +0100
    38.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    38.3 @@ -1,189 +0,0 @@
    38.4 -\chapter{Starting Additional Domains}
    38.5 -
    38.6 -The first step in creating a new domain is to prepare a root
    38.7 -filesystem for it to boot.  Typically, this might be stored in a
    38.8 -normal partition, an LVM or other volume manager partition, a disk
    38.9 -file or on an NFS server.  A simple way to do this is simply to boot
   38.10 -from your standard OS install CD and install the distribution into
   38.11 -another partition on your hard drive.
   38.12 -
   38.13 -To start the \xend\ control daemon, type
   38.14 -\begin{quote}
   38.15 -  \verb!# xend start!
   38.16 -\end{quote}
   38.17 -
   38.18 -%% KMS: If we're going to use '# cmd' syntax we should be consistent
   38.19 -%% about it and have a conventions section noting that '#' == root
   38.20 -%% prompt.
   38.21 -
   38.22 -If you wish the daemon to start automatically, see the instructions in
   38.23 -Section~\ref{s:xend}. Once the daemon is running, you can use the
   38.24 -\path{xm} tool to monitor and maintain the domains running on your
   38.25 -system. This chapter provides only a brief tutorial. We provide full
   38.26 -details of the \path{xm} tool in the next chapter.
   38.27 -
   38.28 -% \section{From the web interface}
   38.29 -%
   38.30 -% Boot the Xen machine and start Xensv (see Chapter~\ref{cha:xensv}
   38.31 -% for more details) using the command: \\
   38.32 -% \verb_# xensv start_ \\
   38.33 -% This will also start Xend (see Chapter~\ref{cha:xend} for more
   38.34 -% information).
   38.35 -%
   38.36 -% The domain management interface will then be available at {\tt
   38.37 -%   http://your\_machine:8080/}.  This provides a user friendly wizard
   38.38 -% for starting domains and functions for managing running domains.
   38.39 -%
   38.40 -% \section{From the command line}
   38.41 -
   38.42 -
   38.43 -\section{Creating a Domain Configuration File}
   38.44 -
   38.45 -Before you can start an additional domain, you must create a
   38.46 -configuration file. We provide two example files which you can use as
   38.47 -a starting point:
   38.48 -\begin{itemize}
   38.49 -\item \path{/etc/xen/xmexample1} is a simple template configuration
   38.50 -  file for describing a single VM\@.
   38.51 -\item \path{/etc/xen/xmexample2} file is a template description that
   38.52 -  is intended to be reused for multiple virtual machines.  Setting the
   38.53 -  value of the \path{vmid} variable on the \path{xm} command line
   38.54 -  fills in parts of this template.
   38.55 -\end{itemize}
   38.56 -
   38.57 -Copy one of these files and edit it as appropriate.  Typical values
   38.58 -you may wish to edit include:
   38.59 -
   38.60 -\begin{quote}
   38.61 -\begin{description}
   38.62 -\item[kernel] Set this to the path of the kernel you compiled for use
   38.63 -  with Xen (e.g.\ \path{kernel = ``/boot/vmlinuz-2.6-xenU''})
   38.64 -\item[memory] Set this to the size of the domain's memory in megabytes
   38.65 -  (e.g.\ \path{memory = 64})
   38.66 -\item[disk] Set the first entry in this list to calculate the offset
   38.67 -  of the domain's root partition, based on the domain ID\@.  Set the
   38.68 -  second to the location of \path{/usr} if you are sharing it between
   38.69 -  domains (e.g.\ \path{disk = ['phy:your\_hard\_drive\%d,sda1,w' \%
   38.70 -    (base\_partition\_number + vmid),
   38.71 -    'phy:your\_usr\_partition,sda6,r' ]}
   38.72 -\item[dhcp] Uncomment the dhcp variable, so that the domain will
   38.73 -  receive its IP address from a DHCP server (e.g.\ \path{dhcp=``dhcp''})
   38.74 -\end{description}
   38.75 -\end{quote}
   38.76 -
   38.77 -You may also want to edit the {\bf vif} variable in order to choose
   38.78 -the MAC address of the virtual ethernet interface yourself.  For
   38.79 -example:
   38.80 -%% KMS:  We should indicate "safe" ranges to use.
   38.81 -\begin{quote}
   38.82 -\verb_vif = ['mac=00:06:AA:F6:BB:B3']_
   38.83 -\end{quote}
   38.84 -If you do not set this variable, \xend\ will automatically generate a
   38.85 -random MAC address from the range 00:16:3E:xx:xx:xx.  Generated MACs are
   38.86 -not tested for possible collisions, however likelihood of this is low at
   38.87 -\begin{math} 1:2^{48}.\end{math}  XenSource Inc.  gives permission for
   38.88 -anyone to use addresses randomly allocated from this range for use by
   38.89 -their Xen domains.
   38.90 -
   38.91 -
   38.92 -For a list of IEEE
   38.93 -assigned MAC organizationally unique identifiers (OUI), see \newline
   38.94 -{\tt http://standards.ieee.org/regauth/oui/oui.txt}
   38.95 -
   38.96 -
   38.97 -\section{Booting the Domain}
   38.98 -
   38.99 -The \path{xm} tool provides a variety of commands for managing
  38.100 -domains.  Use the \path{create} command to start new domains. Assuming
  38.101 -you've created a configuration file \path{myvmconf} based around
  38.102 -\path{/etc/xen/xmexample2}, to start a domain with virtual machine
  38.103 -ID~1 you should type:
  38.104 -
  38.105 -\begin{quote}
  38.106 -\begin{verbatim}
  38.107 -# xm create -c myvmconf vmid=1
  38.108 -\end{verbatim}
  38.109 -\end{quote}
  38.110 -
  38.111 -The \path{-c} switch causes \path{xm} to turn into the domain's
  38.112 -console after creation.  The \path{vmid=1} sets the \path{vmid}
  38.113 -variable used in the \path{myvmconf} file.
  38.114 -
  38.115 -You should see the console boot messages from the new domain appearing
  38.116 -in the terminal in which you typed the command, culminating in a login
  38.117 -prompt.
  38.118 -
  38.119 -
  38.120 -\section{Example: ttylinux}
  38.121 -
  38.122 -Ttylinux is a very small Linux distribution, designed to require very
  38.123 -few resources.  We will use it as a concrete example of how to start a
  38.124 -Xen domain.  Most users will probably want to install a full-featured
  38.125 -distribution once they have mastered the basics\footnote{ttylinux is
  38.126 -  maintained by Pascal Schmidt. You can download source packages from
  38.127 -  the distribution's home page: {\tt
  38.128 -    http://www.minimalinux.org/ttylinux/}}.
  38.129 -
  38.130 -\begin{enumerate}
  38.131 -\item Download and extract the ttylinux disk image from the Files
  38.132 -  section of the project's SourceForge site (see
  38.133 -  \path{http://sf.net/projects/xen/}).
  38.134 -\item Create a configuration file like the following:
  38.135 -  \begin{quote}
  38.136 -\begin{verbatim}
  38.137 -kernel = "/boot/vmlinuz-2.6-xenU"
  38.138 -memory = 64
  38.139 -name = "ttylinux"
  38.140 -nics = 1
  38.141 -ip = "1.2.3.4"
  38.142 -disk = ['file:/path/to/ttylinux/rootfs,sda1,w']
  38.143 -root = "/dev/sda1 ro"
  38.144 -\end{verbatim}    
  38.145 -  \end{quote}
  38.146 -\item Now start the domain and connect to its console:
  38.147 -  \begin{quote}
  38.148 -\begin{verbatim}
  38.149 -xm create configfile -c
  38.150 -\end{verbatim}
  38.151 -  \end{quote}
  38.152 -\item Login as root, password root.
  38.153 -\end{enumerate}
  38.154 -
  38.155 -
  38.156 -\section{Starting / Stopping Domains Automatically}
  38.157 -
  38.158 -It is possible to have certain domains start automatically at boot
  38.159 -time and to have dom0 wait for all running domains to shutdown before
  38.160 -it shuts down the system.
  38.161 -
  38.162 -To specify a domain is to start at boot-time, place its configuration
  38.163 -file (or a link to it) under \path{/etc/xen/auto/}.
  38.164 -
  38.165 -A Sys-V style init script for Red Hat and LSB-compliant systems is
  38.166 -provided and will be automatically copied to \path{/etc/init.d/}
  38.167 -during install.  You can then enable it in the appropriate way for
  38.168 -your distribution.
  38.169 -
  38.170 -For instance, on Red Hat:
  38.171 -
  38.172 -\begin{quote}
  38.173 -  \verb_# chkconfig --add xendomains_
  38.174 -\end{quote}
  38.175 -
  38.176 -By default, this will start the boot-time domains in runlevels 3, 4
  38.177 -and 5.
  38.178 -
  38.179 -You can also use the \path{service} command to run this script
  38.180 -manually, e.g:
  38.181 -
  38.182 -\begin{quote}
  38.183 -  \verb_# service xendomains start_
  38.184 -
  38.185 -  Starts all the domains with config files under /etc/xen/auto/.
  38.186 -\end{quote}
  38.187 -
  38.188 -\begin{quote}
  38.189 -  \verb_# service xendomains stop_
  38.190 -
  38.191 -  Shuts down ALL running Xen domains.
  38.192 -\end{quote}
    39.1 --- a/docs/src/user/suse.tex	Sun Dec 04 20:12:00 2005 +0100
    39.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    39.3 @@ -1,3 +0,0 @@
    39.4 -\chapter{Installing Xen on SuSE or SuSE Linux Enterprise Server (SLES)}
    39.5 -
    39.6 -Placeholder.
    40.1 --- a/docs/src/user/testing.tex	Sun Dec 04 20:12:00 2005 +0100
    40.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    40.3 @@ -1,3 +0,0 @@
    40.4 -\chapter{Testing Xen}
    40.5 -
    40.6 -Placeholder.
    41.1 --- a/docs/src/user/xenstat.tex	Sun Dec 04 20:12:00 2005 +0100
    41.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    41.3 @@ -1,3 +0,0 @@
    41.4 -\chapter{xenstat}
    41.5 -
    41.6 -Placeholder.
    42.1 --- a/docs/src/user/xentrace.tex	Sun Dec 04 20:12:00 2005 +0100
    42.2 +++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
    42.3 @@ -1,3 +0,0 @@
    42.4 -\chapter{xentrace}
    42.5 -
    42.6 -Placeholder.