ia64/xen-unstable

changeset 2777:144012985ae9

bitkeeper revision 1.1159.141.1 (418243e8QcGEI8BSUDnzPf5M01yZuA)

doc fixes.
author kaf24@freefall.cl.cam.ac.uk
date Fri Oct 29 13:21:44 2004 +0000 (2004-10-29)
parents bd14e7f131d0
children 5bfc4a20f04d
files docs/src/interface.tex docs/src/user.tex
line diff
     1.1 --- a/docs/src/interface.tex	Fri Oct 29 10:31:32 2004 +0000
     1.2 +++ b/docs/src/interface.tex	Fri Oct 29 13:21:44 2004 +0000
     1.3 @@ -16,12 +16,19 @@
     1.4  {\Huge \bf Interface manual} \\[4mm]
     1.5  {\huge Xen v2.0 for x86} \\[80mm]
     1.6  
     1.7 -{\Large Xen is Copyright (c) 2004, The Xen Team} \\[3mm]
     1.8 +{\Large Xen is Copyright (c) 2002-2004, The Xen Team} \\[3mm]
     1.9  {\Large University of Cambridge, UK} \\[20mm]
    1.10 -{\large Last updated on 11th March, 2004}
    1.11  \end{tabular}
    1.12 +\end{center}
    1.13 +
    1.14 +{\bf
    1.15 +DISCLAIMER: This documentation is currently under active development
    1.16 +and as such there may be mistakes and omissions --- watch out for
    1.17 +these and please report any you find to the developer's mailing list.
    1.18 +Contributions of material, suggestions and corrections are welcome.
    1.19 +}
    1.20 +
    1.21  \vfill
    1.22 -\end{center}
    1.23  \cleardoublepage
    1.24  
    1.25  % TABLE OF CONTENTS
    1.26 @@ -45,42 +52,39 @@
    1.27  \setstretch{1.15}
    1.28  
    1.29  \chapter{Introduction}
    1.30 -Xen allows the hardware resouces of a machine to be virtualized and
    1.31 -dynamically partitioned such as to allow multiple different 'guest'
    1.32 -operating system images to be run simultaneously.
    1.33  
    1.34 -Virtualizing the machine in this manner provides flexibility allowing
    1.35 -different users to choose their preferred operating system (Windows,
    1.36 -Linux, NetBSD, or a custom operating system).  Furthermore, Xen provides
    1.37 -secure partitioning between these 'domains', and enables better resource
    1.38 +Xen allows the hardware resources of a machine to be virtualized and
    1.39 +dynamically partitioned, allowing multiple different {\em guest}
    1.40 +operating system images to be run simultaneously.  Virtualizing the
    1.41 +machine in this manner provides considerable flexibility, for example
    1.42 +allowing different users to choose their preferred operating system
    1.43 +(e.g., Linux, NetBSD, or a custom operating system).  Furthermore, Xen
    1.44 +provides secure partitioning between virtual machines (known as
    1.45 +{\em domains} in Xen terminology), and enables better resource
    1.46  accounting and QoS isolation than can be achieved with a conventional
    1.47 -operating system.
    1.48 -
    1.49 -The hypervisor runs directly on server hardware and dynamically partitions
    1.50 -it between a number of {\it domains}, each of which hosts an instance
    1.51 -of a {\it guest operating system}.  The hypervisor provides just enough
    1.52 -abstraction of the machine to allow effective isolation and resource 
    1.53 -management between these domains.
    1.54 +operating system. 
    1.55  
    1.56 -Xen essentially takes a virtual machine approach as pioneered by IBM
    1.57 -VM/370.  However, unlike VM/370 or more recent efforts such as VMWare
    1.58 -and Virtual PC, Xen doesn not attempt to completely virtualize the
    1.59 -underlying hardware.  Instead parts of the hosted guest operating
    1.60 -systems are modified to work with the hypervisor; the operating system
    1.61 -is effectively ported to a new target architecture, typically
    1.62 -requiring changes in just the machine-dependent code.  The user-level
    1.63 -API is unchanged, thus existing binaries and operating system
    1.64 -distributions can work unmodified.
    1.65 +Xen essentially takes a `wholemachine' virtualization approach as
    1.66 +pioneered by IBM VM/370.  However, unlike VM/370 or more recent
    1.67 +efforts such as VMWare and Virtual PC, Xen doesn not attempt to
    1.68 +completely virtualize the underlying hardware.  Instead parts of the
    1.69 +hosted guest operating systems are modified to work with the
    1.70 +VMM; the operating system is effectively ported to a new target
    1.71 +architecture, typically requiring changes in just the
    1.72 +machine-dependent code.  The user-level API is unchanged, thus
    1.73 +existing binaries and operating system distributions work without
    1.74 +modification. 
    1.75  
    1.76 -In addition to exporting virtualized instances of CPU, memory, network and
    1.77 -block devicees, Xen exposes a control interface to set how these resources
    1.78 -are shared between the running domains.  The control interface is privileged
    1.79 -and may only be accessed by one particular virtual machine: {\it domain0}.
    1.80 -This domain is a required part of any Xen-base server and runs the application
    1.81 -software that manages the control-plane aspects of the platform.  Running the
    1.82 -control software in {\it domain0}, distinct from the hypervisor itself, allows
    1.83 -the Xen framework to separate the notions of {\it mechanism} and {\it policy}
    1.84 -within the system.
    1.85 +In addition to exporting virtualized instances of CPU, memory, network
    1.86 +and block devices, Xen exposes a control interface to manage how these
    1.87 +resources are shared between the running domains. Access to the
    1.88 +control interface is restricted: it may only be used by one
    1.89 +specially-privileged VM, known as {\em Domain-0}.  This domain is a
    1.90 +required part of any Xen-base server and runs the application software
    1.91 +that manages the control-plane aspects of the platform.  Running the
    1.92 +control software in {\it domain-0}, distinct from the hypervisor
    1.93 +itself, allows the Xen framework to separate the notions of {\it
    1.94 +mechanism} and {\it policy} within the system.
    1.95  
    1.96  
    1.97  \chapter{CPU state}
    1.98 @@ -89,12 +93,16 @@ All privileged state must be handled by 
    1.99  direct access to CR3 and is not permitted to update privileged bits in
   1.100  EFLAGS.
   1.101  
   1.102 +
   1.103  \chapter{Exceptions}
   1.104 -The IDT is virtualised by submitting a virtual 'trap
   1.105 -table' to Xen.  Most trap handlers are identical to native x86
   1.106 -handlers.  The page-fault handler is a noteable exception.
   1.107 +
   1.108 +The IDT is virtualised by submitting to Xen a table of trap handlers.
   1.109 +Most trap handlers are identical to native x86 handlers, although the
   1.110 +page-fault handler is a noteable exception.
   1.111 +
   1.112  
   1.113  \chapter{Interrupts and events}
   1.114 +
   1.115  Interrupts are virtualized by mapping them to events, which are delivered 
   1.116  asynchronously to the target domain.  A guest OS can map these events onto
   1.117  its standard interrupt dispatch mechanisms, such as a simple vectoring 
   1.118 @@ -109,16 +117,19 @@ to it, {\it t} nanoseconds after the fir
   1.119  first).  This allows latency and throughput requirements to be addressed on a
   1.120  domain-specific basis.
   1.121  
   1.122 +
   1.123  \chapter{Time}
   1.124 -Guest operating systems need to be aware of the passage of real time and their
   1.125 -own ``virtual time'', i.e. the time they have been executing.  Furthermore, a
   1.126 -notion of time is required in the hypervisor itself for scheduling and the
   1.127 -activities that relate to it.  To this end the hypervisor provides for notions
   1.128 -of time:  cycle counter time, system time, wall clock time, domain virtual 
   1.129 +
   1.130 +Guest operating systems need to be aware of the passage of both real
   1.131 +(or wallclock) time and their own `virtual time' (i.e., the time for
   1.132 +which they have been executing) Furthermore, a notion of time is
   1.133 +required in the hypervisor itself for scheduling and the activities
   1.134 +that relate to it.  To this end the hypervisor provides for notions of
   1.135 +time: cycle counter time, system time, wall clock time, domain virtual
   1.136  time.
   1.137  
   1.138 +\section{Cycle counter time}
   1.139  
   1.140 -\section{Cycle counter time}
   1.141  This provides the finest-grained, free-running time reference, with the
   1.142  approximate frequency being publicly accessible.  The cycle counter time is
   1.143  used to accurately extrapolate the other time references.  On SMP machines
   1.144 @@ -127,6 +138,7 @@ CPUs.  The current x86-based implementat
   1.145  communication latencies.
   1.146  
   1.147  \section{System time}
   1.148 +
   1.149  This is a 64-bit value containing the nanoseconds elapsed since boot
   1.150  time.  Unlike cycle counter time, system time accurately reflects the
   1.151  passage of real time, i.e.  it is adjusted several times a second for timer
   1.152 @@ -135,6 +147,7 @@ the machine, feeding updates to the hype
   1.153  extrapolated using the cycle counter.
   1.154  
   1.155  \section{Wall clock time}
   1.156 +
   1.157  This is the actual ``time of day'' Unix style struct timeval (i.e. seconds and
   1.158  microseconds since 1 January 1970, adjusted by leap seconds etc.).  Again, an 
   1.159  NTP client hosted by {\it domain0} can help maintain this value.  To guest 
   1.160 @@ -142,8 +155,8 @@ operating systems this value will be rep
   1.161  clock value and they can use the system time and cycle counter times to start
   1.162  and remain perfectly in time.
   1.163  
   1.164 +\section{Domain virtual time}
   1.165  
   1.166 -\section{Domain virtual time}
   1.167  This progresses at the same pace as cycle counter time, but only while a
   1.168  domain is executing.  It stops while a domain is de-scheduled.  Therefore the
   1.169  share of the CPU that a domain receives is indicated by the rate at which
   1.170 @@ -151,6 +164,7 @@ its domain virtual time increases, relat
   1.171  counter time does so.
   1.172  
   1.173  \section{Time interface}
   1.174 +
   1.175  Xen exports some timestamps to guest operating systems through their shared
   1.176  info page.  Timestamps are provided for system time and wall-clock time.  Xen
   1.177  also provides the cycle counter values at the time of the last update
   1.178 @@ -178,6 +192,7 @@ them to request a timer event sent to th
   1.179  time.  Guest OSes may use this timer to implement timeout values when they
   1.180  block.
   1.181  
   1.182 +
   1.183  \chapter{Memory}
   1.184  
   1.185  The hypervisor is responsible for providing memory to each of the
   1.186 @@ -203,6 +218,7 @@ own operation and place applications (if
   1.187  in ring 3.
   1.188  
   1.189  \section{Physical Memory Allocation}
   1.190 +
   1.191  The hypervisor reserves a small fixed portion of physical memory at
   1.192  system boot time.  This special memory region is located at the
   1.193  beginning of physical memory and is mapped at the very top of every
   1.194 @@ -220,21 +236,31 @@ requested for them on initialization.  H
   1.195  pages to the hypervisor if it discovers that its memory requirements
   1.196  have diminished.
   1.197  
   1.198 -% put reasons for why pages might be returned here.
   1.199  \section{Page Table Updates}
   1.200 +
   1.201  In addition to managing physical memory allocation, the hypervisor is also in
   1.202  charge of performing page table updates on behalf of the domains.  This is 
   1.203 -neccessary to prevent domains from adding arbitrary mappings to their page
   1.204 +necessary to prevent domains from adding arbitrary mappings to their page
   1.205  tables or introducing mappings to other's page tables.
   1.206  
   1.207 -\section{Writabel Page Tables}
   1.208 -A domain can also request write access to its page tables.  In this
   1.209 -mode, Xen notes write attempts to page table pages and makes the page
   1.210 -temporarily writable.  In-use page table pages are also disconnect
   1.211 -from the page directory.  The domain can now update entries in these
   1.212 -page table pages without the assistance of Xen.  As soon as the
   1.213 -writabel page table pages get used as page table pages, Xen makes the
   1.214 -pages read-only again and revalidates the entries in the pages.
   1.215 +\section{Writable Page Tables}
   1.216 +
   1.217 +Rather than using the explicit page-update interface that Xen
   1.218 +provides, guests may also be provided with the illusion that their
   1.219 +page tables are directly writable. Of course this is not really the
   1.220 +case, since Xen must validate modifications to ensure secure
   1.221 +partitioning of domains. Instead, Xen detects any write attempt to a
   1.222 +memory page that is currently part of a page table. If such an access
   1.223 +occurs, Xen temporarily allows write access to that page while at the
   1.224 +same time {\em disconnecting} it from the page table that is currently
   1.225 +in use. This allows the guest to safely make updates to the page
   1.226 +because the newly-updated entries cannot be used by the MMU until Xen
   1.227 +revalidates and {\em reconnects} the page.
   1.228 +
   1.229 +Reconnection occurs automatically in a number of situations: for
   1.230 +example, when the guest modifies a different page-table page, when the
   1.231 +domain is preempted, or whenever the guest uses Xen's explicit
   1.232 +page-table update interfaces.
   1.233  
   1.234  \section{Segment Descriptor Tables}
   1.235  
   1.236 @@ -752,13 +778,19 @@ For more information, see the manual pag
   1.237  xentrace\_format} and {\tt xentrace\_cpusplit}.
   1.238  
   1.239  
   1.240 -\chapter{Hypervisor calls}
   1.241 +\appendix
   1.242 +
   1.243 +\newcommand{\hypercall}[1]{\vspace{5mm}{\large\sf #1}}
   1.244  
   1.245 -\section{ set\_trap\_table(trap\_info\_t *table)} 
   1.246 +\chapter{Xen Hypercalls}
   1.247 +
   1.248 +\hypercall{ set\_trap\_table(trap\_info\_t *table)} 
   1.249  
   1.250  Install trap handler table.
   1.251  
   1.252 -\section{ mmu\_update(mmu\_update\_t *req, int count, int *success\_count)} 
   1.253 +
   1.254 +\hypercall{ mmu\_update(mmu\_update\_t *req, int count, int *success\_count)} 
   1.255 +
   1.256  Update the page table for the domain. Updates can be batched.
   1.257  success\_count will be updated to report the number of successfull
   1.258  updates.  The update types are:
   1.259 @@ -769,24 +801,35 @@ updates.  The update types are:
   1.260  
   1.261  {\it MMU\_EXTENDED\_COMMAND}:
   1.262  
   1.263 -\section{ set\_gdt(unsigned long *frame\_list, int entries)} 
   1.264 +
   1.265 +\hypercall{ set\_gdt(unsigned long *frame\_list, int entries)} 
   1.266 +
   1.267  Set the global descriptor table - virtualization for lgdt.
   1.268  
   1.269 -\section{ stack\_switch(unsigned long ss, unsigned long esp)} 
   1.270 +
   1.271 +\hypercall{ stack\_switch(unsigned long ss, unsigned long esp)} 
   1.272 +
   1.273  Request context switch from hypervisor.
   1.274  
   1.275 -\section{ set\_callbacks(unsigned long event\_selector, unsigned long event\_address,
   1.276 +
   1.277 +\hypercall{ set\_callbacks(unsigned long event\_selector, unsigned long event\_address,
   1.278                          unsigned long failsafe\_selector, unsigned
   1.279 - long failsafe\_address) } Register OS event processing routine.  In
   1.280 - Linux both the event\_selector and failsafe\_selector are the
   1.281 - kernel's CS.  The value event\_address specifies the address for an
   1.282 - interrupt handler dispatch routine and failsafe\_address specifies a
   1.283 - handler for application faults.
   1.284 + long failsafe\_address) }
   1.285  
   1.286 -\section{ fpu\_taskswitch(void)} 
   1.287 +Register OS event processing routine.  In
   1.288 +Linux both the event\_selector and failsafe\_selector are the
   1.289 +kernel's CS.  The value event\_address specifies the address for an
   1.290 +interrupt handler dispatch routine and failsafe\_address specifies a
   1.291 +handler for application faults.
   1.292 +
   1.293 +
   1.294 +\hypercall{ fpu\_taskswitch(void)} 
   1.295 +
   1.296  Notify hypervisor that fpu registers needed to be save on context switch.
   1.297  
   1.298 -\section{ sched\_op(unsigned long op)} 
   1.299 +
   1.300 +\hypercall{ sched\_op(unsigned long op)} 
   1.301 +
   1.302  Request scheduling operation from hypervisor. The options are: {\it
   1.303  yield}, {\it block}, and {\it shutdown}.  {\it yield} keeps the
   1.304  calling domain run-able but may cause a reschedule if other domains
   1.305 @@ -795,7 +838,9 @@ queue and the domains sleeps until an ev
   1.306  shutdown} is used to end the domain's execution and allows to specify
   1.307  whether the domain should reboot, halt or suspend..
   1.308  
   1.309 -\section{ dom0\_op(dom0\_op\_t *op)} 
   1.310 +
   1.311 +\hypercall{ dom0\_op(dom0\_op\_t *op)} 
   1.312 +
   1.313  Administrative domain operations for domain management. The options are:
   1.314  
   1.315  {\it DOM0\_CREATEDOMAIN}: create new domain, specifying the name and memory usage
   1.316 @@ -855,47 +900,71 @@ in kilobytes.
   1.317  {\it DOM0\_SETDOMAINVMASSIST}: set domain VM assist options
   1.318  
   1.319  
   1.320 -\section{ set\_debugreg(int reg, unsigned long value)}
   1.321 +\hypercall{ set\_debugreg(int reg, unsigned long value)}
   1.322 +
   1.323  set debug register reg to value
   1.324  
   1.325 -\section{ get\_debugreg(int reg)}
   1.326 +
   1.327 +\hypercall{ get\_debugreg(int reg)}
   1.328 +
   1.329   get the debug register reg
   1.330  
   1.331 -\section{ update\_descriptor(unsigned long ma, unsigned long word1, unsigned long word2)} 
   1.332 +
   1.333 +\hypercall{ update\_descriptor(unsigned long ma, unsigned long word1, unsigned long word2)} 
   1.334  
   1.335 -\section{ set\_fast\_trap(int idx)}
   1.336 +
   1.337 +\hypercall{ set\_fast\_trap(int idx)}
   1.338 +
   1.339   install traps to allow guest OS to bypass hypervisor
   1.340  
   1.341 -\section{ dom\_mem\_op(unsigned int op, unsigned long *extent\_list, unsigned long nr\_extents, unsigned int extent\_order)}
   1.342 +
   1.343 +\hypercall{ dom\_mem\_op(unsigned int op, unsigned long *extent\_list, unsigned long nr\_extents, unsigned int extent\_order)}
   1.344 +
   1.345  Increase or decrease memory reservations for guest OS
   1.346  
   1.347 -\section{ multicall(void *call\_list, int nr\_calls)}
   1.348 +
   1.349 +\hypercall{ multicall(void *call\_list, int nr\_calls)}
   1.350 +
   1.351  Execute a series of hypervisor calls
   1.352  
   1.353 -\section{ update\_va\_mapping(unsigned long page\_nr, unsigned long val, unsigned long flags)}
   1.354 +
   1.355 +\hypercall{ update\_va\_mapping(unsigned long page\_nr, unsigned long val, unsigned long flags)}
   1.356  
   1.357 -\section{ set\_timer\_op(uint64\_t timeout)} 
   1.358 +
   1.359 +\hypercall{ set\_timer\_op(uint64\_t timeout)} 
   1.360 +
   1.361  Request a timer event to be sent at the specified system time.
   1.362  
   1.363 -\section{ event\_channel\_op(void *op)} 
   1.364 -Iinter-domain event-channel management.
   1.365 +
   1.366 +\hypercall{ event\_channel\_op(void *op)} 
   1.367  
   1.368 -\section{ xen\_version(int cmd)}
   1.369 +Inter-domain event-channel management.
   1.370 +
   1.371 +
   1.372 +\hypercall{ xen\_version(int cmd)}
   1.373 +
   1.374  Request Xen version number.
   1.375  
   1.376 -\section{ console\_io(int cmd, int count, char *str)}
   1.377 +
   1.378 +\hypercall{ console\_io(int cmd, int count, char *str)}
   1.379 +
   1.380  Interact with the console, operations are:
   1.381  
   1.382  {\it CONSOLEIO\_write}: Output count characters from buffer str.
   1.383  
   1.384  {\it CONSOLEIO\_read}: Input at most count characters into buffer str.
   1.385  
   1.386 -\section{ physdev\_op(void *physdev\_op)}
   1.387 +
   1.388 +\hypercall{ physdev\_op(void *physdev\_op)}
   1.389  
   1.390 -\section{ grant\_table\_op(unsigned int cmd, void *uop, unsigned int count)}
   1.391 +
   1.392 +\hypercall{ grant\_table\_op(unsigned int cmd, void *uop, unsigned int count)}
   1.393  
   1.394 -\section{ vm\_assist(unsigned int cmd, unsigned int type)}
   1.395 +
   1.396 +\hypercall{ vm\_assist(unsigned int cmd, unsigned int type)}
   1.397  
   1.398 -\section{ update\_va\_mapping\_otherdomain(unsigned long page\_nr, unsigned long val, unsigned long flags, uint16\_t domid)}
   1.399 +
   1.400 +\hypercall{ update\_va\_mapping\_otherdomain(unsigned long page\_nr, unsigned long val, unsigned long flags, uint16\_t domid)}
   1.401 +
   1.402  
   1.403  \end{document}
     2.1 --- a/docs/src/user.tex	Fri Oct 29 10:31:32 2004 +0000
     2.2 +++ b/docs/src/user.tex	Fri Oct 29 13:21:44 2004 +0000
     2.3 @@ -16,12 +16,19 @@
     2.4  {\Huge \bf Users' manual} \\[4mm]
     2.5  {\huge Xen v2.0 for x86} \\[80mm]
     2.6  
     2.7 -{\Large Xen is Copyright (c) 2004, The Xen Team} \\[3mm]
     2.8 +{\Large Xen is Copyright (c) 2002-2004, The Xen Team} \\[3mm]
     2.9  {\Large University of Cambridge, UK} \\[20mm]
    2.10 -{\large Last updated on 26th October, 2004}
    2.11  \end{tabular}
    2.12 +\end{center}
    2.13 +
    2.14 +{\bf
    2.15 +DISCLAIMER: This documentation is currently under active development
    2.16 +and as such there may be mistakes and omissions --- watch out for
    2.17 +these and please report any you find to the developer's mailing list.
    2.18 +Contributions of material, suggestions and corrections are welcome.
    2.19 +}
    2.20 +
    2.21  \vfill
    2.22 -\end{center}
    2.23  \cleardoublepage
    2.24  
    2.25  % TABLE OF CONTENTS
    2.26 @@ -49,15 +56,6 @@
    2.27  \part{Introduction and Tutorial}
    2.28  \chapter{Introduction}
    2.29  
    2.30 -{\bf
    2.31 -DISCLAIMER: This documentation is currently under active development
    2.32 -and as such there may be mistakes and omissions --- watch out for
    2.33 -these and please report any you find to the developer's mailing list.
    2.34 -Contributions of material, suggestions and corrections are welcome.
    2.35 -}
    2.36 -
    2.37 -\vspace{5mm}
    2.38 -
    2.39  Xen is a { \em paravirtualising } virtual machine monitor (VMM), or
    2.40  `hypervisor', for the x86 processor architecture.  Xen can securely
    2.41  execute multiple virtual machines on a single physical system with