ia64/xen-unstable

changeset 2832:fed2ee41e162

bitkeeper revision 1.1159.1.341 (4187f5daw4cUbERbgd6PIWn6FkBBvw)

Merge tempest.cl.cam.ac.uk:/auto/groups/xeno/BK/xeno.bk
into tempest.cl.cam.ac.uk:/local/scratch/smh22/xeno.bk
author smh22@tempest.cl.cam.ac.uk
date Tue Nov 02 21:02:18 2004 +0000 (2004-11-02)
parents 565da10c07de 6ab98626161f
children c9133ff3a758
files docs/src/interface.tex
line diff
     1.1 --- a/docs/src/interface.tex	Tue Nov 02 21:00:50 2004 +0000
     1.2 +++ b/docs/src/interface.tex	Tue Nov 02 21:02:18 2004 +0000
     1.3 @@ -778,140 +778,231 @@ valid.
     1.4  
     1.5  \end{quote}
     1.6  
     1.7 +Guest OSes can use the above in place of context switching entire 
     1.8 +LDTs (or the GDT) when the number of changing descriptors is small. 
     1.9  
    1.10  \section{Context Switching} 
    1.11  
    1.12 +When a guest OS wishes to context switch between two processes, 
    1.13 +it can use the page table and segmentation hypercalls described
    1.14 +above to perform the the bulk of the privileged work. In addition, 
    1.15 +however, it will need to invoke Xen to switch the kernel (ring 1) 
    1.16 +stack pointer: 
    1.17  
    1.18 -
    1.19 +\begin{quote} 
    1.20  \hypercall{stack\_switch(unsigned long ss, unsigned long esp)} 
    1.21  
    1.22 -Request context switch from hypervisor.
    1.23 +Request kernel stack switch from hypervisor; {\tt ss} is the new 
    1.24 +stack segement, which {\tt esp} is the new stack pointer. 
    1.25 +
    1.26 +\end{quote} 
    1.27 +
    1.28 +A final useful hypercall for context switching allows ``lazy'' 
    1.29 +save and restore of floating point state: 
    1.30 +
    1.31 +\begin{quote}
    1.32 +\hypercall{fpu\_taskswitch(void)} 
    1.33 +
    1.34 +This call instructs Xen to set the {\tt TS} bit in the {\tt cr0}
    1.35 +control register; this means that the next attempt to use floating
    1.36 +point will cause a trap which the guest OS can trap. Typically it will
    1.37 +then save/restore the FP state, and clear the {\tt TS} bit. 
    1.38 +\end{quote} 
    1.39 +
    1.40 +This is provided as an optimization only; guest OSes can also choose
    1.41 +to save and restore FP state on all context switches for simplicity. 
    1.42  
    1.43  
    1.44 -\hypercall{fpu\_taskswitch(void)} 
    1.45 +\section{Physical Memory Management}
    1.46 +
    1.47 +As mentioned previously, each domain has a maximum and current 
    1.48 +memory allocation. The maximum allocation, set at domain creation 
    1.49 +time, cannot be modified. However a domain can choose to reduce 
    1.50 +and subsequently grow its current allocation by using the
    1.51 +following call: 
    1.52 +
    1.53 +\begin{quote} 
    1.54 +\hypercall{dom\_mem\_op(unsigned int op, unsigned long *extent\_list,
    1.55 +  unsigned long nr\_extents, unsigned int extent\_order)}
    1.56  
    1.57 -Notify hypervisor that fpu registers needed to be save on context switch.
    1.58 +Increase or decrease current memory allocation (as determined by 
    1.59 +the value of {\tt op}). Each invocation provides a list of 
    1.60 +extents each of which is $2^s$ pages in size, 
    1.61 +where $s$ is the value of {\tt extent\_order}. 
    1.62  
    1.63 +\end{quote} 
    1.64  
    1.65 +In addition to simply reducing or increasing the current memory
    1.66 +allocation via a `balloon driver', this call is also useful for 
    1.67 +obtaining contiguous regions of machine memory when required (e.g. 
    1.68 +for certain PCI devices, or if using superpages).  
    1.69  
    1.70  
    1.71  \section{Inter-Domain Communication}
    1.72  \label{s:idc} 
    1.73  
    1.74 -
    1.75 -\hypercall{event\_channel\_op(void *op)} 
    1.76 -
    1.77 -Inter-domain event-channel management.
    1.78 +Xen provides a simple asynchronous notification mechanism via
    1.79 +\emph{event channels}. Each domain has a set of end-points (or
    1.80 +\emph{ports}) which may be bound to an event source (e.g. a physical
    1.81 +IRQ, a virtual IRQ, or an port in another domain). When a pair of
    1.82 +end-points in two different domains are bound together, then a `send'
    1.83 +operation on one will cause an event to be received by the destination
    1.84 +domain.
    1.85  
    1.86 +The control and use of event channels involves the following hypercall: 
    1.87  
    1.88 -\hypercall{grant\_table\_op(unsigned int cmd, void *uop, unsigned int count)}
    1.89 +\begin{quote}
    1.90 +\hypercall{event\_channel\_op(evtchn\_op\_t *op)} 
    1.91  
    1.92 +Inter-domain event-channel management; {\tt op} is a discriminated 
    1.93 +union which allows the following 7 operations: 
    1.94  
    1.95 +\begin{description} 
    1.96  
    1.97 -\section{Physical Memory Management}
    1.98 -
    1.99 -\hypercall{dom\_mem\_op(unsigned int op, unsigned long *extent\_list,
   1.100 -unsigned long nr\_extents, unsigned int extent\_order)}
   1.101 +\item[\it alloc\_unbound:] allocate a free (unbound) local
   1.102 +  port and prepare for connection from a specified domain. 
   1.103 +\item[\it bind\_virq:] bind a local port to a virtual 
   1.104 +IRQ; any particular VIRQ can be bound to at most one port per domain. 
   1.105 +\item[\it bind\_pirq:] bind a local port to a physical IRQ;
   1.106 +once more, a given pIRQ can be bound to at most one port per
   1.107 +domain. Furthermore the calling domain must be sufficiently
   1.108 +privileged.
   1.109 +\item[\it bind\_interdomain:] construct an interdomain event 
   1.110 +channel; in general, the target domain must have previously allocated 
   1.111 +an unbound port for this channel, although this can be bypassed by 
   1.112 +privileged domains during domain setup. 
   1.113 +\item[\it close:] close an interdomain event channel. 
   1.114 +\item[\it send:] send an event to the remote end of a 
   1.115 +interdomain event channel. 
   1.116 +\item[\it status:] determine the current status of a local port. 
   1.117 +\end{description} 
   1.118  
   1.119 -Increase or decrease memory reservations for guest OS
   1.120 +For more details see
   1.121 +{\tt xen/include/public/event\_channel.h}. 
   1.122  
   1.123 +\end{quote} 
   1.124  
   1.125 +Event channels are the fundamental communication primitive between 
   1.126 +Xen domains and seamlessly support SMP. However they provide little
   1.127 +bandwidth for communication {\sl per se}, and hence are typically 
   1.128 +married with a piece of shared memory to produce effective and 
   1.129 +high-performance inter-domain communication. 
   1.130  
   1.131 +Safe sharing of memory pages between guest OSes is carried out granting
   1.132 +access on a per page basis to individual domains. This is achieved 
   1.133 +by using the {\tt grant\_table\_op()} hypercall. 
   1.134  
   1.135 +%\hypercall{grant\_table\_op(unsigned int cmd, void *uop, unsigned int count)}
   1.136  
   1.137  
   1.138  \section{Administrative Operations}
   1.139  \label{s:dom0ops}
   1.140  
   1.141 +A large number of control operations are available to a sufficiently
   1.142 +privileged domain (typically domain-0). These allow the creation and
   1.143 +management of new domains, for example. A complete list is given 
   1.144 +below: for more details on any or all of these, please see 
   1.145 +{\tt xen/include/public/dom0\_ops.h} 
   1.146 +
   1.147 +
   1.148 +\begin{quote}
   1.149  \hypercall{dom0\_op(dom0\_op\_t *op)} 
   1.150  
   1.151  Administrative domain operations for domain management. The options are:
   1.152  
   1.153 -{\it DOM0\_CREATEDOMAIN}: create new domain, specifying the name and memory usage
   1.154 -in kilobytes.
   1.155 +\begin{description} 
   1.156 +\item [\it DOM0\_CREATEDOMAIN:] create a new domain
   1.157  
   1.158 -{\it DOM0\_CREATEDOMAIN}: create domain
   1.159 +\item [\it DOM0\_PAUSEDOMAIN:] remove a domain from the scheduler run 
   1.160 +queue. 
   1.161  
   1.162 -{\it DOM0\_PAUSEDOMAIN}: mark domain as unschedulable
   1.163 +\item [\it DOM0\_UNPAUSEDOMAIN:] mark a paused domain as schedulable
   1.164 +  once again. 
   1.165  
   1.166 -{\it DOM0\_UNPAUSEDOMAIN}: mark domain as schedulable
   1.167 +\item [\it DOM0\_DESTROYDOMAIN:] deallocate all resources associated
   1.168 +with a domain
   1.169  
   1.170 -{\it DOM0\_DESTROYDOMAIN}: deallocate resources associated with the domain
   1.171 +\item [\it DOM0\_GETMEMLIST:] get list of pages used by the domain
   1.172  
   1.173 -{\it DOM0\_GETMEMLIST}: get list of pages used by the domain
   1.174 +\item [\it DOM0\_SCHEDCTL:]
   1.175  
   1.176 -{\it DOM0\_SCHEDCTL}:
   1.177 +\item [\it DOM0\_ADJUSTDOM:] adjust scheduling priorities for domain
   1.178  
   1.179 -{\it DOM0\_ADJUSTDOM}: adjust scheduling priorities for domain
   1.180 +\item [\it DOM0\_BUILDDOMAIN:] do final guest OS setup for domain
   1.181  
   1.182 -{\it DOM0\_BUILDDOMAIN}: do final guest OS setup for domain
   1.183 +\item [\it DOM0\_GETDOMAINFO:] get statistics about the domain
   1.184  
   1.185 -{\it DOM0\_GETDOMAINFO}: get statistics about the domain
   1.186 +\item [\it DOM0\_GETPAGEFRAMEINFO:] 
   1.187  
   1.188 -{\it DOM0\_GETPAGEFRAMEINFO}:
   1.189 +\item [\it DOM0\_GETPAGEFRAMEINFO2:]
   1.190  
   1.191 -{\it DOM0\_IOPL}: set IO privilege level
   1.192 +\item [\it DOM0\_IOPL:] set I/O privilege level
   1.193  
   1.194 -{\it DOM0\_MSR}:
   1.195 +\item [\it DOM0\_MSR:] read or write model specific registers
   1.196  
   1.197 -{\it DOM0\_DEBUG}: interactively call pervasive debugger
   1.198 +\item [\it DOM0\_DEBUG:] interactively invoke the debugger
   1.199  
   1.200 -{\it DOM0\_SETTIME}: set system time
   1.201 +\item [\it DOM0\_SETTIME:] set system time
   1.202  
   1.203 -{\it DOM0\_READCONSOLE}: read console content from hypervisor buffer ring
   1.204 +\item [\it DOM0\_READCONSOLE:] read console content from hypervisor buffer ring
   1.205  
   1.206 -{\it DOM0\_PINCPUDOMAIN}: pin domain to a particular CPU
   1.207 +\item [\it DOM0\_PINCPUDOMAIN:] pin domain to a particular CPU
   1.208  
   1.209 -{\it DOM0\_GETTBUFS}: get information about the size and location of
   1.210 +\item [\it DOM0\_GETTBUFS:] get information about the size and location of
   1.211                        the trace buffers (only on trace-buffer enabled builds)
   1.212  
   1.213 -{\it DOM0\_PHYSINFO}: get information about the host machine
   1.214 +\item [\it DOM0\_PHYSINFO:] get information about the host machine
   1.215  
   1.216 -{\it DOM0\_PCIDEV\_ACCESS}: modify PCI device access permissions
   1.217 +\item [\it DOM0\_PCIDEV\_ACCESS:] modify PCI device access permissions
   1.218  
   1.219 -{\it DOM0\_SCHED\_ID}: get the ID of the current Xen scheduler
   1.220 +\item [\it DOM0\_SCHED\_ID:] get the ID of the current Xen scheduler
   1.221  
   1.222 -{\it DOM0\_SHADOW\_CONTROL}:
   1.223 +\item [\it DOM0\_SHADOW\_CONTROL:] switch between shadow page-table modes
   1.224  
   1.225 -{\it DOM0\_SETDOMAINNAME}: set the name of a domain
   1.226 +\item [\it DOM0\_SETDOMAININITIALMEM:] set initial memory allocation of a domain
   1.227  
   1.228 -{\it DOM0\_SETDOMAININITIALMEM}: set initial memory allocation of a domain
   1.229 +\item [\it DOM0\_SETDOMAINMAXMEM:] set maximum memory allocation of a domain
   1.230  
   1.231 -{\it DOM0\_SETDOMAINMAXMEM}: set maximum memory allocation of a domain
   1.232 -
   1.233 -{\it DOM0\_GETPAGEFRAMEINFO2}:
   1.234 -
   1.235 -{\it DOM0\_SETDOMAINVMASSIST}: set domain VM assist options
   1.236 +\item [\it DOM0\_SETDOMAINVMASSIST:] set domain VM assist options
   1.237 +\end{description} 
   1.238 +\end{quote} 
   1.239  
   1.240  
   1.241  
   1.242  
   1.243 -\section{Miscellaneous Hypercalls} 
   1.244 +
   1.245 +\section{Debugging Hypercalls} 
   1.246  
   1.247 +A few additional hypercalls are maintly useful for debugging: 
   1.248  
   1.249 +\begin{quote} 
   1.250  \hypercall{console\_io(int cmd, int count, char *str)}
   1.251  
   1.252 -Interact with the console, operations are:
   1.253 +Use Xen to interact with the console; operations are:
   1.254  
   1.255  {\it CONSOLEIO\_write}: Output count characters from buffer str.
   1.256  
   1.257  {\it CONSOLEIO\_read}: Input at most count characters into buffer str.
   1.258 -
   1.259 +\end{quote} 
   1.260  
   1.261 -
   1.262 +A pair of hypercalls allows access to the underlying debug registers: 
   1.263 +\begin{quote}
   1.264  \hypercall{set\_debugreg(int reg, unsigned long value)}
   1.265  
   1.266 -set debug register reg to value
   1.267 -
   1.268 +Set debug register {\tt reg} to {\tt value} 
   1.269  
   1.270  \hypercall{get\_debugreg(int reg)}
   1.271  
   1.272 - get the debug register reg
   1.273 +Return the contents of the debug register {\tt reg}
   1.274 +\end{quote}
   1.275  
   1.276 -
   1.277 +And finally, a sometimes useful call is: 
   1.278 +\begin{quote}
   1.279  \hypercall{xen\_version(int cmd)}
   1.280  
   1.281  Request Xen version number.
   1.282 +\end{quote} 
   1.283 +
   1.284  
   1.285  
   1.286