view docs/src/user/domain_filesystem.tex @ 7727:d7bcc7bbf981

Fix region0 virtual accesses
Signed-off by: Dan Magenheimer <dan.magenheimer@hp.com>
author djm@kirby.fc.hp.com
date Fri Nov 11 12:51:08 2005 -0600 (2005-11-11)
parents 06d84bf87159
children dceb2fcdab5b
line source
1 \chapter{Domain Filesystem Storage}
3 It is possible to directly export any Linux block device in dom0 to
4 another domain, or to export filesystems / devices to virtual machines
5 using standard network protocols (e.g.\ NBD, iSCSI, NFS, etc.). This
6 chapter covers some of the possibilities.
9 \section{Exporting Physical Devices as VBDs}
10 \label{s:exporting-physical-devices-as-vbds}
12 One of the simplest configurations is to directly export individual
13 partitions from domain~0 to other domains. To achieve this use the
14 \path{phy:} specifier in your domain configuration file. For example a
15 line like
16 \begin{quote}
17 \verb_disk = ['phy:hda3,sda1,w']_
18 \end{quote}
19 specifies that the partition \path{/dev/hda3} in domain~0 should be
20 exported read-write to the new domain as \path{/dev/sda1}; one could
21 equally well export it as \path{/dev/hda} or \path{/dev/sdb5} should
22 one wish.
24 In addition to local disks and partitions, it is possible to export
25 any device that Linux considers to be ``a disk'' in the same manner.
26 For example, if you have iSCSI disks or GNBD volumes imported into
27 domain~0 you can export these to other domains using the \path{phy:}
28 disk syntax. E.g.:
29 \begin{quote}
30 \verb_disk = ['phy:vg/lvm1,sda2,w']_
31 \end{quote}
33 \begin{center}
34 \framebox{\bf Warning: Block device sharing}
35 \end{center}
36 \begin{quote}
37 Block devices should typically only be shared between domains in a
38 read-only fashion otherwise the Linux kernel's file systems will get
39 very confused as the file system structure may change underneath
40 them (having the same ext3 partition mounted \path{rw} twice is a
41 sure fire way to cause irreparable damage)! \Xend\ will attempt to
42 prevent you from doing this by checking that the device is not
43 mounted read-write in domain~0, and hasn't already been exported
44 read-write to another domain. If you want read-write sharing,
45 export the directory to other domains via NFS from domain~0 (or use
46 a cluster file system such as GFS or ocfs2).
47 \end{quote}
50 \section{Using File-backed VBDs}
52 It is also possible to use a file in Domain~0 as the primary storage
53 for a virtual machine. As well as being convenient, this also has the
54 advantage that the virtual block device will be \emph{sparse} ---
55 space will only really be allocated as parts of the file are used. So
56 if a virtual machine uses only half of its disk space then the file
57 really takes up half of the size allocated.
59 For example, to create a 2GB sparse file-backed virtual block device
60 (actually only consumes 1KB of disk):
61 \begin{quote}
62 \verb_# dd if=/dev/zero of=vm1disk bs=1k seek=2048k count=1_
63 \end{quote}
65 Make a file system in the disk file:
66 \begin{quote}
67 \verb_# mkfs -t ext3 vm1disk_
68 \end{quote}
70 (when the tool asks for confirmation, answer `y')
72 Populate the file system e.g.\ by copying from the current root:
73 \begin{quote}
74 \begin{verbatim}
75 # mount -o loop vm1disk /mnt
76 # cp -ax /{root,dev,var,etc,usr,bin,sbin,lib} /mnt
77 # mkdir /mnt/{proc,sys,home,tmp}
78 \end{verbatim}
79 \end{quote}
81 Tailor the file system by editing \path{/etc/fstab},
82 \path{/etc/hostname}, etc.\ Don't forget to edit the files in the
83 mounted file system, instead of your domain~0 filesystem, e.g.\ you
84 would edit \path{/mnt/etc/fstab} instead of \path{/etc/fstab}. For
85 this example put \path{/dev/sda1} to root in fstab.
87 Now unmount (this is important!):
88 \begin{quote}
89 \verb_# umount /mnt_
90 \end{quote}
92 In the configuration file set:
93 \begin{quote}
94 \verb_disk = ['file:/full/path/to/vm1disk,sda1,w']_
95 \end{quote}
97 As the virtual machine writes to its `disk', the sparse file will be
98 filled in and consume more space up to the original 2GB.
100 {\bf Note that file-backed VBDs may not be appropriate for backing
101 I/O-intensive domains.} File-backed VBDs are known to experience
102 substantial slowdowns under heavy I/O workloads, due to the I/O
103 handling by the loopback block device used to support file-backed VBDs
104 in dom0. Better I/O performance can be achieved by using either
105 LVM-backed VBDs (Section~\ref{s:using-lvm-backed-vbds}) or physical
106 devices as VBDs (Section~\ref{s:exporting-physical-devices-as-vbds}).
108 Linux supports a maximum of eight file-backed VBDs across all domains
109 by default. This limit can be statically increased by using the
110 \emph{max\_loop} module parameter if CONFIG\_BLK\_DEV\_LOOP is
111 compiled as a module in the dom0 kernel, or by using the
112 \emph{max\_loop=n} boot option if CONFIG\_BLK\_DEV\_LOOP is compiled
113 directly into the dom0 kernel.
116 \section{Using LVM-backed VBDs}
117 \label{s:using-lvm-backed-vbds}
119 A particularly appealing solution is to use LVM volumes as backing for
120 domain file-systems since this allows dynamic growing/shrinking of
121 volumes as well as snapshot and other features.
123 To initialize a partition to support LVM volumes:
124 \begin{quote}
125 \begin{verbatim}
126 # pvcreate /dev/sda10
127 \end{verbatim}
128 \end{quote}
130 Create a volume group named `vg' on the physical partition:
131 \begin{quote}
132 \begin{verbatim}
133 # vgcreate vg /dev/sda10
134 \end{verbatim}
135 \end{quote}
137 Create a logical volume of size 4GB named `myvmdisk1':
138 \begin{quote}
139 \begin{verbatim}
140 # lvcreate -L4096M -n myvmdisk1 vg
141 \end{verbatim}
142 \end{quote}
144 You should now see that you have a \path{/dev/vg/myvmdisk1} Make a
145 filesystem, mount it and populate it, e.g.:
146 \begin{quote}
147 \begin{verbatim}
148 # mkfs -t ext3 /dev/vg/myvmdisk1
149 # mount /dev/vg/myvmdisk1 /mnt
150 # cp -ax / /mnt
151 # umount /mnt
152 \end{verbatim}
153 \end{quote}
155 Now configure your VM with the following disk configuration:
156 \begin{quote}
157 \begin{verbatim}
158 disk = [ 'phy:vg/myvmdisk1,sda1,w' ]
159 \end{verbatim}
160 \end{quote}
162 LVM enables you to grow the size of logical volumes, but you'll need
163 to resize the corresponding file system to make use of the new space.
164 Some file systems (e.g.\ ext3) now support online resize. See the LVM
165 manuals for more details.
167 You can also use LVM for creating copy-on-write (CoW) clones of LVM
168 volumes (known as writable persistent snapshots in LVM terminology).
169 This facility is new in Linux 2.6.8, so isn't as stable as one might
170 hope. In particular, using lots of CoW LVM disks consumes a lot of
171 dom0 memory, and error conditions such as running out of disk space
172 are not handled well. Hopefully this will improve in future.
174 To create two copy-on-write clone of the above file system you would
175 use the following commands:
177 \begin{quote}
178 \begin{verbatim}
179 # lvcreate -s -L1024M -n myclonedisk1 /dev/vg/myvmdisk1
180 # lvcreate -s -L1024M -n myclonedisk2 /dev/vg/myvmdisk1
181 \end{verbatim}
182 \end{quote}
184 Each of these can grow to have 1GB of differences from the master
185 volume. You can grow the amount of space for storing the differences
186 using the lvextend command, e.g.:
187 \begin{quote}
188 \begin{verbatim}
189 # lvextend +100M /dev/vg/myclonedisk1
190 \end{verbatim}
191 \end{quote}
193 Don't let the `differences volume' ever fill up otherwise LVM gets
194 rather confused. It may be possible to automate the growing process by
195 using \path{dmsetup wait} to spot the volume getting full and then
196 issue an \path{lvextend}.
198 In principle, it is possible to continue writing to the volume that
199 has been cloned (the changes will not be visible to the clones), but
200 we wouldn't recommend this: have the cloned volume as a `pristine'
201 file system install that isn't mounted directly by any of the virtual
202 machines.
205 \section{Using NFS Root}
207 First, populate a root filesystem in a directory on the server
208 machine. This can be on a distinct physical machine, or simply run
209 within a virtual machine on the same node.
211 Now configure the NFS server to export this filesystem over the
212 network by adding a line to \path{/etc/exports}, for instance:
214 \begin{quote}
215 \begin{small}
216 \begin{verbatim}
217 /export/vm1root (rw,sync,no_root_squash)
218 \end{verbatim}
219 \end{small}
220 \end{quote}
222 Finally, configure the domain to use NFS root. In addition to the
223 normal variables, you should make sure to set the following values in
224 the domain's configuration file:
226 \begin{quote}
227 \begin{small}
228 \begin{verbatim}
229 root = '/dev/nfs'
230 nfs_server = '' # substitute IP address of server
231 nfs_root = '/path/to/root' # path to root FS on the server
232 \end{verbatim}
233 \end{small}
234 \end{quote}
236 The domain will need network access at boot time, so either statically
237 configure an IP address using the config variables \path{ip},
238 \path{netmask}, \path{gateway}, \path{hostname}; or enable DHCP
239 (\path{dhcp='dhcp'}).
241 Note that the Linux NFS root implementation is known to have stability
242 problems under high load (this is not a Xen-specific problem), so this
243 configuration may not be appropriate for critical servers.