view docs/HOWTOs/VBD-HOWTO @ 2006:bd310c8b4b5c

bitkeeper revision 1.1108.43.1 (410a5973b_ww-XNociMt5BotV87vBQ)

author mwilli2@equilibrium.research.intel-research.net
date Fri Jul 30 14:21:39 2004 +0000 (2004-07-30)
parents 1d1e0a1795b8
line source
1 Virtual Block Devices / Virtual Disks in Xen - HOWTO
2 ====================================================
4 HOWTO for Xen 1.2
6 Mark A. Williamson (mark.a.williamson@intel.com)
7 (C) Intel Research Cambridge 2004
9 Introduction
10 ------------
12 This document describes the new Virtual Block Device (VBD) and Virtual Disk
13 features available in Xen release 1.2. First, a brief introduction to some
14 basic disk concepts on a Xen system:
16 Virtual Block Devices (VBDs):
17 VBDs are the disk abstraction provided by Xen. All XenoLinux disk accesses
18 go through the VBD driver. Using the VBD functionality, it is possible
19 to selectively grant domains access to portions of the physical disks
20 in the system.
22 A virtual block device can also consist of multiple extents from the
23 physical disks in the system, allowing them to be accessed as a single
24 uniform device from the domain with access to that VBD. The
25 functionality is somewhat similar to that underpinning LVM, since
26 you can combine multiple regions from physical devices into a single
27 logical device, from the point of view of a guest virtual machine.
29 Everyone who boots Xen / XenoLinux from a hard drive uses VBDs
30 but for some uses they can almost be ignored.
32 Virtual Disks (VDs):
33 VDs are an abstraction built on top of the functionality provided by
34 VBDs. The VD management code maintains a "free pool" of disk space on
35 the system that has been reserved for use with VDs. The tools can
36 automatically allocate collections of extents from this free pool to
37 create "virtual disks" on demand.
39 VDs can then be used just like normal disks by domains. VDs appear
40 just like any other disk to guest domains, since they use the same VBD
41 abstraction, as provided by Xen.
43 Using VDs is optional, since it's always possible to dedicate
44 partitions, or entire disks to your virtual machines. VDs are handy
45 when you have a dynamically changing set of virtual machines and you
46 don't want to have to keep repartitioning in order to provide them with
47 disk space.
49 Virtual Disks are rather like "logical volumes" in LVM.
51 If that didn't all make sense, it doesn't matter too much ;-) Using the
52 functionality is fairly straightforward and some examples will clarify things.
53 The text below expands a bit on the concepts involved, finishing up with a
54 walk-through of some simple virtual disk management tasks.
57 Virtual Block Devices
58 ---------------------
60 Before covering VD management, it's worth discussing some aspects of the VBD
61 functionality that will be useful to know.
63 A VBD is made up of a number of extents from physical disk devices. The
64 extents for a VBD don't have to be contiguous, or even on the same device. Xen
65 performs address translation so that they appear as a single contiguous
66 device to a domain.
68 When the VBD layer is used to give access to entire drives or entire
69 partitions, the VBDs simply consist of a single extent that corresponds to the
70 drive or partition used. Lists of extents are usually only used when virtual
71 disks (VDs) are being used.
73 Xen 1.2 and its associated XenoLinux release support automatic registration /
74 removal of VBDs. It has always been possible to add a VBD to a running
75 XenoLinux domain but it was then necessary to run the "xen_vbd_refresh" tool in
76 order for the new device to be detected. Nowadays, when a VBD is added, the
77 domain it's added to automatically registers the disk, with no special action
78 by the user being required.
80 Note that it is possible to use the VBD functionality to allow multiple domains
81 write access to the same areas of disk. This is almost always a bad thing!
82 The provided example scripts for creating domains do their best to check that
83 disk areas are not shared unsafely and will catch many cases of this. Setting
84 the vbd_expert variable in config files for xc_dom_create.py controls how
85 unsafe it allows VBD mappings to be - 0 (read only sharing allowed) should be
86 right for most people ;-). Level 1 attempts to allow at most one writer to any
87 area of disk. Level 2 allows multiple writers (i.e. anything!).
90 Virtual Disk Management
91 -----------------------
93 The VD management code runs entirely in user space. The code is written in
94 Python and can therefore be accessed from custom scripts, as well as from the
95 convenience scripts provided. The underlying VD database is a SQLite database
96 in /var/db/xen_vdisks.sqlite.
98 Most virtual disk management can be performed using the xc_vd_tool.py script
99 provided in the tools/examples/ directory of the source tree. It supports the
100 following operations:
102 initialise - "Formats" a partition or disk device for use storing
103 virtual disks. This does not actually write data to the
104 specified device. Rather, it adds the device to the VD
105 free-space pool, for later allocation.
107 You should only add devices that correspond directly to
108 physical disks / partitions - trying to use a VBD that you
109 have created yourself as part of the free space pool has
110 undefined (possibly nasty) results.
112 create - Creates a virtual disk of specified size by allocating space
113 from the free space pool. The virtual disk is identified
114 in future by the unique ID returned by this script.
116 The disk can be given an expiry time, if desired. For
117 most users, the best idea is to specify a time of 0 (which
118 has the special meaning "never expire") and then
119 explicitly delete the VD when finished with it -
120 otherwise, VDs will disappear if allowed to expire.
122 delete - Explicitly delete a VD. Makes it disappear immediately!
124 setexpiry - Allows the expiry time of a (not yet expired) virtual disk
125 to be modified. Be aware the VD will disappear when the
126 time has expired.
128 enlarge - Increase the allocation of space to a virtual disk.
129 Currently this will not be immediately visible to running
130 domain(s) using it. You can make it visible by destroying
131 the corresponding VBDs and then using xc_dom_control.py to
132 add them to the domain again. Note: doing this to
133 filesystems that are in use may well cause errors in the
134 guest Linux, or even a crash although it will probably be
135 OK if you stop the domain before updating the VBD and
136 restart afterwards.
138 import - Allocate a virtual disk and populate it with the contents of
139 some disk file. This can be used to import root file system
140 images or to restore backups of virtual disks, for instance.
142 export - Write the contents of a virtual disk out to a disk file.
143 Useful for creating disk images for use elsewhere, such as
144 standard root file systems and backups.
146 list - List the non-expired virtual disks currently available in the
147 system.
149 undelete - Attempts to recover an expired (or deleted) virtual disk.
151 freespace - Get the free space (in megabytes) available for allocating
152 new virtual disk extents.
154 The functionality provided by these scripts is also available directly from
155 Python functions in the xenctl.utils module - you can use this functionality in
156 your own scripts.
158 Populating VDs:
160 Once you've created a VD, you might want to populate it from DOM0 (for
161 instance, to put a root file system onto it for a guest domain). This can be
162 done by creating a VBD for dom0 to access the VD through - this is discussed
163 below.
165 More detail on how virtual disks work:
167 When you "format" a device for virtual disks, the device is logically split up
168 into extents. These extents are recorded in the Virtual Disk Management
169 database in /var/db/xen_vdisks.sqlite.
171 When you use xc_vd_tool.py to add create a virtual disk, some of the extents in
172 the free space pool are reallocated for that virtual disk and a record for that
173 VD is added to the database. When VDs are mapped into domains as VBDs, the
174 system looks up the allocated extents for the virtual disk in order to set up
175 the underlying VBD.
177 Free space is identified by the fact that it belongs to an "expired" disk.
178 When "initialising" with xc_vd_tool.py adds a real device to the free pool, it
179 actually divides the device into extents and adds them to an already-expired
180 virtual disk. The allocated device is not written to during this operation -
181 its availability is simply recorded into the virtual disks database.
183 If you set an expiry time on a VD, its extents will be liable to be reallocated
184 to new VDs as soon as that expiry time runs out. Therefore, be careful when
185 setting expiry times! Many users will find it simplest to set all VDs to not
186 expire automatically, then explicitly delete them later on.
188 Deleted / expired virtual disks may sometimes be undeleted - currently this
189 only works when none of the virtual disk's extents have been reallocated to
190 other virtual disks, since that's the only situation where the disk is likely
191 to be fully intact. You should try undeletion as soon as you realise you've
192 mistakenly deleted (or allowed to expire) a virtual disk. At some point in the
193 future, an "unsafe" undelete which can recover what remains of partially
194 reallocated virtual disks may also be implemented.
196 Security note:
198 The disk space for VDs is not zeroed when it is initially added to the free
199 space pool OR when a VD expires OR when a VD is created. Therefore, if this is
200 not done manually it is possible for a domain to read a VD to determine what
201 was written by previous owners of its constituent extents. If this is a
202 problem, users should manually clean VDs in some way either on allocation, or
203 just before deallocation (automated support for this may be added at a later
204 date).
207 Side note: The xvd* devices
208 ---------------------------
210 The examples in this document make frequent use of the xvd* device nodes for
211 representing virtual block devices. It is not a requirement to use these with
212 Xen, since VBDs can be mapped to any IDE or SCSI device node in the system.
213 Changing the the references to xvd* nodes in the examples below to refer to
214 some unused hd* or sd* node would also be valid.
216 They can be useful when accessing VBDs from dom0, since binding VBDs to xvd*
217 devices under will avoid clashes with real IDE or SCSI drives.
219 There is a shell script provided in tools/misc/xen-mkdevnodes to create these
220 nodes. Specify on the command line the directory that the nodes should be
221 placed under (e.g. /dev):
223 > cd {root of Xen source tree}/tools/misc/
224 > ./xen-mkdevnodes /dev
227 Dynamically Registering VBDs
228 ----------------------------
230 The domain control tool (xc_dom_control.py) includes the ability to add and
231 remove VBDs to / from running domains. As usual, the command format is:
233 xc_dom_control.py [operation] [arguments]
235 The operations (and their arguments) are as follows:
237 vbd_add dom uname dev mode - Creates a VBD corresponding to either a physical
238 device or a virtual disk and adds it as a
239 specified device under the target domain, with
240 either read or write access.
242 vbd_remove dom dev - Removes the VBD associated with a specified device
243 node from the target domain.
245 These scripts are most useful when populating VDs. VDs can't be populated
246 directly, since they don't correspond to real devices. Using:
248 xc_dom_control.py vbd_add 0 vd:your_vd_id /dev/whatever w
250 You can make a virtual disk available to DOM0. Sensible devices to map VDs to
251 in DOM0 are the /dev/xvd* nodes, since that makes it obvious that they are Xen
252 virtual devices that don't correspond to real physical devices.
254 You can then format, mount and populate the VD through the nominated device
255 node. When you've finished, use:
257 xc_dom_control.py vbd_remove 0 /dev/whatever
259 To revoke DOM0's access to it. It's then ready for use in a guest domain.
263 You can also use this functionality to grant access to a physical device to a
264 guest domain - you might use this to temporarily share a partition, or to add
265 access to a partition that wasn't granted at boot time.
267 When playing with VBDs, remember that in general, it is only safe for two
268 domains to have access to a file system if they both have read-only access. You
269 shouldn't be trying to share anything which is writable, even if only by one
270 domain, unless you're really sure you know what you're doing!
273 Granting access to real disks and partitions
274 --------------------------------------------
276 During the boot process, Xen automatically creates a VBD for each physical disk
277 and gives Dom0 read / write access to it. This makes it look like Dom0 has
278 normal access to the disks, just as if Xen wasn't being used - in reality, even
279 Dom0 talks to disks through Xen VBDs.
281 To give another domain access to a partition or whole disk then you need to
282 create a corresponding VBD for that partition, for use by that domain. As for
283 virtual disks, you can grant access to a running domain, or specify that the
284 domain should have access when it is first booted.
286 To grant access to a physical partition or disk whilst a domain is running, use
287 the xc_dom_control.py script - the usage is very similar to the case of adding
288 access virtual disks to a running domain (described above). Specify the device
289 as "phy:device", where device is the name of the device as seen from domain 0,
290 or from normal Linux without Xen. For instance:
292 > xc_dom_control.py vbd_add 2 phy:hdc /dev/whatever r
294 Will grant domain 2 read-only access to the device /dev/hdc (as seen from Dom0
295 / normal Linux running on the same machine - i.e. the master drive on the
296 secondary IDE chain), as /dev/whatever in the target domain.
298 Note that you can use this within domain 0 to map disks / partitions to other
299 device nodes within domain 0. For instance, you could map /dev/hda to also be
300 accessible through /dev/xvda. This is not generally recommended, since if you
301 (for instance) mount both device nodes read / write you could cause corruption
302 to the underlying filesystem. It's also quite confusing ;-)
304 To grant a domain access to a partition or disk when it boots, the appropriate
305 VBD needs to be created before the domain is started. This can be done very
306 easily using the tools provided. To specify this to the xc_dom_create.py tool
307 (either in a startup script or on the command line) use triples of the format:
309 phy:dev,target_dev,perms
311 Where dev is the device name as seen from Dom0, target_dev is the device you
312 want it to appear as in the target domain and perms is 'w' if you want to give
313 write privileges, or 'r' otherwise.
315 These may either be specified on the command line or in an initialisation
316 script. For instance, to grant the same access rights as described by the
317 command example above, you would use the triple:
319 phy:hdc,/dev/whatever,r
321 If you are using a config file, then you should add this triple into the
322 vbd_list variable, for instance using the line:
324 vbd_list = [ ('phy:dev', 'hdc', 'r') ]
326 (Note that you need to use quotes here, since config files are really small
327 Python scripts.)
329 To specify the mapping on the command line, you'd use the -d switch and supply
330 the triple as the argument, e.g.:
332 > xc_dom_create.py [other arguments] -d phy:hdc,/dev/whatever,r
334 (You don't need to explicitly quote things in this case.)
337 Walk-through: Booting a domain from a VD
338 ----------------------------------------
340 As an example, here is a sequence of commands you might use to create a virtual
341 disk, populate it with a root file system and boot a domain from it. These
342 steps assume that you've installed the example scripts somewhere on your PATH -
343 if you haven't done that, you'll need to specify a fully qualified pathname in
344 the examples below. It is also assumed that you know how to use the
345 xc_dom_create.py tool (apart from configuring virtual disks!)
347 [ This example is intended only for users of virtual disks (VDs). You don't
348 need to follow this example if you'll be booting a domain from a dedicated
349 partition, since you can create that partition and populate it, directly from
350 Dom0, as normal. ]
352 First, if you haven't done so already, you'll initialise the free space pool by
353 adding a real partition to it. The details are stored in the database, so
354 you'll only need to do it once. You can also use this command to add further
355 partitions to the existing free space pool.
357 > xc_vd_tool.py format /dev/<real partition>
359 Now you'll want to allocate the space for your virtual disk. Do so using the
360 following, specifying the size in megabytes.
362 > xc_vd_tool.py create <size in megabytes>
364 At this point, the program will tell you the virtual disk ID. Note it down, as
365 it is how you will identify the virtual device in future.
367 If you don't want the VD to be bootable (i.e. you're booting a domain from some
368 other medium and just want it to be able to access this VD), you can simply add
369 it to the vbd_list used by xc_dom_create.py, either by putting it in a config
370 file or by specifying it on the command line. Formatting / populating of the
371 VD could then done from that domain once it's started.
373 If you want to boot off your new VD as well then you need to populate it with a
374 standard Linux root filesystem. You'll need to temporarily add the VD to DOM0
375 in order to do this. To give DOM0 r/w access to the VD, use the following
376 command line, substituting the ID you got earlier.
378 > xc_dom_control.py vbd_add 0 vd:<id> /dev/xvda w
380 This attaches the VD to the device /dev/xvda in domain zero, with read / write
381 privileges - you can use other devices nodes if you choose too.
383 Now make a filesystem on this device, mount it and populate it with a root
384 filesystem. These steps are exactly the same as under normal Linux. When
385 you've finished, unmount the filesystem again.
387 You should now remove the VD from DOM0. This will prevent you accidentally
388 changing it in DOM0, whilst the guest domain is using it (which could cause
389 filesystem corruption, and confuse Linux).
391 > xc_dom_control.py vbd_remove 0 /dev/xvda
393 It should now be possible to boot a guest domain from the VD. To do this, you
394 should specify the the VD's details in some way so that xc_dom_create.py will
395 be able to set up the corresponding VBD for the domain to access. If you're
396 using a config file, you should include:
398 ('vd:<id>', '/dev/whatever', 'w')
400 In the vbd_list, substituting the appropriate virtual disk ID, device node and
401 read / write setting.
403 To specify access on the command line, as you start the domain, you would use
404 the -d switch (note that you don't need to use quote marks here):
406 > xc_dom_create.py [other arguments] -d vd:<id>,/dev/whatever,w
408 To tell Linux which device to boot from, you should either include:
410 root=/dev/whatever
412 in your cmdline_root in the config file, or specify it on the command line,
413 using the -R option:
415 > xc_dom_create.py [other arguments] -R root=/dev/whatever
417 That should be it: sit back watch your domain boot off its virtual disk!
420 Getting help
421 ------------
423 The main source of help using Xen is the developer's e-mail list:
424 <xen-devel@lists.sourceforge.net>. The developers will help with problems,
425 listen to feature requests and do bug fixes. It is, however, helpful if you
426 can look through the mailing list archives and HOWTOs provided to make sure
427 your question is not answered there. If you post to the list, please provide
428 as much information as possible about your setup and your problem.
430 There is also a general Xen FAQ, kindly started by Jan van Rensburg, which (at
431 time of writing) is located at: <http://xen.epiuse.com/xen-faq.txt>.
433 Contributing
434 ------------
436 Patches and extra documentation are also welcomed ;-) and should also be posted
437 to the xen-devel e-mail list.