ia64/linux-2.6.18-xen.hg

annotate Documentation/iostats.txt @ 524:7f8b544237bf

netfront: Allow netfront in domain 0.

This is useful if your physical network device is in a utility domain.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
author Keir Fraser <keir.fraser@citrix.com>
date Tue Apr 15 15:18:58 2008 +0100 (2008-04-15)
parents 831230e53067
children
rev   line source
ian@0 1 I/O statistics fields
ian@0 2 ---------------
ian@0 3
ian@0 4 Last modified Sep 30, 2003
ian@0 5
ian@0 6 Since 2.4.20 (and some versions before, with patches), and 2.5.45,
ian@0 7 more extensive disk statistics have been introduced to help measure disk
ian@0 8 activity. Tools such as sar and iostat typically interpret these and do
ian@0 9 the work for you, but in case you are interested in creating your own
ian@0 10 tools, the fields are explained here.
ian@0 11
ian@0 12 In 2.4 now, the information is found as additional fields in
ian@0 13 /proc/partitions. In 2.6, the same information is found in two
ian@0 14 places: one is in the file /proc/diskstats, and the other is within
ian@0 15 the sysfs file system, which must be mounted in order to obtain
ian@0 16 the information. Throughout this document we'll assume that sysfs
ian@0 17 is mounted on /sys, although of course it may be mounted anywhere.
ian@0 18 Both /proc/diskstats and sysfs use the same source for the information
ian@0 19 and so should not differ.
ian@0 20
ian@0 21 Here are examples of these different formats:
ian@0 22
ian@0 23 2.4:
ian@0 24 3 0 39082680 hda 446216 784926 9550688 4382310 424847 312726 5922052 19310380 0 3376340 23705160
ian@0 25 3 1 9221278 hda1 35486 0 35496 38030 0 0 0 0 0 38030 38030
ian@0 26
ian@0 27
ian@0 28 2.6 sysfs:
ian@0 29 446216 784926 9550688 4382310 424847 312726 5922052 19310380 0 3376340 23705160
ian@0 30 35486 38030 38030 38030
ian@0 31
ian@0 32 2.6 diskstats:
ian@0 33 3 0 hda 446216 784926 9550688 4382310 424847 312726 5922052 19310380 0 3376340 23705160
ian@0 34 3 1 hda1 35486 38030 38030 38030
ian@0 35
ian@0 36 On 2.4 you might execute "grep 'hda ' /proc/partitions". On 2.6, you have
ian@0 37 a choice of "cat /sys/block/hda/stat" or "grep 'hda ' /proc/diskstats".
ian@0 38 The advantage of one over the other is that the sysfs choice works well
ian@0 39 if you are watching a known, small set of disks. /proc/diskstats may
ian@0 40 be a better choice if you are watching a large number of disks because
ian@0 41 you'll avoid the overhead of 50, 100, or 500 or more opens/closes with
ian@0 42 each snapshot of your disk statistics.
ian@0 43
ian@0 44 In 2.4, the statistics fields are those after the device name. In
ian@0 45 the above example, the first field of statistics would be 446216.
ian@0 46 By contrast, in 2.6 if you look at /sys/block/hda/stat, you'll
ian@0 47 find just the eleven fields, beginning with 446216. If you look at
ian@0 48 /proc/diskstats, the eleven fields will be preceded by the major and
ian@0 49 minor device numbers, and device name. Each of these formats provide
ian@0 50 eleven fields of statistics, each meaning exactly the same things.
ian@0 51 All fields except field 9 are cumulative since boot. Field 9 should
ian@0 52 go to zero as I/Os complete; all others only increase. Yes, these are
ian@0 53 32 bit unsigned numbers, and on a very busy or long-lived system they
ian@0 54 may wrap. Applications should be prepared to deal with that; unless
ian@0 55 your observations are measured in large numbers of minutes or hours,
ian@0 56 they should not wrap twice before you notice them.
ian@0 57
ian@0 58 Each set of stats only applies to the indicated device; if you want
ian@0 59 system-wide stats you'll have to find all the devices and sum them all up.
ian@0 60
ian@0 61 Field 1 -- # of reads issued
ian@0 62 This is the total number of reads completed successfully.
ian@0 63 Field 2 -- # of reads merged, field 6 -- # of writes merged
ian@0 64 Reads and writes which are adjacent to each other may be merged for
ian@0 65 efficiency. Thus two 4K reads may become one 8K read before it is
ian@0 66 ultimately handed to the disk, and so it will be counted (and queued)
ian@0 67 as only one I/O. This field lets you know how often this was done.
ian@0 68 Field 3 -- # of sectors read
ian@0 69 This is the total number of sectors read successfully.
ian@0 70 Field 4 -- # of milliseconds spent reading
ian@0 71 This is the total number of milliseconds spent by all reads (as
ian@0 72 measured from __make_request() to end_that_request_last()).
ian@0 73 Field 5 -- # of writes completed
ian@0 74 This is the total number of writes completed successfully.
ian@0 75 Field 7 -- # of sectors written
ian@0 76 This is the total number of sectors written successfully.
ian@0 77 Field 8 -- # of milliseconds spent writing
ian@0 78 This is the total number of milliseconds spent by all writes (as
ian@0 79 measured from __make_request() to end_that_request_last()).
ian@0 80 Field 9 -- # of I/Os currently in progress
ian@0 81 The only field that should go to zero. Incremented as requests are
ian@0 82 given to appropriate request_queue_t and decremented as they finish.
ian@0 83 Field 10 -- # of milliseconds spent doing I/Os
ian@0 84 This field is increases so long as field 9 is nonzero.
ian@0 85 Field 11 -- weighted # of milliseconds spent doing I/Os
ian@0 86 This field is incremented at each I/O start, I/O completion, I/O
ian@0 87 merge, or read of these stats by the number of I/Os in progress
ian@0 88 (field 9) times the number of milliseconds spent doing I/O since the
ian@0 89 last update of this field. This can provide an easy measure of both
ian@0 90 I/O completion time and the backlog that may be accumulating.
ian@0 91
ian@0 92
ian@0 93 To avoid introducing performance bottlenecks, no locks are held while
ian@0 94 modifying these counters. This implies that minor inaccuracies may be
ian@0 95 introduced when changes collide, so (for instance) adding up all the
ian@0 96 read I/Os issued per partition should equal those made to the disks ...
ian@0 97 but due to the lack of locking it may only be very close.
ian@0 98
ian@0 99 In 2.6, there are counters for each cpu, which made the lack of locking
ian@0 100 almost a non-issue. When the statistics are read, the per-cpu counters
ian@0 101 are summed (possibly overflowing the unsigned 32-bit variable they are
ian@0 102 summed to) and the result given to the user. There is no convenient
ian@0 103 user interface for accessing the per-cpu counters themselves.
ian@0 104
ian@0 105 Disks vs Partitions
ian@0 106 -------------------
ian@0 107
ian@0 108 There were significant changes between 2.4 and 2.6 in the I/O subsystem.
ian@0 109 As a result, some statistic information disappeared. The translation from
ian@0 110 a disk address relative to a partition to the disk address relative to
ian@0 111 the host disk happens much earlier. All merges and timings now happen
ian@0 112 at the disk level rather than at both the disk and partition level as
ian@0 113 in 2.4. Consequently, you'll see a different statistics output on 2.6 for
ian@0 114 partitions from that for disks. There are only *four* fields available
ian@0 115 for partitions on 2.6 machines. This is reflected in the examples above.
ian@0 116
ian@0 117 Field 1 -- # of reads issued
ian@0 118 This is the total number of reads issued to this partition.
ian@0 119 Field 2 -- # of sectors read
ian@0 120 This is the total number of sectors requested to be read from this
ian@0 121 partition.
ian@0 122 Field 3 -- # of writes issued
ian@0 123 This is the total number of writes issued to this partition.
ian@0 124 Field 4 -- # of sectors written
ian@0 125 This is the total number of sectors requested to be written to
ian@0 126 this partition.
ian@0 127
ian@0 128 Note that since the address is translated to a disk-relative one, and no
ian@0 129 record of the partition-relative address is kept, the subsequent success
ian@0 130 or failure of the read cannot be attributed to the partition. In other
ian@0 131 words, the number of reads for partitions is counted slightly before time
ian@0 132 of queuing for partitions, and at completion for whole disks. This is
ian@0 133 a subtle distinction that is probably uninteresting for most cases.
ian@0 134
ian@0 135 Additional notes
ian@0 136 ----------------
ian@0 137
ian@0 138 In 2.6, sysfs is not mounted by default. If your distribution of
ian@0 139 Linux hasn't added it already, here's the line you'll want to add to
ian@0 140 your /etc/fstab:
ian@0 141
ian@0 142 none /sys sysfs defaults 0 0
ian@0 143
ian@0 144
ian@0 145 In 2.6, all disk statistics were removed from /proc/stat. In 2.4, they
ian@0 146 appear in both /proc/partitions and /proc/stat, although the ones in
ian@0 147 /proc/stat take a very different format from those in /proc/partitions
ian@0 148 (see proc(5), if your system has it.)
ian@0 149
ian@0 150 -- ricklind@us.ibm.com