view Documentation/networking/ixgb.txt @ 897:329ea0ccb344

balloon: try harder to balloon up under memory pressure.

Currently if the balloon driver is unable to increase the guest's
reservation it assumes the failure was due to reaching its full
allocation, gives up on the ballooning operation and records the limit
it reached as the "hard limit". The driver will not try again until
the target is set again (even to the same value).

However it is possible that ballooning has in fact failed due to
memory pressure in the host and therefore it is desirable to keep
attempting to reach the target in case memory becomes available. The
most likely scenario is that some guests are ballooning down while
others are ballooning up and therefore there is temporary memory
pressure while things stabilise. You would not expect a well behaved
toolstack to ask a domain to balloon to more than its allocation nor
would you expect it to deliberately over-commit memory by setting
balloon targets which exceed the total host memory.

This patch drops the concept of a hard limit and causes the balloon
driver to retry increasing the reservation on a timer in the same
manner as when decreasing the reservation.

Also if we partially succeed in increasing the reservation
(i.e. receive less pages than we asked for) then we may as well keep
those pages rather than returning them to Xen.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
author Keir Fraser <keir.fraser@citrix.com>
date Fri Jun 05 14:01:20 2009 +0100 (2009-06-05)
parents 831230e53067
line source
1 Linux* Base Driver for the Intel(R) PRO/10GbE Family of Adapters
2 ================================================================
4 November 17, 2004
7 Contents
8 ========
10 - In This Release
11 - Identifying Your Adapter
12 - Command Line Parameters
13 - Improving Performance
14 - Support
17 In This Release
18 ===============
20 This file describes the Linux* Base Driver for the Intel(R) PRO/10GbE Family
21 of Adapters, version 1.0.x.
23 For questions related to hardware requirements, refer to the documentation
24 supplied with your Intel PRO/10GbE adapter. All hardware requirements listed
25 apply to use with Linux.
27 Identifying Your Adapter
28 ========================
30 To verify your Intel adapter is supported, find the board ID number on the
31 adapter. Look for a label that has a barcode and a number in the format
32 A12345-001.
34 Use the above information and the Adapter & Driver ID Guide at:
36 http://support.intel.com/support/network/adapter/pro100/21397.htm
38 For the latest Intel network drivers for Linux, go to:
40 http://downloadfinder.intel.com/scripts-df/support_intel.asp
42 Command Line Parameters
43 =======================
45 If the driver is built as a module, the following optional parameters are
46 used by entering them on the command line with the modprobe or insmod command
47 using this syntax:
49 modprobe ixgb [<option>=<VAL1>,<VAL2>,...]
51 insmod ixgb [<option>=<VAL1>,<VAL2>,...]
53 For example, with two PRO/10GbE PCI adapters, entering:
55 insmod ixgb TxDescriptors=80,128
57 loads the ixgb driver with 80 TX resources for the first adapter and 128 TX
58 resources for the second adapter.
60 The default value for each parameter is generally the recommended setting,
61 unless otherwise noted. Also, if the driver is statically built into the
62 kernel, the driver is loaded with the default values for all the parameters.
63 Ethtool can be used to change some of the parameters at runtime.
65 FlowControl
66 Valid Range: 0-3 (0=none, 1=Rx only, 2=Tx only, 3=Rx&Tx)
67 Default: Read from the EEPROM
68 If EEPROM is not detected, default is 3
69 This parameter controls the automatic generation(Tx) and response(Rx) to
70 Ethernet PAUSE frames.
72 RxDescriptors
73 Valid Range: 64-512
74 Default Value: 512
75 This value is the number of receive descriptors allocated by the driver.
76 Increasing this value allows the driver to buffer more incoming packets.
77 Each descriptor is 16 bytes. A receive buffer is also allocated for
78 each descriptor and can be either 2048, 4056, 8192, or 16384 bytes,
79 depending on the MTU setting. When the MTU size is 1500 or less, the
80 receive buffer size is 2048 bytes. When the MTU is greater than 1500 the
81 receive buffer size will be either 4056, 8192, or 16384 bytes. The
82 maximum MTU size is 16114.
84 RxIntDelay
85 Valid Range: 0-65535 (0=off)
86 Default Value: 6
87 This value delays the generation of receive interrupts in units of
88 0.8192 microseconds. Receive interrupt reduction can improve CPU
89 efficiency if properly tuned for specific network traffic. Increasing
90 this value adds extra latency to frame reception and can end up
91 decreasing the throughput of TCP traffic. If the system is reporting
92 dropped receives, this value may be set too high, causing the driver to
93 run out of available receive descriptors.
95 TxDescriptors
96 Valid Range: 64-4096
97 Default Value: 256
98 This value is the number of transmit descriptors allocated by the driver.
99 Increasing this value allows the driver to queue more transmits. Each
100 descriptor is 16 bytes.
102 XsumRX
103 Valid Range: 0-1
104 Default Value: 1
105 A value of '1' indicates that the driver should enable IP checksum
106 offload for received packets (both UDP and TCP) to the adapter hardware.
108 XsumTX
109 Valid Range: 0-1
110 Default Value: 1
111 A value of '1' indicates that the driver should enable IP checksum
112 offload for transmitted packets (both UDP and TCP) to the adapter
113 hardware.
115 Improving Performance
116 =====================
118 With the Intel PRO/10 GbE adapter, the default Linux configuration will very
119 likely limit the total available throughput artificially. There is a set of
120 things that when applied together increase the ability of Linux to transmit
121 and receive data. The following enhancements were originally acquired from
122 settings published at http://www.spec.org/web99 for various submitted results
123 using Linux.
125 NOTE: These changes are only suggestions, and serve as a starting point for
126 tuning your network performance.
128 The changes are made in three major ways, listed in order of greatest effect:
129 - Use ifconfig to modify the mtu (maximum transmission unit) and the txqueuelen
130 parameter.
131 - Use sysctl to modify /proc parameters (essentially kernel tuning)
132 - Use setpci to modify the MMRBC field in PCI-X configuration space to increase
133 transmit burst lengths on the bus.
135 NOTE: setpci modifies the adapter's configuration registers to allow it to read
136 up to 4k bytes at a time (for transmits). However, for some systems the
137 behavior after modifying this register may be undefined (possibly errors of some
138 kind). A power-cycle, hard reset or explicitly setting the e6 register back to
139 22 (setpci -d 8086:1048 e6.b=22) may be required to get back to a stable
140 configuration.
142 - COPY these lines and paste them into ixgb_perf.sh:
143 #!/bin/bash
144 echo "configuring network performance , edit this file to change the interface"
145 # set mmrbc to 4k reads, modify only Intel 10GbE device IDs
146 setpci -d 8086:1048 e6.b=2e
147 # set the MTU (max transmission unit) - it requires your switch and clients to change too!
148 # set the txqueuelen
149 # your ixgb adapter should be loaded as eth1 for this to work, change if needed
150 ifconfig eth1 mtu 9000 txqueuelen 1000 up
151 # call the sysctl utility to modify /proc/sys entries
152 sysctl -p ./sysctl_ixgb.conf
153 - END ixgb_perf.sh
155 - COPY these lines and paste them into sysctl_ixgb.conf:
156 # some of the defaults may be different for your kernel
157 # call this file with sysctl -p <this file>
158 # these are just suggested values that worked well to increase throughput in
159 # several network benchmark tests, your mileage may vary
161 ### IPV4 specific settings
162 net.ipv4.tcp_timestamps = 0 # turns TCP timestamp support off, default 1, reduces CPU use
163 net.ipv4.tcp_sack = 0 # turn SACK support off, default on
164 # on systems with a VERY fast bus -> memory interface this is the big gainer
165 net.ipv4.tcp_rmem = 10000000 10000000 10000000 # sets min/default/max TCP read buffer, default 4096 87380 174760
166 net.ipv4.tcp_wmem = 10000000 10000000 10000000 # sets min/pressure/max TCP write buffer, default 4096 16384 131072
167 net.ipv4.tcp_mem = 10000000 10000000 10000000 # sets min/pressure/max TCP buffer space, default 31744 32256 32768
169 ### CORE settings (mostly for socket and UDP effect)
170 net.core.rmem_max = 524287 # maximum receive socket buffer size, default 131071
171 net.core.wmem_max = 524287 # maximum send socket buffer size, default 131071
172 net.core.rmem_default = 524287 # default receive socket buffer size, default 65535
173 net.core.wmem_default = 524287 # default send socket buffer size, default 65535
174 net.core.optmem_max = 524287 # maximum amount of option memory buffers, default 10240
175 net.core.netdev_max_backlog = 300000 # number of unprocessed input packets before kernel starts dropping them, default 300
176 - END sysctl_ixgb.conf
178 Edit the ixgb_perf.sh script if necessary to change eth1 to whatever interface
179 your ixgb driver is using.
181 NOTE: Unless these scripts are added to the boot process, these changes will
182 only last only until the next system reboot.
185 Resolving Slow UDP Traffic
186 --------------------------
188 If your server does not seem to be able to receive UDP traffic as fast as it
189 can receive TCP traffic, it could be because Linux, by default, does not set
190 the network stack buffers as large as they need to be to support high UDP
191 transfer rates. One way to alleviate this problem is to allow more memory to
192 be used by the IP stack to store incoming data.
194 For instance, use the commands:
195 sysctl -w net.core.rmem_max=262143
196 and
197 sysctl -w net.core.rmem_default=262143
198 to increase the read buffer memory max and default to 262143 (256k - 1) from
199 defaults of max=131071 (128k - 1) and default=65535 (64k - 1). These variables
200 will increase the amount of memory used by the network stack for receives, and
201 can be increased significantly more if necessary for your application.
203 Support
204 =======
206 For general information and support, go to the Intel support website at:
208 http://support.intel.com
210 If an issue is identified with the released source code on the supported
211 kernel with a supported adapter, email the specific information related to
212 the issue to linux.nics@intel.com.