view Documentation/networking/e1000.txt @ 897:329ea0ccb344

balloon: try harder to balloon up under memory pressure.

Currently if the balloon driver is unable to increase the guest's
reservation it assumes the failure was due to reaching its full
allocation, gives up on the ballooning operation and records the limit
it reached as the "hard limit". The driver will not try again until
the target is set again (even to the same value).

However it is possible that ballooning has in fact failed due to
memory pressure in the host and therefore it is desirable to keep
attempting to reach the target in case memory becomes available. The
most likely scenario is that some guests are ballooning down while
others are ballooning up and therefore there is temporary memory
pressure while things stabilise. You would not expect a well behaved
toolstack to ask a domain to balloon to more than its allocation nor
would you expect it to deliberately over-commit memory by setting
balloon targets which exceed the total host memory.

This patch drops the concept of a hard limit and causes the balloon
driver to retry increasing the reservation on a timer in the same
manner as when decreasing the reservation.

Also if we partially succeed in increasing the reservation
(i.e. receive less pages than we asked for) then we may as well keep
those pages rather than returning them to Xen.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
author Keir Fraser <keir.fraser@citrix.com>
date Fri Jun 05 14:01:20 2009 +0100 (2009-06-05)
parents 831230e53067
line source
1 Linux* Base Driver for the Intel(R) PRO/1000 Family of Adapters
2 ===============================================================
4 November 15, 2005
7 Contents
8 ========
10 - In This Release
11 - Identifying Your Adapter
12 - Command Line Parameters
13 - Speed and Duplex Configuration
14 - Additional Configurations
15 - Known Issues
16 - Support
19 In This Release
20 ===============
22 This file describes the Linux* Base Driver for the Intel(R) PRO/1000 Family
23 of Adapters. This driver includes support for Itanium(R)2-based systems.
25 For questions related to hardware requirements, refer to the documentation
26 supplied with your Intel PRO/1000 adapter. All hardware requirements listed
27 apply to use with Linux.
29 The following features are now available in supported kernels:
30 - Native VLANs
31 - Channel Bonding (teaming)
32 - SNMP
34 Channel Bonding documentation can be found in the Linux kernel source:
35 /Documentation/networking/bonding.txt
37 The driver information previously displayed in the /proc filesystem is not
38 supported in this release. Alternatively, you can use ethtool (version 1.6
39 or later), lspci, and ifconfig to obtain the same information.
41 Instructions on updating ethtool can be found in the section "Additional
42 Configurations" later in this document.
45 Identifying Your Adapter
46 ========================
48 For more information on how to identify your adapter, go to the Adapter &
49 Driver ID Guide at:
51 http://support.intel.com/support/network/adapter/pro100/21397.htm
53 For the latest Intel network drivers for Linux, refer to the following
54 website. In the search field, enter your adapter name or type, or use the
55 networking link on the left to search for your adapter:
57 http://downloadfinder.intel.com/scripts-df/support_intel.asp
60 Command Line Parameters =======================
62 If the driver is built as a module, the following optional parameters
63 are used by entering them on the command line with the modprobe or insmod
64 command using this syntax:
66 modprobe e1000 [<option>=<VAL1>,<VAL2>,...]
68 insmod e1000 [<option>=<VAL1>,<VAL2>,...]
70 For example, with two PRO/1000 PCI adapters, entering:
72 insmod e1000 TxDescriptors=80,128
74 loads the e1000 driver with 80 TX descriptors for the first adapter and 128
75 TX descriptors for the second adapter.
77 The default value for each parameter is generally the recommended setting,
78 unless otherwise noted.
80 NOTES: For more information about the AutoNeg, Duplex, and Speed
81 parameters, see the "Speed and Duplex Configuration" section in
82 this document.
84 For more information about the InterruptThrottleRate,
85 RxIntDelay, TxIntDelay, RxAbsIntDelay, and TxAbsIntDelay
86 parameters, see the application note at:
87 http://www.intel.com/design/network/applnots/ap450.htm
89 A descriptor describes a data buffer and attributes related to
90 the data buffer. This information is accessed by the hardware.
93 AutoNeg
94 -------
95 (Supported only on adapters with copper connections)
96 Valid Range: 0x01-0x0F, 0x20-0x2F
97 Default Value: 0x2F
99 This parameter is a bit mask that specifies which speed and duplex
100 settings the board advertises. When this parameter is used, the Speed
101 and Duplex parameters must not be specified.
103 NOTE: Refer to the Speed and Duplex section of this readme for more
104 information on the AutoNeg parameter.
107 Duplex
108 ------
109 (Supported only on adapters with copper connections)
110 Valid Range: 0-2 (0=auto-negotiate, 1=half, 2=full)
111 Default Value: 0
113 Defines the direction in which data is allowed to flow. Can be either
114 one or two-directional. If both Duplex and the link partner are set to
115 auto-negotiate, the board auto-detects the correct duplex. If the link
116 partner is forced (either full or half), Duplex defaults to half-duplex.
119 FlowControl
120 ----------
121 Valid Range: 0-3 (0=none, 1=Rx only, 2=Tx only, 3=Rx&Tx)
122 Default Value: Reads flow control settings from the EEPROM
124 This parameter controls the automatic generation(Tx) and response(Rx)
125 to Ethernet PAUSE frames.
128 InterruptThrottleRate
129 ---------------------
130 (not supported on Intel 82542, 82543 or 82544-based adapters)
131 Valid Range: 100-100000 (0=off, 1=dynamic)
132 Default Value: 8000
134 This value represents the maximum number of interrupts per second the
135 controller generates. InterruptThrottleRate is another setting used in
136 interrupt moderation. Dynamic mode uses a heuristic algorithm to adjust
137 InterruptThrottleRate based on the current traffic load.
139 NOTE: InterruptThrottleRate takes precedence over the TxAbsIntDelay and
140 RxAbsIntDelay parameters. In other words, minimizing the receive
141 and/or transmit absolute delays does not force the controller to
142 generate more interrupts than what the Interrupt Throttle Rate
143 allows.
145 CAUTION: If you are using the Intel PRO/1000 CT Network Connection
146 (controller 82547), setting InterruptThrottleRate to a value
147 greater than 75,000, may hang (stop transmitting) adapters
148 under certain network conditions. If this occurs a NETDEV
149 WATCHDOG message is logged in the system event log. In
150 addition, the controller is automatically reset, restoring
151 the network connection. To eliminate the potential for the
152 hang, ensure that InterruptThrottleRate is set no greater
153 than 75,000 and is not set to 0.
155 NOTE: When e1000 is loaded with default settings and multiple adapters
156 are in use simultaneously, the CPU utilization may increase non-
157 linearly. In order to limit the CPU utilization without impacting
158 the overall throughput, we recommend that you load the driver as
159 follows:
161 insmod e1000.o InterruptThrottleRate=3000,3000,3000
163 This sets the InterruptThrottleRate to 3000 interrupts/sec for
164 the first, second, and third instances of the driver. The range
165 of 2000 to 3000 interrupts per second works on a majority of
166 systems and is a good starting point, but the optimal value will
167 be platform-specific. If CPU utilization is not a concern, use
168 RX_POLLING (NAPI) and default driver settings.
171 RxDescriptors
172 -------------
173 Valid Range: 80-256 for 82542 and 82543-based adapters
174 80-4096 for all other supported adapters
175 Default Value: 256
177 This value specifies the number of receive descriptors allocated by the
178 driver. Increasing this value allows the driver to buffer more incoming
179 packets. Each descriptor is 16 bytes. A receive buffer is also
180 allocated for each descriptor and is 2048.
183 RxIntDelay
184 ----------
185 Valid Range: 0-65535 (0=off)
186 Default Value: 0
188 This value delays the generation of receive interrupts in units of 1.024
189 microseconds. Receive interrupt reduction can improve CPU efficiency if
190 properly tuned for specific network traffic. Increasing this value adds
191 extra latency to frame reception and can end up decreasing the throughput
192 of TCP traffic. If the system is reporting dropped receives, this value
193 may be set too high, causing the driver to run out of available receive
194 descriptors.
196 CAUTION: When setting RxIntDelay to a value other than 0, adapters may
197 hang (stop transmitting) under certain network conditions. If
198 this occurs a NETDEV WATCHDOG message is logged in the system
199 event log. In addition, the controller is automatically reset,
200 restoring the network connection. To eliminate the potential
201 for the hang ensure that RxIntDelay is set to 0.
204 RxAbsIntDelay
205 -------------
206 (This parameter is supported only on 82540, 82545 and later adapters.)
207 Valid Range: 0-65535 (0=off)
208 Default Value: 128
210 This value, in units of 1.024 microseconds, limits the delay in which a
211 receive interrupt is generated. Useful only if RxIntDelay is non-zero,
212 this value ensures that an interrupt is generated after the initial
213 packet is received within the set amount of time. Proper tuning,
214 along with RxIntDelay, may improve traffic throughput in specific network
215 conditions.
218 Speed
219 -----
220 (This parameter is supported only on adapters with copper connections.)
221 Valid Settings: 0, 10, 100, 1000
222 Default Value: 0 (auto-negotiate at all supported speeds)
224 Speed forces the line speed to the specified value in megabits per second
225 (Mbps). If this parameter is not specified or is set to 0 and the link
226 partner is set to auto-negotiate, the board will auto-detect the correct
227 speed. Duplex should also be set when Speed is set to either 10 or 100.
230 TxDescriptors
231 -------------
232 Valid Range: 80-256 for 82542 and 82543-based adapters
233 80-4096 for all other supported adapters
234 Default Value: 256
236 This value is the number of transmit descriptors allocated by the driver.
237 Increasing this value allows the driver to queue more transmits. Each
238 descriptor is 16 bytes.
240 NOTE: Depending on the available system resources, the request for a
241 higher number of transmit descriptors may be denied. In this case,
242 use a lower number.
245 TxIntDelay
246 ----------
247 Valid Range: 0-65535 (0=off)
248 Default Value: 64
250 This value delays the generation of transmit interrupts in units of
251 1.024 microseconds. Transmit interrupt reduction can improve CPU
252 efficiency if properly tuned for specific network traffic. If the
253 system is reporting dropped transmits, this value may be set too high
254 causing the driver to run out of available transmit descriptors.
257 TxAbsIntDelay
258 -------------
259 (This parameter is supported only on 82540, 82545 and later adapters.)
260 Valid Range: 0-65535 (0=off)
261 Default Value: 64
263 This value, in units of 1.024 microseconds, limits the delay in which a
264 transmit interrupt is generated. Useful only if TxIntDelay is non-zero,
265 this value ensures that an interrupt is generated after the initial
266 packet is sent on the wire within the set amount of time. Proper tuning,
267 along with TxIntDelay, may improve traffic throughput in specific
268 network conditions.
270 XsumRX
271 ------
272 (This parameter is NOT supported on the 82542-based adapter.)
273 Valid Range: 0-1
274 Default Value: 1
276 A value of '1' indicates that the driver should enable IP checksum
277 offload for received packets (both UDP and TCP) to the adapter hardware.
280 Speed and Duplex Configuration
281 ==============================
283 Three keywords are used to control the speed and duplex configuration.
284 These keywords are Speed, Duplex, and AutoNeg.
286 If the board uses a fiber interface, these keywords are ignored, and the
287 fiber interface board only links at 1000 Mbps full-duplex.
289 For copper-based boards, the keywords interact as follows:
291 The default operation is auto-negotiate. The board advertises all
292 supported speed and duplex combinations, and it links at the highest
293 common speed and duplex mode IF the link partner is set to auto-negotiate.
295 If Speed = 1000, limited auto-negotiation is enabled and only 1000 Mbps
296 is advertised (The 1000BaseT spec requires auto-negotiation.)
298 If Speed = 10 or 100, then both Speed and Duplex should be set. Auto-
299 negotiation is disabled, and the AutoNeg parameter is ignored. Partner
300 SHOULD also be forced.
302 The AutoNeg parameter is used when more control is required over the
303 auto-negotiation process. It should be used when you wish to control which
304 speed and duplex combinations are advertised during the auto-negotiation
305 process.
307 The parameter may be specified as either a decimal or hexidecimal value as
308 determined by the bitmap below.
310 Bit position 7 6 5 4 3 2 1 0
311 Decimal Value 128 64 32 16 8 4 2 1
312 Hex value 80 40 20 10 8 4 2 1
313 Speed (Mbps) N/A N/A 1000 N/A 100 100 10 10
314 Duplex Full Full Half Full Half
316 Some examples of using AutoNeg:
318 modprobe e1000 AutoNeg=0x01 (Restricts autonegotiation to 10 Half)
319 modprobe e1000 AutoNeg=1 (Same as above)
320 modprobe e1000 AutoNeg=0x02 (Restricts autonegotiation to 10 Full)
321 modprobe e1000 AutoNeg=0x03 (Restricts autonegotiation to 10 Half or 10 Full)
322 modprobe e1000 AutoNeg=0x04 (Restricts autonegotiation to 100 Half)
323 modprobe e1000 AutoNeg=0x05 (Restricts autonegotiation to 10 Half or 100
324 Half)
325 modprobe e1000 AutoNeg=0x020 (Restricts autonegotiation to 1000 Full)
326 modprobe e1000 AutoNeg=32 (Same as above)
328 Note that when this parameter is used, Speed and Duplex must not be specified.
330 If the link partner is forced to a specific speed and duplex, then this
331 parameter should not be used. Instead, use the Speed and Duplex parameters
332 previously mentioned to force the adapter to the same speed and duplex.
335 Additional Configurations
336 =========================
338 Configuring the Driver on Different Distributions
339 -------------------------------------------------
341 Configuring a network driver to load properly when the system is started
342 is distribution dependent. Typically, the configuration process involves
343 adding an alias line to /etc/modules.conf or /etc/modprobe.conf as well
344 as editing other system startup scripts and/or configuration files. Many
345 popular Linux distributions ship with tools to make these changes for you.
346 To learn the proper way to configure a network device for your system,
347 refer to your distribution documentation. If during this process you are
348 asked for the driver or module name, the name for the Linux Base Driver
349 for the Intel PRO/1000 Family of Adapters is e1000.
351 As an example, if you install the e1000 driver for two PRO/1000 adapters
352 (eth0 and eth1) and set the speed and duplex to 10full and 100half, add
353 the following to modules.conf or or modprobe.conf:
355 alias eth0 e1000
356 alias eth1 e1000
357 options e1000 Speed=10,100 Duplex=2,1
359 Viewing Link Messages
360 ---------------------
362 Link messages will not be displayed to the console if the distribution is
363 restricting system messages. In order to see network driver link messages
364 on your console, set dmesg to eight by entering the following:
366 dmesg -n 8
368 NOTE: This setting is not saved across reboots.
370 Jumbo Frames
371 ------------
373 The driver supports Jumbo Frames for all adapters except 82542 and
374 82573-based adapters. Jumbo Frames support is enabled by changing the
375 MTU to a value larger than the default of 1500. Use the ifconfig command
376 to increase the MTU size. For example:
378 ifconfig eth<x> mtu 9000 up
380 This setting is not saved across reboots. It can be made permanent if
381 you add:
383 MTU=9000
385 to the file /etc/sysconfig/network-scripts/ifcfg-eth<x>. This example
386 applies to the Red Hat distributions; other distributions may store this
387 setting in a different location.
389 Notes:
391 - To enable Jumbo Frames, increase the MTU size on the interface beyond
392 1500.
393 - The maximum MTU setting for Jumbo Frames is 16110. This value coincides
394 with the maximum Jumbo Frames size of 16128.
395 - Using Jumbo Frames at 10 or 100 Mbps may result in poor performance or
396 loss of link.
397 - Some Intel gigabit adapters that support Jumbo Frames have a frame size
398 limit of 9238 bytes, with a corresponding MTU size limit of 9216 bytes.
399 The adapters with this limitation are based on the Intel 82571EB and
400 82572EI controllers, which correspond to these product names:
401 Intel® PRO/1000 PT Dual Port Server Adapter
402 Intel® PRO/1000 PF Dual Port Server Adapter
403 Intel® PRO/1000 PT Server Adapter
404 Intel® PRO/1000 PT Desktop Adapter
405 Intel® PRO/1000 PF Server Adapter
407 - The Intel PRO/1000 PM Network Connection does not support jumbo frames.
410 Ethtool
411 -------
413 The driver utilizes the ethtool interface for driver configuration and
414 diagnostics, as well as displaying statistical information. Ethtool
415 version 1.6 or later is required for this functionality.
417 The latest release of ethtool can be found from
418 http://sourceforge.net/projects/gkernel.
420 NOTE: Ethtool 1.6 only supports a limited set of ethtool options. Support
421 for a more complete ethtool feature set can be enabled by upgrading
422 ethtool to ethtool-1.8.1.
424 Enabling Wake on LAN* (WoL)
425 ---------------------------
427 WoL is configured through the Ethtool* utility. Ethtool is included with
428 all versions of Red Hat after Red Hat 7.2. For other Linux distributions,
429 download and install Ethtool from the following website:
430 http://sourceforge.net/projects/gkernel.
432 For instructions on enabling WoL with Ethtool, refer to the website listed
433 above.
435 WoL will be enabled on the system during the next shut down or reboot.
436 For this driver version, in order to enable WoL, the e1000 driver must be
437 loaded when shutting down or rebooting the system.
439 NAPI
440 ----
442 NAPI (Rx polling mode) is supported in the e1000 driver. NAPI is enabled
443 or disabled based on the configuration of the kernel. To override
444 the default, use the following compile-time flags.
446 To enable NAPI, compile the driver module, passing in a configuration option:
448 make CFLAGS_EXTRA=-DE1000_NAPI install
450 To disable NAPI, compile the driver module, passing in a configuration option:
452 make CFLAGS_EXTRA=-DE1000_NO_NAPI install
454 See www.cyberus.ca/~hadi/usenix-paper.tgz for more information on NAPI.
457 Known Issues
458 ============
460 Jumbo Frames System Requirement
461 -------------------------------
463 Memory allocation failures have been observed on Linux systems with 64 MB
464 of RAM or less that are running Jumbo Frames. If you are using Jumbo
465 Frames, your system may require more than the advertised minimum
466 requirement of 64 MB of system memory.
468 Performance Degradation with Jumbo Frames
469 -----------------------------------------
471 Degradation in throughput performance may be observed in some Jumbo frames
472 environments. If this is observed, increasing the application's socket
473 buffer size and/or increasing the /proc/sys/net/ipv4/tcp_*mem entry values
474 may help. See the specific application manual and
475 /usr/src/linux*/Documentation/
476 networking/ip-sysctl.txt for more details.
478 Jumbo frames on Foundry BigIron 8000 switch
479 -------------------------------------------
480 There is a known issue using Jumbo frames when connected to a Foundry
481 BigIron 8000 switch. This is a 3rd party limitation. If you experience
482 loss of packets, lower the MTU size.
484 Multiple Interfaces on Same Ethernet Broadcast Network
485 ------------------------------------------------------
487 Due to the default ARP behavior on Linux, it is not possible to have
488 one system on two IP networks in the same Ethernet broadcast domain
489 (non-partitioned switch) behave as expected. All Ethernet interfaces
490 will respond to IP traffic for any IP address assigned to the system.
491 This results in unbalanced receive traffic.
493 If you have multiple interfaces in a server, either turn on ARP
494 filtering by entering:
496 echo 1 > /proc/sys/net/ipv4/conf/all/arp_filter
497 (this only works if your kernel's version is higher than 2.4.5),
499 NOTE: This setting is not saved across reboots. The configuration
500 change can be made permanent by adding the line:
501 net.ipv4.conf.all.arp_filter = 1
502 to the file /etc/sysctl.conf
504 or,
506 install the interfaces in separate broadcast domains (either in
507 different switches or in a switch partitioned to VLANs).
509 82541/82547 can't link or are slow to link with some link partners
510 -----------------------------------------------------------------
512 There is a known compatibility issue with 82541/82547 and some
513 low-end switches where the link will not be established, or will
514 be slow to establish. In particular, these switches are known to
515 be incompatible with 82541/82547:
517 Planex FXG-08TE
518 I-O Data ETG-SH8
520 To workaround this issue, the driver can be compiled with an override
521 of the PHY's master/slave setting. Forcing master or forcing slave
522 mode will improve time-to-link.
524 # make EXTRA_CFLAGS=-DE1000_MASTER_SLAVE=<n>
526 Where <n> is:
528 0 = Hardware default
529 1 = Master mode
530 2 = Slave mode
531 3 = Auto master/slave
533 Disable rx flow control with ethtool
534 ------------------------------------
536 In order to disable receive flow control using ethtool, you must turn
537 off auto-negotiation on the same command line.
539 For example:
541 ethtool -A eth? autoneg off rx off
544 Support
545 =======
547 For general information, go to the Intel support website at:
549 http://support.intel.com
551 or the Intel Wired Networking project hosted by Sourceforge at:
553 http://sourceforge.net/projects/e1000
555 If an issue is identified with the released source code on the supported
556 kernel with a supported adapter, email the specific information related
557 to the issue to e1000-devel@lists.sourceforge.net
560 License
561 =======
563 This software program is released under the terms of a license agreement
564 between you ('Licensee') and Intel. Do not use or load this software or any
565 associated materials (collectively, the 'Software') until you have carefully
566 read the full terms and conditions of the file COPYING located in this software
567 package. By loading or using the Software, you agree to the terms of this
568 Agreement. If you do not agree with the terms of this Agreement, do not
569 install or use the Software.
571 * Other names and brands may be claimed as the property of others.