ia64/linux-2.6.18-xen.hg

view Documentation/pm.txt @ 897:329ea0ccb344

balloon: try harder to balloon up under memory pressure.

Currently if the balloon driver is unable to increase the guest's
reservation it assumes the failure was due to reaching its full
allocation, gives up on the ballooning operation and records the limit
it reached as the "hard limit". The driver will not try again until
the target is set again (even to the same value).

However it is possible that ballooning has in fact failed due to
memory pressure in the host and therefore it is desirable to keep
attempting to reach the target in case memory becomes available. The
most likely scenario is that some guests are ballooning down while
others are ballooning up and therefore there is temporary memory
pressure while things stabilise. You would not expect a well behaved
toolstack to ask a domain to balloon to more than its allocation nor
would you expect it to deliberately over-commit memory by setting
balloon targets which exceed the total host memory.

This patch drops the concept of a hard limit and causes the balloon
driver to retry increasing the reservation on a timer in the same
manner as when decreasing the reservation.

Also if we partially succeed in increasing the reservation
(i.e. receive less pages than we asked for) then we may as well keep
those pages rather than returning them to Xen.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
author Keir Fraser <keir.fraser@citrix.com>
date Fri Jun 05 14:01:20 2009 +0100 (2009-06-05)
parents 831230e53067
children
line source
1 Linux Power Management Support
3 This document briefly describes how to use power management with your
4 Linux system and how to add power management support to Linux drivers.
6 APM or ACPI?
7 ------------
8 If you have a relatively recent x86 mobile, desktop, or server system,
9 odds are it supports either Advanced Power Management (APM) or
10 Advanced Configuration and Power Interface (ACPI). ACPI is the newer
11 of the two technologies and puts power management in the hands of the
12 operating system, allowing for more intelligent power management than
13 is possible with BIOS controlled APM.
15 The best way to determine which, if either, your system supports is to
16 build a kernel with both ACPI and APM enabled (as of 2.3.x ACPI is
17 enabled by default). If a working ACPI implementation is found, the
18 ACPI driver will override and disable APM, otherwise the APM driver
19 will be used.
21 No sorry, you can not have both ACPI and APM enabled and running at
22 once. Some people with broken ACPI or broken APM implementations
23 would like to use both to get a full set of working features, but you
24 simply can not mix and match the two. Only one power management
25 interface can be in control of the machine at once. Think about it..
27 User-space Daemons
28 ------------------
29 Both APM and ACPI rely on user-space daemons, apmd and acpid
30 respectively, to be completely functional. Obtain both of these
31 daemons from your Linux distribution or from the Internet (see below)
32 and be sure that they are started sometime in the system boot process.
33 Go ahead and start both. If ACPI or APM is not available on your
34 system the associated daemon will exit gracefully.
36 apmd: http://worldvisions.ca/~apenwarr/apmd/
37 acpid: http://acpid.sf.net/
39 Driver Interface -- OBSOLETE, DO NOT USE!
40 ----------------*************************
42 Note: pm_register(), pm_access(), pm_dev_idle() and friends are
43 obsolete. Please do not use them. Instead you should properly hook
44 your driver into the driver model, and use its suspend()/resume()
45 callbacks to do this kind of stuff.
47 If you are writing a new driver or maintaining an old driver, it
48 should include power management support. Without power management
49 support, a single driver may prevent a system with power management
50 capabilities from ever being able to suspend (safely).
52 Overview:
53 1) Register each instance of a device with "pm_register"
54 2) Call "pm_access" before accessing the hardware.
55 (this will ensure that the hardware is awake and ready)
56 3) Your "pm_callback" is called before going into a
57 suspend state (ACPI D1-D3) or after resuming (ACPI D0)
58 from a suspend.
59 4) Call "pm_dev_idle" when the device is not being used
60 (optional but will improve device idle detection)
61 5) When unloaded, unregister the device with "pm_unregister"
63 /*
64 * Description: Register a device with the power-management subsystem
65 *
66 * Parameters:
67 * type - device type (PCI device, system device, ...)
68 * id - instance number or unique identifier
69 * cback - request handler callback (suspend, resume, ...)
70 *
71 * Returns: Registered PM device or NULL on error
72 *
73 * Examples:
74 * dev = pm_register(PM_SYS_DEV, PM_SYS_VGA, vga_callback);
75 *
76 * struct pci_dev *pci_dev = pci_find_dev(...);
77 * dev = pm_register(PM_PCI_DEV, PM_PCI_ID(pci_dev), callback);
78 */
79 struct pm_dev *pm_register(pm_dev_t type, unsigned long id, pm_callback cback);
81 /*
82 * Description: Unregister a device with the power management subsystem
83 *
84 * Parameters:
85 * dev - PM device previously returned from pm_register
86 */
87 void pm_unregister(struct pm_dev *dev);
89 /*
90 * Description: Unregister all devices with a matching callback function
91 *
92 * Parameters:
93 * cback - previously registered request callback
94 *
95 * Notes: Provided for easier porting from old APM interface
96 */
97 void pm_unregister_all(pm_callback cback);
99 /*
100 * Power management request callback
101 *
102 * Parameters:
103 * dev - PM device previously returned from pm_register
104 * rqst - request type
105 * data - data, if any, associated with the request
106 *
107 * Returns: 0 if the request is successful
108 * EINVAL if the request is not supported
109 * EBUSY if the device is now busy and can not handle the request
110 * ENOMEM if the device was unable to handle the request due to memory
111 *
112 * Details: The device request callback will be called before the
113 * device/system enters a suspend state (ACPI D1-D3) or
114 * or after the device/system resumes from suspend (ACPI D0).
115 * For PM_SUSPEND, the ACPI D-state being entered is passed
116 * as the "data" argument to the callback. The device
117 * driver should save (PM_SUSPEND) or restore (PM_RESUME)
118 * device context when the request callback is called.
119 *
120 * Once a driver returns 0 (success) from a suspend
121 * request, it should not process any further requests or
122 * access the device hardware until a call to "pm_access" is made.
123 */
124 typedef int (*pm_callback)(struct pm_dev *dev, pm_request_t rqst, void *data);
126 Driver Details
127 --------------
128 This is just a quick Q&A as a stopgap until a real driver writers'
129 power management guide is available.
131 Q: When is a device suspended?
133 Devices can be suspended based on direct user request (eg. laptop lid
134 closes), system power policy (eg. sleep after 30 minutes of console
135 inactivity), or device power policy (eg. power down device after 5
136 minutes of inactivity)
138 Q: Must a driver honor a suspend request?
140 No, a driver can return -EBUSY from a suspend request and this
141 will stop the system from suspending. When a suspend request
142 fails, all suspended devices are resumed and the system continues
143 to run. Suspend can be retried at a later time.
145 Q: Can the driver block suspend/resume requests?
147 Yes, a driver can delay its return from a suspend or resume
148 request until the device is ready to handle requests. It
149 is advantageous to return as quickly as possible from a
150 request as suspend/resume are done serially.
152 Q: What context is a suspend/resume initiated from?
154 A suspend or resume is initiated from a kernel thread context.
155 It is safe to block, allocate memory, initiate requests
156 or anything else you can do within the kernel.
158 Q: Will requests continue to arrive after a suspend?
160 Possibly. It is the driver's responsibility to queue(*),
161 fail, or drop any requests that arrive after returning
162 success to a suspend request. It is important that the
163 driver not access its device until after it receives
164 a resume request as the device's bus may no longer
165 be active.
167 (*) If a driver queues requests for processing after
168 resume be aware that the device, network, etc.
169 might be in a different state than at suspend time.
170 It's probably better to drop requests unless
171 the driver is a storage device.
173 Q: Do I have to manage bus-specific power management registers
175 No. It is the responsibility of the bus driver to manage
176 PCI, USB, etc. power management registers. The bus driver
177 or the power management subsystem will also enable any
178 wake-on functionality that the device has.
180 Q: So, really, what do I need to do to support suspend/resume?
182 You need to save any device context that would
183 be lost if the device was powered off and then restore
184 it at resume time. When ACPI is active, there are
185 three levels of device suspend states; D1, D2, and D3.
186 (The suspend state is passed as the "data" argument
187 to the device callback.) With D3, the device is powered
188 off and loses all context, D1 and D2 are shallower power
189 states and require less device context to be saved. To
190 play it safe, just save everything at suspend and restore
191 everything at resume.
193 Q: Where do I store device context for suspend?
195 Anywhere in memory, kmalloc a buffer or store it
196 in the device descriptor. You are guaranteed that the
197 contents of memory will be restored and accessible
198 before resume, even when the system suspends to disk.
200 Q: What do I need to do for ACPI vs. APM vs. etc?
202 Drivers need not be aware of the specific power management
203 technology that is active. They just need to be aware
204 of when the overlying power management system requests
205 that they suspend or resume.
207 Q: What about device dependencies?
209 When a driver registers a device, the power management
210 subsystem uses the information provided to build a
211 tree of device dependencies (eg. USB device X is on
212 USB controller Y which is on PCI bus Z) When power
213 management wants to suspend a device, it first sends
214 a suspend request to its driver, then the bus driver,
215 and so on up to the system bus. Device resumes
216 proceed in the opposite direction.
218 Q: Who do I contact for additional information about
219 enabling power management for my specific driver/device?
221 ACPI Development mailing list: linux-acpi@vger.kernel.org
223 System Interface -- OBSOLETE, DO NOT USE!
224 ----------------*************************
225 If you are providing new power management support to Linux (ie.
226 adding support for something like APM or ACPI), you should
227 communicate with drivers through the existing generic power
228 management interface.
230 /*
231 * Send a request to all devices
232 *
233 * Parameters:
234 * rqst - request type
235 * data - data, if any, associated with the request
236 *
237 * Returns: 0 if the request is successful
238 * See "pm_callback" return for errors
239 *
240 * Details: Walk list of registered devices and call pm_send
241 * for each until complete or an error is encountered.
242 * If an error is encountered for a suspend request,
243 * return all devices to the state they were in before
244 * the suspend request.
245 */
246 int pm_send_all(pm_request_t rqst, void *data);
248 /*
249 * Find a matching device
250 *
251 * Parameters:
252 * type - device type (PCI device, system device, or 0 to match all devices)
253 * from - previous match or NULL to start from the beginning
254 *
255 * Returns: Matching device or NULL if none found
256 */
257 struct pm_dev *pm_find(pm_dev_t type, struct pm_dev *from);