balloon: try harder to balloon up under memory pressure.

Currently if the balloon driver is unable to increase the guest's
reservation it assumes the failure was due to reaching its full
allocation, gives up on the ballooning operation and records the limit
it reached as the "hard limit". The driver will not try again until
the target is set again (even to the same value).

However it is possible that ballooning has in fact failed due to
memory pressure in the host and therefore it is desirable to keep
attempting to reach the target in case memory becomes available. The
most likely scenario is that some guests are ballooning down while
others are ballooning up and therefore there is temporary memory
pressure while things stabilise. You would not expect a well behaved
toolstack to ask a domain to balloon to more than its allocation nor
would you expect it to deliberately over-commit memory by setting
balloon targets which exceed the total host memory.

This patch drops the concept of a hard limit and causes the balloon
driver to retry increasing the reservation on a timer in the same
manner as when decreasing the reservation.

Also if we partially succeed in increasing the reservation
(i.e. receive less pages than we asked for) then we may as well keep
those pages rather than returning them to Xen.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
author Keir Fraser <keir.fraser@citrix.com>
date Fri Jun 05 14:01:20 2009 +0100 (2009-06-05)
parents 3e8752eb6d9c
line source
2 string
3 depends on !UML
4 option defconfig_list
5 default "/lib/modules/$UNAME_RELEASE/.config"
6 default "/etc/kernel-config"
7 default "/boot/config-$UNAME_RELEASE"
8 default "arch/$ARCH/defconfig"
10 menu "Code maturity level options"
13 bool "Prompt for development and/or incomplete code/drivers"
14 ---help---
15 Some of the various things that Linux supports (such as network
16 drivers, file systems, network protocols, etc.) can be in a state
17 of development where the functionality, stability, or the level of
18 testing is not yet high enough for general use. This is usually
19 known as the "alpha-test" phase among developers. If a feature is
20 currently in alpha-test, then the developers usually discourage
21 uninformed widespread use of this feature by the general public to
22 avoid "Why doesn't this work?" type mail messages. However, active
23 testing and use of these systems is welcomed. Just be aware that it
24 may not meet the normal level of reliability or it may fail to work
25 in some special cases. Detailed bug reports from people familiar
26 with the kernel internals are usually welcomed by the developers
27 (before submitting bug reports, please read the documents
28 <file:README>, <file:MAINTAINERS>, <file:REPORTING-BUGS>,
29 <file:Documentation/BUG-HUNTING>, and
30 <file:Documentation/oops-tracing.txt> in the kernel source).
32 This option will also make obsoleted drivers available. These are
33 drivers that have been replaced by something else, and/or are
34 scheduled to be removed in a future kernel release.
36 Unless you intend to help test and develop a feature or driver that
37 falls into this category, or you have a situation that requires
38 using these features, you should probably say N here, which will
39 cause the configurator to present you with fewer choices. If
40 you say Y here, you will be offered the choice of using features or
41 drivers that are currently considered to be in the alpha-test phase.
43 config BROKEN
44 bool
46 config BROKEN_ON_SMP
47 bool
48 depends on BROKEN || !SMP
49 default y
51 config LOCK_KERNEL
52 bool
53 depends on SMP || PREEMPT
54 default y
57 int
58 default 32 if !UML
59 default 128 if UML
60 help
61 Maximum of each of the number of arguments and environment
62 variables passed to init from the kernel command line.
64 endmenu
66 menu "General setup"
69 string "Local version - append to kernel release"
70 help
71 Append an extra string to the end of your kernel version.
72 This will show up when you type uname, for example.
73 The string you set here will be appended after the contents of
74 any files with a filename matching localversion* in your
75 object and source tree, in that order. Your total string can
76 be a maximum of 64 characters.
79 bool "Automatically append version information to the version string"
80 default y
81 help
82 This will try to automatically determine if the current tree is a
83 release tree by looking for git tags that
84 belong to the current top of tree revision.
86 A string of the format -gxxxxxxxx will be added to the localversion
87 if a git based tree is found. The string generated by this will be
88 appended after any matching localversion* files, and after the value
91 Note: This requires Perl, and a git repository, but not necessarily
92 the git or cogito tools to be installed.
94 config SWAP
95 bool "Support for paging of anonymous memory (swap)"
96 depends on MMU
97 default y
98 help
99 This option allows you to choose whether you want to have support
100 for so called swap devices or swap files in your kernel that are
101 used to provide more virtual memory than the actual RAM present
102 in your computer. If unsure say Y.
104 config SYSVIPC
105 bool "System V IPC"
106 ---help---
107 Inter Process Communication is a suite of library functions and
108 system calls which let processes (running programs) synchronize and
109 exchange information. It is generally considered to be a good thing,
110 and some programs won't run unless you say Y here. In particular, if
111 you want to run the DOS emulator dosemu under Linux (read the
112 DOSEMU-HOWTO, available from <http://www.tldp.org/docs.html#howto>),
113 you'll need to say Y here.
115 You can find documentation about IPC with "info ipc" and also in
116 section 6.4 of the Linux Programmer's Guide, available from
117 <http://www.tldp.org/guides.html>.
119 config POSIX_MQUEUE
120 bool "POSIX Message Queues"
121 depends on NET && EXPERIMENTAL
122 ---help---
123 POSIX variant of message queues is a part of IPC. In POSIX message
124 queues every message has a priority which decides about succession
125 of receiving it by a process. If you want to compile and run
126 programs written e.g. for Solaris with use of its POSIX message
127 queues (functions mq_*) say Y here. To use this feature you will
128 also need mqueue library, available from
129 <http://www.mat.uni.torun.pl/~wrona/posix_ipc/>
131 POSIX message queues are visible as a filesystem called 'mqueue'
132 and can be mounted somewhere if you want to do filesystem
133 operations on message queues.
135 If unsure, say Y.
138 bool "BSD Process Accounting"
139 help
140 If you say Y here, a user level program will be able to instruct the
141 kernel (via a special system call) to write process accounting
142 information to a file: whenever a process exits, information about
143 that process will be appended to the file by the kernel. The
144 information includes things such as creation time, owning user,
145 command name, memory usage, controlling terminal etc. (the complete
146 list is in the struct acct in <file:include/linux/acct.h>). It is
147 up to the user level program to do useful things with this
148 information. This is generally a good idea, so say Y.
150 config BSD_PROCESS_ACCT_V3
151 bool "BSD Process Accounting version 3 file format"
152 depends on BSD_PROCESS_ACCT
153 default n
154 help
155 If you say Y here, the process accounting information is written
156 in a new file format that also logs the process IDs of each
157 process and it's parent. Note that this file format is incompatible
158 with previous v0/v1/v2 file formats, so you will need updated tools
159 for processing it. A preliminary version of these tools is available
160 at <http://www.physik3.uni-rostock.de/tim/kernel/utils/acct/>.
162 config TASKSTATS
163 bool "Export task/process statistics through netlink (EXPERIMENTAL)"
164 depends on NET
165 default n
166 help
167 Export selected statistics for tasks/processes through the
168 generic netlink interface. Unlike BSD process accounting, the
169 statistics are available during the lifetime of tasks/processes as
170 responses to commands. Like BSD accounting, they are sent to user
171 space on task exit.
173 Say N if unsure.
175 config TASK_DELAY_ACCT
176 bool "Enable per-task delay accounting (EXPERIMENTAL)"
177 depends on TASKSTATS
178 help
179 Collect information on time spent by a task waiting for system
180 resources like cpu, synchronous block I/O completion and swapping
181 in pages. Such statistics can help in setting a task's priorities
182 relative to other tasks for cpu, io, rss limits etc.
184 Say N if unsure.
186 config AUDIT
187 bool "Auditing support"
188 depends on NET
189 help
190 Enable auditing infrastructure that can be used with another
191 kernel subsystem, such as SELinux (which requires this for
192 logging of avc messages output). Does not do system-call
193 auditing without CONFIG_AUDITSYSCALL.
196 bool "Enable system-call auditing support"
197 depends on AUDIT && (X86 || PPC || PPC64 || S390 || IA64 || UML || SPARC64)
198 default y if SECURITY_SELINUX
199 help
200 Enable low-overhead system-call auditing infrastructure that
201 can be used independently or with another kernel subsystem,
202 such as SELinux. To use audit's filesystem watch feature, please
203 ensure that INOTIFY is configured.
205 config IKCONFIG
206 bool "Kernel .config support"
207 ---help---
208 This option enables the complete Linux kernel ".config" file
209 contents to be saved in the kernel. It provides documentation
210 of which kernel options are used in a running kernel or in an
211 on-disk kernel. This information can be extracted from the kernel
212 image file with the script scripts/extract-ikconfig and used as
213 input to rebuild the current kernel or to build another kernel.
214 It can also be extracted from a running kernel by reading
215 /proc/config.gz if enabled (below).
217 config IKCONFIG_PROC
218 bool "Enable access to .config through /proc/config.gz"
219 depends on IKCONFIG && PROC_FS
220 ---help---
221 This option enables access to the kernel configuration file
222 through /proc/config.gz.
224 config CPUSETS
225 bool "Cpuset support"
226 depends on SMP
227 help
228 This option will let you create and manage CPUSETs which
229 allow dynamically partitioning a system into sets of CPUs and
230 Memory Nodes and assigning tasks to run only within those sets.
231 This is primarily useful on large SMP or NUMA systems.
233 Say N if unsure.
235 config RELAY
236 bool "Kernel->user space relay support (formerly relayfs)"
237 help
238 This option enables support for relay interface support in
239 certain file systems (such as debugfs).
240 It is designed to provide an efficient mechanism for tools and
241 facilities to relay large amounts of data from kernel space to
242 user space.
244 If unsure, say N.
246 source "usr/Kconfig"
249 bool "Optimize for size (Look out for broken compilers!)"
250 default y
251 depends on ARM || H8300 || EXPERIMENTAL
252 help
253 Enabling this option will pass "-Os" instead of "-O2" to gcc
254 resulting in a smaller kernel.
256 WARNING: some versions of gcc may generate incorrect code with this
257 option. If problems are observed, a gcc upgrade may be needed.
259 If unsure, say N.
261 menuconfig EMBEDDED
262 bool "Configure standard kernel features (for small systems)"
263 help
264 This option allows certain base kernel options and settings
265 to be disabled or tweaked. This is for specialized
266 environments which can tolerate a "non-standard" kernel.
267 Only use this if you really know what you are doing.
269 config UID16
270 bool "Enable 16-bit UID system calls" if EMBEDDED
271 depends on ARM || CRIS || FRV || H8300 || X86_32 || M68K || (S390 && !64BIT) || SUPERH || SPARC32 || (SPARC64 && SPARC32_COMPAT) || UML || (X86_64 && IA32_EMULATION)
272 default y
273 help
274 This enables the legacy 16-bit UID syscall wrappers.
276 config SYSCTL
277 bool "Sysctl support" if EMBEDDED
278 default y
279 ---help---
280 The sysctl interface provides a means of dynamically changing
281 certain kernel parameters and variables on the fly without requiring
282 a recompile of the kernel or reboot of the system. The primary
283 interface consists of a system call, but if you say Y to "/proc
284 file system support", a tree of modifiable sysctl entries will be
285 generated beneath the /proc/sys directory. They are explained in the
286 files in <file:Documentation/sysctl/>. Note that enabling this
287 option will enlarge the kernel by at least 8 KB.
289 As it is generally a good thing, you should say Y here unless
290 building a kernel for install/rescue disks or your system is very
291 limited in memory.
293 config KALLSYMS
294 bool "Load all symbols for debugging/kksymoops" if EMBEDDED
295 default y
296 help
297 Say Y here to let the kernel print out symbolic crash information and
298 symbolic stack backtraces. This increases the size of the kernel
299 somewhat, as all symbols have to be loaded into the kernel image.
301 config KALLSYMS_ALL
302 bool "Include all symbols in kallsyms"
303 depends on DEBUG_KERNEL && KALLSYMS
304 help
305 Normally kallsyms only contains the symbols of functions, for nicer
306 OOPS messages. Some debuggers can use kallsyms for other
307 symbols too: say Y here to include all symbols, if you need them
308 and you don't care about adding 300k to the size of your kernel.
310 Say N.
313 bool "Do an extra kallsyms pass"
314 depends on KALLSYMS
315 help
316 If kallsyms is not working correctly, the build will fail with
317 inconsistent kallsyms data. If that occurs, log a bug report and
318 turn on KALLSYMS_EXTRA_PASS which should result in a stable build.
319 Always say N here unless you find a bug in kallsyms, which must be
320 reported. KALLSYMS_EXTRA_PASS is only a temporary workaround while
321 you wait for kallsyms to be fixed.
324 config HOTPLUG
325 bool "Support for hot-pluggable devices" if EMBEDDED
326 default y
327 help
328 This option is provided for the case where no hotplug or uevent
329 capabilities is wanted by the kernel. You should only consider
330 disabling this option for embedded systems that do not use modules, a
331 dynamic /dev tree, or dynamic device discovery. Just say Y.
333 config PRINTK
334 default y
335 bool "Enable support for printk" if EMBEDDED
336 help
337 This option enables normal printk support. Removing it
338 eliminates most of the message strings from the kernel image
339 and makes the kernel more or less silent. As this makes it
340 very difficult to diagnose system problems, saying N here is
341 strongly discouraged.
343 config BUG
344 bool "BUG() support" if EMBEDDED
345 default y
346 help
347 Disabling this option eliminates support for BUG and WARN, reducing
348 the size of your kernel image and potentially quietly ignoring
349 numerous fatal conditions. You should only consider disabling this
350 option for embedded systems with no facilities for reporting errors.
351 Just say Y.
353 config ELF_CORE
354 default y
355 bool "Enable ELF core dumps" if EMBEDDED
356 help
357 Enable support for generating core dumps. Disabling saves about 4k.
359 config BASE_FULL
360 default y
361 bool "Enable full-sized data structures for core" if EMBEDDED
362 help
363 Disabling this option reduces the size of miscellaneous core
364 kernel data structures. This saves memory on small machines,
365 but may reduce performance.
367 config FUTEX
368 bool "Enable futex support" if EMBEDDED
369 default y
370 select RT_MUTEXES
371 help
372 Disabling this option will cause the kernel to be built without
373 support for "fast userspace mutexes". The resulting kernel may not
374 run glibc-based applications correctly.
376 config EPOLL
377 bool "Enable eventpoll support" if EMBEDDED
378 default y
379 help
380 Disabling this option will cause the kernel to be built without
381 support for epoll family of system calls.
383 config SHMEM
384 bool "Use full shmem filesystem" if EMBEDDED
385 default y
386 depends on MMU
387 help
388 The shmem is an internal filesystem used to manage shared memory.
389 It is backed by swap and manages resource limits. It is also exported
390 to userspace as tmpfs if TMPFS is enabled. Disabling this
391 option replaces shmem and tmpfs with the much simpler ramfs code,
392 which may be appropriate on small systems without swap.
394 config SLAB
395 default y
396 bool "Use full SLAB allocator" if EMBEDDED
397 help
398 Disabling this replaces the advanced SLAB allocator and
399 kmalloc support with the drastically simpler SLOB allocator.
400 SLOB is more space efficient but does not scale well and is
401 more susceptible to fragmentation.
404 default y
405 bool "Enable VM event counters for /proc/vmstat" if EMBEDDED
406 help
407 VM event counters are only needed to for event counts to be
408 shown. They have no function for the kernel itself. This
409 option allows the disabling of the VM event counters.
410 /proc/vmstat will only show page counts.
412 endmenu # General setup
414 config RT_MUTEXES
415 boolean
416 select PLIST
418 config TINY_SHMEM
419 default !SHMEM
420 bool
422 config BASE_SMALL
423 int
424 default 0 if BASE_FULL
425 default 1 if !BASE_FULL
427 config SLOB
428 default !SLAB
429 bool
431 menu "Loadable module support"
433 config MODULES
434 bool "Enable loadable module support"
435 help
436 Kernel modules are small pieces of compiled code which can
437 be inserted in the running kernel, rather than being
438 permanently built into the kernel. You use the "modprobe"
439 tool to add (and sometimes remove) them. If you say Y here,
440 many parts of the kernel can be built as modules (by
441 answering M instead of Y where indicated): this is most
442 useful for infrequently used options which are not required
443 for booting. For more information, see the man pages for
444 modprobe, lsmod, modinfo, insmod and rmmod.
446 If you say Y here, you will need to run "make
447 modules_install" to put the modules under /lib/modules/
448 where modprobe can find them (you may need to be root to do
449 this).
451 If unsure, say Y.
453 config MODULE_UNLOAD
454 bool "Module unloading"
455 depends on MODULES
456 help
457 Without this option you will not be able to unload any
458 modules (note that some modules may not be unloadable
459 anyway), which makes your kernel slightly smaller and
460 simpler. If unsure, say Y.
463 bool "Forced module unloading"
465 help
466 This option allows you to force a module to unload, even if the
467 kernel believes it is unsafe: the kernel will remove the module
468 without waiting for anyone to stop using it (using the -f option to
469 rmmod). This is mainly for kernel developers and desperate users.
470 If unsure, say N.
472 config MODVERSIONS
473 bool "Module versioning support"
474 depends on MODULES
475 help
476 Usually, you have to use modules compiled with your kernel.
477 Saying Y here makes it sometimes possible to use modules
478 compiled for different kernels, by adding enough information
479 to the modules to (hopefully) spot any changes which would
480 make them incompatible with the kernel you are running. If
481 unsure, say N.
484 bool "Source checksum for all modules"
485 depends on MODULES
486 help
487 Modules which contain a MODULE_VERSION get an extra "srcversion"
488 field inserted into their modinfo section, which contains a
489 sum of the source files which made it. This helps maintainers
490 see exactly which source was used to build a module (since
491 others sometimes change the module source without updating
492 the version). With this option, such a "srcversion" field
493 will be created for all modules. If unsure, say N.
495 config KMOD
496 bool "Automatic kernel module loading"
497 depends on MODULES
498 help
499 Normally when you have selected some parts of the kernel to
500 be created as kernel modules, you must load them (using the
501 "modprobe" command) before you can use them. If you say Y
502 here, some parts of the kernel will be able to load modules
503 automatically: when a part of the kernel needs a module, it
504 runs modprobe with the appropriate arguments, thereby
505 loading the module if it is available. If unsure, say Y.
507 config STOP_MACHINE
508 bool
509 default y
510 depends on (SMP && MODULE_UNLOAD) || HOTPLUG_CPU
511 help
512 Need stop_machine() primitive.
513 endmenu
515 menu "Block layer"
516 source "block/Kconfig"
517 endmenu