view Documentation/vm/overcommit-accounting @ 897:329ea0ccb344

balloon: try harder to balloon up under memory pressure.

Currently if the balloon driver is unable to increase the guest's
reservation it assumes the failure was due to reaching its full
allocation, gives up on the ballooning operation and records the limit
it reached as the "hard limit". The driver will not try again until
the target is set again (even to the same value).

However it is possible that ballooning has in fact failed due to
memory pressure in the host and therefore it is desirable to keep
attempting to reach the target in case memory becomes available. The
most likely scenario is that some guests are ballooning down while
others are ballooning up and therefore there is temporary memory
pressure while things stabilise. You would not expect a well behaved
toolstack to ask a domain to balloon to more than its allocation nor
would you expect it to deliberately over-commit memory by setting
balloon targets which exceed the total host memory.

This patch drops the concept of a hard limit and causes the balloon
driver to retry increasing the reservation on a timer in the same
manner as when decreasing the reservation.

Also if we partially succeed in increasing the reservation
(i.e. receive less pages than we asked for) then we may as well keep
those pages rather than returning them to Xen.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
author Keir Fraser <keir.fraser@citrix.com>
date Fri Jun 05 14:01:20 2009 +0100 (2009-06-05)
parents 831230e53067
line source
1 The Linux kernel supports the following overcommit handling modes
3 0 - Heuristic overcommit handling. Obvious overcommits of
4 address space are refused. Used for a typical system. It
5 ensures a seriously wild allocation fails while allowing
6 overcommit to reduce swap usage. root is allowed to
7 allocate slighly more memory in this mode. This is the
8 default.
10 1 - Always overcommit. Appropriate for some scientific
11 applications.
13 2 - Don't overcommit. The total address space commit
14 for the system is not permitted to exceed swap + a
15 configurable percentage (default is 50) of physical RAM.
16 Depending on the percentage you use, in most situations
17 this means a process will not be killed while accessing
18 pages but will receive errors on memory allocation as
19 appropriate.
21 The overcommit policy is set via the sysctl `vm.overcommit_memory'.
23 The overcommit percentage is set via `vm.overcommit_ratio'.
25 The current overcommit limit and amount committed are viewable in
26 /proc/meminfo as CommitLimit and Committed_AS respectively.
28 Gotchas
29 -------
31 The C language stack growth does an implicit mremap. If you want absolute
32 guarantees and run close to the edge you MUST mmap your stack for the
33 largest size you think you will need. For typical stack usage this does
34 not matter much but it's a corner case if you really really care
36 In mode 2 the MAP_NORESERVE flag is ignored.
39 How It Works
40 ------------
42 The overcommit is based on the following rules
44 For a file backed map
45 SHARED or READ-only - 0 cost (the file is the map not swap)
46 PRIVATE WRITABLE - size of mapping per instance
48 For an anonymous or /dev/zero map
49 SHARED - size of mapping
50 PRIVATE READ-only - 0 cost (but of little use)
51 PRIVATE WRITABLE - size of mapping per instance
53 Additional accounting
54 Pages made writable copies by mmap
55 shmfs memory drawn from the same pool
57 Status
58 ------
60 o We account mmap memory mappings
61 o We account mprotect changes in commit
62 o We account mremap changes in size
63 o We account brk
64 o We account munmap
65 o We report the commit status in /proc
66 o Account and check on fork
67 o Review stack handling/building on exec
68 o SHMfs accounting
69 o Implement actual limit enforcement
71 To Do
72 -----
73 o Account ptrace pages (this is hard)