ia64/linux-2.6.18-xen.hg

view Documentation/kref.txt @ 897:329ea0ccb344

balloon: try harder to balloon up under memory pressure.

Currently if the balloon driver is unable to increase the guest's
reservation it assumes the failure was due to reaching its full
allocation, gives up on the ballooning operation and records the limit
it reached as the "hard limit". The driver will not try again until
the target is set again (even to the same value).

However it is possible that ballooning has in fact failed due to
memory pressure in the host and therefore it is desirable to keep
attempting to reach the target in case memory becomes available. The
most likely scenario is that some guests are ballooning down while
others are ballooning up and therefore there is temporary memory
pressure while things stabilise. You would not expect a well behaved
toolstack to ask a domain to balloon to more than its allocation nor
would you expect it to deliberately over-commit memory by setting
balloon targets which exceed the total host memory.

This patch drops the concept of a hard limit and causes the balloon
driver to retry increasing the reservation on a timer in the same
manner as when decreasing the reservation.

Also if we partially succeed in increasing the reservation
(i.e. receive less pages than we asked for) then we may as well keep
those pages rather than returning them to Xen.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
author Keir Fraser <keir.fraser@citrix.com>
date Fri Jun 05 14:01:20 2009 +0100 (2009-06-05)
parents 831230e53067
children
line source
2 krefs allow you to add reference counters to your objects. If you
3 have objects that are used in multiple places and passed around, and
4 you don't have refcounts, your code is almost certainly broken. If
5 you want refcounts, krefs are the way to go.
7 To use a kref, add one to your data structures like:
9 struct my_data
10 {
11 .
12 .
13 struct kref refcount;
14 .
15 .
16 };
18 The kref can occur anywhere within the data structure.
20 You must initialize the kref after you allocate it. To do this, call
21 kref_init as so:
23 struct my_data *data;
25 data = kmalloc(sizeof(*data), GFP_KERNEL);
26 if (!data)
27 return -ENOMEM;
28 kref_init(&data->refcount);
30 This sets the refcount in the kref to 1.
32 Once you have an initialized kref, you must follow the following
33 rules:
35 1) If you make a non-temporary copy of a pointer, especially if
36 it can be passed to another thread of execution, you must
37 increment the refcount with kref_get() before passing it off:
38 kref_get(&data->refcount);
39 If you already have a valid pointer to a kref-ed structure (the
40 refcount cannot go to zero) you may do this without a lock.
42 2) When you are done with a pointer, you must call kref_put():
43 kref_put(&data->refcount, data_release);
44 If this is the last reference to the pointer, the release
45 routine will be called. If the code never tries to get
46 a valid pointer to a kref-ed structure without already
47 holding a valid pointer, it is safe to do this without
48 a lock.
50 3) If the code attempts to gain a reference to a kref-ed structure
51 without already holding a valid pointer, it must serialize access
52 where a kref_put() cannot occur during the kref_get(), and the
53 structure must remain valid during the kref_get().
55 For example, if you allocate some data and then pass it to another
56 thread to process:
58 void data_release(struct kref *ref)
59 {
60 struct my_data *data = container_of(ref, struct my_data, refcount);
61 kfree(data);
62 }
64 void more_data_handling(void *cb_data)
65 {
66 struct my_data *data = cb_data;
67 .
68 . do stuff with data here
69 .
70 kref_put(data, data_release);
71 }
73 int my_data_handler(void)
74 {
75 int rv = 0;
76 struct my_data *data;
77 struct task_struct *task;
78 data = kmalloc(sizeof(*data), GFP_KERNEL);
79 if (!data)
80 return -ENOMEM;
81 kref_init(&data->refcount);
83 kref_get(&data->refcount);
84 task = kthread_run(more_data_handling, data, "more_data_handling");
85 if (task == ERR_PTR(-ENOMEM)) {
86 rv = -ENOMEM;
87 kref_put(&data->refcount, data_release);
88 goto out;
89 }
91 .
92 . do stuff with data here
93 .
94 out:
95 kref_put(&data->refcount, data_release);
96 return rv;
97 }
99 This way, it doesn't matter what order the two threads handle the
100 data, the kref_put() handles knowing when the data is not referenced
101 any more and releasing it. The kref_get() does not require a lock,
102 since we already have a valid pointer that we own a refcount for. The
103 put needs no lock because nothing tries to get the data without
104 already holding a pointer.
106 Note that the "before" in rule 1 is very important. You should never
107 do something like:
109 task = kthread_run(more_data_handling, data, "more_data_handling");
110 if (task == ERR_PTR(-ENOMEM)) {
111 rv = -ENOMEM;
112 goto out;
113 } else
114 /* BAD BAD BAD - get is after the handoff */
115 kref_get(&data->refcount);
117 Don't assume you know what you are doing and use the above construct.
118 First of all, you may not know what you are doing. Second, you may
119 know what you are doing (there are some situations where locking is
120 involved where the above may be legal) but someone else who doesn't
121 know what they are doing may change the code or copy the code. It's
122 bad style. Don't do it.
124 There are some situations where you can optimize the gets and puts.
125 For instance, if you are done with an object and enqueuing it for
126 something else or passing it off to something else, there is no reason
127 to do a get then a put:
129 /* Silly extra get and put */
130 kref_get(&obj->ref);
131 enqueue(obj);
132 kref_put(&obj->ref, obj_cleanup);
134 Just do the enqueue. A comment about this is always welcome:
136 enqueue(obj);
137 /* We are done with obj, so we pass our refcount off
138 to the queue. DON'T TOUCH obj AFTER HERE! */
140 The last rule (rule 3) is the nastiest one to handle. Say, for
141 instance, you have a list of items that are each kref-ed, and you wish
142 to get the first one. You can't just pull the first item off the list
143 and kref_get() it. That violates rule 3 because you are not already
144 holding a valid pointer. You must add locks or semaphores. For
145 instance:
147 static DECLARE_MUTEX(sem);
148 static LIST_HEAD(q);
149 struct my_data
150 {
151 struct kref refcount;
152 struct list_head link;
153 };
155 static struct my_data *get_entry()
156 {
157 struct my_data *entry = NULL;
158 down(&sem);
159 if (!list_empty(&q)) {
160 entry = container_of(q.next, struct my_q_entry, link);
161 kref_get(&entry->refcount);
162 }
163 up(&sem);
164 return entry;
165 }
167 static void release_entry(struct kref *ref)
168 {
169 struct my_data *entry = container_of(ref, struct my_data, refcount);
171 list_del(&entry->link);
172 kfree(entry);
173 }
175 static void put_entry(struct my_data *entry)
176 {
177 down(&sem);
178 kref_put(&entry->refcount, release_entry);
179 up(&sem);
180 }
182 The kref_put() return value is useful if you do not want to hold the
183 lock during the whole release operation. Say you didn't want to call
184 kfree() with the lock held in the example above (since it is kind of
185 pointless to do so). You could use kref_put() as follows:
187 static void release_entry(struct kref *ref)
188 {
189 /* All work is done after the return from kref_put(). */
190 }
192 static void put_entry(struct my_data *entry)
193 {
194 down(&sem);
195 if (kref_put(&entry->refcount, release_entry)) {
196 list_del(&entry->link);
197 up(&sem);
198 kfree(entry);
199 } else
200 up(&sem);
201 }
203 This is really more useful if you have to call other routines as part
204 of the free operations that could take a long time or might claim the
205 same lock. Note that doing everything in the release routine is still
206 preferred as it is a little neater.
209 Corey Minyard <minyard@acm.org>
211 A lot of this was lifted from Greg Kroah-Hartman's 2004 OLS paper and
212 presentation on krefs, which can be found at:
213 http://www.kroah.com/linux/talks/ols_2004_kref_paper/Reprint-Kroah-Hartman-OLS2004.pdf
214 and:
215 http://www.kroah.com/linux/talks/ols_2004_kref_talk/