view Documentation/sched-coding.txt @ 897:329ea0ccb344

balloon: try harder to balloon up under memory pressure.

Currently if the balloon driver is unable to increase the guest's
reservation it assumes the failure was due to reaching its full
allocation, gives up on the ballooning operation and records the limit
it reached as the "hard limit". The driver will not try again until
the target is set again (even to the same value).

However it is possible that ballooning has in fact failed due to
memory pressure in the host and therefore it is desirable to keep
attempting to reach the target in case memory becomes available. The
most likely scenario is that some guests are ballooning down while
others are ballooning up and therefore there is temporary memory
pressure while things stabilise. You would not expect a well behaved
toolstack to ask a domain to balloon to more than its allocation nor
would you expect it to deliberately over-commit memory by setting
balloon targets which exceed the total host memory.

This patch drops the concept of a hard limit and causes the balloon
driver to retry increasing the reservation on a timer in the same
manner as when decreasing the reservation.

Also if we partially succeed in increasing the reservation
(i.e. receive less pages than we asked for) then we may as well keep
those pages rather than returning them to Xen.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
author Keir Fraser <keir.fraser@citrix.com>
date Fri Jun 05 14:01:20 2009 +0100 (2009-06-05)
parents 831230e53067
line source
1 Reference for various scheduler-related methods in the O(1) scheduler
2 Robert Love <rml@tech9.net>, MontaVista Software
5 Note most of these methods are local to kernel/sched.c - this is by design.
6 The scheduler is meant to be self-contained and abstracted away. This document
7 is primarily for understanding the scheduler, not interfacing to it. Some of
8 the discussed interfaces, however, are general process/scheduling methods.
9 They are typically defined in include/linux/sched.h.
12 Main Scheduling Methods
13 -----------------------
15 void load_balance(runqueue_t *this_rq, int idle)
16 Attempts to pull tasks from one cpu to another to balance cpu usage,
17 if needed. This method is called explicitly if the runqueues are
18 inbalanced or periodically by the timer tick. Prior to calling,
19 the current runqueue must be locked and interrupts disabled.
21 void schedule()
22 The main scheduling function. Upon return, the highest priority
23 process will be active.
26 Locking
27 -------
29 Each runqueue has its own lock, rq->lock. When multiple runqueues need
30 to be locked, lock acquires must be ordered by ascending &runqueue value.
32 A specific runqueue is locked via
34 task_rq_lock(task_t pid, unsigned long *flags)
36 which disables preemption, disables interrupts, and locks the runqueue pid is
37 running on. Likewise,
39 task_rq_unlock(task_t pid, unsigned long *flags)
41 unlocks the runqueue pid is running on, restores interrupts to their previous
42 state, and reenables preemption.
44 The routines
46 double_rq_lock(runqueue_t *rq1, runqueue_t *rq2)
48 and
50 double_rq_unlock(runqueue_t *rq1, runqueue_t *rq2)
52 safely lock and unlock, respectively, the two specified runqueues. They do
53 not, however, disable and restore interrupts. Users are required to do so
54 manually before and after calls.
57 Values
58 ------
61 The maximum priority of the system, stored in the task as task->prio.
62 Lower priorities are higher. Normal (non-RT) priorities range from
63 MAX_RT_PRIO to (MAX_PRIO - 1).
65 The maximum real-time priority of the system. Valid RT priorities
66 range from 0 to (MAX_RT_PRIO - 1).
68 The maximum real-time priority that is exported to user-space. Should
69 always be equal to or less than MAX_RT_PRIO. Setting it less allows
70 kernel threads to have higher priorities than any user-space task.
73 Respectively, the minimum and maximum timeslices (quanta) of a process.
75 Data
76 ----
78 struct runqueue
79 The main per-CPU runqueue data structure.
80 struct task_struct
81 The main per-process data structure.
84 General Methods
85 ---------------
87 cpu_rq(cpu)
88 Returns the runqueue of the specified cpu.
89 this_rq()
90 Returns the runqueue of the current cpu.
91 task_rq(pid)
92 Returns the runqueue which holds the specified pid.
93 cpu_curr(cpu)
94 Returns the task currently running on the given cpu.
95 rt_task(pid)
96 Returns true if pid is real-time, false if not.
99 Process Control Methods
100 -----------------------
102 void set_user_nice(task_t *p, long nice)
103 Sets the "nice" value of task p to the given value.
104 int setscheduler(pid_t pid, int policy, struct sched_param *param)
105 Sets the scheduling policy and parameters for the given pid.
106 int set_cpus_allowed(task_t *p, unsigned long new_mask)
107 Sets a given task's CPU affinity and migrates it to a proper cpu.
108 Callers must have a valid reference to the task and assure the
109 task not exit prematurely. No locks can be held during the call.
110 set_task_state(tsk, state_value)
111 Sets the given task's state to the given value.
112 set_current_state(state_value)
113 Sets the current task's state to the given value.
114 void set_tsk_need_resched(struct task_struct *tsk)
115 Sets need_resched in the given task.
116 void clear_tsk_need_resched(struct task_struct *tsk)
117 Clears need_resched in the given task.
118 void set_need_resched()
119 Sets need_resched in the current task.
120 void clear_need_resched()
121 Clears need_resched in the current task.
122 int need_resched()
123 Returns true if need_resched is set in the current task, false
124 otherwise.
125 yield()
126 Place the current process at the end of the runqueue and call schedule.