view xen/TODO @ 348:e5e04893c022

bitkeeper revision (3e8c846fQSuOz1Dd8MgUzwG5rj3bDQ)

Many files:
Free DOM0 kernel memory to Xen allocation pool after DOM0 is created. Fixed page-type handling -- we now correctly flush TLB if a page is unpinned after a disk read and refcnt falls to zero.
author kaf24@scramble.cl.cam.ac.uk
date Thu Apr 03 18:58:55 2003 +0000 (2003-04-03)
parents dc2e4de1850f
children 942eb9bcae13
line source
2 This is stuff we probably want to implement in the near future. I
3 think I have them in a sensible priority order -- the first few would
4 be nice to fix before a code release. The later ones can be
5 longer-term goals.
7 -- Keir (16/3/03)
11 ----------------------------------
12 More intelligent assignment of domains to processors. In
13 particular, we don't play well with hyperthreading: we will assign
14 domains to virtual processors on the same package, rather then
15 spreading them across processor packages.
17 What we need to do is port code from Linux which stores information on
18 relationships between processors in the system (eg. which ones are
19 siblings in the same package). We then use this to balance domains
20 across packages, and across virtual processors within a package.
23 --------------------------------
24 Currently we do not free resources when destroying a domain. This is
25 because they may be tied up in subsystems, and there is no way of
26 pulling them back in a safe manner.
28 The fix is probably to reference count resources and automatically
29 free them when the count reaches zero. We may get away with one count
30 per domain (for all its resources). When this reaches zero we know it
31 is safe to free everything: block-device rings, network rings, and all
32 the rest.
35 --------------------------------
36 Handling of the transmit rings is currently very broken (for example,
37 sending an inter-domain packet will wedge the hypervisor). This is
38 because we may handle packets out of order (eg. inter-domain packets
39 are handled eagerly, while packets for real interfaces are queued),
40 but our current ring design really assumes in-order handling.
42 A neat fix will be to allow responses to be queued in a different
43 order to requests, just as we already do with block-device
44 rings. We'll need to add an opaque identifier to ring entries,
45 allowing matching of requests and responses, but that's about it.
48 ---------------------------
49 All the NICs that we support can checksum packets on behalf of guest
50 OSes. We need to add appropriate flags to and from each domain to
51 indicate, on transmit, which packets need the checksum added and, on
52 receive, which packets have been checked out as okay. We can steal
53 Linux's interface, which is entirely sane given NIC limitations.
56 -----------------------------
57 We do not allow modification of the GDT, or any use of the LDT. This
58 is necessary for support of unmodified applications (eg. Linux uses
59 LDT in threaded applications, while Windows needs to update GDT
60 entries).
62 I have some text on how to do this:
63 /usr/groups/xeno/discussion-docs/memory_management/segment_tables.txt
64 It's already half implemented, but the rest is still to do.
67 -----------------------------
68 A better control daemon is required for domain 0, which keeps proper
69 track of machine resources and can make sensible policy choices. This
70 may require support in Xen; for example, notifications (eg. DOMn is
71 killed), and requests (eg. can DOMn allocate x frames of memory?).
74 --------------------------------------
75 Currently our long-term timebase free runs on CPU0, with no external
76 calibration. We should run ntpd on domain 0 and allow this to warp
77 Xen's timebase. Once this is done, we can have a timebase per CPU and
78 not worry about relative drift (since they'll all get sync'ed
79 periodically by ntp).
82 -------------------------
83 Network and blkdev drivers are bloating Xen. At some point we want to
84 build drivers as modules, stick them in a cheesy ramfs, then relocate
85 them one by one at boot time. If a driver duccessfully probes hardware
86 we keep it, otherwise we blow it away. Alternative is to have a
87 central PCI ID to driver name repository. We then use that to decide
88 which drivers to load.
90 Most of the hard stuff (relocating and the like) is done for us by
91 Linux's module system.
94 ----------------------
95 This includes the last-chance page cache, and the unified buffer cache.
99 Graveyard
100 *********
102 Following is some description how some of the above might be
103 implemented. Some of it is superceded and/or out of date, so follow
104 with caution.
106 Segment descriptor tables
107 -------------------------
108 We want to allow guest OSes to specify GDT and LDT tables using their
109 own pages of memory (just like with page tables). So allow the following:
110 * new_table_entry(ptr, val)
111 [Allows insertion of a code, data, or LDT descriptor into given
112 location. Can simply be checked then poked, with no need to look at
113 page type.]
114 * new_GDT() -- relevent virtual pages are resolved to frames. Either
115 (i) page not present; or (ii) page is only mapped read-only and checks
116 out okay (then marked as special page). Old table is resolved first,
117 and the pages are unmarked (no longer special type).
118 * new_LDT() -- same as for new_GDT(), with same special page type.
120 Page table updates must be hooked, so we look for updates to virtual page
121 addresses in the GDT/LDT range. If map to not present, then old physpage
122 has type_count decremented. If map to present, ensure read-only, check the
123 page, and set special type.
125 Merge set_{LDT,GDT} into update_baseptr, by passing four args:
126 update_baseptrs(mask, ptab, gdttab, ldttab);
127 Update of ptab requires update of gtab (or set to internal default).
128 Update of gtab requires update of ltab (or set to internal default).
131 The hypervisor page cache
132 -------------------------
133 This will allow guest OSes to make use of spare pages in the system, but
134 allow them to be immediately used for any new domains or memory requests.
135 The idea is that, when a page is laundered and falls off Linux's clean_LRU
136 list, rather than freeing it it becomes a candidate for passing down into
137 the hypervisor. In return, xeno-linux may ask for one of its previously-
138 cached pages back:
139 (page, new_id) = cache_query(page, old_id);
140 If the requested page couldn't be kept, a blank page is returned.
141 When would Linux make the query? Whenever it wants a page back without
142 the delay or going to disc. Also, whenever a page would otherwise be
143 flushed to disc.
145 To try and add to the cache: (blank_page, new_id) = cache_query(page, NULL);
146 [NULL means "give me a blank page"].
147 To try and retrieve from the cache: (page, new_id) = cache_query(x_page, id)
148 [we may request that x_page just be discarded, and therefore not impinge
149 on this domain's cache quota].