Jim Fehlig [Tue, 18 Jan 2011 17:23:24 +0000 (17:23 +0000)]
xend: improve psudeo-bootloader support for external block scripts
Userspace tools support external block scripts (e.g. block-drbd
provided by drbd project). The psuedo-bootloader setup code in
xend has a few limitations wrt external block scripts, which this
patch addresses.
blkif.py: parse_uname() utility function should be able to parse a
disk specifier understood by the rest of the tools.
XendDomainInfo.py: Block devices using external block scripts must
be attached to dom0 before running the psuedo-bootloader.
Signed-off-by: Jim Fehlig <jfehlig@novell.com> Tested-by: Shriram Rajagopalan <rshriram@gmail.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Remove some more "drbd" cruft from xend. This is not necessary for
drbd to work with Xen.
Requested-by: Jim Fehlig <jfehlig@novell.com> Tested-by: Shriram Rajagopalan <rshriram@gmail.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich [Tue, 18 Jan 2011 12:28:10 +0000 (12:28 +0000)]
VT-d/ATS: misc fixes
First of all there were three places potentially de-referencing NULL
(two after an allocation failure, and one after a failed lookup).
Second, if ATS_ENABLE was already set, the device would not have got
added to the ats_devices list, potentially resulting in
dev_invalidate_iotlb() doing an incomplete job.
Keir Fraser [Tue, 18 Jan 2011 10:28:22 +0000 (10:28 +0000)]
xen-unstable/blkif: Add trim operation interface
Trim operation is a request for the underlying block device to mark
extents to be erased. Add the operation code and ring data structure
to the public header file.
Trim operations are passed with sector_number as the sector index to
begin trim operations at and nr_sectors as the number of sectors to
be trimmed. The specified sectors should be trimmed if the underlying
block device supports trim operations, or a BLKIF_RSP_EOPNOTSUPP
should be returned. More information about trim operations at;
http://t13.org/Documents/UploadedDocuments/docs2008/
e07154r6-Data_Set_Management_Proposal_for_ATA-ACS2.doc
Jan 10 00:02:26 paris /netbsd: xvif108.0: could not attach sysctl nodes
Jan 10 00:02:57 paris /netbsd: sysctl_createv: sysctl_create(xvif108.0)
returned 22
The kernel driver have recently been fixed and attached patch updates
to the hotplug scripts accordingly.
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Daniel De Graaf [Mon, 17 Jan 2011 17:28:30 +0000 (17:28 +0000)]
libxc: Remove set_max_grants in linux
The maximum number of grants is now constrained domain-wide in linux,
so set_max_grants should be a noop there. Previously, this constraint
was per-file-description.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Christoph Egger [Mon, 17 Jan 2011 17:18:38 +0000 (17:18 +0000)]
libxl: fix guest networking on NetBSD
As previously reported when I start guests with xl then the
guest network does not work because the qemu-ifup script
no longer runs.
NetBSD doesn't have something like udev. Changing xm/xend,
libxl and xenbackendd to make everything behave the same way
is a lot more intrusive than enabling it for NetBSD again.
Signed-off-by: Christoph Egger <Christoph.Egger@amd.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Mon, 17 Jan 2011 17:14:20 +0000 (17:14 +0000)]
tools/blktap, blktap2: include <sys/mount.h> instead of <linux/fs.h>
The former is a userspace sanitised header which contains the
definitions we need. In some distros linux/fs.h defines WRITE which
conflicts with blktaps own use of that name.
Also there is no reason to use <linux/errno.h> over the more normal
<errno.h>.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Keir Fraser [Fri, 14 Jan 2011 16:38:51 +0000 (16:38 +0000)]
rcupdate: Make rcu_barrier() more paranoia-proof
I'm not sure my original barrier function is correct. It may allow a
CPU to exit the barrier loop, with no local work to do, while RCU work
is pending on other CPUs and needing one or more quiescent periods to
flush the work through.
Although rcu_pending() may handle this, it is easiest to follow
Linux's example and simply call_rcu() a callback function on every
CPU. When the callback has executed on every CPU, we know that all
previously-queued RCU work is completed, and we can exit the barrier.
Keir Fraser [Fri, 14 Jan 2011 15:21:24 +0000 (15:21 +0000)]
hvmloader: Fixes to printf() implementation.
1. Remove unportable O and D format specifiers
2. Fix X format specifier to print upper-case hex characters
3. Fix d format specifier to print -ve numbers
4. Fix handling of int vs. long (although not actually an issue
for the i386 compile target)
5. Don't use the antiquated C 'register' type attribute.
Keir Fraser [Fri, 14 Jan 2011 15:18:02 +0000 (15:18 +0000)]
x86 hvm: Do not check-and-fail on in_atomic() in hvm_copy().
Stub this out for 4.0, as PV-on-HVM drivers hit this case when
performing grant-table hypercalls. Grant-table code currently accesses
guest memory under bug per-domain lock. The test in hvm_copy() is not
necessary until the xenpaging implementation is more complete, which
will not now be until after 4.1.0.
Ian Campbell [Fri, 14 Jan 2011 14:25:31 +0000 (14:25 +0000)]
libxc: build fix with debugging disabled.
Currently hypercalls have only 5 arguments, hypercall->arg[0..4]. Do
not try and print arg[5] else:
cc1: warnings being treated as errors
xenctrl_osdep_ENOSYS.c: In function
'ENOSYS_privcmd_hypercall':
xenctrl_osdep_ENOSYS.c:30: error: array subscript is above
array bounds
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Keir Fraser [Fri, 14 Jan 2011 14:19:55 +0000 (14:19 +0000)]
x86: On CPU online/offline from dom0, try flushing RCU work on EBUSY.
Although the caller should react appropriately to EBUSY, if the error
is due to pending RCU work then we can help things along by executing
rcu_barrier() and then retrying. To this end, this changeset is an
optimisation only.
Keir Fraser [Fri, 14 Jan 2011 14:18:31 +0000 (14:18 +0000)]
x86 acpi: Fix crash in enable_nonboot_cpus() on wakeup from S3/S4
Bringing a CPU back online can require RCU work to be flushed, because
the per-cpu data from last time the CPU was online may not yet be
deallocated. Use the new rcu_barrier() interface function to achieve
this.
Keir Fraser [Fri, 14 Jan 2011 09:52:02 +0000 (09:52 +0000)]
cpu hotplug: Core functions are quiet on failure.
This was already inconsistent, so make them consistently quiet and
leave it to callers to log an error. Add suitable error logging to the
arch-specific CPU bringup loops,
In particular this avoids printing error on EBUSY, in which case
caller may want a silent retry loop.
Keir Fraser [Fri, 14 Jan 2011 08:34:53 +0000 (08:34 +0000)]
x86: Avoid calling xsave_alloc_save_area before xsave_init
Currently, xsave_alloc_save_area will be called in
init_idle_domain->scheduler_init->alloc_vcpu->vcpu_initialise calls
with xsave_cntxt_size=0, it is earlier than xsave_init called in
identity_cpu(). This may causing buffer overflow on xmem_pool.
Idle domain isn't using FPU,SSE,AVX or any such extended state and
doesn't need it saved. xsave_{alloc,free}_save_area() should
test-and-exit on is_idle_vcpu(), and our context switch code should
not be doing XSAVE when switching out an idle vcpu.
Signed-off-by: Wei Gang <gang.wei@intel.com> Signed-off-by: Keir Fraser <keir@xen.org>
Allen Kay [Fri, 14 Jan 2011 08:11:46 +0000 (08:11 +0000)]
vt-d: quirks for Sandybridge errata workaround, WLAN, VT-d fault escalation
Adding errata workaround for newly released Sandybridge processor
graphics, additional WLAN device ID's for WLAN quirk, a quirk for
masking VT-d fault escalation to IOH HW that can cause system hangs on
some OEM hardware where the BIOS erroneously escalates VT-d faults to
the platform.
Keir Fraser [Fri, 14 Jan 2011 08:02:26 +0000 (08:02 +0000)]
pv-drivers: use PCI interfaces to request IO and MEM resources on platform device
This is the correct interface to use and something has broken the use
of the previous incorrect interface (which fails because the request
conflicts with the resources assigned for the PCI device itself
instead of nesting like the PCI interfaces do).
pci_request_region() has been available since at least Linux 2.6.5.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jan Beulich <jbeulich@novell.com>
Tim Deegan [Thu, 13 Jan 2011 15:46:13 +0000 (15:46 +0000)]
x86/mm: fix EPT PoD locking to match the normal p2m case.
This recursive-locking bug was fixed in the main p2m code in
20269:fd3d5d66c446 (in October 2009) but has lurked unseen in
the EPT side since then. Copy the fix across.
Ian Jackson [Thu, 13 Jan 2011 00:18:35 +0000 (00:18 +0000)]
xl: save domain config (userdata) under correct domid/uuid
Recent changes caused the domain config file to be saved under dom0's
filename in /var/lib/xen. This was due to the config file being saved
before the domain was created and thus before the domid and uuid were
known.
Fix this by moving the saving code to after creation.
Also, change the "default" initialisation of domid in
xl_cmdimpl.c:create_domain to be domid=-1. That provides a more
obviously wrong value than 0 (which refers to dom0) so that other bugs
of this kind would be more likely to show up.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Tested-by: Ian Jackson <ian.jackson@eu.citrix.com>
The scanner from c/s 22735:cb94dbe20f97 is buggy and crashes with a
segmentation fault. Rebuilding the sanner appears to fix the problem
so it appears that I somehow accidentally checked in a scanner which
doesn't correspond to the committed scanner source code.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Tue, 11 Jan 2011 19:31:41 +0000 (19:31 +0000)]
libxl: config parser: print warning for apparent arbitrary python
The characters - + . ( ) : are not legal in xl config files but are
valid Python and use of at least one of them is almost essential for
writing arbitrary Python in the config file.
So if we see one of these during lexing, note it, and then after the
parse is complete if it failed we print a special extra warning.
Currently this warning refers to the nonexistent wiki page
http://wiki.xen.org/xenwiki/PythonInXlConfig
which will have to be written (and/or given a better name) before the
actual 4.1 release.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
We are going to need to rerun flex to generate new lexer code. So
rerun it now to separate out the irrelevant from the relevant changes
to the generated files.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Gianni Tedesco [Tue, 11 Jan 2011 18:58:02 +0000 (18:58 +0000)]
tools/python/pyxl: Updates to builtin-type marshalling functions
Allow setting a string field to None as a way to zero it out.
Implement setting/getting libx_file_references as strings.
Produce relevant Exceptions marshallers which remain unimplemented.
Signed-off-by: Gianni Tedesco <gianni.tedesco@citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Michal Novotny [Tue, 11 Jan 2011 18:51:28 +0000 (18:51 +0000)]
tools/xend: check for device model if path is not specified
this is the patch to check for device model (in XendConfig.py) when the
device_model had no path specified, i.e. XenD was trying to read the
file on the auxbin path. Without this patch applied the meaningless
python error "coercing to Unicode: need string or buffer, NoneType
found" occurred:
[2010-11-30 13:56:47 5255] ERROR (xmlrpclib2:181) Internal error
handling xend.domain.create
Traceback (most recent call last):
File "/usr/lib64/python2.4/site-packages/xen/util/xmlrpclib2.py",
line 134, in _marshaled_dispatch
response = self._dispatch(method, params)
File "/usr/lib64/python2.4/SimpleXMLRPCServer.py", line 406, in _dispatch
return func(*params)
File
"/usr/lib64/python2.4/site-packages/xen/xend/server/XMLRPCServer.py",
line 80, in domain_create
info = XendDomain.instance().domain_create(config)
File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomain.py",
line 1001, in domain_create
dominfo = XendDomainInfo.create(config)
File "/usr/lib64/python2.4/site-packages/xen/xend/XendDomainInfo.py",
line 97, in create
domconfig = XendConfig.XendConfig(sxp_obj = config)
File "/usr/lib64/python2.4/site-packages/xen/xend/XendConfig.py",
line 367, in __init__
self.validate()
File "/usr/lib64/python2.4/site-packages/xen/xend/XendConfig.py",
line 558, in validate
self._platform_sanity_check()
File "/usr/lib64/python2.4/site-packages/xen/xend/XendConfig.py",
line 502, in _platform_sanity_check
if not os.path.exists(self['platform']['device_model']):
File "/usr/lib64/python2.4/posixpath.py", line 171, in exists
st = os.stat(path)
TypeError: coercing to Unicode: need string or buffer, NoneType found
This patch raises VmError with message that no valid device model was
specified if None type was found in the device_model specification.
It's been tested on non-existing device model where the message is being
printed. If an invalid (but existing) device_model is set in the
configuration file the domain was destroyed because it crashed. If there
is a path specified (i.e. it's not using auxbin path) it bails with
error that the device model was not found (which was already implemented
there).
Signed-off-by: Michal Novotny <minovotn@redhat.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Gianni Tedesco [Tue, 11 Jan 2011 18:32:32 +0000 (18:32 +0000)]
libxl: move domain struct init functions from xl to libxl
This allows libxl users to get some sane default values for this complex
set of structures. This is purely code movement and there are no
functional changes except for a trivial error handling change in nic
device init.
Signed-off-by: Gianni Tedesco <gianni.tedesco@citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Gianni Tedesco [Tue, 11 Jan 2011 18:29:43 +0000 (18:29 +0000)]
libxl: Introduce libxl_domain_create_new() and libxl_domain_create_restore()
These functions are introduced as the new way to create domains with libxl
they prevent the callers from need to know about low-level implementation
details such as:
- libxl_domain_make()
- libxl_domain_build()
- libxl_domain_restore()
- when to attach the console
- how to start the device model
Above mentioned functions and all API's for the device model, which are now
redundant, have been made internal to libxl and no longer accessible.
The ocaml binding for libxl has not been properly updated to reflect the
changes, wrappers for the old functions have been removed but the code to wrap
the new functions has not been added.
Signed-off-by: Gianni Tedesco <gianni.tedesco@citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Tim Deegan [Tue, 11 Jan 2011 18:13:44 +0000 (18:13 +0000)]
tools: remove fs-front/fs-back
Its access controls are really not OK. In particular, it's not good for
libxl, which stores per-VM config blobs in a directory that is exported
to all VMs.
This will break stub-qemu save/restore, which is the only user of
fs-front that I'm aware of, but:
- It's currently broken anyway (fs-back isn't run by default and crashes
if it is run manually); and
- Stefano has a plan to plumb qemu save records through a dedicated
console channel instead.
Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Tue, 11 Jan 2011 16:48:09 +0000 (16:48 +0000)]
tools/xenpaging: fix return value from xc_mem_paging_flush_ioemu_cache
While using xenpaging, "Error flushing ioemu cache" message will be
shown even if the "flush-cache" command is sent to xenstore correctly.
That is because xenpaging assumes xc_mem_paging_flush_ioemu_cache()
returns non-zero value when the operation fails. But
xc_mem_paging_flush_ioemu_cache() returns the return value from
xs_write() which is zero on error.
So, we should invert the return value from xs_write() and return -1 on
error, or 0 on success, like other xc_ functions.
Signed-off-by: Han-Lin Li <Han-Lin.Li@itri.org.tw>
Author: Olaf Hering <olaf@aepfle.de> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
John Weekes [Tue, 11 Jan 2011 16:42:41 +0000 (16:42 +0000)]
stubdom: Fix stubdom-dm using "grep" improperly
stubdom-dm uses "grep" on "xm list" output to determine whether it is
already running. The existing behavior is to use "grep $domname-dm" but
this will result in a false-positive in the case of another domU running
whose name ends with the full new name; for instance, if "abctest-dm" is
running, a new "test-dm" will spin forever, waiting for it the end.
Any easy fix is to have it use "grep -w" instead of "grep", searching
for the whole word only.
It also might be worth considering a switch to "xl list" from "xm list",
here and in other places.
Signed-off-by: John Weekes <lists.xen@nuclearfallout.net> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Gianni Tedesco [Tue, 11 Jan 2011 16:31:47 +0000 (16:31 +0000)]
stubdom/minios: don't retrieve the address of void variable
Objects must not be declared to have type void. Declare shared_info
to have the appropriate type instead.
Author: Ganni Tedesco <gianni.tedesco@citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Samuel Thibault [Tue, 11 Jan 2011 16:30:15 +0000 (16:30 +0000)]
stubdom/minios: use correct sized types for software floating point
Replace long/int/short sizes with proper exact-size types for 64bit
architectures. As well as making the code correct, this eliminates a
compiler warning about an uninitialised variable.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org> Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Keir Fraser [Tue, 11 Jan 2011 11:41:39 +0000 (11:41 +0000)]
xenctx: misc adjustments
- fix off-by-one errors during symbol insertion and lookup
- don't store the symbol type, as it wasn't needed at all so far and
is only needed now at parsing time
- don't insert certain kinds of symbols
Keir Fraser [Tue, 11 Jan 2011 11:40:50 +0000 (11:40 +0000)]
x86: restore x2apic pre-enabled check logic
c/s 22475 removed the early checking without replacement, neglecting
the fact that x2apic_enabled must be set early for APIC register
accesses done during second stage ACPI table parsing (rooted at
acpi_boot_init()) to work correctly. Without this, particularly
determination of the boot CPU won't work, resulting in an attempt to
bring up that CPU again as a secondary one (which fails).
Restore the functionality, now calling it from generic_apic_probe().
Keir Fraser [Tue, 11 Jan 2011 11:27:37 +0000 (11:27 +0000)]
xenpaging: update machine_to_phys_mapping[] during page deallocation
The machine_to_phys_mapping[] array needs updating during page
deallocation. If that page is allocated again, a call to
get_gpfn_from_mfn() will still return an old gfn from another guest.
This will cause trouble because this gfn number has no or different
meaning in the context of the current guest.
This happens when the entire guest ram is paged-out before
xen_vga_populate_vram() runs. Then XENMEM_populate_physmap is called
with gfn 0xff000. A new page is allocated with alloc_domheap_pages.
This new page does not have a gfn yet. However, in
guest_physmap_add_entry() the passed mfn maps still to an old gfn
(perhaps from another old guest). This old gfn is in paged-out state
in this guests context and has no mfn anymore. As a result, the
ASSERT() triggers because p2m_is_ram() is true for p2m_ram_paging*
types. If the machine_to_phys_mapping[] array is updated properly,
both loops in guest_physmap_add_entry() turn into no-ops for the new
page and the mfn/gfn mapping will be done at the end of the function.
If XENMEM_add_to_physmap is used with XENMAPSPACE_gmfn,
get_gpfn_from_mfn() will return an appearently valid gfn. As a
result, guest_physmap_remove_page() is called. The ASSERT in
p2m_remove_page triggers because the passed mfn does not match the old
mfn for the passed gfn.
Keir Fraser [Tue, 11 Jan 2011 10:38:28 +0000 (10:38 +0000)]
xenpaging: drop paged pages in guest_remove_page
Simply drop paged-pages in guest_remove_page(), and notify xenpaging
to drop its reference to the gfn. If the ring is full, the page will
remain in paged-out state in xenpaging. This is not an issue, it just
means this gfn will not be nominated again.
Keir Fraser [Tue, 11 Jan 2011 10:32:59 +0000 (10:32 +0000)]
xenpaging: print page-in/page-out progress
Now that DPRINTF is triggered only when the environment variable
XENPAGING_DEBUG is found, make such a debug session actually useful by
printing the entire page-out/page-in process. The 'Got event from Xen'
message alone is not helpful.
Keir Fraser [Tue, 11 Jan 2011 10:32:05 +0000 (10:32 +0000)]
xenpaging: specify policy mru_size at runtime
The environment variable XENPAGING_POLICY_MRU_SIZE will change the
mru_size in the policy at runtime. Specifying the mru_size at runtime
allows the admin to keep more pages in memory so guests can make more
progress. Its also good for development to reduce the value to put
more pressure on the paging related code paths.
Keir Fraser [Tue, 11 Jan 2011 10:31:33 +0000 (10:31 +0000)]
xenpaging: remove domain_id and mfn from struct xenpaging_victim
Remove unused member 'mfn' from struct xenpaging_victim.
xenpaging operates on a single guest, so it needs only a single
domain_id. Remove domain_id from struct xenpaging_victim and use the
one from paging->mem_event where needed. Its not used in the policy.
Keir Fraser [Mon, 10 Jan 2011 08:45:19 +0000 (08:45 +0000)]
x86_64: don't use weak symbols on x86-64
Various gcc versions inline functions that are both weak and hidden,
without even giving a warning.
Certainly the risk exists that we'll see the problem again when
another weak function gets introduced, but I don't see a way to
protect us from that.
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Just remove the weak attribute altogether. It's the only one in
non-ia64-specific code. We can get teh same effect with ifdefs which
although a bit unsightly is better than using compiler/linker features
we cannot trust.
Keir Fraser [Mon, 10 Jan 2011 08:40:32 +0000 (08:40 +0000)]
EPT/VT-d: bug fix for EPT/VT-d table sharing
This patch makes following changes: 1) Moves EPT/VT-d sharing
initialization back to when it is actually needed to make sure
vmx_ept_vpid_cap has been initialized. 2) added page order parameter
to iommu_pte_flush() to tell VT-d what size of page to flush. 3)
added hap_2mb flag to ease performance studies between base 4KB EPT
size and when 2MB and 1GB page size support are enabled.
Keir Fraser [Sat, 8 Jan 2011 10:52:45 +0000 (10:52 +0000)]
Update AMD SVM feature flags
This patch updates AMD SVM feature flags (0x8000000A:EDX). It adds
several new feature bits, along with feature description. The feature
names are changed to be consistent with Linux kernel.
Keir Fraser [Sat, 8 Jan 2011 10:48:46 +0000 (10:48 +0000)]
Update AMD CPU feature flags 0x80000001:ECX for Xen Hypervisor
This patch syncs-up AMD CPU feature flags 0x80000001:ECX with the
latest Linux kernel. Several new features are added. Some of existing
features' names are changed as well.
Keir Fraser [Sat, 8 Jan 2011 10:09:44 +0000 (10:09 +0000)]
timer: Ensure that CPU field of a timer is read safely when lock-free.
Firstly, all updates must use atomic_write16(), and lock-free reads
must use atomic_read16(). Secondly, we ensure ->cpu is the only field
accessed without a lock. This requires us to place a special sentinel
value in that field when a timer is killed, to avoid needing to read
->status outside a locked critical section.
Keir Fraser [Sat, 8 Jan 2011 10:05:55 +0000 (10:05 +0000)]
x86: Fix atomic_write*() macros to correctly inform GCC that memory
it knows about is being written to.
The bug is a copy-and-paste error from inline asm that writes to I/O
memory. In that case, as with asm for accessign guest memory,
specifying memory as a read-only parameter is acceptable because the
memory cannot alias with anything that GCC reads directly.
Keir Fraser [Sat, 8 Jan 2011 09:29:11 +0000 (09:29 +0000)]
timer: Fix up timer-state teardown on CPU offline / online-failure.
The lock-free access to timer->cpu in timer_lock() is problematic, as
the per-cpu data that is then dereferenced could disappear under our
feet. Now that per-cpu data is freed via RCU, we simply add a RCU
read-side critical section to timer_lock(). It is then also
unnecessary to migrate timers on CPU_DYING (we can defer it to the
nicer CPU_DEAD state) and we can also safely migrate timers on
CPU_UP_CANCELED.
Keir Fraser [Sat, 8 Jan 2011 09:14:23 +0000 (09:14 +0000)]
x86: Free per-cpu area for offline cpu via RCU.
This allows other CPUs to reference per-cpu areas with less strict
locking. In particular, timer.c access a per-cpu lock with reference
to a per-timer cpu field which it accesses with no synchronisation.
One subtlety is that this prevents us bringing a cpu back online until
the RCU work is completed. In this case we return EBUSY and the
tool stack can report the (unlikely) error, or retry, as it sees fit.