Eric Blake [Thu, 3 Mar 2011 15:50:19 +0000 (08:50 -0700)]
util: use SCM_RIGHTS in virFileOperation when needed
Currently, the hook function in virFileOperation is extremely limited:
it must be async-signal-safe, and cannot modify any memory in the
parent process. It is much handier to return a valid fd and operate
on it in the parent than to deal with hook restrictions.
* src/util/util.h (VIR_FILE_OP_RETURN_FD): New flag.
* src/util/util.c (virFileOperationNoFork, virFileOperation):
Honor new flag.
Eric Blake [Wed, 2 Mar 2011 04:59:25 +0000 (21:59 -0700)]
qemu: allow simple domain save to use fd: protocol
This allows direct saves (no compression, no root-squash NFS) to use
the more efficient fd: migration, which in turn avoids a race where
qemu exec: migration can sometimes fail because qemu does a generic
waitpid() that conflicts with the pclose() used by exec:. Further
patches will solve compression and root-squash NFS.
* src/qemu/qemu_driver.c (qemudDomainSaveFlag): Use new function
when there is no compression.
Eric Blake [Sat, 26 Mar 2011 11:27:57 +0000 (05:27 -0600)]
qemu: fix restoring a compressed save image
Latent bug introduced in commit 2d6a581960 (Aug 2009), but not exposed
until commit 1859939a (Jan 2011). Basically, when virExec creates a
pipe, it always marks libvirt's side as cloexec. If libvirt then
wants to hand that pipe to another child process, things work great if
the fd is dup2()'d onto stdin or stdout (as with stdin: or exec:
migration), but if the pipe is instead used as-is (such as with fd:
migration) then qemu sees EBADF because the fd was closed at exec().
This is a minimal fix for the problem at hand; it is slightly racy,
but no more racy than the rest of libvirt fd handling, including the
case of uncompressed save images. A more invasive fix, but ultimately
safer at avoiding leaking unintended fds, would be to _always and
atomically_ open all fds as cloexec in libvirt (thanks to primitives
like open(O_CLOEXEC), pipe2(), accept4(), ...), then teach virExec to
clear that bit for all fds explicitly marked to be handed to the child
only after forking.
Eric Blake [Wed, 23 Mar 2011 19:23:59 +0000 (13:23 -0600)]
logging: always NUL-terminate circular buffer
* src/util/logging.c (virLogStartup, virLogSetBufferSize):
Over-allocate, so that a debugger can just print the circular
buffer. Suggested by Daniel Veillard.
Eric Blake [Thu, 10 Mar 2011 22:28:28 +0000 (15:28 -0700)]
maint: use space, not tab, in remote_protocol-structs
* src/Makefile.am (remote_protocol-structs): Flatten tabs.
* src/remote_protocol-structs: Likewise. Also add a hint to emacs
to make it easier to keep spaces in the file.
Eric Blake [Wed, 23 Mar 2011 20:16:02 +0000 (14:16 -0600)]
tests: don't alter state in $HOME
Diego reported a bug where virsh tries to initialize a readline
history directory during 'make check' run as root, but fails
because /root was read-only.
It turns out that I could reproduce this as non-root, by using:
The Open Nebula driver has been unmaintained since it was first
introduced. The only commits have been for tree-wide cleanups.
It also has a major design flaw, in that it only knows about guests
that it has created itself, which makes it of very limited use.
Discussions wrt evolution of the VMWare ESX driver, concluded that
it should limit itself to single-node ESX operation and not try to
manage the multi-node architecture of VirtualCenter. Open Nebula
is a cluster like Virtual Center, not a single node system, so
the same reasoning applies.
The DeltaCloud project includes an Open Nebula driver and is a much
better fit architecturally, since it is explicitly targetting the
distributed multihost cluster scenario.
Thus this patch deletes the libvirt Open Nebula driver with the
recommendation that people use DeltaCloud for managing it instead.
* configure.ac: Remove probe for xmlrpc & --with-one arg
* daemon/Makefile.am, daemon/libvirtd.c, src/Makefile.am: Remove
ONE driver build
* src/opennebula/one_client.c, src/opennebula/one_client.h,
src/opennebula/one_conf.c, src/opennebula/one_conf.h,
src/opennebula/one_driver.c, src/opennebula/one_driver.c: Delete
files
* autobuild.sh, libvirt.spec.in, mingw32-libvirt.spec.in: Remove
build rules for Open Nebula
* docs/drivers.html.in, docs/sitemap.html.in: Remove reference
to OpenNebula
* docs/drvone.html.in: Delete file
Daniel Veillard [Mon, 28 Mar 2011 02:40:24 +0000 (10:40 +0800)]
Update on the goal page
Some things to note in this patch:
- we do extend libvirt scope beyond purely managing domains, there is
already a number of blocks which are here as helpr functions to
manage the resources on the host.
- we are expanding in the direction of libvirt being sufficient to do
most of the management on the Host (but within the limits of the need
for virtualization, e.g. managing users on the host is out of scope)
- we don't require anymore APIs to be supported by multiple
hypervisors to get in, it's already the case in practice, but we
should still make sure the semantic of those APIs are clear. We
added quite a bit for QEmu, but for example I saw on IRC that VBox
could emulate a network unplug/replug on a domain interface, and
that would be a good addition even if a priori no other hypervisor
supports it.
- Make clear that all libvirt APIs are available remotely, which is
key to use libvirt for building management tools.
- link the goal page from the project main page
As for libvirt project directions, I think it just reflects the natural
evolution in the last couple of years. We are less hypervisor agnostic
and extending in the Host management. Clearly there is interest in
making sure libvirt is complete in term of features for the hypervisors
supported, especially the ones like KVM or LXC which don't really have
integrated management library.
* docs/goals.html.in: update the goals page
* docs/index.html.in: link it from the top page
Jiri Denemark [Fri, 25 Mar 2011 11:03:00 +0000 (12:03 +0100)]
daemon: Avoid resetting errors before they are reported
Commit f44bfb7 was supposed to make sure no additional libvirt API (esp.
*Free) is called before remoteDispatchConnError() is called on error.
However, the patch missed two instances.
Eric Blake [Tue, 22 Mar 2011 22:22:37 +0000 (16:22 -0600)]
command: add virCommandAbort for cleanup paths
Sometimes, an asynchronous helper is started (such as a compressor
or iohelper program), but a later error means that we want to
abort that child. Make this easier.
Note that since daemons and virCommandRunAsync can't mix, the only
time virCommandFree can reap a process is if someone did
virCommandRunAsync for a non-daemon and didn't stash the pid.
* src/util/command.h (virCommandAbort): New prototype.
* src/util/command.c (_virCommand): Add new field.
(virCommandRunAsync, virCommandWait): Track whether pid was used.
(virCommandFree): Reap child if caller did not request pid.
(virCommandAbort): New function.
* src/libvirt_private.syms (command.h): Export it.
* tests/commandtest.c (test19): New test.
Eric Blake [Wed, 23 Mar 2011 23:42:27 +0000 (17:42 -0600)]
command: don't mix RunAsync and daemons
It doesn't make sense to run a daemon without synchronously
waiting for the child process to reply whether the daemon has
been kicked off and pidfile written yet.
* src/util/command.c (VIR_EXEC_RUN_SYNC): New constant.
(virCommandRun): Set temporary flag.
(virCommandRunAsync): Use it to prevent async runs of intermediate
child when spawning asynchronous daemon grandchild.
Eric Blake [Tue, 22 Mar 2011 17:55:45 +0000 (11:55 -0600)]
command: properly diagnose process exit via signal
Child processes don't always reach _exit(); if they die from a
signal, then any messages should still be accurate. Most users
either expect a 0 status (thankfully, if status==0, then
WIFEXITED(status) is true and WEXITSTATUS(status)==0 for all
known platforms) or were filtering on WIFEXITED before printing
a status, but a few were missing this check. Additionally,
nwfilter_ebiptables_driver was making an assumption that works
on Linux (where WEXITSTATUS shifts and WTERMSIG just masks)
but fails on other platforms (where WEXITSTATUS just masks and
WTERMSIG shifts).
* src/util/command.h (virCommandTranslateStatus): New helper.
* src/libvirt_private.syms (command.h): Export it.
* src/util/command.c (virCommandTranslateStatus): New function.
(virCommandWait): Use it to also diagnose status from signals.
* src/security/security_apparmor.c (load_profile): Likewise.
* src/storage/storage_backend.c
(virStorageBackendQEMUImgBackingFormat): Likewise.
* src/util/util.c (virExecDaemonize, virRunWithHook)
(virFileOperation, virDirCreate): Likewise.
* daemon/remote.c (remoteDispatchAuthPolkit): Likewise.
* src/nwfilter/nwfilter_ebiptables_driver.c (ebiptablesExecCLI):
Likewise.
Wen Congyang [Thu, 24 Mar 2011 08:56:40 +0000 (16:56 +0800)]
fix the check of the output of monitor command 'device_add'
Hotpluging host usb device by text mode will fail, because the monitor
command 'device_add' outputs 'husb: using...' if it succeeds, but we
think the command should not output anything.
Eric Blake [Fri, 18 Mar 2011 17:32:35 +0000 (11:32 -0600)]
build: enforce reference count checking
Add the compiler attribute to ensure we don't introduce any more
ref bugs like were just patched in commit 9741f34, then explicitly
mark the remaining places in code that are safe.
* src/qemu/qemu_monitor.h (qemuMonitorUnref): Mark
ATTRIBUTE_RETURN_CHECK.
* src/conf/domain_conf.h (virDomainObjUnref): Likewise.
* src/conf/domain_conf.c (virDomainObjParseXML)
(virDomainLoadStatus): Fix offenders.
* src/openvz/openvz_conf.c (openvzLoadDomains): Likewise.
* src/vmware/vmware_conf.c (vmwareLoadDomains): Likewise.
* src/qemu/qemu_domain.c (qemuDomainObjBeginJob)
(qemuDomainObjBeginJobWithDriver)
(qemuDomainObjExitRemoteWithDriver): Likewise.
* src/qemu/qemu_monitor.c (QEMU_MONITOR_CALLBACK): Likewise.
Suggested by Daniel P. Berrange.
Eric Blake [Fri, 18 Mar 2011 20:41:13 +0000 (14:41 -0600)]
maint: prohibit access(,X_OK)
This simplifies several callers that were repeating checks already
guaranteed by util.c, and makes other callers more robust to now
reject directories. remote_driver.c was over-strict - access(,R_OK)
is only needed to execute a script file; a binary only needs
access(,X_OK) (besides, it's unusual to see a file with x but not
r permissions, whether script or binary).
Jiri Denemark [Wed, 23 Mar 2011 15:00:58 +0000 (16:00 +0100)]
Make error reporting in libvirtd thread safe
Bug https://bugzilla.redhat.com/show_bug.cgi?id=689374 reported libvirtd
crash during error dispatch.
The reason is that libvirtd uses remoteDispatchConnError() with non-NULL
conn parameter which means that virConnGetLastError() is used instead of
its thread safe replacement virGetLastError().
So when several libvirtd threads are reporting errors at the same time,
the errors can get mixed or corrupted or in case of bad luck libvirtd
itself crashes.
Since Daniel B. is going to rewrite this code from scratch on top of his
RPC infrastructure, I tried to come up with a minimal fix. Thus,
remoteDispatchConnError() now just ignores its conn argument and always
calls virGetLastError(). However, several callers had to be touched as
well, since no libvirt API is allowed to be called before dispatching
the error. Doing so would reset the error and we would have nothing to
dispatch. As a result of that, the code is not very nice but that
doesn't really make daemon/remote.c worse than it is now :-) And it will
all die soon, which is good.
The bug report also contains a reproducer in C which detects both mixed
up error messages and libvirtd crash. Before this patch, I was able to
crash libvirtd in about 20 seconds up to 3 minutes depending on number
of CPU cores (more is better) and luck.
Eric Blake [Wed, 16 Mar 2011 21:25:56 +0000 (15:25 -0600)]
build: nuke all .x-sc* files, and fix VPATH syntax-check
Not every day you see a patch that nukes 27 files!
* .gnulib: Update to latest, for maint.mk improvements
* bootstrap: Resync to gnulib.
* bootstrap.conf (ACLOCAL): Swap the secondary aclocal include
directory, now that bootstrap picks up gnulib/m4 instead of m4.
* Makefile.am (syntax_check_exceptions, EXTRA_DIST): No longer
worry about nuked files.
* cfg.mk (sc_x_sc_dist_check): Delete dead rule.
(VC_LIST_ALWAYS_EXCLUDE_REGEX): Add HACKING.
(exclude_file_name_regexp--sc_*): Inline and simplify contents...
* .x-sc_*: ...from here, then delete the files.
Eric Blake [Wed, 23 Mar 2011 16:30:49 +0000 (10:30 -0600)]
rpm: add missing dependencies
Among others, the missing radvd dependency showed up as:
error: Failed to start network ipv6net
error: Cannot find radvd - Possibly the package isn't installed: No such file
or directory
even when radvd was installed, because the RADVD preprocessor
symbol was missing at configure time.
* libvirt.spec.in (with_network): Add BuildRequires for radvd,
iptables, and ip6tables.
(BuildRequires): Add libxslt and augeas for docs and test.
(with_libvirtd): Add module-init-tools for modprobe.
(with_nwfilter): Add BuildRequires for ebtables.
(with_esx): Fix esx build on RHEL 5, thanks to curl-devel rename.
Osier Yang [Wed, 23 Mar 2011 14:57:44 +0000 (22:57 +0800)]
util: Fix return value for virJSONValueFromString if it fails
Problem:
"parser.head" is not NULL even if it's free'ed by "virJSONValueFree",
returning "parser.head" when "virJSONValueFromString" fails will cause
unexpected errors (libvirtd will crash sometimes), e.g.
In function "qemuMonitorJSONArbitraryCommand":
if (!(cmd = virJSONValueFromString(cmd_str)))
goto cleanup;
if (qemuMonitorJSONCommand(mon, cmd, &reply) < 0)
goto cleanup;
......
cleanup:
virJSONValueFree(cmd);
It will continues to send command to monitor even if "virJSONValueFromString"
is failed, and more worse, it trys to free "cmd" again.
Crash example:
{"error":{"class":"QMPBadInputObject","desc":"Expected 'execute' in QMP input","data":{"expected":"execute"}}}
{"error":{"class":"QMPBadInputObject","desc":"Expected 'execute' in QMP input","data":{"expected":"execute"}}}
error: server closed connection:
error: unable to connect to '/var/run/libvirt/libvirt-sock', libvirtd may need to be started: Connection refused
error: failed to connect to the hypervisor
This fix is to:
1) return NULL for failure of "virJSONValueFromString",
2) and it seems "virJSONValueFree" uses incorrect loop index for type
of "VIR_JSON_TYPE_OBJECT", fix it together.
Eric Blake [Fri, 18 Mar 2011 17:27:14 +0000 (11:27 -0600)]
qemu: simplify monitor callbacks
A future patch will change reference counting idioms; consolidating
this pattern now makes the next patch smaller (touch only the new
macro rather than every caller).
* src/qemu/qemu_monitor.c (QEMU_MONITOR_CALLBACK): New helper.
(qemuMonitorGetDiskSecret, qemuMonitorEmitShutdown)
(qemuMonitorEmitReset, qemuMonitorEmitPowerdown)
(qemuMonitorEmitStop, qemuMonitorEmitRTCChange)
(qemuMonitorEmitWatchdog, qemuMonitorEmitIOError)
(qemuMonitorEmitGraphics): Use it to reduce duplication.
Roopa Prabhu [Tue, 22 Mar 2011 19:27:01 +0000 (15:27 -0400)]
8021Qbh: use preassociate-rr during the migration prepare stage
This patch introduces PREASSOCIATE-RR during incoming VM migration on the
destination host. This is similar to the usage of PREASSOCIATE during
migration in 8021qbg libvirt code today. PREASSOCIATE-RR is a VDP operation.
With the latest at IEEE, 8021qbh will need to support VDP operations.
A corresponding enic driver patch to support PREASSOCIATE-RR for 8021qbh
will be posted for net-next-2.6 inclusion soon.
Fix uninitialized variable & error reporting in LXC veth setup
THe veth setup in LXC had a couple of flaws, first brInit did
not report any error when it failed. Second vethCreate() did
not correctly initialize the variable containing the return
code, so could report failure even when it succeeded.
Enhance the QEMU migration monitoring loop, so that it can get
a signal to change migration speed on the fly
* src/qemu/qemu_domain.h: Add signal for changing speed on the fly
* src/qemu/qemu_driver.c: Wire up virDomainMigrateSetSpeed driver
* src/qemu/qemu_migration.c: Support signal for changing speed
Add public API for setting migration speed on the fly
It is possible to set a migration speed limit when starting
migration. This new API allows the speed limit to be changed
on the fly to adjust to changing conditions
Thibault Vincent [Tue, 22 Mar 2011 13:12:36 +0000 (21:12 +0800)]
qemu: add two hook script events "prepare" and "release"
Fix for bug https://bugzilla.redhat.com/show_bug.cgi?id=618970
The "prepare" hook is called very early in the VM statup process
before device labeling, so that it can allocate ressources not
managed by libvirt, such as DRBD, or for instance create missing
bridges and vlan interfaces.
* src/util/hooks.c src/util/hooks.h: add definitions for new hooks
VIR_HOOK_QEMU_OP_PREPARE and VIR_HOOK_QEMU_OP_RELEASE
* src/qemu/qemu_process.c: use them in qemuProcessStart and
qemuProcessStop()
Eric Blake [Wed, 16 Mar 2011 02:21:45 +0000 (20:21 -0600)]
qemu: simplify interface fd handling in monitor
With only a single caller to these two monitor commands, I
didn't need to wrap a new WithFds version, but just change
the command itself.
* src/qemu/qemu_monitor.h (qemuMonitorAddNetdev)
(qemuMonitorAddHostNetwork): Add parameters.
* src/qemu/qemu_monitor.c (qemuMonitorAddNetdev)
(qemuMonitorAddHostNetwork): Add support for fd passing.
* src/qemu/qemu_hotplug.c (qemuDomainAttachNetDevice): Use it to
simplify code.
Eric Blake [Tue, 15 Mar 2011 23:10:16 +0000 (17:10 -0600)]
qemu: simplify PCI configfd handling in monitor
This is also a bug fix - on the error path, qemu_hotplug would
leave the configfd file leaked into qemu. At least the next
attempt to hotplug a PCI device would reuse the same fdname,
and when the qemu getfd monitor command gets a new fd by the
same name as an earlier one, it closes the earlier one, so there
is no risk of qemu running out of fds.
* src/qemu/qemu_monitor.h (qemuMonitorAddDeviceWithFd): New
prototype.
* src/qemu/qemu_monitor.c (qemuMonitorAddDevice): Move guts...
(qemuMonitorAddDeviceWithFd): ...to new function, and add support
for fd passing.
* src/qemu/qemu_hotplug.c (qemuDomainAttachHostPciDevice): Use it
to simplify code.
Suggested by Daniel P. Berrange.
Eric Blake [Wed, 16 Mar 2011 01:38:06 +0000 (19:38 -0600)]
qemu: simplify monitor fd error handling
qemu_monitor was already returning -1 and setting errno to EINVAL
on any attempt to send an fd without a unix socket, but this was
a silent failure in the case of qemuDomainAttachHostPciDevice.
Meanwhile, qemuDomainAttachNetDevice was doing some sanity checking
for a better error message; it's better to consolidate that to a
central point in the API.
* src/qemu/qemu_hotplug.c (qemuDomainAttachNetDevice): Move sanity
checking...
* src/qemu/qemu_monitor.c (qemuMonitorSendFileHandle): ...into
central location.
Suggested by Chris Wright.
Eric Blake [Wed, 16 Mar 2011 21:47:32 +0000 (15:47 -0600)]
udev: fix regression with qemu:///session
https://bugzilla.redhat.com/show_bug.cgi?id=684655 points out
a regression introduced in commit 2215050edd - non-root users
can't connect to qemu:///session because libvirtd dies when
it can't use pciaccess initialization.
* src/node_device/node_device_udev.c (udevDeviceMonitorStartup):
Don't abort udev driver (and libvirtd overall) if non-root user
can't use pciaccess.
Eric Blake [Sat, 19 Mar 2011 02:19:31 +0000 (20:19 -0600)]
logging: fix off-by-one bug
Valgrind caught that our log wrap-around was going 1 past the end.
Regression introduced in commit b16f47a; previously the
buffer was static and size+1 bytes, but now it is dynamic and
exactly size bytes.
* src/util/logging.c (virLogStr): Don't write past end of log.
Wen Congyang [Mon, 21 Mar 2011 06:29:35 +0000 (14:29 +0800)]
do not report OOM error when prepareCall() failed
We have reported error in the function prepareCall(), and
the error is not only OOM error. So we should not report
OOM error in the function call() when prepareCall() failed.
Osier Yang [Mon, 21 Mar 2011 07:58:51 +0000 (15:58 +0800)]
doc: Add schema definition for imagelabel
<imagelable> is not generated by running domain, actually we parse
it in src/conf/domain_conf.c, this patch is to fix it, otherwise any
validation (virt-xml-validate) on the domain xml dumped from shutoff
domain containing <imagelable> will fail.
Tiziano Mueller [Sat, 5 Mar 2011 17:36:23 +0000 (18:36 +0100)]
update virGetVersion description
The current description suggests that you always have to provide
a valid typeVer pointer. But if you want only the libvirt version
it's also possible to set type and typeVer to NULL to skip the
hypervisor part.
Hu Tao [Mon, 7 Mar 2011 03:49:12 +0000 (11:49 +0800)]
Don't return an error on failure to create blkio controller
This patch enables cgroup controllers as much as possible by skipping
the creation of blkio controller when running with old kernels that
doesn't support multi-level directory for blkio controller.
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Signed-off-by: Eric Blake <eblake@redhat.com>
Jim Fehlig [Fri, 18 Mar 2011 20:54:13 +0000 (14:54 -0600)]
Don't build libxenlight driver for Xen 4.0
The libxenlight driver does not build against the tech preview
version of libxenlight in Xen 4.0. Only enable building the
driver against more complete libxenlight found in Xen 4.1.
Eric Blake [Tue, 15 Mar 2011 02:20:53 +0000 (20:20 -0600)]
qemu: respect locking rules
THREADS.txt states that the contents of vm should not be read or
modified while the vm lock is not held, but that the lock must not
be held while performing a monitor command. This fixes all the
offenders that I could find.
* src/qemu/qemu_process.c (qemuProcessStartCPUs)
(qemuProcessInitPasswords, qemuProcessStart): Don't modify or
refer to vm state outside lock.
* src/qemu/qemu_driver.c (qemudDomainHotplugVcpus): Likewise.
* src/qemu/qemu_hotplug.c (qemuDomainChangeGraphicsPasswords):
Likewise.
Since radvd is executed by daemonizing it, the attempt to exec the
radvd binary doesn't happen until after libvirtd has already received
an exit code from the intermediate forked process, so no error is
detected or logged by __virExec().
We can't require radvd as a prerequisite for the libvirt package (many
installations don't use IPv6, so they don't need it), so instead we
add in a check to verify there is an executable radvd binary prior to
trying to exec it.
When SASL is active, it was possible that we read and decoded
more data off the wire than we initially wanted. The loop
processing this data terminated after only one message to
avoid delaying the calling thread, but this could delay
event delivery. As long as there is decoded SASL data in
memory, we must process it, before returning to the poll()
event loop.
This is a counterpart to the same kind of issue solved in
virExec would only resolved the binary to $PATH if no env
variables were being set. Since there is no execvep() API
in POSIX, we use virFindFileInPath to manually resolve
the binary and then use execv() instead of execvp().
Jim Fehlig [Thu, 10 Feb 2011 22:42:34 +0000 (15:42 -0700)]
Add libxenlight driver
Add a new xen driver based on libxenlight [1], which is the primary
toolstack starting with Xen 4.1.0. The driver is stateful and runs
privileged only.
Like the existing xen-unified driver, the libxenlight driver is
accessed with xen:// URI. Driver selection is based on the status
of xend. If xend is running, the libxenlight driver will not load
and xen:// connections are handled by xen-unified. If xend is not
running *and* the libxenlight driver is available, xen://
connections are deferred to the libxenlight driver.
V6:
- Address several code style issues noted by Daniel Veillard
- Make drive work with xen:/// URI
- Hold domain object reference while domain is injected in
libvirt event loop. Race found and fixed by Markus Groß.
V5:
- Ensure events are unregistered when domain private data
is destroyed. Discovered and fixed by Markus Groß.
V4:
- Handle restart of libvirtd, reconnecting to previously
started domains
- Rebased to current master
- Tested against Xen 4.1 RC7-pre (c/s 22961:c5d121fd35c0)
V3:
- Reserve vnc port within driver when autoport=yes
V2:
- Update to Xen 4.1 RC6-pre (c/s 22940:5a4710640f81)
- Rebased to current master
- Plug memory leaks found by Stefano Stabellini and valgrind
- Handle SHUTDOWN_crash domain death event
Jiri Denemark [Mon, 7 Mar 2011 14:07:01 +0000 (15:07 +0100)]
util: Forbid calling hash APIs from iterator callback
Calling most hash APIs is not safe from inside of an iterator callback.
Exceptions are APIs that do not modify the hash table and removing
current hash entry from virHashFroEach callback.
This patch make all APIs which are not safe fail instead of just relying
on the callback being nice not calling any unsafe APIs.
Wen Congyang [Wed, 16 Mar 2011 09:01:23 +0000 (17:01 +0800)]
do not unref obj in qemuDomainObjExitMonitor*
Steps to reproduce this bug:
# cat test.sh
#! /bin/bash -x
virsh start domain
sleep 5
virsh qemu-monitor-command domain 'cpu_set 2 online' --hmp
# while true; do ./test.sh ; done
Then libvirtd will crash.
The reason is that:
we add a reference of obj when we open the monitor. We will reduce this
reference when we free the monitor.
If the reference of monitor is 0, we will free monitor automatically and
the reference of obj is reduced.
But in the function qemuDomainObjExitMonitorWithDriver(), we reduce this
reference again when the reference of monitor is 0.
It will cause the obj be freed in the function qemuDomainObjEndJob().
Then we start the domain again, and libvirtd will crash in the function
virDomainObjListSearchName(), because we pass a null pointer(obj->def->name)
to strcmp().
Wen Congyang [Mon, 7 Mar 2011 06:35:49 +0000 (14:35 +0800)]
qemu: check driver name while attaching disk
This bug was reported by Shi Jin(jinzishuai@gmail.com):
=============
# virsh attach-disk RHEL6RC /var/lib/libvirt/images/test3.img vdb \
--driver file --subdriver qcow2
Disk attached successfully
# virsh save RHEL6RC /var/lib/libvirt/images/memory.save
Domain RHEL6RC saved to /var/lib/libvirt/images/memory.save
# virsh restore /var/lib/libvirt/images/memory.save
error: Failed to restore domain from /var/lib/libvirt/images/memory.save
error: internal error unsupported driver name 'file'
for disk '/var/lib/libvirt/images/test3.img'
=============
We check the driver name when we start or restore VM, but we do
not check it while attaching a disk. This adds the same check on disk
driverName used in qemuBuildCommandLine to qemudDomainAttachDevice.
Daniel Veillard [Thu, 17 Mar 2011 07:30:49 +0000 (15:30 +0800)]
Improve logging documentation including the debug buffer
* docs/logging.html.in: document the fact that starting from
0.9.0 the server logs goes to libvirtd.log instead of syslog
by default, describe the debug buffer, restructure the page
and add a couple more examples
Daniel Veillard [Tue, 15 Mar 2011 08:10:03 +0000 (16:10 +0800)]
Avoid taking lock in libvirt debug dump
As pointed out, locking the buffer from the signal handler
cannot been guaranteed to be safe, so to avoid any hazard
we prefer the trade off of dumping logs possibly messed up
by concurrent logging activity rather than risk a daemon
crash.
* src/util/logging.c: change virLogEmergencyDumpAll() to not
take any lock on the log buffer but reset buffer content variables
to an empty set before starting the actual dump.
Wen Congyang [Wed, 16 Mar 2011 06:54:28 +0000 (14:54 +0800)]
unlock the monitor when unwatching the monitor
Steps to reproduce this bug:
# virsh qemu-monitor-command domain 'cpu_set 2 online' --hmp
The domain has 2 cpus, and we try to set the third cpu online.
The qemu crashes, and this command will hang.
The reason is that the refs is not 1 when we unwatch the monitor.
We lock the monitor, but we do not unlock it. So virCondWait()
will be blocked.
virsh: fix memtune's help message for swap_hard_limit
* Correct the documentation for cgroup: the swap_hard_limit indicates
mem+swap_hard_limit.
* Change cgroup private apis to: virCgroupGet/SetMemSwapHardLimit
Signed-off-by: Nikunj A. Dadhania <nikunj@linux.vnet.ibm.com>
Alex Williamson [Thu, 17 Mar 2011 20:26:36 +0000 (14:26 -0600)]
Add PCI sysfs reset access
I'm proposing we make use of $PCIDIR/reset in qemu-kvm to reset
devices on VM reset. We need to add it to libvirt's list of
files that get ownership for device assignment.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Jim Fehlig [Thu, 17 Mar 2011 20:16:47 +0000 (14:16 -0600)]
Support Xen sysctl v8, domctl v7
xen-unstable c/s 21118:28e5409e3fb3 bumped sysctl version to 8.
xen-unstable c/s 21212:de94884a669c introduced CPU pools feature,
adding another member to xen_domctl_getdomaininfo struct. Add a
corresponding domctl v7 struct in xen hypervisor sub-driver and
detect sysctl v8 during initialization.
Matthias Bolte [Thu, 17 Mar 2011 16:51:33 +0000 (17:51 +0100)]
remote: Add missing virCondDestroy calls
The virCond of the remote_thread_call struct was leaked in some
places. This results in leaking the underlying mutex. Which in turn
leaks a handle on Windows.
Reported by Aliaksandr Chabatar and Ihar Smertsin.
Eric Blake [Wed, 16 Mar 2011 17:50:44 +0000 (11:50 -0600)]
build: improve rpm generation for distro backports
When building for an older distro, it's convenient to just
tell rpmbuild to define dist (for example, to .el6_0), rather
than also remembering to define rhel to 6.
* libvirt.spec.in: Guess %{rhel} based on %{dist}.
Based on an idea by Jiri Denemark.
Laine Stump [Tue, 15 Mar 2011 20:22:25 +0000 (16:22 -0400)]
macvtap: log an error if on failure to connect to netlink socket
A bug in libnl (see https://bugzilla.redhat.com/show_bug.cgi?id=677724
and https://bugzilla.redhat.com/show_bug.cgi?id=677725) makes it very
easy to create a failure to connect to the netlink socket when trying
to open a macvtap network device ("type='direct'" in domain interface
XML). When that error occurred (during a call to libnl's nl_connect()
from libvirt's nlComm(), there was no log message, leading virsh (for
example) to report "unknown error".
There were two other cases in nlComm where an error in a libnl
function might return with failure but no error reported. In all three
cases, this patch logs a message which will hopefully be more useful.
Note that more detailed information about the failure might be
available from libnl's nl_geterror() function, but it calls
strerror(), which is not threadsafe, so we can't use it.
Osier Yang [Wed, 16 Mar 2011 08:28:07 +0000 (16:28 +0800)]
storage: Fix a problem which will cause libvirtd crashed
If pool xml has no definition for "port", then "Segmentation fault"
happens when jumping to "cleanup:" to do "VIR_FREE(port)", as "port"
was not initialized in this situation.
Eric Blake [Wed, 22 Dec 2010 20:48:05 +0000 (13:48 -0700)]
qemu: improve efficiency of dd during snapshots
POSIX states about dd:
If the bs=expr operand is specified and no conversions other than
sync, noerror, or notrunc are requested, the data returned from each
input block shall be written as a separate output block; if the read
returns less than a full block and the sync conversion is not
specified, the resulting output block shall be the same size as the
input block. If the bs=expr operand is not specified, or a conversion
other than sync, noerror, or notrunc is requested, the input shall be
processed and collected into full-sized output blocks until the end of
the input is reached.
Since we aren't using conv=sync, there is no zero-padding, but our
use of bs= means that a short read results in a short write. If
instead we use ibs= and obs=, then short reads are collected and dd
only has to do a single write, which can make dd more efficient.
* src/qemu/qemu_monitor.c (qemuMonitorMigrateToFile):
Avoid 'dd bs=', since it can cause short writes.
Eric Blake [Mon, 14 Mar 2011 16:44:37 +0000 (10:44 -0600)]
virsh: allow empty string arguments
"virsh connect ''" should try to connect to the default connection,
but the previous patch made it issue a warning about an invalid URI.
* tools/virsh.c (VSH_OFLAG_EMPTY_OK): New option flag.
(vshCommandOptString): Per the declaration, value is required to
be non-NULL. Honor new flag.
(opts_connect): Allow empty string connection.
The virCommandNewArgs() method would free the virCommandPtr
if it failed to add the args. This meant errors reported in
virCommandAddArgSet() were lost. Simply removing the check
for errors from the constructor means they can be reported
correctly later
The virCommandAddEnvPassCommon() method failed to check for
errors before reallocating the cmd->env array, causing a
potential SEGV if cmd was NULL
The virCommandAddArgSet() method needs to validate that at
least 1 element in 'val's parameter is non-NULL, otherwise
code like
Add virSetBlocking() to allow O_NONBLOCK to be toggle on or off
The virSetNonBlock() API only allows enabling non-blocking
operations. It doesn't allow turning blocking back on. Add
a new API to allow arbitrary toggling.
Taku Izumi [Tue, 15 Mar 2011 07:49:31 +0000 (16:49 +0900)]
libvirt: fix a simple bug in virDomainSetMemoryFlags()
This patch fix a simple bug in virDomainSetMemoryFlags function.
The patch sent before lacks the consideration of the case
where the driver doesn't support virDomainSetMemoryFlags API.
Make LXC container startup/shutdown/I/O more robust
The current LXC I/O controller looks for HUP to detect
when a guest has quit. This isn't reliable as during
initial bootup it is possible that 'init' will close
the console and let mingetty re-open it. The shutdown
of containers was also flakey because it only killed
the libvirt I/O controller and expected container
processes to gracefully follow.
Change the I/O controller such that when it see HUP
or an I/O error, it uses kill($PID, 0) to see if the
process has really quit.
Change the container shutdown sequence to use the
virCgroupKillPainfully function to ensure every
really goes away
This change makes the use of the 'cpu', 'devices'
and 'memory' cgroups controllers compulsory with
LXC
* docs/drvlxc.html.in: Document that certain cgroups
controllers are now mandatory
* src/lxc/lxc_controller.c: Check if PID is still
alive before quitting on I/O error/HUP
* src/lxc/lxc_driver.c: Use virCgroupKillPainfully
Daniel Veillard [Tue, 8 Mar 2011 10:31:20 +0000 (18:31 +0800)]
Allow to dynamically set the size of the debug buffer
This is the part allowing to dynamically resize the debug log
buffer from it's default 64kB size. The buffer is now dynamically
allocated.
It adds a new API virLogSetBufferSize() which resizes the buffer
If passed a zero size, the buffer is deallocated and we do the small
optimization of not formatting messages which are not output anymore.
On the daemon side, it just adds a new option log_buffer_size to
libvirtd.conf and call virLogSetBufferSize() if needed
* src/util/logging.h src/util/logging.c src/libvirt_private.syms:
make buffer dynamic and add virLogSetBufferSize() internal API
* daemon/libvirtd.conf: document the new log_buffer_size option
* daemon/libvirtd.c: read and use the new log_buffer_size option