Avoid creating top level cgroups if just querying for existance
When getting the driver/domain cgroup it is possible to specify
whether it should be auto created. If auto-creation was turned
off, libvirt still mistakenly created its own top level cgroup
* src/util/cgroup.c: Honour autocreate flag for top level cgroup
Laine Stump [Thu, 4 Mar 2010 22:35:27 +0000 (17:35 -0500)]
Change default for storage uid/gid from getuid()/getgid() to -1/-1
This allows the config to have a setting that means "leave it alone",
eg when building a pool where the directory already exists the user
may want the current uid/gid of the directory left intact. This
actually gets us back to older behavior - before recent changes to the
pool building code, we weren't as insistent about honoring the uid/gid
settings in the XML, and virt-manager was taking advantage of this
behavior.
As a side benefit, removing calls to getuid/getgid from the XML
parsing functions also seems like a good idea. And having a default
that is different from a common/useful value (0 == root) is a good
thing in general, as it removes ambiguity from decisions (at least one
place in the code was checking for (perms.uid == 0) to see if a
special uid was requested).
Note that this will only affect newly created pools and volumes. Due
to the way that the XML is parsed, then formatted for newly created
volumes, all existing pools/volumes already have an explicit uid and
gid set.
src/conf/storage_conf.c: Remove calls to setuid/setgid for default values
of uid/gid, and set them to -1 instead
src/storage/storage_backend.c:
src/storage/storage_backend_fs.c:
Make account for the new default values of perms.uid
and perms.gid.
build: vbox: avoid build failure when linking with --no-add-needed
With the recent changes to the linking defaults in Fedora 13 (namely
enabling --no-add-needed behaviour by default), we have to pass the
dlopen()-providing libraries directly at the link of the module; use the
same AC_SEARCH_LIBS function as used before to look for it and add it to
the Makefile.
QEMU has a monitor command 'set_cpu' which allows a specific
CPU to be toggled between online& offline state. libvirt CPU
hotplug does not work in terms of individual indexes CPUs.
Thus to support this, we iteratively toggle the online state
when the total number of vCPUs is adjusted via libvirt
NB, currently untested since QEMU segvs when running this!
* src/qemu/qemu_driver.c: Toggle online state for CPUs when
doing hotplug
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Add
monitor API for toggling a CPU's online status via 'set_cpu
The storage backend implementations all presume that the XML parser
is validating correctness of the source specification. The check for
a source device was lost at some point. This allowed for a potential
crash in the disk backend. Re-introduce the sanity check
* src/conf/storage_conf.c: Re-add check for source device
Don't raise error message from cgroups if QEMU fails to start
The code to remove the cgroup after QEMU failed to startup could
be obscuring a real error from earlier on. It is not neccessary
to raise an error in this case, so tell cgroups to keep quiet
Add missing device type check in QEMU PCI hotunplug
The QEMU hotunplug code for PCI devices was looking at host
devices in the guest config without first filtering non
PCI devices. This means it was reading garbage
* src/qemu/qemu_driver.c: Filter out non-PCI devices
Commit 3c12a67b766cce51b47861ccde2be41de369f832 added
a dependency on the NFS_SUPER_MAGIC macro, which is
defined in linux/magic.h. Unfortunately linux/magic.h
is not available in RHEL-5, and causes a compile error.
Just define it locally, since this is something that
can't change.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Laine Stump [Wed, 3 Mar 2010 16:07:18 +0000 (17:07 +0100)]
Make domain save work on root-squash NFS
Move *all* file operations related to creation and writing of libvirt
header to the domain save file into a hook function that is called by
virFileOperation. First try to call virFileOperation as root. If that
fails with EACCESS, and (in the case of Linux) statfs says that we're
trying to save the file on an NFS share, rerun virFileOperation,
telling it to fork a child process and setuid to the qemu user. This
is the only way we can successfully create a file on a root-squashed
NFS server.
This patch (along with setting dynamic_ownership=0 in qemu.conf)
makes qemudDomainSave work on root-squashed NFS.
* src/qemu/qemu_driver.c: provide new qemudDomainSaveFileOpHook()
utility, use it in qemudDomainSave() if normal creation of the
file as root failed, and after checking the filesystem type for
the storage is NFS. In that case we also bypass the security
driver, as this would fail on NFS.
Laine Stump [Wed, 3 Mar 2010 15:38:42 +0000 (16:38 +0100)]
Fix domain restore for files on root-squash NFS
If qemudDomainRestore fails to open the domain save file, create a
pipe, then fork a process that does setuid(qemu_user) and opens the
file, then reads this file and stuffs it into the pipe. the parent
libvirtd process will use the other end of the pipe as its fd, then
reap the child process after it's done reading.
This makes domain restore work on a root-squash NFS share that is only
visible to the qemu user.
* src/qemu/qemu_driver.c: add new qemudOpenAsUID() helper function,
and use it in qemudDomainRestore() if reading as root directly failed.
Fix detection of errors in QEMU device_add command
The code assumed that 'device_add' returned an empty string upon
success. This is not true, it sometimes prints random debug info.
THus we need to check for an explicit fail string
* src/qemu/qemu_monitor_text.c: Fix error checking of the device_add
monitor command
When a VM save attempt failed, the VM would be left in a paused
state. It is neccessary to resume CPU execution upon failure
if it was running originally
* src/qemu/qemu_driver.c: Resume CPUs upon save failure
This supports cancellation of jobs for the QEMU driver against
the virDomainMigrate, virDomainSave and virDomainCoreDump APIs.
It is not yet supported for the virDomainRestore API, although
it is desirable.
* src/qemu/qemu_driver.c: Issue 'migrate_cancel' command if
virDomainAbortJob is issued during a migration operation
* tools/virsh.c: Add a domjobabort command
Wire up internal entry points for virDomainAbortJob API
This provides the internal glue for the driver API
* src/driver.h: Internal API contract
* src/libvirt.c, src/libvirt_public.syms: Connect public API
to driver API
* src/esx/esx_driver.c, src/lxc/lxc_driver.c, src/opennebula/one_driver.c,
src/openvz/openvz_driver.c, src/phyp/phyp_driver.c,
src/qemu/qemu_driver.c, src/remote/remote_driver.c,
src/test/test_driver.c src/uml/uml_driver.c, src/vbox/vbox_tmpl.c,
src/xen/xen_driver.c: Stub out entry points
Introduce public API for cancelling async domain jobs
The new virDomainAbortJob() method provides a way for a second
thread to abort an ongoing job run by another thread. This
extends to any API with which the virDomainGetJobInfo() API
is intended to work. Cancellation is not guarenteed, rather best
effort on part of the hypervisor and not required to be implmented.
Add QEMU driver support for job info on migration ops
Introduce support for virDomainGetJobInfo in the QEMU driver. This
allows for monitoring of any API that uses the 'info migrate' monitor
command. ie virDomainMigrate, virDomainSave and virDomainCoreDump
Unfortunately QEMU does not provide a way to monitor incoming migration
so we can't wire up virDomainRestore yet.
The virsh tool gets a new command 'domjobinfo' to query status
* src/qemu/qemu_driver.c: Record virDomainJobInfo and start time
in qemuDomainObjPrivatePtr objects. Add generic shared handler
for calling 'info migrate' with all migration based APIs.
* src/qemu/qemu_monitor_text.c: Fix parsing of 'info migration' reply
* tools/virsh.c: add new 'domjobinfo' command to query progress
* src/remote/remote_protocol.x: Define wire protocol format
for virDomainGetJobInfo API
* src/remote/remote_driver.c, daemon/remote.c: Implement client
and server marshalling code for virDomainGetJobInfo()
* daemon/remote_dispatch_args.h, daemon/remote_dispatch_prototypes.h
daemon/remote_dispatch_ret.h, daemon/remote_dispatch_table.h,
src/remote/remote_protocol.c, src/remote/remote_protocol.h: Rebuild
files from src/remote/remote_protocol.x
Stub out internal driver entry points for job processing
The internal glue layer for the new pubic API
* src/driver.h: Define internal driver API contract
* src/libvirt.c, src/libvirt_public.syms: Wire up public
API to internal driver API
* src/esx/esx_driver.c, src/lxc/lxc_driver.c, src/opennebula/one_driver.c,
src/openvz/openvz_driver.c, src/phyp/phyp_driver.c,
src/qemu/qemu_driver.c, src/remote/remote_driver.c,
src/test/test_driver.c, src/uml/uml_driver.c, src/vbox/vbox_tmpl.c,
src/xen/xen_driver.c: Stub new entry point
Introduce public API for domain async job handling
Introduce a new public API that provides a way to get progress
info on currently running jobs on a virDomainpPtr. APIs that
are initially within scope of this idea are
These all take a potentially long time and benefit from monitoring.
The virDomainJobInfo struct allows for various pieces of information
to be reported
- Percentage completion
- Time
- Overall data
- Guest memory data
- Guest disk/file data
* include/libvirt/libvirt.h.in: Add virDomainGetJobInfo
* python/generator.py, python/libvirt-override-api.xml,
python/libvirt-override.c: Override for virDomainGetJobInfo API
* python/typewrappers.c, python/typewrappers.h: Introduce wrapper
for unsigned long long type
when the underlying qemu supports the drive/device model and the
controller has been added this way.
* src/qemu/qemu_driver.c: use qemuMonitorDelDevice() when detaching
PCI controller and if supported
* src/qemu/qemu_monitor.[ch]: add new qemuMonitorDelDevice() function
* src/qemu/qemu_monitor_json.[ch]: JSON backend for DelDevice command
* src/qemu/qemu_monitor_text.[ch]: Text backend for DelDevice command
Fix PCI address handling when controllers are deleted
* src/qemu/qemu_driver.c: in qemudDomainDetachPciControllerDevice()
when a controller is not present in the system anymore, the PCI
address must be deleted from libvirt's hashtable because it can
be re-used for other purposes.
Fix data structure handling when controllers are attached
* src/qemu/qemu_driver.c: in qemudDomainAttachDevice(), one must not
delete the data part when the operation succeeds because it is
required later on. The correct pattern to handlethe parsed
representation of the device information on success
is dev->data.controller = NULL; virDomainDeviceDefFree(dev);,
which leaves the structure pointed at by data in memory.
Jim Meyering [Sun, 28 Feb 2010 12:34:06 +0000 (13:34 +0100)]
x86Decode: avoid NULL-dereference upon questionable input
* src/cpu/cpu_x86.c (x86Decode): Don't dereference NULL when passed
a NULL "models" pointer, or when passed a nonzero "nmodels" value
and a corresponding NULL models[i].
Jim Meyering [Mon, 1 Mar 2010 20:26:59 +0000 (21:26 +0100)]
phypUUIDTable_Push: do not corrupt output stream upon partial write
* src/phyp/phyp_driver.c (phypUUIDTable_Push): Move incr/decr
of ptr/nread into the loop where those variables are used.
Also, remove "exit" label and just-preceding "goto".
Allow an arbitrary timezone with QEMU by setting the $TZ environment
variable when launching QEMU
* src/qemu/qemu_conf.c: Set TZ environment variable if a timezone
is requested
* tests/qemuxml2argvtest.c: Add test case for timezones
* tests/qemuxml2argvdata/qemuxml2argv-clock-france.xml,
tests/qemuxml2argvdata/qemuxml2argv-clock-france.args: Data
for timezone tests
This is useful if the admin has not configured any timezone on the
host OS, but still wants to synchronize a guest to a specific one.
* src/conf/domain_conf.h, src/conf/domain_conf.c: Support extra
'timezone' attribute on clock configuration
* docs/schemas/domain.rng: Add 'timezone' attribute
* src/xen/xend_internal.c, src/xen/xm_internal.c: Reject configs
with a configurable timezone
This allows QEMU guests to be started with an arbitrary clock
offset
The test case can't actually be enabled, since QEMU argv expects
an absolute timestring, and this will obviously change every
time the test runs :-( Hopefully QEMU will allow a relative
time offset in the future.
* src/qemu/qemu_conf.c, src/qemu/qemu_conf.h: Use the -rtc arg
if available to support variable clock offset mode
* tests/qemuhelptest.c: Add QEMUD_CMD_FLAG_RTC for qemu 0.12.1
* qemuxml2argvdata/qemuxml2argv-clock-variable.args,
qemuxml2argvdata/qemuxml2argv-clock-variable.xml,
qemuxml2argvtest.c: Test case, except we can't actually enable
it yet.
This introduces a third option for clock offset synchronization,
that allows an arbitrary / variable adjustment to be set. In
essence the XML contains the time delta in seconds, relative to
UTC.
<clock offset='variable' adjustment='123465'/>
The difference from 'utc' mode, is that management apps should
track adjustments and preserve them at next reboot.
* docs/schemas/domain.rng: Schema for new clock mode
* src/conf/domain_conf.c, src/conf/domain_conf.h: Parse
new clock time delta
* src/libvirt_private.syms, src/util/xml.c, src/util/xml.h: Add
virXPathLongLong() method
Change the internal domain conf representation of localtime/utc
The XML will soon be extended to allow more than just a simple
localtime/utc boolean flag. This change replaces the plain
'int localtime' with a separate struct to prepare for future
extension
* src/conf/domain_conf.c, src/conf/domain_conf.h: Add a new
virDomainClockDef structure
* src/libvirt_private.syms: Export virDomainClockOffsetTypeToString
and virDomainClockOffsetTypeFromString
* src/qemu/qemu_conf.c, src/vbox/vbox_tmpl.c, src/xen/xend_internal.c,
src/xen/xm_internal.c: Updated to use new structure for localtime
Jim Meyering [Mon, 1 Mar 2010 14:41:15 +0000 (15:41 +0100)]
cmdPoolDiscoverSources: initialize earlier to avoid FP from clang
* tools/virsh.c (cmdPoolDiscoverSources): Always initialize srcSpec.
Otherwise, clang would report that srcSpec could be used uninitialized
in the call to virConnectFindStoragePoolSources.
Jim Meyering [Fri, 26 Feb 2010 16:14:01 +0000 (17:14 +0100)]
build: update gnulib submodule to latest
* .gnulib: Update to latest.
Commit 89bdf84bcd9c6032e37 inadvertently rewound the .gnulib
submodule by 51 commits. This corrects it.
Spotted by Eric Blake.
Chris Lalancette [Wed, 24 Feb 2010 20:45:16 +0000 (15:45 -0500)]
Only build virDomainObjFormat if not building proxy.
While building under RHEL-5, I got a compile warning because
virDomainObjFormat was defined but not used. That came about
because in RHEL-5 we build with "#define PROXY", and
virDomainObjFormat is only used with !PROXY. Move the
define.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
* src/openvz/openvz_conf.c (openvzGetVEID): Always call fclose.
Diagnose parse failure also when vzlist output is empty.
If somehow we read a -1, diagnose that (albeit as a parse failure).
Cole Robinson [Wed, 24 Feb 2010 19:19:28 +0000 (14:19 -0500)]
Use standard spacing for user/pass prompt
Kind of minor, but it annoys me that the default auth callback
doesn't put a space between the prompt and the input, like a typical
terminal, ssh, etc. This patch changes the current prompt:
Cole Robinson [Tue, 23 Feb 2010 23:17:56 +0000 (18:17 -0500)]
libvirtd: Better initscript error reporting
From time to time I bork my install, and hate it when the initscript
returns no info. This patch removes the sanity check, which lets
the shell give us 'command not found' or 'permission denied' errors.
Cole Robinson [Wed, 24 Feb 2010 16:29:35 +0000 (11:29 -0500)]
remote: Improve daemon startup error reporting
If I toggle enable_tcp in libvirtd.conf and add --listen in
/etc/init.d/libvirtd, I get the unhelpful error:
Starting libvirtd daemon: error: Unable to initialize network sockets.
Running without --daemon provides much more useful info:
sudo libvirtd --listen
11:29:26.117: error : remoteCheckCertFile:270 : Cannot access CA certificate '/etc/pki/CA/cacert.pem': No such file or directory
The daemon architecture makes it difficult to report this useful
info if daemonized, so point users to /var/log/messages and
dropping the --daemon flag if they want more info.
Cole Robinson [Wed, 24 Feb 2010 16:13:00 +0000 (11:13 -0500)]
virsh: Show errors reported by nonAPI functions
Only API calls trigger the error callback, which is required for
proper virsh error reporting. Since we use non API functions from
util/, make sure we properly report these errors.
Fixes lack of error message from 'virsh create idontexit.xml'
Jim Meyering [Thu, 25 Feb 2010 09:35:20 +0000 (10:35 +0100)]
build: avoid "make rpm" failure in docs/
Add missing rule to build html/libvirt-libvirt.html.
Use a GNU Make pattern rule to avoid running apibuild.py once
for each out-of-date target, in a parallel build.
* docs/Makefile.am
Jim Meyering [Wed, 24 Feb 2010 21:51:47 +0000 (22:51 +0100)]
build: teach apibuild.py to work in a non-srcdir build
* docs/Makefile.am (libvirt-api.xml libvirt-refs.xml): Generalize
apibuild.py to work in a non-srcdir build. Pass "srcdir" to it.
* docs/apibuild.py (rebuild): Honor the $srcdir envvar.
* docs/Makefile.am (MAINTAINERCLEANFILES): Use this variable
for generated-and-distributed files, not "CLEANFILES".
Besides, "make clean" and "make distclean" should not delete
distributed files.
Jim Meyering [Wed, 24 Feb 2010 09:53:44 +0000 (10:53 +0100)]
build: ensure that MKINSTALLDIRS is AC_SUBST-defined
since we're using gettext-0.14.1, which uses that now-obsolete
automake symbol. Otherwise, make distcheck would fails like this:
make[2]: Entering directory `/t/libvirt-0.7.6/_build/po'
/bin/sh @MKINSTALLDIRS@ /t/libvirt-0.7.6/_inst/share
/bin/sh: @MKINSTALLDIRS@: No such file or directory
make[2]: *** [install-data-yes] Error 127
* configure.ac (MKINSTALLDIRS): Define.
For reference, we're currently hamstrung by our desire
to support RHEL5, which still uses gettext-0.14:
http://bugzilla.redhat.com/523713
Eric Blake [Wed, 24 Feb 2010 18:38:44 +0000 (11:38 -0700)]
maint: relax git minimum version
Requiring git 1.6.4, just for the optional GNULIB_SRCDIR support,
was too harsh. Resynchronize from gnulib.
* .gnulib: Import from latest gnulib.
* bootstrap: Re-synchronize from .gnulib/build-aux.
* bootstrap.conf: Drop git to 1.5.5.
* README-hacking: Document use of GNULIB_SRCDIR.
Richard Jones [Wed, 24 Feb 2010 12:55:17 +0000 (12:55 +0000)]
Ignore SIGWINCH in remote client call to poll(2) (RHBZ#567931).
In bug 567931 we found that virt-top would exit occasionally
when the terminal window was resized. Tracking this down it
turned out that SIGWINCH was being delivered to the process at
exactly the point where the libvirt remote driver was calling
poll(2) waiting for a reply from libvirtd.
This caused the poll(2) call to be interrupted (returning errno
EINTR). However handling EINTR the same way as EAGAIN was not
the solution to this problem since we found previously that this
would break Ctrl-C handling (commit 47fec8eac2bb3).
The correct solution is to mask out SIGWINCH for the duration
of the poll(2) system call. The per-thread mask is changed and
restored immediately after the call. Since we are using
pthread_sigmask, this should not affect other threads, and
since we restore the signal mask immediately afterwards it should
not affect the current thread visibly either. Other possibly
problematic signals are SIGCHLD and SIGPIPE and these are
masked too.
Note use of ignore_value: It's not fatal if we cannot mask out
SIGWINCH, and in any case pthread_sigmask never fails on Linux
as long as you supply the correct arguments.
I tested this patch and it cures the original problem with
virt-top.
Dave Allan [Wed, 24 Feb 2010 08:51:34 +0000 (09:51 +0100)]
Format FS pools on creation
Create the filesystem on the partition used by the pool
* configure.ac: check for mkfs availability
* libvirt.spec.in: add extra require on util-linux for mkfs
* src/storage/storage_backend_fs.c: run mkfs with the expected
fs type when creating a filesystem pool
Eric Blake [Tue, 23 Feb 2010 00:01:33 +0000 (17:01 -0700)]
maint: import modern bootstrap
Copy the latest gnulib bootstrap, which runs autoreconf and
generates po/Makevars for us. Other improvements include some
improved prerequisite tool checking.
This also fixes a bug in the .pot files, regarding the copyright holder.
* bootstrap: Update to version in .gnulib/build-aux.
* bootstrap.conf (MSGID_BUGS_ADDRESS, COPYRIGHT_HOLDER, SKIP_PO)
(gnulib_mk, ACLOCAL, bootstrap_epilogue): Provide overrides.
* autogen.sh (autoreconf): Avoid redundant autoreconf if bootstrap
was run.
* po/Makevars: Delete, now that bootstrap creates it.
* po/.gitignore: Update.
Eric Blake [Tue, 23 Feb 2010 00:01:32 +0000 (17:01 -0700)]
maint: start factoring bootstrap
Borrow ideas from gnulib/build-aux/bootstrap, in order to factor the
specifics of libvirt into bootstrap.conf, while allowing future
upgrades of bootstrap to happen with less effort.
* bootstrap (gnulib_tool): Update invocation to be closer to
gnulib's version. Move libvirt specifics...
* bootstrap.conf: ...into new file.
Cole Robinson [Mon, 22 Feb 2010 20:49:27 +0000 (15:49 -0500)]
storage: conf: Correctly calculate exabyte unit
We were using 'Y' to mean exabyte, when the correct abbreviation would be
'E' ('Y' is yettabyte, which is exabyte * 1024 * 1024). While it isn't
strictly backwards compatible, I highly doubt anyone was actually using
this broken behavior, so I don't see any harm in in dropping 'Y' handling.
Jiri Denemark [Mon, 22 Feb 2010 11:24:02 +0000 (12:24 +0100)]
Create raw storage files with O_DSYNC (again)
Recently we introduced O_DSYNC flag when creating raw storage files to
avoid filling all disk cache with dirty pages. However, the patch got
lost when virStorageBackendCreateRaw was reworked using
virFileOperation. Let's use O_DSYNC again.