conf: Utilize more of VIR_(APPEND|INSERT|DELETE)_ELEMENT
This fixes a possible double free. In virNetworkAssignDef() if
virBitmapNew() fails, then virNetworkObjFree(network) is called.
However, with network->def pointing to actual @def. So if caller
frees @def again, ...
Moreover, this fixes one possible memory leak too. In
virInterfaceAssignDef() if appending to the list of interfaces
fails, we ought to call virInterfaceObjFree() instead of bare
VIR_FREE().
Although, in order to do that some array size variables needs
to be turned into size_t rather than int.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The nwfilter conf update mutex previously serialized
updates to the internal data structures for firewall
rules, and updates to the firewall itself. The latter
was recently turned into a read/write lock, and filter
instantiation allowed to proceed in parallel. It was
believed that this was ok, since each filter is created
on a separate iptables/ebtables chain.
It turns out that there is a subtle lock ordering problem
on virNWFilterObjPtr instances. __virNWFilterInstantiateFilter
will hold a lock on the virNWFilterObjPtr it is instantiating.
This in turn invokes virNWFilterInstantiate which then invokes
virNWFilterDetermineMissingVarsRec which then invokes
virNWFilterObjFindByName. This iterates over every single
virNWFilterObjPtr in the list, locking them and checking their
name. So if 2 or more threads try to instantiate a filter in
parallel, they'll all hold 1 lock at the top level in the
__virNWFilterInstantiateFilter method which will cause the
other thread to deadlock in virNWFilterObjFindByName.
The fix is to add an exclusive mutex to serialize the
execution of __virNWFilterInstantiateFilter.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
vshRunConsole() uses virCondWait() which is a wrapper around
pthread_cond_wait(). On FreeBSD, pthread_cond_wait needs mutex to be
locked, otherwise it immediately fails with EPERM. On Linux, the
behaviour in this case is undefined.
John Ferlan [Fri, 7 Mar 2014 14:46:21 +0000 (09:46 -0500)]
virscsi: Introduce virSCSIDeviceUsedByInfoFree
This resolves a Coverity RESOURCE_LEAK issue introduced by commit
id 'de6fa535' where the virSCSIDeviceSetUsedBy() didn't VIR_FREE
the 'copy' or possibly VIR_STRDUP()'d values. It also ensures that
the VIR_APPEND_ELEMENT is successful...
Michael Chapman [Thu, 6 Mar 2014 06:02:47 +0000 (17:02 +1100)]
tests: SELinux tests do not need to be skipped
With the previous commit's securityselinuxhelper enhancements, the
SELinux security manager can be tested even without SELinux enabled on
the test system.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
The selabel_* functions back onto the real implementations if SELinux is
enabled on the test system, otherwise we just implement a fake selabel
handle which errors out on all labelling lookups.
With these changes in place, securityselinuxtest and
securityselinuxlabeltest don't need to skip all tests if SELinux isn't
available; they can exercise much of the security manager code.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
Jiri Denemark [Thu, 6 Mar 2014 10:52:25 +0000 (11:52 +0100)]
spec: Let translations be properly updated
Libvirt tarball contains po/stamp-po file which prevents any po/*.gmo
file to be regenerated even if a corresponding po/*.po file is newer. By
removing the stamp-po file, all *.gmo files are properly updated if
required. This allows downstreams to provide patches that update
translations.
When domain is started with setting that cannot be done, i.e. those
that require cgroups, there is no error reported and it succeeds
without any message whatsoever.
When setting with API, virsh, an error is reported, but only due to
the fact that no cgroups are mounted (priv->cgroup == NULL).
Given the above it seems reasonable to reject such unsupported
settings.
This patch effectively changes the error message from:
$ virsh -c qemu:///session schedinfo dummy
Scheduler : Unknown
error: Requested operation is not valid: cgroup CPU controller is not mounted
to:
$ virsh -c qemu:///session schedinfo dummy
Scheduler : Unknown
error: Operation not supported: CPU tuning is not available in session mode
Eric Blake [Wed, 5 Mar 2014 18:55:27 +0000 (11:55 -0700)]
virt-login-shell: silence coverity warning
Coverity spotted that 'nfdlist' (ssize_t) could be -1, but that we
were using 'i' (size_t) to iterate over the list at cleanup, with
crashing results because it promotes to a really big unsigned number.
* tools/virt-login-shell.c (main): Avoid treating -1 as unsigned.
configure check for character devices lock path calls
AC_DEFINE_UNQUOTED for VIR_CHRDEV_LOCK_FILE_PATH even if
$with_chrdev_lock_files = "no".
So the locking code in conf/virchrdev.c:
#ifdef VIR_CHRDEV_LOCK_FILE_PATH
is compiled in even if it shouldn't, because VIR_CHRDEV_LOCK_FILE_PATH
is defined as "no", so it tries to create lock files with strange
lock path like 'no/LCK..'.
Fix that by calling AC_DEFINE_UNQUOTED only if $with_chrdev_lock_files
is not 'no'.
Peter Krempa [Mon, 3 Mar 2014 15:39:40 +0000 (16:39 +0100)]
doc: storage: Explicitly state that it's possible to have non-unique key
With most of our storage backends it's possible to have two separate
volume keys to point to a single volume. (By creating sym/hard-links to
local files or by mounting remote filesystems to two different locations
and creating pools on top of them) Document this possibility.
Peter Krempa [Mon, 24 Feb 2014 15:07:40 +0000 (16:07 +0100)]
storage: Don't lie about path used to look up in error message
In storageVolLookupByPath the provided path is "sanitized" at first.
This removes some extra slashes and stuff. When the lookup of the volume
fails the original path is used which makes it hard to trace errors in
some cases.
Improve the error message to print the sanitized path along with the
user provided path if they are not equal.
Peter Krempa [Tue, 25 Feb 2014 14:51:15 +0000 (15:51 +0100)]
storage: Avoid mangling paths of non-local filesystems when looking up
When looking up a volume by path on a non-local filesystem don't use the
"cleaned" path that might be mangled in such a way that it will differ
from a path provided by a storage backend.
Skip the cleanup step for gluster, sheepdog and RBD.
Peter Krempa [Mon, 3 Mar 2014 15:11:28 +0000 (16:11 +0100)]
storage: Error out when attempting to vol-upload into a remote pool
Pools that are not backed by files in the filesystem cause problems with
some APIs. Error out when attempting to upload a volume in such a pool
as currently we expect a local file representation for it.
Peter Krempa [Mon, 3 Mar 2014 14:21:37 +0000 (15:21 +0100)]
virsh: volume: Fix lookup of volumes to provide better error messages
If a user specifies the pool explicitly, we should make sure to point
out that it's inactive instead of falling back to lookup by key/path and
failing at the end. Also if the pool isn't found there's no use in
continuing the lookup.
This changes the error in case the user-selected pool is inactive from:
$ virsh vol-upload --pool inactivepool --vol somevolname volcontents
error: failed to get vol 'somevolname'
error: Storage volume not found: no storage vol with matching path
somevolname
To a more descriptive:
$ virsh vol-upload --pool inactivepool --vol somevolname volcontents
error: pool 'inactivepool' is not active
And in case a user specifies an invalid pool from:
$ virsh vol-upload --pool invalidpool --vol somevolname volcontents
error: failed to get pool 'invalidpool'
error: failed to get vol 'somevolname', specifying --pool might help
error: Storage volume not found: no storage vol with matching path somevolname
To something less confusing:
$ virsh vol-upload --pool invalidpool --vol somevolname volcontents
error: failed to get pool 'invalidpool'
error: Storage pool not found: no storage pool with matching name 'invalidpool'
use_apparmor() was first designed to be called from withing libvirtd,
but libvirt_lxc also uses it. in libvirt_lxc, there is no need to check
whether to use apparmor or not: just use it if possible.
Reverting of external snapshots is not supported currently. The check
that is present doesn't properly check for all aspects that make a
snapshot external. Use virDomainSnapshotIsExternal() to do the check.
Ján Tomko [Thu, 27 Feb 2014 20:21:57 +0000 (21:21 +0100)]
Check if systemd is running before creating machines
If systemd is installed, but is not the init system,
systemd-machined fails with an unhelpful error message:
Launch helper exited with unknown return code 1
Currently we only check if the "machine1" service is
available (in ListActivatableNames).
Also check if "systemd1" service is registered with DBus
(ListNames).
This fixes https://bugs.gentoo.org/show_bug.cgi?id=493246#c22
Ján Tomko [Tue, 4 Mar 2014 07:39:56 +0000 (08:39 +0100)]
build: Include sys/wait.h in commandtest.c
Commit 631923e used a few macros from sys/wait.h without including
it. On Linux, they were also defined in stdlib.h, but on FreeBSD
the build failed:
../../tests/commandtest.c: In function 'test1':
warning: implicit declaration of function 'WIFEXITED'
warning: nested extern declaration of 'WIFEXITED' [-Wnested-externs]
Eric Blake [Mon, 23 Dec 2013 16:32:45 +0000 (09:32 -0700)]
virsh: report exit status of failed lxc-enter-namespace
'virsh lxc-enter-namespace' does not have a way to reflect exit
status to the caller in single-command mode, but we might as well
at least report the exit status. Prior to this patch,
* tools/virsh-domain.c (cmdLxcEnterNamespace): Avoid magic numbers.
Dispatch any error.
* tools/virsh.pod: Document that non-zero exit status is collapsed.
Eric Blake [Tue, 24 Dec 2013 04:07:01 +0000 (21:07 -0700)]
virt-login-shell: saner exit value
virt-login-shell was exiting with status 0, regardless of what the
wrapped shell returned. This is unkind to users; we should behave
more like env(1), nice(1), su(1), and other wrapper programs, by
preserving the invoked application's status (which includes the
distinction between death due to signal vs. normal death).
Eric Blake [Mon, 23 Dec 2013 16:15:16 +0000 (09:15 -0700)]
virt-login-shell: use single instead of double fork
Note that 'virsh lxc-enter-namespace' must double-fork, for two
reasons: some namespaces can only be done from a single thread,
while virsh is multithreaded; and because virsh can be run in
batch mode where we must not corrupt the namespace of that
execution upon return from the subsidiary command.
When virt-login-shell was first written, it blindly copied from
'virsh lxc-enter-namespace', including the double-fork. But
neither of the reasons for double forking apply to
virt-login-shell (we are single-threaded, and we have nothing to
do after the child completes that would require us to preserve a
namespace), so we can simplify life by using a single fork.
In turn, this will make it easier for a future patch to pass the
child's exit status on to the invoking shell.
In flattening to a single fork, note that closing the fds must
be done after fork, because the parent process still needs to
use fds to control the virConnectPtr; meanwhile, chdir can be
done prior to forking (in fact, it's easier to report errors
on anything attempted before forking).
* tools/virt-login-shell.c (main): Single rather than double fork.
(virLoginShellFini): Delete, by inlining actions instead.
Eric Blake [Sun, 22 Dec 2013 00:54:33 +0000 (17:54 -0700)]
virFork: simplify semantics
The old semantics of virFork() violates the priciple of good
usability: it requires the caller to check the pid argument
after use, *even when virFork returned -1*, in order to properly
abort a child process that failed setup done immediately after
fork() - that is, the caller must call _exit() in the child.
While uses in virfile.c did this correctly, uses in 'virsh
lxc-enter-namespace' and 'virt-login-shell' would happily return
from the calling function in both the child and the parent,
leading to very confusing results. [Thankfully, I found the
problem by inspection, and can't actually trigger the double
return on error without an LD_PRELOAD library.]
It is much better if the semantics of virFork are impossible
to abuse. Looking at virFork(), the parent could only ever
return -1 with a non-negative pid if it misused pthread_sigmask,
but this never happens. Up until this patch series, the child
could return -1 with non-negative pid if it fails to set up
signals correctly, but we recently fixed that to make the child
call _exit() at that point instead of forcing the caller to do
it. Thus, the return value and contents of the pid argument are
now redundant (a -1 return now happens only for failure to fork,
a child 0 return only happens for a successful 0 pid, and a
parent 0 return only happens for a successful non-zero pid),
so we might as well return the pid directly rather than an
integer of whether it succeeded or failed; this is also good
from the interface design perspective as users are already
familiar with fork() semantics.
One last change in this patch: before returning the pid directly,
I found cases where using virProcessWait unconditionally on a
cleanup path of a virFork's -1 pid return would be nicer if there
were a way to avoid it overwriting an earlier message. While
such paths are a bit harder to come by with my change to a direct
pid return, I decided to keep the virProcessWait change in this
patch.
* src/util/vircommand.h (virFork): Change signature.
* src/util/vircommand.c (virFork): Guarantee that child will only
return on success, to simplify callers. Return pid rather than
status, now that the situations are always the same.
(virExec): Adjust caller, also avoid open-coding process death.
* src/util/virprocess.c (virProcessWait): Tweak semantics when pid
is -1.
(virProcessRunInMountNamespace): Adjust caller.
* src/util/virfile.c (virFileAccessibleAs, virFileOpenForked)
(virDirCreate): Likewise.
* tools/virt-login-shell.c (main): Likewise.
* tools/virsh-domain.c (cmdLxcEnterNamespace): Likewise.
* tests/commandtest.c (test23): Likewise.
Eric Blake [Thu, 20 Feb 2014 00:32:19 +0000 (17:32 -0700)]
util: make it easier to grab only regular command exit
Auditing all callers of virCommandRun and virCommandWait that
passed a non-NULL pointer for exit status turned up some
interesting observations. Many callers were merely passing
a pointer to avoid the overall command dying, but without
caring what the exit status was - but these callers would
be better off treating a child death by signal as an abnormal
exit. Other callers were actually acting on the status, but
not all of them remembered to filter by WIFEXITED and convert
with WEXITSTATUS; depending on the platform, this can result
in a status being reported as 256 times too big. And among
those that correctly parse the output, it gets rather verbose.
Finally, there were the callers that explicitly checked that
the status was 0, and gave their own message, but with fewer
details than what virCommand gives for free.
So the best idea is to move the complexity out of callers and
into virCommand - by default, we return the actual exit status
already cleaned through WEXITSTATUS and treat signals as a
failed command; but the few callers that care can ask for raw
status and act on it themselves.
Eric Blake [Thu, 20 Feb 2014 03:23:44 +0000 (20:23 -0700)]
util: make it easier to grab only regular process exit
Right now, a caller waiting for a child process either requires
the child to have status 0, or must use WIFEXITED() and friends
itself. But in many cases, we want the middle ground of treating
fatal signals as an error, and directly accessing the normal exit
value without having to use WEXITSTATUS(), in order to easily
detect an expected non-zero exit status. This adds the middle
ground to the low-level virProcessWait; the next patch will add
it to virCommand.
Eric Blake [Wed, 19 Feb 2014 19:39:44 +0000 (12:39 -0700)]
util: preserve exit status from mount namespace callback
The documentation of namespace callbacks was inconsistent on whether
it preserved positive return values. Now that we have a dedicated
EXIT_CANCELED to flag all errors before getting to the callback,
it is possible to use positive return values (not that any of the
current callers do, but it is better to match the docs).
Also, while vircommand.c is careful to close fds that a child should
not have, it's still better to be in the practice of setting
FD_CLOEXEC up front.
* src/util/virprocess.c (virProcessRunInMountNamespace): Tweak
return value to pass back non-zero status. Avoid leaking pipe fds
to other threads.
* src/util/virprocess.h: Fix comment.
Eric Blake [Thu, 20 Feb 2014 03:04:40 +0000 (20:04 -0700)]
util: make it easier to reflect child exit status
Thanks to namespaces, we have a couple of places in the code
base that want to reflect a child exit status, including the
ability to detect death by a signal, back to a grandparent.
Best to make it a reusable function.
* src/util/virprocess.h (virProcessExitWithStatus): New prototype.
* src/libvirt_private.syms (util/virprocess.h): Export it.
* src/util/virprocess.c (virProcessExitWithStatus): New function.
* tests/commandtest.c (test23): Test it.
Eric Blake [Wed, 19 Feb 2014 01:06:50 +0000 (18:06 -0700)]
virFork: give specific status on failure prior to exec
When a child fails without exec'ing, we want a well-known status;
best is to match what env(1), nice(1), su(1), and other wrapper
programs do. This patch adds enum values that later patches will
use, and sets up virFork as the first client of EXIT_CANCELED
for errors detected prior to even attempting exec, as well as
virExec to distinguish between a missing executable vs. a binary
that cannot be executed.
This is a slight semantic change in the unlikely case of a child
process failing to restore its signal mask - we now kill the
child with a known status instead of relying on the caller to
notice and do an appropriate _exit(). A subsequent patch will
make further cleanups based on an audit of all callers.
* src/internal.h (EXIT_CANCELED, EXIT_CANNOT_INVOKE)
(EXIT_ENOENT): New enum.
* src/util/vircommand.c (virFork): Document specific exit value if
child aborts early.
(virExec): Distinguish between various exec failures.
* tests/commandtest.c (test1): Enhance test.
(test22): New test.
Eric Blake [Wed, 19 Feb 2014 23:45:54 +0000 (16:45 -0700)]
nwfilter: make ignoring non-zero status easier to follow
While auditing all callers of virCommandRun, I noticed that nwfilter
code never paid attention to commands with a non-zero status; they
were merely passing a pointer to avoid spamming the logs with a
message about commands that might indeed fail. But proving this
required chasing through a lot of code; refactoring things to
localize the decision of whether to ignore non-zero status makes
it easier to prove that later changes to virFork don't negatively
affect this code.
While at it, I also noticed that ebiptablesRemoveRules would
actually report success if the child process failed for a
reason other than non-zero status, such as OOM.
* src/nwfilter/nwfilter_ebiptables_driver.c (ebiptablesExecCLI):
Change parameter from pointer to bool.
(ebtablesApplyBasicRules, ebtablesApplyDHCPOnlyRules)
(ebtablesApplyDropAllRules, ebtablesCleanAll)
(ebiptablesApplyNewRules, ebiptablesTearNewRules)
(ebiptablesTearOldRules, ebiptablesAllTeardown)
(ebiptablesDriverInitWithFirewallD)
(ebiptablesDriverTestCLITools, ebiptablesDriverProbeStateMatch):
Adjust all clients.
(ebiptablesRemoveRules): Likewise, and fix return value on failure.
Oleg Strikov [Mon, 3 Mar 2014 13:41:03 +0000 (17:41 +0400)]
qemu: Implement a stub cpuArchDriver.baseline() handler for arm
Openstack Nova calls virConnectBaselineCPU() during initialization
of the instance to get a full list of CPU features.
This patch adds a stub to arm-specific code to handle
this request (no actual work is done).
Generate a unique journald log for QEMU capabilities failure
When probing QEMU capabilities fails for a binary generate a
log message with MESSAGE_ID==8ae2f3fb-2dbe-498e-8fbd-012d40afa361.
This can be directly queried from journald based on the UUID
instead of needing string grep. This lets tools like libguestfs'
bug reporting tool trivially do automated sanity tests on the
host they're running on.
$ journalctl MESSAGE_ID=8ae2f3fb-2dbe-498e-8fbd-012d40afa361
Feb 21 17:11:01 localhost.localdomain lt-libvirtd[9196]:
Failed to probe capabilities for /bin/qemu-system-alpha:
internal error: Child process (LC_ALL=C LD_LIBRARY_PATH=
/home/berrange/src/virt/libvirt/src/.libs PATH=/usr/lib64/
ccache:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:
/usr/bin:/root/bin HOME=/root USER=root LOGNAME=root
/bin/qemu-system-alpha -help) unexpected exit status 127:
/bin/qemu-system-alpha: error while loading shared libraries:
libglapi.so.0: cannot open shared object file: No such file
or directory
$ journalctl MESSAGE_ID=8ae2f3fb-2dbe-498e-8fbd-012d40afa361 --output=json
{ ...snip...
"LIBVIRT_SOURCE" : "file",
"PRIORITY" : "3",
"CODE_FILE" : "qemu/qemu_capabilities.c",
"CODE_LINE" : "2770",
"CODE_FUNC" : "virQEMUCapsLogProbeFailure",
"MESSAGE_ID" : "8ae2f3fb-2dbe-498e-8fbd-012d40afa361",
"LIBVIRT_QEMU_BINARY" : "/bin/qemu-system-xtensa",
"MESSAGE" : "Failed to probe capabilities for /bin/qemu-system-xtensa:
internal error: Child process (LC_ALL=C LD_LIBRARY_PATH=/home/berrange
/src/virt/libvirt/src/.libs PATH=/usr/lib64/ccache:/usr/local/sbin:
/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin HOME=/root
USER=root LOGNAME=root /bin/qemu-system-xtensa -help) unexpected
exit status 127: /bin/qemu-system-xtensa: error while loading shared
libraries: libglapi.so.0: cannot open shared object file: No such
file or directory\n" }
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Eric Blake [Mon, 24 Feb 2014 23:57:06 +0000 (16:57 -0700)]
virsh: add --all flag to 'event' command
Similar to our event-test demo program, it's nice to be able to
have a mode where we can sniff all events at once, rather than
having to spawn multiple virsh in parallel with one for each
event type.
(Can I just say our RegisterAny design is lousy? The fact that
the majority of our callback pointers have a function signature
with the opaque data in a different position, and that we have
to cast the function signature before registering it, makes it
hard to write a generic callback function; we have to write one
for every type of event id. Life would have been easier if we
had designed the callback as a fixed signature with a void*
and size parameter, and then allowed the caller to downcast
the void* to a particular struct for data specific to their
callback id, where we could have then had a single function
with a switch statement for each event id, and register that
one function for all types of events. It would also be nicer
if the callback functions knew which callbackID was being used
when invoking that callback, so that I could use a common data
structure among all registrations instead of having to create
an array of one data per callback. But I really don't want to
go add yet another event API design.)
* tools/virsh-domain.c (cmdEvent): Add --all parameter; convert
all callbacks to support shared counter.
* tools/virsh.pod (event): Document it.
Eric Blake [Mon, 24 Feb 2014 21:46:29 +0000 (14:46 -0700)]
virsh: support remaining domain events
Earlier, I added 'virsh event' for lifecycle events, to get the
concept approved; this patch finishes the support for all other
events, although the user still has to register for one event
type at a time. A future patch may add an --all parameter to
make it possible to register for all events through a single
call.
* tools/virsh-domain.c (vshDomainEventWatchdogToString)
(vshDomainEventIOErrorToString, vshGraphicsPhaseToString)
(vshGraphicsAddressToString, vshDomainBlockJobStatusToString)
(vshDomainEventDiskChangeToString)
(vshDomainEventTrayChangeToString, vshEventGenericPrint)
(vshEventRTCChangePrint, vshEventWatchdogPrint)
(vshEventIOErrorPrint, vshEventGraphicsPrint)
(vshEventIOErrorReasonPrint, vshEventBlockJobPrint)
(vshEventDiskChangePrint, vshEventTrayChangePrint)
(vshEventPMChangePrint, vshEventBalloonChangePrint)
(vshEventDeviceRemovedPrint): New helper routines.
(cmdEvent): Support full array of event callbacks.
The systemd journal expects log record PRIORITY values to
be encoded using the syslog compatible numbering scheme,
not libvirt's own native numbering scheme. We must therefore
apply a conversion.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The systemd journal accepts arbitrary user specified log
fields. These can be passed into virLogMessage via the
virLogMetadata structure. Allow up to 5 custom fields to
be reported by libvirt callers.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Oleg Strikov [Wed, 26 Feb 2014 16:40:12 +0000 (20:40 +0400)]
qemu: Enable 'host-passthrough' cpu mode for arm
This patch allows libvirt user to specify 'host-passthrough'
cpu mode while using qemu/kvm backend on arm (arm32).
It uses 'host' as a CPU model name instead of some other stub
(correct CPU detection is not implemented yet) to allow libvirt
user to specify 'host-model' cpu mode as well.
Michal Privoznik [Fri, 28 Feb 2014 08:50:01 +0000 (09:50 +0100)]
virDomainBlockStats(Flags): Produce saner error message on empty disk path
As of 0bd2ccdec an empty disk path for virDomainBlockStats (or the one
with Flags) is allowed meaning "get me overall summarized statistics".
However, running 'virsh domblkstat $dom' throws a misleading error:
# ./tools/virsh domblkstat dom
error: Failed to get block stats dom
error: invalid argument: invalid path:
while after this commit
# virsh domblkstat dom
error: Operation not supported: summary statistics are not supported yet
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Eric Blake [Fri, 28 Feb 2014 00:48:23 +0000 (17:48 -0700)]
tests: avoid littering /tmp
Running 'make -C tests check TESTS=qemuagenttest' left a directory
/tmp/libvirt_XXXXXX/ behind. The culprit was failure to cleanup
when short-circuiting an expensive test.
* tests/qemuagenttest.c (testQemuAgentTimeout): Free resources
when skipping expensive test.
Jiri Denemark [Thu, 27 Feb 2014 08:32:41 +0000 (09:32 +0100)]
sanlock: Truncate domain names longer than SANLK_NAME_LEN
Libvirt uses a domain name to fill in owner_name in sanlock_options in
virLockManagerSanlockAcquire. Unfortunately, owner_name is limited to
SANLK_NAME_LEN characters (including trailing '\0'), which means domains
with longer names fail to start when sanlock is enabled. However, we can
truncate the name when setting owner_name as explained by sanlock's
author:
Setting sanlk_options or the owner_name is unnecessary, and has very
little to no benefit. If you do provide something in owner_name, it can
be anything, sanlock doesn't care or use it.
If you run the command "sanlock status", the output will display a list
of clients connected to the sanlock daemon. This client list is
displayed as "pid owner_name" if the client has provided an owner_name
via sanlk_options. This debugging output is the only usage of
owner_name, so its only benefit is to potentially provide a more human
friendly output for debugging purposes.
Eric Blake [Wed, 26 Feb 2014 20:30:46 +0000 (13:30 -0700)]
build: skip virportallocatortest on cygwin
Cygwin supports <dlfcn.h> and even has limited LD_PRELOAD
capabilities; but because it does not use ELF binaries it
cannot support RTLD_NEXT lookups.
CC libvirportallocatormock_la-virportallocatortest.lo
virportallocatortest.c: In function 'init_syms':
virportallocatortest.c:47:24: error: 'RTLD_NEXT' undeclared (first use in this function)
realsocket = dlsym(RTLD_NEXT, "socket");
* tests/virportallocatortest.c: Also require RTLD_NEXT.
Eric Blake [Wed, 26 Feb 2014 19:56:06 +0000 (12:56 -0700)]
build: ignore cygwin toolchain droppings
The cygwin compiler automatically creates a '*.exe.manifest'
companion file for any .exe file that contains a substring
that would otherwise cause newer Windows to pester users about
needing admin rights (such as "update", "instal", "setup"...).
This means that compilation on cygwin left behind
tests/networkxml2xmlupdatetest.exe.manifest.
Peter Krempa [Wed, 26 Feb 2014 13:17:51 +0000 (14:17 +0100)]
spec: Fix braces around macros
In commit 72f7658ba24491672e6b81118f892400916e9404 I've added a few
macros with bad bracing. Although they work as expected fix them so that
we use uniform syntax.
Oops - the build is non-deterministic, where the final binary
depends on my build environment. The fix is to require
systemd-devel in all situations where the code base uses it.
Now ~/.rpmmacros can contain "%define _without_systemd_daemon 1"
to explicitly disable use of the library, but the library is now
a strict build requirement for normal builds; if systemd-devel
is not installed, the user now gets an up-front warning:
$ rpmbuild -ta libvirt-1.2.2.tar.gz
error: Failed build dependencies:
systemd-devel is needed by libvirt-1.2.2-1.fc20.x86_64
* libvirt.spec.in (with_systemd_daemon): New variable.
(BuildRequires): Require systemd-devel for more than just udev.
(%configure): Make choice of systemd_daemon explicit.
Eric Blake [Tue, 25 Feb 2014 19:28:53 +0000 (12:28 -0700)]
spec: require device-mapper-devel for storage-disk
On Fedora 20, with the following in my ~/.rpmmacros:
%_without_udev 1
%_without_storage_mpath 1
and with device-mapper-devel uninstalled, 'make rpm' fails with:
checking for libdevmapper.h... no
configure: error: You must install device-mapper-devel/libdevmapper >= 1.0.0 to compile libvirt
error: Bad exit status from /var/tmp/rpm-tmp.Wo9pOG (%build)
This is a rather late point to be issuing an error; better is
to flag missing packages up front. The fix is to match the logic
in configure.ac on when devmapper is required (for both mpath and
storage). While at it, rbd storage is not dependent on mpath.
With this patch applied, I now get:
$ rpmbuild -ta libvirt-1.2.2.tar.gz
error: Failed build dependencies:
device-mapper-devel is needed by libvirt-1.2.2-1.fc20.x86_64
until either installing the package or further modifying
~/.rpmmacros to add "%_without_storage_disk 1".
* libvirt.spec.in (BuildRequires): Fix build when mpath is
disabled.
Eric Blake [Tue, 25 Feb 2014 16:51:40 +0000 (09:51 -0700)]
spec: explicitly avoid bhyve on Linux
Generally, we try to make the spec file tweakable via user
variables, so that they can select a different subset of sub-rpms
to build. We also try to explicitly list all driver config
options, rather than leaving the chance that the rpm build may be
non-deterministic based on what the user had installed locally.
But in the case of the recent bhyve hypervisor driver, there is
no port of bhyve to Linux, so it is easier to just blindly
disable it for now. If someone ever does try to port bhyve to
Fedora, we can make the spec file conditional at that point.
* libvirt.spec.in (%configure): Don't try to build bhyve.
Eric Blake [Tue, 25 Feb 2014 16:42:29 +0000 (09:42 -0700)]
build: use --with-systemd-daemon as configure option
Commit 68954fb added a configure option --with-systemd_daemon,
which violates the conventions of configure files preferring
dash in all option names. This fixes it, before we hit a
release where the tarball is baked with an awkward name.
* m4/virt-lib.m4 (LIBVIRT_CHECK_LIB, LIBVIRT_CHECK_LIB_ALT)
(LIBVIRT_CHECK_PKG): Favor - over _ in configure option names.
Laine Stump [Tue, 25 Feb 2014 14:47:22 +0000 (16:47 +0200)]
network: unplug bandwidth and call networkRunHook only when appropriate
According to commit b4e0299d if networkAllocateActualDevice() was
successful, it will *always* allocate an iface->data.network.actual,
so we can use this during networkReleaseActualDevice() to know if
there is really anything to undo. We were properly using this
information to only decrement the network connections counter if it
had previously been incremented, but we were unconditionally
unplugging bandwidth and calling the "unplugged" network hook for
*all* interfaces (during qemuProcessStop()) whether they had been
previously plugged or not. This caused problems if a domain failed to
start at some time prior to all interfaces being allocated. (I
encountered this when an interface had a bandwidth floor set but no
inbound QoS).
This patch changes both the call to networkUnplugBandwidth() and the
call to networkRunHook() to only be called if there was a previous
call to "plug" for the same interface.
Laine Stump [Tue, 25 Feb 2014 14:35:59 +0000 (16:35 +0200)]
network: don't even call networkRunHook if there is no network
networkAllocateActualDevice() is called for *all* interfaces, not just
those with type='network'. In that case, it will jump down to its
validate: label immediately, without allocating anything. After
validation is done, two counters are potentially updated (one for the
network, and one for any particular physical device that is chosen),
and then networkRunHook() is called.
This patch refactors that code a slight bit so that networkRunHook()
doesn't get called if netdef is NULL (i.e. type != network) and to
place the conditional increment of dev->connections inside the "if
(netdef)" as well - dev can never be non-null if netdef is null
(because "dev" is the pointer to a device in a network's pool of
devices), so this doesn't have any functional effect, it just makes
the code clearer.
While running virscsitest, it was found that valgrind pointed out the following
memory leak:
==320== 5 bytes in 1 blocks are definitely lost in loss record 4 of 37
==320== at 0x4A069EE: malloc (vg_replace_malloc.c:270)
==320== by 0x3E6CE81171: strdup (strdup.c:43)
==320== by 0x4CB28DF: virStrdup (virstring.c:554)
==320== by 0x4CAC987: virSCSIDeviceSetUsedBy (virscsi.c:289)
==320== by 0x402321: test2 (virscsitest.c:100)
==320== by 0x403231: virtTestRun (testutils.c:199)
==320== by 0x402121: mymain (virscsitest.c:180)
==320== by 0x4039AD: virtTestMain (testutils.c:782)
==320== by 0x3E6CE1ED1C: (below main) (libc-start.c:226)
==320==
Introduced by commit fd243fc. Signed-off-by: Ján Tomko <jtomko@redhat.com>
When starting these domain in parallel, all workers may meet in
virNetDevVethCreate() where a race starts. Race over allocating veth
pairs because allocation requires two steps:
1) find first nonexistent '/sys/class/net/vnet%d/'
2) run 'ip link add ...' command
Now consider two threads. Both of them find N as the first unused veth
index but only one of them succeeds allocating it. The other one fails.
For such cases, we are running the allocation in a loop with 10 rounds.
However this is very flaky synchronization. It should be rather used
when libvirt is competing with other process than when libvirt threads
fight each other. Therefore, internally we should use mutex to serialize
callers, and do the allocation in loop (just in case we are competing
with a different process). By the way we have something similar already
since 1cf97c87.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Eric Blake [Wed, 26 Feb 2014 00:24:58 +0000 (17:24 -0700)]
build: avoid ld_preload tests on mingw
Running ./autobuild.sh complained during the mingw cross-compile:
CC libvirportallocatormock_la-virportallocatortest.lo
../../tests/virportallocatortest.c:32:20: fatal error: dlfcn.h: No such file or directory
# include <dlfcn.h>
^
compilation terminated. With that fixed, the next failure was:
CCLD qemuxml2argvmock.la
libtool: link: libtool library `qemuxml2argvmock.la' must begin with `lib'
libtool: link: Try `libtool --help --mode=link' for more information.
While we don't need to limit all LD_PRELOAD tests to just Linux, we
do need to limit them to platforms that actually support loading;
we also need to avoid building qemu tests when qemu is not enabled.
* tests/virportallocatortest.c: Make conditional on <dlfcn.h>.
* tests/Makefile.am (test_libraries): Only build qemu mock library
when building qemu tests.
Jim Fehlig [Wed, 19 Feb 2014 22:39:19 +0000 (15:39 -0700)]
libxl: queue domain event earlier in shutdown handler
The shutdown handler may restart a domain when handling a reboot
event or when <on_*> is set to 'restart'. Restarting consists of
calling libxlVmCleanup followed by libxlVmStart. libxlVmStart will
emit a VIR_DOMAIN_EVENT_STARTED event, but the SHUTDOWN event is
not emitted until exiting the shutdown handler, after the STARTED
event.
This patch changes the logic a bit to queue the event at the start
of the shutdown action, ensuring it is queued before any subsequent
events that may be generated while executing the shutdown action.
Laine Stump [Fri, 21 Feb 2014 12:12:00 +0000 (14:12 +0200)]
network: include plugged interface XML in "plugged" network hook
The network hook script gets called whenever an interface is plugged
into or unplugged from a network, but even though the full XML of both
the network and the domain is included, there is no reasonable way to
determine what exact resources the plugged interface is using:
1) Prior to a recent patch which modified the status XML of interfaces
to include the information about actual hardware resources used, it
would be possible to scan through the domain XML output sent to the
hook, and from there find the correct interface, but that interface
definition would not include any runtime info (e.g. bandwidth or vlan
taken from a portgroup, or which physdev was used in case of a macvtap
network).
2) After the patch modifying the status XML of interfaces, the network
name would no longer be included in the domain XML, so it would be
completely impossible to determine which interface was the one being
plugged.
To solve that problem, this patch includes a single <interface>
element at the beginning of the XML sent to the network hook for
"plugged" and "unplugged" (just inside <hookData>) that is the status
XML of the interface being plugged. This XML will include all info
gathered from the chosen network and portgroup.
NB: due to hardcoded spaces in all of the device *Format() functions,
the <interface> element inside the <hookData> will be indented by 6
spaces rather than 2. I had intended to fix this, but it turns out
that to make virDomainNetDefFormat() indentation relative, I would
have to do the same to virDomainDeviceInfoFormat(), and that function
is called from 19 places - making that a prerequisite of this patch
would cause too many merge difficulties if we needed to backport
network hooks, so I chose to ignore the problem here and fix the
problem for *all* devices in a followup later.
Laine Stump [Thu, 20 Feb 2014 12:03:49 +0000 (14:03 +0200)]
conf: output actual netdev status in <interface> XML
Until now, the "live" XML status of an <interface type='network'>
device would always show the network information, rather than the
exact hardware device that was used. It would also show the name of
any portgroup the interface belonged to, rather than providing the
configuration that was derived from that portgroup. As an example,
given the following network definition:
In order to learn the exact bandwidth information of the interface, a
management application would need to retrieve the XML for testnet,
then search for the portgroup named "admin". Even worse, there was no
simple and standard way to learn which host physdev the macvtap0
device is attached to.
Internally, libvirt has always kept this information in the
virDomainDef that is held in memory, as well as storing it in the
(libvirt-internal-only) domain status XML (in
/var/run/libvirt/qemu/$domain.xml). In order to not confuse the runtime
"actual state" with the config of the device, it's internally stored
like this:
This was never exposed outside of libvirt though, because I thought it
would be too awkward for a management application to need to look in
two places for the same information, but I also wasn't sure that it
would be okay to overwrite the config info (in this case "<source
network='testnet' portgroup='admin'/>") with the actual runtime info
(everything inside <actual> above).
Now we have a need for this information to be made available to
management applications (in particular, so that a network "plugged"
hook will have full information about the device that is being plugged
in), so it's time to take the leap and decide that it is acceptable
for the config info to be replaced with actual runtime state (but
*only* when reporting domain live status, *not* when saving state in
/var/run/libvirt/qemu/$domain.xml - that remains the same so that
there is no loss of information). That is what this patch does - once
applied, the output of "virsh dumpxml $domain" when the domain is
running will contain something like this:
In effect, everything that is internally stored within <actual> is
moved up a level to where a management application will expect
it. This means that the management application will only look in a
single place to learn - the type of interface in use, the name of the
physdev (if relevant), the <bandwidth>, <vlan>, and <virtualport>
settings in use.
The potential downside is that a management app looking at this output
will not see that the physdev 'p4p1_0' was actually allocated from the
network named 'testnet', or that the bandwidth numbers were taken from
the portgroup 'admin'. However, if they are interested in that info,
they can always get the "inactive" XML for the domain.
An example of where this could cause problems is in virt-manager's
network device display, which shows the status of the device, but
allows you to edit that status info and save it as the new
config. Previously virt-manager would always display the information
in example [C] above, and allow editing that. With this patch, it will
instead display what is in [E] and allow editing it directly, which
could lead to some confusion. I would suggest that virt-manager have
an "edit" button which would change the display from the "live" xml to
the "inactive" xml, so that editing would be done on that; such a
change would both handle the new situation, and also be compatible
with older releases.
Laine Stump [Wed, 19 Feb 2014 12:45:52 +0000 (14:45 +0200)]
conf: new function virDomainActualNetDefContentsFormat
This function is currently only called from one place, but in a
subsequent patch will be called from a 2nd place.
The new function exactly replicates the original behavior of the part
of virDomainActualNetDefFormat() that it replaces, but takes a
virDomainNetDefPtr instead of virDomainActualNetDefPtr, and uses the
virDomainNetGetActual*() functions whenever possible, rather than
reaching into def->data.network.actual - this is to be sure that we
are reporting exactly what is being used internally, just in case
there are any discrepancies (there shouldn't be).
Laine Stump [Wed, 19 Feb 2014 14:22:48 +0000 (16:22 +0200)]
conf: re-situate <bandwidth> element in <interface>
This moves the call to virNetDevBandwidthFormat() in
virDomainNetDefFormat() to be called right after the call to
virNetDevVPortProfileFormat(), so that a single chunk of that function
can be placed inside an if that conditionally calls
virDomainActualNetDefContentsFormat() instead (next patch). The
re-ordering necessitates modifying a couple of test data files.