I noticed this while shortening switch statements via VIR_ENUM.
Basically, the only ways virAsprintf can fail are if we pass a
bogus format string (but we're not THAT bad) or if we run out
of memory (but it already warns on our behalf in that case).
Throw away the cruft that tries too hard to diagnose a printf
failure.
Eric Blake [Fri, 21 Feb 2014 19:55:06 +0000 (12:55 -0700)]
virsh: use more compact VIR_ENUM_IMPL
Dan Berrange suggested that using VIR_ENUM_IMPL is more compact
than open-coding switch statements, and still just as forceful
at making us remember to update lists if we add enum values
in the future. Make this change throughout virsh.
Sure enough, doing this change caught that we missed at least
VIR_STORAGE_VOL_NETDIR.
Ensure systemd cgroup ownership is delegated to container with userns
This function is needed for user namespaces, where we need to chmod()
the cgroup to the initial uid/gid such that systemd is allowed to
use the cgroup.
Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Michal Privoznik [Fri, 21 Feb 2014 12:06:42 +0000 (13:06 +0100)]
virNetServerRun: Notify systemd that we're accepting clients
Systemd does not forget about the cases, where client service needs to
wait for daemon service to initialize and start accepting new clients.
Setting a dependency in client is not enough as systemd doesn't know
when the daemon has initialized itself and started accepting new
clients. However, it offers a mechanism to solve this. The daemon needs
to call a special systemd function by which the daemon tells "I'm ready
to accept new clients". This is exactly what we need with
libvirtd-guests (client) and libvirtd (daemon). So now, with this
change, libvirt-guests.service is invoked not any sooner than
libvirtd.service calls the systemd notify function.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Fri, 21 Feb 2014 11:46:08 +0000 (12:46 +0100)]
libvirt-guests: Wait for libvirtd to initialize
I've noticed that in some cases systemd was quick enough and even
if libvirt-guests.service is marked to be started after the
libvirtd.service my guests were not resumed as
libvirt-guests.sh failed to connect. This is because of a
simple fact: systemd correctly starts libvirt-guests after it
execs libvirtd. However, the daemon is not able to accept
connections right from the start. It's doing some
initialization which may take ages. This problem is not limited
to systemd only, indeed. Any init system that is able to startup
services in parallel (e.g. OpenRC) may run into this situation.
The fix is to try connecting not only once, but continuously a few
times with a small sleep in between tries.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When creating a new domain, we let systemd know about it by calling
CreateMachine() function via dbus. Systemd then creates a scope and
places domain into it. However, later when the host is shutting
down, systemd computes the shutdown order to see what processes can
be shut down in parallel. And since we were not setting
dependencies at all, the slices (and thus domains) were most likely
killed before libvirt-guests.service. So user domains that had to
be saved, shut off, whatever were in fact killed. This problem can
be solved by letting systemd know that scopes we're creating must
not be killed before libvirt-guests.service.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Ján Tomko [Fri, 21 Feb 2014 09:22:22 +0000 (10:22 +0100)]
Ignore additional fields in iscsiadm output
There has been a new field introduced in iscsiadm --mode session
output [1], but our regex only expects four fields. This breaks
startup of iscsi pools:
error: Failed to start pool iscsi
error: internal error: cannot find session
Fix this by ignoring anything after the fourth field.
Ján Tomko [Fri, 21 Feb 2014 08:10:48 +0000 (09:10 +0100)]
Add a stub for virCgroupGetDomainTotalCpuStats
Commit 6515889 broke the build on FreeBSD:
In function `qemuDomainGetCPUStats':
/../../src/qemu/qemu_driver.c:16102:
undefined reference to `virCgroupGetDomainTotalCpuStats'
Eric Blake [Fri, 14 Feb 2014 23:59:35 +0000 (16:59 -0700)]
virsh: add event command, for lifecycle events
Add 'virsh event --list' and 'virsh event [dom] --event=name
[--loop] [--timeout]'. Borrows somewhat from event-test.c,
but defaults to a one-shot notification, and takes advantage
of the event loop integration to allow Ctrl-C to interrupt the
wait for an event. For now, this just does lifecycle events.
* tools/virsh.pod (event): Document new command.
* tools/virsh-domain.c (vshDomainEventToString)
(vshDomainEventDetailToString, vshDomEventData)
(vshEventLifecyclePrint, cmdEvent): New struct and functions.
Eric Blake [Fri, 14 Feb 2014 22:30:23 +0000 (15:30 -0700)]
virsh: common code for waiting for an event
I plan to add 'virsh event' to virsh-domain.c and 'virsh
net-event' to virsh-network.c; but as they will share quite
a bit of common boilerplate, it's better to set that up now
in virsh.c.
* tools/virsh.h (_vshControl): Add fields.
(vshEventStart, vshEventWait, vshEventDone, vshEventCleanup): New
prototypes.
* tools/virsh.c (vshEventFd, vshEventOldAction, vshEventInt)
(vshEventTimeout): New helper variables and functions.
(vshEventStart, vshEventWait, vshEventDone, vshEventCleanup):
Implement new functions.
(vshInit, vshDeinit, main): Manage event timeout.
Eric Blake [Fri, 14 Feb 2014 00:08:42 +0000 (17:08 -0700)]
virsh: common code for parsing --seconds
Several virsh commands ask for a --timeout parameter in
seconds, then use it to control interfaces that operate on
millisecond limits; I also plan on adding a 'virsh event'
command that also does this. Factor this into a common
function.
* tools/virsh.h (vshCommandOptTimeoutToMs): New prototype.
* tools/virsh.c (vshCommandOptTimeoutToMs): New function.
* tools/virsh-domain.c (cmdBlockCommit, cmdBlockCopy)
(cmdBlockPull, cmdMigrate): Use it.
(vshWatchJob): Adjust timeout scale.
John Ferlan [Wed, 12 Feb 2014 20:33:02 +0000 (15:33 -0500)]
bandwidth: Adjust documentation
Recent autotest/virt-test testing on f20 discovered an anomaly in how
the bandwidth options are documented and used. This was discovered due
to a bug fix in the /sbin/tc utility found in iproute-3.11.0.1 (on f20)
in which overflow was actually caught and returned as an error. The fix
was first introduced in iproute-3.10 (search on iproute2 commit 'a303853e').
The autotest/virt-test test for virsh domiftune was attempting to send
the largest unsigned integer value (4294967295) for maximum value
testing. The libvirt xml implementation was designed to manage values
in kilobytes thus when this value was passed to /sbin/tc, it (now)
properly rejected the 4294967295kbps value.
Investigation of the problem discovered that formatdomain.html.in and
formatnetwork.html.in described the elements and property types slightly
differently, although they use the same code - virNetDevBandwidthParseRate()
(shared by portgroups, domains, and networks xml parsers). Rather than
have the descriptions in two places, this patch will combine and reword
the description under formatnetwork.html.in and have formatdomain.html.in
link to that description.
This documentation faux pas was continued into the virsh man page where
the bandwidth description for both 'attach-interface' and 'domiftune'
did not indicate the format of each value, thus leading to the test using
largest unsigned integer value assuming "bps" rather than "kbps", which
ultimately was wrong.
If a domain has a different maximum for persistent and live maxmem
or max vcpus, then it is possible to hit cases where libvirt
refuses to adjust the current values or gets halfway through
the adjustment before failing. Better is to determine up front
if the change is possible for all requested flags.
Based on an idea by Geoff Franks.
* src/qemu/qemu_driver.c (qemuDomainSetMemoryFlags): Compute
correct maximum if both live and config are being set.
(qemuDomainSetVcpusFlags): Likewise.
The previous OOM testing support would re-run the entire "main"
method each iteration, failing a different malloc each time.
When a test suite has 'n' allocations, the number of repeats
requires is (n * (n + 1) ) / 2. This gets very large, very
quickly.
This new OOM testing support instead integrates at the
virtTestRun level, so each individual test case gets repeated,
instead of the entire test suite. This means the values of
'n' are orders of magnitude smaller.
The simple usage is
$ VIR_TEST_OOM=1 ./qemuxml2argvtest
...
29) QEMU XML-2-ARGV clock-utc ... OK
Test OOM for nalloc=36 .................................... OK
30) QEMU XML-2-ARGV clock-localtime ... OK
Test OOM for nalloc=36 .................................... OK
31) QEMU XML-2-ARGV clock-france ... OK
Test OOM for nalloc=38 ...................................... OK
...
the second lines reports how many mallocs have to be failed, and thus
how many repeats of the test will be run.
If it crashes, then running under valgrind will often show the problem
In the worse case, you might want to know the stack trace of the
alloc which was failed then VIR_TEST_OOM_TRACE can be set. If it
is set to 1 then it will only print if it thinks a mistake happened.
This is often not reliable, so setting it to 2 will make it print
the stack trace for every alloc that is failed.
Thorsten Behrens [Fri, 14 Feb 2014 17:49:04 +0000 (18:49 +0100)]
Widening API change - accept empty path for virDomainBlockStats
And provide domain summary stat in that case, for lxc backend.
Use case is a container inheriting all devices from the host,
e.g. when doing application containerization.
Jincheng Miao [Thu, 20 Feb 2014 09:29:15 +0000 (17:29 +0800)]
virsh: fix memleak when starting a guest with invalid fd
When start a guest with --pass-fd, if the argument of --pass-fd is invalid,
virsh will exit, but doesn't free the variable 'dom'.
The valgrind said:
...
==24569== 63 (56 direct, 7 indirect) bytes in 1 blocks are definitely lost in loss record 130 of 234
==24569== at 0x4C2A1D4: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==24569== by 0x4E879A4: virAllocVar (viralloc.c:544)
==24569== by 0x4EBD625: virObjectNew (virobject.c:190)
==24569== by 0x4F3A18A: virGetDomain (datatypes.c:226)
==24569== by 0x4F9311F: remoteDomainLookupByName (remote_driver.c:6636)
==24569== by 0x4F44F20: virDomainLookupByName (libvirt.c:2277)
==24569== by 0x12F616: vshCommandOptDomainBy (virsh-domain.c:105)
==24569== by 0x131C79: cmdStart (virsh-domain.c:3330)
==24569== by 0x12C4AB: vshCommandRun (virsh.c:1752)
==24569== by 0x127001: main (virsh.c:3218)
Destroying a suspended domain needs special action.
We cannot simply terminate all process because they are frozen.
Do deal with that we send them SIGKILL and thaw them.
Upon wakeup the process sees the pending signal and dies immediately.
Signed-off-by: Richard Weinberger <richard@nod.at>
Ján Tomko [Thu, 20 Feb 2014 09:04:30 +0000 (10:04 +0100)]
Fix build of portallocator on mingw
IN6ADDR_ANY_INIT does not seem to be working as expected on MinGW:
error: missing braces around initializer [-Werror=missing-braces]
.sin6_addr = IN6ADDR_ANY_INIT,
Michal Privoznik [Wed, 19 Feb 2014 13:55:23 +0000 (14:55 +0100)]
networkRunHook: Run hook only if possible
Currently, networkRunHook() is called in networkAllocateActualDevice and
friends. These functions, however, doesn't necessarily work on networks,
For example, if domain's interface is defined in this fashion:
The networkAllocateActualDevice jumps directly onto 'validate' label as
the interface is not type of 'network'. Hence, @network is left
initialized to NULL and networkRunHook(network, ...) is called. One of
the things that the hook function does is dereference @network. Soupir.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Jim Fehlig [Thu, 6 Feb 2014 23:34:58 +0000 (16:34 -0700)]
libxl: use job functions in domain save operations
Saving domain memory and cpu state can take considerable time.
Use the recently added job functions and unlock the virDomainObj
while saving the domain.
Jim Fehlig [Wed, 12 Feb 2014 23:06:41 +0000 (16:06 -0700)]
libxl: use job functions when cleaning up a domain
When explicitly destroying a domain (libxlDomainDestroyFlags), or
handling an out-of-band domain shutdown event, cleanup the domain
in the context of a job. Introduce libxlVmCleanupJob to wrap
libxlVmCleanup in a job block.
Jim Fehlig [Thu, 6 Feb 2014 22:21:36 +0000 (15:21 -0700)]
libxl: use job functions in libxlVmStart
Creating a large domain could potentially be time consuming. Use the
recently added job functions and unlock the virDomainObj while
the create operation is in progress.
Jim Fehlig [Wed, 12 Feb 2014 22:22:18 +0000 (15:22 -0700)]
libxl: remove libxlVmReap function
This function, which only has five call sites, simply calls
libxl_domain_destroy and libxlVmCleanup. Call those functions
directly at the call sites, allowing more control over how a
domain is destroyed and cleaned up. This patch maintains the
existing semantic, leaving changes to a subsequent patch.
Oleg Strikov [Fri, 14 Feb 2014 14:09:00 +0000 (18:09 +0400)]
qemu: Use virtio network device for aarch64/virt
This patch changes network device type used by default from rtl8139
to virtio when architecture type is aarch64 and machine type is virt.
Qemu doesn't support any other machine types for aarch64 right now and
we can't make any other aarch64-specific tuning in this function yet.
Li Zhang [Mon, 17 Feb 2014 10:17:54 +0000 (18:17 +0800)]
conf: Remove the implicit PS2 devices for non-X86 platforms
PS2 devices only work on X86 platform, other platforms may need
USB devices instead. Athough it doesn't influence the QEMU command line,
it's not right to add PS2 mouse/keyboard for non-X86 platform.
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com> Signed-off-by: Ján Tomko <jtomko@redhat.com>
Michal Privoznik [Tue, 18 Feb 2014 17:40:28 +0000 (18:40 +0100)]
bridge_driver.h: Fix build --without-network
The networkNotifyActualDevice function is accepting two arguments, not
one:
qemu/qemu_process.c: In function 'qemuProcessNotifyNets':
qemu/qemu_process.c:2776:47: error: macro "networkNotifyActualDevice" passed 2 arguments, but takes just 1
if (networkNotifyActualDevice(def, net) < 0)
^
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Ján Tomko [Tue, 18 Feb 2014 14:01:32 +0000 (15:01 +0100)]
Fix conflicting types of virInitctlSetRunLevel
aebbcdd didn't change the non-linux definition of the function,
breaking the build on FreeBSD:
../../src/util/virinitctl.c:164: error: conflicting types for
'virInitctlSetRunLevel'
../../src/util/virinitctl.h:40: error: previous declaration of
'virInitctlSetRunLevel' was here
network: Taint networks that are using hook script
Basically, the idea is copied from domain code, where tainting
exists for a while. Currently, only one taint reason exists -
VIR_NETWORK_TAINT_HOOK to mark those networks which caused invoking
of hook script.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Fri, 31 Jan 2014 15:48:06 +0000 (16:48 +0100)]
network: Introduce network hooks
There might be some use cases, where user wants to prepare the host or
its environment prior to starting a network and do some cleanup after
the network has been shut down. Consider all the functionality that
libvirt doesn't currently have as an example what a hook script can
possibly do.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Wed, 12 Feb 2014 16:36:35 +0000 (17:36 +0100)]
network_conf: Expose virNetworkDefFormatInternal
In the next patch I'm going to need the network format function that
takes virBuffer as argument. However, slightly change of name is more
appropriate then: virNetworkDefFormatBuf to match the rest of functions
that format an object to buffer.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
CVE-2013-6456: Avoid unsafe use of /proc/$PID/root in LXC hotunplug code
Rewrite multiple hotunplug functions to to use the
virProcessRunInMountNamespace helper. This avoids
risk of a malicious guest replacing /dev with an absolute
symlink, tricking the driver into changing the host OS
filesystem.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
CVE-2013-6456: Avoid unsafe use of /proc/$PID/root in LXC chardev hostdev hotplug
Rewrite lxcDomainAttachDeviceHostdevMiscLive function
to use the virProcessRunInMountNamespace helper. This avoids
risk of a malicious guest replacing /dev with a absolute
symlink, tricking the driver into changing the host OS
filesystem.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
CVE-2013-6456: Avoid unsafe use of /proc/$PID/root in LXC block hostdev hotplug
Rewrite lxcDomainAttachDeviceHostdevStorageLive function
to use the virProcessRunInMountNamespace helper. This avoids
risk of a malicious guest replacing /dev with a absolute
symlink, tricking the driver into changing the host OS
filesystem.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
CVE-2013-6456: Avoid unsafe use of /proc/$PID/root in LXC USB hotplug
Rewrite lxcDomainAttachDeviceHostdevSubsysUSBLive function
to use the virProcessRunInMountNamespace helper. This avoids
risk of a malicious guest replacing /dev with a absolute
symlink, tricking the driver into changing the host OS
filesystem.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
CVE-2013-6456: Avoid unsafe use of /proc/$PID/root in LXC disk hotplug
Rewrite lxcDomainAttachDeviceDiskLive function to use the
virProcessRunInMountNamespace helper. This avoids risk of
a malicious guest replacing /dev with a absolute symlink,
tricking the driver into changing the host OS filesystem.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Eric Blake [Tue, 24 Dec 2013 05:55:51 +0000 (22:55 -0700)]
CVE-2013-6456: Avoid unsafe use of /proc/$PID/root in LXC shutdown/reboot code
Use helper virProcessRunInMountNamespace in lxcDomainShutdownFlags and
lxcDomainReboot. Otherwise, a malicious guest could use symlinks
to force the host to manipulate the wrong file in the host's namespace.
Idea by Dan Berrange, based on an initial report by Reco
<recoverym4n@gmail.com> at
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=732394
Add helper for running code in separate namespaces
Implement virProcessRunInMountNamespace, which runs callback of type
virProcessNamespaceCallback in a container namespace. This uses a
child process to run the callback, since you can't change the mount
namespace of a thread. This implies that callbacks have to be careful
about what code they run due to async safety rules.
Idea by Dan Berrange, based on an initial report by Reco
<recoverym4n@gmail.com> at
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=732394
Signed-off-by: Daniel Berrange <berrange@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
Add a helper function which takes a file path and ensures
that all directory components leading up to the file exist.
IOW, it strips the filename part of the path and passes
the result to virFileMakePath.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move check for cgroup devices ACL upfront in LXC hotplug
The check for whether the cgroup devices ACL is available is
done quite late during LXC hotplug - in fact after the device
node is already created in the container in some cases. Better
to do it upfront so we fail immediately.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Fix reset of cgroup when detaching USB device from LXC guests
When detaching a USB device from an LXC guest we must remove
the device from the cgroup ACL. Unfortunately we were telling
the cgroup code to use the guest /dev path, not the host /dev
path, and the guest device node had already been unlinked.
This was, however, fortunate since the code passed &priv->cgroup
instead of priv->cgroup, so would have crash if the device node
were accessible.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The LXC code missed the 'usb' component out of the path
/dev/bus/usb/$BUSNUM/$DEVNUM, so it failed to actually
setup cgroups for the device. This was in fact lucky
because the call to virLXCSetupHostUsbDeviceCgroup
was also mistakenly passing '&priv->cgroup' instead of
just 'priv->cgroup'. So once the path is fixed, libvirtd
would then crash trying to access the bogus virCgroupPtr
pointer. This would have been a security issue, were it
not for the bogus path preventing the pointer reference
being reached.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
virDomainDefCompatibleDevice blocks use of USB if no USB
controller is present. This is not correct for containers
since devices can be assigned directly regardless of any
controllers.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently, there's just one place where we care if hook script is
changing the domain XML: migration hook for incoming migration. In
all other places where a hook script is executed, we don't read the
XML back from the script.
Anyway, the hook script can alter domain XML and hence we should taint
it if the script did.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Peter Krempa [Fri, 14 Feb 2014 15:03:22 +0000 (16:03 +0100)]
Revert "storage: Introduce internal pool support"
The internal pools were an idea in one of the first iterations of the
gluster series, which we decided not to use. Somehow the patch still
got pushed. Remove it as the internal flag isn't needed.
Ján Tomko [Fri, 18 Oct 2013 11:52:03 +0000 (13:52 +0200)]
Support IPv6 in port allocator
Also try to bind on IPv6 to check if the port is occupied.
Change the mocked bind in the test to return EADDRINUSE
for some ports only for the IPv4/IPv6 socket if we're testing
on a host with IPv6 compiled in.
Also mock socket() to make it fail with EAFNOTSUPPORTED
if LIBVIRT_TEST_IPV4ONLY is set in the environment, to
simulate a host without IPv6 support in the kernel. The
tests are repeated again with this variable set.
Peter Krempa [Fri, 14 Feb 2014 10:46:37 +0000 (11:46 +0100)]
storage: Fix build with older compilers afeter gluster snapshot series
In commit e32268184b4fd1611ed5ffd3c758b8f6a34152e6 I accidentally added
twice a typedef for virStorageFileBackend when I moved it between files
across patch iterations. The double declaration breaks build on older
compilers in RHEL5 and FreeBSD.