Oleg Strikov [Mon, 3 Mar 2014 13:41:03 +0000 (17:41 +0400)]
qemu: Implement a stub cpuArchDriver.baseline() handler for arm
Openstack Nova calls virConnectBaselineCPU() during initialization
of the instance to get a full list of CPU features.
This patch adds a stub to arm-specific code to handle
this request (no actual work is done).
Generate a unique journald log for QEMU capabilities failure
When probing QEMU capabilities fails for a binary generate a
log message with MESSAGE_ID==8ae2f3fb-2dbe-498e-8fbd-012d40afa361.
This can be directly queried from journald based on the UUID
instead of needing string grep. This lets tools like libguestfs'
bug reporting tool trivially do automated sanity tests on the
host they're running on.
$ journalctl MESSAGE_ID=8ae2f3fb-2dbe-498e-8fbd-012d40afa361
Feb 21 17:11:01 localhost.localdomain lt-libvirtd[9196]:
Failed to probe capabilities for /bin/qemu-system-alpha:
internal error: Child process (LC_ALL=C LD_LIBRARY_PATH=
/home/berrange/src/virt/libvirt/src/.libs PATH=/usr/lib64/
ccache:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:
/usr/bin:/root/bin HOME=/root USER=root LOGNAME=root
/bin/qemu-system-alpha -help) unexpected exit status 127:
/bin/qemu-system-alpha: error while loading shared libraries:
libglapi.so.0: cannot open shared object file: No such file
or directory
$ journalctl MESSAGE_ID=8ae2f3fb-2dbe-498e-8fbd-012d40afa361 --output=json
{ ...snip...
"LIBVIRT_SOURCE" : "file",
"PRIORITY" : "3",
"CODE_FILE" : "qemu/qemu_capabilities.c",
"CODE_LINE" : "2770",
"CODE_FUNC" : "virQEMUCapsLogProbeFailure",
"MESSAGE_ID" : "8ae2f3fb-2dbe-498e-8fbd-012d40afa361",
"LIBVIRT_QEMU_BINARY" : "/bin/qemu-system-xtensa",
"MESSAGE" : "Failed to probe capabilities for /bin/qemu-system-xtensa:
internal error: Child process (LC_ALL=C LD_LIBRARY_PATH=/home/berrange
/src/virt/libvirt/src/.libs PATH=/usr/lib64/ccache:/usr/local/sbin:
/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin HOME=/root
USER=root LOGNAME=root /bin/qemu-system-xtensa -help) unexpected
exit status 127: /bin/qemu-system-xtensa: error while loading shared
libraries: libglapi.so.0: cannot open shared object file: No such
file or directory\n" }
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Eric Blake [Mon, 24 Feb 2014 23:57:06 +0000 (16:57 -0700)]
virsh: add --all flag to 'event' command
Similar to our event-test demo program, it's nice to be able to
have a mode where we can sniff all events at once, rather than
having to spawn multiple virsh in parallel with one for each
event type.
(Can I just say our RegisterAny design is lousy? The fact that
the majority of our callback pointers have a function signature
with the opaque data in a different position, and that we have
to cast the function signature before registering it, makes it
hard to write a generic callback function; we have to write one
for every type of event id. Life would have been easier if we
had designed the callback as a fixed signature with a void*
and size parameter, and then allowed the caller to downcast
the void* to a particular struct for data specific to their
callback id, where we could have then had a single function
with a switch statement for each event id, and register that
one function for all types of events. It would also be nicer
if the callback functions knew which callbackID was being used
when invoking that callback, so that I could use a common data
structure among all registrations instead of having to create
an array of one data per callback. But I really don't want to
go add yet another event API design.)
* tools/virsh-domain.c (cmdEvent): Add --all parameter; convert
all callbacks to support shared counter.
* tools/virsh.pod (event): Document it.
Eric Blake [Mon, 24 Feb 2014 21:46:29 +0000 (14:46 -0700)]
virsh: support remaining domain events
Earlier, I added 'virsh event' for lifecycle events, to get the
concept approved; this patch finishes the support for all other
events, although the user still has to register for one event
type at a time. A future patch may add an --all parameter to
make it possible to register for all events through a single
call.
* tools/virsh-domain.c (vshDomainEventWatchdogToString)
(vshDomainEventIOErrorToString, vshGraphicsPhaseToString)
(vshGraphicsAddressToString, vshDomainBlockJobStatusToString)
(vshDomainEventDiskChangeToString)
(vshDomainEventTrayChangeToString, vshEventGenericPrint)
(vshEventRTCChangePrint, vshEventWatchdogPrint)
(vshEventIOErrorPrint, vshEventGraphicsPrint)
(vshEventIOErrorReasonPrint, vshEventBlockJobPrint)
(vshEventDiskChangePrint, vshEventTrayChangePrint)
(vshEventPMChangePrint, vshEventBalloonChangePrint)
(vshEventDeviceRemovedPrint): New helper routines.
(cmdEvent): Support full array of event callbacks.
The systemd journal expects log record PRIORITY values to
be encoded using the syslog compatible numbering scheme,
not libvirt's own native numbering scheme. We must therefore
apply a conversion.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The systemd journal accepts arbitrary user specified log
fields. These can be passed into virLogMessage via the
virLogMetadata structure. Allow up to 5 custom fields to
be reported by libvirt callers.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Oleg Strikov [Wed, 26 Feb 2014 16:40:12 +0000 (20:40 +0400)]
qemu: Enable 'host-passthrough' cpu mode for arm
This patch allows libvirt user to specify 'host-passthrough'
cpu mode while using qemu/kvm backend on arm (arm32).
It uses 'host' as a CPU model name instead of some other stub
(correct CPU detection is not implemented yet) to allow libvirt
user to specify 'host-model' cpu mode as well.
Michal Privoznik [Fri, 28 Feb 2014 08:50:01 +0000 (09:50 +0100)]
virDomainBlockStats(Flags): Produce saner error message on empty disk path
As of 0bd2ccdec an empty disk path for virDomainBlockStats (or the one
with Flags) is allowed meaning "get me overall summarized statistics".
However, running 'virsh domblkstat $dom' throws a misleading error:
# ./tools/virsh domblkstat dom
error: Failed to get block stats dom
error: invalid argument: invalid path:
while after this commit
# virsh domblkstat dom
error: Operation not supported: summary statistics are not supported yet
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Eric Blake [Fri, 28 Feb 2014 00:48:23 +0000 (17:48 -0700)]
tests: avoid littering /tmp
Running 'make -C tests check TESTS=qemuagenttest' left a directory
/tmp/libvirt_XXXXXX/ behind. The culprit was failure to cleanup
when short-circuiting an expensive test.
* tests/qemuagenttest.c (testQemuAgentTimeout): Free resources
when skipping expensive test.
Jiri Denemark [Thu, 27 Feb 2014 08:32:41 +0000 (09:32 +0100)]
sanlock: Truncate domain names longer than SANLK_NAME_LEN
Libvirt uses a domain name to fill in owner_name in sanlock_options in
virLockManagerSanlockAcquire. Unfortunately, owner_name is limited to
SANLK_NAME_LEN characters (including trailing '\0'), which means domains
with longer names fail to start when sanlock is enabled. However, we can
truncate the name when setting owner_name as explained by sanlock's
author:
Setting sanlk_options or the owner_name is unnecessary, and has very
little to no benefit. If you do provide something in owner_name, it can
be anything, sanlock doesn't care or use it.
If you run the command "sanlock status", the output will display a list
of clients connected to the sanlock daemon. This client list is
displayed as "pid owner_name" if the client has provided an owner_name
via sanlk_options. This debugging output is the only usage of
owner_name, so its only benefit is to potentially provide a more human
friendly output for debugging purposes.
Eric Blake [Wed, 26 Feb 2014 20:30:46 +0000 (13:30 -0700)]
build: skip virportallocatortest on cygwin
Cygwin supports <dlfcn.h> and even has limited LD_PRELOAD
capabilities; but because it does not use ELF binaries it
cannot support RTLD_NEXT lookups.
CC libvirportallocatormock_la-virportallocatortest.lo
virportallocatortest.c: In function 'init_syms':
virportallocatortest.c:47:24: error: 'RTLD_NEXT' undeclared (first use in this function)
realsocket = dlsym(RTLD_NEXT, "socket");
* tests/virportallocatortest.c: Also require RTLD_NEXT.
Eric Blake [Wed, 26 Feb 2014 19:56:06 +0000 (12:56 -0700)]
build: ignore cygwin toolchain droppings
The cygwin compiler automatically creates a '*.exe.manifest'
companion file for any .exe file that contains a substring
that would otherwise cause newer Windows to pester users about
needing admin rights (such as "update", "instal", "setup"...).
This means that compilation on cygwin left behind
tests/networkxml2xmlupdatetest.exe.manifest.
Peter Krempa [Wed, 26 Feb 2014 13:17:51 +0000 (14:17 +0100)]
spec: Fix braces around macros
In commit 72f7658ba24491672e6b81118f892400916e9404 I've added a few
macros with bad bracing. Although they work as expected fix them so that
we use uniform syntax.
Oops - the build is non-deterministic, where the final binary
depends on my build environment. The fix is to require
systemd-devel in all situations where the code base uses it.
Now ~/.rpmmacros can contain "%define _without_systemd_daemon 1"
to explicitly disable use of the library, but the library is now
a strict build requirement for normal builds; if systemd-devel
is not installed, the user now gets an up-front warning:
$ rpmbuild -ta libvirt-1.2.2.tar.gz
error: Failed build dependencies:
systemd-devel is needed by libvirt-1.2.2-1.fc20.x86_64
* libvirt.spec.in (with_systemd_daemon): New variable.
(BuildRequires): Require systemd-devel for more than just udev.
(%configure): Make choice of systemd_daemon explicit.
Eric Blake [Tue, 25 Feb 2014 19:28:53 +0000 (12:28 -0700)]
spec: require device-mapper-devel for storage-disk
On Fedora 20, with the following in my ~/.rpmmacros:
%_without_udev 1
%_without_storage_mpath 1
and with device-mapper-devel uninstalled, 'make rpm' fails with:
checking for libdevmapper.h... no
configure: error: You must install device-mapper-devel/libdevmapper >= 1.0.0 to compile libvirt
error: Bad exit status from /var/tmp/rpm-tmp.Wo9pOG (%build)
This is a rather late point to be issuing an error; better is
to flag missing packages up front. The fix is to match the logic
in configure.ac on when devmapper is required (for both mpath and
storage). While at it, rbd storage is not dependent on mpath.
With this patch applied, I now get:
$ rpmbuild -ta libvirt-1.2.2.tar.gz
error: Failed build dependencies:
device-mapper-devel is needed by libvirt-1.2.2-1.fc20.x86_64
until either installing the package or further modifying
~/.rpmmacros to add "%_without_storage_disk 1".
* libvirt.spec.in (BuildRequires): Fix build when mpath is
disabled.
Eric Blake [Tue, 25 Feb 2014 16:51:40 +0000 (09:51 -0700)]
spec: explicitly avoid bhyve on Linux
Generally, we try to make the spec file tweakable via user
variables, so that they can select a different subset of sub-rpms
to build. We also try to explicitly list all driver config
options, rather than leaving the chance that the rpm build may be
non-deterministic based on what the user had installed locally.
But in the case of the recent bhyve hypervisor driver, there is
no port of bhyve to Linux, so it is easier to just blindly
disable it for now. If someone ever does try to port bhyve to
Fedora, we can make the spec file conditional at that point.
* libvirt.spec.in (%configure): Don't try to build bhyve.
Eric Blake [Tue, 25 Feb 2014 16:42:29 +0000 (09:42 -0700)]
build: use --with-systemd-daemon as configure option
Commit 68954fb added a configure option --with-systemd_daemon,
which violates the conventions of configure files preferring
dash in all option names. This fixes it, before we hit a
release where the tarball is baked with an awkward name.
* m4/virt-lib.m4 (LIBVIRT_CHECK_LIB, LIBVIRT_CHECK_LIB_ALT)
(LIBVIRT_CHECK_PKG): Favor - over _ in configure option names.
Laine Stump [Tue, 25 Feb 2014 14:47:22 +0000 (16:47 +0200)]
network: unplug bandwidth and call networkRunHook only when appropriate
According to commit b4e0299d if networkAllocateActualDevice() was
successful, it will *always* allocate an iface->data.network.actual,
so we can use this during networkReleaseActualDevice() to know if
there is really anything to undo. We were properly using this
information to only decrement the network connections counter if it
had previously been incremented, but we were unconditionally
unplugging bandwidth and calling the "unplugged" network hook for
*all* interfaces (during qemuProcessStop()) whether they had been
previously plugged or not. This caused problems if a domain failed to
start at some time prior to all interfaces being allocated. (I
encountered this when an interface had a bandwidth floor set but no
inbound QoS).
This patch changes both the call to networkUnplugBandwidth() and the
call to networkRunHook() to only be called if there was a previous
call to "plug" for the same interface.
Laine Stump [Tue, 25 Feb 2014 14:35:59 +0000 (16:35 +0200)]
network: don't even call networkRunHook if there is no network
networkAllocateActualDevice() is called for *all* interfaces, not just
those with type='network'. In that case, it will jump down to its
validate: label immediately, without allocating anything. After
validation is done, two counters are potentially updated (one for the
network, and one for any particular physical device that is chosen),
and then networkRunHook() is called.
This patch refactors that code a slight bit so that networkRunHook()
doesn't get called if netdef is NULL (i.e. type != network) and to
place the conditional increment of dev->connections inside the "if
(netdef)" as well - dev can never be non-null if netdef is null
(because "dev" is the pointer to a device in a network's pool of
devices), so this doesn't have any functional effect, it just makes
the code clearer.
While running virscsitest, it was found that valgrind pointed out the following
memory leak:
==320== 5 bytes in 1 blocks are definitely lost in loss record 4 of 37
==320== at 0x4A069EE: malloc (vg_replace_malloc.c:270)
==320== by 0x3E6CE81171: strdup (strdup.c:43)
==320== by 0x4CB28DF: virStrdup (virstring.c:554)
==320== by 0x4CAC987: virSCSIDeviceSetUsedBy (virscsi.c:289)
==320== by 0x402321: test2 (virscsitest.c:100)
==320== by 0x403231: virtTestRun (testutils.c:199)
==320== by 0x402121: mymain (virscsitest.c:180)
==320== by 0x4039AD: virtTestMain (testutils.c:782)
==320== by 0x3E6CE1ED1C: (below main) (libc-start.c:226)
==320==
Introduced by commit fd243fc. Signed-off-by: Ján Tomko <jtomko@redhat.com>
When starting these domain in parallel, all workers may meet in
virNetDevVethCreate() where a race starts. Race over allocating veth
pairs because allocation requires two steps:
1) find first nonexistent '/sys/class/net/vnet%d/'
2) run 'ip link add ...' command
Now consider two threads. Both of them find N as the first unused veth
index but only one of them succeeds allocating it. The other one fails.
For such cases, we are running the allocation in a loop with 10 rounds.
However this is very flaky synchronization. It should be rather used
when libvirt is competing with other process than when libvirt threads
fight each other. Therefore, internally we should use mutex to serialize
callers, and do the allocation in loop (just in case we are competing
with a different process). By the way we have something similar already
since 1cf97c87.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Eric Blake [Wed, 26 Feb 2014 00:24:58 +0000 (17:24 -0700)]
build: avoid ld_preload tests on mingw
Running ./autobuild.sh complained during the mingw cross-compile:
CC libvirportallocatormock_la-virportallocatortest.lo
../../tests/virportallocatortest.c:32:20: fatal error: dlfcn.h: No such file or directory
# include <dlfcn.h>
^
compilation terminated. With that fixed, the next failure was:
CCLD qemuxml2argvmock.la
libtool: link: libtool library `qemuxml2argvmock.la' must begin with `lib'
libtool: link: Try `libtool --help --mode=link' for more information.
While we don't need to limit all LD_PRELOAD tests to just Linux, we
do need to limit them to platforms that actually support loading;
we also need to avoid building qemu tests when qemu is not enabled.
* tests/virportallocatortest.c: Make conditional on <dlfcn.h>.
* tests/Makefile.am (test_libraries): Only build qemu mock library
when building qemu tests.
Jim Fehlig [Wed, 19 Feb 2014 22:39:19 +0000 (15:39 -0700)]
libxl: queue domain event earlier in shutdown handler
The shutdown handler may restart a domain when handling a reboot
event or when <on_*> is set to 'restart'. Restarting consists of
calling libxlVmCleanup followed by libxlVmStart. libxlVmStart will
emit a VIR_DOMAIN_EVENT_STARTED event, but the SHUTDOWN event is
not emitted until exiting the shutdown handler, after the STARTED
event.
This patch changes the logic a bit to queue the event at the start
of the shutdown action, ensuring it is queued before any subsequent
events that may be generated while executing the shutdown action.
Laine Stump [Fri, 21 Feb 2014 12:12:00 +0000 (14:12 +0200)]
network: include plugged interface XML in "plugged" network hook
The network hook script gets called whenever an interface is plugged
into or unplugged from a network, but even though the full XML of both
the network and the domain is included, there is no reasonable way to
determine what exact resources the plugged interface is using:
1) Prior to a recent patch which modified the status XML of interfaces
to include the information about actual hardware resources used, it
would be possible to scan through the domain XML output sent to the
hook, and from there find the correct interface, but that interface
definition would not include any runtime info (e.g. bandwidth or vlan
taken from a portgroup, or which physdev was used in case of a macvtap
network).
2) After the patch modifying the status XML of interfaces, the network
name would no longer be included in the domain XML, so it would be
completely impossible to determine which interface was the one being
plugged.
To solve that problem, this patch includes a single <interface>
element at the beginning of the XML sent to the network hook for
"plugged" and "unplugged" (just inside <hookData>) that is the status
XML of the interface being plugged. This XML will include all info
gathered from the chosen network and portgroup.
NB: due to hardcoded spaces in all of the device *Format() functions,
the <interface> element inside the <hookData> will be indented by 6
spaces rather than 2. I had intended to fix this, but it turns out
that to make virDomainNetDefFormat() indentation relative, I would
have to do the same to virDomainDeviceInfoFormat(), and that function
is called from 19 places - making that a prerequisite of this patch
would cause too many merge difficulties if we needed to backport
network hooks, so I chose to ignore the problem here and fix the
problem for *all* devices in a followup later.
Laine Stump [Thu, 20 Feb 2014 12:03:49 +0000 (14:03 +0200)]
conf: output actual netdev status in <interface> XML
Until now, the "live" XML status of an <interface type='network'>
device would always show the network information, rather than the
exact hardware device that was used. It would also show the name of
any portgroup the interface belonged to, rather than providing the
configuration that was derived from that portgroup. As an example,
given the following network definition:
In order to learn the exact bandwidth information of the interface, a
management application would need to retrieve the XML for testnet,
then search for the portgroup named "admin". Even worse, there was no
simple and standard way to learn which host physdev the macvtap0
device is attached to.
Internally, libvirt has always kept this information in the
virDomainDef that is held in memory, as well as storing it in the
(libvirt-internal-only) domain status XML (in
/var/run/libvirt/qemu/$domain.xml). In order to not confuse the runtime
"actual state" with the config of the device, it's internally stored
like this:
This was never exposed outside of libvirt though, because I thought it
would be too awkward for a management application to need to look in
two places for the same information, but I also wasn't sure that it
would be okay to overwrite the config info (in this case "<source
network='testnet' portgroup='admin'/>") with the actual runtime info
(everything inside <actual> above).
Now we have a need for this information to be made available to
management applications (in particular, so that a network "plugged"
hook will have full information about the device that is being plugged
in), so it's time to take the leap and decide that it is acceptable
for the config info to be replaced with actual runtime state (but
*only* when reporting domain live status, *not* when saving state in
/var/run/libvirt/qemu/$domain.xml - that remains the same so that
there is no loss of information). That is what this patch does - once
applied, the output of "virsh dumpxml $domain" when the domain is
running will contain something like this:
In effect, everything that is internally stored within <actual> is
moved up a level to where a management application will expect
it. This means that the management application will only look in a
single place to learn - the type of interface in use, the name of the
physdev (if relevant), the <bandwidth>, <vlan>, and <virtualport>
settings in use.
The potential downside is that a management app looking at this output
will not see that the physdev 'p4p1_0' was actually allocated from the
network named 'testnet', or that the bandwidth numbers were taken from
the portgroup 'admin'. However, if they are interested in that info,
they can always get the "inactive" XML for the domain.
An example of where this could cause problems is in virt-manager's
network device display, which shows the status of the device, but
allows you to edit that status info and save it as the new
config. Previously virt-manager would always display the information
in example [C] above, and allow editing that. With this patch, it will
instead display what is in [E] and allow editing it directly, which
could lead to some confusion. I would suggest that virt-manager have
an "edit" button which would change the display from the "live" xml to
the "inactive" xml, so that editing would be done on that; such a
change would both handle the new situation, and also be compatible
with older releases.
Laine Stump [Wed, 19 Feb 2014 12:45:52 +0000 (14:45 +0200)]
conf: new function virDomainActualNetDefContentsFormat
This function is currently only called from one place, but in a
subsequent patch will be called from a 2nd place.
The new function exactly replicates the original behavior of the part
of virDomainActualNetDefFormat() that it replaces, but takes a
virDomainNetDefPtr instead of virDomainActualNetDefPtr, and uses the
virDomainNetGetActual*() functions whenever possible, rather than
reaching into def->data.network.actual - this is to be sure that we
are reporting exactly what is being used internally, just in case
there are any discrepancies (there shouldn't be).
Laine Stump [Wed, 19 Feb 2014 14:22:48 +0000 (16:22 +0200)]
conf: re-situate <bandwidth> element in <interface>
This moves the call to virNetDevBandwidthFormat() in
virDomainNetDefFormat() to be called right after the call to
virNetDevVPortProfileFormat(), so that a single chunk of that function
can be placed inside an if that conditionally calls
virDomainActualNetDefContentsFormat() instead (next patch). The
re-ordering necessitates modifying a couple of test data files.
Laine Stump [Thu, 20 Feb 2014 11:59:55 +0000 (13:59 +0200)]
conf: handle null pointer in virNetDevVlanFormat
Other *Format() functions (e.g. virNetDevBandwidthFormat()) return
with no action when called with a NULL *Def pointer. This makes
virNetDevVlanFormat() consistent with that behavior.
Laine Stump [Mon, 17 Feb 2014 12:36:18 +0000 (14:36 +0200)]
conf: clarify what is returned for actual bandwidth and vlan
In practice, if a virDomainNetDef has a virDomainActualNetDef
allocated, the ActualNetDef will *always* contain the bandwidth and
vlan data from the NetDef (unless there was also a portgroup involved
- see networkAllocateActualDevice()).
However, virDomainNetGetActual(Bandwidth|Vlan)() were coded to make it
appear as if it might be possible to have a valid bandwidth/vlan in
the NetDef, but a NULL in the ActualNetDef. Believing this un-truth
could lead to writing unnecessarily defensive code when dealing with
the virDomainGetActual*() functions, so this patch makes it more
obvious:
If there is an ActualNetDef, it will always have a copy of the
various appropriate bits from its parent NetDef, and the
virDomainGetActual* function will *always* return the data from the
ActualNetDef, not from the NetDef.
The reason for this effective-NOP patch is that a subsequent patch to
change virDomainNetDefFormat will rely on the above rule.
I noticed this while shortening switch statements via VIR_ENUM.
Basically, the only ways virAsprintf can fail are if we pass a
bogus format string (but we're not THAT bad) or if we run out
of memory (but it already warns on our behalf in that case).
Throw away the cruft that tries too hard to diagnose a printf
failure.
Eric Blake [Fri, 21 Feb 2014 19:55:06 +0000 (12:55 -0700)]
virsh: use more compact VIR_ENUM_IMPL
Dan Berrange suggested that using VIR_ENUM_IMPL is more compact
than open-coding switch statements, and still just as forceful
at making us remember to update lists if we add enum values
in the future. Make this change throughout virsh.
Sure enough, doing this change caught that we missed at least
VIR_STORAGE_VOL_NETDIR.
Ensure systemd cgroup ownership is delegated to container with userns
This function is needed for user namespaces, where we need to chmod()
the cgroup to the initial uid/gid such that systemd is allowed to
use the cgroup.
Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Michal Privoznik [Fri, 21 Feb 2014 12:06:42 +0000 (13:06 +0100)]
virNetServerRun: Notify systemd that we're accepting clients
Systemd does not forget about the cases, where client service needs to
wait for daemon service to initialize and start accepting new clients.
Setting a dependency in client is not enough as systemd doesn't know
when the daemon has initialized itself and started accepting new
clients. However, it offers a mechanism to solve this. The daemon needs
to call a special systemd function by which the daemon tells "I'm ready
to accept new clients". This is exactly what we need with
libvirtd-guests (client) and libvirtd (daemon). So now, with this
change, libvirt-guests.service is invoked not any sooner than
libvirtd.service calls the systemd notify function.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Fri, 21 Feb 2014 11:46:08 +0000 (12:46 +0100)]
libvirt-guests: Wait for libvirtd to initialize
I've noticed that in some cases systemd was quick enough and even
if libvirt-guests.service is marked to be started after the
libvirtd.service my guests were not resumed as
libvirt-guests.sh failed to connect. This is because of a
simple fact: systemd correctly starts libvirt-guests after it
execs libvirtd. However, the daemon is not able to accept
connections right from the start. It's doing some
initialization which may take ages. This problem is not limited
to systemd only, indeed. Any init system that is able to startup
services in parallel (e.g. OpenRC) may run into this situation.
The fix is to try connecting not only once, but continuously a few
times with a small sleep in between tries.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When creating a new domain, we let systemd know about it by calling
CreateMachine() function via dbus. Systemd then creates a scope and
places domain into it. However, later when the host is shutting
down, systemd computes the shutdown order to see what processes can
be shut down in parallel. And since we were not setting
dependencies at all, the slices (and thus domains) were most likely
killed before libvirt-guests.service. So user domains that had to
be saved, shut off, whatever were in fact killed. This problem can
be solved by letting systemd know that scopes we're creating must
not be killed before libvirt-guests.service.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Ján Tomko [Fri, 21 Feb 2014 09:22:22 +0000 (10:22 +0100)]
Ignore additional fields in iscsiadm output
There has been a new field introduced in iscsiadm --mode session
output [1], but our regex only expects four fields. This breaks
startup of iscsi pools:
error: Failed to start pool iscsi
error: internal error: cannot find session
Fix this by ignoring anything after the fourth field.
Ján Tomko [Fri, 21 Feb 2014 08:10:48 +0000 (09:10 +0100)]
Add a stub for virCgroupGetDomainTotalCpuStats
Commit 6515889 broke the build on FreeBSD:
In function `qemuDomainGetCPUStats':
/../../src/qemu/qemu_driver.c:16102:
undefined reference to `virCgroupGetDomainTotalCpuStats'
Eric Blake [Fri, 14 Feb 2014 23:59:35 +0000 (16:59 -0700)]
virsh: add event command, for lifecycle events
Add 'virsh event --list' and 'virsh event [dom] --event=name
[--loop] [--timeout]'. Borrows somewhat from event-test.c,
but defaults to a one-shot notification, and takes advantage
of the event loop integration to allow Ctrl-C to interrupt the
wait for an event. For now, this just does lifecycle events.
* tools/virsh.pod (event): Document new command.
* tools/virsh-domain.c (vshDomainEventToString)
(vshDomainEventDetailToString, vshDomEventData)
(vshEventLifecyclePrint, cmdEvent): New struct and functions.
Eric Blake [Fri, 14 Feb 2014 22:30:23 +0000 (15:30 -0700)]
virsh: common code for waiting for an event
I plan to add 'virsh event' to virsh-domain.c and 'virsh
net-event' to virsh-network.c; but as they will share quite
a bit of common boilerplate, it's better to set that up now
in virsh.c.
* tools/virsh.h (_vshControl): Add fields.
(vshEventStart, vshEventWait, vshEventDone, vshEventCleanup): New
prototypes.
* tools/virsh.c (vshEventFd, vshEventOldAction, vshEventInt)
(vshEventTimeout): New helper variables and functions.
(vshEventStart, vshEventWait, vshEventDone, vshEventCleanup):
Implement new functions.
(vshInit, vshDeinit, main): Manage event timeout.
Eric Blake [Fri, 14 Feb 2014 00:08:42 +0000 (17:08 -0700)]
virsh: common code for parsing --seconds
Several virsh commands ask for a --timeout parameter in
seconds, then use it to control interfaces that operate on
millisecond limits; I also plan on adding a 'virsh event'
command that also does this. Factor this into a common
function.
* tools/virsh.h (vshCommandOptTimeoutToMs): New prototype.
* tools/virsh.c (vshCommandOptTimeoutToMs): New function.
* tools/virsh-domain.c (cmdBlockCommit, cmdBlockCopy)
(cmdBlockPull, cmdMigrate): Use it.
(vshWatchJob): Adjust timeout scale.
John Ferlan [Wed, 12 Feb 2014 20:33:02 +0000 (15:33 -0500)]
bandwidth: Adjust documentation
Recent autotest/virt-test testing on f20 discovered an anomaly in how
the bandwidth options are documented and used. This was discovered due
to a bug fix in the /sbin/tc utility found in iproute-3.11.0.1 (on f20)
in which overflow was actually caught and returned as an error. The fix
was first introduced in iproute-3.10 (search on iproute2 commit 'a303853e').
The autotest/virt-test test for virsh domiftune was attempting to send
the largest unsigned integer value (4294967295) for maximum value
testing. The libvirt xml implementation was designed to manage values
in kilobytes thus when this value was passed to /sbin/tc, it (now)
properly rejected the 4294967295kbps value.
Investigation of the problem discovered that formatdomain.html.in and
formatnetwork.html.in described the elements and property types slightly
differently, although they use the same code - virNetDevBandwidthParseRate()
(shared by portgroups, domains, and networks xml parsers). Rather than
have the descriptions in two places, this patch will combine and reword
the description under formatnetwork.html.in and have formatdomain.html.in
link to that description.
This documentation faux pas was continued into the virsh man page where
the bandwidth description for both 'attach-interface' and 'domiftune'
did not indicate the format of each value, thus leading to the test using
largest unsigned integer value assuming "bps" rather than "kbps", which
ultimately was wrong.
If a domain has a different maximum for persistent and live maxmem
or max vcpus, then it is possible to hit cases where libvirt
refuses to adjust the current values or gets halfway through
the adjustment before failing. Better is to determine up front
if the change is possible for all requested flags.
Based on an idea by Geoff Franks.
* src/qemu/qemu_driver.c (qemuDomainSetMemoryFlags): Compute
correct maximum if both live and config are being set.
(qemuDomainSetVcpusFlags): Likewise.
The previous OOM testing support would re-run the entire "main"
method each iteration, failing a different malloc each time.
When a test suite has 'n' allocations, the number of repeats
requires is (n * (n + 1) ) / 2. This gets very large, very
quickly.
This new OOM testing support instead integrates at the
virtTestRun level, so each individual test case gets repeated,
instead of the entire test suite. This means the values of
'n' are orders of magnitude smaller.
The simple usage is
$ VIR_TEST_OOM=1 ./qemuxml2argvtest
...
29) QEMU XML-2-ARGV clock-utc ... OK
Test OOM for nalloc=36 .................................... OK
30) QEMU XML-2-ARGV clock-localtime ... OK
Test OOM for nalloc=36 .................................... OK
31) QEMU XML-2-ARGV clock-france ... OK
Test OOM for nalloc=38 ...................................... OK
...
the second lines reports how many mallocs have to be failed, and thus
how many repeats of the test will be run.
If it crashes, then running under valgrind will often show the problem
In the worse case, you might want to know the stack trace of the
alloc which was failed then VIR_TEST_OOM_TRACE can be set. If it
is set to 1 then it will only print if it thinks a mistake happened.
This is often not reliable, so setting it to 2 will make it print
the stack trace for every alloc that is failed.
Thorsten Behrens [Fri, 14 Feb 2014 17:49:04 +0000 (18:49 +0100)]
Widening API change - accept empty path for virDomainBlockStats
And provide domain summary stat in that case, for lxc backend.
Use case is a container inheriting all devices from the host,
e.g. when doing application containerization.
Jincheng Miao [Thu, 20 Feb 2014 09:29:15 +0000 (17:29 +0800)]
virsh: fix memleak when starting a guest with invalid fd
When start a guest with --pass-fd, if the argument of --pass-fd is invalid,
virsh will exit, but doesn't free the variable 'dom'.
The valgrind said:
...
==24569== 63 (56 direct, 7 indirect) bytes in 1 blocks are definitely lost in loss record 130 of 234
==24569== at 0x4C2A1D4: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==24569== by 0x4E879A4: virAllocVar (viralloc.c:544)
==24569== by 0x4EBD625: virObjectNew (virobject.c:190)
==24569== by 0x4F3A18A: virGetDomain (datatypes.c:226)
==24569== by 0x4F9311F: remoteDomainLookupByName (remote_driver.c:6636)
==24569== by 0x4F44F20: virDomainLookupByName (libvirt.c:2277)
==24569== by 0x12F616: vshCommandOptDomainBy (virsh-domain.c:105)
==24569== by 0x131C79: cmdStart (virsh-domain.c:3330)
==24569== by 0x12C4AB: vshCommandRun (virsh.c:1752)
==24569== by 0x127001: main (virsh.c:3218)
Destroying a suspended domain needs special action.
We cannot simply terminate all process because they are frozen.
Do deal with that we send them SIGKILL and thaw them.
Upon wakeup the process sees the pending signal and dies immediately.
Signed-off-by: Richard Weinberger <richard@nod.at>
Ján Tomko [Thu, 20 Feb 2014 09:04:30 +0000 (10:04 +0100)]
Fix build of portallocator on mingw
IN6ADDR_ANY_INIT does not seem to be working as expected on MinGW:
error: missing braces around initializer [-Werror=missing-braces]
.sin6_addr = IN6ADDR_ANY_INIT,
Michal Privoznik [Wed, 19 Feb 2014 13:55:23 +0000 (14:55 +0100)]
networkRunHook: Run hook only if possible
Currently, networkRunHook() is called in networkAllocateActualDevice and
friends. These functions, however, doesn't necessarily work on networks,
For example, if domain's interface is defined in this fashion:
The networkAllocateActualDevice jumps directly onto 'validate' label as
the interface is not type of 'network'. Hence, @network is left
initialized to NULL and networkRunHook(network, ...) is called. One of
the things that the hook function does is dereference @network. Soupir.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>