Doug Goldstein [Sat, 6 Oct 2012 19:20:25 +0000 (14:20 -0500)]
interface: add udev based backend for virInterface
Add a read-only udev based backend for virInterface. Useful for distros
that do not have netcf support yet. Multiple libvirt based utilities use
a HAL based fallback when virInterface is not available which is less
than ideal. This implements:
* virConnectNumOfInterfaces()
* virConnectListInterfaces()
* virConnectNumOfDefinedInterfaces()
* virConnectListDefinedInterfaces()
* virConnectListAllInterfaces()
* virConnectInterfaceLookupByName()
* virConnectInterfaceLookupByMACString()
Eric Blake [Tue, 9 Oct 2012 03:44:36 +0000 (21:44 -0600)]
hooks: let virCommand do the error reporting
The code was reporting raw exit status without decoding it into
normal vs. signal exit. virCommandRun already does this, but
with a different error type, so all we have to do is recast
the error to the correct type.
Reported by li guang.
Marcelo Cerri [Tue, 9 Oct 2012 12:57:58 +0000 (09:57 -0300)]
doc: update description about user/group in qemu.conf
As a side effect of changes in the functions virGetUserID and
virGetGroupID, the user and group configurations for DAC in qemu.conf
are now able to accept both names and IDs, supporting a leading plus
sign to ensure that a numeric value will not be interpreted as a name.
This patch updates the comments in qemu.conf, including a description of
this new behavior.
Jiri Denemark [Tue, 9 Oct 2012 06:25:02 +0000 (08:25 +0200)]
qemu: Fix QMP detection of QXL graphics
With the recent introduction of QMP capabilities probing, libvirt failed
to detect support for QXL graphics in QEMU 1.2 and newer. In addition to
fixing that, this patch also causes libvirt to detect QXL support for
qemu-kvm-0.13.0, which doesn't advertise it in -help output but mentions
it in device list. Since qemu-kvm-0.13.0 supported -spice, it looks like
not having qxl in -help was a bug.
Marcelo Cerri [Mon, 8 Oct 2012 20:37:02 +0000 (17:37 -0300)]
security: update user and group parsing in security_dac.c
The functions virGetUserID and virGetGroupID are now able to parse
user/group names and IDs in a similar way to coreutils' chown. So, user
and group parsing in security_dac can be simplified.
Marcelo Cerri [Mon, 8 Oct 2012 20:37:01 +0000 (17:37 -0300)]
util: extend virGetUserID and virGetGroupID to support names and IDs
This patch updates virGetUserID and virGetGroupID to be able to parse a
user or group name in a similar way to coreutils' chown. This means that
a numeric value with a leading plus sign is always parsed as an ID,
otherwise the functions try to parse the input first as a user or group
name and if this fails they try to parse it as an ID.
This patch includes Peter Krempa's changes to correctly handle errors
returned by getpwnam_r and getgrnam_r.
Matthias Bolte [Sat, 6 Oct 2012 18:09:20 +0000 (20:09 +0200)]
Call curl_global_init from virInitialize to avoid thread-safety issues
curl_global_init is not thread-safe. curl_easy_init might call
curl_global_init when it was no called before. But curl_easy_init
can be called from different threads by the ESX driver. Therefore,
call curl_global_init from virInitialize to stop curl_easy_init from
calling it.
When both kvmclock and kvm_pv_eoi are configured (either disabled or
enabled) libvirt will generate invalid CPU specification due to the
fact that even though kvmclock causes the CPU to be specified, it
doesn't set have_cpu flag to true (and the new kvm_pv_eoi as well).
This patch fixes the issue and adds a test exactly for that to show
that it is fixed correctly (and also to keep it that way in the future
of course).
python: keep consistent handling of Python integer conversion
libvirt_ulonglongUnwrap requires the integer type of python obj.
But libvirt_longlongUnwrap still could handle python obj of
Pyfloat_type which causes the float value to be rounded up
to an integer.
For example
>>> dom.setSchedulerParameters({'vcpu_quota': 0.88})
0
libvirt_longlongUnwrap treats 0.88 as a valid value 0
However
>>> dom.setSchedulerParameters({'cpu_shares': 1000.22})
libvirt_ulonglongUnwrap will throw out an error
"TypeError: an integer is required"
libvirt_virDomainGetVcpus: add error handling, return -1 instead of None
libvirt_virDomainPinVcpu and libvirt_virDomainPinVcpuFlags:
check the type of argument
make use of libvirt_boolUnwrap
Set bitmap according to these values which are contained in given
argument of vcpu tuple and turn off these bit corresponding to
missing vcpus in argument tuple
The original way ignored the error info from PyTuple_GetItem
if index is out of range.
"IndexError: tuple index out of range"
The error message will only be raised on next command in interactive mode.
esx: Disable libcurl's use of signals to fix a segfault
libcurl uses a SIGALRM in combination with sigsetjmp/siglongjmp to be
able to abort a DNS lookup when it takes too long. The problem with this
in a multi-threaded application is that the signal handler for SIGALRM
and the call to siglongjmp can be executed on a thread that is different
from the one that initially did the SIGALRM setup and the call to
sigsetjmp. In the reported case this triggered a segfault.
Disable libcurl's use of signals to avoid this situation. This has the
disadvantage of losing the ability to abort synchronous DNS lookups which
might result in libcurl getting stuck in a DNS lookup in the worst case.
When libcurl was build with an asynchronous DNS backend such as c-ares
then there is no problem because the timeout mechanism works without
signals here anyway.
The output buffer for virFileReadAll was too small for systems with
more than 30 CPUs which leads to a log entry and incorrect behavior.
The new size will be sufficient for the current
architectural limits.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Eric Blake [Fri, 5 Oct 2012 14:53:17 +0000 (08:53 -0600)]
build: fix VPATH builds
This reverts part of commit 5468594f465; the perl changes in that
patch were sufficient. Since libvirt.syms is already a generated
file created in part from libvirt_private.syms, we don't need a
second pass over libvirt_private.syms in isolation.
* src/Makefile.am: Undo addition of check-private-symfile.
Currently, we are checking if libvirt.so contains public symbols.
However, sometimes we rename an internal symbol and forget to
change libvirt_private.syms accordingly. Hence, it's safer to check
for internal symbols as well.
Eric Blake [Thu, 4 Oct 2012 16:02:02 +0000 (10:02 -0600)]
spec: prefer canonical name of util-linux
I noticed that in two places, we require util-linux, and in a third,
we require util-linux-ng. On Fedora (I tested F15 through rawhide),
util-linux-ng is obsoleted by util-linux; on RHEL 6, util-linux
is obsoleted by util-linux-ng. That is, on either platform, either
name will get you the correct package installed (where the preferred
name on fedora is util-linux, and on RHEL 6 is util-linux-ng). But
on RHEL 5, there is no util-linux-ng
* libvirt.spec.in (Requires): Use util-linux, not util-linux-ng.
Kyle Mestery [Wed, 3 Oct 2012 19:46:55 +0000 (15:46 -0400)]
Correct checking of virStrcpyStatic() return value
Correct the check for the return value of virStrcpyStatic()
when copying port-profile names. Fixes Open vSwitch ports
which utilize port-profiles from network definitions.
Marcelo Cerri [Tue, 2 Oct 2012 17:57:36 +0000 (14:57 -0300)]
security: also parse user/group names instead of just IDs for DAC labels
The DAC driver is missing parsing of group and user names for DAC labels
and currently just parses uid and gid. This patch extends it to support
names, so the following security label definition is now valid:
When it tries to parse an owner or a group, it first tries to resolve it as
a name, if it fails or it's an invalid user/group name then it tries to
parse it as an UID or GID. A leading '+' can also be used for both owner and
group to force it to be parsed as IDs, so the following example is also
valid:
Eric Blake [Tue, 2 Oct 2012 14:37:00 +0000 (08:37 -0600)]
build: avoid -Wno-format on new-enough gcc
Commit c579d6b added a sledgehammer to silence spurious warnings from
gcc 4.2, but in the process, it also silenced useful warnings from
gcc 4.3 through 4.5. As a result, a bug slipped in to commit 0caccb58.
Tested with FreeBSD (gcc 4.2.1), RHEL 6.3 (gcc 4.4), and F17 (gcc 4.7.2),
where the former didn't trip on spurious warnings, and where the latter
two detected a revert of 2b804cf.
* m4/virt-compile-warnings.m4 (-Wno-format): Probe for the actual
spurious message, to once again allow gcc 4.4 to use -Wformat.
CC libvirt_driver_qemu_impl_la-qemu_capabilities.lo
../../src/qemu/qemu_capabilities.c: In function 'qemuCapsInitQMP':
../../src/qemu/qemu_capabilities.c:2327:13: error: format '%d' expects argument of type 'int', but argument 8 has type 'const char *' [-Werror=format]
* src/qemu/qemu_capabilities.c (qemuCapsInitQMP): Use correct format.
Jiri Denemark [Tue, 2 Oct 2012 08:59:56 +0000 (10:59 +0200)]
qemu: Kill processes used for QMP caps probing
Since libvirt switched to QMP capabilities probing recently, it starts
QEMU process used for this probing with -daemonize, which means
virCommandAbort can no longer reach these processes. As a result of
that, restarting libvirtd will leave several new QEMU processes behind.
Let's use QEMU's -pidfile and use it to kill the process when QMP caps
probing is done.
Peter Krempa [Mon, 1 Oct 2012 15:58:29 +0000 (17:58 +0200)]
qemu: Use proper agent entering function when freezing filesystems
When doing snapshots, the filesystem freeze function used the agent
entering function that expects the qemud_driver unlocked. This might
cause a deadlock of the qemu driver if the agent does not respond.
The only call path of this function has the qemud_driver locked, so this
patch changes the entering functions to those expecting the driver
locked.
There was an inverted return value in lxcCgroupControllerActive().
The function assumes cgroups are active and do couple of checks
to prove that. If any of them fails, false is returned. Therefore,
at the end, after all checks are done we must return true, not false.
Eric Blake [Mon, 1 Oct 2012 22:38:56 +0000 (16:38 -0600)]
build: avoid journald on rhel 5
Commit f6430390 broke builds on RHEL 5, where glibc (2.5) is too
old to support mkostemp (2.7) or htole64 (2.9). While gnulib
has mkostemp, it still lacks htole64; and it's not worth dragging
in replacements on systems where journald is unlikely to exist
in the first place, so we just use an extra configure-time check
as our witness of whether to attempt compiling the code.
* src/util/logging.c (virLogParseOutputs): Don't attempt to
compile journald on older glibc.
* configure.ac (AC_CHECK_DECLS): Check for htole64.
Eric Blake [Mon, 1 Oct 2012 15:10:20 +0000 (09:10 -0600)]
build: avoid infinite autogen loop
Several people have reported that if the .gnulib submodule is dirty,
then 'make' will go into an infinite loop attempting to rerun bootstrap,
because that never cleans up the dirty submodule. By default, we
should halt and make the user investigate, but if the user doesn't
know why or care that the submodule is dirty, I also added the ability
to 'make CLEAN_SUBMODULE=1' to get things going again.
Also, while testing this, I noticed that when a submodule update was
needed, 'make' would first run autoreconf, then bootstrap (which
reruns autoreconf); adding a strategic dependency allows for less work.
* .gnulib: Update to latest, for maint.mk improvements.
* cfg.mk (_autogen): Also hook maint.mk, to run before autoreconf.
* autogen.sh (bootstrap): Refuse to run if gnulib is dirty, unless
user requests discarding gnulib changes.
build: default selinuxfs mount point to /sys/fs/selinux
Currently if you build on a machine that does not support SELinux we end up
with the default mount point being /selinux, since this is moved to
/sys/fs/selinux, we should start defaulting there.
I believe this is causing a bug in libvirt-lxc when /selinux does not exists,
even though /sys/fs/selinux exists.
and talk QMP over stdio to discover what capabilities the
binary supports. This works for QEMU 1.2.0 or later and
for older QEMU automatically fallback to the old approach
of parsing -help and related command line args.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Some architectures provide the query-cpu-definitions command,
but are set to always return a "GenericError" from it :-(
Catch this & treat it as if there was an empty list of CPUs
returned
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Avoid bogus I/O event errors when closing the QEMU monitor
After calling qemuMonitorClose(), it is still possible for
the QEMU monitor I/O event callback to get invoked. This
will trigger an error message because mon->fd has been set
to -1 at this point. Silently ignore the case where mon->fd
is -1, likewise for mon->watch being zero.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remove need to pass in a virDomainObjPtr instance to qemuMonitorOpen
The qemuMonitorOpen method only needs a virDomainObjPtr in order
to access the QEMU pid. This is not critical when detecting the
QEMU capabilties, so can easily be skipped
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The help output for QEMU 1.2.0 changed 'pci-assign' to 'kvm-pci-assign'.
Since the new capabilities code does exact device name matching
instead of substring matching, this caused the capabilities to go
missing.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Probe to see if the systemd journal is accessible, and if
so enable logging to the journal by default, rather than
stderr (current default under systemd).
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add support for logging to the systemd journal, using its
simple client library. The benefit over syslog is that it
accepts structured log data, so the journald can store
individual items like code file/line/func separately from
the string message. Tools which require structured log
data can then query the journal to extract exactly what
they desire without resorting to string parsing
While systemd provides a simple client library for logging,
it is more convenient for libvirt to directly write its
own client code. This lets us build up the iovec's on
the stack, avoiding the need to alloc memory when writing
log messages.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Refactor qemuCapsParseDeviceStr to work from data tables
Currently the qemuCapsParseDeviceStr method has a bunch of open
coded string searches/comparisons to detect devices and their
properties. Soon this data will be obtained from QMP queries
instead of -device help output. Maintaining the list of device
and properties in two places is undesirable. Thus the existing
qemuCapsParseDeviceStr() method needs to be refactored to
separate the device types and properties from the actual
search code.
Thus the -device help output is now parsed to construct a
list of device names, and device properties. These are then
checked against a set of datatables to set the capability
flags
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The 'const char *category' parameter only has a few possible
values now that the filename has been separated. Turn this
parameter into an enum instead.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the logging APIs have a 'const char *category' parameter
which indicates where the log message comes from. This is typically
a combination of the __FILE__ string and other prefix. Split the
__FILE__ off into a dedicated parameter so it can passed to the
log outputs
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
- Move '{' to a new line after funtion declaration
- Put each parameter on a new line to avoid long lines
- Put return type on new line
- Leave 2 blank lines between functions
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Benjamin Cama [Wed, 26 Sep 2012 19:02:20 +0000 (21:02 +0200)]
network: fix dnsmasq/radvd binding to IPv6 on recent kernels
I hit this problem recently when trying to create a bridge with an IPv6
address on a 3.2 kernel: dnsmasq (and, further, radvd) would not bind to
the given address, waiting 20s and then giving up with -EADDRNOTAVAIL
(resp. exiting immediately with "error parsing or activating the config
file", without libvirt noticing it, BTW). This can be reproduced with (I
think) any kernel >= 2.6.39 and the following XML (to be used with
"virsh net-create"):
The problem is that since commit [1] (which, ironically, was made to
“help IPv6 autoconfiguration”) the linux bridge code makes bridges
behave like “real” devices regarding carrier detection. This makes the
bridges created by libvirt, which are started without any up devices,
stay with the NO-CARRIER flag set, and thus prevents DAD (Duplicate
address detection) from happening, thus letting the IPv6 address flagged
as “tentative”. Such addresses cannot be bound to (see RFC 2462), so
dnsmasq fails binding to it (for radvd, it detects that "interface XXX
is not RUNNING", thus that "interface XXX does not exist, ignoring the
interface" (sic)). It seems that this behavior was enhanced somehow with
commit [2] by avoiding setting NO-CARRIER on empty bridges, but I
couldn't reproduce this behavior on my kernel. Anyway, with the “dummy
tap to set MAC address” trick, this wouldn't work.
To fix this, the idea is to get the bridge's attached device to be up so
that DAD can happen (deactivating DAD altogether is not a good idea, I
think). Currently, libvirt creates a dummy TAP device to set the MAC
address of the bridge, keeping it down. But even if we set this device
up, it is not RUNNING as soon as the tap file descriptor attached to it
is closed, thus still preventing DAD. So, we must modify the API a bit,
so that we can get the fd, keep the tap device persistent, run the
daemons, and close it after DAD has taken place. After that, the bridge
will be flagged NO-CARRIER again, but the daemons will be running, even
if not happy about the device's state (but we don't really care about
the bridge's daemons doing anything when no up interface is connected to
it).
Other solutions that I envisioned were:
* Keeping the *-nic interface up: this would waste an fd for each
bridge during all its life. May be acceptable, I don't really
know.
* Stop using the dummy tap trick, and set the MAC address directly
on the bridge: it is possible since quite some time it seems,
even if then there is the problem of the bridge not being
RUNNING when empty, contrary to what [2] says, so this will need
fixing (and this fix only happened in 3.1, so it wouldn't work
for 2.6.39)
* Using the --interface option of dnsmasq, but I saw somewhere
that it's not used by libvirt for backward compatibility. I am
not sure this would solve this problem, though, as I don't know
how dnsmasq binds itself to it with this option.
This is why this patch does what's described earlier.
This patch also makes radvd start even if the interface is
“missing” (i.e. it is not RUNNING), as it daemonizes before binding to
it, and thus sometimes does it after the interface has been brought down
by us (by closing the tap fd), and then originally stops. This also
makes it stop yelling about it in the logs when the interface is down at
a later time.
Move command/event capabilities detection out of QEMU monitor code
The qemuMonitorSetCapabilities() API is used to initialize the QMP
protocol capabilities. It has since been abused to initialize some
libvirt internal capabilities based on command/event existance too.
Move the latter code out into qemuCapsProbeQMP() in the QEMU
capabilities source file instead
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a qemuMonitorGetTargetArch() method for QMP query-target command
Add a new qemuMonitorGetTargetArch() method to support invocation
of the 'query-target' JSON monitor command. No HMP equivalent
is required, since this will only be present for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a qemuMonitorGetObjectProps() method for QMP device-list-properties command
Add a new qemuMonitorGetObjectProps() method to support invocation
of the 'device-list-properties' JSON monitor command. No HMP equivalent
is required, since this will only be present for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a qemuMonitorGetObjectTypes() method for QMP qom-list-types command
Add a new qemuMonitorGetObjectTypes() method to support invocation
of the 'qom-list-types' JSON monitor command. No HMP equivalent
is required, since this will only be present for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a qemuMonitorGetEvents() method for QMP query-events command
Add a new qemuMonitorGetEvents() method to support invocation
of the 'query-events' JSON monitor command. No HMP equivalent
is required, since this will only be used when JSON is available
The existing qemuMonitorJSONCheckEvents() method is refactored
to use this new method
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a qemuMonitorGetCommands() method for QMP query-commands command
Add a new qemuMonitorGetCPUCommands() method to support invocation
of the 'query-commands' JSON monitor command. No HMP equivalent
is required, since this will only be used when JSON is available
The existing qemuMonitorJSONCheckCommands() method is refactored
to use this new method
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a qemuMonitorGetCPUDefinitions method for QMP query-cpu-definitions command
Add a new qemuMonitorGetCPUDefinitions() method to support invocation
of the 'query-cpu-definitions' JSON monitor command. No HMP equivalent
is required, since this will only be present for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a qemuMonitorGetMachines() method for QMP query-machines command
Add a new qemuMonitorGetMachines() method to support invocation
of the 'query-machines' JSON monitor command. No HMP equivalent
is required, since this will only be present for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add a qemuMonitorGetVersion() method for QMP query-version command
Add a new qemuMonitorGetVersion() method to support invocation
of the 'query-version' JSON monitor command. No HMP equivalent
is provided, since this will only be used for QEMU >= 1.2
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Make qemuCapsProbeMachineTypes & qemuCapsProbeCPUModels static
The qemuCapsProbeMachineTypes & qemuCapsProbeCPUModels methods
do not need to be invoked directly anymore. Make them static
and refactor them to directly populate the qemuCapsPtr object
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remove probing of CPU models when launching QEMU guests
When launching a QEMU guest the binary is probed to discover
the list of supported CPU names. Remove this probing with a
simple lookup of CPU models in the qemuCapsPtr object. This
avoids another invocation of the QEMU binary during the
startup path.
As a nice benefit we can now remove all the nasty hacks from
the test suite which were done to avoid having to exec QEMU
on the test system. The building of the -cpu command line
can just rely on data we pre-populate in qemuCapsPtr.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remove probing of machine types when canonicalizing XML
When XML for a new guest is received, the machine type is
immediately canonicalized into the version specific name.
This involves probing QEMU for supported machine types.
Replace this probing with a lookup of the machine types
in the (hopefully cached) qemuCapsPtr object
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remove probing of flags when launching QEMU guests
Remove all use of the existing APIs for querying QEMU
capability flags. Instead obtain a qemuCapsPtr object
from the global cache. This avoids the execution of
'qemu -help' (and related commands) when launching new
guests.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Switch over to use cache for building QEMU capabilities
When building up a virCapsPtr instance, the QEMU driver
was copying the list of machine types across from the
previous virCapsPtr instance, if the QEMU binary had not
changed. Replace this ad-hoc caching of data with use
of the new qemuCapsCache global cache.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Introduce a qemuCapsCachePtr object to provide a global cache
of capabilities for QEMU binaries. The cache auto-populates
on first request for capabilities about a binary, and will
auto-refresh if the binary has changed since a previous cache
was populated
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If the qemuAgentClose method is called from a place which holds
the domain lock, it is theoretically possible to get a deadlock
in the agent destroy callback. This has not been observed, but
the equivalent code in the QEMU monitor destroy callback has seen
a deadlock.
Remove the redundant locking while unrefing the object and the
bogus assignment
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Use size_t instead of int for virDomainDefPtr struct
Many parts of virDomainDefPtr were using 'int' variables as
array length counts. Replace all these with size_t and update
various format strings & API signatures to adapt
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Some users report (very rarely) seeing a deadlock in the QEMU
monitor callbacks
Thread 10 (Thread 0x7fcd11e20700 (LWP 26753)):
#0 0x00000030d0e0de4d in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00000030d0e09ca6 in _L_lock_840 () from /lib64/libpthread.so.0
#2 0x00000030d0e09ba8 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00007fcd162f416d in virMutexLock (m=<optimized out>)
at util/threads-pthread.c:85
#4 0x00007fcd1632c651 in virDomainObjLock (obj=<optimized out>)
at conf/domain_conf.c:14256
#5 0x00007fcd0daf05cc in qemuProcessHandleMonitorDestroy (mon=0x7fcccc0029e0,
vm=0x7fcccc00a850) at qemu/qemu_process.c:1026
#6 0x00007fcd0db01710 in qemuMonitorDispose (obj=0x7fcccc0029e0)
at qemu/qemu_monitor.c:249
#7 0x00007fcd162fd4e3 in virObjectUnref (anyobj=<optimized out>)
at util/virobject.c:139
#8 0x00007fcd0db027a9 in qemuMonitorClose (mon=<optimized out>)
at qemu/qemu_monitor.c:860
#9 0x00007fcd0daf61ad in qemuProcessStop (driver=driver@entry=0x7fcd04079d50,
vm=vm@entry=0x7fcccc00a850,
reason=reason@entry=VIR_DOMAIN_SHUTOFF_DESTROYED, flags=flags@entry=0)
at qemu/qemu_process.c:4057
#10 0x00007fcd0db323cf in qemuDomainDestroyFlags (dom=<optimized out>,
flags=<optimized out>) at qemu/qemu_driver.c:1977
#11 0x00007fcd1637ff51 in virDomainDestroyFlags (
domain=domain@entry=0x7fccf00c1830, flags=1) at libvirt.c:2256
At frame #10 we are holding the domain lock, we call into
qemuProcessStop() to cleanup QEMU, which triggers the monitor
to close, which invokes qemuProcessHandleMonitorDestroy() which
tries to obtain the domain lock again. This is a non-recursive
lock, hence hang.
Since qemuMonitorPtr is a virObject, the unref call in
qemuProcessHandleMonitorDestroy no longer needs mutex
protection. The assignment of priv->mon = NULL, can be
instead done by the caller of qemuMonitorClose(), thus
removing all need for locking.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Move most of qemuProcessKill into virProcessKillPainfully
In the cgroups APIs we have a virCgroupKillPainfully function
which does the loop sending SIGTERM, then SIGKILL and waiting
for the process to exit. There is similar functionality for
simple processes in qemuProcessKill, but it is tangled with
the QEMU code. Untangle it to provide a virProcessKillPainfuly
function
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When calling qemuProcessKill from the virDomainDestroy impl
in QEMU, do not ignore the return value. This ensures that
if QEMU fails to respond to SIGKILL, the caller will know
about the failure.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Depending on the scenario in which LXC containers exit, it is
possible for the EOF callback of the LXC monitor to deadlock
the driver.
#0 0x00000038a0a0de4d in __lll_lock_wait () from /lib64/libpthread.so.0
#1 0x00000038a0a09ca6 in _L_lock_840 () from /lib64/libpthread.so.0
#2 0x00000038a0a09ba8 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3 0x00007f4bd9579d55 in virMutexLock (m=<optimized out>) at util/threads-pthread.c:85
#4 0x00007f4bcacc7597 in lxcDriverLock (driver=0x7f4bc40c8290) at lxc/lxc_conf.h:81
#5 virLXCProcessMonitorEOFNotify (mon=<optimized out>, vm=0x7f4bb4000b00) at lxc/lxc_process.c:581
#6 0x00007f4bd9645c91 in virNetClientCloseLocked (client=client@entry=0x7f4bb4009e60)
at rpc/virnetclient.c:554
#7 0x00007f4bd96460f8 in virNetClientIOEventLoopPassTheBuck (thiscall=0x0, client=0x7f4bb4009e60)
at rpc/virnetclient.c:1306
#8 virNetClientIOEventLoopPassTheBuck (client=0x7f4bb4009e60, thiscall=0x0)
at rpc/virnetclient.c:1287
#9 0x00007f4bd96467a2 in virNetClientCloseInternal (reason=3, client=0x7f4bb4009e60)
at rpc/virnetclient.c:589
#10 virNetClientCloseInternal (client=0x7f4bb4009e60, reason=3) at rpc/virnetclient.c:561
#11 0x00007f4bcacc4a82 in virLXCMonitorClose (mon=0x7f4bb4000a00) at lxc/lxc_monitor.c:201
#12 0x00007f4bcacc55ac in virLXCProcessCleanup (reason=<optimized out>, vm=0x7f4bb4000b00,
driver=0x7f4bc40c8290) at lxc/lxc_process.c:240
#13 virLXCProcessStop (driver=0x7f4bc40c8290, vm=vm@entry=0x7f4bb4000b00,
reason=reason@entry=VIR_DOMAIN_SHUTOFF_DESTROYED) at lxc/lxc_process.c:735
#14 0x00007f4bcacc5bd2 in virLXCProcessAutoDestroyDom (payload=<optimized out>,
name=0x7f4bb4003c80, opaque=0x7fff41af2df0) at lxc/lxc_process.c:94
#15 0x00007f4bd9586649 in virHashForEach (table=0x7f4bc409b270,
iter=iter@entry=0x7f4bcacc5ab0 <virLXCProcessAutoDestroyDom>, data=data@entry=0x7fff41af2df0)
at util/virhash.c:514
#16 0x00007f4bcacc52d7 in virLXCProcessAutoDestroyRun (driver=driver@entry=0x7f4bc40c8290,
conn=conn@entry=0x7f4bb8000ab0) at lxc/lxc_process.c:120
#17 0x00007f4bcacca628 in lxcClose (conn=0x7f4bb8000ab0) at lxc/lxc_driver.c:128
#18 0x00007f4bd95e67ab in virReleaseConnect (conn=conn@entry=0x7f4bb8000ab0) at datatypes.c:114
When the driver calls virLXCMonitorClose, there is really no
need for the EOF callback to be invoked in this case, since
the caller can easily handle events itself. In changing this,
the monitor needs to take a deep copy of the callback list,
not merely a reference.
Also adds debug statements in various places to aid
troubleshooting
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Laine Stump [Fri, 21 Sep 2012 16:50:53 +0000 (12:50 -0400)]
network: backend for virNetworkUpdate of interface list
<interface> elements are location inside the <forward> element of a
network. There is only one <forward> element in any network, but it
might have many <interface> elements. This element only contains a
single attribute, "dev", which is the name of a network device
(e.g. "eth0").
Since there is only a single attribute, the modify operation isn't
supported for this "section", only add-first, add-last, and
delete. Also, note that it's not permitted to delete an interface from
the list while any guest is using it. We may later decide this is safe
(because removing it from the list really only excludes it from
consideration in future guest allocations of interfaces, but doesn't
affect any guests currently connected), but for now this limitation
seems prudent (of course when changing the persistent config, this
limitation doesn't apply, because the persistent config doesn't
support the concept of "in used").
Another limitation - it is also possible for the interfraces in this
list to be described by PCI address rather than netdev name. However,
I noticed while writing this function that we currently don't support
defining interfaces that way in config - the only method of getting
interfaces specified as <adress type='pci' ..../> instead of
<interface dev='xx'/> is to provide a <pf dev='yy'/> element under
forward, and let the entries in the interface list be automatically
populated with the virtual functions (VF) of the physical function
device given in <pg>.
As with the other virNetworkUpdate section backends, support for this
section is completely contained within a single static function, no
other changes were required, and only functions already called from
elsewhere within the same file are used in the new content for this
existing function (i.e., adding this code should not cause a new build
problem on any platform).
Eric Blake [Wed, 26 Sep 2012 17:16:36 +0000 (11:16 -0600)]
build: avoid older gcc warning
Jim Fehlig reported a compilation error with older gcc 4.3.4:
libvirt.c: In function 'virDomainGetEmulatorPinInfo':
libvirt.c:9111: error: logical '&&' with non-zero constant will always evaluate as true [-Wlogical-op]
It looks like someone programmed via too much copy-and-paste.
* src/libvirt.c (virDomainGetEmulatorPinInfo): Multiplying by 1 is
a no-op, and thus will never overflow.
Michal Privoznik [Thu, 20 Sep 2012 09:15:31 +0000 (11:15 +0200)]
qemu: wait for SPICE to migrate
Recently, there have been some improvements made to qemu so it
supports seamless migration or something very close to it.
However, it requires libvirt interaction. Once qemu is migrated,
the SPICE server needs to send its internal state to the destination.
Once it's done, it fires SPICE_MIGRATE_COMPLETED event and this
fact is advertised in 'query-spice' output as well.
We must not kill qemu until SPICE server finishes the transfer.
SELinux wants all log files opened with O_APPEND. When
running non-root though, libvirtd likes to use O_TRUNC
to avoid log files growing in size indefinitely. Instead
of using O_TRUNC though, we can use O_APPEND and then
call ftruncate() which keeps SELinux happier.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>