When lock failure is detected by sanlock, our sanlock_helper kill script
will try to restart (shutdown followed by start) the affected domain
when RESTART action is configured for it. While shutting down kills QEMU
and removes all its leases (which is what sanlock wants to happen),
trying to start it again just hangs because libvirt tries reacquire the
locks in the failed lock space. Hence, this action cannot be supported
by sanlock driver.
Sanlock expects that the configured kill script either kills the PID on
lock failure or removes all locks the PID owns. If none of the two
options happen, sanlock will reboot the host. Although IGNORE action is
supposed to ignore the request to kill the PID or remove all leases,
it's certainly not designed to cause the host to be rebooted. That said,
IGNORE action is incompatible with sanlock and should be forbidden by
libvirt.
Peter Krempa [Tue, 25 Mar 2014 07:17:27 +0000 (08:17 +0100)]
util: Sanitize ATTRIBUTE_NONNULL use in viriscsi.h
Some of the function attributes marked as nonnull actually explicitly
handle the arguments for NULL. All changed functions handle missing
"initiatoriqn" argument well and virISCSIScanTargets also handles well
if the return pointers are missing. Remove some of the liberaly used
ATTRIBUTE_NONNULLs as coverity and possibly other compilers that honor
the attribute fail to compile the code.
Qiao Nuohan [Sun, 23 Mar 2014 03:51:12 +0000 (11:51 +0800)]
add new virDomainCoreDumpWithFormat API
--memory-only option is introduced without compression supported. Now qemu
has support for dumping domain's memory in kdump-compressed format. This
patch adds a new virDomainCoreDumpWithFormat API, so that the format in
which qemu dumps domain's memory can be specified.
Signed-off-by: Qiao Nuohan <qiaonuohan@cn.fujitsu.com> Signed-off-by: Eric Blake <eblake@redhat.com>
Eric Blake [Wed, 19 Mar 2014 17:11:16 +0000 (11:11 -0600)]
conf: prepare to track multiple host source files per <disk>
It's finally time to start tracking disk backing chains in
<domain> XML. The first step is to start refactoring code
so that we have an object more convenient for representing
each host source resource in the context of a single guest
<disk>. Ultimately, I plan to move the new type into src/util
where it can be reused by virStorageFile, but to make the
transition easier to review, this patch just creates the
new type then fixes everything until it compiles again.
Eric Blake [Mon, 17 Mar 2014 18:53:15 +0000 (12:53 -0600)]
conf: use disk source accessors in conf/
Part of a series of cleanups to use new accessor methods.
Several places in domain_conf.c still open-code raw field access,
but that code will be touched later with the diskDef struct split
so I'm avoiding churn here.
Eric Blake [Mon, 17 Mar 2014 17:39:57 +0000 (11:39 -0600)]
conf: accessors for common source information
A future patch will split virDomainDiskDef, in order to track
multiple host resources per guest <disk>. To reduce the size
of that patch, I've factored out the four most common accesses
into functions, so that I can incrementally upgrade the code
base to use the accessors, and so that code that doesn't care
about the distinction of per-file details won't have to be
changed when the struct changes.
Eric Blake [Tue, 18 Mar 2014 21:41:28 +0000 (15:41 -0600)]
vmware: fix parse of disk source
While writing disk source refactoring, I discovered that conversion
from XML to vmware modified the disk source in place; if the same
code is reached twice, the second call behaves differently because
the first call didn't clean up its mess.
I get the SIGSEGV when starting the domain. The thing is, when
starting a domain, we check for its disk presence. For some reason,
when determining the disk chain, we parse the <seclabel/> (don't ask
me why). However, there's no label attribute in the XML, so we end up
calling virParseOwnershipIds() over NULL string:
[Switching to Thread 0x7ffff10c4700 (LWP 30956)]
__strchr_sse42 () at ../sysdeps/x86_64/multiarch/strchr.S:136
136 ../sysdeps/x86_64/multiarch/strchr.S: No such file or directory.
(gdb) bt
#0 __strchr_sse42 () at ../sysdeps/x86_64/multiarch/strchr.S:136
#1 0x00007ffff749f800 in virParseOwnershipIds (label=0x0, uidPtr=uidPtr@entry=0x7ffff10c2df0, gidPtr=gidPtr@entry=0x7ffff10c2df4) at util/virutil.c:2115
#2 0x00007fffe929f006 in qemuDomainGetImageIds (gid=0x7ffff10c2df4, uid=0x7ffff10c2df0, disk=0x7fffe40cb000, vm=0x7fffe40a6410, cfg=0x7fffe409ae00) at qemu/qemu_domain.c:2385
#3 qemuDomainDetermineDiskChain (driver=driver@entry=0x7fffe40120e0, vm=vm@entry=0x7fffe40a6410, disk=disk@entry=0x7fffe40cb000, force=force@entry=false) at qemu/qemu_domain.c:2414
#4 0x00007fffe929f128 in qemuDomainCheckDiskPresence (driver=driver@entry=0x7fffe40120e0, vm=vm@entry=0x7fffe40a6410, cold_boot=cold_boot@entry=true) at qemu/qemu_domain.c:2250
#5 0x00007fffe92b6fc8 in qemuProcessStart (conn=conn@entry=0x7fffd4000b60, driver=driver@entry=0x7fffe40120e0, vm=vm@entry=0x7fffe40a6410, migrateFrom=migrateFrom@entry=0x0, stdin_fd=stdin_fd@entry=-1, stdin_path=stdin_path@entry=0x0, snapshot=snapshot@entry=0x0,
vmop=vmop@entry=VIR_NETDEV_VPORT_PROFILE_OP_CREATE, flags=flags@entry=1) at qemu/qemu_process.c:3813
#6 0x00007fffe93087e8 in qemuDomainObjStart (conn=0x7fffd4000b60, driver=driver@entry=0x7fffe40120e0, vm=vm@entry=0x7fffe40a6410, flags=flags@entry=0) at qemu/qemu_driver.c:6051
#7 0x00007fffe9308e32 in qemuDomainCreateWithFlags (dom=0x7fffcc000d50, flags=0) at qemu/qemu_driver.c:6105
#8 0x00007ffff753c5cc in virDomainCreate (domain=domain@entry=0x7fffcc000d50) at libvirt.c:8861
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Chegu Vinod [Thu, 6 Feb 2014 23:44:36 +0000 (15:44 -0800)]
libvirt support to force convergence of live guest migration
Busy enterprise workloads hosted on large sized VM's tend to dirty
memory faster than the transfer rate achieved via live guest migration.
Despite some good recent improvements (& using dedicated 10Gig NICs
between hosts) the live migration may NOT converge.
Recently support was added in qemu (version 1.6) to allow a user to
choose if they wish to force convergence of their migration via a
new migration capability : "auto-converge". This feature allows for qemu
to auto-detect lack of convergence and trigger a throttle-down of the
VCPUs.
This patch includes the libvirt support needed to trigger this
feature. (Testing is in progress)
Wang Yufei [Thu, 20 Mar 2014 07:14:01 +0000 (07:14 +0000)]
cgroup: Fix start VMs coincidently failed
When I start multi VMs coincidently and any of the cgroup directories
named machine doesn't exist. There's a chance that VM start failed because
of creating directory failed:
Unable to initialize /machine cgroup: File exists
When the errno returned by mkdir in virCgroupMakeGroup is EEXIST,
we should pass it through and continue to start the VM. Signed-off-by: Wang Yufei <james.wangyufei@huawei.com>
The caller may not want all DBus error conditions to be turned
into libvirt errors, so provide a way for the caller to get
back the full DBusError object. They can then check the errors
and only report those that they consider to be fatal.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Introduce alternate way to encode/decode arrays in DBus messages
Currently the DBus helper APIs require the values for an array
to be passed inline in the variadic argument list. This change
introduces support for passing arrays using a pointer to a plain
C array of the basic type. This is of particular benefit for
decoding messages when you don't know how many array elements
are being received.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The dbus_connection_send_with_reply_and_block method will
automatically call dbus_set_error_from_message for us. We
mistakenly thought we had todo it because of a flaw in the
systemd unit test mock impl. The latter should have directly
set the error object, instead of creating an error message
object.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The virDBusMessageRead method should not have side-effects on
the message parameter passed in, so unref'ing it is wrong.
The caller should unref only when they decided they are done
with it.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add DBus helper methods for creating reply messages
The test suites often have to create DBus method reply messages
with payloads. Create two helpers for simplifying the process
of creating replies with payloads.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Zhou Yimin [Fri, 21 Mar 2014 06:52:40 +0000 (14:52 +0800)]
virlog: Modify virLogParseDefaultPriority's comment of return value
virLogParseDefaultPriority's successful return value is the same as
virLogSetDefaultPriority's successful return value. So it should be 0
rather than the parsed log level.
Jiri Denemark [Thu, 20 Mar 2014 21:34:00 +0000 (22:34 +0100)]
conf: Introduce virDomainDeviceGetInfo API
The offset of virDomainDeviceInfo structure within a device definition
varies with device type and some types do not contain the info structure
at all. This new API makes it easier to access the info structure from a
generic virDomainDeviceDef structure.
Jiri Denemark [Thu, 20 Mar 2014 12:39:20 +0000 (13:39 +0100)]
Pass action to virDomainDefCompatibleDevice
When checking compatibility of a device with a domain definition, we
should know what we're going to do with the device. Because we may need
to check for different things when we're attaching a new device versus
detaching an existing device.
Jiri Denemark [Thu, 20 Mar 2014 12:04:06 +0000 (13:04 +0100)]
Fix usage of virDomainDefCompatibleDevice
A device needs to be checked for compatibility with the domain
definition it corresponds to. Specifically, for VIR_DOMAIN_AFFECT_CONFIG
case we should check against persistent def rather than active def.
When qemu dies early after connecting to its monitor but before we
actually try to read something from the monitor, we would just fail
domain start with useless message:
"An error occurred, but the cause is unknown"
This is because the real error gets reported in a monitor EOF handler
executing within libvirt's event loop.
The fix is to take any error set in qemuMonitor structure and propagate
it into the thread-local error when qemuMonitorClose is called and no
thread-local error is set.
Eric Blake [Thu, 6 Feb 2014 21:46:52 +0000 (14:46 -0700)]
qemu: allow filtering events by regex
When listening for a subset of monitor events, it can be tedious
to register for each event name in series; nicer is to register
for multiple events in one go. Implement a flag to use regex
interpretation of the event filter.
While at it, prove how much I hate the shift key, by adding a
way to filter for 'shutdown' instead of 'SHUTDOWN'. :)
Eric Blake [Fri, 31 Jan 2014 14:02:25 +0000 (07:02 -0700)]
qemu: enable monitor event filtering by name
Filtering monitor events by name requires tracking the name for
the duration of the filtering. In order to free the name, I
found it easiest to just piggyback on the user's freecb function,
which gets called when the event is deregistered.
For events without a name filter, we have the design of multiple
client registrations sharing a common server registration, because
the server side uses the same callback function and we reject
duplicate use of the same function. But with events in the mix,
we want to be able to allow the same function pointer to be used
with more than one event name. The solution is to tweak the
duplicate detection code to only act when there is no additional
filtering; if name filtering is in use, there is exactly one
client registration per server registration. Yes, this means
that there is no longer a bound on the number of server
registrations possible, so a malicious client could repeatedly
register for the same name event to exhaust server memory. On
the other hand, we already restricted monitor events to require
write access (compared to normal events only needing read access),
and separated it into the intentionally unsupported
libvirt-qemu.so, with documentation that using this function is
for debug purposes only; so it is not a security risk worth
worrying about a client trying to abuse multiple registrations.
* src/conf/domain_event.c (virDomainQemuMonitorEventData): New
struct.
(virDomainQemuMonitorEventFilter)
(virDomainQemuMonitorEventCleanup): New functions.
(virDomainQemuMonitorEventDispatchFunc)
(virDomainQemuMonitorEventStateRegisterID): Use new struct.
* src/conf/object_event.c (virObjectEventCallbackListCount)
(virObjectEventCallbackListAddID)
(virObjectEventCallbackListRemoveID)
(virObjectEventCallbackListMarkDeleteID): Drop duplicate detection
when filtering is in effect.
Eric Blake [Thu, 30 Jan 2014 00:14:44 +0000 (17:14 -0700)]
qemu: enable monitor event reporting
Wire up all the pieces to send arbitrary qemu events to a
client using libvirt-qemu.so. If the extra bookkeeping of
generating event objects even when no one is listening turns
out to be noticeable, we can try to further optimize things
by adding a counter for how many connections are using events,
and only dump events when the counter is non-zero; but for
now, I didn't think it was worth the code complexity.
* src/qemu/qemu_driver.c
(qemuConnectDomainQemuMonitorEventRegister)
(qemuConnectDomainQemuMonitorEventDeregister): New functions.
* src/qemu/qemu_monitor.h (qemuMonitorEmitEvent): New prototype.
(qemuMonitorDomainEventCallback): New typedef.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONIOProcessEvent):
Report events.
* src/qemu/qemu_monitor.c (qemuMonitorEmitEvent): New function, to
pass events through.
* src/qemu/qemu_process.c (qemuProcessHandleEvent): Likewise.
Eric Blake [Wed, 29 Jan 2014 22:30:44 +0000 (15:30 -0700)]
qemu: wire up RPC for qemu monitor events
These are the first async events in the qemu protocol, so this
patch looks rather big compared to most RPC additions. However,
a large majority of this patch is just mechanical copy-and-paste
from recently-added network events. It didn't help that this
is also the first virConnect rather than virDomain prefix
associated with a qemu-specific API.
* src/remote/qemu_protocol.x (qemu_*_domain_monitor_event_*): New
structs and RPC messages.
* src/rpc/gendispatch.pl: Adjust naming conventions.
* daemon/libvirtd.h (daemonClientPrivate): Track qemu events.
* daemon/remote.c (remoteClientFreeFunc): Likewise.
(remoteRelayDomainQemuMonitorEvent)
(qemuDispatchConnectDomainMonitorEventRegister)
(qemuDispatchConnectDomainMonitorEventDeregister): New functions.
* src/remote/remote_driver.c (qemuEvents): Handle qemu events.
(doRemoteOpen): Register for events.
(remoteNetworkBuildEventLifecycle)
(remoteConnectDomainQemuMonitorEventRegister)
(remoteConnectDomainQemuMonitorEventDeregister): New functions.
* src/qemu_protocol-structs: Regenerate.
Eric Blake [Tue, 31 Dec 2013 13:33:42 +0000 (06:33 -0700)]
qemu: create object for qemu monitor events
Create qemu monitor events as a distinct class to normal domain
events, because they will be filtered differently. For ease of
review, the logic for filtering by event name is saved for a later
patch.
* src/conf/domain_event.c (virDomainQemuMonitorEventClass): New
class.
(virDomainEventsOnceInit): Register it.
(virDomainQemuMonitorEventDispose, virDomainQemuMonitorEventNew)
(virDomainQemuMonitorEventDispatchFunc)
(virDomainQemuMonitorEventStateRegisterID): New functions.
* src/conf/domain_event.h (virDomainQemuMonitorEventNew)
(virDomainQemuMonitorEventStateRegisterID): New prototypes.
* src/libvirt_private.syms (conf/domain_conf.h): Export them.
Eric Blake [Thu, 19 Dec 2013 14:06:35 +0000 (07:06 -0700)]
qemu: new API for tracking arbitrary monitor events
Several times in the past, qemu has implemented a new event,
but libvirt has not yet caught up to reporting that event to
the user applications. While it is possible to track libvirt
logs to see that an unknown event was received and ignored,
it would be nicer to copy what 'virsh qemu-monitor-command'
does, and expose this information to the end developer as
one of our unsupported qemu-specific commands.
If you find yourself needing to use this API for more than
just development purposes, please ask on the libvirt list
for a supported counterpart event to be added in libvirt.so.
While the supported virConnectDomainEventRegisterAny() API
takes an id which determines the signature of the callback,
this version takes a string filter and always uses the same
signature. Furthermore, I chose to expose this as a new API
instead of trying to add a new eventID at the top level, in
part because the generic option lacks event name filtering,
and in part because the normal domain event namespace should
not be polluted by a qemu-only event. I also added a flags
argument; unused for now, but we might decide to use it to
allow a user to request event names by glob or regex instead
of literal match.
This API intentionally requires full write access (while
normal event registration is allowed on read-only clients);
this is in part due to the fact that it should only be used
by debugging situations, and in part because the design of
per-event filtering in later patches ended up allowing for
duplicate registrations that could potentially be abused to
exhaust server memory - requiring write privileges means
that such abuse will not serve as a denial of service attack
against users with higher privileges.
* include/libvirt/libvirt-qemu.h
(virConnectDomainQemuMonitorEventCallback)
(virConnectDomainQemuMonitorEventRegister)
(virConnectDomainQemuMonitorEventDeregister): New prototypes.
* src/libvirt-qemu.c (virConnectDomainQemuMonitorEventRegister)
(virConnectDomainQemuMonitorEventDeregister): New functions.
* src/libvirt_qemu.syms (LIBVIRT_QEMU_1.2.1): Export them.
* src/driver.h (virDrvConnectDomainQemuMonitorEventRegister)
(virDrvConnectDomainQemuMonitorEventDeregister): New callbacks.
Ján Tomko [Thu, 20 Mar 2014 15:35:00 +0000 (16:35 +0100)]
Ignore missing files on pool refresh
If we cannot stat/open a file on pool refresh, returning -1 aborts
the refresh and the pool is undefined.
Only treat missing files as fatal unless VolOpenCheckMode is called
with the VIR_STORAGE_VOL_OPEN_ERROR flag. If this flag is missing
(when it's called from virStorageBackendProbeTarget in
virStorageBackendFileSystemRefresh), only emit a warning and return
-2 to let the caller skip over the file.
Ján Tomko [Thu, 20 Mar 2014 15:42:52 +0000 (16:42 +0100)]
Ignore char devices in storage pools by default
Without this, using /dev/mapper as a directory pool
fails in virStorageBackendUpdateVolTargetInfoFD:
cannot seek to end of file '/dev/mapper/control': Illegal seek
Require K&R styled curly braces around function bodies
Although not explicitly requested, we are using K&R (or Kernel)
indentation for curly braces around functions in HACKING file and most
of the code. Using grep -P, this patch add the syntax-check rule for
it (while skipping all the false positives with foreach constructs).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>