With current code, if user calls virDomainPMSuspendForDuration()
followed by virDomainDestroy(), the former API checks for qemu agent
presence, which will evaluate as true (if agent is configured). While
talking to qemu agent, the qemu driver is unlocked, so the latter API
starts executing. However, if machine dies meanwhile, libvirtd gets
EOF on the agent socket and qemuProcessHandleAgentEOF() is called. The
handler clears reference to qemu agent while the destroy API already
holding a reference to it. This leads to NULL dereferencing later in
the code. Therefore, the agent pointer should be set to NULL only if
we are the exclusive owner of it.
Yufang Zhang [Wed, 9 Jan 2013 12:18:35 +0000 (20:18 +0800)]
build: move file deleting action from %files list to %install
When building libvirt rpms on rhel5, I got the following error:
File must begin with "/": rm
File must begin with "/": -f
File must begin with "/": $RPM_BUILD_ROOT/etc/sysctl.d/libvirtd
Installed (but unpackaged) file(s) found:
/etc/sysctl.d/libvirtd
It is triggerd by the %files list of libvirt daemon:
Eric Blake [Wed, 9 Jan 2013 21:36:25 +0000 (14:36 -0700)]
maint: distribute libvirtd.service.in
I did a build --without-libvirtd, then ran 'make dist'. The
resulting tarball was broken, with a complaint that make did not
know how to create libvirtd.service.in. I traced it to a use
of EXTRA_DIST inside a conditional.
* daemon/Makefile.am (EXTRA_DIST): Hoist libvirtd.service.in
outside of WITH_LIBVIRTD conditional.
When qemu/KVM VMs are paused manually through a monitor not-owned by libvirt,
libvirt will think of them as "paused" event after they are resumed and
effectively running. With this patch the discrepancy goes away.
This is meant to address bug 892791.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
John Ferlan [Tue, 8 Jan 2013 21:21:51 +0000 (16:21 -0500)]
tests: Remove remnants of removing the fake emulator output
Coverity determined that 'emulator' could no longer be set and determined the
code was dead. Looking through the history, I discovered commit-id ed769e18
removed code originally added by commit-id 9237e955 and further modified by
commit-id 6a7e7c4f.
In a non-systemd environment the post and preun scripts of libvirt-client
fail, since the required files are in libvirt-daemon. Moved them to client.
Doing that I noticed %{_unitdir}/libvirt-guests.service was contained in
both libvirt-client and libvirt-daemon, which I don't think was intended.
Removed the extra copy from daemon.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Eric Blake [Mon, 7 Jan 2013 22:50:41 +0000 (15:50 -0700)]
maint: avoid potential promotion issues with [ug]id_t
POSIX does not guarantee whether uid_t and gid_t are signed or
unsigned, nor does it guarantee whether they are smaller, same
size, or larger than int (or even the same size as one another).
Therefore, it is possible to have platforms where '(uid_t)-1==-1'
is false or where 'uid = gid = -1' sets uid to the wrong value,
thanks to integer promotion rules. The only portable way to use
the placeholder value of these two types is to always use a cast.
Thankfully, the issue is mostly theoretical - sanlock only
compiles on Linux for now, and on Linux, these types do not
suffer from strange promotion problems.
* src/locking/lock_driver_sanlock.c
(virLockManagerSanlockSetupLockspace, virLockManagerSanlockInit)
(virLockManagerSanlockCreateLease): Cast -1 to proper type before
comparing with uid_t or gid_t.
Currently, if there's no hard memory limit defined for a domain,
libvirt tries to calculate one, based on domain definition and magic
equation and set it upon the domain startup. The rationale behind was,
if there's a memory leak or exploit in qemu, we should prevent the
host system trashing. However, the equation was too tightening, as it
didn't reflect what the kernel counts into the memory used by a
process. Since many hosts do have a swap, nobody hasn't noticed
anything, because if hard memory limit is reached, process can
continue allocating memory on a swap. However, if there is no swap on
the host, the process gets killed by OOM killer. In our case, the qemu
process it is.
To prevent this, we need to relax the hard RSS limit. Moreover, we
should reflect more precisely the kernel way of accounting the memory
for process. That is, even the kernel caches are counted within the
memory used by a process (within cgroups at least). Hence the magic
equation has to be changed:
limit = 1.5 * (domain memory + total video memory) + (32MB for cache
per each disk) + 200MB
J.B. Joret [Mon, 7 Jan 2013 17:17:14 +0000 (18:17 +0100)]
S390: Add SCLP console front end support
The SCLP console is the native console type for s390 and is preferred
over the virtio console as it doesn't require special drivers and
is more efficient. Recent versions of QEMU come with SCLP support
which is hereby enabled.
The new target types 'sclp' and 'sclplm' can be used to specify a
SCLP console. Adding documentation, domain schema and XML processing
support.
Signed-off-by: J.B. Joret <jb@linux.vnet.ibm.com> Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
To avoid confusion between the LXC driver <-> controller
monitor RPC protocol and the libvirt-lxc.so <-> libvirtd public
RPC protocol, rename the former to lxc_monitor_protocol.x
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the libvirt client can pass FDs to the server, but the
dispatch mechanism provides no way to return FDs back from the
server to the client. Tweak the dispatch code, such that if a
dispatcher returns '1', this indicates that it populated the
virNetMessagePtr with FDs to return
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
John Ferlan [Mon, 7 Jan 2013 17:09:29 +0000 (12:09 -0500)]
xen: Avoid possible NULL dereference
Change calling sequence to only call xenUnifiedDomainSetVcpusFlags() when
'dom' is not NULL. Use the GET_PRIVATE() macro to reference privateData.
Just return -1 if dom is NULL.
Ensure we always setup a private mount namespace for LXC controller
The code for setting up a private /dev/pts for the containers
is also responsible for making the LXC controller have a
private mount namespace. Unfortunately the /dev/pts code is
not run if launching a container without a custom root. This
causes the LXC FUSE mount to leak into the host FS.
Since we daemonized QEMU for capabilities probing there is a long
time if QEMU fails to launch. This is because we're not passing in
any virDomainObjPtr instance and thus the monitor code can not
check to see if the PID is still alive.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Only initialize capabilities after setting dir permissions
The current code is initializing capabilities before setting
directory permissions. Thus the QEMU binaries being run may
not have the ability to create the UNIX monitor socket on
the first run of libvirtd.
Jim Fehlig [Mon, 7 Jan 2013 17:15:56 +0000 (10:15 -0700)]
build: Add libxenctrl to LIBXL_LIBS
Commit dfa1e1dd removed libxenctrl from LIBXL_LIBS, but the libxl
driver uses a symbol from this library. Explicitly link with
libxenctrl instead of relying on the build system to support
implicit DSO linking.
Eric Blake [Fri, 4 Jan 2013 21:21:59 +0000 (14:21 -0700)]
build: install libvirt sysctl file correctly
https://bugzilla.redhat.com/show_bug.cgi?id=887017 reports that
even though libvirt attempts to set fs.aio-max-nr via sysctl,
the file was installed with the wrong name and gets ignored by
sysctl. Furthermore, 'man systcl.d' recommends that packages
install into hard-coded /usr/lib/sysctl.d (even when libdir is
/usr/lib64), so that sysadmins can use /etc/sysctl.d for overrides.
Eric Blake [Fri, 4 Jan 2013 20:35:04 +0000 (13:35 -0700)]
build: use common .in replacement mechanism
We had several different styles of .in conversion in our Makefiles:
ALLCAPS, @ALLCAPS@, @lower@, ::lower::
Canonicalize on one form, to make it easier to copy and paste
between .in files.
Also, we were using some non-portable sed constructs: \@ is an
undefined escape sequence (it happens to be @ itself in GNU sed,
but POSIX allows it to mean something else), as well as risky
behavior (failure to consistently quote things means a space
in $(sysconfdir) could throw things off; also, Autoconf recommends
using | rather than , or ! in the s||| operator, because | has to
be quoted in shell and is therefore less likely to appear in file
names than , or !).
Osier Yang [Wed, 2 Jan 2013 14:37:11 +0000 (22:37 +0800)]
qemu: Check if the shared disk's cdbfilter conflicts with others
This prevents domain starting and disk attaching if the shared disk's
setting conflicts with other active domain(s), E.g. A domain with
"sgio" set as "filtered", however, another active domain is using
it set as "unfiltered".
Osier Yang [Wed, 2 Jan 2013 14:37:09 +0000 (22:37 +0800)]
conf: Parse and format the new XML
Like "rawio", "sgio" is only allowed for block disk of device
type "lun".
It doesn't default disk->sgio to "filtered" when parsing, as
it won't be able to distinguish explicitly requested "filtered"
and a default "filtered" in driver then. We have to error out for
explicit request when the kernel doesn't support the new sysfs
knob "unpriv_sgio", however, for defaulted "filtered", we can
just ignore it if the kernel doesn't support "unpriv_sgio".
Osier Yang [Wed, 2 Jan 2013 14:37:08 +0000 (22:37 +0800)]
docs: Add docs and rng schema for new XML tag sgio
This introduces new XML tag "sgio" for disk, its valid values
are "filtered" and "unfiltered", setting it as "filtered" will
set the disk's unpriv_sgio to 0, and "unfiltered" to set it
as 1, which allows the unprivileged SG_IO commands.
Osier Yang [Wed, 2 Jan 2013 14:37:07 +0000 (22:37 +0800)]
qemu: Add a hash table for the shared disks
This introduces a hash table for qemu driver, to store the shared
disk's info as (@major:minor, @ref_count). @ref_count is the number
of domains which shares the disk.
Since we only care about if the disk support unprivileged SG_IO
commands, and the SG_IO commands only make sense for block disk,
this patch only manages (add/remove hash entry) the shared disk for
block disk.
* src/qemu/qemu_conf.h: (Add member 'sharedDisks' of type
virHashTablePtr; Declare helpers
qemuGetSharedDiskKey, qemuAddSharedDisk
and qemuRemoveSharedDisk)
* src/qemu/qemu_conf.c (Implement the 3 helpers)
* src/qemu/qemu_process.c (Update 'sharedDisks' when domain
starting and shutdown)
* src/qemu/qemu_driver.c (Update 'sharedDisks' when attaching
or detaching disk).
Peter Krempa [Thu, 3 Jan 2013 13:20:09 +0000 (14:20 +0100)]
snapshot: qemu: Fix segfault and vanishing snapshots when redefining
When the disk alignment check done while redefining an existing snapshot
failed, the qemu driver attempted to free the existing snapshot. As in
the cleanup path the definition of the snapshot wasn't assigned, the
cleanup code dereferenced a NULL pointer.
This patch changes the behavior on error paths while redefining snapshot
in two ways:
1) On failure, modifications done on the snapshot definition object are
rolled back.
2) The previous definition of the data isn't freed until it's certain it
won't be needed any more.
This change avoids the segfault and additionally the snapshot doesn't
vanish if redefinition fails for some reason.
John Eckersberg [Wed, 2 Jan 2013 15:38:53 +0000 (10:38 -0500)]
conf: Add unix socket support to virChrdevOpen
This also changes the function signature to take a
virDomainChrSourceDefPtr instead of just a path, since it needs to
differentiate behavior based on source->type.
John Eckersberg [Wed, 2 Jan 2013 15:38:52 +0000 (10:38 -0500)]
conf: Rename console-specific identifiers to be more generic
The functionality provided in virchrdev.c (previously virconsole.c) is
applicable to other types of character devices besides consoles, such
as channels. This patch is just code motion, renaming things such as
"console" or "pty", instead using more general terms such as
"character device" or "device path".
John Eckersberg [Thu, 13 Dec 2012 16:24:16 +0000 (11:24 -0500)]
api: Add API to tunnel a guest channel via stream
This patch adds a new API, virDomainOpenChannel, that uses streams to
connect to a virtio channel on a guest. This creates a secure
communication channel between a guest and a libvirt client.
This behaves the same as virDomainOpenConsole, except on channels
instead of console/serial/parallel devices.
Eric Blake [Fri, 4 Jan 2013 23:36:57 +0000 (16:36 -0700)]
build: fix mingw rpm build
Commit d13155c changed which files get installed for the
libvirt-guests service, but did not touch up the mingw spec
file. As a result, rpmbuild complained:
Eric Blake [Fri, 4 Jan 2013 22:02:42 +0000 (15:02 -0700)]
network: fix check for ambiguous lookup
gcc -O2 complained:
../../src/conf/network_conf.c: In function 'virNetworkDefUpdateDNSSrv':
../../src/conf/network_conf.c:3232: error: 'foundIdx' may be used uninitialized in this function [-Wuninitialized]
It turned out to be a spurious warning (we didn't use foundIdx
unless foundCt was non-zero). But in investigating that, I noticed
a worse problem: we were using 'if (foundCt > 1)', but since foundCt
was bool, it could never be > 1.
* src/conf/network_conf.c (virNetworkDefUpdateDNSHost): Use
correct type.
(virNetworkDefUpdateDNSSrv): Likewise, and silence compiler
warning.
Since 4c993d8a we failed to set this important capability, which
allows starting a domain with QXL video card. We set DEVICE_QXL
capability bit instead, which is not necessary wrong. Anyway, if
qemu supports the new '-device qxl' it supports older '-vga qxl'
as well. The latter is used for the primary (the first) qxl video
card, the former for other video cards.
Ján Tomko [Thu, 3 Jan 2013 18:07:55 +0000 (19:07 +0100)]
qemu: fix a segfault in qemuProcessWaitForMonitor
Commit b3f2b4ca5cfe98b08ffdb96f0455e3e333e5ace6 left buf unallocated in
the case of QMP capability probing being used, leading to a segfault in
strlen in the cleanup path.
This patch opens the log and allocates the buffer if QMP probing was
used, so we can display the helpful error message.
qemu: Don't parse log output when starting up a domain
Despite our great effort we still parsed qemu log output.
We wouldn't notice unless upcoming qemu 1.4 changed the
format of the logs slightly. Anyway, now we should gather
all interesting knobs like pty paths from monitor. Moreover,
since for historical reasons the first console can be just
an alias to the first serial port, we need to check this and
copy the pty path if that's the case to the first console.
Eric Blake [Wed, 2 Jan 2013 18:10:42 +0000 (11:10 -0700)]
build: use autobuild module to make build logs nicer
A recent build failure made me realize that we could usefully add
a bit more information to configure output, for aid in analysis of
failed builds. Pulling in the autobuild module merely adds these
four lines to configure output:
This reverts commit 28224c4d2a2d623b9a0a714bc0454d47de5d7a35
which shouldn't be needed at all because with current qemu
we obtain all paths from 'query-chardev' output. We ought
not parse log output at all anymore.
Michal Privoznik [Fri, 21 Dec 2012 11:35:21 +0000 (12:35 +0100)]
sanlock: Chown lease files as well
Since sanlock doesn't run under root:root, we have chown()'ed the
__LIBVIRT__DISKS__ lease file to the user:group defined in the
sanlock config. However, when writing the patch I've forgot about
lease files for each disk (this is the
/var/lib/libvirt/sanlock/<md5>) file.
Michal Privoznik [Fri, 28 Dec 2012 15:22:09 +0000 (16:22 +0100)]
python: Adapt to virevent rename
With our recent renames under src/util/* we forgot to adapt
python wrapper code generator. This results in some methods being
not exposed:
$ python examples/domain-events/events-python/event-test.py
Using uri:qemu:///system
Traceback (most recent call last):
File "examples/domain-events/events-python/event-test.py", line 585, in <module>
main()
File "examples/domain-events/events-python/event-test.py", line 543, in main
virEventLoopPureStart()
File "examples/domain-events/events-python/event-test.py", line 416, in virEventLoopPureStart
virEventLoopPureRegister()
File "examples/domain-events/events-python/event-test.py", line 397, in virEventLoopPureRegister
libvirt.virEventRegisterImpl(virEventAddHandleImpl,
AttributeError: 'module' object has no attribute 'virEventRegisterImpl'
Michal Privoznik [Thu, 13 Dec 2012 11:19:58 +0000 (12:19 +0100)]
qemu: Convert some APIs to use qemuDomObjFromDomain
Many internal qemu APIs must find domain object from passed
virDomainPtr. And with function Peter's introduced, we can use it
instead of copying multiple lines among code.
S390: Re-enable capability probing for virtio devices.
Since we switched to QMP probing, the object types are spelled out
explicitly, i.e. virtio-net-pci. This has effectively disabled
the capability detection of s390 virtio devices. The trivial fix
is to add the s390 virtio types explicitly to qemuCapsObjectProps.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
commit ac2797cf2af2fd0e64c58a48409a8175d24d6f86 attempted to fix a
problem with netlink in newer kernels requiring an extra attribute
with a filter flag set in order to receive an IFLA_VFINFO_LIST from
netlink. Unfortunately, the #ifdef that protected against compiling it
in on systems without the new flag went a bit too far, assuring that
the new code would *never* be compiled, and even if it had, the code
was incorrect.
The first problem was that, while some IFLA_* enum values are also
their existence at compile time, IFLA_EXT_MASK *isn't* #defined, so
checking to see if it's #defined is not a valid method of determining
whether or not to add the attribute. Fortunately, the flag that is
being set (RTEXT_FILTER_VF) *is* #defined, and it is never present if
IFLA_EXT_MASK isn't, so it's sufficient to just check for that flag.
And to top it off, due to the code not actually compiling when I
thought it did, I didn't realize that I'd been given the wrong arglist
to nla_put() - you can't just send a const value to nla_put, you have
to send it a pointer to memory containing what you want to add to the
message, along with the length of that memory.
This time I've actually sent the patch over to the other machine
that's experiencing the problem, applied it to the branch being used
(0.10.2) and verified that it works properly, i.e. it does fix the
problem it's supposed to fix. :-/
The code for doing a block-copy was supposed to track the destination
file in drive->mirror, but was set up to do all mallocs prior to
starting the copy so that OOM wouldn't leave things partially started.
However, the wrong variable was being written; later in the code we
silently did 'disk->mirror = mirror' which was still NULL, and thus
leaking memory and leaving libvirt to think that the mirror job was
never started, which prevented a pivot operation after a copy.
Problem introduced in commit 35c7701c6.