qemu: fix live pinning to memory node on NUMA system
Ever since the subcpusets(vcpu,emulator) were introduced, the parent
cpuset cannot be modified to remove the nodes that are in use by the
subcpusets.
The fix is to break the memory node modification into three steps:
1. assign new nodes into the parent,
2. change the nodes in the child nodes,
3. remove the old nodes on the parent node.
The storageRegister() didn't check the return from the
virRegisterStorageDriver() like other callers did, so Coverity
flagged it. Just check the return and handle.
The x509dname is only set inside a WITH_GNUTLS conditional, so
when used/check later on for NULL, Coverity detects this is not
possible. Added WITH_GNUTLS around uses to remove message
John Ferlan [Mon, 2 Dec 2013 19:36:51 +0000 (14:36 -0500)]
nwfilter: Remove Coverity DEADCODE warning
The nwfilterStateInitialize() would only assign sysbus inside
a WITH_DBUS conditional, thus leaving a subsequent check for sysbus
and nwfilterDriverInstallDBusMatches() as a no-op
Rather than try to add WITH_DBUS conditions which ended up conflicting
with the usage of HAVE_FIREWALLD conditionals, just remove the WITH_DBUS
since virdbus.c has entry points for with and without conditions.
GEN access/viraccessapichecklxc.h
GEN hyperv/hyperv_wmi.generated.h
GEN access/viraccessapichecklxc.c
GEN hyperv/hyperv_wmi.generated.c
GEN hyperv/hyperv_wmi_classes.generated.typedef
GEN hyperv/hyperv_wmi_classes.generated.h
GEN hyperv/hyperv_wmi_classes.generated.c
GEN libvirt_access_qemu.xml
GEN libvirt_access.syms
GEN libvirt_access_lxc.xml
GEN libvirt_access_qemu.syms
GEN libvirt_access_lxc.syms
GEN libvirt_qemu.def
GEN esx/esx_vi_types.generated.typedef
GEN esx/esx_vi_types.generated.typeenum
GEN esx/esx_vi_types.generated.typetostring
GEN esx/esx_vi_types.generated.typefromstring
GEN esx/esx_vi_types.generated.h
GEN esx/esx_vi_types.generated.c
GEN esx/esx_vi_methods.generated.h
GEN esx/esx_vi_methods.generated.c
GEN esx/esx_vi_methods.generated.macro
GEN esx/esx_vi.generated.h
GEN esx/esx_vi.generated.c
GEN libvirt_lxc.def
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The BZ asked for the capability to change the number of queues used by
a virtio-net device while the device is in use. Because the number of
queues can only be set at the time the device is created, that isn't
possible. However, libvirt also shouldn't be silently reporting
success when someone tries to change the number of queues. So this
patch flags that as an error (just as attempts to change any of the
other virtio-specific parameters already do).
Currently, initialization of drivers is done in a separate thread. This
is done for several reasons: a driver that is initialized may require
running event loop, it may take ages to initialize driver (e.g. due to
autostarting domains). While the thread is spawn and run, the main()
continues its execution. However, if something goes bad, or the event
loop is just exited (e.g. due to a --timeout or SIGINT) we try to
cleanup all the drivers. So we have two threads running Initialize() and
Cleanup() concurrently. This may result in accessing stale pointers -
e.g. netcf driver will free() itself in stateCleanup callback, while the
init thread may come, open a dummy connection in order to autostart some
domains and voilà: do_open() iterates over interface drivers and
accesses stale netcf driver.
The fix consists in not running stateCleanup if the init thread is still
running.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
(which was already closed as CANTFIX because the qemu "-boot strict"
commandline option wasn't available at the time).
Problem: you couldn't have a domain that used PXE to boot, but also
had an un-bootable disk device *even if that disk wasn't listed in the
boot order*, because if PXE timed out (e.g. due to the bridge
forwarding delay), the BIOS would move on to the next target, which
would be the unbootable disk device (again - even though it wasn't
given a boot order), and get stuck at a "BOOT DISK FAILURE, PRESS ANY
KEY" message until a user intervened.
The solution available since sometime around QEMU 1.5, is to add
"-boot strict=on" to *every* qemu command. When this is done, if any
devices have a boot order specified, then QEMU will *only* attempt to
boot from those devices that have an explicit boot order, ignoring the
rest.
Commit f094aaac48a6 changed the PCI device assignment in qemu domains
to default to using VFIO rather than legacy KVM device assignment
(when VFIO is available). It didn't change which driver was used by
default for virNodeDeviceDetachFlags(), though, so that API (and the
virsh nodedev-detach command) was still binding to the pci-stub
driver, used by legacy KVM assignment, by default.
This patch publicizes (only within the qemu module, though, so no
additions to the symbol exports are needed) the functions that check
for presence of KVM and VFIO device assignment, then uses those
functions to decide what to do when no driver is specified for
virNodeDeviceDetachFlags(); if the vfio driver is loaded, the device
will be bound to vfio-pci, or if legacy KVM assignment is supported on
this system, the device will be bound to pci-stub; if neither method
is available, the detach will fail.
Peter Krempa [Mon, 25 Nov 2013 16:51:04 +0000 (17:51 +0100)]
qemu: snapshots: Declare supported and unsupported snapshot configs
Currently the snapshot code did not check if it actually supports
snapshots on various disk backends for domains. To avoid future problems
add checkers that whitelist the supported configurations.
Peter Krempa [Thu, 21 Nov 2013 17:43:59 +0000 (18:43 +0100)]
qemu: Refactor disk source string formatting
This patch adds function qemuGetDriveSourceString to produce
qemu-compatible disk source strings that will enable to reuse the code
and refactors building of the qemu commandline of disks to use this new
helper.
Peter Krempa [Mon, 18 Nov 2013 14:25:03 +0000 (15:25 +0100)]
qemu: Simplify call pattern of qemuBuildDriveURIString
Automatically assign secret type from the disk source definition and
pull in adding of the comma. Then update callers to keep generated
output the same.
Peter Krempa [Wed, 20 Nov 2013 09:37:31 +0000 (10:37 +0100)]
qemu: Refactor qemuTranslateDiskSourcePool
Before this patch, the translation function still needs a second ugly
helper function to actually format the command line for qemu. But if we
do the right stuff in the translation function, we don't have to bother
with the second function any more.
This patch removes the messy qemuBuildVolumeString function and changes
qemuTranslateDiskSourcePool to set stuff up correctly so that the
regular code paths meant for volumes can be used to format the command
line correctly.
For this purpose a new helper "qemuDiskGetActualType()" is introduced to
return the type of the volume in a pool.
As a part of the refactor the qemuTranslateDiskSourcePool function is
fixed to do decisions based on the pool type instead of the volume type.
This allows to separate pool-type-specific stuff more clearly and will
ease addition of other pool types that will require certain other
operations to get the correct pool source.
The previously fixed tests should make sure that we don't break stuff
that was working before.
The virDomainGetBlockJobInfo method did not zero out the
virDomainBlockJobInfo pointer arg, so when block jobs were
not active it would return garbage for the bandwidth/cur/end
fields.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Chen Hanxiao [Thu, 28 Nov 2013 12:01:55 +0000 (20:01 +0800)]
docs: fix typos in libvirt.h.in
s/causes/cause/
Each event callback has a single detail parameter, and can
thus only report a single cause. Also, make all the sub-event
documentation use similar wording.
Signed-off-by: Chen Hanxiao <chenhanxiao@cn.fujitsu.com> Signed-off-by: Eric Blake <eblake@redhat.com>
Peter Krempa [Mon, 25 Nov 2013 13:51:25 +0000 (14:51 +0100)]
qemu: snapshot: Detect internal snapshots also for sheepdog and RBD
When doing an internal snapshot on a VM with sheepdog or RBD disks we
would not set a flag to mark the domain is using internal snapshots and
might end up creating a mixed snapshot. Move the setting of the variable
to avoid this problem.
Peter Krempa [Tue, 12 Nov 2013 10:37:04 +0000 (11:37 +0100)]
snapshot: conf: Use common parsing and formatting functions for source
Disk source elements for snapshots were using separate code from our
config parser. As snapshots can be stored on more than just regular
files, we will need the universal parser to allow us to expose a variety
of snapshot disk targets. This patch reuses the config parsers and
formatters to do the job.
This initial support only changes the code without any visible XML
change.
Peter Krempa [Thu, 7 Nov 2013 15:03:03 +0000 (16:03 +0100)]
conf: Clean up virDomainDiskSourceDefFormatInternal
Avoid if statements when used with virBufferEscapeString which
automaticaly omits the whole string. Also add some line breaks to
visualy separate the code.
Peter Krempa [Thu, 7 Nov 2013 14:41:30 +0000 (15:41 +0100)]
conf: Support disk source formatting without needing a virDomainDiskDefPtr
The <source> element formatting function was expecting a
virDomainDiskDefPtr to store the data. As snapshots are not using this
data structure to hold the data, we need to add an internal function
which splits out individual fields separately.
Peter Krempa [Wed, 20 Nov 2013 15:04:10 +0000 (16:04 +0100)]
test: Implement fake storage pool driver in qemuxml2argv test
To support testing of "volume" disk backing, we need to implement a few
disk driver backend functions.
The fake storage driver uses files in storagepoolxml2xmlout/POOLNAME.xml
as XML files for pool definitions and volume names are in format
"VOL_TYPE+VOL_PATH". By default type "block" is assumed (for iSCSI test
compatibility).
The choice of this approach along with implemented functions was made so
that <disk type='volume'> can be tested in the xml2argv test.
Ryota Ozaki [Sun, 1 Dec 2013 14:46:06 +0000 (23:46 +0900)]
vbox: handle errors of virDomainHostdevDefAlloc correctly
The original code ignored errors of virDomainHostdevDefAlloc,
however, we should properly do error return from the function
if it occurs.
The fix pulls out virDomainHostdevDefAlloc from the loop and
executes it all together before the loop. So we can easily
return on errors without the notion of other memory allocations
in the loop.
The deallocation code is separated from the allocation code
because it will be used by a further patch for fixing other error
handlings.
Reported-by: Laine Stump <laine@laine.org> Signed-off-by: Ryota Ozaki <ozaki.ryota@gmail.com>
Ryota Ozaki [Sun, 1 Dec 2013 14:46:05 +0000 (23:46 +0900)]
vbox: fix incorrect loop condition in vboxHostDeviceGetXMLDesc
The fixed loop used logical OR to combine two conditions, however,
it is apparently incorrect and logical AND is correct.
We can fix it by replacing OR with AND, but this patch instead
fixes the problem by getting rid of the first conditional
statement: USBFilterCount < def->nhostdevs. It isn't needed
because USBFilterCount will never be greater than or equal to
def->nhostdevs.
def->nhostdevs is calculated in the following code
above the loop in question like this:
for (i = 0; i < deviceFilters.count; i++) {
PRBool active = PR_FALSE;
IUSBDeviceFilter *deviceFilter = deviceFilters.items[i];
deviceFilter->vtbl->GetActive(deviceFilter, &active);
if (active) {
def->nhostdevs++;
}
}
And the loop is constructed as like this:
for (i = 0; (USBFilterCount < def->nhostdevs) || (i < deviceFilters.count); i++) {
PRBool active = PR_FALSE;
(snip)
deviceFilter->vtbl->GetActive(deviceFilter, &active);
if (!active)
continue;
(snip)
USBFilterCount++;
}
So def->nhostdevs is the number of active device filters and
USBFilterCount is counted up only when a device filter is active.
Thus, we can remove USBFilterCount < def->nhostdevs safely.
Reported-by: Laine Stump <laine@laine.org> Signed-off-by: Ryota Ozaki <ozaki.ryota@gmail.com>
While running nwfilterxml2xmltest, it was found that valgrind pointed out the
following error...
==7466== 16 bytes in 1 blocks are definitely lost in loss record 26 of 90
==7466== at 0x4A06B6F: calloc (vg_replace_malloc.c:593)
==7466== by 0x4C651AD: virAlloc (viralloc.c:142)
==7466== by 0x4D0450D: virNWFilterDefParseNode (nwfilter_conf.c:2575)
==7466== by 0x4D05D84: virNWFilterDefParse (nwfilter_conf.c:2647)
==7466== by 0x401FDE: testCompareXMLToXMLHelper (nwfilterxml2xmltest.c:39)
==7466== by 0x402DE1: virtTestRun (testutils.c:138)
==7466== by 0x4018E9: mymain (nwfilterxml2xmltest.c:111)
==7466== by 0x403482: virtTestMain (testutils.c:593)
==7466== by 0x341F421A04: (below main) (libc-start.c:225)
...21 times, which are related to 21 tests in nwfilterxml2xmltest.c which sent
EXPECT_WARN = false. There were two scenarios in virNWFilterDefParseXML(),
when the variable 'entry' was malloc'ed, but not freed.
This patch fixes the memory leaks found while running qemuxml2argvtest
==8260== 3 bytes in 1 blocks are definitely lost in loss record 1 of
129
==8260== at 0x4A0887C: malloc (vg_replace_malloc.c:270)
==8260== by 0x341F485E21: strdup (strdup.c:42)
==8260== by 0x4CADCFF: virStrdup (virstring.c:554)
==8260== by 0x4CBB839: virXPathString (virxml.c:90)
==8260== by 0x4CE753A: virDomainDefParseXML (domain_conf.c:11478)
==8260== by 0x4CEB4FE: virDomainDefParseNode (domain_conf.c:12742)
==8260== by 0x4CEB675: virDomainDefParse (domain_conf.c:12684)
==8260== by 0x425958: testCompareXMLToArgvHelper (qemuxml2argvtest.c:107)
==8260== by 0x427111: virtTestRun (testutils.c:138)
==8260== by 0x41D3FE: mymain (qemuxml2argvtest.c:452)
==8260== by 0x4277B2: virtTestMain (testutils.c:593)
==8260== by 0x341F421A04: (below main) (libc-start.c:225)
==8260==
==8260== 4 bytes in 1 blocks are definitely lost in loss record 5 of
129
==8260== at 0x4A0887C: malloc (vg_replace_malloc.c:270)
==8260== by 0x341F485E21: strdup (strdup.c:42)
==8260== by 0x4CADCFF: virStrdup (virstring.c:554)
==8260== by 0x4CBB839: virXPathString (virxml.c:90)
==8260== by 0x4CE753A: virDomainDefParseXML (domain_conf.c:11478)
==8260== by 0x4CEB4FE: virDomainDefParseNode (domain_conf.c:12742)
==8260== by 0x4CEB675: virDomainDefParse (domain_conf.c:12684)
==8260== by 0x425958: testCompareXMLToArgvHelper (qemuxml2argvtest.c:107)
==8260== by 0x427111: virtTestRun (testutils.c:138)
==8260== by 0x41D39A: mymain (qemuxml2argvtest.c:451)
==8260== by 0x4277B2: virtTestMain (testutils.c:593)
==8260== by 0x341F421A04: (below main) (libc-start.c:225)
==8260==
The basic problem is that during a network update, the required
iptables rules sometimes change, and this was being handled by simply
removing and re-adding the rules. However, the removal of the old
rules was done based on the *new* state of the network, which would
mean that some of the rules would not match those currently in the
system, so the old rules wouldn't be removed.
This patch removes the old rules prior to updating the network
definition then adds the new rules as soon as the definition is
updated. Note that this could lead to a stray packet or two during the
interim, but that was already a problem before (the period of limbo is
now just slightly longer).
While moving the location for the rules, I added a few more sections
that should result in the iptables rules being redone:
DHCP_RANGE and DHCP_HOST - these are needed because adding/removing a dhcp
host entry could lead to the dhcp service being started/stopped, which
would require that the mangle rule that fixes up dhcp response
checksums sould need to be added/removed, and this wasn't being done.
Eric Blake [Wed, 27 Nov 2013 21:59:52 +0000 (14:59 -0700)]
tests: fix virpcitest with read-only srcdir
'make distcheck' has been broken since commit 21685c9; basically,
it emulates the case of a read-only $(srcdir) (such as building
from a tarball exploded onto a CD-ROM), but we were creating our
fake pci device as a symlink into $(srcdir) and failing when that
requires opening the config file for writing:
3) testVirPCIDeviceReset ... libvirt: error : Failed to open config space file '/sys/bus/pci/devices/0000:00:01.0/config': Permission denied
Fix it by copying rather than symlinking.
* tests/virpcimock.c (make_file): Add parameter to allow binary
creation; adjust all callers.
(pci_device_new_from_stub): Copy rather than symlink.
Eric Blake [Wed, 27 Nov 2013 21:31:53 +0000 (14:31 -0700)]
tests: guarantee abs_srcdir in all C tests
While trying to debug a failure of virpcitest during 'make distcheck',
I noticed that with a VPATH build, 'cd tests; ./virpcitest' fails for
an entirely different reason. To reproduce the distcheck failure, I
had to run 'cd tests; abs_srcdir=/path/to/src ./virpcitest'. But we
document in HACKING that all of our tests are supposed to be runnable
without requiring extra environment variables.
The solution: hardcode the location of srcdir into the just-built
binaries, rather than requiring make to prepopulate environment
variables. With this, './virpcitest' passes even in a VPATH build
(provided that $(srcdir) is writable; a followup patch will fix the
conditions required by 'make distcheck'). [Note: the makefile must
still pass on directory variables to the test environment of shell
scripts, since those aren't compiled. So while this solves the case
of a compiled test, it still requires environment variables to pass
a VPATH build of any shell script test case that relies on srcdir.]
* tests/Makefile.am (AM_CFLAGS): Define abs_srcdir in all compiled
tests.
* tests/testutils.h (abs_srcdir): Quit declaring.
* tests/testutils.c (virtTestMain): Rely on define rather than
environment variable.
* tests/virpcimock.c (pci_device_new_from_stub): Rely on define.
* tests/cputest.c (mymain): Adjust abs_top_srcdir default.
* tests/qemuxml2argvtest.c (mymain): Likewise.
* tests/qemuxmlnstest.c (mymain): Likewise.
Eric Blake [Wed, 27 Nov 2013 03:57:05 +0000 (20:57 -0700)]
storage: skip selinux cleanup when fd not available
When attempting to backport gluster pools to an older versoin
where there is no VIR_STRDUP, I got a crash from calling
strdup(,NULL). Rather than relying on the current else branch
safely doing nothing when there is no fd, it is easier to just
skip it. While at it, there's no need to explicitly set
perms.label to NULL after a VIR_FREE().
* src/storage/storage_backend.c
(virStorageBackendUpdateVolTargetInfoFD): Minor optimization.
Bing Bu Cao [Wed, 27 Nov 2013 10:52:12 +0000 (18:52 +0800)]
qemu: preserve netdev MAC address during 'domxml-to-native'
The virsh command 'domxml-to-native' (virConnectDomainXMLToNative())
converts all network devices to "type='ethernet'" in order to make it
more likely that the generated command could be run directly from a
shell (other libvirt network device types end up referencing file
descriptors for tap devices assumed to have been created by libvirt,
which can't be done in this case).
During this conversion, all of the netdev parameters are cleared out,
then specific items are filled in after changing the type. The MAC
address was not one of these preserved items, and the result was that
mac addresses in the generated commandlines were always
00:00:00:00:00:00.
This patch saves the mac address before the conversion, then
repopulates it afterwards, so the proper mac addresses show up in the
commandline.
Signed-off-by: Bing Bu Cao <mars@linux.vnet.ibm.com> Signed-off-by: Laine Stump <laine@laine.org>
Eric Blake [Mon, 25 Nov 2013 21:38:22 +0000 (14:38 -0700)]
storage: don't read storage volumes in nonblock mode
Commit 348b4e2 introduced a potential problem (thankfully not
in any release): we are attempting to use virFileReadHeaderFD()
on a file that was opened with O_NONBLOCK. While this
shouldn't be a problem in practice (because O_NONBLOCK
typically doesn't affect regular or block files, and fifos and
sockets cannot be storage volumes), it's better to play it safe
to avoid races from opening an unexpected file type while also
avoiding problems with having to handle EAGAIN while read()ing.
Based on a report by Dan Berrange.
* src/storage/storage_backend.c
(virStorageBackendVolOpenCheckMode): Fix up fd after avoiding race.
unprivileged user has no rights to move the mounts that
inherited from parent mountns. we use this feature to move
the /stateDir/domain-name.{dev, devpts} to the /dev/ and
/dev/pts directroy of container. this commit breaks libvirt lxc.
this patch changes the behavior to bind these mounts when
user namespace is enabled and move these mounts when user
namespace is disabled.
sasl: Fix authentication when using PLAIN mechanism
With some authentication mechanism (PLAIN for example), sasl_client_start()
can return SASL_OK, which translates to virNetSASLSessionClientStart()
returning VIR_NET_SASL_COMPLETE.
cyrus-sasl documentation is a bit vague as to what to do in such situation,
but upstream clarified this a bit in
http://asg.andrew.cmu.edu/archive/message.php?mailbox=archive.cyrus-sasl&msg=10104
When we got VIR_NET_SASL_COMPLETE after virNetSASLSessionClientStart() and
if the remote also tells us that authentication is complete, then we should
end the authentication procedure rather than forcing a call to
virNetSASLSessionClientStep(). Without this patch, when trying to use SASL
PLAIN, I get:
error :authentication failed : Failed to step SASL negotiation: -1
(SASL(-1): generic failure: Unable to find a callback: 32775)
This patch is based on a spice-gtk patch by Dietmar Maurer.
Fix invalid read in virNetSASLSessionClientStep debug log
virNetSASLSessionClientStep logs the data that is going to be passed to
sasl_client_step as input data. However, it tries to log it as a string,
while there is no guarantee that this data is going to be nul-terminated.
This leads to this valgrind log:
==20938== Invalid read of size 1
==20938== at 0x8BDB08F: vfprintf (vfprintf.c:1635)
==20938== by 0x8C06DF2: vasprintf (vasprintf.c:62)
==20938== by 0x4CCEDF9: virVasprintfInternal (virstring.c:337)
==20938== by 0x4CA9516: virLogVMessage (virlog.c:842)
==20938== by 0x4CA939A: virLogMessage (virlog.c:778)
==20938== by 0x4E21E0D: virNetSASLSessionClientStep (virnetsaslcontext.c:458)
==20938== by 0x4DE47B8: remoteAuthSASL (remote_driver.c:4136)
==20938== by 0x4DE33AE: remoteAuthenticate (remote_driver.c:3635)
==20938== by 0x4DDBFAA: doRemoteOpen (remote_driver.c:832)
==20938== by 0x4DDC8BA: remoteConnectOpen (remote_driver.c:1027)
==20938== by 0x4D8595F: do_open (libvirt.c:1239)
==20938== by 0x4D863F3: virConnectOpenAuth (libvirt.c:1481)
==20938== by 0x12762B: vshReconnect (virsh.c:337)
==20938== by 0x12C9B0: vshInit (virsh.c:2470)
==20938== by 0x12E9A5: main (virsh.c:3338)
==20938== Address 0xe329ccd is 0 bytes after a block of size 141 alloc'd
==20938== at 0x4A081D4: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==20938== by 0x8CB91B4: xdr_array (xdr_array.c:94)
==20938== by 0x4E039C2: xdr_remote_auth_sasl_start_ret (remote_protocol.c:3134)
==20938== by 0x4E1F8AA: virNetMessageDecodePayload (virnetmessage.c:405)
==20938== by 0x4E119F5: virNetClientProgramCall (virnetclientprogram.c:377)
==20938== by 0x4DF8141: callFull (remote_driver.c:5794)
==20938== by 0x4DF821A: call (remote_driver.c:5816)
==20938== by 0x4DE46CF: remoteAuthSASL (remote_driver.c:4112)
==20938== by 0x4DE33AE: remoteAuthenticate (remote_driver.c:3635)
==20938== by 0x4DDBFAA: doRemoteOpen (remote_driver.c:832)
==20938== by 0x4DDC8BA: remoteConnectOpen (remote_driver.c:1027)
==20938== by 0x4D8595F: do_open (libvirt.c:1239)
==20938== by 0x4D863F3: virConnectOpenAuth (libvirt.c:1481)
==20938== by 0x12762B: vshReconnect (virsh.c:337)
==20938== by 0x12C9B0: vshInit (virsh.c:2470)
==20938== by 0x12E9A5: main (virsh.c:3338)
Tie SASL callbacks lifecycle to virNetSessionSASLContext
The array of sasl_callback_t callbacks which is passed to sasl_client_new()
must be kept alive as long as the created sasl_conn_t object is alive as
cyrus-sasl uses this structure internally for things like logging, so
the memory used for callbacks must only be freed after sasl_dispose() has
been called.
During testing of successful SASL logins with
virsh -c qemu+tls:///system list --all
I've been getting invalid read reports from valgrind
==9237== Invalid read of size 8
==9237== at 0x6E93B6F: _sasl_getcallback (common.c:1745)
==9237== by 0x6E95430: _sasl_log (common.c:1850)
==9237== by 0x16593D87: digestmd5_client_mech_dispose (digestmd5.c:4580)
==9237== by 0x6E91653: client_dispose (client.c:332)
==9237== by 0x6E9476A: sasl_dispose (common.c:851)
==9237== by 0x4E225A1: virNetSASLSessionDispose (virnetsaslcontext.c:678)
==9237== by 0x4CBC551: virObjectUnref (virobject.c:262)
==9237== by 0x4E254D1: virNetSocketDispose (virnetsocket.c:1042)
==9237== by 0x4CBC551: virObjectUnref (virobject.c:262)
==9237== by 0x4E2701C: virNetSocketEventFree (virnetsocket.c:1794)
==9237== by 0x4C965D3: virEventPollCleanupHandles (vireventpoll.c:583)
==9237== by 0x4C96987: virEventPollRunOnce (vireventpoll.c:652)
==9237== by 0x4C94730: virEventRunDefaultImpl (virevent.c:274)
==9237== by 0x12C7BA: vshEventLoop (virsh.c:2407)
==9237== by 0x4CD3D04: virThreadHelper (virthreadpthread.c:161)
==9237== by 0x7DAEF32: start_thread (pthread_create.c:309)
==9237== by 0x8C86EAC: clone (clone.S:111)
==9237== Address 0xe2d61b0 is 0 bytes inside a block of size 168 free'd
==9237== at 0x4A07577: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==9237== by 0x4C73827: virFree (viralloc.c:580)
==9237== by 0x4DE4BC7: remoteAuthSASL (remote_driver.c:4219)
==9237== by 0x4DE33D0: remoteAuthenticate (remote_driver.c:3639)
==9237== by 0x4DDBFAA: doRemoteOpen (remote_driver.c:832)
==9237== by 0x4DDC8DC: remoteConnectOpen (remote_driver.c:1031)
==9237== by 0x4D8595F: do_open (libvirt.c:1239)
==9237== by 0x4D863F3: virConnectOpenAuth (libvirt.c:1481)
==9237== by 0x12762B: vshReconnect (virsh.c:337)
==9237== by 0x12C9B0: vshInit (virsh.c:2470)
==9237== by 0x12E9A5: main (virsh.c:3338)
This commit changes virNetSASLSessionNewClient() to take ownership of the SASL
callbacks. Then we can free them in virNetSASLSessionDispose() after the corresponding
sasl_conn_t has been freed.
remote: Don't leak priv->tls object on connection failure
When testing SASL authentication over TLS with
virsh -c qemu+tls:///system list --all
I got this valgrind trace after entering wrong credentials:
==30540== 26,903 (88 direct, 26,815 indirect) bytes in 1 blocks are definitely lost in loss record 289 of 293
==30540== at 0x4A081D4: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==30540== by 0x4C7379A: virAllocVar (viralloc.c:558)
==30540== by 0x4CBC178: virObjectNew (virobject.c:190)
==30540== by 0x4CBC329: virObjectLockableNew (virobject.c:216)
==30540== by 0x4E2D003: virNetTLSContextNew (virnettlscontext.c:719)
==30540== by 0x4E2DC3F: virNetTLSContextNewPath (virnettlscontext.c:930)
==30540== by 0x4E2DD5B: virNetTLSContextNewClientPath (virnettlscontext.c:957)
==30540== by 0x4DDB618: doRemoteOpen (remote_driver.c:627)
==30540== by 0x4DDC8BA: remoteConnectOpen (remote_driver.c:1031)
==30540== by 0x4D8595F: do_open (libvirt.c:1239)
==30540== by 0x4D863F3: virConnectOpenAuth (libvirt.c:1481)
==30540== by 0x12762B: vshReconnect (virsh.c:337)
==30540== by 0x12C9B0: vshInit (virsh.c:2470)
==30540== by 0x12E9A5: main (virsh.c:3338)
Eric Blake [Tue, 12 Nov 2013 04:08:27 +0000 (21:08 -0700)]
storage: probe qcow2 volumes in gluster pool
Putting together pieces from previous patches, it is now possible
for 'virsh vol-dumpxml --pool gluster volname' to report metadata
about a qcow2 file stored on gluster. The backing file is still
treated as raw; to fix that, more patches are needed to make the
storage backing chain analysis recursive rather than halting at
a network protocol name, but that work will not need any further
calls into libgfapi so much as just reusing this code, and that
should be the only code outside of the storage driver that needs
any help from libgfapi. Any additional use of libgfapi within
libvirt should only be needed for implementing storage pool APIs
such as volume creation or resizing, where backing chain analysis
should be unaffected.
* src/storage/storage_backend_gluster.c
(virStorageBackendGlusterReadHeader): New helper function.
(virStorageBackendGlusterRefreshVol): Probe non-raw files.
Eric Blake [Mon, 18 Nov 2013 22:24:05 +0000 (15:24 -0700)]
storage: improve handling of symlinks in gluster
With this patch, dangling and looping symlinks are silently
ignored, while links to files and directories are treated the
same as the underlying file or directory. This is the same
behavior as both 'directory' and 'netfs' pools.
* src/storage/storage_backend_gluster.c
(virStorageBackendGlusterRefreshVol): Treat symlinks similar to
directory and netfs pools.
Eric Blake [Mon, 18 Nov 2013 19:45:31 +0000 (12:45 -0700)]
storage: improve allocation stats reported on gluster files
We already had code for handling allocation different than
capacity for sparse files; we just had to wire it up to be
used when inspecting gluster images.
Eric Blake [Wed, 20 Nov 2013 20:17:55 +0000 (13:17 -0700)]
storage: improve directory support in gluster pool
Take advantage of the previous patch's addition of 'netdir' as
a distinct volume type, to expose rather than silently skip
directories embedded in a gluster pool. Also serves as an XML
validation for the previous patch.
Eric Blake [Mon, 18 Nov 2013 23:43:06 +0000 (16:43 -0700)]
storage: add network-dir as new storage volume type
In the 'directory' and 'netfs' storage pools, a user can see
both 'file' and 'dir' storage volume types, to know when they
can descend into a subdirectory. But in a network-based storage
pool, such as the upcoming 'gluster' pool, we use 'network'
instead of 'file', and did not have any counterpart for a
directory until this patch. Adding a new volume type
'network-dir' is better than reusing 'dir', because it makes
it clear that the only way to access 'network' volumes within
that container is through the network mounting (leaving 'dir'
for something accessible in the local file system).
* include/libvirt/libvirt.h.in (virStorageVolType): Expand enum.
* docs/formatstorage.html.in: Document it.
* docs/schemasa/storagevol.rng (vol): Allow new value.
* src/conf/storage_conf.c (virStorageVol): Use new value.
* src/qemu/qemu_command.c (qemuBuildVolumeString): Fix client.
* src/qemu/qemu_conf.c (qemuTranslateDiskSourcePool): Likewise.
* tools/virsh-volume.c (vshVolumeTypeToString): Likewise.
* src/storage/storage_backend_fs.c
(virStorageBackendFileSystemVolDelete): Likewise.
Eric Blake [Wed, 30 Oct 2013 03:28:16 +0000 (21:28 -0600)]
storage: implement rudimentary glusterfs pool refresh
Actually put gfapi to use, by allowing the creation of a gluster
pool. Right now, all volumes are treated as raw and directories
are skipped; further patches will allow peering into files to
allow for qcow2 files and backing chains, and reporting proper
volume allocation. This implementation was tested against Fedora
19's glusterfs 3.4.1; it might be made simpler by requiring a
higher minimum, and/or require more hacks to work with a lower
minimum.
Eric Blake [Tue, 15 Oct 2013 23:06:18 +0000 (17:06 -0600)]
storage: document gluster pool
Add support for a new <pool type='gluster'>, similar to
RBD and Sheepdog. Terminology wise, a gluster volume
forms a libvirt storage pool, within the gluster volume,
individual files are treated as libvirt storage volumes.
* docs/schemas/storagepool.rng (poolgluster): New pool type.
* docs/formatstorage.html.in: Document gluster.
* docs/storage.html.in: Likewise, and contrast it with netfs.
* tests/storagepoolxml2xmlin/pool-gluster.xml: New test.
* tests/storagepoolxml2xmlout/pool-gluster.xml: Likewise.
* tests/storagepoolxml2xmltest.c (mymain): Likewise.
Eric Blake [Tue, 19 Nov 2013 23:26:05 +0000 (16:26 -0700)]
storage: initial support for linking with libgfapi
We support gluster volumes in domain XML, so we also ought to
support them as a storage pool. Besides, a future patch will
want to take advantage of libgfapi to handle the case of a
gluster device holding qcow2 rather than raw storage, and for
that to work, we need a storage backend that can read gluster
storage volume contents. This sets up the framework.
Note that the new pool is named 'gluster' to match a
<disk type='network'><source protocol='gluster'> image source
already supported in a <domain>; it does NOT match the
<pool type='netfs'><source><target type='glusterfs'>,
since that uses a FUSE mount to a local file name rather than
a network name.
This and subsequent patches have been tested against glusterfs
3.4.1 (available on Fedora 19); there are likely bugs in older
versions that may prevent decent use of gfapi, so this patch
enforces the minimum version tested. A future patch may lower
the minimum. On the other hand, I hit at least two bugs in
3.4.1 that will be fixed in 3.5/3.4.2, where it might be worth
raising the minimum: glfs_readdir is nicer to use than
glfs_readdir_r [1], and glfs_fini should only return failure on
an actual failure [2].
* configure.ac (WITH_STORAGE_GLUSTER): New conditional.
* m4/virt-gluster.m4: new file.
* libvirt.spec.in (BuildRequires): Support gluster in spec file.
* src/conf/storage_conf.h (VIR_STORAGE_POOL_GLUSTER): New pool
type.
* src/conf/storage_conf.c (poolTypeInfo): Treat similar to
sheepdog and rbd.
(virStoragePoolDefFormat): Don't output target for gluster.
* src/storage/storage_backend_gluster.h: New file.
* src/storage/storage_backend_gluster.c: Likewise.
* po/POTFILES.in: Add new file.
* src/storage/storage_backend.c (backends): Register new type.
* src/Makefile.am (STORAGE_DRIVER_GLUSTER_SOURCES): Build new files.
* src/storage/storage_backend.h (_virStorageBackend): Documet
assumption.
Eric Blake [Tue, 19 Nov 2013 20:14:54 +0000 (13:14 -0700)]
storage: expose volume meta-type in XML
I got annoyed at having to use both 'virsh vol-list $pool --details'
AND 'virsh vol-dumpxml $vol $pool' to learn if I had populated
the volume correctly. Since two-thirds of the data present in
virStorageVolGetInfo() already appears in virStorageVolGetXMLDesc(),
this just adds the remaining piece of information, as:
<volume type='...'>
...
</volume>
* docs/formatstorage.html.in: Document new <volume type=...>.
* docs/schemas/storagevol.rng (vol): Add it to RelaxNG.
* src/conf/storage_conf.h (virStorageVolTypeToString): Declare.
* src/conf/storage_conf.c (virStorageVolTargetDefFormat): Output
the metatype.
(virStorageVolDefParseXML): Parse it, for unit tests.
* tests/storagevolxml2xmlout/vol-*.xml: Update tests to match.
Jiri Denemark [Mon, 25 Nov 2013 15:37:32 +0000 (16:37 +0100)]
spec: Don't save/restore running VMs on libvirt-client update
The previous attempt (commit d65e0e1) removed just one of two
libvirt-guests restarts that happened on libvirt-client update. Let's
remove the last one too :-)
virsh domxml-from-native to treat SCSI as the bus type for pseries by default
The bus type IDE being enum Zero, the bus type on pseries system appears as IDE for all the -hda/-cdrom and for disk drives with if="none" type. Pseries platform needs this to appear as SCSI instead of IDE. The ide being not supported, the explicit requests for ide devices will return an error.
Signed-off-by: Shivaprasad G Bhat <sbhat@linux.vnet.ibm.com>
Eric Blake [Fri, 22 Nov 2013 19:40:15 +0000 (12:40 -0700)]
storage: allow interleave in volume XML
The RNG grammar did not allow arbitrary interleaving, which makes
it harder than necessary to create a new volume from handwritten XML.
(Compare also to commit caf516db for pools).
* docs/schemas/storagevol.rng: Support interleaving.
* tests/storagevolxml2xmlin/vol-file-backing.xml: Test it.
Ryota Ozaki [Thu, 21 Nov 2013 14:41:07 +0000 (23:41 +0900)]
vbox: add support for 4.3 APIs
Makefile.am, vbox_V4_3.c and vbox_driver.c do regular
modifitions to support a new version of APIs.
vbox_tmpl.c basically fixes incompatibilities since 4.2.
The affected incompatibilities of 4.3 are:
* IMachine::Delete() has been renamed to IMachine::deleteConfig()
* IMedium::CreateBaseStorage() now accepts multiple variant values
* IDisplay::GetScreenResolution() now returns the display position
in the guest
* IMachine now has multiple IUSBControllers and IUSBDeviceFilters
handles USB device filters instead of (obsolete) IUSBController
This patch is tested on Mac OS X 10.8.5 and Fedora 19.
Ryota Ozaki [Thu, 21 Nov 2013 14:41:06 +0000 (23:41 +0900)]
vbox: import vbox_CAPI_v4_3.h from SDK
vbox_CAPI_v4_3.h is almost same as
sdk/bindings/xpcom/include/VBoxCAPI_v4_3.h of
http://download.virtualbox.org/virtualbox/4.3.2/VirtualBoxSDK-4.3.2-90405.zip,
but modified to fix preprocessor indentations by using cppi.
Ryota Ozaki [Thu, 21 Nov 2013 14:41:05 +0000 (23:41 +0900)]
vbox: pull vboxHostDeviceGetXMLDesc out from vboxDomainGetXMLDesc
The USB-related code in vboxDomainGetXMLDesc is deeply nested and
difficult to add new code. So flatten it. To do so, the code is
pulled out from vboxDomainGetXMLDesc to make the function short
and to leaverage early return and goto for error handling.
The original problem was that libvirt_virConnectGetCPUModelNames
was listed twice in the exports table, once automatically from
the generator and once from the manual override. We merely needed
to list it in the skip_impl list, and not delete the manually
written code entirely.
Ján Tomko [Tue, 19 Nov 2013 17:03:21 +0000 (18:03 +0100)]
Don't start a nested job in qemuMigrationPrepareAny
This nested job is canceled by the first ExitMonitor call (even though
it was not created by the corresponding EnterMonitor call), and
again in qemuMigrationPrepareAny if qemuProcessStart failed.
This can lead to a crash if the vm object was disposed of before calling
qemuDomainRemoveInactive:
0 ..62bc in virClassIsDerivedFrom (klass=0xdeadbeef,
parent=0x7ffce4cdd270) at util/virobject.c:166
1 ..6666 in virObjectIsClass at util/virobject.c:362
2 ..66b4 in virObjectLock at util/virobject.c:314
3 ..477e in virDomainObjListRemove at conf/domain_conf.c:2359
4 ..7a64 in qemuDomainRemoveInactive at qemu/qemu_domain.c:2087
5 ..956c in qemuMigrationPrepareAny at qemu/qemu_migration.c:2469
Which in a default configuration will managedsave every running VM,
and then restore them. Certainly not something we should do every
time the libvirt-client RPM is updated.
Just drop the try-restart attempt, I don't know what purpose it
serves anyways.
As virt-login-shell is an SUID binary, we should restrict its usage to
just the users chosen by an administrator to use virt-login-shell as
their login shell. This can easily be done by making the binary
executable only by users from a new virtlogin group.