instead of only unloading it. This makes sure old profiles don't pile up
in /etc/apparmor.d/libvirt and we get updates to modified templates on
VM restart.
Laine Stump [Thu, 21 Sep 2017 17:57:30 +0000 (13:57 -0400)]
util: Fix stack smashing in virNetDevGetFamilyId
After commit 8708ca01c0d libvirtd consistently aborts with "stack
smashing detected" when nodedev driver is initialized.
This is caused by nlmsg_parse() being told that its array of nlattr*
has CTRL_CMD_MAX (10) entries, when in fact it is declared to have
CTRL_ATTR_MAX (8) entries. Since all the entries are initialized to
NULL, the result is that nlmsg_parse is overwriting 2*(sizof(nlattr*))
bytes outside the array.
Signed-off-by: Laine Stump <laine@laine.org> Reviewed-by: John Ferlan <jferlan@redhat.com> Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
John Ferlan [Fri, 15 Sep 2017 19:21:35 +0000 (15:21 -0400)]
util: Fix secret generation in virStorageSourceParseRBDColonString
Commit id '5604c056' used the wrong API to generate the
<secret type='%s'..." field. The previous code used the
correct API as was done in commit id '6887af39'. The data
is actually a usage type not an auth type even though the
result is the same.
When migrating to a file (e.g. when doing 'virsh save file'),
couple of things are happening in the thread that is executing
the API:
1) the domain obj is locked
2) iohelper is spawned as a separate process to handle all I/O
3) the thread waits for iohelper to finish
4) the domain obj is unlocked
Now, the problem is that while the thread waits in step 3 for
iohelper to finish this may take ages because iohelper calls
fdatasync(). And unfortunately, we are waiting the whole time
with the domain locked. So if another thread wants to jump in and
say copy the domain name ('virsh list' for instance), they are
stuck.
The solution is to unlock the domain whenever waiting for I/O and
lock it back again when it finished.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Commit id '99a2d6af2' was a bit too aggressive with determining whether
the provided path was a "physical" cd-rom in order to generate a taint
message due to the possibility of some guest and host trying to control
the tray. For cd-rom guest devices backed to some VIR_STORAGE_TYPE_FILE
storage, this wouldn't be a problem and as such it shouldn't be a problem
for guest devices using some sort of block device on the host such as
iSCSI, LVM, or a Disk pool would present.
So before issuing a taint message, let's check if the provided path of
the VIR_STORAGE_TYPE_BLOCK backed device is a "known" physical cdrom name
by comparing the beginning of the path w/ "/dev/cdrom" and "/dev/sr".
Also since it's possible the provided path could resolve to some /dev/srN
device, let's get that path as well and perform the same check.
Michal Privoznik [Thu, 21 Sep 2017 12:52:58 +0000 (14:52 +0200)]
qemuBuildHostNetStr: Don't leak @addr
The virSocketAddrFormat() allocates the string and it's caller
responsibility to free it afterwards.
==28857== 11 bytes in 1 blocks are definitely lost in loss record 37 of 168
==28857== at 0x4C2BEDF: malloc (vg_replace_malloc.c:299)
==28857== by 0x9A81D79: strdup (in /lib64/libc-2.23.so)
==28857== by 0x5DA3BF0: virStrdup (virstring.c:902)
==28857== by 0x5D96182: virSocketAddrFormatFull (virsocketaddr.c:427)
==28857== by 0x5D95E13: virSocketAddrFormat (virsocketaddr.c:352)
==28857== by 0x5706890: qemuBuildHostNetStr (qemu_command.c:3891)
==28857== by 0x57138D3: qemuBuildInterfaceCommandLine (qemu_command.c:8597)
==28857== by 0x5713D6A: qemuBuildNetCommandLine (qemu_command.c:8699)
==28857== by 0x57176F6: qemuBuildCommandLine (qemu_command.c:10027)
==28857== by 0x5769D61: qemuProcessCreatePretendCmd (qemu_process.c:6004)
==28857== by 0x4056EC: testCompareXMLToArgv (qemuxml2argvtest.c:502)
==28857== by 0x41DF40: virTestRun (testutils.c:180)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Jiri Denemark [Fri, 30 Jun 2017 14:55:20 +0000 (16:55 +0200)]
qemu: Don't update CPU when formatting live def
Since commit v2.2.0-199-g7ce711a30e libvirt stores an updated guest CPU
in domain's live definition and there's no need to update it every time
we want to format the definition. The commit itself tried to address
this in qemuDomainFormatXML, but forgot to fix qemuDomainDefFormatLive.
Not to mention that masking a previously set flag is only acceptable if
the flag was set by a public API user. Internally, libvirt should have
never set the flag in the first place.
Jiri Denemark [Fri, 30 Jun 2017 15:05:22 +0000 (17:05 +0200)]
qemu: Use correct host model for updating guest cpu
When a user requested a domain XML description with
VIR_DOMAIN_XML_UPDATE_CPU flag, libvirt would use the host CPU
definition from host capabilities rather than the one which will
actually be used once the domain is started.
Jiri Denemark [Fri, 30 Jun 2017 13:47:23 +0000 (15:47 +0200)]
cpu_conf: Drop updateCPU from virCPUDefFormat
In the past we updated host-model CPUs with host CPU data by adding a
model and features, but keeping the host-model mode. And since the CPU
model is not normally formatted for host-model CPU defs, we had to pass
the updateCPU flag to the formatting code to be able to properly output
updated host-model CPUs. Libvirt doesn't do this anymore, host-model
CPUs are turned into custom mode CPUs once updated with host CPU data
and thus there's no reason for keeping the hacks inside CPU XML
formatters.
If a test expects either a parse error or a failure but then there is
neither a parse error nor a failure, then properly mark the test as
failing, instead of failing later on (e.g. trying to open a
non-existing .args file).
For win32 we need EXIT_AM_SKIP which is in testutils.h. We must
define NO_LIBVIRT to prevent replacement of fprintf with
virFilePrintf as we can't link to libvirt_util.la
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
iohelper: avoid calling read() with misaligned buffers for O_DIRECT
The iohelper currently calls saferead() to get data from the
underlying file. This has a problem with O_DIRECT when hitting
end-of-file. saferead() is asked to read 1MB, but the first
read() it does may return only a few KB, so it'll try another
read() to fill the remaining buffer. Unfortunately the buffer
pointer passed into this 2nd read() is likely not aligned
to the extent that O_DIRECT requires, so rather than seeing
'0' for end-of-file, we'll get -1 + EINVAL due to misaligned
buffer.
The way the iohelper is currently written, it already handles
getting short reads, so there is actually no need to use
saferead() at all. We can simply call read() directly. The
benefit of this is that we can now write() the data immediately
so when we go into the subsequent reads() we'll always have a
correctly aligned buffer.
Technically the file position ought to be aligned for O_DIRECT
too, but this does not appear to matter when at end-of-file.
Tested-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The commandhelper binary is a helper for commandtest that
validates what file handles were inherited. For this to
work reliably we must not have any libraries that leak
file descriptors into commandhelper. Unfortunately some
versions of gnutls will intentionally open file handles
at library load time via a constructor function.
gnutls-3.3.0 and newer leaves 2 FDs open in order to be backwards
compatible when it comes to chrooted binaries [1]. Linking
commandhelper with gnutls then leaves these two FDs open and
commandtest fails thanks to that. This patch does not link
commandhelper with libvirt.la, but rather only the utilities making
the test pass.
That fix relied on fact that while libvirt.so linked with
gnutls, libvirt_util.la did not link to it. With the
introduction of the util/vircrypto.c file that assumption
is no longer valid. We must not link to libvirt_util.la
at all - only gnulib and libc can (hopefully) be relied
on not to open random file descriptors in constructors.
Reviewed-by: Martin Kletzander <mkletzan@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Julio Faracco [Sat, 9 Sep 2017 15:09:49 +0000 (12:09 -0300)]
storage: Add new events for *PoolBuild() and *PoolDelete().
This commit adds new events for two methods and operations: *PoolBuild() and
*PoolDelete(). Using the event-test and the commands set below we have the
following outputs:
$ sudo ./event-test
Registering event callbacks
myStoragePoolEventCallback EVENT: Storage pool test Defined 0
myStoragePoolEventCallback EVENT: Storage pool test Created 0
myStoragePoolEventCallback EVENT: Storage pool test Started 0
myStoragePoolEventCallback EVENT: Storage pool test Stopped 0
myStoragePoolEventCallback EVENT: Storage pool test Deleted 0
myStoragePoolEventCallback EVENT: Storage pool test Undefined 0
Another terminal:
$ sudo virsh pool-define test.xml
Pool test defined from test.xml
$ sudo virsh pool-build test
Pool test built
$ sudo virsh pool-start test
Pool test started
$ sudo virsh pool-destroy test
Pool test destroyed
$ sudo virsh pool-delete test
Pool test deleted
$ sudo virsh pool-undefine test
Pool test has been undefined
The util/vircrypto.c file uses gnutls, so we must directly link
libvirt_util.la with gnutls to avoid errors on OS which do not
resolve symbols against indirectly linked libraries.
This fixes a build failure on Ubuntu Trusty
CCLD storagevolxml2argvtest
/usr/bin/ld: ../src/.libs/libvirt_util.a(libvirt_util_la-vircrypto.o): undefined reference to symbol 'gnutls_strerror@@GNUTLS_1_4'
//usr/lib/x86_64-linux-gnu/libgnutls.so.26: error adding symbols: DSO missing from command line
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Extract out the "guts" of building a server entry into it's own
separately callable/usable function in order to allow building
a server entry for a consumer with src->nhosts == 1.
John Ferlan [Tue, 29 Aug 2017 22:44:19 +0000 (18:44 -0400)]
qemu: Detect support for vxhs
Using the query-qmp-schema introspection - look for the 'vxhs'
blockdevOptions type.
NB: This is a "best effort" type situation as there is not a
mechanism to determine whether the running QEMU has been
built with '--enable-vxhs'. All we can do is check if the
option to use vxhs for a blockdev-add exists in the command
infrastructure which does not take that into account when
building its table of commands and options.
Laine Stump [Fri, 15 Sep 2017 15:26:14 +0000 (11:26 -0400)]
util: virPCIGetNetName(): use first netdev name when phys_port_id isn't matched
The mlx4 (Mellanox) netdev driver implements the sysfs phys_port_id
file for both VFs and PFs, so you can find the VF netdev plugged into
the same physical port as any given PF netdev by comparing the
contents of phys_port_id of the respective netdevs. That's what
libvirt does when attempting to find the PF netdev for a given VF
netdev (or vice versa).
Most other netdev's drivers don't implement phys_port_id, so the file
is visible in sysfs directory listing, but attempts to read it result
in ENOTSUPP. In these cases, libvirt is unable to read phys_port_id of
either the PF or the VF, so it just returns the first entry in the
PF/VF's list of netdevs.
But we've found that the i40e driver is in between those two
situations - it implements phys_port_id for PF netdevs, but doesn't
implement it for VF netdevs. So libvirt would successfully read the
phys_port_id of the PF netdev, then try to find a VF netdev with
matching phys_port_id, but would fail because phys_port_id is NULL for
all VFs. This would result in a message like the following:
Could not find network device with phys_port_id '3cfdfe9edc39'
under PCI device at /sys/class/net/ens4f1/device/virtfn0
To solve this problem in a way that won't break functionality for
anyone else, this patch saves the first netdev name we find for the
device, and returns that if we fail to find a netdev with the desired
phys_port_id.
Peter Krempa [Mon, 18 Sep 2017 14:03:58 +0000 (16:03 +0200)]
qemu: blockPeek: Fix filling of the return buffer
Commit 3956af495e broke the blockPeek API since virStorageFileRead
allocates a return buffer and fills it with the data, while the API
fills a user-provided buffer. This did not get caught by the compiler
since the API prototype uses a 'void *'.
Fix it by transferring the data from the allocated buffer to the user
provided buffer.
Andrea Bolognani [Tue, 19 Sep 2017 13:49:19 +0000 (15:49 +0200)]
Revert "travis: Limit git depth to 5 commits"
Turns out a build job can be stuck waiting for a macOS worker to
become available for a pretty long time: if more than 5 commits
have been pushed in the meantime, the clone will be too shallow
for the worker to find the commit it's supposed to verify, and
the build job will fail.
See https://travis-ci.org/libvirt/libvirt/jobs/277244110 for an
example of the failure described.
Andrea Bolognani [Tue, 19 Sep 2017 10:42:09 +0000 (12:42 +0200)]
python: Don't hardcode interpreter path
This is particularly useful on operating systems that don't ship
Python as part of the base system (eg. FreeBSD) while still working
just as well as it did before on Linux.
While at it, make it explicit that our scripts are only going to
work with Python 2, and remove the usage of unbuffered I/O, which
as far as I can tell has no effect on the output files.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Andrea Bolognani [Mon, 18 Sep 2017 12:35:50 +0000 (14:35 +0200)]
perl: Don't hardcode interpreter path
This is particularly useful on operating systems that don't ship
Perl as part of the base system (eg. FreeBSD) while still working
just as well as it did before on Linux.
In one case (src/rpc/genprotocol.pl) the interpreter path was
missing altogether.
Signed-off-by: Andrea Bolognani <abologna@redhat.com>
Michal Privoznik [Mon, 18 Sep 2017 13:39:58 +0000 (15:39 +0200)]
qemu: Mark graphics ports used on reconnect
I don't want to mask the real problem, but one can advocate
that we should be marking graphics ports as already in use on
qemuProcessReconnect anyway, because we already know that they
are taken.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Ján Tomko [Mon, 18 Sep 2017 17:21:47 +0000 (13:21 -0400)]
configure: fix check for DEVLINK_CMD_ESWITCH_GET
Instead of checking for all possible constants that every
kernel header with devlink support should have (and defining
HAVE_DECL_DEVLINK as 1 if any of them is present due to the
way AC_CHECK_DECLS works), only check for DEVLINK_CMD_ESWITCH_GET.
This is the name of the constant since kernel 4.11. Between 4.8
and 4.11, the now deprecated spelling DEVLINK_CMD_ESWITCH_MODE_GET
was used.
Assume DEVLINK_ESWITCH_MODE_SWITCHDEV is available, since it was
introduced along with the deprecated spelling.
John Ferlan [Tue, 9 May 2017 12:18:33 +0000 (08:18 -0400)]
storage: Introduce APIs to search/scan storage pool volumes list
Introduce virStoragePoolObjForEachVolume to scan each volume
calling the passed callback function until all volumes have been
processed in the storage pool volume list, unless the callback
function returns an error.
Introduce virStoragePoolObjSearchVolume to search each volume
calling the passed callback function until it returns true
indicating that the desired volume was found.
Create/use virStoragePoolObjAddVol in order to add volumes onto list.
Create/use virStoragePoolObjRemoveVol in order to remove volumes from list.
Create/use virStoragePoolObjGetVolumesCount to get count of volumes on list.
For the storage driver, the logic alters when the volumes.obj list grows
to after we've fetched the volobj. This is an optimization of sorts, but
also doesn't "needlessly" grow the volumes.objs list and then just decr
the count if the virGetStorageVol fails.
Erik Skultety [Fri, 8 Sep 2017 12:52:44 +0000 (14:52 +0200)]
virsh: man: Describe the 'create' command a bit more
So we refer to the terms 'persistent' and 'transient' across the whole
man page, without describing it further, but more importantly, how the
create command affects it, i.e. explicitly stating that domain created
via the 'create' command are going to be transient or persistent,
depending on whether there is an existing persistent domain with a
matching <name> and <uuid>, in which case it will remain persistent, but
will run using a one-time configuration, otherwise it's going to be
transient and will vanish once destroyed.
Signed-off-by: Erik Skultety <eskultet@redhat.com>
Edan David [Mon, 21 Aug 2017 09:19:53 +0000 (05:19 -0400)]
nodedev: add switchdev to NIC capabilities
Adding functionality to libvirt that will allow querying the interface
for the availability of switchdev Offloading NIC capabilities.
The switchdev mode was introduced in kernel 4.8, the iproute2-devlink
command to retrieve the switchdev NIC feature with command example:
devlink dev eswitch show pci/0000:03:00.0
This feature is needed for Openstack so we can do a scheduling decision
if the NIC is in Hardware Offload (switchdev) or regular SR-IOV (legacy) mode.
And select the appropriate hypervisors with the requested capability see [1].
Apart from generic checks, we need to constrain netmask/prefix
length a bit. Thing is, with current implementation QEMU needs to
be able to 'assign' some IP addresses to the virtual network. For
instance, the default gateway is at x.x.x.2, dns is at x.x.x.3,
the default DHCP range is x.x.x.15-x.x.x.30. Since we don't
expose these settings yet, it's safer to require shorter prefix
to have room for the defaults.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: laine@laine.org
Currently, all that users can specify for an interface type of
'user' is the common attributes: PCI address, NIC model (and
that's basically it). However, some need to configure other
address range than the default one.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: laine@laine.org
Only feature policy is checked on s390, which was previously done in
virCPUUpdate, but that's not the correct place for the check once we
have virCPUValidateFeatures.
This new API may be used to check whether all features used in a CPU
definition are valid (e.g., libvirt knows their name, their policy is
supported, etc.). Leaving this API unimplemented in an arch subdriver
means libvirt does not restrict CPU features usable on the associated
architectures.
qemu: Filter CPU features returned by qemuConnectBaselineCPU
The host CPU definitions reported in the capabilities XML may contain
CPU features unknown to QEMU, but the result of virConnectBaselineCPU is
supposed to be directly usable as a guest CPU definition and thus it
should only contain features QEMU knows about.
John Ferlan [Thu, 14 Sep 2017 15:01:40 +0000 (11:01 -0400)]
conf: Move <disk> encryption validation
Rather than checking during XML processing, move the check for
valid <encryption> into virDomainDiskDefParseValidate and alter
the text of the message slightly to be a bit more correct.
John Ferlan [Wed, 13 Sep 2017 15:00:28 +0000 (11:00 -0400)]
conf: Move <disk> authdef validation
Rather than checking during XML processing, move the checks for correct
and valid auth into virDomainDiskDefParseValidate. This will introduce
virDomainDiskSourceDefParseAuthValidate to validate that the authdef
stored for the virStorageSource is valid. This can then be expanded
to service backingStore sources as well.
Alter the message text slightly as well to distinguish between an
unknown name and an incorrectly used name. Since type is not a
mandatory field, add the NULLSTR() around the output of the unknown
error. NB, a config using unknown formatting would fail virschematest
since it only accepts 'iscsi' and 'ceph' as "valid" types.
John Ferlan [Wed, 13 Sep 2017 19:24:41 +0000 (15:24 -0400)]
conf: Add invalid secrettype checks
Add a couple of tests to "validate" checks in domain_conf that either
a missing secrettype (CONFIG_UNSUPPORTED) or an mismatched secrettype
of ceph for an iSCSI disk (INTERNAL_ERROR) will cause a parsing error.
The reality is, it's not even used. For a <source pool> the authdef
from the storage source pool will supercede whatever is in the <disk>
definition during virStorageTranslateDiskSourcePool processing. In fact,
if the pool doesn't have/need authentication, then the authdef would
be removed anyway as the storage pool would be handling things.
The "proof" for this is in the adjustment to the test to add an
<auth> for a disk. The resulting .args file won't add what normally
would be added "myname:encodedpassword@" prior to the hostname in
the IQN (e.g. iscsi://myname:encodedpassword@iscsi.example.org:3260/...
Peter Krempa [Mon, 11 Sep 2017 13:28:15 +0000 (15:28 +0200)]
qemu: Restore errors when rolling back disk image state
Some operations done to rollback disk image labelling and locking might
overwrite (or clear) the actual error. Remember the original error when
tearing down disk access so that it's not obscured.
Peter Krempa [Fri, 1 Sep 2017 14:19:56 +0000 (16:19 +0200)]
util: error: Add helpers for saving and restoring of last error
Some cleanup paths overwrite a usefull error message with a less useful
one and we then try to preserve the original message. The handlers added
in this patch will simplify the operations since they are designed right
for the purpose.
Andrea Bolognani [Thu, 14 Sep 2017 11:31:17 +0000 (13:31 +0200)]
travis: Shuffle sections around
Order them more logically and make sure that stuff that doesn't
need to be modified frequently if at all, such as the notification
settings, are out of the way.
Perform other very minor tweaks as well.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Martin Kletzander <mkletzan@redhat.com>
travis: Don't abort build due to -Wvariadic-macros
The openwsman header files are at fault here, but precise is entirely
unmaintained at this point so the issue will never be fixed.
Better to ignore the error and have coverage over the Hyper-V driver
than disabling it: if code that would trigger the warning will be
added to libvirt, the CentOS CI will catch it.
Signed-off-by: Andrea Bolognani <abologna@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com>