The API docs extractor, ESX code generator and keycodemapdb tools
rely on python. Historically every platform that this present, but
with switch to Python3 by default, we're increasingly seeing
installs without a /usr/bin/python.
This tightens up the check during configure, so it exits immediately
if python is missing, rather than leaving an empty $(PYTHON) make
variable which leads to more obscure errors later.
Also add it as a build dep for Mingw, since Fedora build roots no
longer get python2 by default. This was not previously a major
problem, since both ESX & API generated files were included in
EXTRA_DIST, but the keycodemapdb generated files are not, so we
require python all the time now.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When adding a nwfilter onto the list in
virNWFilterObjListAssignDef() this array is re-allocated to match
demand for new size. However, it is never freed leading to a
leak:
==26535== 136 bytes in 1 blocks are definitely lost in loss record 1,079 of 1,250
==26535== at 0x4C2E2BE: realloc (vg_replace_malloc.c:785)
==26535== by 0x54BA28E: virReallocN (viralloc.c:245)
==26535== by 0x54BA384: virExpandN (viralloc.c:294)
==26535== by 0x54BA657: virInsertElementsN (viralloc.c:436)
==26535== by 0x55DB011: virNWFilterObjListAssignDef (virnwfilterobj.c:362)
==26535== by 0x55DB530: virNWFilterObjListLoadConfig (virnwfilterobj.c:503)
==26535== by 0x55DB635: virNWFilterObjListLoadAllConfigs (virnwfilterobj.c:539)
==26535== by 0x2AC5A28B: nwfilterStateInitialize (nwfilter_driver.c:250)
==26535== by 0x5621C64: virStateInitialize (libvirt.c:770)
==26535== by 0x124379: daemonRunStateInit (libvirtd.c:881)
==26535== by 0x554AC78: virThreadHelper (virthread.c:206)
==26535== by 0x8F5F493: start_thread (in /lib64/libpthread-2.23.so)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
After bdcf6e481 there is a crasher in libvirt. The commit assumes
that priv->perf is always set. That is not true. For inactive
domains, the priv->perf is not allocated as it is set in
qemuProcessLaunch(). Now, usually we differentiate between
accesses to inactive and active definition and it works just
fine. Except for 'domstats'. There priv->perf is accessed without
prior check for domain inactivity. While we could check for that,
more robust solution is to make virPerfEventIsEnabled() accept
NULL.
How to reproduce:
1) ensure you have at least one inactive domain
2) virsh domstats
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Erik Skultety [Fri, 28 Apr 2017 07:24:31 +0000 (09:24 +0200)]
mdev: Fix daemon crash on domain shutdown after reconnect
The problem resides in virHostdevUpdateActiveMediatedDevices which gets
called during qemuProcessReconnect. The issue here is that
virMediatedDeviceListAdd takes a pointer to the item to be added to the
list to which VIR_APPEND_ELEMENT is used, which also clears the pointer.
However, in this case only the local copy of the pointer got cleared,
leaving the original pointing to valid memory. To sum it up, during
cleanup phase, the original pointer is freed and the daemon crashes
basically any time it would access it.
Backtrace:
0x00007ffff3ccdeba in __strcmp_sse2_unaligned
0x00007ffff72a444a in virMediatedDeviceListFindIndex
0x00007ffff7241446 in virHostdevReAttachMediatedDevices
0x00007fffc60215d9 in qemuHostdevReAttachMediatedDevices
0x00007fffc60216dc in qemuHostdevReAttachDomainDevices
0x00007fffc6046e6f in qemuProcessStop
0x00007fffc6091596 in processMonitorEOFEvent
0x00007fffc6091793 in qemuProcessEventHandler
0x00007ffff7294bf5 in virThreadPoolWorker
0x00007ffff7294184 in virThreadHelper
0x00007ffff3fdc3c4 in start_thread () from /lib64/libpthread.so.0
0x00007ffff3d269cf in clone () from /lib64/libc.so.6
Erik Skultety [Fri, 28 Apr 2017 05:52:52 +0000 (07:52 +0200)]
util: mdev: Use a local variable instead of a direct pointer access
Use a local variable to hold data, rather than accessing the pointer
after calling virMediatedDeviceListAdd (therefore VIR_APPEND_ELEMENT).
Although not causing an issue at the moment, this change is a necessary
prerequisite for tweaking virMediatedDeviceListAdd in a separate patch,
which will take a reference for the source pointer (instead of pointer
value) and will clear it along the way.
Signed-off-by: Erik Skultety <eskultet@redhat.com> Reviewed-by: Laine Stump <laine@laine.org> Reviewed-by: Pavel Hrdina <phrdina@redhat.com>
Michal Privoznik [Fri, 28 Apr 2017 11:17:04 +0000 (13:17 +0200)]
qemuDomainDetachDeviceUnlink: Don't unlink files we haven't created
Even though there are several checks before calling this function
and for some scenarios we don't call it at all (e.g. on disk hot
unplug), it may be possible to sneak in some weird files (e.g. if
domain would have RNG with /dev/shm/some_file as its backend). No
matter how improbable, we shouldn't unlink it as we would be
unlinking a file from the host which we haven't created in the
first place.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Cedric Bosdonnat <cbosdonnat@suse.com>
Michal Privoznik [Fri, 28 Apr 2017 08:45:45 +0000 (10:45 +0200)]
qemuDomainCreateDeviceRecursive: Don't try to create devices under preserved mount points
While the code allows devices to already be there (by some
miracle), we shouldn't try to create devices that don't belong to
us. For instance, we shouldn't try to create /dev/shm/file
because /dev/shm is a mount point that is preserved. Therefore if
a file is created there from an outside (e.g. by mgmt application
or some other daemon running on the system like vhostmd), it
exists in the qemu namespace too as the mount point is the same.
It's only /dev and /dev only that is different. The same
reasoning applies to all other preserved mount points.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Cedric Bosdonnat <cbosdonnat@suse.com>
Michal Privoznik [Fri, 28 Apr 2017 07:30:23 +0000 (09:30 +0200)]
qemuDomainCreateDeviceRecursive: pass a structure instead of bare path
Currently, all we need to do in qemuDomainCreateDeviceRecursive() is to
take given @device, get all kinds of info on it (major & minor numbers,
owner, seclabels) and create its copy at a temporary location @path
(usually /var/run/libvirt/qemu/$domName.dev), if @device live under
/dev. This is, however, very loose condition, as it also means
/dev/shm/* is created too. Therefor, we will need to pass more arguments
into the function for better decision making (e.g. list of mount points
under /dev). Instead of adding more arguments to all the functions (not
easily reachable because some functions are callback with strictly
defined type), lets just turn this one 'const char *' into a 'struct *'.
New "arguments" can be then added at no cost.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Cedric Bosdonnat <cbosdonnat@suse.com>
Michal Privoznik [Thu, 27 Apr 2017 14:29:21 +0000 (16:29 +0200)]
qemuDomainBuildNamespace: Move /dev/* mountpoints later
When setting up mount namespace for a qemu domain the following
steps are executed:
1) get list of mountpoints under /dev/
2) move them to /var/run/libvirt/qemu/$domName.ext
3) start constructing new device tree under /var/run/libvirt/qemu/$domName.dev
4) move the mountpoint of the new device tree to /dev
5) restore original mountpoints from step 2)
Note the problem with this approach is that if some device in step
3) requires access to a mountpoint from step 2) it will fail as
the mountpoint is not there anymore. For instance consider the
following domain disk configuration:
In this case operation fails as we are unable to create vhostmd0
in the new device tree because after step 2) there is no /dev/shm
anymore. Leave aside fact that we shouldn't try to create devices
living in other mountpoints. That's a separate bug that will be
addressed later.
Currently, the order described above is rearranged to:
1) get list of mountpoints under /dev/
2) start constructing new device tree under /var/run/libvirt/qemu/$domName.dev
3) move them to /var/run/libvirt/qemu/$domName.ext
4) move the mountpoint of the new device tree to /dev
5) restore original mountpoints from step 3)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Cedric Bosdonnat <cbosdonnat@suse.com>
Jiri Denemark [Tue, 2 May 2017 14:39:57 +0000 (16:39 +0200)]
client: Report proper close reason
When we get a POLLHUP or VIR_EVENT_HANDLE_HANGUP event for a client, we
still want to read from the socket to process any accumulated data. But
doing so inevitably results in an error and a call to
virNetClientMarkClose before we get to processing the hangup event (and
another call to virNetClientMarkClose). However the close reason passed
to the second virNetClientMarkClose call is ignored because another one
was already set. We need to pass the correct close reason when marking
the socket to be closed for the first time.
Jiri Denemark [Tue, 2 May 2017 16:01:04 +0000 (18:01 +0200)]
qemu: Fix persistent migration of transient domains
While fixing a bug with incorrectly freed memory in commit v3.1.0-399-g5498aa29a, I accidentally broke persistent migration of
transient domains. Before adding qemuDomainDefCopy in the path, the code
just took NULL from vm->newDef and used it as the persistent def, which
resulted in no persistent XML being sent in the migration cookie. This
scenario is perfectly valid and the destination correctly handles it by
using the incoming live definition and storing it as the persistent one.
After the mentioned commit libvirtd would just segfault in the described
scenario.
If we are encoding a block of data that is 16 bytes in length,
we cannot leave it as 16 bytes, we must pad it out to the next
block boundary, 32 bytes. Without this padding, the decoder will
incorrectly treat the last byte of plain text as the padding
length, as it can't distinguish padded from non-padded data.
The problem exhibited itself when using a 16 byte passphrase
for a LUKS volume
$ virsh start demo
error: Failed to start domain demo
error: internal error: process exited while connecting to monitor: >>>>>>>>>>Len 16
2017-05-02T10:35:40.016390Z qemu-system-x86_64: -object \
secret,id=virtio-disk1-luks-secret0,data=SEtNi5vDUeyseMKHwc1c1Q==,\
keyid=masterKey0,iv=zm7apUB1A6dPcH53VW960Q==,format=base64: \
Incorrect number of padding bytes (56) found on decrypted data
Notice how the padding '56' corresponds to the ordinal value of
the character '8'.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When creating v3.2.0-77-g8be3ccd04 commit, I completely forgot that one
migration capability is very special. It's the "events" capability which
tells QEMU to report "MIGRATION" events. Since libvirt always wants the
events, it is enabled in qemuConnectMonitor and the rest of the code
should not touch it.
The virsh command 'domblkinfo' returns the capacity, allocation and phisycal
size of the devices attached in a domain. Usually, this sizes are very big
and hard to understand and calculate. This commits introduce a human readable
support to check the size of each field easilly.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Laine Stump [Mon, 27 Mar 2017 01:57:54 +0000 (21:57 -0400)]
conf: don't ignore <target dev='blah'/> for macvtap interfaces
The parser had been clearing out *all* suggested device names for
type='direct' (aka macvtap) interfaces. All of the code implementing
macvtap allows for a user-specified device name, so we should allow
it. In the case that an interface name starts with "macvtap" or
"macvlan" though, we do still clear it out, just as we do with "vnet"
(which is the prefix used for automatically generated tap device
names), since those are the prefixes for the names we autogenerate for
macvtap and macvlan devices.
Laine Stump [Tue, 25 Apr 2017 18:09:45 +0000 (14:09 -0400)]
util: make macvtap/macvlan generated name #defines available to other files
MACVTAP_NAME_PREFIX and MACVLAN_NAME_PREFIX could be useful to other
files if they were defined in virnetdevmacvlan.h instead of
virnetdevmacvlan.c, so do that (while slightly renaming them and also
adding yet another #define that chooses between macvlan/macvtap based
on flags).
This is a prerequisite to fix: https://bugzilla.redhat.com/1335798
Laine Stump [Tue, 25 Apr 2017 16:26:43 +0000 (12:26 -0400)]
network: better log message when network is inactive during reconnect
If the network isn't active during networkNotifyActualDevice(), we
would log an error message stating that the bridge device didn't
exist. This patch adds a check to see if the network is active, making
the logs more useful in the case that it isn't.
Laine Stump [Tue, 25 Apr 2017 16:20:30 +0000 (12:20 -0400)]
qemu: don't kill qemu process on restart if networkNotify fails
Nothing that could happen during networkNotifyActualDevice() could
justify unceremoniously killing the qemu process, but that's what we
were doing.
In particular, new code added in commit 85bcc022 (first appearred in
libvirt-3.2.0) attempts to reattach tap devices to their assigned
bridge devices when libvirtd restarts (to make it easier to recover
from a restart of a libvirt network). But if the network has been
stopped and *not* restarted, the bridge device won't exist and
networkNotifyActualDevice() will fail.
This patch changes networkNotifyActualDevice() and
qemuProcessNotifyNets() to return void, so that qemuProcessReconnect()
will soldier on regardless of what happens (any errors will still be
logged though).
After 1eb6647979f8c nobody calls the iohelper with 6 arguments.
Everybody uses the other mode. Well, the only user of iohelper
after the previous commit is virFileWrapperFd really.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Currently we use iohelper for virFDStream implementation. This is
because UNIX I/O can lie sometimes: even though a FD for a
file/block device is set as unblocking, actual read()/write() can
block. To avoid this, a pipe is created and one end is kept for
read/write while the other is handed over to iohelper to
write/read the data for us. Thus it's iohelper which gets blocked
and not our event loop.
This approach has two problems:
1) we are spawning a new process.
2) any exchange of information between daemon and iohelper can be
done only through the pipe.
Therefore, iohelper is replaced with an implementation in thread
which is created just for the stream lifetime. The data are still
transferred through pipe (for now), but both problems described
above are solved.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Pavel Hrdina [Thu, 27 Apr 2017 15:41:56 +0000 (17:41 +0200)]
qemu: change the logic of setting default USB controller
The new logic will set the piix3-uhci if available regardless of
any architecture and it will be updated to better model based on
architecture and device existence.
Signed-off-by: Pavel Hrdina <phrdina@redhat.com> Acked-by: Andrea Bolognani <abologna@redhat.com>
Peter Krempa [Thu, 20 Apr 2017 09:16:42 +0000 (11:16 +0200)]
qemu: Don't fail if physical size can't be updated in qemuDomainGetBlockInfo
Since commit c5f6151390 qemuDomainBlockInfo tries to update the
"physical" storage size for all network storage and not only block
devices.
Since the storage driver APIs to do this are not implemented for certain
storage types (RBD, iSCSI, ...) the code would fail to retrieve any data
since the failure of qemuDomainStorageUpdatePhysical is fatal.
Since it's desired to return data even if the total size can't be
updated we need to ignore errors from that function and return plausible
data.
Peter Krempa [Wed, 26 Apr 2017 07:57:39 +0000 (09:57 +0200)]
qemu: process: Don't leak priv->usbaddrs after VM restart
Since the private data structure is not freed upon stopping a VM, the
usbaddrs pointer would be leaked:
==15388== 136 (16 direct, 120 indirect) bytes in 1 blocks are definitely lost in loss record 893 of 1,019
==15388== at 0x4C2CF55: calloc (vg_replace_malloc.c:711)
==15388== by 0x54BF64A: virAlloc (viralloc.c:144)
==15388== by 0x5547588: virDomainUSBAddressSetCreate (domain_addr.c:1608)
==15388== by 0x144D38A2: qemuDomainAssignUSBAddresses (qemu_domain_address.c:2458)
==15388== by 0x144D38A2: qemuDomainAssignAddresses (qemu_domain_address.c:2515)
==15388== by 0x144ED1E3: qemuProcessPrepareDomain (qemu_process.c:5398)
==15388== by 0x144F51FF: qemuProcessStart (qemu_process.c:5979)
[...]
Peter Krempa [Tue, 25 Apr 2017 13:17:34 +0000 (15:17 +0200)]
qemu: process: Clean automatic NUMA/cpu pinning information on shutdown
Clean the stale data after shutting down the VM. Otherwise the data
would be leaked on next VM start. This happens due to the fact that the
private data object is not freed on destroy of the VM.
Wim ten Have [Mon, 24 Apr 2017 13:07:00 +0000 (15:07 +0200)]
xenconfig: add conversions for xen-xl
Per xen-xl conversions from and to native under host-passthrough
mode we take care for Xen (nestedhvm = mode) applied and inherited
settings generating or processing correct feature policy:
This patch maps /domain/cpu/cache element into -cpu parameters:
- <cache mode='passthrough'/> is translated to host-cache-info=on
- <cache level='3' mode='emulate'/> is transformed into l3-cache=on
- <cache mode='disable'/> is turned in host-cache-info=off,l3-cache=off
Any other <cache> element is forbidden.
The tricky part is detecting whether QEMU supports the CPU properties.
The 'host-cache-info' property is introduced in v2.4.0-1389-ge265e3e480,
earlier QEMU releases enabled host-cache-info by default and had no way
to disable it. If the property is present, it defaults to 'off' for any
QEMU until at least 2.9.0.
The 'l3-cache' property was introduced later by v2.7.0-200-g14c985cffa.
Earlier versions worked as if l3-cache=off was passed. For any QEMU
until at least 2.9.0 l3-cache is 'off' by default.
QEMU 2.9.0 was the first release which supports probing both properties
by running device-list-properties with typename=host-x86_64-cpu. Older
QEMU releases did not support device-list-properties command for CPU
devices. Thus we can't really rely on probing them and we can just use
query-cpu-model-expansion QMP command as a witness.
Because the cache property probing is only reliable for QEMU >= 2.9.0
when both are already supported for quite a few releases, we let QEMU
report an error if a specific cache mode is explicitly requested. The
other mode (or both if a user requested CPU cache to be disabled) is
explicitly turned off for QEMU >= 2.9.0 to avoid any surprises in case
the QEMU defaults change. Any older QEMU already turns them off so not
doing so explicitly does not make any harm.
Not all async jobs are visible via virDomainGetJobStats (either they are
too fast or getting the stats is not allowed during the job), but
forcing all of them to advertise the operation is easier than hunting
the jobs for which fetching statistics is allowed. And we won't need to
think about this when we add support for getting stats for more jobs.
The parameter is reported by virDomainGetJobStats API and
VIR_DOMAIN_EVENT_ID_JOB_COMPLETED event and it can be used to identify
the operation (migration, snapshot, ...) to which the reported
statistics belong.
Eric Farman [Wed, 26 Apr 2017 21:10:01 +0000 (17:10 -0400)]
qemu: Remove extra messages for vhost-scsi hotplug
As with virtio-scsi, the "internal error" messages after
preparing a vhost-scsi hostdev overwrites more meaningful
error messages deeper in the callchain. Remove it too.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Eric Farman [Wed, 26 Apr 2017 21:10:00 +0000 (17:10 -0400)]
qemu: Remove extra messages from virtio-scsi hotplug
I tried to attach a SCSI LUN to two different guests, and forgot
to specify "shareable" in the hostdev XML. Attaching the device
to the second guest failed, but the message was not helpful in
telling me what I was doing wrong:
$ virsh attach-device dasd_fedora_0e1e scsi_scratch_disk.xml
error: Failed to attach device from scsi_scratch_disk.xml
error: internal error: Unable to prepare scsi hostdev: scsi_host3:0:15:1074151456
I eventually discovered my error, but thought it was weird that
Libvirt doesn't provide something more helpful in this case.
Looking over the code we had just gone through, I commented out
the "internal error" message, and got something more useful:
$ virsh attach-device dasd_fedora_0e1e scsi_scratch_disk.xml
error: Failed to attach device from scsi_scratch_disk.xml
error: Requested operation is not valid: SCSI device 3:0:15:1074151456 is already in use by other domain(s) as 'non-shareable'
Looking over the error paths here, we seem to issue better
messages deeper in the callchain so these "internal error"
messages overwrite any of them. Remove them, so that the
more detailed errors are seen.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
qemu: migration: fix race on cancelling drive mirror
0feebab2 adds calling qemuBlockNodeNamesDetect for completed job
on updating block jobs. This affects cancelling drive mirror logic as
this function drops vm lock. Now we have to recheck all disks
before the disk with the completed block job before going
to wait for block job events.
Peter Krempa [Wed, 26 Apr 2017 07:01:30 +0000 (09:01 +0200)]
qemu: numa: Don't return automatic nodeset for inactive domain
qemuDomainGetNumaParameters would return the automatic nodeset even for
the persistent config if the domain was running. This is incorrect since
the automatic nodeset will be re-queried upon starting the vm.
While peer-to-peer migration enters the Confirm phase even if the
Perform phase fails, the client which initiated a non-p2p migration will
never call virDomainMigrateConfirm* API if the Perform phase failed.
Thus we need to explicitly reset migration before reporting a failure
from the Perform phase API.
Migration with old QEMU which does not support query-migrate-parameters
would fail because the QMP command is called unconditionally since the
introduction of TLS migration. Previously it was only called if the user
explicitly requested a feature which uses QEMU migration parameters. And
even then the situation was not ideal, instead of reporting an
unsupported feature we'd just complain about missing QMP command.
Trivially no migration parameters are supported when
query-migrate-parameters QMP command is missing. There's no need to
report an error if it is missing, the callers will report better error
if needed.
John Ferlan [Mon, 3 Apr 2017 13:03:47 +0000 (09:03 -0400)]
secret: Split apart NumOfSecrets and GetUUIDs callback function
Rather than overloading one function - split apart the logic to have
separate interfaces and local/private structures to manage the data
for which the helper is collecting.
John Ferlan [Wed, 19 Apr 2017 12:34:04 +0000 (08:34 -0400)]
secret: Change variable names for list traversals
Rather than 'nuuids' it should be 'maxuuids' and rather than 'got'
it should be 'nuuids'. Alter the logic of the list traversal to
utilize those names.
John Ferlan [Sat, 1 Apr 2017 15:46:36 +0000 (11:46 -0400)]
secret: Use consistent naming for variables
When processing a virSecretPtr use 'secret' as a variable name.
When processing a virSecretObjPtr use 'obj' as a variable name.
When processing a virSecretDefPtr use 'def' as a variable name,
unless a distinction needs to be made with a 'newdef' such as
virSecretObjListAddLocked (which also used the VIR_STEAL_PTR macro
for the configFile and base64File).
John Ferlan [Thu, 20 Apr 2017 15:51:22 +0000 (11:51 -0400)]
nwfilter: Move save of config until after successful assign
Only save the config when using a generated UUID if we were able to
create an object for the def. There could have been "other reasons"
for the assignment to fail, so saving the config could be incorrect.