Andrea Bolognani [Mon, 20 Jul 2015 16:37:24 +0000 (18:37 +0200)]
nodeinfo: Add old kernel compatibility to nodeGetPresentCPUBitmap()
If the cpu/present file is not available, we assume that the kernel
is too old to support non-consecutive CPU ids and return a bitmap
with all the bits set to represent this fact. This assumption is
already exploited in nodeGetCPUCount().
This means users of this API can expect the information to always
be available unless an error has occurred, and no longer need to
treat the NULL return value as a special case.
Andrea Bolognani [Mon, 20 Jul 2015 16:37:23 +0000 (18:37 +0200)]
nodeinfo: Rename linuxParseCPUmax() to linuxParseCPUCount()
The original name was confusing because the function returns the number
of CPUs, not the maximum CPU id. The comment above the function has
been updated to reflect this.
During the recent refactoring/cleanups, a bug has been introduced
that caused all CPUs to be reported as online unless the sysfs
cpu/present file was available.
This commit fixes the fallback code path by building the directory
path passed to virNodeGetCpuValue() correctly.
Andrea Bolognani [Fri, 17 Jul 2015 16:12:48 +0000 (18:12 +0200)]
tests: Restore links in deconfigured-cpus nodeinfo test
When cleaning up the data (taken from a running system) for inclusion
I went a little too far and deleted a bunch of links that should have
been left alone. The test worked despite this because it was going
through a fallback code path.
A few other files are affected as well: again, the data is taken from
a running system, so even thought we would probably be okay if we
just added the links, aligning everything is definitely safer.
Peter Krempa [Thu, 16 Jul 2015 13:35:05 +0000 (15:35 +0200)]
cgroup: Drop resource partition from virSystemdMakeScopeName
The scope name, even according to our docs is
"machine-$DRIVER\x2d$VMNAME.scope" virSystemdMakeScopeName would use the
resource partition name instead of "machine-" if it was specified thus
creating invalid scope paths.
This makes libvirt drop cgroups for a VM that uses custom resource
partition upon reconnecting since the detected scope name would not
match the expected name generated by virSystemdMakeScopeName.
The error is exposed by the following log entry:
debug : virCgroupValidateMachineGroup:302 : Name 'machine-qemu\x2dtestvm.scope' for controller 'cpu' does not match 'testvm', 'testvm.libvirt-qemu' or 'machine-test-qemu\x2dtestvm.scope'
As of fedora polkit-0.113-2, polkit-devel only pulls in polkit-libs, not
full polkit, but we need the latter for pkcheck otherwise our configure
test fails.
Peter Krempa [Mon, 13 Jul 2015 15:04:49 +0000 (17:04 +0200)]
virsh: Refactor block job waiting in cmdBlockCommit
Reuse the vshBlockJobWait infrastructure to refactor cmdBlockCommit to
use the common code. This additionally fixes a bug when working with
new qemus, where when doing an active commit with --pivot the pivoting
would fail, since qemu reaches 100% completion but the job doesn't
switch to synchronized phase right away.
Peter Krempa [Mon, 13 Jul 2015 15:04:49 +0000 (17:04 +0200)]
virsh: Refactor block job waiting in cmdBlockPull
Introduce helper function that will provide logic for waiting for block
job completion so the 3 open coded places can be unified and improved.
This patch introduces the whole logic and uses it to fix
cmdBlockJobPull. The vshBlockJobWait function provides common logic for
block job waiting that should be robust enough to work across all
previous versions of libvirt. Since virsh allows passing user-provided
strings as paths of block devices we can't reliably use block job events
for detection of block job states so the function contains a great deal
of fallback logic.
Peter Krempa [Wed, 15 Jul 2015 13:11:02 +0000 (15:11 +0200)]
qemu: Update state of block job to READY only if it actually is ready
Few parts of the code looked at the current progress of and assumed that
a two phase blockjob is in the _READY state as soon as the progress
reached 100% (info.cur == info.end). In current versions of qemu this
assumption is invalid and qemu exposes a new flag 'ready' in the
query-block-jobs output that is set to true if the job is actually
finished.
This patch adds internal data handling for reading the 'ready' flag and
acting appropriately as long as the flag is present.
While this still doesn't fix the virsh client problem with two phase
block jobs and the --pivot option, it at least improves the error
message:
$ virsh blockcommit --wait --verbose vm vda --base vda[1] --active --pivot
Block commit: [100 %]error: failed to pivot job for disk vda
error: internal error: unable to execute QEMU command 'block-job-complete': The active block job for device 'drive-virtio-disk0' cannot be completed
to
$ virsh blockcommit --wait --verbose VM vda --base vda[1] --active --pivot
Block commit: [100 %]error: failed to pivot job for disk vda
error: block copy still active: disk 'vda' not ready for pivot yet
Peter Krempa [Mon, 13 Jul 2015 14:23:59 +0000 (16:23 +0200)]
virsh: Refactor argument checking in cmdBlockCommit
Use the VSH_EXCLUSIVE_OPTIONS to exclude combinations of --pivot and
--keep-overlay and refactor the enforcing of the --wait option and other
flags that imply --wait.
Peter Krempa [Wed, 1 Apr 2015 13:05:33 +0000 (15:05 +0200)]
virsh: cmdBlockJob: Switch to declarative flag interlocking
Use the VSH_EXCLUSIVE_OPTIONS_VAR to interlock incompatible options.
Since a variable named 'abort' would conflict with older compilers use
VSH_EXCLUSIVE_OPTIONS for the --abort option.
Moshe Levi [Sun, 19 Jul 2015 10:11:07 +0000 (13:11 +0300)]
nodedev: add RDMA and tx-udp_tnl-segmentation NIC capabilities
Adding functionality to libvirt that will allow
it query the interface for the availability of RDMA and
tx-udp_tnl-segmentation Offloading NIC capabilities
qemu: Reject updating unsupported disk information
If one calls update-device with information that is not updatable,
libvirt reports success even though no data were updated. The example
used in the bug linked below uses updating device with <boot order='2'/>
which, in my opinion, is a valid thing to request from user's
perspective. Mainly since we properly error out if user wants to update
such data on a network device for example.
And since there are many things that might happen (update-device on disk
basically knows just how to change removable media), check for what's
changing and moreover, since the function might be usable in other
drivers (updating only disk path is a valid possibility) let's abstract
it for any two disks.
We can't possibly check for everything since for many fields our code
does not properly differentiate between default and unspecified values.
Even though this could be changed, I don't feel like it's worth the
complexity so it's not the aim of this patch.
After upgrade to perl-5.22.0, it started complaining about one of our
scripts. The thing is that even though it works, it wants all curly
brackets escaped properly. The change is not functional, it merely gets
rid of the following error:
Unescaped left brace in regex is deprecated, passed through in regex;
marked by <-- HERE in m/^enum { <-- HERE / at -e line 3.
There is one more error like this that I'm getting, but it is because of
GNU automake bug #21001:
storage: Fix pool building when directory already exists
Currently, when trying to virsh pool-define/virsh pool-build a new
'dir' pool, if the target directory already exists, virsh
pool-build/virStoragePoolBuild will error out. This is a change of
behaviour compared to eg libvirt 1.2.13
This is caused by the wrong type being used for the dir_create_flags
variable in virStorageBackendFileSystemBuild , it's defined as a bool
but is used as a flag bit field so should be unsigned int (this matches
the type virDirCreate expects for this variable).
This should fix https://bugzilla.gnome.org/show_bug.cgi?id=752417 (GNOME
Boxes) and https://bugzilla.redhat.com/show_bug.cgi?id=1244080
(downstream virt-manager).
rpc: ensure daemon is spawn even if dead socket exists
The auto-spawn code would originally attempt to spawn the
daemon for both ENOENT and ECONNREFUSED errors from connect().
The various refactorings eventually lost this so we only
spawn the daemon on ENOENT. The result is if the daemon exits
uncleanly, so that the socket is left in the filesystem, we
will never be able to auto-spawn the daemon again.
docs: Document how libvirt handles companion controllers
The information on companion controllers we give in our documentation is
rather sparse. For example, it looks like any controller can be used as
a companion one. Also, when using ich9-uhci2, for example, we are able
to set some sensible defaults, but it might get confusing for the user
as we don't do that for all controller models.
John Ferlan [Thu, 16 Jul 2015 16:28:58 +0000 (12:28 -0400)]
rbd: Return error from rbd_create for message processing
Resolving an error reporting bug introduced by commit id '761491e' which
just took the return of virStorageBackendRBDCreateImage and used it as
the basis for the message generated. This would generate EPERM regardless
of error seen.
Commit ed8155eafbff5c5ca0bdfe84a8388f58b718c2f9 documented that
mhz field in virNodeInfo might be 0 if the frequency is unknown. Modify
virsh to know about that.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
This patch adds a test for the qemu command line generation.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com> Reviewed-by: Jason J. Herne <jjherne@us.ibm.com> Reviewed-by: Stefan Zimmermann <stzi@linux.vnet.ibm.com>
qemu: Make virtio-9p-ccw the default for s390-ccw-virtio machines
For s390-ccw-virtio machines the default bus type is set to ccw.
Specifing an address element allows to override the default.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com> Reviewed-by: Jason J. Herne <jjherne@us.ibm.com> Reviewed-by: Stefan Zimmermann <stzi@linux.vnet.ibm.com>
Adding the recently in qemu added 9pfs support for virtio-ccw.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com> Reviewed-by: Jason J. Herne <jjherne@us.ibm.com> Reviewed-by: Stefan Zimmermann <stzi@linux.vnet.ibm.com>
Some drivers don't expose available huge page sizes in the
capabilities XML. For instance, LXC driver is one of those.
This has a downside that when virsh is trying to get
aggregated info on free pages per all NUMA nodes, it fails.
The problem is that the virNodeGetFreePages() API expects
caller to pass an array of page sizes he is interested in.
In virsh, this array is filled from the capabilities from
'/capabilities/host/cpu/pages' XPath. As said, in LXC
there's no such XPath and therefore virsh fails currently.
But hey, we can fallback: the page sizes are exposed under
'/capabilities/host/topology/cells/cell/pages'. The page
size can be collected from there, and voilà the command
works again. But now we must make sure that there are no
duplicates in the array passed to the public API. Otherwise
we won't get as beautiful output as we are getting now.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
which can never be true since VIR_DOMAIN_AFFECT_CURRENT has hardcoded
value of zero. Therefore virDomainIsActive() is a dead code. However,
the condition could make sense if it is rewritten as the following:
Michal Privoznik [Tue, 14 Jul 2015 13:42:50 +0000 (15:42 +0200)]
qemuMigrationRun: Don't leak @fd
If we are migrating to an UNIX socket, we accept() a connection
from qemu and use that FD to set up a tunnel. However, the FD is
not closed as often as it should be.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Andrea Bolognani [Tue, 14 Jul 2015 09:40:13 +0000 (11:40 +0200)]
nodeinfo: Make sysfs_prefix usage more consistent
Make sure sysfs_prefix, when present, is always the first argument
to a function; don't use a different name to refer to it; check
whether it is NULL, and hence SYSFS_SYSTEM_PATH should be used, only
when using it directly and not just passing it down to another
function; always pass down the same value we've been passed when
calling another function.
Peter Krempa [Tue, 30 Jun 2015 14:31:24 +0000 (16:31 +0200)]
qemu: process: Improve update of maximum balloon state at startup
In commit 641a145d73fdc3dd9350fd57b3d3247abf101c05 I've added code that
resets the balloon memory value to full size prior to resuming the vCPUs
since the size certainly was not reduced at that point.
Since qemuProcessStart is used also in code paths with already booted
up guests (migration, save/restore) the assumption is not entirely true
since the guest might already been running before.
This patch adds a function that queries the monitor rather than using
the full size since a balloon event would not be reissued in case we are
recovering a saved migration state.
Additionally the new function is used also when reconnecting to a VM
after libvirtd restart since we might have missed a few balloon events
while libvirtd was not running.
In one of my previous ptaches (bcd9a564) I've tried to fix the problem
that we blindly assumed strict NUMA mode for guests. This led to
several problems like us pinning a domain onto a nodeset via libnuma
among with CGroups. Once the nodeset was changed by user, well, it did
not result in desired effect. See the original commit for more info.
But, the commit I wrote had a bug: when NUMA parameters are changed on
a running domain we require domain to be strictly pinned onto a
nodeset. Due to a typo a condition was mis-evaluated.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Andrea Bolognani [Mon, 13 Jul 2015 16:37:37 +0000 (18:37 +0200)]
tests: Add nodeinfo test for non-present CPUs
Some of the possible CPUs in a system might not be present, eg. they
might be defective or might have been deconfigured from the ASM console
in a Power system. Due to this fact, Linux keeps track of what CPUs are
possible and what are present separately.
This test uses the data from a system where not all the possible CPUs
are present to make sure libvirt handles this situation correctly.
nodeinfo: fix to parse present cpus rather than possible cpus
This patch resolves a situation where a core is defective and is not
in the present mask during boot. Optionally a host can have empty sockets
could be brought online if the socket is added. In this case the present
mask contains the cpu's that are actually there in the sockets even though
they might be offline for some reason. This patch excludes the cpu's that
are offline because the socket is defective/empty by checking the present
mask before reading the cpu directory. Otherwise, the nodeinfo on such
hosts always displays wrong output which includes the defective/empty
sockets as set of offline cpu's.
John Ferlan [Tue, 7 Jul 2015 21:22:28 +0000 (17:22 -0400)]
nodeinfo: Add sysfs_prefix to nodeCapsInitNUMA
Add the sysfs_prefix argument to the call to allow for setting the
path for tests to something other than SYSFS_CPU_PATH which is a
derivative of SYSFS_SYSTEM_PATH
Use cpupath for nodeCapsInitNUMAFake and remove SYSFS_CPU_PATH
Michal Privoznik [Fri, 10 Jul 2015 14:32:00 +0000 (17:32 +0300)]
virt-driver-vz: Require parallels-7.0.22 at least
With the latest patch to the vz driver (7d73ca06cefe) I was
getting some compilation errors. It turned out, my installation
of the parallels SDK was not as fresh as it could be. Parallels
installed in my system were missing the
PRL_USE_VNET_NAME_FOR_BRIDGE_NAME symbol which simply was not
introduced at the time I was installing the SDK. The symbol was
introduced in 86e62a5d which was then part of the 7.0.22 release.
Require that version at least therefore.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Mon, 13 Jul 2015 12:15:03 +0000 (14:15 +0200)]
qemuProcessHandleMigrationStatus: Update migration status more frequently
After Jirka's migration patches libvirt is listening on migration
events from qemu instead of actively polling on the monitor. There is,
however, a little regression (introduced in 6d2edb6a42d0d41). The
problem is, the current status of migration job is updated in
qemuProcessHandleMigrationStatus if and only if migration job was
started. But eventually every asynchronous job may result in
migration. Therefore, since this job is not strictly a
migration job, internal state was not updated and later checks failed:
virsh # save fedora22 /tmp/fedora22_ble.save
error: Failed to save domain fedora22 to /tmp/fedora22_ble.save
error: operation failed: domain save job: is not active
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
We create a virtual network of special type, which
has the same name as bridge name to create bridged
network adapter in vz. So when we delete such an
adapter we have to remove corresponding virtual
network.
So let's rename prlsdkDelNet to prlsdkCleanupBridgedNet
and don't check for return value.
qemu: Use error from Finish instead of "unexpectedly failed"
When QEMU exits on destination during migration, the source reports
either success (if the failure happened at the very end) or unhelpful
"unexpectedly failed" error message. However, the Finish API called on
the destination may report a real error so let's use it instead of the
generic one.
virDomainMigrateFinish* APIs were unfortunately designed to return the
pointer to the domain on destination and NULL on error. This looks OK in
normal cases but the same API is also called when we know migration
failed and thus we expect Finish to return NULL even if it actually did
all it was supposed to do without any error. The call is defined to
return nonnull domain pointer over RPC, which means returning NULL will
always result in an error being send. If this was not in fact an error,
the API itself wouldn't set anything to the thread local virError, which
makes the RPC layer come up with it's own "Library function returned
error but did not set virError" error.
This is quite confusing and also hard to detect by the caller. This
patch adds a special error code which can be used to check that Finish
successfully aborted migration.
If QEMU fails during incoming migration, the domain disappears including
a possibly useful error message read from QEMU log file. Let's remember
the error in virQEMUDriver so that Finish can report more than just "no
such domain".
This is a self-locking wrapper around virHashTable. Only a limited set
of APIs are implemented now (the ones which are used in the following
patch) as more can be added on demand.
Cédric Bosdonnat [Thu, 25 Jun 2015 08:36:52 +0000 (10:36 +0200)]
Get more libvirt errors from virt-aa-helper
Initializing libvirt log in virt-aa-helper and getting it to output
libvirt log to stderr. This will help debugging problems happening in
libvirt functions called from within virt-aa-helper
Daemon used false logic for determining whether there were any clients.
When the timer was inactive, it was activated if at least one of the
servers did not have clients. So the bool was being flipped there and
back all the time in case there was one client, for example.
Peter Krempa [Wed, 8 Jul 2015 14:10:05 +0000 (16:10 +0200)]
qemu: Check duplicate WWNs also for hotplugged disks
In commit 714b38cb232bcbbd7487af4c058fa6d0999b3326 I tried to avoid
having two disks with the same WWN in a VM. I forgot to check the
hotplug paths though which make it possible bypass that check. Reinforce
the fix by checking the wwn when attaching the disk.
Prerna Saxena [Fri, 26 Jun 2015 11:43:26 +0000 (17:13 +0530)]
Fix cloning of raw, sparse volumes
When virsh vol-clone is attempted on a raw file where capacity > allocation,
the resulting cloned volume has a size that matches the virtual-size of
the parent; in place of matching its actual, disk size.
This patch fixes the cloned disk to have same _allocated_size_ as
the parent file from which it was cloned.
Ján Tomko [Fri, 3 Jul 2015 10:47:02 +0000 (12:47 +0200)]
Rewrite allocation tracking when cloning volumes
Instead of storing the remaining bytes, store the position of the first
unallocated byte. This will allow changing the amount of bytes copied
by virStorageBackendCopyToFD without changing the safezero call.
Libvirt's error messages do not end with a LF. However, when reading the
error from QEMU log, we would read the LF from the log and keep it in
the message.
Jiri Denemark [Fri, 29 May 2015 06:38:44 +0000 (08:38 +0200)]
qemu: Wait for migration events on domain condition
Since we already support the MIGRATION event, we just need to make sure
the domain condition is signalled whenever a p2p connection drops or the
domain is paused due to IO error and we can avoid waking up every 50 ms
to check whether something happened.
Jiri Denemark [Tue, 26 May 2015 11:42:06 +0000 (13:42 +0200)]
qemuDomainGetJobStatsInternal: Support migration events
When QEMU supports migration events the qemuDomainJobInfo structure will
no longer be updated with migration statistics. We have to enter a job
and explicitly ask QEMU every time virDomainGetJob{Info,Stats} is
called.
Even if QEMU supports migration events it doesn't send them by default.
We have to enable them by calling migrate-set-capabilities. Let's enable
migration events everytime we can and clear QEMU_CAPS_MIGRATION_EVENT in
case migrate-set-capabilities does not support events.
qemu: don't use initialized ret in qemuRemoveSharedDevice
This fixes
CC qemu/libvirt_driver_qemu_impl_la-qemu_conf.lo
qemu/qemu_conf.c: In function 'qemuRemoveSharedDevice':
qemu/qemu_conf.c:1384:9: error: 'ret' may be used uninitialized in this function [-Werror=maybe-uninitialized]
Pavel Hrdina [Mon, 29 Jun 2015 14:19:44 +0000 (16:19 +0200)]
qemu_hotplug: try harder to eject media
Some guests lock the tray and QEMU eject command will simply fail to
eject the media. But the guest OS can handle this attempt to eject the
media and can unlock the tray and open it. In this case, we should try
again to actually eject the media.
If the first attempt fails to detect a tray_open we will fail with
error, from monitor. If we receive that event, we know, that the guest
properly reacted to the eject request, unlocked the tray and opened it.
In this case, we need to run the command again to actually eject the
media from the device. The reason to call it again is, that QEMU
doesn't wait for the guest to react and report an error, that the tray
is locked.
Pavel Hrdina [Mon, 29 Jun 2015 14:18:53 +0000 (16:18 +0200)]
monitor: detect that eject fails because the tray is locked
Modify the eject monitor functions to parse the return code and detect,
whether the error contains "is locked" to report this type of failure to
upper layers.
Commit id 'e0e290552' added a check to determine if the same bus
had the same target value. It seems that's not quite good enough
as the check should check the target name value regardless of bus type.
Also added a DO_TEST_DIFFERENT to exhibit the issue
Erik Skultety [Thu, 9 Jul 2015 09:17:12 +0000 (11:17 +0200)]
storage: Revert volume obj list updating after volume creation (4749d82a)
This patch reverts commit 4749d82a which tried to tweak the logic in
volume creation. We did realloc and update our object list before we executed
volume building within a specific storage backend. If that failed, we
had to update (again) our object list to the original state as it was before the
build and delete the volume from the pool (even though it didn't exist - this
truly depends on the backend).
I misunderstood the base idea to be able to poll the status of the volume
creation using vol-info. After commit 4749d82a this wasn't possible
anymore, although no BZ has been reported yet.
Commit 4749d82a also claimed to fix
https://bugzilla.redhat.com/show_bug.cgi?id=1223177, but commit c8be606b of the
same series as 4749d82ad (which was more of a refactor than a fix)
fixes the same issue so the revert should be pretty straightforward.
Further more, BZ https://bugzilla.redhat.com/show_bug.cgi?id=1241454 can be
fixed with this revert.
John Ferlan [Tue, 16 Jun 2015 15:34:06 +0000 (11:34 -0400)]
qemu: Inline qemuGetHostdevPath
Since a future patch will need the device path generated when adding a
shared host device, remove the qemuAddSharedHostdev and inline the two
calls into qemuAddSharedHostdev and qemuRemoveSharedHostdev
John Ferlan [Mon, 6 Jul 2015 17:08:32 +0000 (13:08 -0400)]
qemu: Refactor qemuCheckSharedDisk to create qemuCheckUnprivSGIO
Split out the current function in order to share the code with hostdev
in a future patch. Failure to match the expected sgio value against what
is stored will cause an error which the caller would need to handle since
only the caller has the disk (or eventually hostdev) specific data in
order to uniquely identify the disk in an error message.
Jim Fehlig [Tue, 7 Jul 2015 18:29:24 +0000 (12:29 -0600)]
libxl: rework setting the state of virDomainObj
Set the state of virDomainObj in the functions that
actually change the domain state, instead of the generic
libxlDomainCleanup function. This approach gives functions
calling libxlDomainCleanup more flexibility wrt when and
how they change virDomainObj state via virDomainObjSetState.
The prior approach of calling virDomainObjSetState in
libxlDomainCleanup resulted in the following incorrect
coding pattern in the various functions that change
domain state
libxlDomain<DoStateTransition>
call libxl function to do state transition
emit lifecycle event
libxlDomainCleanup
virDomainObjSetState
Once simple manifestation of this bug is seeing a domain
running in virt-manager after selecting the shutdown button,
even after the domain has long shutdown.