Peter Krempa [Mon, 10 Jun 2013 14:05:45 +0000 (16:05 +0200)]
qemu: Cancel migration if guest encoutners I/O error while migrating
During a live migration the guest may receive a disk access I/O error.
In this state the guest is unable to continue running on a remote host
after migration as some state may be present in the kernel and not
migrated.
With this patch, the migration is canceled in such case so it can either
continue on the source if the I/O issues are recovered or has to be
destroyed anyways.
As of d7f9d827531bc843b7c5aa9d3e8c08738a1de248 we copy the listen
address from the qemu.conf config file in case none has been provided
via XML. But later, when migrating, we should not include such listen
address in the migratable XML as it is something autogenerated, not
requested by user. Moreover, the binding to the listen address will
likely fail, unless the address is '0.0.0.0' or its IPv6 equivalent.
This patch introduces a new boolean attribute to virDomainGraphicsListenDef
to distinguish autofilled listen addresses. However, we must keep the
attribute over libvirtd restarts, so it must be kept within status XML.
This patch fixes changes done in commit 29c1e913e459058c12d02b3f4b767b3
that was pushed without implementing review feedback.
The flag introduced by the patch is changed to VIR_DOMAIN_VCPU_GUEST and
documentation makes the difference between regular hotplug and this new
functionality more explicit.
The virsh options that enable the use of the new flag are changed to
"--guest" and the documentation is fixed too.
Fix ordering of file open in virProcessGetNamespaces
virProcessGetNamespaces() opens files in /proc/XXX/ns/ which will
later be passed to setns(). We have to make sure that the file
descriptors in the array are in the correct order. In particular
the 'user' namespace must be first otherwise setns() may fail
for other namespaces.
The order has been taken from util-linux's sys-utils/nsenter.c
Also we must ignore EINVAL in setns() which occurs if the
namespace associated with the fd, matches the calling process'
current namespace.
Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently, there's a path to use the ncpuinfo variable uninitialized,
which leads to a compiler warning:
qemu/qemu_driver.c: In function 'qemuDomainGetVcpusFlags':
qemu/qemu_driver.c:4573:9: error: 'ncpuinfo' may be used
uninitialized in this function [-Werror=maybe-uninitialized]
for (i = 0; i < ncpuinfo; i++) {
^
Peter Krempa [Mon, 27 May 2013 14:08:30 +0000 (16:08 +0200)]
qemu: Implement new QMP command for cpu hotplug
This patch implements support for the "cpu-add" QMP command that plugs
CPUs into a live guest. The "cpu-add" command was introduced in QEMU
1.5. For the hotplug to work machine type "pc-i440fx-1.5" is required.
Peter Krempa [Fri, 12 Apr 2013 12:30:41 +0000 (14:30 +0200)]
API: Introduce VIR_DOMAIN_VCPU_AGENT, for agent based CPU hot(un)plug
This flag will allow to use qemu guest agent commands to disable
(offline) and enable (online) processors in a live guest that has the
guest agent running.
Peter Krempa [Fri, 12 Apr 2013 10:14:02 +0000 (12:14 +0200)]
qemu_agent: Introduce helpers for agent based CPU hot(un)plug
The qemu guest agent allows to online and offline CPUs from the
perspective of the guest. This patch adds helpers that call
'guest-get-vcpus' and 'guest-set-vcpus' guest agent functions and
convert the data for internal libvirt usage.
ryan woodsmall [Tue, 2 Oct 2012 09:00:54 +0000 (04:00 -0500)]
Add support for VirtualBox 4.2 APIs
A few things have changed in the VirtualBox API - some small
(capitalizations of things in function names like Ip to IP
and Dhcp to DHCP) and some much larger (FindMedium is superceded
by OpenMedium). The biggest change for the sake of this patch
is the signature of CreateMachine is quite a bit different. Using
the Oracle source as a guide, to spin up a VM with a given UUID,
it looks like a text flag has to be passed in a new argument to
CreateMachine. This flag is built in the VirtualBox 4.2 specific
ifdefs and is kind of ugly but works. Additionally, there is now
(unused) VM groups support in CreateMachine and the previous
'osTypeId' arg is currently set to nsnull as in the Oracle code.
The FindMedium to OpenMedium changes were more straightforward
and are pretty clear. The rest of the vbox template changes are
basically spelling/capitalization changes from the looks of things.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Though both libvirt and QEMU's document say RTC_CHANGE returns
the offset from the host UTC, qemu actually returns the offset
from the specified date instead when specific date is provided
(-rtc base=$date).
It's not safe for qemu to fix it in code, it worked like that
for 3 years, changing it now may break other QEMU use cases.
What qemu tries to do is to fix the document:
And in libvirt side, instead of replying on the value from qemu,
this converts the offset returned from qemu to the offset from
host UTC, by:
/*
* a: the offset from qemu RTC_CHANGE event
* b: The specified date (-rtc base=$date)
* c: the host date when libvirt gets the RTC_CHANGE event
* offset: What libvirt will report
*/
offset = a + (b - c);
The specified date (-rtc base=$date) is recorded in clock's def as
an internal only member (may be useful to exposed outside?).
Internal only XML tag "basetime" is introduced to not lose the
guest's basetime after libvirt restarting/reloading:
Peter Krempa [Wed, 5 Jun 2013 07:42:59 +0000 (09:42 +0200)]
storage: Avoid unnecessary ternary operators and refactor the code
Setting of local variables in virStorageBackendCreateQemuImgCmd was
unnecessarily cluttered with ternary operators and repeated testing of
of conditions.
This patch refactors the function to use if statements and improves
error reporting in case inputvol is specified but does not contain
target path. Previously we would complain about "unknown storage vol
type 0" instead of the actual problem.
Alvaro Polo [Thu, 6 Jun 2013 07:53:54 +0000 (09:53 +0200)]
openvz: Fix code coverage issue in OpenVZ driver
After fixing an invalid usage of virDomainNetDef in OpenVZ driver,
a coverage issue appeared. This was caused by a still invalid usage
of net->data.ethernet.dev for non ethernet networking.
Currently, a listen address for a SPICE server can be specified. Later,
when the domain is migrated, we need to relocate the graphics which
involves telling new destination to the SPICE server. However, we can't
just assume the listen address is the new location, because the listen
address can be ANYCAST (0.0.0.0 for IPv4, :: for IPv6). In which case,
we want to pass the remote hostname. But there are some troubles with
ANYCAST. In both IPv4 and IPv6 it has many ways for specifying such
address. For instance, in IPv4: 0, 0.0, 0.0.0, 0.0.0.0. The number of
variations gets bigger in IPv6 world. Hence, in order to check for
ANYCAST address sanely, we should take the provided listen address,
parse it and format back in it's full form. Which is exactly what this
patch does.
Ensure non-root can read /proc/meminfo file in LXC containers
By default files in a FUSE mount can only be accessed by the
user which created them, even if the file permissions would
otherwise allow it. To allow other users to access the FUSE
mount the 'allow_other' mount option must be used. This bug
prevented non-root users in an LXC container from reading
the /proc/meminfo file.
Remove legacy code for single-instance devpts filesystem
Earlier commit f7e8653f dropped support for using LXC with
kernels having single-instance devpts filesystem from the
LXC controller. It forgot to remove the same code from the
LXC container setup.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Peter Krempa [Wed, 5 Jun 2013 08:49:15 +0000 (10:49 +0200)]
storagevolxml2argvtest: Report better error messages on test failure
If the creation of the commandline failed, libvirt always reported "out
of memory" from the virCommandToString function rather than the proper
error that happened in virStorageBackendCreateQemuImgCmd. Error out
earlier.
Osier Yang [Fri, 31 May 2013 05:16:14 +0000 (13:16 +0800)]
storage: Support preallocate the new capacity for vol-resize
The document for "vol-resize" says the new capacity will be sparse
unless "--allocate" is specified, however, the "--allocate" flag
is never implemented. This implements the "--allocate" flag for
fs backend's raw type volume, based on posix_fallocate and the
syscall SYS_fallocate.
Thread 1 must be calling nwfilterDriverRemoveDBusMatches(). It does so with
nwfilterDriverLock held. In the patch below I am now moving the
nwfilterDriverLock(driverState) further up so that the initialization, which
seems to either take a long time or is entirely stuck, occurs with the lock
held and the shutdown cannot occur at the same time.
Remove the lock in virNWFilterDriverIsWatchingFirewallD to avoid
double-locking.
Alvaro Polo [Tue, 4 Jun 2013 07:44:32 +0000 (09:44 +0200)]
Fix a invalid usage of virDomainNetDef in OpenVZ driver
OpenVZ was accessing ethernet data to obtain the guest iface name
regardless the domain is configured to use ethernet or bridged
networking. This prevented the guest network interface to be rightly
named for bridged networking.
Jiri Denemark [Tue, 4 Jun 2013 08:16:02 +0000 (10:16 +0200)]
virsh: Obey pool-or-uuid spec when creating volumes
Our documentation says a pool may be referenced by its name or UUID
anywhere if it makes sense (pool-name and pool-uuid are the only
exceptions). However, vol-create and vol-create-as commands did not obey
this.
Peter Krempa [Tue, 28 May 2013 12:29:12 +0000 (14:29 +0200)]
RPC: Support up to 16384 cpus on the host and 4096 in the guest
The RPC limits for cpu maps didn't allow to use libvirt on ultra big
boxes. This patch increases size of the limits to support a maximum of
16384 cpus on the host with a maximum of 4096 cpus per guest.
The full cpu map of such a system takes 8 megabytes and the map for
vcpu pinning is 2 kilobytes long.
Osier Yang [Fri, 31 May 2013 10:09:24 +0000 (18:09 +0800)]
conf: Generate address for scsi host device automatically
With unknown good reasons, the attribute "bus" of scsi device
address is always set to 0, same for attribute "target". (See
virDomainDiskDefAssignAddress).
Though we might need to change the algorithm to honor "bus"
and "target" too, that's a different issue. The address generator
for scsi host device in this patch just follows the unknown
good reasons, only considering the "controller" and "unit".
It walks through all scsi controllers and their units, to see
if the address $controller:0:0:$unit can be used (if not used
by any disk or scsi host device yet), if found one, it sits on
it, otherwise, it creates a new controller (actually the controller
is implicitly created by someone else), and sits on
$new_controller:0:0:0 instead.
The problem was that qemuUpdateActivePciHostdevs was returning 0
(success) when no hostdevs were present, but would otherwise return -1
(failure) even when it completed successfully. It is only called from
qemuProcessReconnect(), and when qemuProcessReconnect got back an
error, it would not only stop reconnecting, but would terminate the
guest qemu process "to remove danger of it ending up running twice if
user tries to start it again later".
(This bug was introduced in commit 011cf7ad, which was pushed between
v1.0.2 and v1.0.3, so all maintenance branches from v1.0.3 up to 1.0.5
will need this one line patch applied.)
Eric Blake [Fri, 31 May 2013 16:47:22 +0000 (10:47 -0600)]
build: skip qemu in tests when !WITH_QEMU
A mingw build (where the qemu driver is not built, so WITH_QEMU
is undefined) failed with:
In file included from ../../src/qemu/qemu_command.h:30:0,
from ../../tests/testutilsqemu.h:4,
from ../../tests/networkxml2xmltest.c:14:
../../src/qemu/qemu_conf.h:53:4: error: #error "Port me"
But since testutilsqemu.c is already conditional, the header
should be likewise.
* tests/testutilsqemu.h: Make content conditional.
We can't use GNULIB's fprintf-posix due to licensing
incompatibilities. We do already have a portable
formatting via virAsprintf() which we got from GNULIB
though. We can use to create a virFilePrintf() function.
But really gnulib could just provide a 'fprintf'
module, that depended on just its 'asprintf' module.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ján Tomko [Fri, 31 May 2013 11:24:06 +0000 (13:24 +0200)]
qemu: escape literal IPv6 address in NBD migration
A literal IPv6 must be escaped, otherwise migration fails with:
unable to execute QEMU command 'drive-mirror': address resolution failed
for f0::0d:5901: Servname not supported for ai_socktype
since QEMU treats everything after the first ':' as the port.
During domain destruction it's possible that the learnIPAddressThread has
already removed the interface prior to the teardown filter path being run.
The teardown code would only be telling the thread to terminate.
John Ferlan [Fri, 31 May 2013 13:07:28 +0000 (09:07 -0400)]
Resolve memory leak found by valgrind
Commit '6afdfc8e' adjusted the exit and error paths to go through the error
and cleanup labels, but neglected to remove the return ret prior to cleanup.
Also noted the 'type' xml string fetch was never checked for NULL which
could lead to some interesting results.
Peter Krempa [Fri, 31 May 2013 13:38:46 +0000 (15:38 +0200)]
qemu: snapshot: Don't kill access to disk if snapshot creation fails
If snapshot creation failed for example due to invalid use of the
"REUSE_EXTERNAL" flag, libvirt killed access to the original image file
instead of the new image file. On machines with selinux this kills the
whole VM as the selinux context is enforced immediately.
* qemu_driver.c:qemuDomainSnapshotUndoSingleDiskActive():
- Kill access to the new image file instead of the old one.
Eric Blake [Thu, 30 May 2013 20:51:58 +0000 (14:51 -0600)]
build: work around cygwin header bug
A bug in Cygwin [1] and poor error messages from gcc [2] lead
to this confusing compilation error:
qemu/qemu_monitor.c:418:9: error: passing argument 2 of 'sendmsg' from incmpatible pointer type
/usr/include/sys/socket.h:42:11: note: expected 'const struct msghdr *' but argument is of type 'struct msghdr *'
Eric Blake [Thu, 30 May 2013 13:59:14 +0000 (07:59 -0600)]
build: cast [ug]id_t when printing
This is a recurring problem for cygwin :)
For example, see commit 23a4df88.
qemu/qemu_driver.c: In function 'qemuStateInitialize':
qemu/qemu_driver.c:691:13: error: format '%d' expects type 'int', but argument 8 has type 'uid_t' [-Wformat]
Eric Blake [Thu, 30 May 2013 03:00:51 +0000 (21:00 -0600)]
build: port qemu to cygwin
A cygwin build of the qemu driver fails with:
qemu/qemu_process.c: In function 'qemuPrepareCpumap':
qemu/qemu_process.c:1803:31: error: 'CPU_SETSIZE' undeclared (first use in this function)
CPU_SETSIZE is a Linux extension in <sched.h>; a bit more portable
is using sysconf if _SC_NPROCESSORS_CONF is defined (several platforms
have it, including Cygwin). Ultimately, I would have preferred to
use gnulib's 'nproc' module, but it is currently under an incompatible
license.
* src/qemu/qemu_conf.h (QEMUD_CPUMASK_LEN): Provide definition on
cygwin.
Eric Blake [Thu, 30 May 2013 01:52:58 +0000 (19:52 -0600)]
build: use correct rpc.h for lockd
On cygwin, the build failed with:
In file included from ./rpc/virnetmessage.h:24:0,
from ./rpc/virnetclient.h:29,
from locking/lock_driver_lockd.c:31:
./rpc/virnetprotocol.h:9:21: fatal error: rpc/rpc.h: No such file or directory
Cole Robinson [Tue, 28 May 2013 19:37:53 +0000 (15:37 -0400)]
virsh: migrate: Don't disallow --p2p and --migrateuri
Because it's a valid combination. p2p still uses a separate channel
for qemu migration, so there's value in letting the user specify a manual
migrate URI for overriding auto-port, or libvirt's FQDN lookup.
What _isn't_ allowed is --migrateuri and TUNNELLED, since there is
no separate migration channel. Disallow that instead
Eric Blake [Wed, 29 May 2013 15:05:33 +0000 (09:05 -0600)]
build: fix build without libvirtd
Building when configured --with-libvirtd=no fails with:
In file included from ../src/qemu/qemu_command.h:30:0,
from testutilsqemu.h:4,
from networkxml2xmltest.c:14:
../src/qemu/qemu_conf.h:175:5: error: expected specifier-qualifier-list before 'virStateInhibitCallback'
* src/libvirt_internal.h (virStateInhibitCallback): Move outside
of conditional.
Eric Blake [Wed, 29 May 2013 03:15:08 +0000 (21:15 -0600)]
build: fix build with newer gnutls
Building with gnutls 3.2.0 (such as shipped with current cygwin) fails
with:
rpc/virnettlscontext.c: In function 'virNetTLSSessionGetKeySize':
rpc/virnettlscontext.c:1358:5: error: implicit declaration of function 'gnutls_cipher_get_key_size' [-Wimplicit-function-declaration]
Yeah, it's stupid that gnutls broke API by moving their declaration
into a new header without including that header from the old one,
but it's easy enough to work around, all without breaking on gnutls
1.4.1 (hello RHEL 5) that lacked the new header.
* configure.ac (gnutls): Check for <gnutls/crypto.h>.
* src/rpc/virnettlscontext.c (includes): Include additional header.
Osier Yang [Wed, 22 May 2013 12:05:22 +0000 (20:05 +0800)]
storage_conf: Use uid_t/gid_t instead of int to cast the value
And error out if the casted value is not same with the original
one, which prevents the bug on platform(s) where uid_t/gid_t
has different size with long.
Osier Yang [Wed, 29 May 2013 10:04:33 +0000 (18:04 +0800)]
storage_conf: Improve the memory deallocation of pool def parsing
Changes:
* Free all the strings at "cleanup", instead of freeing them
in the middle
* Remove xmlFree
* s/tmppath/target_path/, to make it more sensible
* Add new goto label "error"
Michal Privoznik [Fri, 24 May 2013 12:45:06 +0000 (14:45 +0200)]
qemuOpenVhostNet: Decrease vhostfdSize on open failure
Currently, if there's an error opening /dev/vhost-net (e.g. because
it doesn't exist) but it's not required we proceed with vhostfd array
filled with -1 and vhostfdSize unchanged. Later, when constructing
the qemu command line only non-negative items within vhostfd array
are taken into account. This means, vhostfdSize may be greater than
the actual count of non-negative items in vhostfd array. This results
in improper command line arguments being generated, e.g.:
Eric Blake [Tue, 28 May 2013 23:11:48 +0000 (17:11 -0600)]
build: drop unused variable
Compilation for mingw failed:
../../src/util/virutil.c: In function 'virGetWin32DirectoryRoot':
../../src/util/virutil.c:1094:9: error: unused variable 'ret' [-Werror=unused-variable]
Cole Robinson [Tue, 28 May 2013 14:23:00 +0000 (10:23 -0400)]
qemu: Don't report error on successful media eject
If we are just ejecting media, ret == -1 even after the retry loop
determines that the tray is open, as requested. This means media
disconnect always report's error.
Fix it, and fix some other mini issues:
- Don't overwrite the 'eject' error message if the retry loop fails
- Move the retries decrement inside the loop, otherwise the final loop
might succeed, yet retries == 0 and we will raise error
- Setting ret = -1 in the disk->src check is unneeded
- Fix comment typos