Though both libvirt and QEMU's document say RTC_CHANGE returns
the offset from the host UTC, qemu actually returns the offset
from the specified date instead when specific date is provided
(-rtc base=$date).
It's not safe for qemu to fix it in code, it worked like that
for 3 years, changing it now may break other QEMU use cases.
What qemu tries to do is to fix the document:
And in libvirt side, instead of replying on the value from qemu,
this converts the offset returned from qemu to the offset from
host UTC, by:
/*
* a: the offset from qemu RTC_CHANGE event
* b: The specified date (-rtc base=$date)
* c: the host date when libvirt gets the RTC_CHANGE event
* offset: What libvirt will report
*/
offset = a + (b - c);
The specified date (-rtc base=$date) is recorded in clock's def as
an internal only member (may be useful to exposed outside?).
Internal only XML tag "basetime" is introduced to not lose the
guest's basetime after libvirt restarting/reloading:
Peter Krempa [Wed, 5 Jun 2013 07:42:59 +0000 (09:42 +0200)]
storage: Avoid unnecessary ternary operators and refactor the code
Setting of local variables in virStorageBackendCreateQemuImgCmd was
unnecessarily cluttered with ternary operators and repeated testing of
of conditions.
This patch refactors the function to use if statements and improves
error reporting in case inputvol is specified but does not contain
target path. Previously we would complain about "unknown storage vol
type 0" instead of the actual problem.
Alvaro Polo [Thu, 6 Jun 2013 07:53:54 +0000 (09:53 +0200)]
openvz: Fix code coverage issue in OpenVZ driver
After fixing an invalid usage of virDomainNetDef in OpenVZ driver,
a coverage issue appeared. This was caused by a still invalid usage
of net->data.ethernet.dev for non ethernet networking.
Currently, a listen address for a SPICE server can be specified. Later,
when the domain is migrated, we need to relocate the graphics which
involves telling new destination to the SPICE server. However, we can't
just assume the listen address is the new location, because the listen
address can be ANYCAST (0.0.0.0 for IPv4, :: for IPv6). In which case,
we want to pass the remote hostname. But there are some troubles with
ANYCAST. In both IPv4 and IPv6 it has many ways for specifying such
address. For instance, in IPv4: 0, 0.0, 0.0.0, 0.0.0.0. The number of
variations gets bigger in IPv6 world. Hence, in order to check for
ANYCAST address sanely, we should take the provided listen address,
parse it and format back in it's full form. Which is exactly what this
patch does.
Ensure non-root can read /proc/meminfo file in LXC containers
By default files in a FUSE mount can only be accessed by the
user which created them, even if the file permissions would
otherwise allow it. To allow other users to access the FUSE
mount the 'allow_other' mount option must be used. This bug
prevented non-root users in an LXC container from reading
the /proc/meminfo file.
Remove legacy code for single-instance devpts filesystem
Earlier commit f7e8653f dropped support for using LXC with
kernels having single-instance devpts filesystem from the
LXC controller. It forgot to remove the same code from the
LXC container setup.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Peter Krempa [Wed, 5 Jun 2013 08:49:15 +0000 (10:49 +0200)]
storagevolxml2argvtest: Report better error messages on test failure
If the creation of the commandline failed, libvirt always reported "out
of memory" from the virCommandToString function rather than the proper
error that happened in virStorageBackendCreateQemuImgCmd. Error out
earlier.
Osier Yang [Fri, 31 May 2013 05:16:14 +0000 (13:16 +0800)]
storage: Support preallocate the new capacity for vol-resize
The document for "vol-resize" says the new capacity will be sparse
unless "--allocate" is specified, however, the "--allocate" flag
is never implemented. This implements the "--allocate" flag for
fs backend's raw type volume, based on posix_fallocate and the
syscall SYS_fallocate.
Thread 1 must be calling nwfilterDriverRemoveDBusMatches(). It does so with
nwfilterDriverLock held. In the patch below I am now moving the
nwfilterDriverLock(driverState) further up so that the initialization, which
seems to either take a long time or is entirely stuck, occurs with the lock
held and the shutdown cannot occur at the same time.
Remove the lock in virNWFilterDriverIsWatchingFirewallD to avoid
double-locking.
Alvaro Polo [Tue, 4 Jun 2013 07:44:32 +0000 (09:44 +0200)]
Fix a invalid usage of virDomainNetDef in OpenVZ driver
OpenVZ was accessing ethernet data to obtain the guest iface name
regardless the domain is configured to use ethernet or bridged
networking. This prevented the guest network interface to be rightly
named for bridged networking.
Jiri Denemark [Tue, 4 Jun 2013 08:16:02 +0000 (10:16 +0200)]
virsh: Obey pool-or-uuid spec when creating volumes
Our documentation says a pool may be referenced by its name or UUID
anywhere if it makes sense (pool-name and pool-uuid are the only
exceptions). However, vol-create and vol-create-as commands did not obey
this.
Peter Krempa [Tue, 28 May 2013 12:29:12 +0000 (14:29 +0200)]
RPC: Support up to 16384 cpus on the host and 4096 in the guest
The RPC limits for cpu maps didn't allow to use libvirt on ultra big
boxes. This patch increases size of the limits to support a maximum of
16384 cpus on the host with a maximum of 4096 cpus per guest.
The full cpu map of such a system takes 8 megabytes and the map for
vcpu pinning is 2 kilobytes long.
Osier Yang [Fri, 31 May 2013 10:09:24 +0000 (18:09 +0800)]
conf: Generate address for scsi host device automatically
With unknown good reasons, the attribute "bus" of scsi device
address is always set to 0, same for attribute "target". (See
virDomainDiskDefAssignAddress).
Though we might need to change the algorithm to honor "bus"
and "target" too, that's a different issue. The address generator
for scsi host device in this patch just follows the unknown
good reasons, only considering the "controller" and "unit".
It walks through all scsi controllers and their units, to see
if the address $controller:0:0:$unit can be used (if not used
by any disk or scsi host device yet), if found one, it sits on
it, otherwise, it creates a new controller (actually the controller
is implicitly created by someone else), and sits on
$new_controller:0:0:0 instead.
The problem was that qemuUpdateActivePciHostdevs was returning 0
(success) when no hostdevs were present, but would otherwise return -1
(failure) even when it completed successfully. It is only called from
qemuProcessReconnect(), and when qemuProcessReconnect got back an
error, it would not only stop reconnecting, but would terminate the
guest qemu process "to remove danger of it ending up running twice if
user tries to start it again later".
(This bug was introduced in commit 011cf7ad, which was pushed between
v1.0.2 and v1.0.3, so all maintenance branches from v1.0.3 up to 1.0.5
will need this one line patch applied.)
Eric Blake [Fri, 31 May 2013 16:47:22 +0000 (10:47 -0600)]
build: skip qemu in tests when !WITH_QEMU
A mingw build (where the qemu driver is not built, so WITH_QEMU
is undefined) failed with:
In file included from ../../src/qemu/qemu_command.h:30:0,
from ../../tests/testutilsqemu.h:4,
from ../../tests/networkxml2xmltest.c:14:
../../src/qemu/qemu_conf.h:53:4: error: #error "Port me"
But since testutilsqemu.c is already conditional, the header
should be likewise.
* tests/testutilsqemu.h: Make content conditional.
We can't use GNULIB's fprintf-posix due to licensing
incompatibilities. We do already have a portable
formatting via virAsprintf() which we got from GNULIB
though. We can use to create a virFilePrintf() function.
But really gnulib could just provide a 'fprintf'
module, that depended on just its 'asprintf' module.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ján Tomko [Fri, 31 May 2013 11:24:06 +0000 (13:24 +0200)]
qemu: escape literal IPv6 address in NBD migration
A literal IPv6 must be escaped, otherwise migration fails with:
unable to execute QEMU command 'drive-mirror': address resolution failed
for f0::0d:5901: Servname not supported for ai_socktype
since QEMU treats everything after the first ':' as the port.
During domain destruction it's possible that the learnIPAddressThread has
already removed the interface prior to the teardown filter path being run.
The teardown code would only be telling the thread to terminate.
John Ferlan [Fri, 31 May 2013 13:07:28 +0000 (09:07 -0400)]
Resolve memory leak found by valgrind
Commit '6afdfc8e' adjusted the exit and error paths to go through the error
and cleanup labels, but neglected to remove the return ret prior to cleanup.
Also noted the 'type' xml string fetch was never checked for NULL which
could lead to some interesting results.
Peter Krempa [Fri, 31 May 2013 13:38:46 +0000 (15:38 +0200)]
qemu: snapshot: Don't kill access to disk if snapshot creation fails
If snapshot creation failed for example due to invalid use of the
"REUSE_EXTERNAL" flag, libvirt killed access to the original image file
instead of the new image file. On machines with selinux this kills the
whole VM as the selinux context is enforced immediately.
* qemu_driver.c:qemuDomainSnapshotUndoSingleDiskActive():
- Kill access to the new image file instead of the old one.
Eric Blake [Thu, 30 May 2013 20:51:58 +0000 (14:51 -0600)]
build: work around cygwin header bug
A bug in Cygwin [1] and poor error messages from gcc [2] lead
to this confusing compilation error:
qemu/qemu_monitor.c:418:9: error: passing argument 2 of 'sendmsg' from incmpatible pointer type
/usr/include/sys/socket.h:42:11: note: expected 'const struct msghdr *' but argument is of type 'struct msghdr *'
Eric Blake [Thu, 30 May 2013 13:59:14 +0000 (07:59 -0600)]
build: cast [ug]id_t when printing
This is a recurring problem for cygwin :)
For example, see commit 23a4df88.
qemu/qemu_driver.c: In function 'qemuStateInitialize':
qemu/qemu_driver.c:691:13: error: format '%d' expects type 'int', but argument 8 has type 'uid_t' [-Wformat]
Eric Blake [Thu, 30 May 2013 03:00:51 +0000 (21:00 -0600)]
build: port qemu to cygwin
A cygwin build of the qemu driver fails with:
qemu/qemu_process.c: In function 'qemuPrepareCpumap':
qemu/qemu_process.c:1803:31: error: 'CPU_SETSIZE' undeclared (first use in this function)
CPU_SETSIZE is a Linux extension in <sched.h>; a bit more portable
is using sysconf if _SC_NPROCESSORS_CONF is defined (several platforms
have it, including Cygwin). Ultimately, I would have preferred to
use gnulib's 'nproc' module, but it is currently under an incompatible
license.
* src/qemu/qemu_conf.h (QEMUD_CPUMASK_LEN): Provide definition on
cygwin.
Eric Blake [Thu, 30 May 2013 01:52:58 +0000 (19:52 -0600)]
build: use correct rpc.h for lockd
On cygwin, the build failed with:
In file included from ./rpc/virnetmessage.h:24:0,
from ./rpc/virnetclient.h:29,
from locking/lock_driver_lockd.c:31:
./rpc/virnetprotocol.h:9:21: fatal error: rpc/rpc.h: No such file or directory
Cole Robinson [Tue, 28 May 2013 19:37:53 +0000 (15:37 -0400)]
virsh: migrate: Don't disallow --p2p and --migrateuri
Because it's a valid combination. p2p still uses a separate channel
for qemu migration, so there's value in letting the user specify a manual
migrate URI for overriding auto-port, or libvirt's FQDN lookup.
What _isn't_ allowed is --migrateuri and TUNNELLED, since there is
no separate migration channel. Disallow that instead
Eric Blake [Wed, 29 May 2013 15:05:33 +0000 (09:05 -0600)]
build: fix build without libvirtd
Building when configured --with-libvirtd=no fails with:
In file included from ../src/qemu/qemu_command.h:30:0,
from testutilsqemu.h:4,
from networkxml2xmltest.c:14:
../src/qemu/qemu_conf.h:175:5: error: expected specifier-qualifier-list before 'virStateInhibitCallback'
* src/libvirt_internal.h (virStateInhibitCallback): Move outside
of conditional.
Eric Blake [Wed, 29 May 2013 03:15:08 +0000 (21:15 -0600)]
build: fix build with newer gnutls
Building with gnutls 3.2.0 (such as shipped with current cygwin) fails
with:
rpc/virnettlscontext.c: In function 'virNetTLSSessionGetKeySize':
rpc/virnettlscontext.c:1358:5: error: implicit declaration of function 'gnutls_cipher_get_key_size' [-Wimplicit-function-declaration]
Yeah, it's stupid that gnutls broke API by moving their declaration
into a new header without including that header from the old one,
but it's easy enough to work around, all without breaking on gnutls
1.4.1 (hello RHEL 5) that lacked the new header.
* configure.ac (gnutls): Check for <gnutls/crypto.h>.
* src/rpc/virnettlscontext.c (includes): Include additional header.
Osier Yang [Wed, 22 May 2013 12:05:22 +0000 (20:05 +0800)]
storage_conf: Use uid_t/gid_t instead of int to cast the value
And error out if the casted value is not same with the original
one, which prevents the bug on platform(s) where uid_t/gid_t
has different size with long.
Osier Yang [Wed, 29 May 2013 10:04:33 +0000 (18:04 +0800)]
storage_conf: Improve the memory deallocation of pool def parsing
Changes:
* Free all the strings at "cleanup", instead of freeing them
in the middle
* Remove xmlFree
* s/tmppath/target_path/, to make it more sensible
* Add new goto label "error"
Michal Privoznik [Fri, 24 May 2013 12:45:06 +0000 (14:45 +0200)]
qemuOpenVhostNet: Decrease vhostfdSize on open failure
Currently, if there's an error opening /dev/vhost-net (e.g. because
it doesn't exist) but it's not required we proceed with vhostfd array
filled with -1 and vhostfdSize unchanged. Later, when constructing
the qemu command line only non-negative items within vhostfd array
are taken into account. This means, vhostfdSize may be greater than
the actual count of non-negative items in vhostfd array. This results
in improper command line arguments being generated, e.g.:
Eric Blake [Tue, 28 May 2013 23:11:48 +0000 (17:11 -0600)]
build: drop unused variable
Compilation for mingw failed:
../../src/util/virutil.c: In function 'virGetWin32DirectoryRoot':
../../src/util/virutil.c:1094:9: error: unused variable 'ret' [-Werror=unused-variable]
Cole Robinson [Tue, 28 May 2013 14:23:00 +0000 (10:23 -0400)]
qemu: Don't report error on successful media eject
If we are just ejecting media, ret == -1 even after the retry loop
determines that the tray is open, as requested. This means media
disconnect always report's error.
Fix it, and fix some other mini issues:
- Don't overwrite the 'eject' error message if the retry loop fails
- Move the retries decrement inside the loop, otherwise the final loop
might succeed, yet retries == 0 and we will raise error
- Setting ret = -1 in the disk->src check is unneeded
- Fix comment typos
Sergey Fionov [Sat, 18 May 2013 11:47:51 +0000 (15:47 +0400)]
qemu: save domain state to XML after reboot
Currently qemuDomainReboot() does reboot in two phases:
qemuMonitorSystemPowerdown() and qemuProcessFakeReboot().
qemuMonitorSystemPowerdown() shutdowns the domain and saves domain
state/reason as VIR_DOMAIN_SHUTDOWN_UNKNOWN.
qemuProcessFakeReboot() sets domain state/reason to
VIR_DOMAIN_RESUMED_UNPAUSED but does not save domain state changes.
Subsequent restart of libvirtd leads to restoring domain state/reason to
saved that is VIR_DOMAIN_SHUTDOWN_UNKNOWN and to automatic shutdown of
the domain. This commit adds virDomainSaveStatus() into
qemuProcessFakeReboot() to avoid unexpected shutdowns.
Matthias Bolte [Sat, 18 May 2013 09:45:04 +0000 (11:45 +0200)]
esx: Fix dynamic VI object type detection
VI objects support inheritance with subtype polymorphism. For example the
FileInfo object type is extended by FloppyImageFileInfo, FolderFileInfo
etc. Then SearchDatastore_Task returns an array of FileInfo objects and
depending on the represented file the FileInfo is actually a FolderFileInfo
or FloppyImageFileInfo etc. The actual type information is stored as XML
attribute that allows clients such as libvirt to distinguish between the
actual types. esxVI_GetActualObjectType is used to extract the actual type.
I assumed that this mechanism would be used for all VI object types that
have subtypes. But this is not the case. It seems only to be used for types
that are actually used as generic base type such as FileInfo. But it is not
used for types that got extended later such as ElementDescription that was
extended by ExtendedElementDescription (added in vSphere API 4.0) or that
are not meant to be used with subtype polymorphism.
This breaks the deserialization of types that contain ElementDescription
properties such as PerfCounterInfo or ChoiceOption, because the code
expects an ElementDescription object to have an XML attribute named type
that is not present, since ExtendedElementDescription was added to the
esx_vi_generator.input in commit 60f0f55ee4686fecbffc5fb32f90863427d02a14.
This in turn break virtual machine question handling and auto answering.
Fix this by using the base type if no XML type attribute is present.
spec: Build vbox packages only for x86 architectures
Commit 6ab6bc19f03513fd87d29ecfd405bb7f4a7de114 has introduced separate
daemon/driver packages for vbox. These should only be built for x86
architectures which is done hereby.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Michal Privoznik [Fri, 24 May 2013 09:21:36 +0000 (11:21 +0200)]
Adapt to new VIR_STRNDUP behavior
With previous patch, we accept negative value as length of string to
duplicate. So there is no need to pass strlen(src) in case we want to do
duplicate the whole string.
Function qemuDomainSetBlockIoTune() was checking QEMU capabilities
even when !(flags & VIR_DOMAIN_AFFECT_LIVE) and the domain was
shutoff, resulting in the following problem:
Currently, the controllers argument to virCgroupDetect acts both as
a result filter and a required controller specification, which is
a bit overloaded. If both functionalities are needed, it would be
better to have them seperated into a filter and a requirement mask.
The only situation where it is used today is to ensure that only
CPU related controllers are used for the VCPU directories. But here
we clearly do not want to enforce the existence of cpu, cpuacct and
specifically not cpuset at the same time.
This commit changes the semantics of controllers to "filter only".
Should a required mask ever be needed, more work will have to be done.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Osier Yang [Fri, 24 May 2013 03:59:14 +0000 (11:59 +0800)]
virsh: Fix regression of vol-resize
Introduced by commit 1daa4ba33acf. vshCommandOptStringReq returns
0 on *success* or the option is not required && not present, both
are right result. Error out when returning 0 is not correct.
the caller, it doesn't have to check wether it