Peter Krempa [Mon, 19 Jan 2015 12:21:09 +0000 (13:21 +0100)]
qemu: memdev: Add infrastructure to load memory device information
When using 'dimm' memory devices with qemu, some of the information
like the slot number and base address need to be reloaded from qemu
after process start so that it reflects the actual state. The state then
allows to use memory devices across migrations.
Peter Krempa [Mon, 6 Oct 2014 12:18:37 +0000 (14:18 +0200)]
qemu: Implement setup of memory hotplug parameters
To enable memory hotplug the maximum memory size and slot count need to
be specified. As qemu supports now other units than mebibytes when
specifying memory, use the new interface in this case.
Peter Krempa [Mon, 11 Aug 2014 15:40:32 +0000 (17:40 +0200)]
conf: Add support for parsing and formatting max memory and slot count
Add a XML element that will allow to specify maximum supportable memory
and the count of memory slots to use with memory hotplug.
To avoid possible confusion and misuse of the new element this patch
also explicitly forbids the use of the maxMemory setting in individual
drivers's post parse callbacks. This limitation will be lifted when the
support is implemented.
Peter Krempa [Mon, 16 Mar 2015 14:33:45 +0000 (15:33 +0100)]
libxl: Refactor logic in domain post parse callback
With the current control flow the post parse callback returned success
right away for fully virtualized VMs. To allow adding additional checks
into the post parse callback tweak the conditions so that the function
doesn't return early except for error cases.
To clarify the original piece of code borrow the wording from the commit
message for the patch that introduced the code.
Peter Krempa [Mon, 23 Mar 2015 13:17:42 +0000 (14:17 +0100)]
qemu: Don't return memory device config on error in qemuBuildMemoryBackendStr
In the last section if the function determines that the config is
invalid when QEMU doesn't support the memory device the JSON config
object would be returned even if it doesn't make sense.
Boris Fiuczynski [Fri, 20 Mar 2015 15:01:10 +0000 (16:01 +0100)]
qemu: Set default SCSI controller model for S390 arch
When no model is specified in the domain definition for
a scsi controller and the architectur is s390 than virtio-scsi
is set as default model.
Signed-off-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com> Reviewed-by: Daniel Hansel <daniel.hansel@linux.vnet.ibm.com> Reviewed-by: Stefan Zimmermann <stzi@linux.vnet.ibm.com> Reviewed-by: Jens Freimann <jfrei@linux.vnet.ibm.com> Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Michael Chapman [Fri, 13 Mar 2015 00:34:58 +0000 (11:34 +1100)]
qemu: skip precreation of network disks
Commit cf54c60699833b3791a5d0eb3eb5a1948c267f6b introduced the ability
to create missing storage volumes during migration. For network disks,
however, we may not necessarily be able to detect whether they already
exist -- there is no straight-forward way to map the disk to a storage
volume, and even if there were it's possible no configured storage pool
actually contains the disk.
It is better to assume the network disk exists in this case, rather than
aborting the migration completely. If the volume really is missing, QEMU
will generate an appropriate error later in the migration.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
Michal Privoznik [Sat, 14 Mar 2015 10:18:21 +0000 (11:18 +0100)]
network_conf: Drop virNetworkObjIsDuplicate
This function does not make any sense now, that network driver is
(almost) dropped. I mean, previously, when threads were
serialized, this function was there to check, if no other network
with the same name or UUID exists. However, nowadays that threads
can run more in parallel, this function is useless, in fact it
gives misleading return values. Consider the following scenario.
Two threads, both trying to define networks with same name but
different UUID (e.g. because it was generated during XML parsing
phase, whatever). Lets assume that both threads are about to call
networkValidate() which immediately calls
virNetworkObjIsDuplicate().
T1: calls virNetworkObjIsDuplicate() and since no network with
given name or UUID exist, success is returned.
T2: calls virNetworkObjIsDuplicate() and since no network with
given name or UUID exist, success is returned.
T1: calls virNetworkAssignDef() and successfully places its
network into the virNetworkObjList.
T2: calls virNetworkAssignDef() and since network with the same
name exists, the network definition is replaced.
Okay, this is mainly because virNetworkAssignDef() does not check
whether name and UUID matches. Well, lets make it so! And drop
useless function too.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Mon, 16 Mar 2015 15:01:45 +0000 (16:01 +0100)]
objecteventtest: Check for virNetwork* return values
Lets not give a bad example and check for return values of
virNetwork* APIs called within the test. Even though it's
unlikely that any API will fail, it can happen. We're connected
to the test driver after all, and our API sequence is correct. So
test driver should fail only in case of bug or OOM.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Fri, 13 Mar 2015 15:47:06 +0000 (16:47 +0100)]
networkStateInitialize: Don't lock network driver
There's no need to lock the network driver, as network driver
initialization is done prior accepting any client. There's nobody
to hop in and do something over partially initialized driver. Nor
qemu driver is doing that.
==30532== Observed (incorrect) order is: acquisition of lock at 0x1439EF50
==30532== at 0x4C31A26: pthread_mutex_lock (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==30532== by 0x5324895: virMutexLock (virthread.c:88)
==30532== by 0x5307E86: virObjectLock (virobject.c:323)
==30532== by 0x5396440: virNetworkObjListForEach (network_conf.c:4511)
==30532== by 0x19B29308: networkStateInitialize (bridge_driver.c:686)
==30532== by 0x53E1CCC: virStateInitialize (libvirt.c:777)
==30532== by 0x11DEB7: daemonRunStateInit (libvirtd.c:906)
==30532== by 0x5324B6A: virThreadHelper (virthread.c:197)
==30532== by 0x4C30456: ??? (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==30532== by 0xA1EC1F2: start_thread (in /lib64/libpthread-2.19.so)
==30532== by 0xA4EDC8C: clone (in /lib64/libc-2.19.so)
==30532==
==30532== followed by a later acquisition of lock at 0x1439CD60
==30532== at 0x4C31A26: pthread_mutex_lock (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==30532== by 0x5324895: virMutexLock (virthread.c:88)
==30532== by 0x19B27B2C: networkDriverLock (bridge_driver.c:102)
==30532== by 0x19B27B60: networkGetDnsmasqCaps (bridge_driver.c:113)
==30532== by 0x19B2856A: networkUpdateState (bridge_driver.c:389)
==30532== by 0x53963E9: virNetworkObjListForEachHelper (network_conf.c:4488)
==30532== by 0x52E2224: virHashForEach (virhash.c:521)
==30532== by 0x539645B: virNetworkObjListForEach (network_conf.c:4512)
==30532== by 0x19B29308: networkStateInitialize (bridge_driver.c:686)
==30532== by 0x53E1CCC: virStateInitialize (libvirt.c:777)
==30532== by 0x11DEB7: daemonRunStateInit (libvirtd.c:906)
==30532== by 0x5324B6A: virThreadHelper (virthread.c:197)
==30532==
==30532== Required order was established by acquisition of lock at 0x1439CD60
==30532== at 0x4C31A26: pthread_mutex_lock (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==30532== by 0x5324895: virMutexLock (virthread.c:88)
==30532== by 0x19B27B2C: networkDriverLock (bridge_driver.c:102)
==30532== by 0x19B28DF9: networkStateInitialize (bridge_driver.c:609)
==30532== by 0x53E1CCC: virStateInitialize (libvirt.c:777)
==30532== by 0x11DEB7: daemonRunStateInit (libvirtd.c:906)
==30532== by 0x5324B6A: virThreadHelper (virthread.c:197)
==30532== by 0x4C30456: ??? (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==30532== by 0xA1EC1F2: start_thread (in /lib64/libpthread-2.19.so)
==30532== by 0xA4EDC8C: clone (in /lib64/libc-2.19.so)
==30532==
==30532== followed by a later acquisition of lock at 0x1439EF50
==30532== at 0x4C31A26: pthread_mutex_lock (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==30532== by 0x5324895: virMutexLock (virthread.c:88)
==30532== by 0x5307E86: virObjectLock (virobject.c:323)
==30532== by 0x538A09C: virNetworkAssignDef (network_conf.c:527)
==30532== by 0x5391EB2: virNetworkLoadState (network_conf.c:3008)
==30532== by 0x53922D4: virNetworkLoadAllState (network_conf.c:3128)
==30532== by 0x19B2929A: networkStateInitialize (bridge_driver.c:671)
==30532== by 0x53E1CCC: virStateInitialize (libvirt.c:777)
==30532== by 0x11DEB7: daemonRunStateInit (libvirtd.c:906)
==30532== by 0x5324B6A: virThreadHelper (virthread.c:197)
==30532== by 0x4C30456: ??? (in /usr/lib64/valgrind/vgpreload_helgrind-amd64-linux.so)
==30532== by 0xA1EC1F2: start_thread (in /lib64/libpthread-2.19.so)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Wikipedia's list of common misspellings [1] has a machine-readable
version. This patch fixes those misspellings mentioned in the list
which don't have multiple right variants (as e.g. "accension", which can
be both "accession" and "ascension"), such misspellings are left
untouched. The list of changes was manually re-checked for false
positives.
We've never set the cpuset.memory_migrate value to anything, keeping it
on default. However, we allow changing cpuset.mems on live domain.
That setting, however, don't have any consequence on a domain unless
it's going to allocate new memory.
I managed to make 'virsh numatune' move all the memory to any node I
wanted even without disabling libnuma's numa_set_membind(), so this
should be safe to use with it as well.
Jim Fehlig [Tue, 17 Mar 2015 20:10:28 +0000 (14:10 -0600)]
libxl: use xenlight pkgconfig file if present
xen.git commit babeca32 added a pkgconfig file for libxenlight,
allowing libxl apps to determine the location of Xen binaries
such as firmware blobs, device emulator, etc.
This patch adds support for xenlight.pc in the libxl driver, falling
back to the previous configure logic if not found. It introduces
LIBXL_FIRMWARE_DIR and LIBXL_EXECBIN_DIR to define the firmware and
libexec_bin locations. If xenlight.pc does not exist, the defines
are set to the current hardcoded paths. The capabilities'
<emulator> and <loader> elements are updated to use the paths.
Maxim Nestratov [Thu, 19 Mar 2015 14:43:21 +0000 (17:43 +0300)]
parallels: fix libvirt crash if parallelsNetworkOpen fails
If, by any reason, parallelsNetworkOpen fails it dereferences
newly allocated privconn->networks via virObjectUnref, which in
turn deallocates its memory.
Subsequent call of parallelsNetworkClose calls virObjectUnref
that leads to double memory free. To prevent this we should zero
privconn->networks to make all subsequent virObjectUnref be safe.
When qemu exits during startup, libvirt includes the error from
/var/log/libvirt/qemu/vm.log in the error message:
$ virsh start test3
error: Failed to start domain test3
error: internal error: early end of file from monitor: possible problem:
2015-02-27T03:03:16.985494Z qemu-kvm: -numa memdev is not supported by
machine rhel6.5.0
The check for domain liveness added to qemuDomainObjExitMonitor
in commit dc2fd51f sometimes overwrites this error:
$ virsh start test3
error: Failed to start domain test3
error: operation failed: domain is no longer running
Fix the check to only report an error if there is none set.
Signed-off-by: Luyao Huang <lhuang@redhat.com> Signed-off-by: Ján Tomko <jtomko@redhat.com>
Jim Fehlig [Wed, 18 Mar 2015 20:53:45 +0000 (14:53 -0600)]
libxl: Don't overwrite errors from xenconfig
When converting domXML from native, the libxl driver was overwriting
useful errors from the xenconfig parsing code with a useless, generic
error. E.g. "internal error: parsing xm config failed" vs
"internal error: config value usbdevice was malformed". Remove the
redundant (and useless) error reporting in the libxl driver.
John Ferlan [Wed, 18 Mar 2015 11:10:54 +0000 (07:10 -0400)]
qemu: Fix two issues in qemuDomainSetVcpus error handling
Issue #1 - A call to virBitmapNew did not check if the allocation
failed which could lead to a NULL dereference
Issue #2 - When deleting the pin entries from the config file, the
code loops from the number of elements down to the "new" vcpu count;
however, the pin id values are numbered 0..n-1 not 1..n, so the "first"
pin attempt would never work. Luckily the check was for whether the
incoming 'n' (vcpu id) matched the entry in the array from 0..arraysize
rather than a dereference of the 'n' entry
Deepak Shetty [Wed, 18 Mar 2015 13:48:22 +0000 (19:18 +0530)]
doc: Fix doc for backingStore
I spent quite some time figuring that backingStore info
isn't included in the dom xml, unless guest is up and
running. Hopefully putting that in the doc should help.
Also, several people have complained that libvirt reports
a backing file as raw, even though they expected it to be
qcow2; where the culprit is usually the user forgetting to
create the file with qemu-img create -o backing_fmt=qcow2.
This patch adds that info to the doc.
Signed-off-by: Deepak C Shetty <deepakcs@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
Eric Blake [Fri, 6 Mar 2015 16:30:14 +0000 (09:30 -0700)]
qemu: track 'cancelling' migration state
In qemu 2.3, the migration status will include 'cancelling' in the
window between when an asynchronous cancel has been requested and
when the migration is actually halted. Previously, qemu hid this
state and reported 'active'. Libvirt manages the sequence okay
even when the string is unrecognized (that is, it will report an
unknown state:
Migration: [ 69 %]^Cerror: internal error: unexpected migration status in cancelling.
but the migration is still cancelled), but recognizing the string
makes for a smoother user experience.
Laine Stump [Wed, 18 Mar 2015 18:27:05 +0000 (14:27 -0400)]
util: more verbose error when failing to create macvtap device
Investigation of a problem with creating passthrough macvtap devices
(https://bugzilla.redhat.com/show_bug.cgi?id=1185501) has shown that
this slightly more verbose failure message is useful. In particular,
the mac address can be used to determine the domain. You could also
figure this out by looking at preceding messages in a debug log, but
this gets it in a single place.
Laine Stump [Tue, 17 Mar 2015 17:46:44 +0000 (13:46 -0400)]
util: clean up #includes of virnetdevopenvswitch.h
virnetdevopenvswitch.h declares a few functions that can be called to
add ports to and remove them from OVS bridges, and retrieve the
migration data for a port. It does not contain any data definitions
that are used by domain_conf.h. But for some reason, domain_conf.h
virnetdevopenvswitch.h should be directly #including it. This adds a
few lines to the project, but saves all the files that don't need it
from the extra computing, and makes the dependencies more clear cut.
zhang bo [Fri, 13 Mar 2015 09:17:46 +0000 (17:17 +0800)]
util: vhost user: support for bootindex
Problem Description:
When we set boot order for a vhost-user network interface, we found the boot index
doesn't work.
Cause of the Problem:
In the function qemuBuildVhostuserCommandLine(), it forcely set the arg bootindex of
function qemuBuildNicDevStr() to 0. Thus, the bootindex parameter got missing.
Solution:
Trans the arg bootindex down.
Signed-off-by: Gao Haifeng <gaohaifeng.gao@huawei.com> Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>
Maxim Nestratov [Wed, 18 Mar 2015 08:33:56 +0000 (11:33 +0300)]
parallels: switch off offline management feature
which is on by default when a new VM/CT is created.
We should do this because this feature can't be controlled
by libvirt now and it sets up some iptables rules. So it's
better to do this to avoid potential conflict of different
set of rules or to avoid unexpected behavior.
Maxim Nestratov [Wed, 18 Mar 2015 08:33:53 +0000 (11:33 +0300)]
parallels: better bridge network interface support
In order to support 'bridge' network adapters in parallels
driver we need to plug our veth devices into corresponding
linux bridges.
We are going to do this by reusing our abstraction of
Virtual Networks in terms of PCS. On a domain creation, we
create a new Virtual Network naming it with the same name
as a source bridge for each network interface.
Having done this, we plug PCS veth interfaces created with names of
target dev into specified bridges using our standard PCS procedures
Signed-off-by: Maxim Nestratov <mnestratov@parallels.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Maxim Nestratov [Wed, 18 Mar 2015 08:33:52 +0000 (11:33 +0300)]
parallels: fix parallelsLoadNetworks
Don't fail initialization of parallels driver if
parallelsLoadNetwork fails for optional networks.
This can happen when some of them are added manually
and configured incompletely. PCS requires only two networks
created automatically (named Host-Only and Bridged), others
are optional and their incompleteness can be ignored.
Signed-off-by: Maxim Nestratov <mnestratov@parallels.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
John Ferlan [Mon, 16 Mar 2015 12:50:11 +0000 (08:50 -0400)]
network: Resolve Coverity FORWARD_NULL
The following is a long winded way to say this patch is avoiding a
false positive.
Coverity complains that calling networkPlugBandwidth() could eventually
end up with a NULL dereference on iface->bandwidth because in the
networkAllocateActualDevice there's a check of 'iface->bandwidth'
before deciding to try to use the 'portgroup' if it exists or to not
perferm the virNetDevBandwidthCopy if 'bandwidth' is not NULL.
Later in networkPlugBandwidth the 'iface->bandwidth' is sourced from
virDomainNetGetActualBandwidth - which would be either iface->bandwidth
or (preferably) iface->data.network.actual->bandwidth which would have
been filled in from either 'iface->bandwidth' or 'portgroup->bandwidth'
back in networkAllocateActualDevice
There *is* a check in networkCheckBandwidth for the result of the
virDomainNetGetActualBandwidth being NULL and a return 1 based on
that which would cause networkPlugBandwidth to exit properly and thus
never hit the condition that Coverity complains about.
However, since Coverity checks all paths - it somehow believes that
a return of 0 by networkCheckBandwidth in this condition would end
up causing the possible NULL dereference. The "fix" to silence Coverity
is to not have networkCheckBandwidth also call virDomainNetGetActualBandwidth
in order to get the ifaceBand, but rather have it accept it as an argument
which causes Coverity to "see" that it's the exit condition of 1 that won't
have the possible NULL dereference. Since we're passing that, I added the
passing of iface->mac rather than passing iface as well. This just hopefully
makes sure someone doesn't undo this in the future...
Jiri Denemark [Mon, 16 Feb 2015 14:17:00 +0000 (15:17 +0100)]
Use PAUSED state for domains that are starting up
When libvirt is starting a domain, it reports the state as SHUTOFF until
it's RUNNING. This is not ideal because domain startup may take a long
time (usually because of some configuration issues, firewalls blocking
access to network disks, etc.) and domain lists provided by libvirt look
awkward. One can see weird shutoff domains with IDs in a list of active
domains or even shutoff transient domains. In any case, it looks more
like a bug in libvirt than a normal state a domain goes through.
Michal Privoznik [Tue, 17 Mar 2015 16:44:50 +0000 (17:44 +0100)]
qemuGetDHCPInterfaces: Don't leak @network
The function needs a pointer to the network to get list of DHCP
leases. The pointer is obtained via virNetworkLookupByName() which
requires callers to free the returned network once no longer needed.
Otherwise it's leaked.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Tue, 17 Mar 2015 16:34:22 +0000 (17:34 +0100)]
cmdDomIfAddr: Free @ip_addr_str
The variable holds formatted suffix to each line printed out
(address type, address and prefix). However, the variable is
never freed. At the same time, honour fact, that data held in
the variable is not constant.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Tue, 17 Mar 2015 16:28:53 +0000 (17:28 +0100)]
virsh: Adapt to new HW address scenario
Make sure we don't print (null) (which in fact is printf()'s
cleverness anyway, not ours). If no HW address is present, print
"N/A" string just like we do for other fields.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Eric Blake [Wed, 11 Mar 2015 20:37:04 +0000 (14:37 -0600)]
qemu: read backing chain names from qemu
https://bugzilla.redhat.com/show_bug.cgi?id=1199182 documents that
after a series of disk snapshots into existing destination images,
followed by active commits of the top image, it is possible for
qemu 2.2 and earlier to end up tracking a different name for the
image than what it would have had when opening the chain afresh.
That is, when starting with the chain 'a <- b <- c', the name
associated with 'b' is how it was spelled in the metadata of 'c',
but when starting with 'a', taking two snapshots into 'a <- b <- c',
then committing 'c' back into 'b', the name associated with 'b' is
now the name used when taking the first snapshot.
Sadly, older qemu doesn't know how to treat different spellings of
the same filename as identical files (it uses strcmp() instead of
checking for the same inode), which means libvirt's attempt to
commit an image using solely the names learned from qcow2 metadata
fails with a cryptic:
error: internal error: unable to execute QEMU command 'block-commit': Top image file /tmp/images/c/../b/b not found
even though the file exists. Trying to teach libvirt the rules on
which name qemu will expect is not worth the effort (besides, we'd
have to remember it across libvirtd restarts, and track whether a
file was opened via metadata or via snapshot creation for a given
qemu process); it is easier to just always directly ask qemu what
string it expects to see in the first place.
As a safety valve, we validate that any name returned by qemu
still maps to the same local file as we have tracked it, so that
a compromised qemu cannot accidentally cause us to act on an
incorrect file.
* src/qemu/qemu_monitor.h (qemuMonitorDiskNameLookup): New
prototype.
* src/qemu/qemu_monitor_json.h (qemuMonitorJSONDiskNameLookup):
Likewise.
* src/qemu/qemu_monitor.c (qemuMonitorDiskNameLookup): New function.
* src/qemu/qemu_monitor_json.c (qemuMonitorJSONDiskNameLookup)
(qemuMonitorJSONDiskNameLookupOne): Likewise.
* src/qemu/qemu_driver.c (qemuDomainBlockCommit)
(qemuDomainBlockJobImpl): Use it.
network: Add midonet virtual port type support to qemu
Use the utilities introduced in the previous patches so the qemu
driver is able to create tap devices that are bound (and unbound
on domain destroyal) to Midonet virtual ports.
Signed-off-by: Antoni Segura Puimedon <toni+libvirt@midokura.com>
docs: schema and docs for the midonet virtualport type
Midonet is an opensource virtual networking that over lays the IP
network between hypervisors. Currently, such networks can be made
with the openvswitch virtualport type.
This patch, defines the schema and documentation that will serve
as basis for the follow up patches that will add support to libvirt
for using Midonet virtual ports for its interfaces. The schema
definition requires that the port profile expresses its interfaceid
as part of the port profile. For that reason, this is part of the
patch too.
Signed-off-by: Antoni Segura Puimedon <toni+libvirt@midokura.com>
util: functions to support binding/unbinding midonet virtualports
Adds the port type definitions and methods that will be used to bind
interfaces to the Midonet virtual ports.
virtnetdevmidonet.c adds the way to bind and unbind the ports by
calling into the Midonet Host Agent control command line (installed
with the midolman package).
Signed-off-by: Antoni Segura Puimedon <toni+libvirt@midokura.com>
Peter Krempa [Thu, 12 Mar 2015 16:33:09 +0000 (17:33 +0100)]
conf: disk: Simplify checking if source definition was parsed
Previously we had to check for 3 fields to see if the source was filled.
Repurpose one of the variables as a boolean flag and use it instead of
combining multiple sources.
For the condition that checks that only CDROM/FLOPPY drives can be empty
we can use the virStorageSourceIsEmpty() helper.
Peter Krempa [Thu, 12 Mar 2015 16:53:01 +0000 (17:53 +0100)]
util: storage: Fix check for empty storage device
If the storage device type is parsed as network our parser still allows
it to omit the <source> element. The empty drive check would not trigger
on such device as it expects that every network storage source is valid.
Use VIR_STORAGE_NET_PROTOCOL_NONE as a marker that the storage source is
empty.
Peter Krempa [Thu, 12 Mar 2015 16:12:12 +0000 (17:12 +0100)]
qemu: driver: Fix cold-update of removable storage devices
Only selected fields from the disk source were copied when cold updating
source in a CDROM drive. When such drive was backed by a network file
this resulted into corruption of the definition:
Peter Krempa [Thu, 12 Mar 2015 15:41:21 +0000 (16:41 +0100)]
virsh: domain: Fix the change-media command
The command did not modify the disk type and thus didn't allow to change
media from a file image to a block backed image or vice versa. In
addition when operating on a network backed removable devices the
command would replace the while <source> subelement with an invalid one.
This patch adds the --block option that allows to specify that the new
image is block backed and assumes that without that option all images
are file backed. Since network backends were always mangled it should
not cause problems.
Peter Krempa [Thu, 12 Mar 2015 10:51:51 +0000 (11:51 +0100)]
virsh: domain: Don't use vshPrepareDiskXML for creating XML to detach disk
Since cmdDetachDisk() calls into vshPrepareDiskXML() with
type == VSH_PREPARE_DISK_XML_NONE && source == NULL this would result
into skipping all the checks and effectively turn the function into a
XML formatter.
This patch changes the code to use the formatter directly so that the
function can be refactored in a easier way.
By querying the qemu guest agent with the QMP command
"guest-network-get-interfaces" and converting the received JSON
output to structured objects.
Although "ifconfig" is deprecated, IP aliases created by "ifconfig"
are supported by this API. The legacy syntax of an IP alias is:
"<ifname>:<alias-name>". Since we want all aliases to be clubbed
under parent interface, simply stripping ":<alias-name>" suffices.
Note that IP aliases formed by "ip" aren't visible to "ifconfig",
and aliases created by "ip" do not have any specific name. But
we are lucky, as qemu guest agent detects aliases created by both.
src/remote/remote_protocol.x
* New RPC procedure: REMOTE_PROC_DOMAIN_INTERFACE_ADDRESSES
* Define structs remote_domain_ip_addr, remote_domain_interface,
remote_domain_interfaces_addresse_args, remote_domain_interface_addresses_ret
* Introduce upper bounds (to handle DoS attacks):
REMOTE_DOMAIN_INTERFACE_MAX = 2048
REMOTE_DOMAIN_IP_ADDR_MAX = 2048
Restrictions on the maximum number of aliases per interface were
removed after kernel v2.0, and theoretically, at present, there
are no upper limits on number of interfaces per virtual machine
and on the number of IP addresses per interface.
Maxim Nestratov [Fri, 13 Mar 2015 15:40:42 +0000 (18:40 +0300)]
parallels: fix home directory for VMs
Failures of parallelsStorageOpen occured because we incorrectly treated
path to VM' configuration file as a directory. Now initialization of
parallels VM domains home directory is fixed.
parallels: set cpu mode when applying xml configuration
Otherwise exporting existing domain config and defining a new one like this:
virsh -c parallels:///system dumpxml instance01 > my.xml
virsh -c parallels:///system define my.xml
leads to an error because PCS default x64 mode turns to x32.
Thus, we need to set correct cpuMode in prlsdkDoApplyConfig() explicitly.
We're parsing memballoon status period as unsigned int, but when we're
trying to set it, both we and qemu use signed int. That means large
values will get wrapped around to negative one resulting in error.
Basically the same problem as commit e3a7b874 was dealing with when
updating live domain.
QEMU changed the accepted value to int64 in commit 1f9296b5, but even
values as INT_MAX don't make sense since the value passed means seconds.
Hence adding capability flag for this change isn't worth it.
qemu: Don't duplicate errors when settings stats period
In order not to leave old error messages set, this patch refactors the
code so the error is reported only when acted upon. The only such place
already rewrites any error, so cleaning up all the error reporting in
qemuMonitorSetMemoryStatsPeriod() is enough.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
All the devices we have format their address as its last sub-element, so
let's change memballoon to follow suit. Also adjust RNG to allow any
order of them so 'virsh edit' doesn't shout at us.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Pavel Hrdina [Mon, 16 Mar 2015 11:52:13 +0000 (12:52 +0100)]
rpm-build: use pkg-config to detect wireshark presence
Wireshark supports pkg-config since 1.11.3. Right now we build
wireshark-dissectior tool as default trough rpm build only on
fedora >= 21 and there is new wireshark that supports pkg-config.
If someone wants to build libvirt with wireshark-dissector against old
wireshark, they should specify the location by hand.
This patch is mainly to fix wrong dependency on wireshark binary as it
doesn't make sense to require that binary file to just get version info
of that package in makefile.
Jim Fehlig [Sat, 14 Mar 2015 00:36:04 +0000 (18:36 -0600)]
libxl: fix regression introduced by commit 4ab8cd77
Commit 4ab8cd77 added a check requiring input devices to have
a bus type of VIR_DOMAIN_INPUT_BUS_USB, failing to start the
domain otherwise. But virDomainDefParseXML adds implicit mouse
and keyboard if a graphics device is configured. See calls to
virDomainDefMaybeAddInput.
The regression is fixed by removing the check requiring USB input
devices, and skipping non-USB input devices when populating USB
'usbdevice' in libxl_domain_build_info struct.
Peter Krempa [Mon, 16 Mar 2015 15:52:44 +0000 (16:52 +0100)]
qemu: block-commit: Mark disk in block jobs only on successful command
Patch 51f9f03a4ca50b070c0fbfb29748d49f583e15e1 introduces a regression
where if a blockCommit operation fails the disk is still marked as being
part of a block job but can't be unmarked later.
Eric Blake [Fri, 13 Mar 2015 23:01:43 +0000 (17:01 -0600)]
daemon: avoid memleak when ListAll returns nothing
Commit 4f25146 (v1.2.8) managed to silence Coverity, but at the
cost of a memory leak detected by valgrind:
==24129== 40 bytes in 5 blocks are definitely lost in loss record 355 of 637
==24129== at 0x4A08B1C: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==24129== by 0x5084B8E: virReallocN (viralloc.c:245)
==24129== by 0x514D5AA: virDomainObjListExport (domain_conf.c:22200)
==24129== by 0x201227DB: qemuConnectListAllDomains (qemu_driver.c:18042)
==24129== by 0x51CC1B6: virConnectListAllDomains (libvirt-domain.c:6797)
==24129== by 0x14173D: remoteDispatchConnectListAllDomains (remote.c:1580)
==24129== by 0x121BE1: remoteDispatchConnectListAllDomainsHelper (remote_dispatch.h:1072)
In short, every time a client calls a ListAll variant and asks
for the resulting list, but there are 0 elements to return, we
end up leaking the 1-entry array that holds the NULL terminator.
What's worse, a read-only client can access these functions in a
tight loop to cause libvirtd to eventually run out of memory; and
this can be considered a denial of service attack against more
privileged clients. Thankfully, the leak is so small (8 bytes per
call) that you would already have some other denial of service with
any guest calling the API that frequently, so an out-of-memory
crash is unlikely enough that this did not warrant a CVE.
there are some comments sprinkled in indicating IOThreads were using
the same structure as the VcpuPin code...
This is the first patch of a few that will change the virDomainVcpuPin*
structures and code to just virDomainPin* - starting with the data
structure naming...
John Ferlan [Mon, 9 Mar 2015 22:41:04 +0000 (18:41 -0400)]
qemu: Fix possible memory leak in qemuDomainPinVcpuFlags
During his review of the iothreads pin setting code, Pavel noted that
there was a potential memory leak with respect to how the newVcpuPin
is handled and the goto endjob's in failure paths which would not free
the memory. For reference, See:
Peter Krempa [Wed, 4 Mar 2015 10:04:27 +0000 (11:04 +0100)]
conf: Make specifying <memory> optional
Now that the size of guest's memory can be inferred from the NUMA
configuration (if present) make it optional to specify <memory>
explicitly.
To make sure that memory is specified add a check that some form of
memory size was specified. One side effect of this change is that it is
no longer possible to specify 0KiB as memory size for the VM, but I
don't think it would be any useful to do so. (I can imagine embedded
systems without memory, just registers, but that's far from what libvirt
is usually doing).
Forbidding 0 memory for guests also fixes a few corner cases where 0 was
not interpreted correctly and caused failures. (Arguments for numad when
using automatic placement, size of the balloon). This fixes problems
described in https://bugzilla.redhat.com/show_bug.cgi?id=1161461
Test case changes are added to verify that the schema change and code
behave correctly.
Peter Krempa [Wed, 18 Feb 2015 13:31:47 +0000 (14:31 +0100)]
qemu: command: Add helper to align memory sizes
The memory sizes in qemu are aligned up to 1 MiB boundaries. There are
two places where this was done once for the total size and then for
individual NUMA cell sizes.
Add a function that will align the sizes in one place so that it's clear
where the sizes are aligned.
Peter Krempa [Tue, 17 Feb 2015 17:01:09 +0000 (18:01 +0100)]
conf: Replace access to def->mem.max_balloon with accessor functions
As there are two possible approaches to define a domain's memory size -
one used with legacy, non-NUMA VMs configured in the <memory> element
and per-node based approach on NUMA machines - the user needs to make
sure that both are specified correctly in the NUMA case.
To avoid this burden on the user I'd like to replace the NUMA case with
automatic totaling of the memory size. To achieve this I need to replace
direct access to the virDomainMemtune's 'max_balloon' field with
two separate getters depending on the desired size.
The two sizes are needed as:
1) Startup memory size doesn't include memory modules in some
hypervisors.
2) After startup these count as the usable memory size.
Note that the comments for the functions are future aware and document
state that will be present after a few later patches.
Peter Krempa [Fri, 13 Mar 2015 16:00:03 +0000 (17:00 +0100)]
qemu: event: Don't fiddle with disk backing trees without a job
Surprisingly we did not grab a VM job when a block job finished and we'd
happily rewrite the backing chain data. This made it possible to crash
libvirt when queueing two backing chains tightly and other badness.
To fix it, add yet another handler to the helper thread that handles
monitor events that require a job.
Erik Skultety [Thu, 19 Feb 2015 15:53:13 +0000 (16:53 +0100)]
qemu: Check for negative port values in network drive configuration
We interpret port values as signed int (convert them from char *),
so if a negative value is provided in network disk's configuration,
we accept it as valid, however there's an 'unknown cause' error raised later.
This error is only accidental because we return the port value in the return code.
This patch adds just a minor tweak to the already existing check so we
reject negative values the same way as we reject non-numerical strings.
Eric Blake [Fri, 13 Mar 2015 20:55:58 +0000 (14:55 -0600)]
network: avoid memory leak of dnsmasq capabilities
Valgrind detected a leak:
==17820== 102 (56 direct, 46 indirect) bytes in 1 blocks are definitely lost in loss record 479 of 646
==17820== at 0x4A08946: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==17820== by 0x508521A: virAllocVar (viralloc.c:560)
==17820== by 0x50D9FCA: virObjectNew (virobject.c:193)
==17820== by 0x50A4FD9: dnsmasqCapsNewEmpty (virdnsmasq.c:784)
==17820== by 0x50A514E: dnsmasqCapsNewFromBinary (virdnsmasq.c:830)
==17820== by 0x1B508287: networkStateInitialize (bridge_driver.c:666)
It looks like commit 172acef introduced the problem, because
networkGetDnsmasqCaps() increments the reference count but an
early exit never does a matching decrement.
Eric Blake [Fri, 13 Mar 2015 20:35:31 +0000 (14:35 -0600)]
netdev: silence valgrind warning about ioctl use
Valgrind complained:
==3770== Syscall param ioctl(SIOCETHTOOL) points to uninitialised byte(s)
==3770== at 0x919D407: ioctl (syscall-template.S:81)
==3770== by 0x530FE7E: rpl_ioctl (ioctl.c:42)
==3770== by 0x50CB433: virNetDevFeatureAvailable (virnetdev.c:2764)
==3770== by 0x50CB6A7: virNetDevGetFeatures (virnetdev.c:2830)
==3770== by 0x1F0E5347: udevProcessNetworkInterface (node_device_udev.c:722)
==3770== by 0x1F0E689F: udevGetDeviceDetails (node_device_udev.c:1300)
==3770== by 0x1F0E6E06: udevAddOneDevice (node_device_udev.c:1422)
==3770== by 0x1F0E6FB8: udevProcessDeviceListEntry (node_device_udev.c:1464)
==3770== by 0x1F0E70CF: udevEnumerateDevices (node_device_udev.c:1494)
==3770== by 0x1F0E7BB4: nodeStateInitialize (node_device_udev.c:1806)
==3770== by 0x51B4303: virStateInitialize (libvirt.c:777)
==3770== by 0x11DEE7: daemonRunStateInit (libvirtd.c:906)
==3770== Address 0x228e38d4 is on thread 12's stack
==3770== in frame #2, created by virNetDevFeatureAvailable (virnetdev.c:2750)
* src/util/virnetdev.c (virNetDevFeatureAvailable): Initialize all
bytes of ifr.
Eric Blake [Fri, 13 Mar 2015 15:56:48 +0000 (09:56 -0600)]
virsh: fix report of non-active commit completion
Commit f182da20 (v1.2.6) caused a slight regression in virsh
reporting of a non-active block job; where it used to state
"Commit complete", it now states "Now in synchronized phase".
But the synchronized phase is only possible for an active commit.
For a reproducer, I created a chain 'a <- b <- c <- d <- e' and
ran virsh blockcommit $dom vda --top c --base a --verbose --wait
* tools/virsh-domain.c (cmdBlockCommit): Synchronized phase is
only possible on active commits.
Problem Description:
After multiple times of migrating a domain, which has an ovs interface with no portData set,
with non-shared disk, nbd ports got overflowed.
The steps to reproduce the problem:
1 define and start a domain with its network configured as:
<interface type='bridge'>
<source bridge='br0'/>
<virtualport type='openvswitch'>
</virtualport>
<model type='virtio'/>
<driver name='vhost' queues='4'/>
</interface>
2 do not set the network's portData.
3 migrate(ToURI2) it with flag 91(1011011), which means:
VIR_MIGRATE_LIVE
VIR_MIGRATE_PEER2PEER
VIR_MIGRATE_PERSIST_DEST
VIR_MIGRATE_UNDEFINE_SOURCE
VIR_MIGRATE_NON_SHARED_DISK
4 migrate success, but we got an error log in libvirtd.log:
error : virCommandWait:2423 : internal error: Child process (ovs-vsctl --timeout=5 get Interface
vnet1 external_ids:PortData) unexpected exit status 1: ovs-vsctl: no key "PortData" in Interface
record "vnet1" column external_ids
5 migrate it back, migrate it , migrate it back, .......
6 nbd port got overflowed.
The reasons for the problem is :
1 virNetDevOpenvswitchGetMigrateData() takes it as wrong if no portData is available for the ovs
interface of a domain. (We think it's not appropriate, as portData is just OPTIONAL)
2 in func qemuMigrationBakeCookie(), it fails in qemuMigrationCookieAddNetwork(), and returns with -1.
qemuMigrationCookieAddNBD() is not called thereafter, and mig->nbd is still NULL.
3 However, qemuMigrationRun() just *WARN* if qemuMigrationBakeCookie() fails, migration still successes.
cookie is NULL, it's not baked on the src side.
4 On the destination side, it would alloc a port first and then free the nbd port in COOKIE.
But the cookie is NULL due to qemuMigrationCookieAddNetwork() failure at src side. thus the nbd port
is not freed.
In this patch, we add "--if-exists" option to make ovs-vsctl not raise error if there's no portData available.
Further more, because portData may be NULL in the cookie at the dest side, check it before setting portData.
Signed-off-by: Zhou Yimin <zhouyimin@huawei.com> Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com>