build: Don't fail on '<' or '>' with old xmllint
Older xmllint version don't allow such characters in datatype anyURI.
In order not to change too much, I'm suggesting making a choice of
anyURI or 'absPathName' which should be fine (checked with upstream
and that old xmllint, both work fine).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Oops. That's not valid XML. And when we fix the XML
generation, it fails RelaxNG validation.
I'm also tired of seeing <key>(null)</key> in the example
output for volume xml; while we used NULLSTR() to avoid
a NULL deref rather than relying on glibc's printf
extension behavior, it's even better if we avoid the issue
in the first place. But this requires being careful that
we don't invalidate any storage backends that were relying
on key being unassigned during virStoragVolCreateXML[From].
I would have split this into two patches (one for escaping,
one for avoiding <key>(null)</key>), but since they both
end up touching a lot of the same test files, I ended up
merging it into one.
Note that this patch allows pretty much any volume name
that can appear in a directory (excluding . and .. because
those are special), but does nothing to change the current
(unenforced) RelaxNG claim that pool names will consist
only of letters, numbers, _, -, and +. Tightening the C
code to match RelaxNG patterns and/or relaxing the grammar
to match the C code for pool names is a task for another
day (but remember, we DID recently tighten C code for
domain names to exclude a leading '.').
* src/conf/storage_conf.c (virStoragePoolSourceFormat)
(virStoragePoolDefFormat, virStorageVolTargetDefFormat)
(virStorageVolDefFormat): Escape user-controlled strings.
(virStorageVolDefParseXML): Parse key, for use in unit tests.
* src/storage/storage_driver.c (storageVolCreateXML)
(storageVolCreateXMLFrom): Ensure parsed key doesn't confuse
volume creation.
* docs/schemas/basictypes.rng (volName): Relax definition.
* tests/storagepoolxml2xmltest.c (mymain): Test it.
* tests/storagevolxml2xmltest.c (mymain): Likewise.
* tests/storagepoolxml2xmlin/pool-dir-naming.xml: New file.
* tests/storagepoolxml2xmlout/pool-dir-naming.xml: Likewise.
* tests/storagevolxml2xmlin/vol-file-naming.xml: Likewise.
* tests/storagevolxml2xmlout/vol-file-naming.xml: Likewise.
* tests/storagevolxml2xmlout/vol-*.xml: Fix fallout.
Doug Goldstein [Thu, 21 Nov 2013 14:47:08 +0000 (08:47 -0600)]
python: remove virConnectGetCPUModelNames from globals
Commit de51dc9c9aed0e615c8b301cccb89f4859324eb0 primarily added
virConnectGetCPUModelNames as libvirt.getCPUModelNames(conn, arch)
instead of libvirt.virConnect.getCPUModelNames(arch) so revert the code
that does the former while leaving the code that does the later.
This is the rest of the patch that was ACK'd by Dan but I committed only
the partial patch in 6a8b8ae.
Doug Goldstein [Thu, 21 Nov 2013 14:47:08 +0000 (08:47 -0600)]
python: remove virConnectGetCPUModelNames from globals
Commit de51dc9c9aed0e615c8b301cccb89f4859324eb0 primarily added
virConnectGetCPUModelNames as libvirt.getCPUModelNames(conn, arch)
instead of libvirt.virConnect.getCPUModelNames(arch) so revert the code
that does the former while leaving the code that does the later.
Eric Farman [Thu, 21 Nov 2013 03:36:27 +0000 (22:36 -0500)]
qemu: Auto-generate controller for hotplugged hostdev
If a SCSI hostdev is included in an initial domain XML, without a
corresponding controller statement, one is created silently when the
guest is booted.
When hotplugging a SCSI hostdev, a presumption is that the controller
is already present in the domain either from the original XML, or via
an earlier hotplug.
[root@xxxxxxxx ~]# cat disk.xml
<hostdev mode='subsystem' type='scsi'>
<source>
<adapter name='scsi_host0'/>
<address bus='0' target='3' unit='1088438288'/>
</source>
</hostdev>
[root@xxxxxxxx ~]# virsh attach-device guest01 disk.xml
error: Failed to attach device from disk.xml
error: internal error: unable to execute QEMU command 'device_add': Bus 'scsi0.0' not found
Since the infrastructure is in place, we can also create a controller
silently for use by the hotplugged hostdev device.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Eric Farman [Thu, 21 Nov 2013 03:36:26 +0000 (22:36 -0500)]
qemu: Separate calls based on controller bus type
For systems without a PCI bus, attaching a SCSI controller fails:
[root@xxxxxxxx ~]# cat controller.xml
<controller type='scsi' model='virtio-scsi' index='0' />
[root@xxxxxxxx ~]# virsh attach-device guest01 controller.xml
error: Failed to attach device from controller.xml
error: XML error: No PCI buses available
A similar problem occurs with the detach of a controller:
[root@xxxxxxxx ~]# virsh detach-device guest01 controller.xml
error: Failed to detach device from controller.xml
error: operation failed: controller scsi:0 not found
The qemuDomainXXtachPciControllerDevice routines made assumptions
that any caller had a PCI bus. These routines now selectively calls
PCI functions where necessary, and assigns the device information
type to one appropriate for the bus in use.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com> Signed-off-by: Ján Tomko <jtomko@redhat.com>
Eric Farman [Thu, 21 Nov 2013 03:36:25 +0000 (22:36 -0500)]
qemu: Rename controller hotplug functions to not be PCI-specific
For attach/detach of controller devices, we rename the functions to
remove 'PCI' from their title. The actual separation of PCI-specific
operations will be handled in the next patch.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Clark Laughlin [Tue, 19 Nov 2013 21:49:40 +0000 (21:49 +0000)]
qemu: Add support for virt machine type with virtio-mmio devices on armv7
These changes allow the correct virtio-blk-device and virtio-net-device
devices to be used for the 'virt' machine type for armv7 rather than the
PCI virtio devices.
A test case was added to qemuxml2argvtest for this change.
Signed-off-by: Clark Laughlin <clark.laughlin@linaro.org>
For a while we're have random failures of 'securityselinuxtest'
which were not at all reproducible. Fortunately we finally
caught a failure with VIR_TEST_DEBUG=1 enabled. This revealed
TEST: securityselinuxtest
1) GenLabel "dynamic unconfined, s0, c0.c1023" ... OK
2) GenLabel "dynamic unconfined, s0, c0.c1023" ... OK
3) GenLabel "dynamic unconfined, s0, c0.c1023" ... OK
4) GenLabel "dynamic virtd, s0, c0.c1023" ... OK
5) GenLabel "dynamic virtd, s0, c0.c10" ... OK
6) GenLabel "dynamic virtd, s2-s3, c0.c1023" ... OK
7) GenLabel "dynamic virtd, missing range" ... Category two 1024 is out of range 0-1023
FAILED
FAIL: securityselinuxtest
And sure enough we had an off-by-1 in the MCS range code when
the current process has no range set. The test suite randomly
allocates 2 categories from 0->1024 so the chances of hitting
this in the test suite were slim indeed :-)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Michael Chapman [Wed, 20 Nov 2013 12:58:24 +0000 (12:58 +0000)]
spec: fix libvirt-docs subpackage on RHEL-6
RHEL-6's rpmbuild wipes the docdir for a (sub-)package if any %doc
directives are present, prior to copying in the marked documentation.
This means we can't prepopulate this directory with the HTML
documentation during the %install phase.
Instead, move the HTML documentation to a temporary directory during
%install and mark the contents of this temporary directory with %doc.
Eric Blake [Tue, 19 Nov 2013 21:13:31 +0000 (14:13 -0700)]
maint: ship .pl scripts as executables
All our .pl scripts had the executable bit set, except for one.
Make it consistent (even if we invoke the scripts as an argument
to $(PERL) rather than directly).
Michal Privoznik [Tue, 19 Nov 2013 15:30:28 +0000 (16:30 +0100)]
qemumonitorjsontest: Introduce GetNonExistingCPUData test
In the 730af8f2cd commit we are fixing broken qemu startup on systems
with ancient qemu. This commit introduces the regression test for that
specific case to make sure we don't break it again.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Tue, 19 Nov 2013 14:42:28 +0000 (15:42 +0100)]
qemuMonitorJSONGetCPUx86Data: Don't fail on ancient qemus
On the domain startup, this function is called to dump some info about
the CPUs. At the beginning of the function we check if we aren't running
older qemu which is not exposing the CPUs via 'qom-list'. However, we
are not checking for even older qemus, which throw 'CommandNotFound'
error.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Ryota Ozaki [Mon, 18 Nov 2013 15:39:55 +0000 (00:39 +0900)]
vbox: fix segfault on virsh dumpxml with the existence of USB filters
A USB filter is stored in a hostdev. The original code doesn't
allocate hostdev->info that is expected to be allocated with hostdev.
So use virDomainHostdevDefAlloc() to allocate both as we expect.
Ryota Ozaki [Mon, 18 Nov 2013 15:39:32 +0000 (00:39 +0900)]
build: work around super-old readline.h
This patch shuts up the following warning of clang
on Mac OS X:
virsh.c:2761:22: error: assigning to 'char *' from 'const char [6]' discards qualifiers
[-Werror,-Wincompatible-pointer-types-discards-qualifiers]
rl_readline_name = "virsh";
^ ~~~~~~~
The warning happens because rl_readline_name on Mac OS X comes
from an old readline header that still uses 'char *', while it
is 'const char *' in readline 4.2 (April 2001) and newer.
Tested on Mac OS X 10.8.5 (clang-500.2.75) and Fedora 19 (gcc 4.8.1).
Signed-off-by: Ryota Ozaki <ozaki.ryota@gmail.com> Signed-off-by: Eric Blake <eblake@redhat.com>
Avoid async signal safety problem in glibc's setxid
The glibc setxid is supposed to be async signal safe, but
libc developers confirm that it is not. This causes a problem
when libvirt_lxc starts the FUSE thread and then runs clone()
to start the container. If the clone() was done before the
FUSE thread has completely started up, then the container
will hang in setxid after clone().
The fix is to avoid creating any threads until after the
container has been clone()'d. By avoiding any threads in
the parent, the child is no longer required to run in an
async signal safe context, and we thus avoid the glibc
bug.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ján Tomko [Thu, 31 Oct 2013 11:49:04 +0000 (12:49 +0100)]
Return -1 in virPortAllocatorAcquire if all ports are used
Report the error in virPortAllocatorAcquire instead
of doing it in every caller.
The error contains the port range name instead of the intended
use for the port, e.g.:
Unable to find an unused port in range 'display' (65534-65535)
instead of:
Unable to find an unused port for SPICE
This also adds error reporting when the QEMU driver could not
find an unused port for VNC, VNC WebSockets or NBD migration.
Ján Tomko [Thu, 14 Nov 2013 15:29:29 +0000 (16:29 +0100)]
Properly unref a connection with a close callback
The connection pointer in the closeCallback data was never
initialized, making the unref in remoteClientCloseFunc a no-op.
This fixes the following leak in virsh when the daemon closes
the connection unexpectedly:
1,179 (288 direct, 891 indirect) bytes in 1 blocks are
definitely lost in loss record 745 of 792
at 0x4C2A6D0: calloc (in vgpreload_memcheck-amd64-linux.so)
by 0x4E9643D: virAllocVar (viralloc.c:558)
by 0x4ED2425: virObjectNew (virobject.c:190)
by 0x4F675AC: virGetConnect (datatypes.c:116)
by 0x4F6EA06: do_open (libvirt.c:1136)
by 0x4F71017: virConnectOpenAuth (libvirt.c:1481)
by 0x129FFA: vshReconnect (virsh.c:337)
by 0x128310: main (virsh.c:2470)
Michael Avdienko [Fri, 15 Nov 2013 11:47:43 +0000 (20:47 +0900)]
Fix migration with QEMU 1.6
QEMU 1.6.0 introduced new migration status: setup
Libvirt does not expect such string in QMP and refuses to migrate with error
"unexpected migration status in setup"
So far qemuSetupHostdevCGroup was called very early during hotplug, even
before we knew the device we were about to hotplug was actually
available. By calling the function later, we make sure QEMU won't be
allowed to access devices used by other domains.
Another important effect of this change is that hopluging USB devices
specified by vendor and product (but not by their USB address) works
again. This was broken since v1.0.5-171-g7d763ac, when the call to
qemuFindHostdevUSBDevice was moved after the call to
qemuSetupHostdevCGroup, which then used an uninitialized USB address.
The aim of virObject refing and urefing is to tell where the object is
to be used and when is no longer needed. Hence any object shouldn't be
used after it has been unrefed, as we might be the last to hold the
reference. The better way is to call virObjectUnref() *after* the last
object usage. In this specific case, the monitor EOF handler was called
after the qemuMonitorIO called virObjectUnref. Not only that @mon was
disposed (which is not used in the handler anyway) but the @mon->vm
which is causing a SIGSEGV:
2013-11-15 10:17:54.425+0000: 20110: error : qemuMonitorIO:688 : internal error: early end of file from monitor: possible problem:
qemu-kvm: -incoming tcp:01.01.01.0:49152: Failed to bind socket: Cannot assign requested address
In the qemuProcessReconnectHelper() a new thread that does all the
interesting work is spawned. The rationale is to not block the daemon
startup process in case of unresponsive qemu. However, the thread
handler is a local variable which gets lost once the control goes out of
scope. Hence the thread gets leaked. We can avoid this if the thread
isn't made joinable.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The @list->callbacks is an array that is inflated whenever a new event
is added, e.g. via virDomainEventCallbackListAddID(). However, when we
are freeing the array, we free the items within it but forgot to
actually free it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Wed, 13 Nov 2013 13:18:09 +0000 (14:18 +0100)]
virPCIDeviceBindToStub: Remove unused @oldDriverPath and @oldDriverName
These two chunks had to be part of df4283a55bf. But for some unclear
reason, the weren't. Anyway, these two variables are not used anywhere
within function. They're initialized to NULL and then VIR_FREE()-d. And
there's no reason do do two NOPs, right?
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Eric Blake [Tue, 12 Nov 2013 23:35:36 +0000 (16:35 -0700)]
storage: fix RNG validation of gluster via netfs
While trying to compare netfs against my new gluster pool, I
discovered two things:
virt-xml-validate chokes on valid xml produced by 'virsh pool-dumpxml'
[yet another reason that ALL patches that add new xml should be adding
corresponding tests]
When using glusterfs FUSE mounts, you cannot access a subdirectory
of a gluster volume. The recommended workaround in the gluster
community is to mount the volume to an intermediate location, then
bind-mount the desired subdirectory to the final location. Maybe
we should teach libvirt to do bind-mounting, but for now I chose to
just document the limitation.
* docs/storage.html.in: Improve documentation.
* docs/schemas/storagepool.rng (sourcefmtnetfs): Allow all
formats, and drop redundant info-vendor.
* tests/storagepoolxml2xmltest.c (mymain): New test.
* tests/storagepoolxml2xmlin/pool-netfs-gluster.xml: New file.
* tests/storagepoolxml2xmlout/pool-netfs-gluster.xml: Likewise.
Peter Krempa [Tue, 12 Nov 2013 16:35:31 +0000 (17:35 +0100)]
virsh-interface: Unify list column alignment
Before:
$ virsh iface-list
Name State MAC Address
--------------------------------------------
br0 active f0:de:f1:dc:b8:b0
virbr2 active 52:54:00:61:78:0c
After:
$ virsh iface-list
Name State MAC Address
---------------------------------------------------
br0 active f0:de:f1:dc:b8:b0
virbr2 active 52:54:00:61:78:0c
Ján Tomko [Tue, 12 Nov 2013 12:18:54 +0000 (13:18 +0100)]
Disable nwfilter driver when running unprivileged
When opening a new connection to the driver, nwfilterOpen
only succeeds if the driverState has been allocated.
Move the privilege check in driver initialization before
the state allocation to disable the driver.
This changes the nwfilter-define error from:
error: cannot create config directory (null): Bad address
To:
this function is not supported by the connection driver:
virNWFilterDefineXML
Jason Andryuk [Tue, 12 Nov 2013 17:41:12 +0000 (12:41 -0500)]
libxl: Fix Xen 4.4 libxlVmStart logic
ifdef LIBXL_HAVE_DOMAIN_CREATE_RESTORE_PARAMS hides a multi-line body
for a brace-less else. Add braces to ensure proper logic is applied.
Without this fix, new domains cannot be started. Both
libxl_domain_create_new and libxl_domain_create_restore are called when
starting a new domain leading to this error:
libxl: error: libxl.c:324:libxl__domain_rename: domain with name "guest" already exists.
libxl: error: libxl_create.c:800:initiate_domain_create: cannot make domain: -6
Peter Krempa [Mon, 11 Nov 2013 15:34:53 +0000 (16:34 +0100)]
qemu: Check for presence of device and properities when getting CPUID
The QOM path in qemu that contains the CPUID registers of a running VM
may not be present (introduced in QEMU 1.5).
Since commit d94b7817719 we have a regression with QEMU that don't
support reporting of the CPUID register state via the monitor as the
process startup code expects the path to exist.
This patch adds code that checks with the monitor if the requested path
already exists and uses it only in this case.
Peter Krempa [Tue, 12 Nov 2013 14:11:55 +0000 (15:11 +0100)]
virsh-volume: Unify alignment of vol-list output columns
Add an extra space before the first column as we have when listing
domains.
Previous output:
$ virsh vol-list glusterpool
Name Path
-----------------------------------------
asdf gluster://gluster-node-1/gv0/asdf
c gluster://gluster-node-1/gv0/c
cd gluster://gluster-node-1/gv0/cd
$ virsh vol-list glusterpool --details
Name Path Type Capacity Allocation
----------------------------------------------------------------------
asdf gluster://gluster-node-1/gv0/asdf unknown 0.00 B 0.00 B
c gluster://gluster-node-1/gv0/c unknown 16.00 B 16.00 B
cd gluster://gluster-node-1/gv0/cd unknown 0.00 B 0.00 B
New output:
$ virsh vol-list glusterpool
Name Path
------------------------------------------------------------------------------
asdf gluster://gluster-node-1/gv0/asdf
c gluster://gluster-node-1/gv0/c
cd gluster://gluster-node-1/gv0/cd
$ virsh vol-list glusterpool --details
Name Path Type Capacity Allocation
------------------------------------------------------------------------
asdf gluster://gluster-node-1/gv0/asdf unknown 0.00 B 0.00 B
c gluster://gluster-node-1/gv0/c unknown 16.00 B 16.00 B
cd gluster://gluster-node-1/gv0/cd unknown 0.00 B 0.00 B
As of libvirt 1.1.1 and systemd 205, the cgroups layout used by
libvirt has some changes. Update the 'cgroups.html' file from
the website to describe how it works in a systemd world.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If the host side of an LXC container console disconnected
and the guest side continued to write data, until the PTY
buffer filled up, the LXC controller would busy wait. It
would repeatedly see POLLHUP from poll() and not disable
the watch.
This was due to some bogus logic detecting blocking
conditions. Upon seeing a POLLHUP we must disable all
reading & writing from the PTY, and setup the epoll to
wake us up again when the connection comes back.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The 'none' machine type is something only intended for use
by libvirt probing capabilities. It isn't something that
is useful for running real VM instances. As such it should
not be exposed to users in the capabilities.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Fix mem leak in virQEMUCapsProbeQMPMachineTypes on OOM
The virQEMUCapsProbeQMPMachineTypes method iterates over machine
types copying them into the qemuCapsPtr object. It only updates
the qemuCaps->nmachinetypes value at the end though. So if OOM
occurs in the middle, the destructor of qemuCapsPtr will not
free the partially initialized machine types.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Peter Krempa [Wed, 6 Nov 2013 16:52:23 +0000 (17:52 +0100)]
conf: Split out code to parse the source of a disk definition
To avoid code duplication between snapshot configuration code that
parses the disk source too we need to split out this code that will be
reused later on.
This patch tries to be code movement, some aspects of this function will
be refactored later.
Michal Privoznik [Mon, 11 Nov 2013 15:37:16 +0000 (16:37 +0100)]
qemuDomainObjStart: Warn on corrupted image
If the managedsave image is corrupted, e.g. the XML part is, we fail to
parse it and throw an error, e.g.:
error: Failed to start domain jms8
error: XML error: missing security model when using multiple labels
This is okay, as we can't really start the machine and avoid undefined
qemu behaviour. On the other hand, the error message doesn't give a
clue to users what should they do. The consensus here would be to thrown
a warning to logs saying "Hey, you've got a corrupted file".
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
If there's the following snippet in the domain XML, the domain will be
lost upon the daemon restart (if the domain is started prior restart):
<seclabel type='dynamic' relabel='yes'/>
The problem is, the 'label', 'imagelabel' and 'baselabel' are parsed
whenever the VIR_DOMAIN_XML_INACTIVE is *not* present or the label is
static. The latter is not our case, obviously. So, when libvirtd starts
up, it finds domain state xml and parse it. During parsing, many XML
flags are enabled but VIR_DOMAIN_XML_INACTIVE. Hence, our parser tries
to extract 'label', 'imagelabel' and 'baselabel' from the XML which
fails for model='none'. Err, this model - even though not specified in
XML - can be taken from qemu wide config file: /etc/libvirtd/qemu.conf.
However, in order to know we are dealing with model='none' the code in
question must be moved forward a bit. Then a new check must be
introduced. This is what the first two chunks are doing.
But this alone is not sufficient. The domain state XML won't contain the
model attribute without slight modification. The model should be
inserted into the XML even if equal to 'none' and the state XML is being
generated - what if the origin (the @security_driver variable in
qemu.conf) changes during libvirtd restarts?
At the end, a test to catch this scenario is introduced.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Peter Krempa [Wed, 30 Oct 2013 12:58:09 +0000 (13:58 +0100)]
virsh-domain: Mark --live and --config mutually exclusive in vcpucount
The 'vcpucount' command is a getter command for the vCPUu count. When
one or more of the filtering flags are specified the command returns the
value only for the selected combination. In this case the --live and
--config combination isn't valid. This however didn't cause errors as
the combination of flags was rejected by the libvirt API but then the
fallback code kicked in and requested the count in a way where the clash
of the flags didn't matter.
Mark the flag combination mutually exclusive so that users aren't
confused.
Vitor de Lima [Wed, 6 Nov 2013 16:13:55 +0000 (14:13 -0200)]
qemu: Fix SCSI hotplug on pseries guests
This patch moves some code in the qemuDomainAttachSCSIDisk
function. The check for the existence of a PCI address assigned
to the SCSI controller was moved in order to be executed only
when needed. The PCI address of a controller is not necessary
if QEMU_CAPS_DEVICE is supported.
This fixes issues with the hotplug of SCSI disks on pseries guests.
When virPCIGetVirtualFunctions created the list of an SRIOV Physical
Function's (PF) Virtual Functions (VF), it had assumed that the order
of "virtfn*" links returned by readdir() from the PF's sysfs directory
was already in the correct order. Experience has shown that this is
not always the case - it can be in alphabetical order (which would
e.g. place virtfn11 before virtfn2) or even some seemingly random
order (see the example in the bugzilla report)
This results in 1) incorrect assumptions made by consumers of the
output of the virt_functions list of virsh nodedev-dumpxml, and 2)
setting MAC address and vlan tag on the wrong VF (since libvirt uses
netlink to set mac address and vlan tag, netlink requires the VF#, and
the function virPCIGetVirtualFunctionIndex() returns the wrong index
due to the improperly ordered VF list).
The solution provided by this patch is for virPCIGetVirtualFunctions
to no longer scan the entire device directory in its natural order,
but instead to check for links individually by name "virtfn%d" where
%d starts at 0 and increases with each success. Since VFs are created
contiguously by the kernel, this will guarantee that all VFs are
found, and placed in the arry in the correct order.
One note of use to the uninitiated is that VIR_APPEND_ELEMENT always
either increments *num_virtual_functions or fails, so no this isn't an
endless loop.
(NB: the SRIOV_* defines at the top of virpci.c were removed
because they are unnecessary and/or not used.)
Vitor de Lima [Wed, 6 Nov 2013 16:13:54 +0000 (14:13 -0200)]
qemu: assign PCI address to primary video card
When adding support for Q35 guests, the code to assign a PCI address
to the primary video card was moved into Q35 and i440fx(PIIX3)
specific functions, but no fallback was kept for other machine types
that might have a video card.
This patch remedies that by assigning a PCI address to the primary
video card if it does not have any kind of address. In particular,
this fixes issues with pseries guests.
Signed-off-by: Vitor de Lima <vitor.lima@eldorado.org.br> Signed-off-by: Laine Stump <laine@laine.org>
Peter Krempa [Mon, 14 Oct 2013 09:35:00 +0000 (11:35 +0200)]
qemu: process: Validate specific CPUID flags of a guest
When starting a VM the qemu process may filter out some requested
features of a domain as it's not supported either by the host or by
qemu. Libvirt didn't check if this happened which might end up in
changing of the guest ABI when migrating.
The proof of concept implementation adds the check for the recently
introduced kvm_pv_unhalt cpuid feature bit. This feature depends on both
qemu and host kernel support and thus increase the possibility of guest
ABI breakage.
Peter Krempa [Mon, 23 Sep 2013 16:32:11 +0000 (18:32 +0200)]
qemu: Add support for paravirtual spinlocks in the guest
The linux kernel recently added support for paravirtual spinlock
handling to avoid performance regressions on overcomitted hosts. This
feature needs to be turned in the hypervisor so that the guest OS is
notified about the possible support.
This patch adds a new feature "paravirt-spinlock" to the XML and
supporting code to enable the "kvm_pv_unhalt" pseudo CPU feature in
qemu.
Peter Krempa [Mon, 23 Sep 2013 13:02:38 +0000 (15:02 +0200)]
conf: Refactor storing and usage of feature flags
Currently we were storing domain feature flags in a bit field as the
they were either enabled or disabled. New features such as paravirtual
spinlocks however can be tri-state as the default option may depend on
hypervisor version.
To allow storing tri-state feature state in the same place instead of
having to declare dedicated variables for each feature this patch
refactors the bit field to an array.
Peter Krempa [Wed, 9 Oct 2013 09:42:24 +0000 (11:42 +0200)]
cpu: x86: Add internal CPUID features support and KVM feature bits
Some of the emulator features are presented in the <features> element in
the domain XML although they are virtual CPUID feature bits when
presented to the guest. To avoid confusing the users with these
features, as they are not configurable via the <cpu> element, this patch
adds an internal array where those can be stored privately instead of
exposing them in the XML.
Additionaly KVM feature bits are added as example usage of this code.
qemu: Add monitor APIs to fetch CPUID data from QEMU
The qemu monitor supports retrieval of actual CPUID bits presented to
the guest using QMP monitor. Add APIs to extract these information and
tests for them.
Peter Krempa [Mon, 7 Oct 2013 13:26:17 +0000 (15:26 +0200)]
cpu_x86: Refactor storage of CPUID data to add support for KVM features
The CPUID functions were stored in multiple arrays according to a
specified prefix of those. This made it very hard to add another prefix
to store KVM CPUID features (0x40000000). Instead of hardcoding a third
array this patch changes the approach used:
The code is refactored to use a single array where the CPUID functions
are stored ordered by the cpuid function so that they don't depend on
the specific prefix and don't waste memory. The code is also less
complex using this approach. A trateoff to this is the change from O(N)
complexity to O(N^2) in x86DataAdd and x86DataSubtract. The rest of the
functions were already using O(N^2) algorithms.
Li Zhang [Thu, 7 Nov 2013 08:35:10 +0000 (16:35 +0800)]
storage: Fix a vol-clone bug on ppc64
vol-clone reports out of memory error with disk type on ppc64.
Currently, wbytes is defined as size_t type (8 bytes), but
args's value in ioctl(fd, args..) in kernel is int (4 bytes).
This makes wbytes 2^32 times larger, causing an out of memory error.
This patch changes size_t to int to synchronize with kernel.
Since 86d90b3a (yes, my patch; again) we are supporting NBD storage
migration. However, on error recovery path we got the steps reversed.
The correct order is: return NBD port to the virPortAllocator and then
either unlock the vm or remove it from the driver. Not vice versa.
==11192== Invalid write of size 4
==11192== at 0x11488559: qemuMigrationPrepareAny (qemu_migration.c:2459)
==11192== by 0x11488EA6: qemuMigrationPrepareDirect (qemu_migration.c:2652)
==11192== by 0x114D1509: qemuDomainMigratePrepare3Params (qemu_driver.c:10332)
==11192== by 0x519075D: virDomainMigratePrepare3Params (libvirt.c:7290)
==11192== by 0x1502DA: remoteDispatchDomainMigratePrepare3Params (remote.c:4798)
==11192== by 0x12DECA: remoteDispatchDomainMigratePrepare3ParamsHelper (remote_dispatch.h:5741)
==11192== by 0x5212127: virNetServerProgramDispatchCall (virnetserverprogram.c:435)
==11192== by 0x5211C86: virNetServerProgramDispatch (virnetserverprogram.c:305)
==11192== by 0x520A8FD: virNetServerProcessMsg (virnetserver.c:165)
==11192== by 0x520A9E1: virNetServerHandleJob (virnetserver.c:186)
==11192== by 0x50DA78F: virThreadPoolWorker (virthreadpool.c:144)
==11192== by 0x50DA11C: virThreadHelper (virthreadpthread.c:161)
==11192== Address 0x1368baa0 is 576 bytes inside a block of size 688 free'd
==11192== at 0x4A07F5C: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==11192== by 0x5079A2F: virFree (viralloc.c:580)
==11192== by 0x11456C34: qemuDomainObjPrivateFree (qemu_domain.c:267)
==11192== by 0x50F41B4: virDomainObjDispose (domain_conf.c:2034)
==11192== by 0x50C2991: virObjectUnref (virobject.c:262)
==11192== by 0x50F4CFC: virDomainObjListRemove (domain_conf.c:2361)
==11192== by 0x1145C125: qemuDomainRemoveInactive (qemu_domain.c:2087)
==11192== by 0x11488520: qemuMigrationPrepareAny (qemu_migration.c:2456)
==11192== by 0x11488EA6: qemuMigrationPrepareDirect (qemu_migration.c:2652)
==11192== by 0x114D1509: qemuDomainMigratePrepare3Params (qemu_driver.c:10332)
==11192== by 0x519075D: virDomainMigratePrepare3Params (libvirt.c:7290)
==11192== by 0x1502DA: remoteDispatchDomainMigratePrepare3Params (remote.c:4798)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
One of my previous patches (c7ac2519b7f) did try to fix the issue when
domain dies too soon during migration. However, this clumsy approach was
missing removal of qemuProcessHandleMonitorDestroy resulting in double
unrefing of mon->vm and hence producing the daemon crash:
==11843== Invalid read of size 4
==11843== at 0x50C28C5: virObjectUnref (virobject.c:255)
==11843== by 0x1148F7DB: qemuMonitorDispose (qemu_monitor.c:258)
==11843== by 0x50C2991: virObjectUnref (virobject.c:262)
==11843== by 0x50C2D13: virObjectFreeCallback (virobject.c:388)
==11843== by 0x509C37B: virEventPollCleanupHandles (vireventpoll.c:583)
==11843== by 0x509C711: virEventPollRunOnce (vireventpoll.c:652)
==11843== by 0x509A620: virEventRunDefaultImpl (virevent.c:274)
==11843== by 0x520D21C: virNetServerRun (virnetserver.c:1112)
==11843== by 0x11F368: main (libvirtd.c:1513)
==11843== Address 0x13b88864 is 4 bytes inside a block of size 136 free'd
==11843== at 0x4A07F5C: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==11843== by 0x5079A2F: virFree (viralloc.c:580)
==11843== by 0x50C29E3: virObjectUnref (virobject.c:270)
==11843== by 0x114770E4: qemuProcessHandleMonitorDestroy (qemu_process.c:1103)
==11843== by 0x1148F7CB: qemuMonitorDispose (qemu_monitor.c:257)
==11843== by 0x50C2991: virObjectUnref (virobject.c:262)
==11843== by 0x50C2D13: virObjectFreeCallback (virobject.c:388)
==11843== by 0x509C37B: virEventPollCleanupHandles (vireventpoll.c:583)
==11843== by 0x509C711: virEventPollRunOnce (vireventpoll.c:652)
==11843== by 0x509A620: virEventRunDefaultImpl (virevent.c:274)
==11843== by 0x520D21C: virNetServerRun (virnetserver.c:1112)
==11843== by 0x11F368: main (libvirtd.c:1513)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
qemuMigrationBeginPhase: Check for 'drive-mirror' for NBD
So far we are checking if qemu supports 'nbd-server-start'. This,
however, makes no sense on the source as nbd-server-* is used on the
destination. On the source the 'drive-mirror' is used instead.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Since 21685c955e5466 we have tests/virpcitestdata dir containing the PCI
config files for some dummy PCI devices that are used int virpcitest.
However, the directory containing the config files is not distributed
making 'make rpm' fail.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Peter Krempa [Wed, 6 Nov 2013 14:15:18 +0000 (15:15 +0100)]
conf: Refactor virDomainDiskSourcePoolDefParse
For some strange reason virDomainDiskSourcePoolDefParse accessed def of
the disk and allocated the pool object in it. To avoid the need to carry
over the disk definition object, refactor this function to return the
allocated object instead.
Eric Blake [Thu, 7 Nov 2013 00:09:26 +0000 (17:09 -0700)]
nodeinfo: fix build on non-Linux
Commit b0f8546 broke the build on mingw, by exposing code that
had Linux-specific dependencies but which was previously protected
by libnuma ifdef guards:
make[3]: Entering directory `/home/eblake/libvirt-tmp/build/src'
CC libvirt_driver_la-nodeinfo.lo
../../src/nodeinfo.c: In function 'virNodeGetSiblingsList':
../../src/nodeinfo.c:1543:30: error: 'SYSFS_THREAD_SIBLINGS_LIST_LENGTH_MAX' undeclared (first use in this function)
if (virFileReadAll(path, SYSFS_THREAD_SIBLINGS_LIST_LENGTH_MAX, &buf) < 0)
^
../../src/nodeinfo.c:1543:30: note: each undeclared identifier is reported only once for each function it appears in
../../src/nodeinfo.c: In function 'virNodeCapsFillCPUInfo':
../../src/nodeinfo.c:1562:5: error: implicit declaration of function 'virNodeGetCpuValue' [-Werror=implicit-function-declaration]
if ((tmp = virNodeGetCpuValue(SYSFS_CPU_PATH, cpu_id,
^
../../src/nodeinfo.c:1562:5: error: nested extern declaration of 'virNodeGetCpuValue' [-Werror=nested-externs]
../../src/nodeinfo.c:1562:35: error: 'SYSFS_CPU_PATH' undeclared (first use in this function)
if ((tmp = virNodeGetCpuValue(SYSFS_CPU_PATH, cpu_id,
^
cc1: all warnings being treated as errors
* src/nodeinfo.c (virNodeCapsFillCPUInfo): Make conditional.
(virNodeGetSiblingsList): Move into #ifdef linux block.
Eric Blake [Tue, 5 Nov 2013 21:12:02 +0000 (14:12 -0700)]
storage: always probe type with buffer
This gets rid of another stat() per volume, as well as cutting
bytes read in half, when populating the volumes of a directory
pool during a pool refresh. Not to mention that it provides an
interface that can let gluster pools also probe file types.
Eric Blake [Tue, 5 Nov 2013 20:50:29 +0000 (13:50 -0700)]
storage: refactor backing chain division of labor
Future patches will want to learn metadata about a file using
a buffer that was already parsed in order to probe the file's
format. Rather than reopening and re-reading the file, it makes
sense to separate getting file contents from actually parsing
those contents.
* src/util/virstoragefile.c (virStorageFileGetMetadataFromBuf)
(virStorageFileGetMetadataFromFDInternal): New functions.
(virStorageFileGetMetadataInternal): Hoist fstat() and read() into
callers.
(virStorageFileGetMetadataFromFD)
(virStorageFileGetMetadataRecurse): Rework clients.
* src/util/virstoragefile.h (virStorageFileGetMetadataFromBuf):
New prototype.
* src/libvirt_private.syms (virstoragefile.h): Export it.
Eric Blake [Tue, 5 Nov 2013 15:30:01 +0000 (08:30 -0700)]
storage: reduce number of stat calls
We are calling fstat() at least twice per storage volume in
a directory storage pool; this is rather wasteful. Refactoring
this is also a step towards making code reusable for gluster,
where gluster can provide struct stat but cannot use fstat().