Guido Günther [Fri, 22 Mar 2013 09:25:42 +0000 (10:25 +0100)]
qemu: Don't set address type too early during virtio disk hotplug
f946462e14ac036357b7c11ce5c23f94a3ee4e49 changed behavior by settings
VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI upfront. If we do so before invoking
qemuDomainPCIAddressEnsureAddr we merely try to set the PCI slot via
qemuDomainPCIAddressReserveSlot instead reserving a new address via
qemuDomainPCIAddressSetNextAddr which fails with
$ ~/run-tck-test domain/200-disk-hotplug.t
./scripts/domain/200-disk-hotplug.t .. # Creating a new transient domain
./scripts/domain/200-disk-hotplug.t .. 1/5 # Attaching the new disk /var/lib/jenkins/jobs/libvirt-tck-build/workspace/scratchdir/200-disk-hotplug/extra.img
# Failed test 'disk has been attached'
# at ./scripts/domain/200-disk-hotplug.t line 67.
# died: Sys::Virt::Error (libvirt error code: 1, message: internal error unable to reserve PCI address 0:0:0.0
# )
Michal Privoznik [Tue, 26 Mar 2013 14:45:16 +0000 (15:45 +0100)]
qemu: Set migration FD blocking
Since we switched from direct host migration scheme to the one,
where we connect to the destination and then just pass a FD to a
qemu, we have uncovered a qemu bug. Qemu expects migration FD to
block. However, we are passing a nonblocking one which results in
cryptic error messages like:
qemu: warning: error while loading state section id 2
load of migration failed
The bug is already known to Qemu folks, but we should workaround
already released Qemus. Patch has been originally proposed by Stefan
Hajnoczi <stefanha@gmail.com>
virConnectOpenAuth didn't require 'name' to be specified (VIR_DEBUG
used NULLSTR() for the output) and by default, if name == NULL, the
default connection uri is used. This was not indicated in the
documentation and wasn't checked for in other API's VIR_DEBUG outputs.
Guannan Ren [Tue, 26 Mar 2013 04:34:49 +0000 (12:34 +0800)]
python: set default value to optional arguments
When prefixing with string (optional) or optional in the description
of arguments to libvirt C APIs, in python, these arguments will be
set as optional arugments, for example:
* virDomainSaveFlags:
* @domain: a domain object
* @to: path for the output file
* @dxml: (optional) XML config for adjusting guest xml used on restore
* @flags: bitwise-OR of virDomainSaveRestoreFlags
the corresponding python APIs is
restoreFlags(self, frm, dxml=None, flags=0)
After further discussions with Alon Levy, I learned the following:
The use of '-vga qxl' vs. '-device qxl-vga' is completely orthogonal
to whether ram_size can be exposed. Downstream distros are interested
in backporting support for multi-head qxl, but this can be done in
one of two ways:
1. Support one head per PCI device. If you do this, then it makes
sense to have full control over the PCI address of each device. For
full control, you need '-device qxl-vga' instead of '-vga qxl'.
2. Support multiple heads through a single PCI device. If you do
this, then you need to allocate more RAM to that PCI device (enough
ram to cover the multiple screens). Here, the device is hard-coded
to 0:0:2.0, both in qemu and libvirt code.
Apparently, backporting ram_size changes to allow multiple heads in
a single device is much easier than backporting multiple device
support. Furthermore, the presence or absence of qxl-vga.surfaces
is no different than the presence or absence of qxl-vga.ram_size;
both properties can be applied regardless of whether you have one
PCI device (-vga qxl) or multiple (-device qxl-vga), so this property
is NOT a good witness of whether '-device qxl-vga' support has been
backported.
Downstream RHEL will NOT be using this patch; and worse, leaving this
patch in risks doing the wrong thing if compiling upstream libvirt
on RHEL, so the best course of action is to revert it. That means
that libvirt will go back to only using '-device qxl-vga' for qemu
>= 1.2, but this is just fine because we know of no distros that plan
on backporting multiple PCI address support to any older version of
qemu. Meanwhile, downstream can still use ram_size to pack multiple
heads through a single PCI device.
$ service libvirt-guests restart
Running guests on default URI: a, b, d, c
Shutting down guests on default URI...
Starting shutdown on guest: a
Shutdown of guest a failed to complete in time.Starting shutdown on guest: b
Shutdown of guest b failed to complete in time.Starting shutdown on guest: d
Shutdown of guest d failed to complete in time.Starting shutdown on guest: c
Shutdown of guest c failed to complete in time.libvirt-guests is configured not to start any guests on boot
* tools/libvirt-guests.sh.in (shutdown_guest): Add missing newline.
Reported by Xuesong Zhang.
to tag the PCI device with "virtual_function" cap.
However, this results in a VF will aslo get "virtual_function" cap.
This patch fixes it by:
* Ignoring the VF which has failure of getting PCI config space
(given that the successfully detected VFs are kept , it makes
sense to not give up on the failure of one VF too) with a warning,
so virPCIGetVirtualFunctions will not return -1 except out of memory.
* Free the allocated *virtual_functions when out of memory
And thus the logic can be changed to:
/* Out of memory */
int ret = virPCIGetVirtualFunctions(syspath,
&data->pci_dev.virtual_functions,
&data->pci_dev.num_virtual_functions);
if (ret < 0 )
goto out;
if (data->pci_dev.num_virtual_functions > 0)
data->pci_dev.flags |= VIR_NODE_DEV_CAP_FLAG_PCI_VIRTUAL_FUNCTION;
Osier Yang [Mon, 7 Jan 2013 17:05:34 +0000 (01:05 +0800)]
nodedev: Abstract nodeDeviceVportCreateDelete as util function
This abstracts nodeDeviceVportCreateDelete as an util function
virManageVport, which can be further used by later storage patches
(to support persistent vHBA, I don't want to create the vHBA
using the public API, which is not good).
* docs/formatnode.html.in: (Document the new XML)
* docs/schemas/nodedev.rng: (Add the schema)
* src/conf/node_device_conf.h: (New member for data.scsi_host)
* src/node_device/node_device_linux_sysfs.c: (Collect the value of
max_vports and vports)
Osier Yang [Mon, 7 Jan 2013 17:05:31 +0000 (01:05 +0800)]
nodedev: Refactor the helpers
This adds two util functions (virIsCapableFCHost and virIsCapableVport),
and rename helper check_fc_host_linux as detect_scsi_host_caps,
check_capable_vport_linux is removed, as it's abstracted to the util
function virIsCapableVport. detect_scsi_host_caps nows detect both
the fc_host and vport_ops capabilities. "stat(2)" is replaced with
"access(2)" for saving.
* src/util/virutil.h:
- Declare virIsCapableFCHost and virIsCapableVport
* src/util/virutil.c:
- Implement virIsCapableFCHost and virIsCapableVport
* src/node_device/node_device_linux_sysfs.c:
- Remove check_capable_vport_linux
- Rename check_fc_host_linux as detect_scsi_host_caps, and refactor
it a bit to detect both fc_host and vport_os capabilities
* src/node_device/node_device_driver.h:
- Change/remove the related declarations
* src/node_device/node_device_udev.c: (Use detect_scsi_host_caps)
* src/node_device/node_device_hal.c: (Likewise)
* src/node_device/node_device_driver.c (Likewise)
Osier Yang [Mon, 7 Jan 2013 17:05:30 +0000 (01:05 +0800)]
nodedev: Use access instead of stat
The use of 'stat' in nodeDeviceVportCreateDelete is only to check
if the file exists or not, it's a bit overkill, and safe to replace
with the wrapper of access(2) (virFileExists).
Osier Yang [Mon, 7 Jan 2013 17:05:29 +0000 (01:05 +0800)]
util: Add one helper virReadFCHost to read the value of fc_host entry
"open_wwn_file" in node_device_linux_sysfs.c is redundant, on one
hand it duplicates work of virFileReadAll, on the other hand, it's
waste to use a function for it, as there is no other users of it.
So I don't see why the file opening work cannot be done in
"read_wwn_linux".
"read_wwn_linux" can be abstracted as an util function. As what all
it does is to read the sysfs entry.
So this patch removes "open_wwn_file", and abstract "read_wwn_linux"
as an util function "virReadFCHost" (a more general name, because
after changes, it can read each of the fc_host entry now).
* src/util/virutil.h: (Declare virReadFCHost)
* src/util/virutil.c: (Implement virReadFCHost)
* src/node_device/node_device_linux_sysfs.c: (Remove open_wwn_file,
and read_wwn_linux)
src/node_device/node_device_driver.h: (Remove the declaration of
read_wwn_linux, and the related macros)
src/libvirt_private.syms: (Export virReadFCHost)
Osier Yang [Mon, 7 Jan 2013 17:05:28 +0000 (01:05 +0800)]
nodedev: Introduce two new flags for listAll API
VIR_CONNECT_LIST_NODE_DEVICES_CAP_FC_HOST to filter the FC HBA,
and VIR_CONNECT_LIST_NODE_DEVICES_CAP_VPORTS to filter the FC HBA
which supports vport.
Osier Yang [Mon, 7 Jan 2013 17:05:27 +0000 (01:05 +0800)]
nodedev: Remove the unused enum
Guess it was created for the fc_host and vports_ops capabilities
purpose, but there is enum virNodeDevScsiHostCapFlags for them,
and enum virNodeDevHBACapType is unused, and actually both
VIR_ENUM_DECL and VIR_ENUM_IMPL use the wrong enum name
"virNodeDevHBACap".
Peter Krempa [Fri, 22 Mar 2013 10:05:36 +0000 (11:05 +0100)]
virsh: Fix docs for "virsh setmaxmem"
The docs assumed the command works always for QEMU and other
hypervisors. As this is done using the balloon mechainism live increase
of the maximum memory limit isn't supported. Fix the docs to mention
this limitation.
Mount temporary devpts on /var/lib/libvirt/lxc/$NAME.devpts
Currently the lxc controller sets up the devpts instance on
$rootfsdef->src, but this only works if $rootfsdef is using
type=mount. To support type=block or type=file for the root
filesystem, we must use /var/lib/libvirt/lxc/$NAME.devpts
for the temporary devpts mount in the controller
Move FUSE mount to /var/lib/libvirt/lxc/$NAME.fuse
Instead of using /var/lib/libvirt/lxc/$NAME for the FUSE
filesystem, use /var/lib/libvirt/lxc/$NAME.fuse. This allows
room for other temporary mounts in the same directory
Some of the LXC callbacks did not lock the virDomainObjPtr
instance. This caused transient errors like
error: Failed to start domain busy-mount
error: cannot rename file '/var/run/libvirt/lxc/busy-mount.xml.new' as '/var/run/libvirt/lxc/busy-mount.xml': No such file or directory
as 2 threads tried to update the status file concurrently
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This configuration does not contain the optional updelay and downdelay
attributes, but parsing will fail due to returning the result of
virXPathULong (a -1 when the attribute doesn't exist) from
virInterfaceDefParseBond after examining the updelay attribute.
While fixing this bug, cleanup the function to use virXPathInt instead
of virXPathULong, and store the result directly instead of using a tmp
variable. Using virXPathInt actually fixes a potential silent
truncation bug noted by Eric Blake.
Also, there is no cleanup in the error label. Remove the label,
returning failure where failure occurs and success if the end of the
function is reached.
Ján Tomko [Fri, 22 Mar 2013 11:32:32 +0000 (12:32 +0100)]
virsh: don't print --(null) in vol-name and vol-pool
Don't print the pool option name if it's null.
Before:
virsh # vol-name vol
error: failed to get vol 'vol', specifying --(null) might help
error: Storage volume not found: no storage vol with matching path vol
After:
virsh # vol-name vol
error: failed to get vol 'vol'
error: Storage volume not found: no storage vol with matching path vol
Paolo Bonzini [Thu, 21 Mar 2013 11:53:51 +0000 (12:53 +0100)]
domain: make port optional for network disks
Only sheepdog actually required it in the code, and we can use 7000 as the
default---the same value that QEMU uses for the simple "sheepdog:VOLUME"
syntax. With this change, the schema can be fixed to allow no port.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Thu, 21 Mar 2013 11:53:49 +0000 (12:53 +0100)]
qemu: add support for libiscsi
libiscsi provides a userspace iSCSI initiator.
The main advantage over the kernel initiator is that it is very
easy to provide different initiator names for VMs on the same host.
Thus libiscsi supports usage of persistent reservations in the VM,
which otherwise would only be possible with NPIV.
libiscsi uses "iscsi" as the scheme, not "iscsi+tcp". We can change
this in the tests (while remaining backwards-compatible manner, because
QEMU uses TCP as the default transport for both Gluster and NBD).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Guannan Ren [Thu, 21 Mar 2013 08:27:09 +0000 (16:27 +0800)]
python: treat flags as default argument with value 0
The following four functions have not changed because default arguments
have to come after positional arguments. Changing them will break the
the binding APIs.
The 'trang' utility, which is able to transform '.rng' files into
'.rnc' files, reported some errors in our schemas that weren't caught
by the tools we use in the build. I haven't added a test for this,
but the validity can be checked by the following command:
trang -I rng -O rnc domain.rng domain.rnc
There were unescaped minuses in regular expressions and we were
constraining int (which is by default in the range of [-2^31;2^31-1]
to maximum of 2^32. But what we wanted was exactly an unsignedInt.
Peter Krempa [Wed, 20 Mar 2013 10:43:41 +0000 (11:43 +0100)]
python: Fix emulatorpin API bindings
The addition of emulator pinning APIs didn't think of doing the right
job with python APIs for them. The default generator produced unusable
code for this.
This patch switches to proper code as in the case of domain Vcpu pining.
This change can be classified as a python API-breaker but in the state
the code was before I doubt anyone was able to use it successfully.
Fix initialization of virIdentityPtr thread locals
Some code mistakenly called virIdentityOnceInit directly
instead of virIdentityInitialize(). This meant that one-time
initializer was run many times with predictably bad results.
The VIR_ERR_NO_SUPPORT error code is reserved for cases where an
API is not implemented in a driver. It definitely should not be
used when an API execution fails due to unsupported operation.
The recent commit moved some of the use of libnuma out of the
driver code, and into src/util/. It did not, however, update
libvirt_util.la to link against libnuma. This caused linkage
failure with virt-aa-helper, since nothing else caused libnuma
to be pulled onto the linker command line.
The fix removes all reference to NUMACTL_LIBS/CFLAGS from the
various modules in src/Makefile.am and just adds them to the
libvirt_util.la module, which everything else depends on.
Technically a build-breaker fix, but wanted to wait for feedback
on this
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Osier Yang [Mon, 18 Mar 2013 13:52:04 +0000 (21:52 +0800)]
qemu: Add the new disk src into shared disk table when updating disk
We should record the new disk src in the shared disk table for
updating disk (CD-ROM or Floppy) API. Fortunately, we only allow
to update the disk source now, otherwise we might also want to
set the unpriv_sgio setting.
Guannan Ren [Thu, 21 Mar 2013 03:24:49 +0000 (11:24 +0800)]
python: fix bindings that don't raise an exception
For example:
>>> dom.memoryStats()
libvir: QEMU Driver error : Requested operation is not valid:\
domain is not running
There are six such python API functions like so.
The root reason is that generator.py script checks the type of return
value of a python stub function defined in libvirt-api.xml or
libvirt-override-api.xml to see whether to add the raise clause or not
in python wrapper code in libvirt.py.
The type of return value is supposed to be C types.
For those stub functions which return python non-integer data type like
string, list, tuple, dictionary, the existing type in functions varies
from each other which leads problem like this.
Currently, in generator.py, it maintains a buggy whitelist for stub functions
returning a list type. I think it is easy to forget adding new function name
in the whitelist.
This patch makes the value of type consistent with C type "char *"
in libvirt-override-api.xml. For python, any of types could be printed
as string, so I choose "char *" in this case. And the comment in xml
could explain it when adding new function definition.
Guido Günther [Wed, 20 Mar 2013 19:01:18 +0000 (20:01 +0100)]
Don't fail if SELinux is diabled
but libvirt is built with --with-selinux. In this case getpeercon
returns ENOPROTOOPT so don't return an error in that case but simply
don't set seccon.
Gene Czarcinski [Sat, 16 Mar 2013 17:36:32 +0000 (13:36 -0400)]
clarify virsh net commands
Clarify that net-create deals with a transient virtual
network whereas net-define defines a persistent virtual
network definition and will create the network (xml)
definition file.
Clarify that net-destroy works with both transient and
persistent virtual networks.
Gao feng [Mon, 18 Mar 2013 09:04:02 +0000 (17:04 +0800)]
LXC: allow uses advisory nodeset from querying numad
Allow lxc using the advisory nodeset from querying numad,
this means if user doesn't specify the numa nodes that
the lxc domain should assign to, libvirt will automatically
bind the lxc domain to the advisory nodeset which queried from
numad.
Olivia Yin [Thu, 14 Mar 2013 04:49:43 +0000 (12:49 +0800)]
qemu: add dtb option support
The "dtb" option sets the filename for the device tree.
If without this option support, "-dtb file" will be converted into
<qemu:commandline> in domain XML file.
For example, '-dtb /media/ram/test.dtb' will be converted into
<qemu:commandline>
<qemu:arg value='-dtb'/>
<qemu:arg value='/media/ram/test.dtb'/>
</qemu:commandline>
This is not very friendly.
This patchset add special <dtb> tag like <kernel> and <initrd>
which is easier for user to write domain XML file.
<os>
<type arch='ppc' machine='ppce500v2'>hvm</type>
<kernel>/media/ram/uImage</kernel>
<initrd>/media/ram/ramdisk</initrd>
<dtb>/media/ram/test.dtb</dtb>
<cmdline>root=/dev/ram rw console=ttyS0,115200</cmdline>
</os>
Doug Goldstein [Sun, 17 Mar 2013 01:57:55 +0000 (20:57 -0500)]
Fix --without-libvirtd builds
When building with --without-libvirtd and udev support is detected we
will fail to build with the following error:
node_device/node_device_udev.c:1608:37: error: unknown type name
'virStateInhibitCallback'
Gene Czarcinski [Tue, 19 Mar 2013 18:36:28 +0000 (14:36 -0400)]
rename tests/conftest.c
To prevent confusion with configure's popular name
for a file, rename conftest.c to test_conf.c which
is consistent with the invoking test_conf.sh Signed-off-by: Gene Czarcinski <gene@czarc.net>
Laine Stump [Mon, 18 Mar 2013 20:04:11 +0000 (16:04 -0400)]
storage: fix unlikely memory leak in rbd backend
virStorageBackendRBDRefreshPool() first allocates an array big enough
to hold 1024 names, then calls rbd_list(), which returns ERANGE if the
array isn't big enough. When that happens, the VIR_ALLOC_N is called
again with a larger size. Unfortunately, the original array isn't
freed before allocating a new one.
Do not prematurely close loop devices in LXC controller
The LXC controller is closing loop devices as soon as the
container has started. This is fine if the loop device
was setup as a mounted filesystem, but if we're just passing
through the loop device as a disk, nothing else is keeping
it open. Thus we must keep the loop device FDs open for as
long the libvirt_lxc process is running.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the LXC controller creates the cgroup, configures the
resources and adds the task all in one go. This is not sufficiently
flexible for the forthcoming NBD integration. We need to make sure
the NBD process gets into the right cgroup immediately, but we can
not have limits (in particular the device ACL) applied at the point
where we start qemu-nbd. So create a virLXCCgroupCreate method
which creates the cgroup and adds the current task to be called
early, and leave virLXCCgroupSetup to only do resource config.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Set the current client identity during API call dispatch
When dispatching an RPC API call, setup the current identity to
hold the identity of the network client associated with the
RPC message being dispatched. The setting is thread-local, so
only affects the API call in this thread
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add ability to get a virIdentity from a virNetServerClientPtr
Add APIs which allow creation of a virIdentity from the info
associated with a virNetServerClientPtr instance. This is done
based on the results of client authentication processes like
TLS, x509, SASL, SO_PEERCRED
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If no user identity is available, some operations may wish to
use the system identity. ie the identity of the current process
itself. Add an API to get such an identity.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Introduce a local object virIdentity for managing security
attributes used to form a client application's identity.
Instances of this object are intended to be used as if they
were immutable, once created & populated with attributes
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A socket object has various pieces of security data associated
with it, such as the SELinux context, the SASL username and
the x509 distinguished name. Add new APIs to virNetServerClient
and related modules to access this data.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
qemu: check backing chains even when cgroup is omitted
added backing file checks just before the code that removes optional
disks if they are not present. However, the backing chain code fails in
case the disk file does not exist, which makes qemuProcessStart fail
regardless on configured startupPolicy.
Note that startupPolicy implementation is still wrong after this patch
since it only check the first file in a possible chain. It should rather
check the complete backing chain. But this is an existing limitation
that can be solved later. After all, startupPolicy is most useful for
CDROM images and they won't make use of backing files in most cases.
Paolo Bonzini [Mon, 25 Feb 2013 17:44:25 +0000 (18:44 +0100)]
qemu: support URI syntax for NBD
QEMU 1.3 and newer support an alternative URI-based syntax to specify
the location of an NBD server. Libvirt can keep on using the old
syntax in general, but only the URI syntax supports IPv6 addresses.
The URI syntax also supports relative paths to Unix sockets. These
should never be used but aren't explicitly blocked either by the parser,
so support it just in case.
The URI syntax is intentionally compatible with Gluster's, and the
code can be reused.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
Paolo Bonzini [Mon, 25 Feb 2013 17:44:21 +0000 (18:44 +0100)]
qemu: do not support non-network disks without -drive
QEMU added -drive in 2007, and NBD in 2008. Both appeared first in
release 0.10.0. Thus the code to support network disks without -drive
is dead, and in fact it incorrectly escapes commas. Drop it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
After we switched to C99 initialization, I noticed there were many
places where the specification of .flags parameter differed. After
going through many options and deciding whether to unify the
initialization to be '.flags = 0' or '.flags = VSH_OFLAG_NONE', I
realized both can be removed and it makes the code easier to go
through.
According to the man page, the memspec parameter should have the
'--memspec' option mandatory and this is as close as we can get to
that. What this change does is explained below.
Li Zhang [Fri, 15 Mar 2013 09:25:09 +0000 (17:25 +0800)]
Remove contiguous CPU indexes assumption
When getting CPUs' information, it assumes that CPU indexes
are not contiguous. But for ppc64 platform, CPU indexes are not
contiguous because SMT is needed to be disabled, so CPU information
is not right on ppc64 and vpuinfo, vcpupin can't work corretly.
This patch is to remove the assumption to be compatible with ppc64.
Test:
4 vcpus are assigned to one VM and execute vcpuinfo command.
Without patch: There is only one vcpu informaion can be listed.
With patch: All vcpus' information can be listed correctly.
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com>
The text version
of LGPLv2.1 available at
http://www.gnu.org/licenses/old-licenses/lgpl-2.1.txt is slightly
different from COPYING.LIB:
- several paragraphs were rewrapped
- the FSF address has changed, so the license has been changed to
indicate the newer address
I've checked that there are no changes in the license text apart from
the updated address, which is what I want to fix with this commit.