There are now two places in libvirt which use polkit. Currently
they use pkexec, which is set to be replaced by direct DBus API
calls. Add a common API which they will both be able to use for
this purpose.
No tests are added at this time, since the impl will be gutted
in favour of a DBus API call shortly.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ján Tomko [Thu, 11 Sep 2014 10:56:31 +0000 (12:56 +0200)]
conf: add options for disabling segment offloading
Add options for tuning segment offloading:
<driver>
<host csum='off' gso='off' tso4='off' tso6='off'
ecn='off' ufo='off'/>
<guest csum='off' tso4='off' tso6='off' ecn='off' ufo='off'/>
</driver>
which control the respective host_ and guest_ properties
of the virtio-net device.
Peter Krempa [Thu, 11 Sep 2014 16:59:32 +0000 (18:59 +0200)]
qemu: Report better errors from broken backing chains
Request erroring out from the backing chain traveller and drop qemu's
internal backing chain integrity tester.
The backing chain traveller reports errors by itself with possibly more
detail than qemuDiskChainCheckBroken ever could.
We also need to make sure that we reconnect to existing qemu instances
even at the cost of losing the backing chain info (this really should be
stored in the XML rather than reloaded from disk, but that needs some
work).
Peter Krempa [Thu, 11 Sep 2014 16:28:47 +0000 (18:28 +0200)]
util: storage: Allow metadata crawler to report useful errors
Add a new parameter to virStorageFileGetMetadata that will break the
backing chain detection process and report useful error message rather
than having to use virStorageFileChainGetBroken.
This patch just introduces the option, usage will be provided
separately.
Jim Fehlig [Mon, 8 Sep 2014 16:22:14 +0000 (10:22 -0600)]
libvirt-guests: run after time-sync.target
When libvirt-guests is configured to start guests on host
boot, it is possible for guests start and read the host
clock before it is synchronized. Services such as
libvirt-guests that require correct time should use the
Special Passive System Unit time-sync.target
Pavel Hrdina [Tue, 9 Sep 2014 14:34:12 +0000 (16:34 +0200)]
cputune_event: queue the event for cputune updates
Now we have universal tunable event so we can use it for reporting
changes to user. The cputune values will be prefixed with "cputune" to
distinguish it from other tunable events.
Pavel Hrdina [Wed, 10 Sep 2014 11:28:24 +0000 (13:28 +0200)]
event: introduce new event for tunable values
This new event will use typedParameters to expose what has been actually
updated and the reason is that we can in the future extend any tunable
values or add new tunable values. With typedParameters we don't have to
worry about creating some other events, we will just use this universal
event to inform user about updates.
Michal Privoznik [Tue, 23 Sep 2014 11:08:39 +0000 (13:08 +0200)]
qemuBuildNumaArgStr: Discard def->cpu check
In the function at one place we check if def->cpu is NULL prior
to accessing def->cpu->ncells. Then, later in the code,
def->cpu->ncells is accessed directly, without the check. This
makes coverity unhappy, because the first check makes it think
def->cpu can be NULL. However, the function is not called if
def->cpu is NULL. Therefore, remove the first check and hopefully
make coverity cheer again.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michael R. Hines [Mon, 13 Jan 2014 06:28:12 +0000 (14:28 +0800)]
qemu: Memory pre-pinning support for RDMA migration
RDMA Live migration requires registering memory with the hardware, and
thus QEMU offers a new 'capability' to pre-register / mlock() the guest
memory in advance for higher RDMA performance before the migration
begins. This capability is disabled by default, which means QEMU will
register the memory with the hardware in an on-demand basis.
This patch exposes this capability with the following example usage:
Since libvirt runs QEMU in a pretty restricted environment, several
files needs to be added to cgroup_device_acl (in qemu.conf) for QEMU to
be able to access the host's infiniband hardware. Full documenation of
the feature can be found on QEMU wiki:
http://wiki.qemu.org/Features/RDMALiveMigration
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com> Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
qemu: Prepare support for arbitrary migration protocol
Currently we only support TCP protocol for native QEMU migration but
this is going to be changed. Let's make the code more general and remove
hardcoded TCP protocol from several places.
Michael R. Hines [Mon, 13 Jan 2014 06:28:10 +0000 (14:28 +0800)]
qemu: Expose additional migration statistics
RDMA migration uses the 'setup' state in QEMU to optionally lock
all memory before the migration starts. The total time spent in
this state is exposed as VIR_DOMAIN_JOB_SETUP_TIME.
Additionally, QEMU also exports migration throughput (mbps) for both
memory and disk, so let's add them too: VIR_DOMAIN_JOB_MEMORY_BPS,
VIR_DOMAIN_JOB_DISK_BPS.
Signed-off-by: Michael R. Hines <mrhines@us.ibm.com> Signed-off-by: Jiri Denemark <jdenemar@redhat.com>
Peter Krempa [Wed, 17 Sep 2014 12:50:04 +0000 (14:50 +0200)]
qemu: save image: Add possibility to return XML stored in the image
Add a new parameter that will allow to return the XML stored in the save
image for further manipulation and adjust the callers. This option will
be used in later patches.
Jim Fehlig [Thu, 18 Sep 2014 21:05:34 +0000 (15:05 -0600)]
libxl: Drop driver lock in libxlDomainDefineXML
There is no need to acquire the driver-wide lock in
libxlDomainDefineXML. When switching to jobs in the libxl
driver, most driver-wide locks were removed. The locking here
was preserved since I mistakenly thought virDomainObjListAdd
needed protection. This is not the case, so remove the
unnecessary locking.
John Ferlan [Fri, 19 Sep 2014 09:53:04 +0000 (05:53 -0400)]
qemu: Add missing goto on rawio
Commit id '9a2f36ec' added a build conditional of CAP_SYS_RAWIO
in order to determine whether or not a disk definition using rawio
should be allowed on platforms without CAP_SYS_RAWIO. If one was
found, virReportError was used but the code didn't goto cleanup.
Pavel Hrdina [Thu, 18 Sep 2014 15:38:32 +0000 (17:38 +0200)]
Move the FIPS detection from capabilities
We are not detecting the presence of FIPS from QEMU, but from procfs and
that means it's not QEMU capability. It was decided that we will pass
this flag to QEMU even if it's not supported by old QEMU binaries.
This patch also reverts changes done by commit a21cfb0f to
qemucapabilitestest and implements a new test case in qemuxml2argvtest.
A long time ago I've implemented support for so called multiqueue
net. The idea was to let guest network traffic be processed by
multiple host CPUs and thus increasing performance. However, this
behavior is enabled by QEMU via special ioctl() iterated over the
all tap FDs passed in by libvirt. Unfortunately, SELinux comes in
and disallows the ioctl() call because the /dev/net/tun has label
system_u:object_r:tun_tap_device_t:s0 and 'attach_queue' ioctl()
is not allowed on tun_tap_device_t type. So after discussion with
a SELinux developer we've decided that the FDs passed to the QEMU
should be labelled with svirt_t type and SELinux policy will
allow the ioctl(). Therefore I've made a patch
(cf976d9dcf4e592261b14f03572) that does exactly this. The patch
was fixed then by a4431931393aeb1ac5893f121151fa3df4fde612 and b635b7a1af0e64754016d758376f382470bc11e7. However, things are not
that easy - even though the API to label FD is called
(fsetfilecon_raw) the underlying file is labelled too! So
effectively we are mangling /dev/net/tun label. Yes, that broke
dozen of other application from openvpn, or boxes, to qemu
running other domains.
The best solution would be if SELinux provides a way to label an
FD only, which could be then labeled when passed to the qemu.
However that's a long path to go and we should fix this
regression AQAP. So I went to talk to the SELinux developer again
and we agreed on temporary solution that:
1) All the three patches are reverted
2) SELinux temporarily allows 'attach_queue' on the
tun_tap_device_t
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
- docs/formatstorage.html.in: document 'zfs' pool type, add it
to a list of pool types that could use source physical devices
- docs/storage.html.in: update a ZFS pool example XML with
source physical devices, mention that starting from 1.2.9 a
pool could be created from this devices by libvirt and in earlier
versions user still has to create a pool manually
- docs/drvbhyve.html.in: add an example with ZFS pools
- Provide an implementation for buildPool and deletePool operations
for the ZFS storage backend.
- Add VIR_STORAGE_POOL_SOURCE_DEVICE flag to ZFS pool poolOptions
as now we can specify devices to build pool from
- storagepool.rng: add an optional 'sourceinfodev' to 'sourcezfs' and
add an optional 'target' to 'poolzfs' entity
- Add a couple of tests to storagepoolxml2xmltest
John Ferlan [Wed, 17 Sep 2014 18:43:12 +0000 (14:43 -0400)]
qemu: Don't fail startup/attach for IOThreads if no JSON
If the qemu being used doesn't support JSON, then querying for IOThread
data would fail. In that case, ensure the *iothreads is NULL and return 0
as the count of iothreads available.
CC qemu/libvirt_driver_qemu_impl_la-qemu_command.lo
qemu/qemu_command.c:6580:58: error: implicit conversion from enumeration type
'virMemAccess' to different enumeration type 'virTristateSwitch'
[-Werror,-Wenum-conversion]
virTristateSwitch memAccess = def->cpu->cells[i].memAccess;
~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~^~~~~~~~~
1 error generated.
Fix that by using virMemAccess instead of virTristateSwitch.
Commit f05b6a91 added virQEMUDriverConfigPtr argument to the
virQEMUCapsFillDomainCaps function and it uses forward declaration
of virQEMUDriverConfig and virQEMUDriverConfigPtr that casues clang
build to fail:
gmake[3]: Entering directory `/usr/home/novel/code/libvirt/src'
CC qemu/libvirt_driver_qemu_impl_la-qemu_capabilities.lo
In file included from qemu/qemu_capabilities.c:43:
In file included from qemu/qemu_hostdev.h:27:
qemu/qemu_conf.h:63:37: error: redefinition of typedef 'virQEMUDriverConfig'
is a C11 feature [-Werror,-Wtypedef-redefinition]
typedef struct _virQEMUDriverConfig virQEMUDriverConfig;
^
qemu/qemu_capabilities.h:328:37: note: previous definition is here
typedef struct _virQEMUDriverConfig virQEMUDriverConfig;
^
Fix that by passing loader and nloader config attributes directly
instead of passing complete config.
Commit b20d39a introduced a new argument for the
virNetDevTapCreateInBridgePort function, however, its mock
in bhyve tests wasn't updated, so the build failed.
Fix build by adding this new argument to the mock version.
Peter Krempa [Thu, 11 Sep 2014 14:35:53 +0000 (16:35 +0200)]
CVE-2014-3633: qemu: blkiotune: Use correct definition when looking up disk
Live definition was used to look up the disk index while persistent one
was indexed leading to a crash in qemuDomainGetBlockIoTune. Use the
correct def and report a nice error.
Unfortunately it's accessible via read-only connection, though it can
only crash libvirtd in the cases where the guest is hot-plugging disks
without reflecting those changes to the persistent definition. So
avoiding hotplug, or doing hotplug where persistent is always modified
alongside live definition, will avoid the out-of-bounds access.
There are two ways how to tell qemu to use huge pages. The first one
is suitable for domains with NUMA nodes: the path to hugetlbfs mount
is appended to NUMA node definition on the command line. The second
one is suitable for UMA domains: here there's this global '-mem-path'
argument that accepts path to the hugetlbfs mount point. However, the
latter case was not used for all the cases that it should be. For
instance:
Michal Privoznik [Mon, 15 Sep 2014 09:59:09 +0000 (11:59 +0200)]
conf: Disallow nonexistent NUMA nodes for hugepages
As of 136ad4974 it is possible to specify different huge pages per
guest NUMA node. However, there's no check if nodeset specified in
./hugepages/page contains only those guest NUMA nodes that exist.
In other words with current code it is possible to define meaningless
combination:
Francesco Romani [Mon, 15 Sep 2014 08:48:04 +0000 (10:48 +0200)]
qemu: bulk stats: extend internal collection API
Future patches which will implement more bulk stats groups for QEMU will
need to access the connection object.
To accommodate that, a few changes are needed:
* enrich internal prototype to pass qemu driver object
* add per-group flag to mark if one collector needs monitor access or not
* If at least one collector of the requested stats needs monitor access
we must start a query job for each domain. The specific collectors
will run nested monitor jobs inside that.
* If the job can't be acquired we pass flags to the collector so
specific collectors that need monitor access can be skipped in order
to gather as much data as is possible.
Signed-off-by: Francesco Romani <fromani@redhat.com> Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Michal Privoznik [Wed, 17 Sep 2014 15:17:03 +0000 (17:17 +0200)]
domaincapstest: Run cleanly on systems missing OVMF firmware
As of f05b6a918e28 the test produces the list of paths that can
be passed to <loader/> and libvirt knows about them. However,
during the process of generating the list the paths are checked
for their presence. This may produce different results on
different systems. Therefore, the path - if missing - is
added to pretend it's there.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
qemu: Add support for multiple versions of 'pseries' machine type
qemu for IBM Power processor architecture is adding functionality for
supporting multiple 'pseries' machine type versions, each with different
capabilities. This patch is for supporting the same
Signed-off-by: Pradipta Kr. Banerjee <bpradip@in.ibm.com>
Michal Privoznik [Tue, 16 Sep 2014 12:47:47 +0000 (14:47 +0200)]
domaincaps: Expose UEFI capability
As of 542899168c38 we learned libvirt to use UEFI for domains.
However, management applications may firstly query if libvirt
supports it. And this is where virConnectGetDomainCapabilities()
API comes handy.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Peter Krempa [Tue, 16 Sep 2014 10:55:32 +0000 (12:55 +0200)]
util: storage: Copy driver type when initializing chain element
virStorageSourceInitChainElement initializes a new storage chain element
for use as a new disk source. If the new element doesn't contain the
driver name, copy it from the old source.
This fixes issue where a disk would forget the driver after a snapshot.
Peter Krempa [Tue, 16 Sep 2014 13:37:08 +0000 (15:37 +0200)]
qemu: time: Report errors if agent command fails
Commit b606bbb4 broke reporting of errors when setting of guest time
fails via the guest agent as the return value is not checked and later
overwritten by the return value qemuMonitorRTCResetReinjection();
Fix this by checking the return value before resetting the RTC
reinjection.
Ján Tomko [Thu, 11 Sep 2014 15:11:28 +0000 (17:11 +0200)]
conf: add backend element to interfaces
For tuning the network, alternative devices
for creating tap and vhost devices can be specified via:
<backend tap='/dev/net/tun' vhost='/dev/net-vhost'/>
Ján Tomko [Wed, 10 Sep 2014 17:12:54 +0000 (19:12 +0200)]
conf: split out virtio net driver formatting
Instead of checking upfront if the <driver> element will be needed
in a big condition, just format all the attributes into a string
and output the <driver> element if the string is not empty.
Erik Skultety [Tue, 16 Sep 2014 08:06:50 +0000 (10:06 +0200)]
network: check negative values in bridge queues
We already are checking for negative value, reporting an error, but
using wrong function and the check only succeeds when a value that
cannot be converted to number successfully is encountered. This patch
provides just a minor change in call of the right version
of function virStrToLong.
Hongbin Lu [Tue, 16 Sep 2014 02:22:48 +0000 (22:22 -0400)]
openvz: fixed two memory leaks on migration code
The first one occurs in openvzDomainMigratePrepare3Params() where in
case no remote uri is given, the distant hostname is used. The name is
obtained via virGetHostname() which require callers to free the
returned value.
The second leak lies in openvzDomainMigratePerform3Params(). There's a
virCommand used later. However, at the beginning of the function
virCheckFlags() is called which returns. So the command created was
leaked.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Mon, 15 Sep 2014 13:31:40 +0000 (15:31 +0200)]
virprocess: Extend list of platforms for setns wrapper
Currently, the setns() wrapper is supported only for x86_64 and i686
which leaves us failing to build on other platforms like arm, aarch64
and so on. This means, that the wrapper needs to be extended to those
platforms and make to fail on runtime not compile time.
The syscall numbers for other platforms was fetched using this
command:
Peter Krempa [Mon, 15 Sep 2014 14:16:25 +0000 (16:16 +0200)]
util: storage: Fix qcow(2) header parser according to docs
The backing store string location offset 0 determines that the file
isn't present. The string size shouldn't be then checked:
from qemu.git/docs/specs/qcow2.txt
== Header ==
The first cluster of a qcow2 image contains the file header:
Byte 0 - 3: magic
QCOW magic string ("QFI\xfb")
4 - 7: version
Version number (valid values are 2 and 3)
8 - 15: backing_file_offset
Offset into the image file at which the backing file name
is stored (NB: The string is not null terminated). 0 if the
image doesn't have a backing file.
16 - 19: backing_file_size
Length of the backing file name in bytes. Must not be
longer than 1023 bytes. Undefined if the image doesn't have
a backing file. ^^^^^^^^^
This patch intentionally leaves the backing file string size check in
place in case a malformatted file would be presented to libvirt. Also
according to the docs the string size is maximum 1023 bytes, thus this
patch adds a check to verify that.
I was also able to verify that the check was done the same way in the
legacy qcow fromat (in qemu's code).