That sets a new flag, but that flag does mean the child will get
LISTEN_FDS and LISTEN_PID environment variables properly set and
passed FDs reordered so that it corresponds with LISTEN_FDS (they must
start right after STDERR_FILENO).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
util: abstract parsing of passed FDs into virGetListenFDs()
Since not only systemd can do this (we'll be doing it as well few
patches later), change 'systemd' to 'caller' and fix LISTEN_FDS to
LISTEN_PID where applicable.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Peter Krempa [Thu, 21 Aug 2014 09:06:37 +0000 (11:06 +0200)]
conf: net: Correctly switch how to format address fields
When formatting the forward mode addresses or interfaces the switch was
done based on the type of the network rather than of the type of the
individual <interface>/<address> element. In case a user would specify
an incorrect network type ("passhtrough") with <address> elements,
libvirtd would crash as it would attempt to format an <interface>.
Use the type of the individual element to format the XML.
Using 'virsh attach-device --config' (or --persistent) to attach a
file backed lun device will succeed; however, subsequent domain restarts
will result in failure because the configuration of a file backed lun
is not supported.
Although allowing 'illegal configurations' is something that can be
allowed, it may not be practical in this case. Generally, when attaching
a device to a domain means the domain must be running. A way around
this is using the --config (or --persistent) option. When an attach
is done to a running domain, a temporary configuration is modified
first followed by the live update. The live update will make a number
of disk validity checks when building the qemu command to attach the
disk. If any fail, then change is rejected.
Rather than allow a potentially illegal combination, adjust the code
in the configuration path to make the same checks as the running path
will make with respect to disk validity checks. This way we avoid
having the potential for some subsequent start/reboot to fail because
an illegal combination was allowed.
NB: The live path still checks the configuration since it is possible
to just do --live guest modification...
Michal Privoznik [Wed, 20 Aug 2014 16:17:07 +0000 (18:17 +0200)]
hvsupport: Adapt to vbox driver rewrite
Since vbox driver rewrite the virDriver structure init moved from
vbox_tmpl.c into vbox_common.c. However, our hvsupport.pl script
doesn't count with that. It still parses vbox_tmp.c and looks for
virDriver structure which is not found there anymore. As a result,
at hvsupport page is seems like vbox driver doesn't support
anything.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Wed, 20 Aug 2014 13:59:25 +0000 (15:59 +0200)]
nodeCapsInitNUMA: Avoid @cpumap leak
In case the host has 2 or more NUMA nodes, we fetch CPU map for each
node. However, we need to free the CPU map in between loops:
==29513== 96 (72 direct, 24 indirect) bytes in 3 blocks are definitely lost in loss record 951 of 1,264
==29513== at 0x4C2A700: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==29513== by 0x52AD24B: virAlloc (viralloc.c:144)
==29513== by 0x52AF0E6: virBitmapNew (virbitmap.c:78)
==29513== by 0x52FB720: virNumaGetNodeCPUs (virnuma.c:294)
==29513== by 0x53C700B: nodeCapsInitNUMA (nodeinfo.c:1886)
==29513== by 0x11759708: vboxCapsInit (vbox_common.c:398)
==29513== by 0x11759CC4: vboxConnectOpen (vbox_common.c:514)
==29513== by 0x53C965F: do_open (libvirt.c:1147)
==29513== by 0x53C9EBC: virConnectOpen (libvirt.c:1317)
==29513== by 0x142905: remoteDispatchConnectOpen (remote.c:1215)
==29513== by 0x126ADF: remoteDispatchConnectOpenHelper (remote_dispatch.h:2346)
==29513== by 0x5453D21: virNetServerProgramDispatchCall (virnetserverprogram.c:437)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Peter Krempa [Mon, 11 Aug 2014 15:28:34 +0000 (17:28 +0200)]
examples: test: Kill unsupported maxMemory element
The "maxMemory" element was never supported by libvirt. Remove it from
the test XMLs. (Found while actually trying to add support for a
identically named element).
An advice appeared there on the qemu-devel list [1]. When a domain is
suspended and then resumed guest kernel is not aware of this. So we've
introduced virDomainSetTime API that resets the time within guest
using qemu-ga. On the other hand, qemu itself is trying to make RTC
beat faster to catch the difference. But if we don't tell qemu that
guest's time was reset via the other method, both mechanisms are
applied resulting in again wrong guest time. In order to avoid summing
both corrections we need to tell qemu that it should not use the RTC
injection if the guest time is set via guest agent.
Peter Krempa [Wed, 20 Aug 2014 08:49:49 +0000 (10:49 +0200)]
qemu: blkiotune: Avoid accessing non-existing disk configuration
When a user would try changing the persistent IO tuning settings for a
disk that was hotplugged to a vm in a transient way, the
qemuDomainSetBlockIoTune API would use the same index for both the
live and config disk array. The disk was missing from the config array
though causing a crash of libvirtd.
To fix the issue, determine the indexes separately.
When starting up the domain the domain's NICs are allocated. As of 1f24f682 (v1.0.6) we are able to use multiqueue feature on virtio
NICs. It breaks network processing into multiple queues which can be
processed in parallel by different host CPUs. The queues are, however,
created by opening /dev/net/tun several times. Unfortunately, only the
first FD in the row is labelled so when turning the multiqueue feature
on in the guest, qemu will get AVC denial. Make sure we label all the
FDs needed.
Moreover, the default label of /dev/net/tun doesn't allow
attaching a queue:
And as suggested by SELinux maintainers, the tun FD should be labeled
as svirt_t. Therefore, we don't need to adjust any range (as done
previously by Guannan in ae368ebf) rather set the seclabel of the
domain directly.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Peter Krempa [Thu, 7 Aug 2014 13:20:40 +0000 (15:20 +0200)]
qemu: conf: Split up qemuRemoveSharedDevice into per-device-type functions
Removing a shared device needs special steps for disks and hostdevs.
Instead of having one function dealing this split the code into two
separate functions that can be used with better granularity.
Peter Krempa [Thu, 7 Aug 2014 13:20:40 +0000 (15:20 +0200)]
qemu: conf: Split up qemuAddSharedDevice into per-device-type functions
Adding a shared device needs special steps for disks and hostdevs.
Instead of having one function dealing this split the code into two
separate functions that can be used with better granularity.
Peter Krempa [Thu, 7 Aug 2014 12:56:30 +0000 (14:56 +0200)]
qemu: conf: rename qemuCheckSharedDevice to qemuCheckSharedDisk
The qemuCheckSharedDevice function is operating only on disk devices.
Rename it and change the arguments to reflect that and refactor some
logic for more readability.
Peter Krempa [Thu, 7 Aug 2014 15:55:12 +0000 (17:55 +0200)]
qemu: shared: Split out shared device list remove code
Split it out into a separate function and simplify the code. There's no
need to copy the entry to update it as the hash returns pointer to the
existing item.
Also remove the now unused qemuSharedDeviceEntryCopy function.
Peter Krempa [Wed, 6 Aug 2014 11:57:42 +0000 (13:57 +0200)]
qemu: shared: Split out insertion code to the shared device list
To allow reuse split the code into a separate function and refactor it.
To update an existing entry there's no need to copy it first, just
update it inplace.
Peter Krempa [Tue, 5 Aug 2014 12:53:01 +0000 (14:53 +0200)]
qemu: hotplug: Add helper to initialize/teardown new disks for VMs
When we are changing media (or doing other hotplug operations) we need
to setup cgroups, locking and seclabels on the new disk. This is a
multi-step process where every piece can fail. To simplify dealing with
this introduce qemuDomainPrepareDisk that similarly to
qemuDomainPrepareDiskChainElement initializes/tears down a whole new
disk to be used with the domain.
Additionally the function supports passing a different source struct for
media changes of cdroms that will be refactored later.
- Make virBhyveProcessBuildBhyveCmd and
virBhyveProcessBuildLoadCmd take virConnectPtr as the
first argument instead of bhyveConnPtr as virConnectPtr is
needed for virStorageTranslateDiskSourcePool,
- Add virStorageTranslateDiskSourcePool call to
virBhyveProcessBuildBhyveCmd and
virBhyveProcessBuildLoadCmd,
- Allow disks of type VIR_STORAGE_TYPE_VOLUME
storage: make disk source pool translation generic
Currently, qemu driver uses qemuTranslateDiskSourcePool()
to translate disk volume information. This function is
general enough and could be used for other drivers as well,
so move it to conf/domain_conf.c along with its helpers.
- qemuTranslateDiskSourcePool: move to storage/storage_driver.c
and rename to virStorageTranslateDiskSourcePool,
- qemuAddISCSIPoolSourceHost: move to storage/storage_driver.c
and rename to virStorageAddISCSIPoolSourceHost,
- qemuTranslateDiskSourcePoolAuth: move to storage/storage_driver.c
and rename to virStorageTranslateDiskSourcePoolAuth,
- Update users of qemuTranslateDiskSourcePool to use a
new name.
qemu: allow device block I/O tuning in session mode
In commit 45ad1adb I added a nicer message for tunings that need
cgroups when unavailable (unprivileged), but I added this check for
I/O tuning of block devices, which doesn't need cgroups, because it is
done by QEMU, so let's fix that.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Chunyan Liu [Fri, 8 Aug 2014 08:44:36 +0000 (16:44 +0800)]
cmdMigrate: move vshConnect before vshWatchJob
A possible fix to issue:
http://www.redhat.com/archives/libvir-list/2014-August/thread.html#00227
While doing migration on KVM host, found problem sometimes:
VM is already running on the target host and disappears from source
host, but 'virsh migrate' command line hangs, cannot exit normally.
If pressing "ENTER" key, it will exit.
The code hangs at tools/virsh-domain.c: cmdMigrate
->vshWatchJob->poll():
poll() is trying to select pipe_fd, which is used to receive message
from doMigrate thread. In debugging, found that doMigrate finishes
and at the end it does call safewrite() to write the retval ('0' or
'1') to pipe_fd, and the write is completed. But cmdMigrate poll()
cannot get the event. If pressing "ENTER" key, poll() can get the
event and select pipe_fd, then command line can exit.
In current code, authentication thread which is called by vshConnect
will use stdin, and at the same time, in cmdMigrate main process,
poll() is listening to stdin, that probably affect poll() to get
pipe_fd event. Better to move authentication before vshWatchJob. With
this change, above problem does not exist.
Jim Fehlig [Sat, 16 Aug 2014 02:52:15 +0000 (20:52 -0600)]
src/xenconfig: move common parsing/formatting to xen_common
XM and XL config are very similar. Disks are specified differently
in XL, but the old XM disk config is still supported by XL. XL also
supports new config like spice that was never supported by XM.
This patch moves all the common parsing and formatting functions to
the new file xen_common.c and adapts the XM parser/formatter accordingly.
This restructuring paves way for introducing an XL parser/formatter in
the future.
While moving the code, fixup whitespace, comments, and style issues.
Peter Krempa [Fri, 15 Aug 2014 14:41:47 +0000 (16:41 +0200)]
qemu: process: Pin on per-vcpu basis instead of per-vcpupin element
Pin existing vcpus rather than existing vcpu pinning infos. This
increases the complexity of the lookup, but avoids pinning cpus that are
not enabled actually.
Peter Krempa [Fri, 15 Aug 2014 14:28:58 +0000 (16:28 +0200)]
qemu: cpu: unplug: Remove vcpu pinning on cold cpu unplug
Remove the pinning info when removing to CPU, otherwise when the VM will
be started our code will try to pin non-existing vcpus as the definition
wasn't updated.
Peter Krempa [Thu, 14 Aug 2014 12:38:06 +0000 (14:38 +0200)]
conf: Refactor virDomainVcpuPinDefParseXML
Tidy up control flow, change boolean argument to use 'bool', improve
error message in case the function is used to parse emulator pinning
info and avoid a few temp variables that made no sense.
Also when the function is called to parse emulator pinning info, there's
no need to check the processor ID in that case.
Peter Krempa [Thu, 14 Aug 2014 12:20:37 +0000 (14:20 +0200)]
conf: cpupin: Remove useless checking of vcpupin element count
The check doesn't make much sense as right below it the entries are
either checked for duplicity or ignored in some cases. Having this check
doesn't actually forbid passing invalid values.
Erik Skultety [Fri, 15 Aug 2014 11:50:59 +0000 (13:50 +0200)]
qemu: Redundant listen address entry in quest xml
When editing guest's XML (on QEMU), it was possible to add multiple
listen elements into graphics parent element. However QEMU does not
support listening on multiple addresses. Configuration is tested for
multiple 'listen address' and if positive, an error is raised.
Michal Privoznik [Fri, 15 Aug 2014 10:26:09 +0000 (12:26 +0200)]
daemon: Fix driver registration ordering
There are some stateless drivers which implement subdrivers
(typically vbox and its own network and storage subdrivers). However,
as of ba5f3c7c8ecc10 the vbox driver lives in the daemon, not the
client library. This means, in order for vbox (or any stateless domain
driver) to use its subdrivers, it must register before the general
drivers. Later, when the virConnectOpen function goes through the
subdrivers, stateless drivers are searched first. If the connection
request is aiming at stateless driver, it will be opened. Otherwise
the generic subdriver is opened.
The other change done in this commit is moving interface module load a
bit earlier to match the ordering in case libvirt is built without
driver modules.
Reported-by: Taowei Luo <uaedante@gmail.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Taowei [Mon, 11 Aug 2014 10:07:07 +0000 (18:07 +0800)]
vbox: Rewrite vboxNode functions
Four functions are rewrite in this patch, that is:
vboxNodeGetInfo
vboxNodeGetCellsFreeMemory
vboxNodeGetFreeMemory
vboxNodeGetFreePages
Since these functions has nothing to do with vbox,
it can be directly moved to vbox_common.c. So, I
merged these things into one patch.
Taowei [Mon, 11 Aug 2014 10:06:53 +0000 (18:06 +0800)]
vbox: Rewrite vboxDomainSnapshotCreateXML
The vboxDomainSnapshotCreateXML integrated the snapshot redefine
with this patch:
http://www.redhat.com/archives/libvir-list/2014-May/msg00589.html
This patch introduced vboxSnapshotRedefine in vboxUniformedAPI to
enable the features.
This patch replace all version specified APIs to the uniformed api,
then, moving the whole implementation to vbox_common.c. As there
is only API level changes, the behavior of the function doesn't
change.
Some old version's defects has brought to the new one. The already
known things are:
*goto cleanup in a loop without releasing the pointers in the
loop.
*When function failed after machine unregister, no roll back
to recovery it and the virtual machine would disappear.