Jiri Denemark [Wed, 21 Oct 2015 08:55:43 +0000 (10:55 +0200)]
qemu: Always set async job when starting a domain
We only started an async job for incoming migration from another host.
When we were starting a domain from scratch or restoring from a saved
state (migration from file) we didn't set any async job. Let's introduce
a new QEMU_ASYNC_JOB_START for these cases.
Jiri Denemark [Fri, 6 Nov 2015 17:41:37 +0000 (18:41 +0100)]
qemu: Introduce qemuProcessIncomingDef
Incoming migration may require quite a few parameters (URI, fd, path) to
be considered while starting QEMU and we will soon add another one.
Let's group all of them in a single struct.
Joao Martins [Fri, 13 Nov 2015 13:14:44 +0000 (13:14 +0000)]
util: add virDiskNameParse to handle disk and partition idx
Introduce a new helper function "virDiskNameParse" which extends
virDiskNameToIndex but handling both disk index and partition index.
Also rework virDiskNameToIndex to be based on virDiskNameParse.
A test is also added for this function testing both valid and
invalid disk names.
Joao Martins [Fri, 13 Nov 2015 13:14:42 +0000 (13:14 +0000)]
libxl: implement virDomainMemorystats
Introduce support for domainMemoryStats API call, which
consequently enables the use of `virsh dommemstat` command to
query for memory statistics of a domain. We support
the following statistics: balloon info, available and currently
in use. swap-in, swap-out, major-faults, minor-faults require
cooperation of the guest and thus currently not supported.
We build on the data returned from libxl_domain_info and deliver
it in the virDomainMemoryStat format.
The LXC driver uses virSetUIDGID() to become UID/GID 0.
It passes an empty groups list to virSetUIDGID()
to get rid of all supplementary groups from the host side.
But virSetUIDGID() calls setgroups() only if the supplied list
is larger than 0.
This leads to a container root with unrelated supplementary groups.
In most cases this issue is unoticed as libvirtd runs as UID/GID 0
without any supplementary groups.
Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of creating symlinks, bind mount the devices to
/dev/pts/XY.
Using bind mounts it is no longer needed to add pts devices
to files like /etc/securetty.
Signed-off-by: Richard Weinberger <richard@nod.at>
Userspace does not expect that the initial console
is a controlling TTY. systemd can deal with that, others not.
On sysv init distros getty will fail to spawn a controlling on
/dev/console or /dev/tty1. Which will cause to whole container
to reboot upon ctrl-c.
This patch changes the behavior of libvirt to match the kernel
behavior where the initial TTY is also not controlling.
The only user visible change should be that a container with
bash as PID 1 would complain. But this matches exactly the kernel
be behavior with init=/bin/bash.
To get a controlling TTY for bash just run "setsid /bin/bash".
Signed-off-by: Richard Weinberger <richard@nod.at>
So, if domain loses access to storage, sanlock tries to kill it
after some timeout. So far, the default is 80 seconds. But for
some scenarios this might not be enough. We should allow users to
adjust the timeout according to their needs.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Peter Krempa [Wed, 7 Oct 2015 11:52:45 +0000 (13:52 +0200)]
conf: Prepare making memory device target node optional
Adjust the config code so that it does not enforce that target memory
node is specified. To avoid breakage, adjust the qemu memory hotplug
config checker to disallow such config for now.
Peter Krempa [Wed, 7 Oct 2015 12:17:43 +0000 (14:17 +0200)]
qemu: command: Make qemuBuildMemoryBackendStr usable without NUMA
Make the function usable so that -1 can be passed to it as cell ID so
that we can later enable memory hotplug on non-NUMA guests for certain
architectures.
Guido Günther [Tue, 17 Nov 2015 07:39:46 +0000 (08:39 +0100)]
libvirt-guests: Disable shutdown timeout
Since we can't know at service start how many VMs will be running we
can't calculate an apropriate shutdown timeout. So instead of killing
off the service just let it use it's own internal timeout mechanism.
Adapt the sysfs TPM command cancel path for the TPM driver that
does not use a miscdevice anymore since Linux 4.0. Support old
and new paths and check their availability.
Add a mockup for the test cases to avoid the testing for
availability of the cancel path.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Joao Martins [Fri, 13 Nov 2015 13:14:41 +0000 (13:14 +0000)]
libxl: implement virDomainGetCPUStats
Introduce support for domainGetCPUStats API call and consequently
allow us to use `virsh cpu-stats`. The latter returns a more brief
output than the one provided by`virsh vcpuinfo`.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Jim Fehlig <jfehlig@suse.com>
The virStoragePoolFCRefreshThread had passed a pointer to the pool obj
in the virStoragePoolFCRefreshInfoPtr; however, we cannot assume that
the pool exists still since we don't keep the pool lock throughout
the duration of the thread.
Therefore, instead of passing the pool obj pointer, pass the UUID of
the pool and perform a lookup. If found, then we can perform the
refresh using the locked pool obj pointer; otherwise, we just exit
the thread since the pool is now gone.
Jiri Denemark [Tue, 20 Oct 2015 14:01:01 +0000 (16:01 +0200)]
tests: Remove qemuxmlnstest
It's just a copy&paste of qemuxml2argv test anyway. We can test most of
them (except for qemuxmlns-qemu-ns-domain.xml which fails to validate
against our schema) by qemuxml2argv test.
Pavel Hrdina [Wed, 11 Nov 2015 14:20:15 +0000 (15:20 +0100)]
domain-conf: reorder usb controllers so the master is first
USB controllers can share the same 'index' which indicates, that there
is some sort of master-companion relationship. Reorder the controllers
in XML in to place the master controller before its companions. This is
required by QEMU to not fail with error message:
error: internal error: process exited while connecting to monitor:
2015-10-26T16:25:17.630265Z qemu-system-x86_64:
-device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6:
USB bus 'usb.0' not found
Pavel Hrdina [Wed, 21 Oct 2015 10:59:41 +0000 (12:59 +0200)]
virsh-domain: update attach-interface to support type=hostdev
Adding this feature will allow users to easily attach a hostdev network
interface using PCI passthrough.
The interface can be attached using --type=hostdev and PCI address or
as --source. This command also allows you to tell, whether the interface
should be managed.
As of QEMU 0.9.1 the -drive argument can be used to configure
all disks, so the QEMU driver can assume it is always available
and drop support for -hda/-cdrom/etc.
Many of the tests need updating because a great many were
running without CAPS_DRIVE set, so using the -hda legacy
syntax.
Fixing the tests uncovered a bug in the argv -> xml
convertor which failed to handle disk with if=floppy.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
qemu: handle floppy disk bus when parsing command line argv
The QEMU argv -> virDomainDef conversion code was not handling
-drive arguments using the floppy bus. This caused them to be
added as hard disks instead.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
As of QEMU 0.11.0 the 'info chardev' monitor command can be
used to report on allocated chardev paths, so we can drop
support for parsing QEMU stderr to locate the PTY paths.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
As of QEMU 0.9.0 the -vnc option accepts a ':' to separate port
from listen address, so the QEMU driver can assume that support
for listen addresses is always available.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Check the QEMU version and refuse to work with QEMU versions
older than 0.12.0. This is approximately the vintage of QEMU
that is available in RHEL-6 era distros.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Peter Krempa [Fri, 6 Nov 2015 15:39:31 +0000 (16:39 +0100)]
qemu: hotplug: Fix mlock limit handling on memory hotplug
If mlock is required either due to use of VFIO hostdevs or due to the
fact that it's enabled it needs to be tweaked prior to adding new memory
or after removing a module. Add a helper to determine when it's
necessary and reuse it both on hotplug and hotunplug.
tests: handle backspace-newline pairs in test input files
all the test argv files were line wrapped so that the args
were less than 80 characters.
The way the line wrapping was done turns out to be quite
undesirable, because it often leaves multiple parameters
on the same line. If we later need to add or remove
individual parameters, then it leaves us having to redo
line wrapping.
This commit changes the line wrapping so that every
single "-param value" is one its own new line. If the
"value" is still too long, then we break on ',' or ':'
or ' ' as needed.
This means that when we come to add / remove parameters
from the test files line, the patch diffs will only
ever show a single line added/removed which will greatly
simplify review work.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
John Ferlan [Wed, 4 Nov 2015 15:48:24 +0000 (10:48 -0500)]
virnetdev: Use virNetDevSetupControl in virNetDevSendEthtoolIoctl
Use virNetDevSetupControl instead of open coding using socket(AF_LOCAL...)
and clearing virIfreq.
By using virNetDevSetupControl, the socket is then opened using
AF_PACKET which requires being privileged (effectively root) in
order to complete successfully. Since that's now a requirement,
then the ioctl(SIOCETHTOOL) should not fail with EPERM, thus it
is removed from the filtered listed of failure codes.
John Ferlan [Wed, 4 Nov 2015 15:26:16 +0000 (10:26 -0500)]
virnetdev: Check for root in virNetDevGetFeatures
Since the SIOCETHTOOL ioctl only works for privileged daemons, if called
when not root, then virNetDevGetFeatures will VIR_DEBUG a message and
return 0 as if the functions were not available for the architecture.
This effectively returns an empty bitmap indicating no features available.
John Ferlan [Fri, 6 Nov 2015 14:44:37 +0000 (09:44 -0500)]
virnetdev: Fix function comments for virNetDevGetFeatures
In commit id 'c9027d8f4' when updating the posted patch to generate
a bitmap instead of an array of named feature bits, adjustment of
the args was missed
John Ferlan [Fri, 6 Nov 2015 15:06:11 +0000 (10:06 -0500)]
virnetdev: Document reasons for ignoring some SIOCETHTOOL errno values
Recently reverted commit id '6f2a0198' showed a need to add extra
comments when dealing with filtering of potential "non-issues".
Scanning through upstream patch postings indicates early on the
reasons for the filtering of specific ioctl failures were provided;
however, when converted from causing an error to VIR_DEBUG's the
reasons were missing. A future read/change of the code incorrectly
assumed they could or should be removed.
This commit removed error reporting from virNetDevSendEthtoolIoctl
pushing responsibility onto the callers. This is wrong, however,
since virNetDevSendEthtoolIoctl calls virNetDevSetupControl
which can still report errors. So as a result virNetDevSendEthtoolIoctl
may or may not report errors depending on which bit of it fails, and as
a result callers now overwrite some errors.
It also introduced a regression causing unprivileged libvirtd to
spew error messages to the console due to inability to query the
NIC features, an error which was previously ignored.
virNetDevSetupControlFull:148 : Cannot open network interface control socket: Operation not permitted
virNetDevFeatureAvailable:3062 : Cannot get device wlp3s0 flags: Operation not permitted
virNetDevSetupControlFull:148 : Cannot open network interface control socket: Operation not permitted
virNetDevFeatureAvailable:3062 : Cannot get device wlp3s0 flags: Operation not permitted
virNetDevSetupControlFull:148 : Cannot open network interface control socket: Operation not permitted
virNetDevFeatureAvailable:3062 : Cannot get device wlp3s0 flags: Operation not permitted
virNetDevSetupControlFull:148 : Cannot open network interface control socket: Operation not permitted
virNetDevFeatureAvailable:3062 : Cannot get device wlp3s0 flags: Operation not permitted
Looking back at the original posting I see no explanation of why
thsi refactoring was needed, so reverting the clearly broken
error reporting logic looks like the best option.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Peter Krempa [Thu, 5 Nov 2015 14:23:37 +0000 (15:23 +0100)]
qemu: migration: Actually error out on unsupported migration flag
The code reported that a migration flag is unsupported but didn't jump
to the error label. Probably an oversight in commit f88af9dc that
introduced the flag checking.
John Ferlan [Wed, 4 Nov 2015 22:47:42 +0000 (17:47 -0500)]
network: Remove extraneous ATTRIBUTE_NONNULL for virNetDevWaitDadFinish
Commit id '0f7436ca' added virNetDevWaitDadFinish using ATTRIBUTE_NONNULL
for both arguments, although one is a non-null argument. A Coverity build
balks at that.
Commit id 'fdda3760' only managed a symptom where it was possible to
create a file in a pool without libvirt's knowledge, so it was reverted.
The real fix is to have all the createVol API's which actually create
a volume (disk, logical, zfs) and the buildVol API's which handle the
real creation of some volume file (fs, rbd, sheepdog) manage deleting
any volume which they create when there is some sort of error in
processing the volume.
This way the onus isn't left up to the storage_driver to determine whether
the buildVol failure was due to some failure as a result of adjustments
made to the volume after creation such as getting sizes, changing ownership,
changing volume protections, etc. or simple a failure in creation.
Without needing to consider that the volume has to be removed, the
buildVol failure path only needs to remove the volume from the pool.
This way if a creation failed due to duplicate name, libvirt wouldn't
remove a volume that it didn't create in the pool target.
This commit only manages a symptom of finding a buildRet failure
where a volume was not listed in the pool, but someone created the
volume outside of libvirt in the pool being managed by libvirt.
John Ferlan [Thu, 8 Oct 2015 20:00:41 +0000 (16:00 -0400)]
storage: Cleanup failures in virStorageBackendCreateRaw
After successfully returning from virFileOpenAs, if subsequent calls fail,
then we need to remove the file since our caller expects that failures after
creation will remove the created file.
After a successful qemu-img/qcow-create of the backing file, if we
fail to stat the file, change it owner/group, or mode, then the
cleanup path should remove the file.
John Ferlan [Wed, 14 Oct 2015 13:56:14 +0000 (09:56 -0400)]
storage: Fix setting mode in virStorageBackendCreateExecCommand
Currently the code does not handle the NFS root squash environment
properly since if the file gets created, then the subsequent chmod
will fail in a root squash environment where we're creating a file
in the pool with qemu tools, such as seen via:
assuming $file.xml is creating a file of "<format type='qcow2'"> from
an existing file.img in the pool of "<format type='raw'>".
This patch will utilize the virCommandSetUmask when creating the file
in the NETFS pool. The virCommandSetUmask API was added in commit id
'0e1a1a8c4', which was after the original code was developed in commit
id 'e1f27784' to attempt to handle the root squash environment.
Also, rather than blindly attempting to chmod, check to see if the
st_mode bits from the stat match what we're trying to set and only
make the chmod if they don't.
Also, a slight adjustment to the fallback algorithm to move the
virCommandSetUID/virCommandSetGID inside the if (!filecreated) since
they're only useful if we need to attempt to create the file again.
Jiri Denemark [Tue, 27 Oct 2015 18:14:01 +0000 (19:14 +0100)]
Remove new lines from log messages
VIR_DEBUG and VIR_WARN will automatically add a new line to the message,
having "\n" at the end or at the beginning of the message results in
empty lines.
Jiri Denemark [Tue, 20 Oct 2015 12:26:46 +0000 (14:26 +0200)]
qemu: Rename ret variable in qemuProcessStart
Generally, we use "ret" variable for storing the value we are going to
return at the and of a function, but this is not the case in
qemuProcessStart. Let's rename "ret" as "rv".
Jiri Denemark [Tue, 20 Oct 2015 08:11:26 +0000 (10:11 +0200)]
qemu: Use correct type when calling qemuPrepareNVRAM
qemuProcessStart was passing char * migrateFrom as the third argument to
qemuPrepareNVRAM. We should explicitly convert the pointer to bool which
is what the function expects.
Laine Stump [Thu, 29 Oct 2015 18:09:59 +0000 (14:09 -0400)]
util: set max wait for IPv6 DAD to 20 seconds
This was originally set to 5 seconds, but times of 5.5 to 7 seconds
were experienced. Since it's an arbitrary number intended to prevent
an infinite hang, having it a bit too high won't hurt anything, and 20
seconds looks to be adequate (i.e. I think/hope we don't need to make
it tunable in libvirtd.conf)
Michal Privoznik [Tue, 27 Oct 2015 13:02:19 +0000 (14:02 +0100)]
wireshark: Install to generic plugin directory
There has been a report on the list [1] that we are not
installing the wireshark dissector into the correct plugin
directory. And in fact we are not. The problem is, the plugin
directory path is constructed at compile time. However, it's
dependent on the wireshark version, e.g.
/usr/lib/wireshark/plugins/1.12.6
This is rather unfortunate, because if libvirt RPMs were built
with one version, but installed on a system with newer one, the
plugins are not really loaded. This problem lead fedora packagers
to unify plugin path to:
/usr/lib/wireshark/plugins/
Cool! But this was enabled just in wireshark-1.12.6-4. Therefore,
we must require at least that version.
And while at it, on some distributions, the wireshark.pc file
already has a variable that defines where plugin dir is. Use that
if possible.
Build on non-Linux fails because the virNetDevWaitDadFinish() stub
has unused parameters. Fix by adding appropriate ATTRIBUTE_UNUSED
for these parameters.
network: wait for DAD to finish for bridge IPv6 addresses
commit db488c79 assumed that dnsmasq would complete IPv6 DAD before
daemonizing, but in reality it doesn't wait, which creates problems
when libvirt's bridge driver sets the matching "dummy tap device" to
IFF_DOWN prior to DAD completing.
This patch waits for DAD completion by periodically polling the kernel
using netlink to check whether there are any IPv6 addresses assigned
to bridge which have a 'tentative' state (if there are any in this
state, then DAD hasn't yet finished). After DAD is finished, execution
continues. To avoid an endless hang in case something was wrong with
the kernel's DAD, we wait a maximum of 5 seconds.
Commit id '9deb96f' removed the code to fetch the nodeset from the
CpusetMems cgroup for a running vm in favor of using the return from
virDomainNumatuneFormatNodeset introduced by commit id '43b67f2e7'.
However, that API will return the value of the passed 'auto_nodeset'
when placement is VIR_DOMAIN_NUMATUNE_PLACEMENT_AUTO, which happens
to be NULL.
Since commit id 'c74d58ad' started using priv->autoNodeset in order
to manage the auto placement value during qemuProcessStart, it should
be passed along in order to return the correct value if the domain
requests the auto placement.
Pavel Hrdina [Wed, 21 Oct 2015 10:43:29 +0000 (12:43 +0200)]
virsh-domain: use correct base for virStrToLong_ui
While parsing device addresses we should use correct base and don't
count on auto-detect. For example, PCI address uses hex numbers, but
each number starting with 0 will be auto-detected as octal number and
that's wrong. Another wrong use-case is for PCI address if for example
bus is 10, than it's incorrectly parsed as decimal number.
PCI and CCW addresses have all values as hex numbers, IDE and SCSI
addresses are in decimal numbers.