Jim Fehlig [Fri, 6 Oct 2017 20:20:36 +0000 (14:20 -0600)]
apparmor: add dnsmasq ptrace rule to libvirtd profile
Commit b482925c added ptrace rule for the apparmor profiles,
but one was missed in the libvirtd profile for dnsmasq. It was
overlooked since the test machine did not have an active libvirt
network requiring dnsmasq that was also set to autostart. With
one active and set to autostart, the following denial is observed
in audit.log when restarting libvirtd
With an active network, I suspect a libvirtd restart causes access
to /proc/<dnsmasq-pid>/*, hence the resulting denial. As a nasty
side affect of the denial, libvirtd thinks it needs to spawn a
dnsmasq process even though one is already running for the network.
E.g. after two libvirtd restarts
Wim ten Have [Fri, 8 Sep 2017 14:47:14 +0000 (16:47 +0200)]
numa: rename function virDomainNumaDefCPUFormat
Rename virDomainNumaDefCPUFormat to virDomainNumaDefCPUFormatXML,
matching its peer virDomainNumaDefCPUParseXML and the general
vir*{Format,Parse}XML conventions.
Signed-off-by: Wim ten Have <wim.ten.have@oracle.com> Reviewed-by: Jim Fehlig <jfehlig@suse.com>
Wim ten Have [Fri, 6 Oct 2017 12:16:07 +0000 (14:16 +0200)]
build: isolate core libvirt libs deps from xen runtime
Generating libvirt packages per make rpm, "with-libxl=1" and "with-xen=1",
adds strict runtime dependencies per libxenlight for xen-libs package from
core libvirt-libs package. This is not necessary and unfortunate since
those dependencies set demand to "xen-libs" package even when there's no
need for libvirt xen or libxl driver components.
This patch is to have two separate xenconfig lib tool libraries: one for
core libvirt (without XL), and a another that contains xl for libxl driver
(libvirt_driver_libxl_impl.la) which when loading the driver, loads the
remaining symbols (xen{Format,Parse}XL. For the user/sysadmin, this means
the xen dependencies are moved into libxl driver, instead of core libvirt.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Wim ten Have <wim.ten.have@oracle.com> Reviewed-by: Jim Fehlig <jfehlig@suse.com>
John Ferlan [Tue, 19 Sep 2017 12:55:43 +0000 (08:55 -0400)]
conf: Fix prototype/definition for virStoragePoolObj get functions
Modify virStoragePoolObjGetAutostartLink and
virStoragePoolObjGetConfigFile to return "const char *"
since that's how both are used and to ensure no one
tries to VIR_FREE the result.
To avoid any issues later on if paths ever change (unlikely but
possible) and to match the style of other generated rules the paths
of the static rules have to be quoted as well.
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
libvirt allows spaces in vm names, there were issues in the past but it
seems not removed so the assumption has to be that spaces are continuing
to be allowed.
Therefore virt-aa-helper should not reject spaces in vm names anymore if
it is going to be refused causing issues then the parser or xml schema
should do so.
Apparmor rules are in quotes, so a space in a path based on the name works.
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
virt-aa-helper: fix libusb access to udev usb data
libusb as used by qemu needs to read data from /run/udev/data/ about usb
devices. That is read once on the first initialization of libusb_init by
qemu.
Therefore generating just the device we need would not be sufficient as
another hotplug later can need another device which would fail as the
data is no more re-read at this point.
But we can restrict the paths very much to just the major number of
potential usb devices which will make it match approximately the detail
that e.g. an lsusb -v would reveal - that is much safer than the
"/run/udev/data/* r" blanket many users are using now as a workaround.
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
If users only specified vendor&product (the common case) then parsing
the xml via virDomainHostdevSubsysUSBDefParseXML would only set these.
Bus and Device would much later be added when the devices are prepared
to be added.
Due to that a hot-add of a usb hostdev works as the device is prepared
and virt-aa-helper processes the new internal xml. But on an initial
guest start at the time virt-aa-helper renders the apparmor rules the
bus/device id's are not set yet:
p ctl->def->hostdevs[0]->source.subsys.u.usb
$12 = {autoAddress = false, bus = 0, device = 0, vendor = 1921, product
= 21888}
That causes rules to be wrong:
"/dev/bus/usb/000/000" rw,
The fix calls virHostdevFindUSBDevice after reading the XML from
virt-aa-helper to only add apparmor rules for devices that could be found
and now are fully known to be able to write the rule correctly.
It uncondtionally sets virHostdevFindUSBDevice mandatory attribute as
adding an apparmor rule for a device not found makes no sense no matter
what startup policy it has set.
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Since domain can have at most one watchdog it simplifies things a
bit. However, since we must be able to set the watchdog action as
well, new monitor command needs to be used.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Michal Privoznik [Wed, 27 Sep 2017 11:45:07 +0000 (13:45 +0200)]
qemuDomainDeviceDefValidate: Validate watchdog
Currently we don't do it. Therefore we accept senseless
combinations of models and buses they are attached to.
Moreover, diag288 watchdog is exclusive to s390(x).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
It's possible to define and start a pool with a '.' in the
name; however, when trying to add a volume to a domain using
the storage pool source with a '.' in the storage pool name,
the domain RNG validation fails because RNG uses 'genericName'
which does not allow a '.' in the name.
Domain XML def parsing has a virXMLValidateAgainstSchema which
generates the error. The Storage Pool XML def parsing has no
call to virXMLValidateAgainstSchema. The only Storage Pool name
validation occurs in virStoragePoolDefParseXML to ensure the
name doesn't have a '/' in it and in storagePoolDefineXML to
call virXMLCheckIllegalChars using the same parameter "\n" as
qemuDomainDefineXMLFlags would check after the RNG check
could be succesful.
In order to resolve this, create a poolName definition in
storagecommon.rng that will mimic the domain name regex that
disallows a newline character, but add the "/" in the exclude
list. Then modify the pool and volume source name definitions
to key off that poolName.
Peter Krempa [Mon, 28 Aug 2017 13:36:05 +0000 (15:36 +0200)]
qemu: blockjob: Always save config XML when a blockjob is finished
For VMs with persistent config the config may change upon successful
completion of a job. Save it always if a persistent VM finishes a
blockjob. This will simplify further additions.
Peter Krempa [Mon, 28 Aug 2017 13:21:06 +0000 (15:21 +0200)]
qemu: blockjob: Always save status XML after block event
The status XML would be saved only for the copy job (in case of success)
or on failure even for other jobs. As the status contains the backing
chain data, which change after success we should always save it on
block job completion.
Peter Krempa [Tue, 3 Oct 2017 10:38:19 +0000 (12:38 +0200)]
qemu: process: move disk presence checking to host setup function
Checking of disk presence accesses storage on the host so it should be
done from the host setup function. Move the code to new function called
qemuProcessPrepareHostStorage and remove qemuDomainCheckDiskPresence.
Peter Krempa [Tue, 3 Oct 2017 08:14:21 +0000 (10:14 +0200)]
qemu: process: Pass flags to qemuProcessPrepareHost
Pass flags to the function rather than just whether we have incoming
migration. This also enforces correct startup policy for USB devices
when reverting from a snapshot.
Peter Krempa [Tue, 3 Oct 2017 07:59:03 +0000 (09:59 +0200)]
qemu: migration: Extract flags for starting VM into a variable
qemuMigrationPrepareAny called multiple of the functions starting the
qemu process for incoming migration by adding the flags explicitly.
Extract them to a variable so that they can be easily used for other
calls or changed in the future.
Interestingly enough, we don't document the point of view of the
interface statistics. Therefore it's unknown to users if for
instance rx_packets is the number of packets received by domain or
received by host (from domain). Document this explicitly.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Similarly to previous patch, for some types of interface domain
and host are on the same side of RX/TX barrier. In that case, we
need to set up the QoS differently. Well, swapped.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
The comment in virNetDevTapInterfaceStats() implementation for
Linux states that packets transmitted by domain are received by
the host and vice versa. Well, this is true but not for all types
of interfaces. For instance, for macvtaps when TAP device is
hooked right onto a physical device any packet that domain sends
looks also like a packet sent to the host. Therefore, we should
allow caller to chose if the stats returned should be straight
copy or swapped.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
qemuDomainInterfaceStats: Check for the actual type of interface
Users might have configured interface so that it's type of
network, but the corresponding network plugs interfaces into an
OVS bridge. Therefore, we have to check for the actual type of
the interface instead of the configured one.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Jiri Denemark [Thu, 5 Oct 2017 07:06:03 +0000 (09:06 +0200)]
tests: Fix build with clang
clang doesn't like mode_t type as an argument to va_arg():
error: second argument to 'va_arg' is of promotable type 'mode_t' (aka
'unsigned short'); this va_arg has undefined behavior because arguments
will be promoted to 'int'
virDomainNetFindIdx: Ignore auto generated MAC addresses
When detaching an <interface/> from a domain, the MAC address is
parsed and if not present one is generated. If no corresponding
interface is found in the domain, the following error is
reported:
error: operation failed: no device matching mac address 52:54:00:75:32:5b found
where the MAC address is the auto generated one. This might be
very confusing. Solution to this is to ignore auto generated MAC
address when looking up the device.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com>
John Ferlan [Fri, 29 Sep 2017 13:21:47 +0000 (09:21 -0400)]
nwfilter: Fix memory leak and error path
Found by Coverity. If virNWFilterHashTablePut, then the 3rd arg @val
must be free'd since it would be leaked.
This also fixes potential problem on the error path where the caller
could assume the virNWFilterHashTablePut was successful when in fact
it failed leading to other issues.
John Ferlan [Fri, 29 Sep 2017 13:18:53 +0000 (09:18 -0400)]
nwfilter: Clean up virNWFilterDetermineMissingVarsRec returns
Rather than using loop break;'s in order to force a return
of rc = -1, let's just return -1 immediately on the various
error paths and then return 0 on the success path.
tests: Do not ignore mode parameter in mocked open()
This is normally not an issue since the tests which use mocked open() do
not create files. But once coverage build is enabled, gcov_open will use
O_CREATE and real_open will read random data rather than the actual mode
argument.
Peter Krempa [Tue, 9 May 2017 12:25:02 +0000 (14:25 +0200)]
conf: Split out parsing of network disk source XML elements
virDomainDiskSourceParse got to the point of being an ugly spaghetti
mess by adding more and more stuff into it. Split out parsing of network
disk information into a separate function so that it stays contained.
docs: Document the real behaviour of suspend-to-{mem,disk}
We get a question every now and then about why hibernation works when
suspend-to-disk is disabled and similar. Let's hope that, by documenting the
obvious more blatantly, people will get more informed.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com>
John Ferlan [Fri, 29 Sep 2017 19:55:29 +0000 (15:55 -0400)]
nwfilter: Don't have virNWFilterIPAddrMapAddIPAddr consume input
On pure success paths, virNWFilterIPAddrMapAddIPAddr was validly
consuming the input @addr; however, on failure paths it was possible
that virNWFilterVarValueCreateSimple succeed, but virNWFilterHashTablePut
failed resulting in virNWFilterVarValueFree being called to clean
up @val which also cleaned up the input @addr. Thus the caller had
no way to determine on failure whether it too should clean up the
passed parameter.
Instead, let's create a copy of the input @addr, then handle that
properly in the API allowing/forcing the caller to free it's own
copy of the input parameter.
The test suite has hardcoded /etc/pki/qemu as the cert dir, but this
only works if configure has --sysconfdir=/etc passed. We must set the
vxhs cert dir to a stable path in the test suite.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Update the qemuxml2argvtest with a couple of examples. One for a
simple case and the other a bit more complex where multiple VxHS disks
are added where at least one uses a VxHS that doesn't require TLS
credentials and thus sets the domain disk source attribute "tls = 'no'".
Update the hotplug to be able to handle processing the tlsAlias whether
it's to add the TLS object when hotplugging a disk or to remove the TLS
object when hot unplugging a disk. The hot plug/unplug code is largely
generic, but the addition code does make the VXHS specific checks only
because it needs to grab the correct config directory and generate the
object as the command line would do.
Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com> Signed-off-by: John Ferlan <jferlan@redhat.com>
John Ferlan [Wed, 30 Aug 2017 19:29:59 +0000 (15:29 -0400)]
qemu: Introduce qemuDomainPrepareDiskSource
Introduce a function to setup any TLS needs for a disk source.
If there's a configuration or other error setting up the disk source
for TLS, then cause the domain startup to fail.
For VxHS, follow the chardevTLS model where if the src->haveTLS hasn't
been configured, then take the system/global cfg->haveTLS setting for
the storage source *and* mark that we've done so via the tlsFromConfig
setting in storage source.
Next, if we are using TLS, then generate an alias into a virStorageSource
'tlsAlias' field that will be used to create the TLS object and added to
the disk object in order to link the two together for QEMU.
Additionally add a tlsFromConfig boolean to control whether the TLS
setting was due to domain configuration or qemu.conf global setting
in order to decide whether to Format the haveTLS setting for either
a live or saved domain configuration file.
Update the qemuxml2xmltest in order to add a test to show the proper
parsing.
Also update the docs to describe the tls attribute.
Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com> Signed-off-by: John Ferlan <jferlan@redhat.com>
Ashish Mittal [Wed, 30 Aug 2017 15:32:33 +0000 (11:32 -0400)]
conf: Introduce TLS options for VxHS block device clients
Add a new TLS X.509 certificate type - "vxhs". This will handle the
creation of a TLS certificate capability for properly configured
VxHS network block device clients.
The following describes the behavior of TLS for VxHS block device:
(1) Two new options have been added in /etc/libvirt/qemu.conf
to control TLS behavior with VxHS block devices
"vxhs_tls" and "vxhs_tls_x509_cert_dir".
(2) Setting "vxhs_tls=1" in /etc/libvirt/qemu.conf will enable
TLS for VxHS block devices.
(3) "vxhs_tls_x509_cert_dir" can be set to the full path where the
TLS CA certificate and the client certificate and keys are saved.
If this value is missing, the "default_tls_x509_cert_dir" will be
used instead. If the environment is not configured properly the
authentication to the VxHS server will fail.
Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com> Signed-off-by: John Ferlan <jferlan@redhat.com>
John Ferlan [Wed, 27 Sep 2017 14:06:50 +0000 (10:06 -0400)]
nwfilter: Fix possible segfault on sometimes consumed variable
The virNWFilterIPAddrMapAddIPAddr code can consume the @addr parameter
on success when the @ifname is found in the ipAddressMap->hashTable
hash table in the call to virNWFilterVarValueAddValue; however, if
not found in the hash table, then @addr is formatted into a @val
which is stored in the table and on return the caller would be
expected to free @addr.
Thus, the caller has no way to determine on success whether @addr was
consumed, so in order to fix this create a @tmp variable which will
be stored/consumed when virNWFilterVarValueAddValue succeeds. That way
the caller can free @addr whether the function returns success or failure.
When the packet is received we parse the "header", which as a side
effect updates msg->bufferOffset to point to the beginning of "payload".
If the message call contains FDs, we need to also parse the count of
FDs, which also updates the msg->bufferOffset.
The issue here is that when we attempt to read the FDs data from the
socket and we receive EAGAIN we finish the reading and call poll()
to wait for the data the we need. When the data arrives we already have
the packet in our buffer so we read the "header" again but this time
we don't read the count of FDs because we already have it stored.
That means that the msg->bufferOffset is not updated to point to the
actual beginning of the payload data, but it points to the count of
FDs. After all FDs are processed we dispatch the message to process
it and decode the payload. Since the msg->bufferOffset points to wrong
data, we decode the wrong payload and the API call fails with
error messages:
Domain not found: no domain with matching uuid '67656e65-7269-6300-0c87-5003ca6941f2' ()
Broken by commit 133c511b527 which fixed a FD and memory leak.
Peter Krempa [Wed, 23 Aug 2017 12:19:36 +0000 (14:19 +0200)]
qemu: domain: Extract common clearing of VM private data
VM private data is cleared when the VM is turned off and also when the
VM object is being freed. Some of the clearing code was duplicated.
Extract it to a separate function.
This also removes the now unnecessary function
qemuDomainClearPrivatePaths.
Ján Tomko [Mon, 25 Sep 2017 14:35:42 +0000 (16:35 +0200)]
virStorageFileResize: fallocate the whole capacity
We have been trying to implement the ALLOCATE flag to mean
"the volume should be fully allocated after the resize".
Since commit b0579ed9 we do not allocate from the existing
capacity, but from the existing allocation value.
However this value is a total of all the allocated bytes,
not an offset.
Treating allocation as an offset would result in an incompletely
allocated file:
$ virsh vol-resize vol1 --pool default 16384 --allocate
Capacity: 16.00 KiB
Allocation: 12.00 KiB
Call fallocate from zero on the whole requested capacity to fully
allocate the file. After that, the volume is fully allocated
after the resize:
$ virsh vol-resize vol1 --pool default 16384 --allocate
$ virsh vol-info vol1 default
Capacity: 16.00 KiB
Allocation: 16.00 KiB
Ján Tomko [Mon, 25 Sep 2017 14:29:34 +0000 (16:29 +0200)]
use virFileAllocate in virStorageFileResize
Introduce a new function virFileAllocate that will call the
non-destructive variants of safezero, essentially reverting
my commit 1390c268
safezero: fall back to writing zeroes even when resizing
back to the state as of commit 18f0316
virstoragefile: Have virStorageFileResize use safezero
This means that _ALLOCATE flag will no longer work on platforms
without the allocate syscalls, but it will not overwrite data
either.
For the virsh pool-{define|create}-as command, let's allow using
--secret-uuid on the command line as an alternative to --secret-usage
(added for commit id '8932580'), but ensure that they are mutually
exclusive.
Peter Krempa [Mon, 25 Sep 2017 14:16:08 +0000 (16:16 +0200)]
qemu: process: Refresh data from qemu monitor after migration
Some values we read from the qemu monitor may be changed with the actual
state by the incoming migration. This means that we should refresh
certain things only after the migration has finished.
This is mostly visible in the cdrom tray state, which is by default
closed but may be opened by the guest OS. This would be refreshed before
qemu transferred the actual state and thus libvirt would think that the
tray is closed.
Note that this patch moves only a few obvious query commands. Others may
be moved later after individual assessment.
Peter Krempa [Mon, 25 Sep 2017 20:34:44 +0000 (22:34 +0200)]
qemu: hotplug: Ignore cgroup errors when hot-unplugging vcpus
When the vcpu is successfully removed libvirt would remove the cgroup.
In cases when removal of the cgroup fails libvirt would report an error.
This does not make much sense, since the vcpu was removed and we can't
really do anything with the cgroup. This patch silences the errors from
cgroup removal.
Ján Tomko [Tue, 26 Sep 2017 11:30:10 +0000 (13:30 +0200)]
conf: fix formatting of udp chardev attributes
It is possible (although possibly not very useful) to leave out
the service attribute when using <source mode='bind'/>
Fix the formatter bug introduced by commit 4a0da34 and format
the host when its present (checked for non-NULL inside
virBufferEscapeString) instead of basing it on the presence
of the service attribute.
Peter Krempa [Mon, 25 Sep 2017 09:44:00 +0000 (11:44 +0200)]
qemu: block: Use correct alias when extracting disk node names
The alias recorded in disk->info.alias is the alias for the frontend
device but we are interested in the backend drive. This messed up the
disk node name extraction code as qemu reports the drive alias in the
block query commands. This was broken in the node name detector
refactoring done in commit 0175dc6ea024d
Print hex values with '0x' prefix and octal with '0' in debug messages
Seeing a log message saying 'flags=93' is ambiguous & confusing unless
you happen to know that libvirt always prints flags as hex. Change our
debug messages so that they always add a '0x' prefix when printing flags,
and '0' prefix when printing mode. A few other misc places gain a '0x'
prefix in error messages too.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ján Tomko [Wed, 20 Sep 2017 13:23:47 +0000 (15:23 +0200)]
news: remove kernel version reference from switchdev entry
The functionality was added in 4.8, but due to a rename of
the DEVLINK_CMD_ESWITCH_GET constant in the kernel headers,
the headers from kernel 4.11 are required by the libvirt code.
Remove the reference from the news entry, since it could be
misleading.
Peter Krempa [Wed, 20 Sep 2017 08:45:23 +0000 (10:45 +0200)]
qemu: capabilities: Remove support for downstream-only QMP monitor backport
Some distros (see diff) chose to backport QMP support rather than rebase
to newer version of qemu. As a hack they added the string 'libvirt' to
the qemu -help output. Remove this as downstream-only hacks should be
carried by downstream and not litter upstream.