John Ferlan [Thu, 22 Jan 2015 16:23:10 +0000 (11:23 -0500)]
storage: When delete extended partition, need to refresh pool
When removing a volume that is the extended partition, all the logical
volume partitions that exist within the extended partition will also be
removed, so we need to refresh the pool to have the updated list
John Ferlan [Thu, 22 Jan 2015 14:12:04 +0000 (09:12 -0500)]
storage: Adjust how to refresh extended partition disk data
During virStorageBackendDiskMakeDataVol processing, if we find an extended
partition, then handle it specially when updating the capacity/allocation
rather than calling virStorageBackendUpdateVolInfo.
As it turns out, once a logical partition exists, any attempt to refresh
the pool or after libvirtd restart/reload will result in a failure to open
the extended partition device resulting in the inability to start the pool.
The downside to this is we will lose the <permissions> and <timestamps> for
the extended partition upon subsequent restart, refresh, reload since the
stat() in virStorageBackendUpdateVolTargetInfoFD will not be called. However,
since it's really only a container and shouldn't directly be used for
storage that seems reasonable.
Therefore, only use the existing code that already had a comment about
getting the allocation wrong for extended partitions for just the setting
of the extended partition data.
John Ferlan [Thu, 22 Jan 2015 13:30:36 +0000 (08:30 -0500)]
storage: Fix check for partition type for disk backing volumes
While checking the existing partitions in virStorageBackendDiskPartFormat,
the code would erroneously compare the volume target format type (eg, the
virStoragePartedFsType) rather than the source partition type (eg, the
virStorageVolTypeDisk) which is set during virStorageBackendDiskReadPartitions.
John Ferlan [Wed, 21 Jan 2015 20:38:05 +0000 (15:38 -0500)]
storage: Attempt error recovery in virStorageBackendDiskCreateVol
During virStorageBackendDiskCreateVol if virStorageBackendDiskReadPartitions
fails, then we were leaving with an error and a partition on the disk for
which there was no corresponding volume and used space on the disk which
could be reclaimable through direct parted activity. On a subsequent restart,
reload, or refresh the volume may magically appear too.
John Ferlan [Wed, 21 Jan 2015 19:41:02 +0000 (14:41 -0500)]
storage: Move virStorageBackendDiskDeleteVol
Move the API to before virStorageBackendDiskCreateVol in order to be
able to call the DeleteVol API when virStorageBackendDiskReadPartitions
fails so that we don't by chance leave a partition on the disk.
Pavel Hrdina [Wed, 28 Jan 2015 17:40:14 +0000 (18:40 +0100)]
libvirt.spec: remove vbox storage and network .so files
Commit 55ea7be7 removed separated modules for vbox_network and
vbox_storage drivers but forget to update libvirt.spec.in file. This
patch will fix rpm build.
When we try to update a xml to a image file, we will clear the
graphics passwd settings, because we do not pass VIR_DOMAIN_XML_SECURE
to qemuDomainDefCopy, qemuDomainDefFormatBuf won't format the passwd.
Add VIR_DOMAIN_XML_SECURE flag when we call qemuDomainDefCopy
in qemuDomainSaveImageUpdateDef.
This way the device is in vmdef only if ret = 0 and the caller
(qemuDomainAttachDeviceFlags) does not free it.
Otherwise it might get double freed by qemuProcessStop
and qemuDomainAttachDeviceFlags if the domain crashed
in monitor after we've added it to vm->def.
Ján Tomko [Tue, 27 Jan 2015 17:30:15 +0000 (18:30 +0100)]
Split qemuDomainChrInsert into two parts
Do the allocation first, then add the actual device.
The second part should never fail. This is good
for live hotplug where we don't want to remove the device
on OOM after the monitor command succeeded.
The only change in behavior is that on failure, the
vmdef->consoles array is freed, not just the first console.
Add more logging to the lxc controller and container files to
facilitate debugging startup problems. Also make it clear when
the container is going to close stdout and thus no longer do
any logging.
lxc: delay setup of cgroup until we have the init pid
Don't create the cgroups ahead of launching the container since
there is no need for the limits to apply during initial bootstrap.
Create the cgroup after the container PID is known and tell
systemd the initpid is the leader, instead of the controller
pid.
Currently when launching the LXC controller we first write out
the plain, inactive XML configuration, then launch the controller,
then replace the file with the live status XML configuration.
By good fortune this hasn't caused any problems other than some
misleading error messages during failure scenarios.
This simplifies the code so it only writes out the XML once and
always writes the live status XML. To do this we need to handshake
with the child process, to make execution pause just before exec()
so we can write the XML status with the child PID present.
lxc: re-arrange startup synchronization sequence with controller
Currently the lxc controller process itself is responsible for
daemonizing itself into the background and writing out its pid
file. The lxc driver would fork the controller and then attempt
to connect to the lxc monitor. This connection would only
succeed after the controller has backgrounded itself, setup
cgroups and written its pid file, so startup was race free.
The problem is that we need to delay create of the cgroups to
much later, such that we can tell systemd the container init
pid when we create the cgroups. If we delay cgroup creation
though the current synchronization won't work.
A second problem is that the controller needs the XML config
of the guest. Currently we write out the plain virDomainDefPtr
XML before starting the controller, and then later replace it
with the full virDomainObjPtr status XML. This is kind of gross
and also means that the controller doesn't get a record of the
live XML config right away. This means it doesn't have a record
of the veth device names either and so can't give that info
to systemd when creating the cgroups.
To address this we change the startup sequencing. The goal
is that we want to get the PID as soon as possible, before
the LXC controller even starts. So we stop letting the LXC
controller daemonize itself, and instead use virCommand's
built-in capabilities. This daemonizes and writes the PID
before LXC controller is exec'd. So the driver can read
the PID as soon as virCommandRun returns. It is no longer
safe to connect to the monitor or detect the cgroups though.
Fortunately the LXC controller already has a second point
of synchronization. Immediately before its event loop
starts running, it performs a handshake with the driver.
So we move the opening of the monitor connection and cgroup
detection after this synchronization point.
Build the pidfile string once when starting a guest and then
use the same string thereafter. This will benefit following
patches which need the pidfile string in more situations.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
systemd: don't report an error if the guest is already terminated
In many cases where we invoke virSystemdTerminateMachine the
process(es) will have already gone away on their own accord.
In these cases we log an error message that the machine does
not exist. We should catch this particular error and simply
ignore it, so we don't pollute the logs.
Ján Tomko [Tue, 27 Jan 2015 12:36:10 +0000 (13:36 +0100)]
Fix shadowed variable warning
libvirtd.c: In function 'daemonSetupAccessManager':
libvirtd.c:730:18: error: declaration of 'driver' shadows
a global declaration [-Werror=shadow]
const char **driver = (const char **)config->access_drivers;
^
In file included from libvirtd.c:95:0:
../src/node_device/node_device_driver.h:43:36: error: shadowed
declaration is here [-Werror=shadow]
extern virNodeDeviceDriverStatePtr driver;
^
For stateless, client side drivers, it is never correct to
probe for secondary drivers. It is only ever appropriate to
use the secondary driver that is associated with the
hypervisor in question. As a result the ESX & HyperV drivers
have both been forced to do hacks where they register no-op
drivers for the ones they don't implement.
For stateful, server side drivers, we always just want to
use the same built-in shared driver. The exception is
virtualbox which is really a stateless driver and so wants
to use its own server side secondary drivers. To deal with
this virtualbox has to be built as 3 separate loadable
modules to allow registration to work in the right order.
This can all be simplified by introducing a new struct
recording the precise set of secondary drivers each
hypervisor driver wants
Instead of registering the hypervisor driver, we now
just register a virConnectDriver instead. This allows
us to remove all probing of secondary drivers. Once we
have chosen the primary driver, we immediately know the
correct secondary drivers to use.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
don't disable state driver when libvirtd is not built
A bunch of code is wrapped in #if WITH_LIBVIRTD in order to
enable the virStateDriver to be disabled when libvirtd is not
built. Disabling this code doesn't have any real functional
benefit beyond removing 1 pointer from the virConnectPtr struct,
while having a cost of many more conditionals.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Michal Privoznik [Mon, 26 Jan 2015 16:09:36 +0000 (17:09 +0100)]
tests: Check for virQEMUDriverConfigNew return value
The function may return NULL if something went wrong. In some places
in the tests we are not checking the return value rather than
accessing the pointer directly resulting in SIGSEGV.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Ján Tomko [Mon, 26 Jan 2015 09:20:22 +0000 (10:20 +0100)]
Fix a memory leak in virCgroupGetPercpuStats
Coverity reports that my commit af1c98e introduced
two memory leaks:
the cpumap if ncpus == 0 in virCgroupGetPercpuStats
and the params array in the test of the function.
The virDBusMethodCall method has a DBusError as one of its
parameters. If the caller wants to pass a non-NULL value
for this, it immediately makes the calling code require
DBus at build time. This has led to breakage of non-DBus
builds several times. It is desirable that only the virdbus.c
file should need WITH_DBUS conditionals, so we must ideally
remove the DBusError parameter from the method.
We can't simply raise a libvirt error, since the whole point
of this parameter is to give the callers a way to check if
the error is one they want to ignore, without having the logs
polluted with an error message. So, we add a virErrorPtr
parameter which the caller can then either ignore or raise
using the new virReportErrorObject method.
This new method is distinct from virSetError in that it
ensures the logging hooks are run.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Erik Skultety [Fri, 23 Jan 2015 12:17:43 +0000 (13:17 +0100)]
qemu: Add check for PCI bridge placement if there are too many PCI devices
Previous patch of this series fixed the issue with adding a new PCI bridge
when all the slots were reserved by devices with user specified addresses.
In case there are still some PCI devices waiting to get a slot reserved
by qemuAssignDevicePCISlots, this means a new bus needs to be
created along with a corresponding bridge controller. By adding an
additional check, this scenario now results in a reasonable error
instead of generating wrong qemu command line.
Erik Skultety [Fri, 23 Jan 2015 12:17:42 +0000 (13:17 +0100)]
qemu: Fix auto-adding PCI bridge when all slots are reserved
Commit 93c8ca tried to fix the issue with auto-adding of a PCI bridge
controller, but didn't work properly in all scenarios.
This patch provides a better fix of the issue when all slots on a PCI bus
are reserved by devices with user specified addresses and no additional
bridges need to be created.
Erik Skultety [Fri, 23 Jan 2015 12:17:41 +0000 (13:17 +0100)]
qemu: move PCI slot assignment for PIIX3, Q35 into a separate function
In order to be able to test for fully reserved PCI buses, assignment of
PCI slots for integrated devices needs to be moved to a separate function.
This also might be a good preparation if we decide to add support for
other chipsets as well.
Erik Skultety [Fri, 23 Jan 2015 12:17:40 +0000 (13:17 +0100)]
qemu: reorder PCI slot assignment functions
Move qemuDomainAssignPCIAddresses after the definition
of the static function qemuDomainValidateDevicePCISlotsQ35.
This lets us define a new static function using
qemuDomainValidateDevicePCISlots* and use it in
qemuDomainAssignPCIAddresses without a forward declaration.
Mike Latimer [Tue, 20 Jan 2015 01:25:42 +0000 (18:25 -0700)]
Fix apparmor issues for tck
The network and nwfilter tests contained in the libvirt-TCK testkit can fail
unless access to raw network packets is granted. Without this access, the
following apparmor error can be seen while running the tests:
Mike Latimer [Tue, 20 Jan 2015 01:25:40 +0000 (18:25 -0700)]
Fix apparmor issues for Xen
In order for apparmor to work properly in Xen environments, the following
access rights need to be allowed:
- Allow CAP_SYS_PACCT, which is required when resetting some multi-port
Broadcom cards by writting to the PCI config space
- Allow CAP_IPC_LOCK, which is required to lock/unlock memory. Without
this setting, an error 'Resource temporarily unavailable' can be seen
while attempting to mmap memory. At the same time, the following
apparmor message is seen:
Previously the function returned either -1 in case of an error or 0 on
success. However, we should also distinguish between a case we
successfully added a controller and a case there wasn't a need to add any
controller
Erik Skultety [Wed, 21 Jan 2015 16:49:24 +0000 (17:49 +0100)]
qemu: Remove dead code in qemuDomainAssignPCIAddresses revert patch
As it turned out, fix of dead code 419a22 changed the affected condition
from "never true" to "always true", so better fix would be to change the
return code of virDomainMaybeAddController from 0 to 1 if
a new bridge has been added, thus distinguishing case when we didn't need to
add any controller and case we successfully added one.
Ján Tomko [Fri, 23 Jan 2015 09:30:01 +0000 (10:30 +0100)]
Fix build with older gcc
My commit af1c98e4 broke the build on RHEL-6:
vircgrouptest.c: In function 'testCgroupGetPercpuStats':
vircgrouptest.c:566: error: nested extern declaration of
'_gl_verify_function2' [-Wnested-externs]
The only thing that needs checking is that the array size
is at least EXPECTED_NCPUS, to prevent access beyond the array.
We can ensure the minimum size also by specifying the array
size upfront.
Pavel Hrdina [Wed, 21 Jan 2015 10:17:52 +0000 (11:17 +0100)]
esx_vi: fix possible segfault
Clang found possible dereference of NULL pointer which is right.
Function 'esxVI_LookupTaskInfoByTask' should find a task info. The issue
is that we could return 0 and leave 'taksInfo' pointer NULL because if
there is no match we simply end the search loop end set 'result' to 0.
Every caller count on the fact that if the return value is 0 than it's
safe to dereference 'taskInfo'. We should return 0 only in case we found
something and the '*taskInfo' is not NULL.
Pavel Hrdina [Wed, 21 Jan 2015 10:13:31 +0000 (11:13 +0100)]
xenapi_driver: fix copy-paste typo
Clang found that we are passing variable with wrong enum type to
'xenapiCrashExitEnum2virDomainLifecycle' function. This is probably
copy-paste typo as the correct variable exists in the code, but it isn't
used.
Ján Tomko [Thu, 22 Jan 2015 10:54:50 +0000 (11:54 +0100)]
Fix virCgroupGetPercpuStats with non-continuous present CPUs
Per-cpu stats are only shown for present CPUs in the cgroups,
but we were only parsing the largest CPU number from
/sys/devices/system/cpu/present and looking for stats even for
non-present CPUs.
This resulted in:
internal error: cpuacct parse error
Peter Krempa [Tue, 20 Jan 2015 16:01:01 +0000 (17:01 +0100)]
CVE-2015-0236: qemu: Check ACLs when dumping security info from snapshots
The ACL check didn't check the VIR_DOMAIN_XML_SECURE flag and the
appropriate permission for it. Found via code inspection while fixing
permissions for save images.
When using 'virsh attach-device' to hotplug an unsupported console type
into a qemu guest the attachment would succeed as the command line
formatter didn't report error in such case.
The listen address is not mandatory for <interface type='server'>
but when it's not specified, we've been formatting it as:
-netdev socket,listen=(null):5558,id=hostnet0
which failed with:
Device 'socket' could not be initialized
Omit the address completely and only format the port in the listen
attribute.
Jim Fehlig [Tue, 20 Jan 2015 23:45:09 +0000 (16:45 -0700)]
tests: fix xlconfigtest build failure
When libvirt is configured --without-xen, building the xlconfigtest
fails with
CCLD xlconfigtest
/usr/lib/gcc/x86_64-linux-gnu/4.7/../../../x86_64-linux-gnu/crt1.o
In function `_start': (.text+0x20): undefined reference to `main'
collect2: error: ld returned 1 exit status
Introduced in commit 4ed5fb91 by too much copy and paste from
xmconfigtest.
Josh Stone [Thu, 4 Dec 2014 00:01:33 +0000 (16:01 -0800)]
network: Let domains be restricted to local DNS
This adds a new "localOnly" attribute on the domain element of the
network xml. With this set to "yes", DNS requests under that domain
will only be resolved by libvirt's dnsmasq, never forwarded upstream.
This was how it worked before commit f69a6b987d616, and I found that
functionality useful. For example, I have my host's NetworkManager
dnsmasq configured to forward that domain to libvirt's dnsmasq, so I can
easily resolve guest names from outside. But if libvirt's dnsmasq
doesn't know a name and forwards it to the host, I'd get an endless
forwarding loop. Now I can set localOnly="yes" to prevent the loop.
Gary R Hook [Fri, 16 Jan 2015 17:38:32 +0000 (17:38 +0000)]
Make ZFS storage pool XML tests optional
Do not run ZFS tests when ZFS is unsupported in the environment.
The recent patch b4af40226d09adeaf9e33a1d6594c4e8ce7f771d adds tests
to storagepoolxml2xmltest for the optional ZFS feature. When ZFS is
not included in the configuration these tests should not / cannot be
run. Modify the test source file to check for use of the feature and
compile in the tests accordingly.
Signed-off-by: Gary R Hook <gary.hook@nimboxx.com>
Ján Tomko [Tue, 16 Dec 2014 09:40:58 +0000 (10:40 +0100)]
Always check return value of qemuDomainObjExitMonitor
Depending on the context, either error out if the domain
has disappeared in the meantime, or just ignore the value
to allow marking the function as ATTRIBUTE_RETURN_CHECK.
If the domain crashed while we were in monitor,
we cannot rely on the REALLOC done on live definition,
since vm->def now points to the persistent definition.
Skip adding the attached devices to domain definition
if the domain crashed.
In AttachChrDevice, the chardev was already added to the
live definition and freed by qemuProcessStop in the case
of a crash. Skip the device removal in that case.
Also skip audit if the domain crashed in the meantime.
Anthony PERARD [Thu, 15 Jan 2015 16:40:19 +0000 (16:40 +0000)]
libxl: Set path to console on domain startup.
The path to the pty of a Xen PV console is set only in
virDomainOpenConsole. But this is done too late. A call to
virDomainGetXMLDesc done before OpenConsole will not have the path to
the pty, but a call after OpenConsole will.
e.g. of the current issue.
Starting a domain with '<console type="pty"/>'
Then:
virDomainGetXMLDesc():
<devices>
<console type='pty'>
<target type='xen' port='0'/>
</console>
</devices>
virDomainOpenConsole()
virDomainGetXMLDesc():
<devices>
<console type='pty' tty='/dev/pts/30'>
<source path='/dev/pts/30'/>
<target type='xen' port='0'/>
</console>
</devices>
The patch intend to have the TTY path on the first call of GetXMLDesc.
This is done by setting up the path at domain start up instead of in
OpenConsole.
Dmitry Guryanov [Tue, 13 Jan 2015 11:27:40 +0000 (14:27 +0300)]
parallels: create container from existing image
It's possible to create a container with existing
disk image as root filesystem. You need to remove
existing disks from PCS VM config and then add a new
one, pointing to your image. And then call PrlVm_RegEx
with PRNVM_PRESERVE_DISK flag.
With this patch you can create such container with
something like this for new domain XML config:
Dmitry Guryanov [Tue, 13 Jan 2015 11:27:38 +0000 (14:27 +0300)]
parallels: commit with PVCF_DETACH_HDD_BUNDLE flag
PCS removes disk image from filesystem, if you remove it
from config. There is a special flag PVCF_DETACH_HDD_BUNDLE
which allow to remove disk only from VM/CT config.
If you call virDomainDefine and remove some disk from
config it should be preserved, so call PrlVm_CommitEx
always with flag PVCF_DETACH_HDD_BUNDLE.
Dmitry Guryanov [Tue, 13 Jan 2015 11:27:36 +0000 (14:27 +0300)]
add ploop fs driver type
Ploop is a pseudo device which makeit possible to access
to an image in a file as a block device. Like loop devices,
but with additional features, like snapshots, write tracker
and without double-caching.
It used in PCS for containers and in OpenVZ. You can manage
ploop devices and images with ploop utility
(http://git.openvz.org/?p=ploop).
John Ferlan [Fri, 16 Jan 2015 11:40:15 +0000 (06:40 -0500)]
network: Resolve Coverity FORWARD_NULL
Commit id 'ca481a6f' added virNetworkRouteDefFree which may be called
in an error path from lxcAddNetworkRouteDefinition with 'route = NULL'.
So just add the (!def) at the top to resolve.
The 'virsh edit' command gets XML validation enabled by default,
with a --skip-validate option to disable it. The 'virsh define'
and 'virsh create' commands get a --validate option to enable
it, to avoid regressions for existing scripts.
The quality of error reporting from libxml2 varies depending
on the type of XML error made. Sometimes it is quite clear
and useful, other times it is obscure & inaccurate. At least
the user will see an error now, rather than having their
XML modification silently disappear.
Erik Skultety [Thu, 15 Jan 2015 13:14:17 +0000 (14:14 +0100)]
qemu: Tweak auto adding PCI bridge controller when extending default PCI bus
In case we find out, there are more PCI devices to be connected
than there are available slots on the default PCI bus, we automatically add a
new bus and a related PCI bridge controller as well. As there are no free slots
left on the default PCI bus, PCI bridge controller gets a free slot on a
newly created PCI bus which causes qemu to refuse to start the guest.
This fix introduces a new function qemuDomainPCIBusFullyReserved which
is checked right before we possibly try to reserve a slot for PCI bridge
controller.
John Ferlan [Fri, 9 Jan 2015 16:02:05 +0000 (11:02 -0500)]
domain_conf: Check errors from virSocketAddrFormat
Commit id 'aa2cc721' added calls to virSocketAddrFormat but did not
check for a NULL (error) return which could lead to bad output
in the XML file. Need to check for NULL return and cause failure.
Cédric Bosdonnat [Wed, 14 Jan 2015 13:21:10 +0000 (14:21 +0100)]
Move code related to network routes to networkcommon_conf.[ch]
Moving code for parsing and formatting network routes to
networkcommon_conf helps reusing those routes for domains. The route
definition has been hidden to help reducing the number of unnecessary
checks in the format function.
When updating a network and adding new ip-dhcp-host entry, the deamon
may crash. The problem is, we iterate over existing <host/> entries
trying to compare MAC addresses to see if there's already an existing
rule. However, not all entries are required to have MAC address. For
instance, the following is perfectly valid entry:
When the checking loop iterates over this, the entry's MAC address is
accessed directly. Well, the fix is obvious - check if the address is
defined before trying to compare it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Add support for schema validation when passing in XML
The virDomainDefineXMLFlags and virDomainCreateXML APIs both
gain new flags allowing them to be told to validate XML.
This updates all the drivers to turn on validation in the
XML parser when the flags are set