Ján Tomko [Mon, 1 Jul 2013 16:28:50 +0000 (18:28 +0200)]
qemu: don't use deprecated -no-kvm-pit-reinjection
Since qemu-kvm 1.1 [1] (since 1.3. in upstream QEMU [2])
'-no-kvm-pit-reinjection' has been deprecated.
Use -global kvm-pit.lost_tick_policy=discard instead.
John Ferlan [Tue, 5 Nov 2013 12:55:54 +0000 (07:55 -0500)]
Resolve Coverity issue regarding not checking return value
Coverity complains that the call to virPCIDeviceDetach() in
qemuPrepareHostdevPCIDevices() doesn't check status return like
other calls. Seems this just was lurking until a recent change
to this module resulted in Coverity looking harder and finding
the issue. Introduced by 'a4efb2e33' when function was called
'pciReAttachDevice()'
Just added a ignore_value() since it doesn't appear to matter
if the call fails since we're on a failure path already.
Currently the LXC container tries to skip selinux/securityfs
mounts if the directory does not exist in the filesystem,
or if SELinux is disabled.
The former check is flawed because the /sys/fs/selinux
or /sys/kernel/securityfs directories may exist in sysfs
even if the mount type is disabled. Instead of just doing
an access() check, use an virFileIsMounted() to see if
the FS is actually present in the host OS. This also
avoids the need to check is_selinux_enabled().
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add flag to lxcBasicMounts to control use in user namespaces
Some mounts must be skipped if running inside a user namespace,
since the kernel forbids their use. Instead of strcmp'ing the
filesystem type in the body of the loop, set an explicit flag
in the lxcBasicMounts table.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently the lxcBasicMounts array has separate entries for
most mounts, to reflect that we must do a separate mount
operation to make mounts read-only. Remove the duplicate
entries and instead set the MS_RDONLY flag against the main
entry. Then change lxcContainerMountBasicFS to look for the
MS_RDONLY flag, mask it out & do a separate bind mount.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remove pointless 'srcpath' variable in lxcContainerMountBasicFS
The 'srcpath' variable is initialized from 'mnt->src' and never
changed thereafter. Some places continue to use 'mnt->src' and
others use 'srcpath'. Remove the pointless 'srcpath' variable
and use 'mnt->src' everywhere.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remove unused 'opts' field from LXC basic mounts struct
The virLXCBasicMountInfo struct contains a 'char *opts'
field passed onto the mount() syscall. Every entry in the
list sets this to NULL though, so it can be removed to
simplify life.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If a PCI deivce is not binded to any driver (e.g. there's yet no PCI
driver in the linux kernel) but still users want to passthru the device
we fail the whole operation as we fail to resolve the 'driver' link
under the PCI device sysfs tree. Obviously, this is not a fatal error
and it shouldn't be error at all.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Wed, 23 Oct 2013 15:55:02 +0000 (16:55 +0100)]
virpcitest: Test virPCIDeviceDetach
This commit introduces yet another test under virpcitest:
virPCIDeviceDetach. However, in order to be able to do this, the
virpcimock needs to be extended to model the kernel behavior on PCI
device binding and unbinding (create 'driver' symlinks under the device
tree, check for device ID in driver's ID table, etc.)
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Peter Krempa [Tue, 8 Oct 2013 16:20:10 +0000 (18:20 +0200)]
cpu: x86: Parse the CPU feature map only once
Until now the map was loaded from the XML definition file every time a
operation on the flags was requested. With the introduciton of one shot
initializers we can store the definition forever (as it will never
change) instead of parsing it over and over again.
This makes virCPUx86DataAddCPUID, virCPUx86DataFree, and
virCPUx86MakeData available for direct usage outside of cpu driver in
tests and the new qemu monitor that will request the actual CPU
definition from a running qemu instance.
Peter Krempa [Fri, 18 Oct 2013 10:18:56 +0000 (12:18 +0200)]
nodeinfo: Get rid of nodeGetCellMemory
The function was called in a single place only and was reporting errors
that were later ignored. Use the virNumaGetNodeMemory helper to get the
size of the memory in the NUMA node and remove the helper
Ryota Ozaki [Thu, 24 Oct 2013 15:48:25 +0000 (00:48 +0900)]
virnetsocket: fix getsockopt on FreeBSD
aa0f099 introduced a strict error checking for getsockopt and it
revealed that getting a peer credential of a socket on FreeBSD
didn't work. Libvirtd hits the error:
error : virNetSocketGetUNIXIdentity:1198 : Failed to get valid
client socket identity groups
SOL_SOCKET (0xffff) was used as a level of getsockopt for
LOCAL_PEERCRED, however, it was wrong. 0 is correct as well as
Mac OS X.
So for LOCAL_PEERCRED our options are SOL_LOCAL (if defined) or
0 on Mac OS X and FreeBSD. According to the fact, the patch
simplifies the code by removing ifdef __APPLE__.
I tested the patch on FreeBSD 8.4, 9.2 and 10.0-BETA1.
For reference, Linux systems typically define it as:
typedef bool_t (*xdrproc_t)(XDR *, void *, ...);
The rationale explained in the header is that using a vararg is
incorrect and has a potential to change the ABI slightly do to compiler
optimizations taken and the undefined behavior. They decided
to specify the exact number of parameters and for compatibility with old
code decided to make the signature require 3 arguments. The third
argument is ignored for cases that its not used and its recommended to
supply a 0.
libxl: fix dubious cpumask handling in libxlDomainSetVcpuAffinities
Rather than casting the virBitmap pointer to uint8_t* and then using
the structure contents as a byte array, use the virBitmap API to determine
the bitmap size and test each bit.
Signed-off-by: Jeremy Fitzhardinge <jeremy@goop.org>
Jim Fehlig [Fri, 1 Nov 2013 14:22:18 +0000 (08:22 -0600)]
Revert "libxl: Fix possible invalid read"
This reverts commit 394d6e0a95ac91facba06ab43d67ae8600b14b0e.
The real problem is accessing the virtBitmap structure as a byte
array, which was correctly identified and fixed by Jeremy Fitzhardinge
in recently xen commit: 7051d5c8, there is a api changes in
libxl_domain_create_restore.
Author: Andrew Cooper <andrew.cooper3@citrix.com>
Date: Thu Oct 10 12:23:10 2013 +0100
tools/migrate: Fix regression when migrating from older version of Xen
use the macro LIBXL_HAVE_DOMAIN_CREATE_RESTORE_PARAMS in libxl.h
in order to make libvirt could compile with old and new xen.
the params checkpointed_stream is useful if libvirt libxl driver
support migration. for new, set it as zero.
When starting a transient VM the first thing done is to check
for duplicates. The check looks if there are any running VMs
with the matching name/uuid. It explicitly allows there to
be inactive VMs, so that a persistent VM can be temporarily
booted with a different config.
There is a race condition, however, where 2 or more clients
try to create the same transient VM. The first client will
cause a virDomainObjPtr to be added to the domain list, and
it is inactive at this stage. The second client may then
come along and see this inactive VM, and mistake it for a
persistent VM.
If the first VM fails to start its transient guest for any
reason, then it'll remove the virDomainObjPtr from the list.
The second client now has a virDomainObjPtr that it can try
to boot, which libvirt no longer has a record of. The result
can be a running QEMU process that is orphaned.
It was also, however, possible for the virDomainObjPtr to be
completely free'd which will cause libvirtd to crash in some
scenarios.
The fix is to only allow an existing inactive VM if it is
marked as persistent.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ryota Ozaki [Thu, 31 Oct 2013 15:45:12 +0000 (00:45 +0900)]
nodedev_hal: fix segfault when virDBusGetSystemBus fails
Thie patch fixes the segfault:
error : nodeStateInitialize:658 : DBus not available,
disabling HAL driver: internal error: Unable to get DBus
system bus connection: Failed to connect to socket
/var/run/dbus/system_bus_socket: No such file or directory
error : nodeStateInitialize:719 : ?:
Caught Segmentation violation dumping internal log buffer:
This segfault occurs at the below VIR_ERROR:
failure:
if (dbus_error_is_set(&err)) {
VIR_ERROR(_("%s: %s"), err.name, err.message);
When virDBusGetSystemBus fails, the code jumps to the above failure
path. However, the err variable is not correctly initialized
before calling virDBusGetSystemBus. As a result, dbus_error_is_set
may pass over the uninitialized err variable whose name or
message may point to somewhere unknown memory region, which
causes a segfault on VIR_ERROR.
The new code initializes the err variable before calling
virDBusGetSystemBus.
Include reference of the VM object pointer and name in debug
logs for QEMU start/stop functions. Also make sure we log the
PID that we started, since it isn't available elsewhere in the
logs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
In debugging a recent oVirt/libvirt race condition, I was very
frustrated by lack of logging in the job enter/exit code. This
patch adds some key data which would have been useful in by
debugging attempts.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Eric Blake [Wed, 30 Oct 2013 21:42:31 +0000 (15:42 -0600)]
storage: use correct type for array count
Using size_t counts will let us use VIR_APPEND_ELEMENT and friends.
* src/conf/storage_conf.h (_virStoragePoolObjList)
(_virStorageVolDefList): Track list sizes with size_t.
* src/storage/storage_backend_rbd.c
(virStorageBackendRBDRefreshPool): Fix type fallout.
Eric Blake [Tue, 29 Oct 2013 19:57:19 +0000 (13:57 -0600)]
maint: avoid further typedef accidents
To make it easier to forbid future attempts at a confusing typedef
name ending in Ptr that isn't actually a pointer, insist that we
follow our preferred style of 'typedef foo *fooPtr'.
Fix race condition reconnecting to vms & loading configs
The following sequence
1. Define a persistent QMEU guest
2. Start the QEMU guest
3. Stop libvirtd
4. Kill the QEMU process
5. Start libvirtd
6. List persistent guests
At the last step, the previously running persistent guest
will be missing. This is because of a race condition in the
QEMU driver startup code. It does
1. Load all VM state files
2. Spawn thread to reconnect to each VM
3. Load all VM config files
Only at the end of step 3, does the 'virDomainObjPtr' get
marked as "persistent". There is therefore a window where
the thread reconnecting to the VM will remove the persistent
VM from the list.
The easy fix is to simply switch the order of steps 2 & 3.
In addition to this though, we must only attempt to reconnect
to a VM which had a non-zero PID loaded from its state file.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Fix leak of objects when reconnecting to QEMU instances
The 'error' cleanup block in qemuProcessReconnect() had a
'return' statement in the middle of it. This caused a leak
of virConnectPtr & virQEMUDriverConfigPtr instances. This
was identified because netcf recently started checking its
refcount in libvirtd shutdown:
netcfStateCleanup:109 : internal error: Attempt to close netcf state driver with open connections
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Doug Goldstein [Mon, 28 Oct 2013 18:20:35 +0000 (13:20 -0500)]
MacOS: Re-add support for QEMU backend
The QEMU backend was disabled on Mac OS X without a reason in the code
and due to refactors its difficult to understand when/why it was
disabled. With QEMU being supported on Mac OS X there is no reason to
disable QEMU on this platform.
John Ferlan [Tue, 22 Oct 2013 07:50:08 +0000 (08:50 +0100)]
Add '+' to uid/gid printing for label processing
To ensure proper processing by virGetUserID() and virGetGroupID()
of a uid/gid add a "+" prior to the uid/gid to denote it's really
a uid/gid for the label.
Eric Blake [Tue, 29 Oct 2013 15:56:48 +0000 (09:56 -0600)]
storage: fix incorrect typedef
The rbd code had a confusing typedef ending in Ptr that was not
actually a pointer, which made the rest of the code harder to
read. This fixes things to actually pass by pointer rather than
by copy.
Push RPM deps down into libvirt-daemon-driver-XXXX sub-RPMs
For inexplicable reasons, many of the 3rd party package deps
were left against the 'libvirt-daemon' RPM when the drivers
were split out. This makes a minimal install heavier that
it should be. Push them all down into libvirt-daemon-driver-XXX
so they're only pulled in when truly needed
With this change applied, a minimal install of just the
libvirt-daemon-driver-lxc RPM is reduced by 41 MB on a
Fedora 19 host.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
capabilities: add baselabel per sec driver/virt type to secmodel
Expand the "secmodel" XML fragment of "host" with a sequence of
baselabel's which describe the default security context used by
libvirt with a specific security model and virtualization type:
Chen Hanxiao [Mon, 28 Oct 2013 11:18:26 +0000 (11:18 +0000)]
Skip debug message in lxcContainerSetID if no map is set.
The lxcContainerSetID() method prints a misleading log
message about setting the uid/gid when no ID map is
present in the XML config. Skip the debug message in
this case.
John Ferlan [Fri, 18 Oct 2013 10:43:47 +0000 (06:43 -0400)]
Avoid Coverity DEADCODE warning
Commit '922b7fda' resulted in two DEADCODE warnings from Coverity in
remoteDispatchAuthPolkit and virAccessDriverPolkitFormatProcess.
Commit '604ae657' modified the daemon.c code to remove the deadcode
issue, but did not do so for viracessdriverpolkit.c. This just mimics
the same changes
Commit e962a57 added 'attach-disk --shareable', even though we
already had 'attach-disk --mode=shareable'. Worse, if the user
types 'attach-disk --mode=readonly --shareable', we create
non-sensical XML. The best solution is just to undocument the
duplicate spelling, by having it fall back to the preferred
spelling.
* tools/virsh-domain.c (cmdAttachDisk): Let alias handling fix our
mistake in exposing a second spelling for an existing option.
* tools/virsh.pod: Fix documentation.
Eric Blake [Thu, 24 Oct 2013 07:06:29 +0000 (08:06 +0100)]
virsh: allow alias to expand to opt=value pair
We want to treat 'attach-disk --shareable' as an undocumented
alias for 'attach-disk --mode=shareable'. By improving our
alias handling, we can allow all such --bool -> --opt=value
replacements, and guarantee up front that the alias is not
mixed with its replacement.
* tools/virsh.c (vshCmddefOptParse, vshCmddefGetOption): Add
support for expanding bool alias to --opt=value.
(opts_echo): Add another alias to test it.
* tests/virshtest.c (mymain): Test it.
According to the following valgrind output, there seems to be a
invalid limit for the iterator (captured on Fedora 19):
==3945== Invalid read of size 1
==3945== at 0x1E1FA410: libxlVmStart (libxl_driver.c:475)
==3945== by 0x1E1FAD9A: libxlDomainCreateWithFlags (libxl_driver.c:2633)
==3945== by 0x5187D46: virDomainCreate (libvirt.c:9439)
==3945== by 0x13BAA6: remoteDispatchDomainCreateHelper (remote_dispatch.h:2910)
==3945== by 0x51DE5B9: virNetServerProgramDispatch (virnetserverprogram.c:435)
==3945== by 0x51D93E7: virNetServerHandleJob (virnetserver.c:165)
==3945== by 0x50F5BF4: virThreadPoolWorker (virthreadpool.c:144)
==3945== by 0x50F5670: virThreadHelper (virthreadpthread.c:161)
==3945== by 0x8046C52: start_thread (pthread_create.c:308)
==3945== by 0x8758E1C: clone (clone.S:113)
==3945== Address 0x23424d81 is 0 bytes after a block of size 1 alloc'd
==3945== at 0x4A08121: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==3945== by 0x50B1F8C: virAllocN (viralloc.c:189)
==3945== by 0x1E1FA3CA: libxlVmStart (libxl_driver.c:468)
==3945== by 0x1E1FAD9A: libxlDomainCreateWithFlags (libxl_driver.c:2633)
==3945== by 0x5187D46: virDomainCreate (libvirt.c:9439)
==3945== by 0x13BAA6: remoteDispatchDomainCreateHelper (remote_dispatch.h:2910)
==3945== by 0x51DE5B9: virNetServerProgramDispatch (virnetserverprogram.c:435)
==3945== by 0x51D93E7: virNetServerHandleJob (virnetserver.c:165)
==3945== by 0x50F5BF4: virThreadPoolWorker (virthreadpool.c:144)
==3945== by 0x50F5670: virThreadHelper (virthreadpthread.c:161)
==3945== by 0x8046C52: start_thread (pthread_create.c:308)
==3945== by 0x8758E1C: clone (clone.S:113)
==3945==
Related: https://bugzilla.redhat.com/show_bug.cgi?id=1013045 Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
build: Fix prohibit_int_ijk (and iijjkk) on RHEL 5
On RHEL 5, make syntax-check was failing because even strings like
'int isTempChain' matched the 'int i' rule. To be honest, I haven't
found the root cause, but the change added makes it work as expected
and keeps the proper behavior on newer systems as well.
Marian Neagul [Tue, 22 Oct 2013 15:03:39 +0000 (16:03 +0100)]
python: Fix Create*WithFiles filefd passing
Commit d76227be added functions virDomainCreateWithFiles and
virDomainCreateXMLWithFiles, but there was a little piece missing in
python bindings. This patch fixes proper passing of file descriptors
in the overwrites of these functions.
Hongwei Bi [Tue, 22 Oct 2013 13:38:01 +0000 (21:38 +0800)]
networkStartDhcpDaemon: Check for dnsmasqCapsRefresh failure
Currently, we ignore whether dnsmasqCapsRefresh succeeds or fails. We
shouldn't do that as we may generate wrong dnsmasq command line (what
is done just a few lines below).
Signed-off-by: Hongwei Bi <hwbi2008@gmail.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Doug Goldstein [Sat, 12 Oct 2013 17:06:27 +0000 (12:06 -0500)]
rpc: Retrieve peer PID via new getsockopt() for Mac
While LOCAL_PEERCRED on the BSDs does not return the pid information of
the peer, Mac OS X 10.8 added LOCAL_PEERPID to retrieve the pid so we
should use that when its available to get that information.
Jim Fehlig [Tue, 22 Oct 2013 05:12:22 +0000 (23:12 -0600)]
build: fix build of virt-login-shell on systems with older gnutls
On systems where gnutls uses libgcrypt, I'm seeing the following
build failure
libvirt.c:314: error: variable 'virTLSThreadImpl' has initializer but incomplete type
libvirt.c:319: error: 'GCRY_THREAD_OPTION_PTHREAD' undeclared here (not in a function)
...
Fix by undefining WITH_GNUTLS_GCRYPT in config-post.h
Michal Privoznik [Tue, 22 Oct 2013 09:33:06 +0000 (10:33 +0100)]
Get rid of shadowed booleans
There are still two places where we are using 1bit width unsigned
integer to store a boolean. There's no real need for this and these
occurrences can be replaced with 'bool'.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Jim Fehlig [Mon, 21 Oct 2013 21:36:11 +0000 (15:36 -0600)]
build: fix linking virt-login-shell
After commit 3e2f27e1, I've noticed build failures of virt-login-shell
when libapparmor-devel is installed on the build host
CCLD virt-login-shell
../src/.libs/libvirt-setuid-rpc-client.a(libvirt_setuid_rpc_client_la-vircommand.o):
In function `virExec':
/home/jfehlig/virt/upstream/libvirt/src/util/vircommand.c:653: undefined
reference to `aa_change_profile'
collect2: error: ld returned 1 exit status
I was about to commit an easy fix under the build-breaker rule
(build-fix-1.patch), but thought to extend the notion of SECDRIVER_LIBS
to SECDRIVER_CFLAGS, and use both throughout src/Makefile.am where it
makes sense (build-fix-2.patch).
Should I just stick with the simple fix, or is something along the lines
of patch 2 preferred?
Regards,
Jim
>From a0f35945f3127ab70d051101037e821b1759b4bb Mon Sep 17 00:00:00 2001
From: Jim Fehlig <jfehlig@suse.com>
Date: Mon, 21 Oct 2013 15:30:02 -0600
Subject: [PATCH] build: fix virt-login-shell build with apparmor
With libapparmor-devel installed, virt-login-shell fails to link
CCLD virt-login-shell
../src/.libs/libvirt-setuid-rpc-client.a(libvirt_setuid_rpc_client_la-vircommand.o): In function `virExec':
/home/jfehlig/virt/upstream/libvirt/src/util/vircommand.c:653: undefined reference to `aa_change_profile'
collect2: error: ld returned 1 exit status
Fix by linking libvirt_setuid_rpc_client with previously determined
SECDRIVER_LIBS in src/Makefile.am. While at it, introduce SECDRIVER_CFLAGS
and use both throughout src/Makefile.am where it makes sense.
Michal Privoznik [Tue, 22 Oct 2013 12:15:21 +0000 (13:15 +0100)]
vircgroupmock: Mock access() to some more files
Currently, if access(path, mode) is invoked, we check if @path has this
special prefix SYSFS_PREFIX. If it does, we modify the path a bit and
call realaccess. If it doesn't we act just like a wrapper and call
realaccess directly. However, we are mocking fopen() as well. And as one
can clearly see there, fopen("/proc/cgroups") will succeed. Hence, we
have an error in our mocked access(): We need to check whether @path is
not equal to /proc/cgroups as it may not exists on real system we're
running however we definitely know how to fopen() it.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Tue, 22 Oct 2013 11:29:13 +0000 (12:29 +0100)]
tests: Use lv_abs_top_builddir instead of bare abs_top_builddir
As stated in the comment above introduction of the lv_abs_top_builddir
variable, older automake doesn't provide abs_top_builddir variable.
Hence, we are creating our own one with lv_ prefix. However, when
exporting env variables to the tests, the variables are not evaluated
but only substituted. Hence:
Peter Krempa [Tue, 22 Oct 2013 14:01:26 +0000 (15:01 +0100)]
virsh: Fix job watching when STDIN is not a tty
In commit b46c4787dde79b015dad67dedda4ccf6ff1a3082 I changed the code to
watch long running jobs in virsh. Unfortunately I didn't take into
account that poll may get a hangup if the terminal is not a TTY and will
be closed.
This patch avoids polling the STDIN fd when there's no TTY.
Ryota Ozaki [Sun, 20 Oct 2013 15:14:52 +0000 (00:14 +0900)]
nodeinfo: fix physical memory size on Mac OS X
HW_PHYSMEM is available on Mac OS X as well as FreeBSD, however,
its resulting value for Mac OS X is 32 bits. Mac OS X provides
HW_MEMSIZE that is 64 bits version of HW_PHYSMEM. We have to use it.
I tested the patch on Mac OS X 10.6.8, 10.7.4, 10.8.5 and FreeBSD 9.2.
When libvirt was changed to delay the final cleanup of device removal
until the qemu process had signaled it with a DEVICE_DELETED event for
that device, the hostdev removal function
(qemuDomainRemoveHostDevice()) was written to properly handle the
removal of a hostdev that was actually an SRIOV virtual function
(defined with <interface type='hostdev'>). However, the function used
to search for a device matching the alias name provided in the
DEVICE_DELETED message (virDomainDefFindDevice()) would search through
the list of netdevs before hostdevs, so qemuDomainRemoveHostDevice()
was never called; instead the netdev function,
qemuDomainRemoveNetDevice() (which *doesn't* properly cleanup after
removal of <interface type='hostdev'>), was called.
(As a reminder - each <interface type='hostdev'> results in a
virDomainNetDef which contains a virDomainHostdevDef having a parent
type of VIR_DOMAIN_DEVICE_NET, and parent.data.net pointing back to
the virDomainNetDef; both Defs point to the same device info object
(and the info contains the device's "alias", which is used by qemu to
identify the device). The virDomainHostdevDef is added to the domain's
hostdevs list *and* the virDomainNetDef is added to the domain's nets
list, so searching either list for a particular alias will yield a
positive result.)
This function modifies the qemuDomainRemoveNetDevice() to short
circuit itself and call qemu DomainRemoveHostDevice() instead when the
actual device is a VIR_DOMAIN_NET_TYPE_HOSTDEV (similar logic to what
is done in the higher level qemuDomainDetachNetDevice())
Note that even if virDomainDefFindDevice() changes in the future so
that it finds the hostdev entry first, the current code will continue
to work properly.
This function was called in three places, and in each the call was
qualified by a slightly different conditional. In reality, this
function should only be called for a hostdev if all of the following
are true:
1) mode='subsystem'
2) type='pci'
3) there is a parent device definition which is an <interface>
(VIR_DOMAIN_DEVICE_NET)
We can simplify the callers and make them more consistent by checking
these conditions at the top ov qemuDomainHostdevNetConfigRestore and
returning 0 if one of them isn't satisfied.
The location of the call to qemuDomainHostdevNetConfigRestore() has
also been changed in the hot-plug case - it is moved into the caller
of its previous location (i.e. from qemuDomainRemovePCIHostDevice() to
qemuDomainRemoveHostDevice()). This was done to be more consistent
about which functions pay attention to whether or not this is one of
the special <interface> hostdevs or just a normal hostdev -
qemuDomainRemoveHostDevice() already contained a call to
networkReleaseActualDevice() and virDomainNetDefFree(), so it makes
sense for it to also handle the resetting of the device's MAC address
and vlan tag (which is what's done by
qemuDomainHostdevNetConfigRestore()).
Move virt-login-shell into libvirt-login-shell sub-RPM
Many people will not want the setuid virt-login-shell binary
installed by default, so move it into a separate sub-RPM
named libvirt-login-shell. This RPM is only generated if
LXC is enabled
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Most of the usage of getuid()/getgid() is in cases where we are
considering what privileges we have. As such the code should be
using the effective IDs, not real IDs.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We already have stubs for getuid, geteuid, getgid but
not for getegid. Something in gnulib already does a
check for it during configure, so we already have the
HAVE_GETEGID macro defined.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Only allow the UNIX transport in remote driver when setuid
We don't know enough about quality of external libraries used
for non-UNIX transports, nor do we want to spawn external
commands when setuid. Restrict to the bare minimum which is
UNIX transport for local usage. Users shouldn't need to be
running setuid if connecting to remote hypervisors in any
case.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Unconditional use of getenv is not secure in setuid env.
While not all libvirt code runs in a setuid env (since
much of it only exists inside libvirtd) this is not always
clear to developers. So make all the code paranoid, even
if it only ever runs inside libvirtd.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When running setuid, we must be careful about what env vars
we allow commands to inherit from us. Replace the
virCommandAddEnvPass function with two new ones which do
filtering
Don't link virt-login-shell against libvirt.so (CVE-2013-4400)
The libvirt.so library has far too many library deps to allow
linking against it from setuid programs. Those libraries can
do stuff in __attribute__((constructor) functions which is
not setuid safe.
The virt-login-shell needs to link directly against individual
files that it uses, with all library deps turned off except
for libxml2 and libselinux.
Create a libvirt-setuid-rpc-client.la library which is linked
to by virt-login-shell. A config-post.h file allows this library
to disable all external deps except libselinux and libxml2.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Close all non-stdio FDs in virt-login-shell (CVE-2013-4400)
We don't want to inherit any FDs in the new namespace
except for the stdio FDs. Explicitly close them all,
just in case some do not have the close-on-exec flag
set.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Only allow 'stderr' log output when running setuid (CVE-2013-4400)
We must not allow file/syslog/journald log outputs when running
setuid since they can be abused to do bad things. In particular
the 'file' output can be used to overwrite files.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add helpers for getting env vars in a setuid environment
Care must be taken accessing env variables when running
setuid. Introduce a virGetEnvAllowSUID for env vars which
are safe to use in a setuid environment, and another
virGetEnvBlockSUID for vars which are not safe. Also add
a virIsSUID helper method for any other non-env var code
to use.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Fix perms for virConnectDomainXML{To,From}Native (CVE-2013-4401)
The virConnectDomainXMLToNative API should require 'connect:write'
not 'connect:read', since it will trigger execution of the QEMU
binaries listed in the XML.
Also make virConnectDomainXMLFromNative API require a full
read-write connection and 'connect:write' permission. Although the
current impl doesn't trigger execution of QEMU, we should not
rely on that impl detail from an API permissioning POV.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>