Added check for maximum number of vcpus exceeding topology limit
Earlier, when the number of vcpus was greater than the topology allowed,
libvirt didn't raise an error and continued, resulting in running qemu
with parameters making no sense. Even though qemu did not report any
error itself, the number of vcpus was set to maximum allowed by the
topology.
Eric Blake [Thu, 12 Jan 2012 00:44:49 +0000 (17:44 -0700)]
uuid: fix off-by-one
Detected by Coverity. Although unlikely, if we are ever started
with stdin closed, we could reach a situation where we open a
uuid file but then fail to close it, making that file the new
stdin for the rest of the process.
* src/util/uuid.c (getDMISystemUUID): Allow for stdin.
Re-write LXC controller end-of-file I/O handling yet again
Currently the LXC controller attempts to deal with EOF on a
tty by spawning a thread to do an edge triggered epoll_wait().
This avoids the normal event loop spinning on POLLHUP. There
is a subtle mistake though - even after seeing POLLHUP on a
master PTY, it is still perfectly possible & valid to write
data to the PTY. There is a buffer that can be filled with
data, even when no client is present.
The second mistake is that the epoll_wait() thread was not
looking for the EPOLLOUT condition, so when a new client
connects to the LXC console, it had to explicitly send a
character before any queued output would appear.
Finally, there was in fact no need to spawn a new thread to
deal with epoll_wait(). The epoll file descriptor itself
can be poll()'d on normally.
This patch attempts to deal with all these problems.
- The blocking epoll_wait() thread is replaced by a poll
on the epoll file descriptor which then does a non-blocking
epoll_wait() to handle events
- Even if POLLHUP is seen, we continue trying to write
any pending output until getting EAGAIN from write.
- Once write returns EAGAIN, we modify the epoll event
mask to also look for EPOLLOUT
* src/lxc/lxc_controller.c: Avoid stalled I/O upon
connected to an LXC console
Allow 10 chars for domain IDs & 30 chars for names in virsh list
Domain IDs are at least 16 bits for most hypervisors, theoretically
event 32-bits. 3 characters is clearly too small an alignment.
Increase alignment to 5 characters to allow 16-bit domain IDs to
display cleanly. Commonly seen with LXC where domain IDs are the
process IDs by default. Also increase the 'name' field from 20
to 30 characters to cope with longer guest names which are quite
common
Michal Privoznik [Tue, 10 Jan 2012 15:57:30 +0000 (16:57 +0100)]
stream: Check for stream EOF
If client stream does not have any data to sink and neither received
EOF, a dummy packet is sent to the daemon signalising client is ready to
sink some data. However, after we added event loop to client a race may
occur:
Thread 1 calls virNetClientStreamRecvPacket and since no data are cached
nor stream has EOF, it decides to send dummy packet to server which will
sent some data in turn. However, during this decision and actual message
exchange with server -
Thread 2 receives last stream data from server. Therefore an EOF is set
on stream and if there is a call waiting (which is not yet) it is woken
up. However, Thread 1 haven't sent anything so far, so there is no call
to be woken up. So this thread sent dummy packet to daemon, which
ignores that as no stream is associated with such packet and therefore
no reply will ever come.
Osier Yang [Wed, 11 Jan 2012 12:45:31 +0000 (20:45 +0800)]
virsh: New command print summary of all virtual interfaces
Just like command "domblklist", the command extracts "type",
"source", "target", "model", and "MAC" of all virtual interfaces
from domain XML (live or persistent).
Deepak C Shetty [Tue, 10 Jan 2012 12:53:31 +0000 (18:23 +0530)]
Do not generate security_model when fs driver is anything but 'path'
QEMU does not support security_model for anything but 'path' fs driver type.
Currently in libvirt, when security_model ( accessmode attribute) is not
specified it auto-generates it irrespective of the fs driver type, which
can result in a qemu error for drivers other than path. This patch ensures
that the qemu cmdline is correctly generated by taking into account the
fs driver type.
Signed-off-by: Deepak C Shetty <deepakcs@linux.vnet.ibm.com>
Shradha Shah [Wed, 14 Dec 2011 10:50:30 +0000 (10:50 +0000)]
Functionality to implicitly get interface pool from SR-IOV PF.
If a system has 64 or more VF's, it is quite tedious to mention each VF
in the interface pool.
The following modification will implicitly create an interface pool from
the SR-IOV PF.
Although the netcf interface driver can in theory be used by
the stateless drivers, in practice none of them want to use
it because they have different ways of dealing with interfaces.
Furthermore, if you have mingw32-netcf installed, then the
libvirt mingw32 build will fail with
../../src/interface/netcf_driver.c:644:5: error: unknown field 'close_used_without_including_unistd_h' specified in initializer
* configure.ac: disable netcf if built without libvirtd
Eric Blake [Wed, 11 Jan 2012 13:48:14 +0000 (06:48 -0700)]
build: fix build on mingw with netcf available
The autobuilder pointed out an odd failure on mingw:
../../src/interface/netcf_driver.c:644:5: error: unknown field 'close_used_without_including_unistd_h' specified in initializer
cc1: warnings being treated as errors
This is because the gnulib headers #define close to different strings,
according to which headers are included, in order to work around some
odd mingw problems with close(), and these defines happen to also
affect field members declared with a name of struct foo.close. As long
as all headers are included before both the definition and use of the
struct, the various #define doesn't matter, but the netcf file hit
an instance where things were included in a different order. Fix this
for all clients that use a struct member named 'close'.
* src/driver.h: Include <unistd.h> before using 'close'.
Eric Blake [Wed, 11 Jan 2012 13:32:52 +0000 (06:32 -0700)]
build: avoid spurious compiler warning
For some weird reason, i686-pc-mingw32-gcc version 4.6.1 at -O2 complained:
../../src/conf/nwfilter_params.c: In function 'virNWFilterVarCombIterCreate':
../../src/conf/nwfilter_params.c:346:23: error: 'minValue' may be used uninitialized in this function [-Werror=uninitialized]
../../src/conf/nwfilter_params.c:319:28: note: 'minValue' was declared here
../../src/conf/nwfilter_params.c:344:23: error: 'maxValue' may be used uninitialized in this function [-Werror=uninitialized]
../../src/conf/nwfilter_params.c:319:18: note: 'maxValue' was declared here
cc1: all warnings being treated as errors
even though all paths of the preceding switch statement either
assign the variables or return.
Stefan Berger [Wed, 11 Jan 2012 11:42:37 +0000 (06:42 -0500)]
Address side effects of accessing vars via index
Address side effect of accessing a variable via an index: Filters
accessing a variable where an element is accessed that is beyond the
size of the list (for example $TEST[10] and only 2 elements are available)
cannot instantiate that filter. Test for this and report proper error
to user.
processes the two lists 'A' and 'B' in parallel. This means that A and B
must have the same number of 'N' elements and that 'N' rules will be
instantiated (assuming all tuples from A and B are unique).
In this patch we now introduce the assignment of variables to different
iterators. Therefore a rule like
will now create every combination of elements in A with elements in B since
A has been assigned to an iterator with Id '1' and B has been assigned to an
iterator with Id '2', thus processing their value independently.
Stefan Berger [Wed, 11 Jan 2012 11:42:37 +0000 (06:42 -0500)]
Optimize the elements the iterator visits.
In this patch we introduce testing whether the iterator points to a
unique set of entries that have not been seen before at one of the previous
iterations. The point is to eliminate duplicates and with that unnecessary
filtering rules by preventing identical filtering rules from being
instantiated.
Example with two lists:
list1 = [1,2,1]
list2 = [1,3,1]
The 1st iteration would take the 1st items of each list -> 1,1
The 2nd iteration would take the 2nd items of each list -> 2,3
The 3rd iteration would take the 3rd items of each list -> 1,1 but
skip them since this same pair has already been encountered in the 1st
iteration
Implementation-wise this is solved by taking the n-th element of list1 and
comparing it against elements 1..n-1. If no equivalent is found, then there
is no possibility of this being a duplicate. In case an equivalent element
is found at position i, then the n-th element in the 2nd list is compared
against the i-th element in the 2nd list and if that is not the same, then
this is a unique pair, otherwise it is not unique and we may need to do
the same comparison on the 3rd list.
Change security driver APIs to use virDomainDefPtr instead of virDomainObjPtr
When sVirt is integrated with the LXC driver, it will be neccessary
to invoke the security driver APIs using only a virDomainDefPtr
since the lxc_container.c code has no virDomainObjPtr available.
Aside from two functions which want obj->pid, every bit of the
security driver code only touches obj->def. So we don't need to
pass a virDomainObjPtr into the security drivers, a virDomainDefPtr
is sufficient. Two functions also gain a 'pid_t pid' argument.
* src/qemu/qemu_driver.c, src/qemu/qemu_hotplug.c,
src/qemu/qemu_migration.c, src/qemu/qemu_process.c,
src/security/security_apparmor.c,
src/security/security_dac.c,
src/security/security_driver.h,
src/security/security_manager.c,
src/security/security_manager.h,
src/security/security_nop.c,
src/security/security_selinux.c,
src/security/security_stack.c: Change all security APIs to use a
virDomainDefPtr instead of virDomainObjPtr
Eric Blake [Mon, 9 Jan 2012 18:57:46 +0000 (11:57 -0700)]
snapshot: allow reuse of existing files in disk snapshot
When disk snapshots were first implemented, libvirt blindly refused
to allow an external snapshot destination that already exists, since
qemu will blindly overwrite the contents of that file during the
snapshot_blkdev monitor command, and we don't like a default of
data loss by default. But VDSM has a scenario where NFS permissions
are intentionally set so that the destination file can only be
created by the management machine, and not the machine where the
guest is running, so that libvirt will necessarily see the destination
file already existing; adding a flag will allow VDSM to force the file
reuse without libvirt complaining of possible data loss.
Eric Blake [Mon, 9 Jan 2012 20:56:42 +0000 (13:56 -0700)]
docs: standardize description of flags
We had loads of different styles in describing the @flags parameter
for various APIs, as well as several APIs that didn't list which
enums provided the bit values valid for the flags.
The end result is one of two formats:
@flags: bitwise-OR of vir...Flags
@flags: extra flags; not used yet, so callers should always pass 0
(it doesn't eliminate the failure to start, but causes libvirt to give
a better idea about the cause of the failure).
If a guest uses a kvm emulator (e.g. /usr/bin/qemu-kvm) and the guest
is started when kvm isn't available (either because virtualization is
unavailable / has been disabled in the BIOS, or the kvm modules
haven't been loaded for some reason), a semi-cryptic error message is
logged:
libvirtError: internal error Child process (LC_ALL=C
PATH=/sbin:/usr/sbin:/bin:/usr/bin /usr/bin/qemu-kvm -device ? -device
pci-assign,? -device virtio-blk-pci,? -device virtio-net-pci,?) status
unexpected: exit status 1
This patch notices at process start that a guest needs kvm, and checks
for the presence of /dev/kvm (a reasonable indicator that kvm is
available) before trying to execute the qemu binary. If kvm isn't
available, a more useful (too verbose??) error is logged.
Jim Fehlig [Tue, 3 Jan 2012 18:35:06 +0000 (11:35 -0700)]
PolicyKit: Check auth before asking client to obtain it
I previously mentioned [1] a PolicyKit issue where libvirt would
proceed with authentication even though polkit-auth failed:
testusr xen134:~> virsh list --all
Attempting to obtain authorization for org.libvirt.unix.manage.
polkit-grant-helper: given auth type (8 -> yes) is bogus
Failed to obtain authorization for org.libvirt.unix.manage.
Id Name State
----------------------------------
0 Domain-0 running
- sles11sp1-pv shut off
AFAICT, libvirt attempts to obtain a privilege it already has,
causing polkit-auth to fail with above message. Instead of calling
obtain and then checking auth, IMO the workflow should be for the
server to check auth first, and if that fails ask the client to
obtain it and check again. This workflow also allows for checking
only successful exit of polkit-auth in virConnectAuthGainPolkit().
Laine Stump [Thu, 5 Jan 2012 03:48:38 +0000 (22:48 -0500)]
qemu: add new disk device='lun' for bus='virtio' & type='block'
In the past, generic SCSI commands issued from a guest to a virtio
disk were always passed through to the underlying disk by qemu, and
the kernel would also pass them on.
As a result of CVE-2011-4127 (see:
http://seclists.org/oss-sec/2011/q4/536), qemu now honors its
scsi=on|off device option for virtio-blk-pci (which enables/disables
passthrough of generic SCSI commands), and the kernel will only allow
the commands for physical devices (not for partitions or logical
volumes). The default behavior of qemu is still to allow sending
generic SCSI commands to physical disks that are presented to a guest
as virtio-blk-pci devices, but libvirt prefers to disable those
commands in the standard virtio block devices, enabling it only when
specifically requested (hopefully indicating that the requester
understands what they're asking for). For this purpose, a new libvirt
disk device type (device='lun') has been created.
device='lun' is identical to the default device='disk', except that:
1) It is only allowed if bus='virtio', type='block', and the qemu
version is "new enough" to support it ("new enough" == qemu 0.11 or
better), otherwise the domain will fail to start and a
CONFIG_UNSUPPORTED error will be logged).
2) The option "scsi=on" will be added to the -device arg to allow
SG_IO commands (if device !='lun', "scsi=off" will be added to the
-device arg so that SG_IO commands are specifically forbidden).
Guests which continue to use disk device='disk' (the default) will no
longer be able to use SG_IO commands on the disk; those that have
their disk device changed to device='lun' will still be able to use SG_IO
commands.
*docs/formatdomain.html.in - document the new device attribute value.
*docs/schemas/domaincommon.rng - allow it in the RNG
*tests/* - update the args of several existing tests to add scsi=off, and
add one new test that will test scsi=on.
*src/conf/domain_conf.c - update domain XML parser and formatter
*src/qemu/qemu_(command|driver|hotplug).c - treat
VIR_DOMAIN_DISK_DEVICE_LUN *almost* identically to
VIR_DOMAIN_DISK_DEVICE_DISK, except as indicated above.
Note that no support for this new device value was added to any
hypervisor drivers other than qemu, because it's unclear what it might
mean (if anything) to those drivers.
Laine Stump [Tue, 29 Nov 2011 18:37:27 +0000 (13:37 -0500)]
qemu: add capabilities flags related to SG_IO
This patch adds two capabilities flags to deal with various aspects
of supporting SG_IO commands on virtio-blk-pci devices:
QEMU_CAPS_VIRTIO_BLK_SCSI
set if -device virtio-blk-pci accepts the scsi="on|off" option
When present, this is on by default, but can be set to off to disable
SG_IO functions.
QEMU_CAPS_VIRTIO_BLK_SG_IO
set if SG_IO commands are supported in the virtio-blk-pci driver
(present since qemu 0.11 according to a qemu developer, if I
understood correctly)
Laine Stump [Fri, 6 Jan 2012 17:59:47 +0000 (12:59 -0500)]
config: report error when script given for inappropriate interface type
This fixes https://bugzilla.redhat.com/show_bug.cgi?id=638633
Although scripts are not used by interfaces of type other than
"ethernet" in qemu, due to the fact that the parser stores the script
name in a union that is only valid when type is ethernet or bridge,
there is no way for anyone except the parser itself to catch the
problem of specifying an interface script for an inappropriate
interface type (by the time the parsed data gets back to the code that
called the parser, all evidence that a script was specified is
forgotten).
Since the parser itself should be agnostic to which type of interface
allows scripts (an example of why: a script specified for an interface
of type bridge is valid for xen domains, but not for qemu domains),
the solution here is to move the script out of the union(s) in the
DomainNetDef, always populate it when specified (regardless of
interface type), and let the driver decide whether or not it is
appropriate.
Currently the qemu, xen, libxml, and uml drivers recognize the script
parameter and do something with it (the uml driver only to report that
it isn't supported). Those drivers have been updated to log a
CONFIG_UNSUPPORTED error when a script is specified for an interface
type that's inappropriate for that particular hypervisor.
(NB: There was earlier discussion of solving this problem by adding a
VALIDATE flag to all libvirt APIs that accept XML, which would cause
the XML to be validated against the RNG files. One statement during
that discussion was that the RNG shouldn't contain hypervisor-specific
things, though, and a proper solution to this problem would require
that (again, because a script for an interface of type "bridge" is
accepted by xen, but not by qemu).
Eric Blake [Fri, 6 Jan 2012 23:07:34 +0000 (16:07 -0700)]
tests: work around pdwtags 1.9 failure
On rawhide, gcc is new enough to output new DWARF information that
pdwtags has not yet learned, but the resulting 'make check' output
was rather confusing:
$ make -C src check
...
GEN virkeepaliveprotocol-structs
die__process_function: DW_TAG_INVALID (0x4109) @ <0x58c> not handled!
WARNING: your pdwtags program is too old
WARNING: skipping the virkeepaliveprotocol-structs test
WARNING: install dwarves-1.3 or newer
...
$ pdwtags --version
v1.9
I've filed the pdwtags deficiency as
https://bugzilla.redhat.com/show_bug.cgi?id=772358
* src/Makefile.am (PDWTAGS): Don't leave -t file behind on version
mismatch. Soften warning message, since 1.9 is newer than 1.3.
Don't leak stderr from broken version.
Eric Blake [Fri, 6 Jan 2012 21:07:23 +0000 (14:07 -0700)]
tests: avoid test failure on rawhide gnutls
I hit a VERY weird testsuite failure on rawhide, which included
_binary_ output to stderr, followed by a hang waiting for me
to type something! (Here, using ^@ for NUL):
$ ./commandtest
TEST: commandtest
WARNING: gnome-keyring:: couldn't send data: Bad file descriptor
.WARNING: gnome-keyring:: couldn't send data: Bad file descriptor
.WARNING: gnome-keyring:: couldn't send data: Bad file descriptor
WARNING: gnome-keyring:: couldn't send data: Bad file descriptor
.8^@^@^@8^@^@^@^A^@^@^@^Bay^A^@^@^@)PRIVATE-GNOME-KEYRING-PKCS11-PROTOCOL-V-1
I finally traced it to the fact that gnome-keyring, called via
gnutls_global_init which is turn called by virNetTLSInit, opens
an internal fd that it expects to communicate to via a
pthread_atfork handler (never mind that it violates POSIX by
using non-async-signal-safe functions in that handler:
https://bugzilla.redhat.com/show_bug.cgi?id=772320).
Our problem stems from the fact that we pulled the rug out from
under the library's expectations by closing an fd that it had
just opened. While we aren't responsible for fixing the bugs
in that pthread_atfork handler, we can at least avoid the bugs
by not closing the fd in the first place.
* tests/commandtest.c (mymain): Avoid closing fds that were opened
by virInitialize.
Alex Jia [Fri, 6 Jan 2012 06:36:34 +0000 (14:36 +0800)]
qemu: Avoid memory leaks on qemuParseRBDString
Detected by valgrind. Leak introduced in commit 5745dc1.
* src/qemu/qemu_command.c: fix memory leak on failure and successful path.
* How to reproduce?
% valgrind -v --leak-check=full ./qemuargv2xmltest
* Actual result:
==2196== 80 bytes in 1 blocks are definitely lost in loss record 3 of 4
==2196== at 0x4A05FDE: malloc (vg_replace_malloc.c:236)
==2196== by 0x39CF07F6E1: strdup (in /lib64/libc-2.12.so)
==2196== by 0x419823: qemuParseRBDString (qemu_command.c:1657)
==2196== by 0x4221ED: qemuParseCommandLine (qemu_command.c:5934)
==2196== by 0x422AFB: qemuParseCommandLineString (qemu_command.c:7561)
==2196== by 0x416864: testCompareXMLToArgvHelper (qemuargv2xmltest.c:48)
==2196== by 0x417DB1: virtTestRun (testutils.c:141)
==2196== by 0x415CAF: mymain (qemuargv2xmltest.c:175)
==2196== by 0x4174A7: virtTestMain (testutils.c:696)
==2196== by 0x39CF01ECDC: (below main) (in /lib64/libc-2.12.so)
==2196==
==2196== LEAK SUMMARY:
==2196== definitely lost: 80 bytes in 1 blocks
Eric Blake [Thu, 5 Jan 2012 21:21:11 +0000 (14:21 -0700)]
build: drop check for ANSI compiler
Using automake.git (will become 1.12 someday), I got this error:
configure.ac:90: error: automatic de-ANSI-fication support has been removed
/usr/local/share/aclocal-1.11a/protos.m4:13: AM_C_PROTOTYPES is expanded from...
configure.ac:90: the top level
autom4te: /usr/bin/m4 failed with exit status: 1
In short, pre-C89 compilers are no longer a viable portability
target. Besides, our code base already requires C99, so worrying
about pre-C89 seems pointless.
* configure.ac (AM_C_PROTOTYPES): Drop, since newer automake no
longer provides it.
Hu Tao [Wed, 4 Jan 2012 09:41:43 +0000 (17:41 +0800)]
qemu: fix a bug in numatune
When setting numa nodeset for a domain which has no nodeset set
before, libvirtd crashes by dereferencing the pointer to the old
nodemask which is null in that case.
Eric Blake [Wed, 4 Jan 2012 23:01:24 +0000 (16:01 -0700)]
seclabel: fix regression in libvirtd restart
Commit b434329 has a logic bug: seclabel overrides don't set
def->type, but the default value is 0 (aka static). Restarting
libvirtd would thus reject the XML for any domain with an
override of <seclabel relabel='no'/> (which happens quite
easily if a disk image lives on NFS), with a message:
2012-01-04 22:29:40.949+0000: 6769: error : virSecurityLabelDefParseXMLHelper:2593 : XML error: security label is missing
Fix the logic to never read from an override's def->type, and
to allow a missing <label> subelement when relabel is no. There's
a lot of stupid double-negatives in the code (!norelabel) because
of the way that we want the zero-initialized defaults to behave.
* src/conf/domain_conf.c (virSecurityLabelDefParseXMLHelper): Use
type field from correct location.
command: Discard FD_SETSIZE limit for opened files
Currently, virCommand implementation uses FD_ macros from
sys/select.h. However, those cannot handle more opened files
than FD_SETSIZE. Therefore switch to generalized implementation
based on array of integers.
Jim Fehlig [Wed, 4 Jan 2012 15:47:37 +0000 (08:47 -0700)]
Support Xen domctl v8
xen-unstable c/s 23874:651aed73b39c added another member to
xen_domctl_getdomaininfo struct and bumped domctl version to 8.
Add a corresponding domctl v8 struct in xen hypervisor sub-driver
and detect domctl v8 during initialization.
Jim Fehlig [Tue, 3 Jan 2012 22:39:59 +0000 (15:39 -0700)]
Fix xenstore serial console path for HVM guests
The console path in xenstore is /local/domain/<id>/console/tty
for PV guests (PV console) and /local/domain/<id>/serial/0/tty
(serial console) for HVM guests. Similar to Xen's in-tree console
client, read the correct path for PV vs HVM.
It is a good practise to set revents to zero before doing any poll().
Moreover, we should check if event we waited for really occurred or
if any of fds we were polling on didn't encountered hangup.
Eric Blake [Mon, 2 Jan 2012 21:35:12 +0000 (14:35 -0700)]
domiftune: clean up previous patches
Most severe here is a latent (but currently untriggered) memory leak
if any hypervisor ever adds a string interface property; the
remainder are mainly cosmetic.
Peter Krempa [Mon, 2 Jan 2012 16:07:31 +0000 (17:07 +0100)]
virsh: Fix checking for reconnect conditions
virshReportError() function frees the most recent error reported from
libvirt. Condition that checks if connection to the daemon was broken
during last command was then limited to check for SIGPIPE signal not
taking into account possible errors signalized without SIGPIPE.
This patch moves the check before the error is freed, to take into
account code that does not emit SIGPIPE while failing.
* tools/virsh.c: - move check for broken connection before error print.
Michal Novotny [Mon, 2 Jan 2012 14:23:54 +0000 (15:23 +0100)]
Implement DNS SRV record into the bridge driver
Hi,
this is the fifth version of my SRV record for DNSMasq patch rebased
for the current codebase to the bridge driver and libvirt XML file to
include support for the SRV records in the DNS. The syntax is based on
DNSMasq man page and tests for both xml2xml and xml2argv were added as
well. There are some things written a better way in comparison with
version 4, mainly there's no hack in tests/networkxml2argvtest.c and
also the xPath context is changed to use a simpler query using the
virXPathInt() function relative to the current node.
Also, the patch is also fixing the networkxml2argv test to pass both
checks, i.e. both unit tests and also syntax check.
Please review,
Michal
Signed-off-by: Michal Novotny <minovotn@redhat.com>
Daniel Veillard [Fri, 30 Dec 2011 06:15:26 +0000 (14:15 +0800)]
Fix build on s390(x) and other stange arches
The blocks to extract node information on a per-arch
basis wasn't well balanced leading to a compilation
failure if not on one of the handled arches (PCs and PPCs)
Eric Blake [Fri, 23 Dec 2011 15:31:51 +0000 (08:31 -0700)]
seclabel: honor device override in selinux
This wires up the XML changes in the previous patch to let SELinux
labeling honor user overrides, as well as affecting the live XML
configuration in one case where the user didn't specify anything
in the offline XML.
I noticed that the logs contained messages like this:
for all my domain images living on NFS. But if we would just remember
that on domain creation that we were unable to set a SELinux label (due to
NFSv3 lacking labels, or NFSv4 not being configured to expose attributes),
then we could avoid wasting the time trying to clear the label on
domain shutdown. This in turn is one less point of NFS failure,
especially since there have been documented cases of virDomainDestroy
hanging during an attempted operation on a failed NFS connection.
* src/security/security_selinux.c (SELinuxSetFilecon): Move guts...
(SELinuxSetFileconHelper): ...to new function.
(SELinuxSetFileconOptional): New function.
(SELinuxSetSecurityFileLabel): Honor override label, and remember
if labeling failed.
(SELinuxRestoreSecurityImageLabelInt): Skip relabeling based on
override.
Eric Blake [Fri, 23 Dec 2011 00:47:50 +0000 (17:47 -0700)]
seclabel: allow a seclabel override on a disk src
Implement the parsing and formatting of the XML addition of
the previous commit. The new XML doesn't affect qemu command
line, so we can now test round-trip XML->memory->XML handling.
I chose to reuse the existing structure, even though per-device
override doesn't use all of those fields, rather than create a
new structure, in order to reuse more code.
* src/conf/domain_conf.h (_virDomainDiskDef): Add seclabel member.
* src/conf/domain_conf.c (virDomainDiskDefFree): Free it.
(virSecurityLabelDefFree): New function.
(virDomainDiskDefFormat): Print it.
(virSecurityLabelDefFormat): Reduce output if model not present.
(virDomainDiskDefParseXML): Alter signature, and parse seclabel.
(virSecurityLabelDefParseXML): Split...
(virSecurityLabelDefParseXMLHelper): ...into new helper.
(virDomainDeviceDefParse, virDomainDefParseXML): Update callers.
* tests/qemuxml2argvdata/qemuxml2argv-seclabel-dynamic-override.args:
New file.
* tests/qemuxml2xmltest.c (mymain): Enhance test.
* tests/qemuxml2argvtest.c (mymain): Likewise.
Eric Blake [Fri, 23 Dec 2011 00:47:49 +0000 (17:47 -0700)]
seclabel: extend XML to allow per-disk label overrides
When doing security relabeling, there are cases where a per-file
override might be appropriate. For example, with a static label
and relabeling, it might be appropriate to skip relabeling on a
particular disk, where the backing file lives on NFS that lacks
the ability to track labeling. Or with dynamic labeling, it might
be appropriate to use a custom (non-dynamic) label for a disk
specifically intended to be shared across domains.
The new XML resembles the top-level <seclabel>, but with fewer
options (basically relabel='no', or <label>text</label>):
<domain ...>
...
<devices>
<disk type='file' device='disk'>
<source file='/path/to/image1'>
<seclabel relabel='no'/> <!-- override for just this disk -->
</source>
...
</disk>
<disk type='file' device='disk'>
<source file='/path/to/image1'>
<seclabel relabel='yes'> <!-- override for just this disk -->
<label>system_u:object_r:shared_content_t:s0</label>
</seclabel>
</source>
...
</disk>
...
</devices>
<seclabel type='dynamic' model='selinux'>
<baselabel>text</baselabel> <!-- used for all devices without override -->
</seclabel>
</domain>
This patch only introduces the XML and documentation; future patches
will actually parse and make use of it. The intent is that we can
further extend things as needed, adding a per-device <seclabel> in
more places (such as the source of a console device), and possibly
allowing a <baselabel> instead of <label> for labeling where we want
to reuse the cNNN,cNNN pair of a dynamically labeled domain but a
different base label.
First suggested by Daniel P. Berrange here:
https://www.redhat.com/archives/libvir-list/2011-December/msg00258.html
* docs/schemas/domaincommon.rng (devSeclabel): New define.
(disk): Use it.
* docs/formatdomain.html.in (elementsDisks, seclabel): Document
the new XML.
* tests/qemuxml2argvdata/qemuxml2argv-seclabel-dynamic-override.xml:
New test, to validate RNG.
Eric Blake [Fri, 23 Dec 2011 00:47:46 +0000 (17:47 -0700)]
schema: rewrite seclabel rng to match code
The RNG for <seclabel> was too strict - if it was present, then it
had to have sub-elements, even if those didn't make sense for the
given attributes. Also, we didn't have any tests of <seclabel>
parsing or XML output.
In this patch, I added more parsing tests than output tests (since
the output populates and/or reorders fields not present in certain
inputs). Making the RNG reliable is a precursor to using <seclabel>
variants in more places in the XML in later patches.
See also:
http://berrange.com/posts/2011/09/29/two-small-improvements-to-svirt-guest-configuration-flexibility-with-kvmlibvirt/
* docs/schemas/domaincommon.rng (seclabel): Tighten rules.
* tests/qemuxml2argvtest.c (mymain): New tests.
* tests/qemuxml2xmltest.c (mymain): Likewise.
* tests/qemuxml2argvdata/qemuxml2argv-seclabel-*.*: New files.
Hu Tao [Thu, 29 Dec 2011 07:33:18 +0000 (15:33 +0800)]
domiftune: Add support of new APIs to the remote driver
* daemon/remote.c: implement the server side support
* src/remote/remote_driver.c: implement the client side support
* src/remote/remote_protocol.x: definitions for the new entry points
* src/remote_protocol-structs: structure definitions
Hu Tao [Thu, 29 Dec 2011 07:33:16 +0000 (15:33 +0800)]
domiftune: Add API virDomain{S,G}etInterfaceParameters
The APIs are used to set/get domain's network interface's parameters.
Currently supported parameters are bandwidth settings.
* include/libvirt/libvirt.h.in: new API and parameters definition
* python/generator.py: skip the Python API generation
* src/driver.h: add new entry to the driver structure
* src/libvirt_public.syms: export symbols
Daniel Veillard [Thu, 29 Dec 2011 08:20:00 +0000 (16:20 +0800)]
remove a static limit on max domains in python bindings
* python/libvirt-override.c: remove the predefined array in the
virConnectListDomainsID binding and call virConnectNumOfDomains
to do a proper allocation
Alex Jia [Thu, 29 Dec 2011 05:22:52 +0000 (13:22 +0800)]
python: Fix problems of virDomain{Set, Get}BlockIoTune bindings
The parameter 'params' is useless for virDomainGetBlockIoTune API,
and the return value type should be a virTypedParameterPtr but not
integer. And "PyArg_ParseTuple" in functions
libvirt_virDomain{Set,Get}BlockIoTune misses format unit for "format"
argument.
* libvirt-override-api.xml: Remove useless the parameter 'params'
from virDomainGetBlockIoTune API, and change return value type from
integer to virTypedParameterPtr.
* python/libvirt-override.c: Add the missed format units.
We had two nested loops both trying to use 'i' as the iteration
variable, which can result in an infinite loop when the inner
loop interferes with the outer loop. Introduced in commit 93ab585.
* src/qemu/qemu_driver.c (qemuDomainSetBlkioParameters): Don't
reuse iteration variable across two loops.
Eric Blake [Thu, 22 Dec 2011 08:28:04 +0000 (16:28 +0800)]
daemon: clean up daemonization
Valgrind detected a pipe fd leak before the parent exits on success,
introduced in commit 4296cea; by itself, the leak is not bad, since
we immediately called _exit(), but we might as well be clean to make
valgrind analysis easier. Meanwhile, if the daemon grandchild detects
an error, the parent failed to flush the error message before exiting.
Also, we had the possibility of both parent and child returning to the
caller, such that the user could see duplicated reports of failure
from the two return paths. And we might as well be robust to the
(unlikely) situation of being started with stdin closed.
* daemon/libvirtd.c (daemonForkIntoBackground): Use exit if an
error message was generated, avoid fd leaks for valgrind's sake,
avoid returning to caller in both parent and child, and don't
close a just-dup'd stdin.
Based on a report by Alex Jia.
* How to reproduce?
% service libvirtd stop
% valgrind -v --track-fds=yes /usr/sbin/libvirtd --daemon
* Actual valgrind result:
==16804== FILE DESCRIPTORS: 7 open at exit.
==16804== Open file descriptor 7:
==16804== at 0x321FAD8B87: pipe (in /lib64/libc-2.12.so)
==16804== by 0x41F34D: daemonForkIntoBackground (libvirtd.c:186)
==16804== by 0x4207A0: main (libvirtd.c:1420)
Signed-off-by: Alex Jia <ajia@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>
Satoru SATOH [Sun, 25 Dec 2011 17:20:03 +0000 (02:20 +0900)]
docs: Move 'echo' command description into the generic commands section
Virsh's echo command looks not having any relations with domains and its
description should go into the generic commands section instead of the
domain commands section (current).
Satoru SATOH [Sun, 25 Dec 2011 17:08:11 +0000 (02:08 +0900)]
docs: Move 'send-key' command description into the domain commands section
Virsh's send-key command manipulates domains and its description should
go into the domain commands section instead of generic commands section
(current).
Eric Blake [Thu, 22 Dec 2011 19:00:45 +0000 (12:00 -0700)]
tests: fix schema checks sorting
Commit 6fdbce12 attempted to sort the list of tests, but failed
(without quotes, echo merges all the tests into a single line,
so there was nothing to sort).
* tests/schematestutils.sh: Fix thinko in previous patch.
Michal Privoznik [Thu, 22 Dec 2011 11:22:31 +0000 (12:22 +0100)]
qemu: Support for overriding NOFILE limit
This patch adds max_files option to qemu.conf which can be used to
override system default limit on number of opened files that are
allowed for qemu user.
Michal Privoznik [Tue, 20 Dec 2011 13:04:28 +0000 (14:04 +0100)]
virsh: Move job watch code to a separate function
called vshWatchJob. This can be later used in other
job oriented commands like dump, save, managedsave
to report progress and allow user to cancel via ^C.
Michal Privoznik [Thu, 22 Dec 2011 10:00:05 +0000 (11:00 +0100)]
qemuhelptest: Add new qemuCap flag
Latest patch a1a83c587443 introduces new qemu capability flag
QEMU_CAPS_FSDEV_READONLY. However, it was missing in qemuhelptest
making test for qemu-1.0 fail.
Stefan Berger [Wed, 21 Dec 2011 15:54:47 +0000 (10:54 -0500)]
nwfilter: Do not require DHCP requests to be broadcasted
Remove the requirement that DHCP messages have to be broadcasted.
DHCP requests are most often sent via broadcast but can be directed
towards a specific DHCP server. For example 'dhclient' takes '-s <server>'
as a command line parameter thus allowing DHCP requests to be sent to a
specific DHCP server.
Michael Ellerman [Mon, 12 Dec 2011 23:39:31 +0000 (10:39 +1100)]
Add address type for SPAPR VIO devices
For QEMU PPC64 we have a machine type ("pseries") which has a virtual
bus called "spapr-vio". We need to be able to create devices on this
bus, and as such need a way to specify the address for those devices.
This patch adds a new address type "spapr-vio", which achieves this.
The addressing is specified with a "reg" property in the address
definition. The reg is optional, if it is not specified QEMU will
auto-assign an address for the device.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>