Stefan Berger [Fri, 1 Jun 2012 23:32:06 +0000 (19:32 -0400)]
nwfilter: Add multiple IP address support to DHCP snooping
With support for multiple IP addresses per interface in place, this patch
now adds support for multiple IP addresses per interface for the DHCP
snooping code.
Testing:
Since the infrastructure I tested this with does not provide multiple IP
addresses per MAC address (anymore), I either had to plug the VM's interface
from the virtual bride connected directly to the infrastructure to virbr0
to get a 2nd IP address from dnsmasq (kill and run dhclient inside the VM)
or changed the lease file (/var/run/libvirt/network/nwfilter.leases) and
restart libvirtd to have a 2nd IP address on an existing interface.
Note that dnsmasq can take a lease timeout parameter as part of the --dhcp-range
command line parameter, so that timeouts can be tested that way
(--dhcp-range 192.168.122.2,192.168.122.254,120). So, terminating and restarting
dnsmasq with that parameter is another choice to watch an IP address disappear
after 120 seconds.
Stefan Berger [Fri, 1 Jun 2012 23:32:06 +0000 (19:32 -0400)]
nwfilter: move code for IP address map into separate file
The goal of this patch is to prepare for support for multiple IP
addresses per interface in the DHCP snooping code.
Move the code for the IP address map that maps interface names to
IP addresses into their own file. Rename the functions on the way
but otherwise leave the code as-is. Initialize this new layer
separately before dependent layers (iplearning, dhcpsnooping)
and shut it down after them.
Stefan Berger [Fri, 1 Jun 2012 23:32:06 +0000 (19:32 -0400)]
nwfilter: add DHCP snooping
This patch adds DHCP snooping support to libvirt. The learning method for
IP addresses is specified by setting the "CTRL_IP_LEARNING" variable to one of
"any" [default] (existing IP learning code), "none" (static only addresses)
or "dhcp" (DHCP snooping).
Active leases are saved in a lease file and reloaded on restart or HUP.
The following interface XML activates and uses the DHCP snooping:
All filters containing the variable 'IP' are automatically adjusted when
the VM receives an IP address via DHCP. However, multiple IP addresses per
interface are silently ignored in this patch, thus only supporting one IP
address per interface. Multiple IP address support is added in a later
patch in this series.
Signed-off-by: David L Stevens <dlstevens@us.ibm.com> Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Marti Raudsepp [Fri, 1 Jun 2012 16:25:33 +0000 (19:25 +0300)]
qemu: move -name arg to be 1st in "ps x" output
Currently, monitoring QEMU virtual machines with standard Unix
sysadmin tools is harder than it has to be. The QEMU command line is
often miles long and mostly redundant, it's hard to tell which process
is which.
This patch reorders the QEMU -name argument to be the first, so it's
immediately visible in "ps x", htop and "atop -c" output.
The problem is that an interface with type='hostdev' will have an
alias of the form "hostdev%d", while the function that looks through
existing netdevs to determine the name to use for a new addition will
fail if there's an existing entry that does not match the form
"net%d".
This is another of the handful of places that need an exception due to
the hybrid nature of <interface type='hostdev'> (which is not exactly
an <interface> or a <hostdev>, but is both at the same time).
Wen Congyang [Wed, 30 May 2012 09:20:44 +0000 (17:20 +0800)]
qemu: avoid closing fd more than once
If we migrate to fd, spec->fwdType is not MIGRATION_FWD_DIRECT,
we will close spec->dest.fd.local in qemuMigrationRun(). So we
should set spec->dest.fd.local to -1 in qemuMigrationRun().
Eric Blake [Wed, 30 May 2012 15:20:37 +0000 (09:20 -0600)]
fdstream: avoid double close bug
Wen Congyang reported that we have a double-close bug if we fail
virFDStreamOpenInternal, since childfd duplicated one of the fds[]
array contents. In truth, since we always transfer both members
of fds to other variables, we should close the fds through those
other names, and just use fds[] for pipe().
Eric Blake [Tue, 29 May 2012 23:47:58 +0000 (17:47 -0600)]
command: avoid double close bugs
KAMEZAWA Hiroyuki reported a nasty double-free bug when virCommand
is used to convert a string into input to a child command. The
problem is that the poll() loop of virCommandProcessIO would close()
the write end of the pipe in order to let the child see EOF, then
the caller virCommandRun() would also close the same fd number, with
the second close possibly nuking an fd opened by some other thread
in the meantime. This in turn can have all sorts of bad effects.
The bug has been present since the introduction of virCommand in
commit f16ad06f.
This is based on his first attempt at a patch, at
https://bugzilla.redhat.com/show_bug.cgi?id=823716
* src/util/command.c (_virCommand): Drop inpipe member.
(virCommandProcessIO): Add argument, to avoid closing caller's fd
without informing caller.
(virCommandRun, virCommandNewArgs): Adjust clients.
Fixes for check and rpm builds without sanlock (and qemu)
Apart from the non-sanlock check build, there is also a little fix for
qemu (EXTRA_DIST had qemu.conf and others inside even if the build was
supposed to be without qemu).
Eric Blake [Tue, 29 May 2012 21:57:31 +0000 (15:57 -0600)]
build: use same perl binary throughout build
Some of our rules used $(PERL), while others used 'perl'. Always
using the variable allows a developer to point to a different (often
better) perl than the default one found on $PATH.
Eric Blake [Tue, 29 May 2012 21:49:13 +0000 (15:49 -0600)]
build: fix testing of augeas files in VPATH builds
Without this fix, a VPATH build (such as used by ./autobuild.sh)
fails with messages like:
make[3]: Entering directory `/home/remote/eblake/libvirt-tmp2/build/daemon'
../../build-aux/augeas-gentest.pl libvirtd.conf ../../daemon/test_libvirtd.aug.in test_libvirtd.aug
cannot read libvirtd.conf: No such file or directory at ../../build-aux/augeas-gentest.pl line 38.
Since the test files are not part of the tarball, we can generate
them into the build dir, but rather than create a subdirectory
just for the test file, it is easier to test them directly in
libvirt.git/src.
* daemon/Makefile.am (AUG_GENTEST): Factor out definition.
(test_libvirtd.aug): Look for correct file.
* src/Makefile.am (AUG_GENTEST): Use $(PERL).
(qemu/test_libvirtd_qemu.aug, lxc/test_libvirtd_lxc.aug)
(locking/test_libvirt_sanlock.aug): Rename to avoid subdirectories.
(check-augeas-qemu, check-augeas-lxc, check-augeas-sanlock): Reflect
location of built tests.
* configure.ac (PERL): Substitute perl.
Eric Blake [Tue, 29 May 2012 14:10:44 +0000 (08:10 -0600)]
build: silence warning from autoconf
Autoconf 2.60 and later insist on using ${datarootdir}, rather than
the derived ${datadir} (although the latter defaults to the former,
it is possible to set configure arguments so that they differ):
config.status: creating libvirt.pc
config.status: WARNING: 'libvirt.pc.in' seems to ignore the --datarootdir setting
This patch follows the autoconf manual's suggestions for how to
support 2.59 (RHEL 5) and newer simultaneously.
* libvirt.pc.in (datarootdir): Define, so ${datadir} will not ignore
datarootdir when using newer autoconf.
Michal Privoznik [Wed, 30 May 2012 12:17:26 +0000 (14:17 +0200)]
virCommand: Extend debug message for handshake
Currently, we are logging only one side of pipes we
create in virCommandRequireHandshake(); This is enough
in cases where pipe2() returns two consecutive FDs. However,
it is not guaranteed and it may return any FDs.
Therefore, it's wise to log the other ends as well.
When getting number of CPUs the host has assigned, there was always
number "1" returned. Even though all lxc domains with no pinning
launched by libvirt run on all pCPUs (by default, no matter what's the
number), we should at least return the same number as the user
specified when creating the domain.
I added libvirt_qemu_probes.h into BUILT_SOURCES. That makes it
generated, but most probably it is not the clearest way how to do
that, but it fixes the build.
Dave Allan [Tue, 29 May 2012 22:49:13 +0000 (16:49 -0600)]
examples: add consolecallback example python script
A while back I wrote the attached code to demonstrate how to use
events and serial console to create a serial console that stays up
even when the VM is down. It might need some work, as I am not
terribly strong with Python.
* examples/python/consolecallback.py: New file.
* examples/python/Makefile.am (EXTRA_DIST): Ship it.
Eric Blake [Tue, 29 May 2012 20:58:56 +0000 (14:58 -0600)]
build: don't lose probes.o files
The previous patch fixed an incremental build, but missed that on
a fresh checkout, we now have nothing left that stops make from
nuking libvirt_qemu_probes.o.
* src/Makefile.am ($(libvirt_driver_qemu_la_SOURCES)): Delete,
since this variable is empty.
(.PRECIOUS): Add %_probes.o, so they don't get nuked as an
intermediate by-product after creating %_probes.lo.
Eric Blake [Tue, 29 May 2012 18:41:06 +0000 (12:41 -0600)]
build: fix missing dependencies for libvirt-qemu.so
The moment you specify a _DEPENDENCIES, older automake (stupidly)
assumes that you will specify _all_ dependencies for that target.
This stupidity has been fixed in automake 1.12, but we cannot rely on
newer automake everywhere. For libvirt_la_DEPENDENCIES, we took
care of providing the full list, but for libvirt_qemu_la_DEPENDENCIES,
we were missing the dependency on libvirt_qemu_impl.la, which resulted
in a failed build:
make[3]: Entering directory `/home/ajia/Workspace/libvirt/src'
CCLD libvirt_driver_qemu.la
libtool: link: `libvirt_qemu_probes.lo' is not a valid libtool object
* src/Makefile.am (libvirt_driver_qemu_la_DEPENDENCIES): Delete;
automake does a better job if it does the entire job.
Eric Blake [Tue, 29 May 2012 14:34:56 +0000 (08:34 -0600)]
virsh: avoid strncpy
strncpy is generally evil - it runs the risk of missing NUL
termination, and more often than not wastes time zeroing way
more bytes than strictly necessary. We've avoided this evil
in our virStrncpy wrapper, except for places where we forgot
to use the wrapper; meanwhile, we have also added an even
higher layer wrapper for setting virTypedParameter values.
* tools/virsh.c (cmdMemtune, cmdBlkdeviotune): Use modern API.
* cfg.mk (exclude_file_name_regexp--sc_prohibit_strncpy): Tighten.
Eric Blake [Mon, 28 May 2012 12:48:26 +0000 (06:48 -0600)]
build: update to latest gnulib
Gnulib finally relaxed the isatty license, needed as first mentioned here:
https://www.redhat.com/archives/libvir-list/2012-February/msg01022.html
Other improvements include better syntax-check rules (we can delete one
of ours now that it is a duplicate) and better compiler warning usage.
* .gnulib: Update to latest, for isatty.
* cfg.mk (sc_prohibit_strncpy): Drop a now-redundant rule.
* bootstrap.conf (gnulib_modules): Add isatty.
* bootstrap: Resync from gnulib.
Stefan Berger [Tue, 29 May 2012 10:25:59 +0000 (06:25 -0400)]
leak_fix.diff
==3240== 23 bytes in 1 blocks are definitely lost in loss record 242 of 744
==3240== at 0x4C2A4CD: malloc (vg_replace_malloc.c:236)
==3240== by 0x8077537: __vasprintf_chk (vasprintf_chk.c:82)
==3240== by 0x509C677: virVasprintf (stdio2.h:199)
==3240== by 0x509C733: virAsprintf (util.c:1912)
==3240== by 0x1906583A: qemudStartup (qemu_driver.c:679)
==3240== by 0x511991D: virStateInitialize (libvirt.c:809)
==3240== by 0x40CD84: daemonRunStateInit (libvirtd.c:751)
==3240== by 0x5098745: virThreadHelper (threads-pthread.c:161)
==3240== by 0x7953D8F: start_thread (pthread_create.c:309)
==3240== by 0x805FF5C: clone (clone.S:115)
Eric Blake [Fri, 25 May 2012 19:18:10 +0000 (13:18 -0600)]
build: silence libtool during tests
Libtool is picky about linking against a module library (aka a .so);
giving lots of warnings like this in the tests directory:
CCLD networkxml2argvtest
*** Warning: Linking the executable networkxml2argvtest against the loadable module
*** libvirt_driver_network.so is not portable!
Fix that by splitting things into a convenience library which can
be used directly by the tests, and making the real .so just wrap
the convenience library.
Based on a suggestion by Daniel P. Berrange.
* configure.ac (--with-driver-modules): Fix help test.
* src/Makefile.am (libvirt_driver_xen.la, libvirt_driver_libxl.la)
(libvirt_driver_qemu.la, libvirt_driver_lxc.la)
(libvirt_driver_uml.la): Factor into new convenience libraries.
* tests/Makefile.am (xen_LDADDS, qemu_LDADDS, lxc_LDADDS)
(networkxml2argvtest_LDADD): Link to convenience libraries, not
shared libraries.
There was no rule forcing libvirt_qemu_probes.o to be built
before libvirt_qemu_probes.lo was used. Also libvirtd was
still referencing the .o file, rather than the .lo file.
Both the .lo and .o file must be listed as DEPENDENCIES,
otherwise libtool will unhelpfully delete the .o file
once the .lo file is created.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The CoTaskMemFree function requires the ole32 DLL to be
linked against. Currently this is only done for the
VirtualBox driver. Also add it to libvirt_util.la
* configure.ac: Unconditionally add ole32 DLL to Win32
* src/Makefile.am: Link old32 to libvirt_util.la
Autogenerate augeas test case from default config files
When adding new config file parameters, the corresponding
additions to the augeas lens' are constantly forgotten.
Also there are augeas test cases, these don't catch the
error, since they too are never updated.
To address this, the augeas test cases need to be auto-generated
from the example config files.
* build-aux/augeas-gentest.pl: Helper to generate an
augeas test file, substituting in elements from the
example config files
* src/Makefile.am, daemon/Makefile.am: Switch to
auto-generated augeas test cases
* daemon/test_libvirtd.aug, daemon/test_libvirtd.aug.in,
src/locking/test_libvirt_sanlock.aug,
src/locking/test_libvirt_sanlock.aug.in,
src/lxc/test_libvirtd_lxc.aug,
src/lxc/test_libvirtd_lxc.aug.in,
src/qemu/test_libvirtd_qemu.aug,
src/qemu/test_libvirtd_qemu.aug.in: Remove example
config file data, replacing with a ::CONFIG:: placeholder
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add an impl of +virGetUserRuntimeDirectory, virGetUserCacheDirectory
virGetUserConfigDirectory and virGetUserDirectory for Win32 platform.
Also create stubs for non-Win32 platforms which lack getpwuid_r()
In adding these two helpers were added virFileIsAbsPath and
virFileSkipRoot, along with some macros VIR_FILE_DIR_SEPARATOR,
VIR_FILE_DIR_SEPARATOR_S, VIR_FILE_IS_DIR_SEPARATOR,
VIR_FILE_PATH_SEPARATOR, VIR_FILE_PATH_SEPARATOR_S
All this code was adapted from GLib2 under terms of LGPLv2+ license.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remove the uid param from virGetUserConfigDirectory,
virGetUserCacheDirectory, virGetUserRuntimeDirectory,
and virGetUserDirectory
These functions were universally called with the
results of getuid() or geteuid(). To make it practical
to port to Win32, remove the uid parameter and hardcode
geteuid()
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When you try to connect to a socket in the abstract namespace,
the error will be ECONNREFUSED for a non-listening daemon. With
the non-abstract namespace though, you instead get ENOENT. Add
a check for this extra errno when auto-spawning the daemon
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Jim Meyering [Sat, 26 May 2012 09:21:47 +0000 (11:21 +0200)]
maint: avoid new automake warning about AM_PROG_CC_STDC
* configure.ac (AM_PROG_CC_STDC): Stop using this macro.
It provokes warnings from newer automake and is superseded by
autoconf's AC_PROG_CC, which we're already using.
Eric Blake [Fri, 25 May 2012 15:57:56 +0000 (09:57 -0600)]
build: silence libtool warning on probes.o
Libtool supports linking directly against .o files on some platforms
(such as Linux), which happens to be the only place where we are
actually doing that (for the dtrace-generated probes.o files). However,
it raises a big stink about the non-portability, even though we don't
attempt it on platforms where it would actually fail:
CCLD libvirt_driver_qemu.la
*** Warning: Linking the shared library libvirt_driver_qemu.la against
the non-libtool
*** objects libvirt_qemu_probes.o is not portable!
This shuts libtool up by creating a proper .lo file that matches
what libtool normally expects.
* src/Makefile.am (%_probes.lo): New rule.
(libvirt_probes.stp, libvirt_qemu_probes.stp): Simplify into...
(%_probes.stp): ...shorter rule.
(CLEANFILES): Clean new .lo files.
(libvirt_la_BUILT_LIBADD, libvirt_driver_qemu_la_LIBADD)
(libvirt_lxc_LDADD, virt_aa_helper_LDADD): Link against .lo file.
* tests/Makefile.am (PROBES_O, qemu_LDADDS): Likewise.
Add parsing for VIR_ENUM_IMPL & VIR_ENUM_DECL in apibuild.py
The apibuild.py parser needs to be able to parse & ignore
any VIR_ENUM_IMPL/VIR_ENUM_DECL macros in the source. Add
some special case code to deal with this rather than trying
to figure out a generic syntax for parsing macros.
* apibuild.py: Special case VIR_ENUM_IMPL & VIR_ENUM_DECL
Turn on loadable modules for libvirtd. Add new sub-RPMs
libvirt-daemon-driver-XXX, one for each loadable .so.
Modify the libvirt-daemon-YYY RPMs to depend on each of
the individual drivers they required
* libvirt.spec.in: Enable driver modules
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Always enable driver modules for libvirtd, if we have dlopen
available. This allows more modular packaging by distros
and ensures we don't break this config
* configure.ac: Default to enable driver modules
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
* daemon/libvirtd.c: Set custom driver module dir if the current
binary name is 'lt-libvirtd' (indicating execution directly
from GIT checkout)
* src/driver.c, src/driver.h, src/libvirt_driver_modules.syms: Add
virDriverModuleInitialize to allow driver module location to
be changed
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When building as driver modules, it is not possible for the QEMU
driver module to reference the DTrace/SystemTAP probes linked into
the main libvirt.so. Thus we need to move the QEMU probes into a
separate file 'libvirt_qemu_probes.d'. Also rename the existing
file from 'probes.d' to 'libvirt_probes.d' while we're at it
* daemon/Makefile.am, src/internal.h: Include libvirt_probes.h
instead of probes.h
* src/Makefile.am: Add rules for libvirt_qemu_probes.d
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor_json.c,
src/qemu/qemu_monitor_text.c: Include libvirt_qemu_probes.h
* src/libvirt_probes.d: Rename from probes.d
* src/libvirt_qemu_probes.d: QEMU specific probes formerly
in probes.d
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Since we have drivers which depend on each other (ie QEMU/LXC
depend on the network driver APIs), we need to use RTLD_GLOBAL
instead of RTLD_LOCAL. While this pollutes the calling binary
with many more symbols, this is no worse than if we directly
link to the drivers, and this only applies to libvirtd
* src/driver.c: s/RTLD_LOCAL/RTLD_GLOBAL/
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ensure LXC driver links against libblkid explicitly.
Only libvirt_driver_storage.la links to libblkid currently. If
we are running in a scenario with driver modules, LXC must
directly link to it, since it can't assume the storage driver
is present
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The libvirt_test.la library was introduced to allow test suites
to reference internal-only symbols. These days, nearly every
symbol we care about is in src/libvirt_private.syms, so there
is no need for libvirt_test.la to continue to exist
* src/Makefile.am: Delete libvirt_test.la & add new .syms files
* src/libvirt_private.syms: Export symbols needed by test suite
* tests/Makefile.am: Link to libvirt_test.la. Ensure LXC tests link
to network_driver.la
* src/libvirt_esx.syms, src/libvirt_openvz.syms: Add exports needed
by test suite
The driver modules all use symbols which are defined in libvirt.so.
Thus for loading of modules to work, the binary that libvirt.so
is linked to must export its symbols back to modules. If the
libvirt.so itself is dlopen()d then the RTLD_GLOBAL flag must
be set. Unfortunately few, if any, programming languages use
the RTLD_GLOBAL flag when loading modules :-( This means is it
not practical to use driver modules for any libvirt client side
drivers (OpenVZ, VMWare, Hyper-V, Remote client, test).
This patch changes the build process so only server side drivers
are built as modules (Xen, QEMU, LXC, UML)
* daemon/libvirtd.c: Add missing load of 'interface' driver
* src/Makefile.am: Only build server side drivers as modules
* src/libvirt.c: Don't load any driver modules
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Guido Günther [Wed, 23 May 2012 12:37:10 +0000 (14:37 +0200)]
Introduce virDomainParseScaledValue
and use it for virDomainParseMemory. This allows to parse arbitrary
scaled value, not only memory related values as needed for the
filesystem limits code following later in this series.
Jiri Denemark [Mon, 21 May 2012 14:02:05 +0000 (16:02 +0200)]
Revert "rpc: Discard non-blocking calls only when necessary"
This reverts commit b1e374a7ac56927cfe62435179bf0bba1e08b372, which was
rather bad since I failed to consider all sides of the issue. The main
things I didn't consider properly are:
- a thread which sends a non-blocking call waits for the thread with
the buck to process the call
- the code doesn't expect non-blocking calls to remain in the queue
unless they were already partially sent
Thus, the reverted patch actually breaks more than what it fixes and
clients (which may even be libvirtd during p2p migrations) will likely
end up in a deadlock.
Peter Krempa [Mon, 21 May 2012 14:31:53 +0000 (16:31 +0200)]
qemu_hotplug: Don't free the PCI device structure after hot-unplug
The pciDevice structure corresponding to the device being hot-unplugged
was freed after it was "stolen" from activeList. The pointer was still
used for eg-inactive list. This patch removes the free of the structure
and frees it only if reset fails on the device.
storage backend: Add RBD (RADOS Block Device) support
This patch adds support for a new storage backend with RBD support.
RBD is the RADOS Block Device and is part of the Ceph distributed storage
system.
It comes in two flavours: Qemu-RBD and Kernel RBD, this storage backend only
supports Qemu-RBD, thus limiting the use of this storage driver to Qemu only.
To function this backend relies on librbd and librados being present on the
local system.
The backend also supports Cephx authentication for safe authentication with
the Ceph cluster.
For storing credentials it uses the built-in secret mechanism of libvirt.
Signed-off-by: Wido den Hollander <wido@widodh.nl>
Fix potential events deadlock when unref'ing virConnectPtr
When the last reference to a virConnectPtr is released by
libvirtd, it was possible for a deadlock to occur in the
virDomainEventState functions. The virDomainEventStatePtr
holds a reference on virConnectPtr for each registered
callback. When removing a callback, the virUnrefConnect
function is run. If this causes the last reference on the
virConnectPtr to be released, then virReleaseConnect can
be run, which in turns calls qemudClose. This function has
a call to virDomainEventStateDeregisterConn which is intended
to remove all callbacks associated with the virConnectPtr
instance. This will try to grab a lock on virDomainEventState
but this lock is already held. Deadlock ensues
Thread 1 (Thread 0x7fcbb526a840 (LWP 23185)):
Since each callback associated with a virConnectPtr holds a
reference on virConnectPtr, it is impossible for the qemudClose
method to be invoked while any callbacks are still registered.
Thus the call to virDomainEventStateDeregisterConn must in fact
be a no-op. Thus it is possible to just remove all trace of
virDomainEventStateDeregisterConn and avoid the deadlock.
Jim Fehlig [Mon, 21 May 2012 15:23:41 +0000 (09:23 -0600)]
Fix build when configuring with polkit0
Commit 2223ea98 removed the only use of 'server' param in
remoteDispatchAuthPolkit(). Mark the parameter with ATTRIBUTE_UNUSED
to fix the build when configuring with polkit0.
Stefan Berger [Mon, 21 May 2012 10:26:34 +0000 (06:26 -0400)]
nwfilter: Add support for ipset
This patch adds support for the recent ipset iptables extension
to libvirt's nwfilter subsystem. Ipset allows to maintain 'sets'
of IP addresses, ports and other packet parameters and allows for
faster lookup (in the order of O(1) vs. O(n)) and rule evaluation
to achieve higher throughput than what can be achieved with
individual iptables rules.
On the command line iptables supports ipset using
iptables ... -m set --match-set <ipset name> <flags> -j ...
where 'ipset name' is the name of a previously created ipset and
flags is a comma-separated list of up to 6 flags. Flags use 'src' and 'dst'
for selecting IP addresses, ports etc. from the source or
destination part of a packet. So a concrete example may look like this:
iptables -A INPUT -m set --match-set test src,src -j ACCEPT
Since ipset management is quite complex, the idea was to leave ipset
management outside of libvirt but still allow users to reference an ipset.
The user would have to make sure the ipset is available once the VM is
started so that the iptables rule(s) referencing the ipset can be created.
Using XML to describe an ipset in an nwfilter rule would then look as
follows:
Eric Blake [Fri, 18 May 2012 15:42:25 +0000 (09:42 -0600)]
build: fix virnetlink on glibc 2.11
We were being lazy - virnetlink.c was getting uint32_t as a
side-effect from glibc 2.14's <unistd.h>, but older glibc 2.11
does not provide uint32_t from <unistd.h>. In fact, POSIX states
that <unistd.h> need only provide intptr_t, not all of <stdint.h>,
so the bug really is ours. Reported by Jonathan Alescio.
Hu Tao [Wed, 9 May 2012 08:41:38 +0000 (16:41 +0800)]
Adds support to param 'vcpu_time' in qemu_driver.
This involves setting the cpuacct cgroup to a per-vcpu granularity,
as well as summing the each vcpu accounting into a common array.
Now that we are reading more than one cgroup file, we double-check
that cpus weren't hot-plugged between reads to invalidate our
summing.
Michal Privoznik [Thu, 17 May 2012 11:40:52 +0000 (13:40 +0200)]
qemu: Don't delete USB device on failed qemuPrepareHostdevUSBDevices
If qemuPrepareHostdevUSBDevices fail it will roll back devices added
to the driver list of used devices. However, if it may fail because
the device is being used already. But then again - with roll back.
Therefore don't try to remove a usb device manually if the function
fail. Although, we want to remove the device if any operation
performed afterwards fail.
Michal Privoznik [Wed, 16 May 2012 14:42:02 +0000 (16:42 +0200)]
qemu: Rollback on used USB devices
One of our latest USB device handling patches 05abd1507d66aabb6cad12eeafeb4c4d1911c585 introduced a regression.
That is, we first create a temporary list of all USB devices that
are to be used by domain just starting up. Then we iterate over and
check if a device from the list is in the global list of currently
assigned devices (activeUsbHostdevs). If not, we add it there and
continue with next iteration then. But if a device from temporary
list is either taken already or adding to the activeUsbHostdevs fails,
we remove all devices in temp list from the activeUsbHostdevs list.
Therefore, if a device is already taken we remove it from
activeUsbHostdevs even if we should not. Thus, next time we allow
the device to be assigned to another domain.
Most versions of libselinux do not contain the function
selinux_lxc_contexts_path() that the security driver
recently started using for LXC. We must add a conditional
check for it in configure and then disable the LXC security
driver for builds where libselinux lacks this function.
* configure.ac: Check for selinux_lxc_contexts_path
* src/security/security_selinux.c: Disable LXC security
if selinux_lxc_contexts_path() is missing
Remount cgroups controllers after setting up new /sys in LXC
Normal practice is for cgroups controllers to be mounted at
/sys/fs/cgroup. When setting up a container, /sys is mounted
with a new sysfs instance, thus we must re-mount all the
cgroups controllers. The complexity is that we must mount
them in the same layout as the host OS. ie if 'cpu' and 'cpuacct'
were mounted at the same location in the host we must preserve
this in the container. Also if any controllers are co-located
we must setup symlinks from the individual controller name to
the co-located mount-point
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Trim /proc & /sys subtrees before mounting new instances
Both /proc and /sys may have sub-mounts in them from the host
OS. We must explicitly unmount them all before mounting the
new instance over that location. If we don't then /proc/mounts
will show the sub-mounts as existing, even though nothing will
be able to access them, due to the over-mount.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently to make sysfs readonly, we remount the existing
instance and then bind it readonly. Unfortunately this means
sysfs is still showing device objects wrt the host OS namespace.
We need it to reflect the container namespace, so we must mount
a completely new instance of it. Do the same for selinuxfs since
there is no benefit to bind mounting & this lets us simplify
the code.
* src/lxc/lxc_container.c: Mount fresh sysfs instance
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>