Stefan Berger [Fri, 30 Apr 2010 12:10:12 +0000 (08:10 -0400)]
Syncronize the teardown of rules with the thread
Introduce a function to notify the IP address learning
thread to terminate and thus release the lock on the interface.
Notify the thread before grabbing the lock on the interface
and tearing down the rules. This prevents a 'virsh destroy' to
tear down the rules that the IP address learning thread has
applied.
Stefan Berger [Fri, 30 Apr 2010 12:06:18 +0000 (08:06 -0400)]
Clean all tables before applying 'basic' rules
The functions invoked by the IP address learning thread
that apply some basic filtering rules did not clean up
any previous filtering rules that may still be there
(due to a libvirt restart for example). With the
patch below all the rules are cleaned up first.
Also, I am introducing a function to drop all traffic
in case the IP address learning thread could not apply
the rules.
Stefan Berger [Fri, 30 Apr 2010 11:51:47 +0000 (07:51 -0400)]
nwfilter: Also pick IP address from a DHCP ACK message
The local DHCP server on virtbr0 sends DHCP ACK messages when a VM is
started and requests an IP address while the initial DHCP lease on the
VM's MAC address hasn't expired. So, also pick the IP address of the VM
if that type of message is seen.
Thanks to Gerhard Stenzel for providing a test case for this.
Changes from V1 to V2:
- cleanup: replacing DHCP option numbers through constants
Ubuntu's gntls package generates an Issuer line that looks like this:
Issuer: C=US,ST=NY,L=Rochester,O=example.com,CN=example.com CA,EMAIL=hostmaster@example.com
While Red Hat's looks like this
Issuer: CN=Red Hat Emerging Technologies
Note the leading whitespace, and the additional fields in the former.
This patch updates the regular expression to:
* trim leading characters before "Issuer:"
* trim anything between Issuer: and CN=
* trim anything after the next ,
I've tested this against the certool output of both RH and Ubuntu
generated certs.
Signed-off-by: Dustin Kirkland <kirkland@canonical.com> Signed-off-by: Eric Blake <eblake@redhat.com>
When using -device syntax, the IO event will have a different
prefix, 'drive-' that needs to be skipped over before matching
against the libvirt disk alias
* src/qemu/qemu_driver.c: Skip QEMU_DRIVE_HOST_PREFIX in IO event
Implement python binding for virDomainGetBlockInfo
This binds the virDomainGetBlockInfo API to python's blockInfo
method on the domain object
>>> c = libvirt.openReadOnly('qemu:///session')
>>> d = c.lookupByName('demo')
>>> f = d.blockInfo("/dev/loop0", 0)
>>> print f
[1048576000L, 104857600L, 104857600L]
* python/libvirt-override-api.xml: Define override signature
* python/generator.py: Skip C impl generator for virDomainGetBlockInfo
* python/libvirt-override.c: Manual impl of virDomainGetBlockInfo
* daemon/remote.c: Server side dispatcher
* daemon/remote_dispatch_args.h, daemon/remote_dispatch_prototypes.h,
daemon/remote_dispatch_ret.h, daemon/remote_dispatch_table.h: Update
with new API
* src/remote/remote_driver.c: Client side dispatcher
* src/remote/remote_protocol.c, src/remote/remote_protocol.h: Update
* src/remote/remote_protocol.x: Define new wire protocol
Add virDomainGetBlockInfo API to query disk sizing
Some applications need to be able to query a guest's disk info,
even for paths not managed by the storage pool APIs. This adds
a very simple API to get this information, modelled on the
virStorageVolGetInfo API, but with an extra field 'physical'.
Normally 'physical' and 'allocation' will be identical, but
in the case of a qcow2-like file stored inside a block device
'physical' will give the block device size, while 'allocation'
will give the qcow2 image size
Chris Lalancette [Wed, 28 Apr 2010 19:50:06 +0000 (15:50 -0400)]
Fix a virsh edit memory leak
When running virsh edit, we are unlinking and setting
the tmp variable to NULL before going to the end of the
function, meaning that we never free tmp. Since the
exit to the function will always unlink and free tmp,
just remove this bit of code and let it get done at the
end.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Chris Lalancette [Wed, 28 Apr 2010 19:49:41 +0000 (15:49 -0400)]
Fix a qemuDomainPCIAddressSetFree memory leak
qemuDomainPCIAddressSetFree was freeing up the hash
table for the pci addresses, but not freeing up the addr
structure. Looking over the callers of this function, it
seems like they expect it to also free up the structure,
so do that here.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Chris Lalancette [Mon, 26 Apr 2010 13:41:07 +0000 (09:41 -0400)]
Fix build on Ubuntu.
When building on Ubuntu with make -j3 (or more), it would always
fail when trying to build virt-aa-helper. I'm not an expert in
automake by any means, but I think the entry for virt-aa-helper
is mis-using LDADD; it shouldn't be putting direct paths to
libvirt_conf.la and libvirt_util.la, but instead referencing those
names. With this patch in place, I'm able to successfully build
on Ubuntu 9.04 with make -j3.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
* src/qemu/qemu_driver.c (qemuDomainSnapshotCreateXML): When setting
"vm" to NULL, jump over vm-dereferencing code to "cleanup".
(qemuDomainRevertToSnapshot): Likewise.
Daniel Veillard [Wed, 28 Apr 2010 13:38:47 +0000 (15:38 +0200)]
Move dnsmasq host file to a separate directory
use /var/lib/libvirt/dnsmasq since /var/lib/libvirt/network is
unreadable by the dnsmasq binary
* src/network/bridge_driver.c: update DNSMASQ_STATE_DIR
* src/Makefile.am: create it on make install
* libvirt.spec.in: take the new directory into account
Fix handling of security driver restore failures in QEMU domain save
In cases where the security driver failed to restore a label after a
guest has saved, we mistakenly jumped to the error cleanup paths.
This is not good, because the operation has in fact completed and
cannot be rolled back completely. Label restore is non-critical, so
just log the problem instead. Also add a missing restore call in
the error cleanup path
* src/qemu/qemu_driver.c: Fix handling of security driver
restore failures in QEMU domain save
Fix QEMU domain save to block devices with cgroups enabled
When cgroups is enabled, access to block devices is likely to be
restricted to a whitelist. Prior to saving a guest to a block device,
it is necessary to add the block device to the whitelist. This is
not required upon restore, since QEMU reads from stdin
* src/qemu/qemu_driver.c: Add block device to cgroups whitelist
if neccessary during domain save.
The save process was relying on use of the shell >> append
operator to ensure the save data was placed after the libvirt
header + XML. This doesn't work for block devices though.
Replace this code with use of 'dd' and its 'seek' parameter.
This means that we need to pad the header + XML out to a
multiple of dd block size (in this case we choose 512).
The qemuMonitorMigateToCommand() monitor API is used for both
save/coredump, and migration via UNIX socket. We can't simply
switch this to use 'dd' since this causes problems with the
migration usage. Thus, create a dedicated qemuMonitorMigateToFile
which can accept an filename + offset, and remove the filename
from the current qemuMonitorMigateToCommand() API
* src/qemu/qemu_driver.c: Switch to qemuMonitorMigateToFile
for save and core dump
* src/qemu/qemu_monitor.c, src/qemu/qemu_monitor.h,
src/qemu/qemu_monitor_json.c, src/qemu/qemu_monitor_json.h,
src/qemu/qemu_monitor_text.c, src/qemu/qemu_monitor_text.h: Create
a new qemuMonitorMigateToFile, separate from the existing
qemuMonitorMigateToCommand to allow handling file offsets
Avoid create/unlink with block devs used for QEMU save
It is possible to use block devices with domain save/restore. Upon
failure QEMU unlinks the path being saved to. This isn't good when
it is a block device !
* src/qemu/qemu_driver.c: Don't unlink block devices if save fails
Fix crash when cleaning up from failed save attempt
If a transient QEMU crashes during save attempt, then the virDomainPtr
object may be freed. If a persistent QEMU crashes during save, then
the 'priv->mon' field is no longer valid since it will be inactive.
* src/qemu/qemu_driver.c: Fix two crashes when QEMU exits
during a save attempt
Stefan Berger [Tue, 27 Apr 2010 18:50:35 +0000 (14:50 -0400)]
nwfilter: let qemu's after-migration packet pass
Qemu currently sends an Ethernet packet with protocol id 0x835 once a VM
was successfully migrated. The content of the packet looks like a
gratuitous RARP, just with the wrong protocol ID, which should be
0x8035. I wrote some filters to let either one of the packets pass and
am adapting the clean-traffic sample filter to use it. I am also
doing some changes on the existing ARP filter which was lacking a
test for source MAC address.
Chris Lalancette [Thu, 22 Apr 2010 16:01:56 +0000 (12:01 -0400)]
Fix up the locking in the snapshot code.
In particular I was forgetting to take the qemuMonitorPrivatePtr
lock (via qemuDomainObjBeginJob), which would cause problems
if two users tried to access the same domain at the same time.
This patch also fixes a problem where I was forgetting to remove
a transient domain from the list of domains.
Thanks to Stephen Shaw for pointing out the problem and testing
out the initial patch.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Stefan Berger [Tue, 27 Apr 2010 11:26:12 +0000 (07:26 -0400)]
nwfilter: add support for RAPR protocol
This patch adds support for the RARP protocol. This may be needed due to
qemu sending out a RARP packet (at least that's what it seems to want to
do even though the protocol id is wrong) when migration finishes and
we'd need a rule to let the packets pass.
Unfortunately my installation of ebtables does not understand -p RARP
and also seems to otherwise depend on strings in /etc/ethertype
translated to protocol identifiers. Therefore I need to pass -p 0x8035
for RARP. To generally get rid of the dependency of that file I switch
all so far supported protocols to use their protocol identifier in the
-p parameter rather than the string.
I am also extending the schema and added a test case.
changes from v1 to v2:
- added test case into patch
With JSON qemu monitor, we get a STOP event from qemu whenever qemu
stops guests CPUs. The downside of it is that vm->state is changed to
PAUSED and a new generic paused event is send to applications. However,
when we ask qemu to stop the CPUs we are not really interested in qemu
event and we usually want to issue a more specific event.
By setting vm->status to PAUSED before actually sending the request to
qemu (and resetting it back if the request fails) we can ignore the
event since the event handler does nothing when the guest is already
paused. This solution is quite hacky but unfortunately it's the best
solution which I was able to come up with and it doesn't introduce a
race condition.
David Allan [Tue, 27 Apr 2010 10:01:32 +0000 (12:01 +0200)]
Fix indentation for storage conf XML
* virStorageEncryptionFormat is called from both
virDomainDiskDefFormat and virStorageVolTargetDefFormat. The proper
indentation in the generated XML depends on the caller. My earlier
patch to fix the incorrect indentation for the domain XML broke the
indentation for the storage XML. This patch adopts Laine's
suggestion of requring the caller of virStorageEncryptionFormat to
provide an unsigned int with the number of spaces the output should
be indented. The patch modifies both callers to provide the
additional argument.
* Add a regression test for the domain XML
* src/conf/domain_conf.c src/conf/storage_conf.c
src/conf/storage_encryption_conf.c src/conf/storage_encryption_conf.h:
change the indentation code
* tests/qemuxml2xmltest.c
tests/qemuxml2argvdata/qemuxml2argv-encrypted-disk.args
tests/qemuxml2argvdata/qemuxml2argv-encrypted-disk.xml: add a regression test
Stefan Berger [Mon, 26 Apr 2010 17:50:40 +0000 (13:50 -0400)]
nwfilter: enable hex number inputs in filter XML
With this patch I want to enable hex number inputs in the filter XML. A
number that was entered as hex is also printed as hex unless a string
representing the meaning can be found.
I am also extending the schema and adding a test case. A problem with
the DSCP value is fixed on the way as well.
Changes from V1 to V2:
- using asHex boolean in all printf type of functions to select the
output format in hex or decimal format
Starts dnsmasq from libvirtd with --dhcp-hostsfile option
This patch makes libvirtd start the dnsmasq daemon with a
--dhcp-hostsfile option instead of --dhcp-host options for each
'//ip/dhcp/host' entries defined in network xml file.
the dnsmasq host file is stored into /var/lib/libvirt/network
* src/network/bridge_driver.c: define the directory for the hostfiles
and save/delete them to be used by dnsmasq
* po/POTFILES.in: the new module contains translatable strings
* src/Makefile.am: include the files in the utils set
* src/libvirt_private.syms: exports the symbols internally
It implements an idea to save dhcp hosts' macaddr vs. ipaddr mappings to
static file and make dnsmasq loading it with "--dhcp-hostsfile" option,
originally suggested by Dan, and can address the problem that too
many "--dhcp-host" args hitting ARG_MAX limit
* src/util/dnsmasq.h src/util/dnsmasq.c: adds the 2 new files
Chris Lalancette [Fri, 23 Apr 2010 15:59:02 +0000 (11:59 -0400)]
Fix printing of pathnames on error in qemuDomainSnapshotLoad.
While doing some testing of the snapshot code I noticed that
if qemuDomainSnapshotLoad failed, it would print a NULL as
part of the error. That's not desirable, so leave the
full_path variable around until after we are done printing
errors.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Chris Lalancette [Fri, 23 Apr 2010 15:57:16 +0000 (11:57 -0400)]
Fix a memory leak in the snapshot code in libvirtd.
While running libvirtd under valgrind and doing some
snapshot testing I noticed that we would always leak a
connection reference. The problem was actually that we
were leaking a domain reference in the libvirtd remote
snapshot code, which was in turn causing a leaked
connection reference. Fix the situation by explicitly
taking and dropping a domain reference where we need it.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Disable stateful OpenNebula driver if libvirtd is disabled
Also move the equivalent checks for LXC and UML before their header
checks. This way configure doesn't check for the headers when the driver
gets disabled anyway.
The nwfilterDriverActive() could de-reference a NULL pointer
if it hadn't be started at the point it was called. It was
also not thread safe, since it lacked locking around data
accesses.
* src/nwfilter/nwfilter_driver.c: Fix locking & NULL checks
in nwfilterDriverActive()
* The error messages coming from qemu's DAC support contain strings
from the original SELinux security driver code. This just removes
references to "security context" and other SELinux-isms from the DAC
code.
Signed-off-by: Spencer Shimko <sshimko@tresys.com> Signed-off-by: Eric Blake <eblake@redhat.com>
Stefan Berger [Thu, 22 Apr 2010 18:58:57 +0000 (14:58 -0400)]
Changes from V1 to V2:
- using INT_BUFSIZE_BOUND() to determine the length of the buffersize
for printing and integer into
- not explicitly initializing static var threadsTerminate to false
anymore, since that's done automatically
Changes after V2:
- removed while looks in case of OOM error
- removed on ifaceDown() call
- preceding one ifaceDown() call with an ifaceCheck() call
Since the name of an interface can be the same between stops and starts
of different VMs I have to switch the IP address learning thread to use
the index of the interface to determine whether an interface is still
available or not - in the case of macvtap the thread needs to listen for
traffic on the physical interface, thus having to time out periodically
to check whether the VM's macvtap device is still there as an indication
that the VM is still alive. Previously the following sequence of 2 VMs
with macvtap device
would not terminate the thread upon testvm1's destroy since the name of
the interface on the host could be the same (i.e, macvtap0) on testvm1
and testvm2, thus it was easily race-able. The thread would then
determine the IP address parameter for testvm2 but apply the rule set
for testvm1. :-(
I am also introducing a lock for the interface (by name) that the thread
must hold while it listens for the traffic and releases when it
terminates upon VM termination or 0.5 second thereafter. Thus, the new
thread for a newly started VM with the same interface name will not
start while the old one still holds the lock. The only other code that I
see that also needs to grab the lock to serialize operation is the one
that tears down the firewall that were established on behalf of an
interface.
I am moving the code applying the 'basic' firewall rules during the IP
address learning phase inside the thread but won't start the thread
unless it is ensured that the firewall driver has the ability to apply
the 'basic' firewall rules.
The hang fix in d376b7d63ec1ef24ba4c812d58b9a414ddb561f8 was incomplete
since it left quite a few {Enter,Exit}Monitor calls which require driver
to be unlocked. Since the driver is locked throughout the whole
function, {Enter,Exit}MonitorWithDriver need to be used instead to
ensure driver is not locked when issuing monitor commands.
The comment in qemuDomainWaitForMigrationComplete says we are polling
every 50ms but the code sleeps only for 50us. This was already discussed
during review but apparently forgotten when the series was pushed.
The text monitor code was checking for a '\n' prefix on several
places. Previously this would work, but since the monitor code
re-write the '\n' is already stripped off, so mustn't be checked
for.
Adds ability to provide a preferred CPU model for CPUID data decoding.
Such model would be considered as the best possible model (if it's
supported by hypervisor) regardless on number of features which have to
be added or removed for describing required CPU.
Support removing features when converting data to CPU
So far, when CPUID data were converted into CPU model and features, the
features can only be added to the model. As a result, when a guest asked
for something like "qemu64,-svm" it would get a qemu32 plus a bunch of
additional features instead.
This patch adds support for removing feature from the base model.
Selection algorithm remains the same: the best CPU model is the model
which requires lowest number of features to be added/removed from it.
Qemu committed a patch which list some CPU names in [] when asked for
supported CPUs (qemu -cpu ?). Yet, it needs such CPUs to be passed
without those square braces. When probing for supported CPU models, we
can just strip the square braces and pretend we have never seen them.
First, inital VCPU pinning is set correctly but then it is reset by
assigning qemu process to a new cgroup (which contains all CPUs). It's
easily fixed by swapping these two actions.
Chris Lalancette [Tue, 23 Mar 2010 13:01:37 +0000 (09:01 -0400)]
Make avahi startup more robust.
If the hostname of the current virtualization machine
could not be resolved, then libvirtd would fail to
start. However, for disconnected operation (on a laptop,
for instance) the hostname may very legitimately not
be resolvable. This patch makes it so that if we can't
resolve the hostname, avahi doesn't fail, it just uses
a less useful MDNS string.
Signed-off-by: Chris Lalancette <clalance@redhat.com>
Chris Wong [Wed, 21 Apr 2010 09:47:15 +0000 (11:47 +0200)]
esx: Don't treat an empty root snapshot list as error
An empty root snapshot list was considered as error condition. Creating a
new snapshot would fail if the domain didn't have snapshots yet, because
the snapshot-create function tries to lookup the list of existing snapshots
in order to verify that the snapshot name is unique. This fails if the
domain doesn't have snapshots yet.
Removing the NULL check from esxVI_LookupRootSnapshotTreeList fixes this.
Stefan Berger [Tue, 20 Apr 2010 21:07:15 +0000 (17:07 -0400)]
Extend fwall-drv interface and call functions via interface
I am moving some of the eb/iptables related functions into the interface
of the firewall driver and am making them only accessible via the driver's
interface. Otherwise exsiting code is adapted where needed. I am adding one
new function to the interface that checks whether the 'basic' rules can be
applied, which will then be used by a subsequent patch.
Eric Blake [Tue, 20 Apr 2010 19:44:31 +0000 (13:44 -0600)]
build: avoid compiler warning
According to GCC, ATTRIBUTE_UNUSED means that an attribute _might_
be unused, not _must_ be unused. Therefore, it is easier to
blindly mark a variable, than to try and do preprocessor limiting
of when we know it is unused.
* src/remote/remote_driver.c (remoteAuthenticate): Mark attribute
as potentially unused.
Reported by Gustovo Morozowski.