Jiri Denemark [Tue, 18 Jan 2011 11:07:13 +0000 (12:07 +0100)]
qemu: Avoid sending STOPPED event twice
In some circumstances, libvirtd would issue two STOPPED events after it
stopped a domain. This was because an EOF event can arrive after a qemu
process is killed but before qemuMonitorClose() is called.
qemuHandleMonitorEOF() should ignore EOF when the domain is not running.
I wasn't able to reproduce this bug directly, only after adding an
artificial sleep() into qemudShutdownVMDaemon().
A large number of return values used 'return (0)' instead
of simply 'return 0'. Remove all these redundant brackets
so the style is consistent throughout the file
Increase size of driver table to make UML work again
The driver table only has 10 slots, but there are potentially
11 drivers that need activating. Improve the error message
when driver registration fails
Turn libvirt.c error reporting functions into macros
The virLibConnError() function (and related ones) do not correctly
report line number info. Turn them all into macros so line numbers
are reported correctly. Drop the connection object in all of them
since it is no longer used.
Also from the virLibConnWarning() equivalents completely. Now
that the Xen driver is running 100% inside libvirtd, those
codepaths for secondary drivers cannot be reached.
* src/libvirt.c: Replace error functions with macros
Eric Blake [Fri, 24 Dec 2010 02:26:15 +0000 (19:26 -0700)]
build: use more gnulib modules for simpler code
* .gnulib: Update to latest, for sigpipe and sigaction modules.
* bootstrap.conf (gnulib_modules): Add siaction, sigpipe, strerror_r.
* tools/virsh.c (vshSetupSignals) [!SIGPIPE]: Delete, now that
gnulib guarantees it.
(SA_SIGINFO): Define for mingw fallback.
* src/util/virterror.c (virStrerror): Simplify, now that gnulib
guarantees the POSIX interface.
* configure.ac (AC_CHECK_FUNCS_ONCE): Drop redundant check.
(AM_PROG_CC_STDC): Move earlier, to keep autoconf happy.
Matthias Bolte [Sat, 15 Jan 2011 15:06:52 +0000 (16:06 +0100)]
Simplify "NWFilterPool" to "NWFilter"
The public object is called NWFilter but the corresponding private
object is called NWFilterPool. I don't see compelling reasons for this
Pool suffix. One might argue that an NWFilter is a "pool" of rules, etc.
Remove the Pool suffix from NWFilterPool. No functional change included.
Eric Blake [Tue, 18 Jan 2011 16:47:30 +0000 (09:47 -0700)]
qemu: don't fail capabilities check on 0.12.x
Fixes regression introduced in commit 2211518, where all qemu 0.12.x
fails to start, as does qemu 0.13.x lacking the pci-assign device.
Prior to 2211518, the code was just ignoring a non-zero exit status
from the qemu child, but the virCommand code checked this to avoid
masking any other issues, which means the real bug of provoking
non-zero exit status has been latent for a longer time.
* src/qemu/qemu_capabilities.c (qemuCapsExtractVersionInfo): Check
for -device driver,? support.
(qemuCapsExtractDeviceStr): Avoid failure if all probed devices
are unsupported.
Reported by Ken Congyang.
When using -incoming stdio or -incoming exec:, qemu keeps the
stdin fd open long after the migration is complete. Not to
mention that exec:cat is horribly inefficient, by doubling the
I/O and going through a popen interface in qemu.
The new -incoming fd: of qemu 0.12.0 closes the fd after using
it, and allows us to bypass an intermediary cat process for
less I/O.
* src/qemu/qemu_command.h (qemuBuildCommandLine): Add parameter.
* src/qemu/qemu_command.c (qemuBuildCommandLine): Support
migration via fd: when possible. Consolidate migration handling
into one spot, now that it is more complex.
* src/qemu/qemu_driver.c (qemudStartVMDaemon): Update caller.
* tests/qemuxml2argvtest.c (mymain): Likewise.
* tests/qemuxml2argvdata/qemuxml2argv-restore-v2-fd.args: New file.
* tests/qemuxml2argvdata/qemuxml2argv-restore-v2-fd.xml: Likewise.
Jiri Denemark [Wed, 12 Jan 2011 14:19:34 +0000 (15:19 +0100)]
Introduce per-device boot element
Currently, boot order can be specified per device class but there is no
way to specify exact disk/NIC device to boot from.
This patch adds <boot order='N'/> element which can be used inside
<disk/> and <interface/>. This is incompatible with the older os/boot
element. Since not all hypervisors support per-device boot
specification, new deviceboot flag is included in capabilities XML for
hypervisors which understand the new boot element. Presence of the flag
allows (but doesn't require) users to use the new style boot order
specification.
Eric Blake [Thu, 6 Jan 2011 19:00:30 +0000 (12:00 -0700)]
build: let xgettext see strings in libvirt-guests
* tools/libvirt-guests.init.in: Rename...
* tools/libvirt-guests.init.sh: ...so that xgettext's language
detection via suffix will work.
* po/POTFILES.in: Update all references.
* tools/Makefile.am (EXTRA_DIST, libvirt-guests.init): Likewise.
Laurent Léonard [Tue, 4 Jan 2011 18:13:56 +0000 (19:13 +0100)]
libvirt-guests: remove bashisms
* tools/libvirt-guests.init.sh: Use only POSIX shell features, which
includes using gettext.sh for translation rather than $"".
* tools/Makefile.am (libvirt-guests.init): Supply a few more substitutions.
* po/POTFILES.in: Mark that libvirt-guests.init needs translation.
Matthias Bolte [Fri, 14 Jan 2011 21:37:55 +0000 (22:37 +0100)]
tests: Remove obsolete secaatest
Before the security driver was refactored in d6623003 seclabeltest and
secaatest were basically the same. seclabeltest was meant for SELinux
and secaatest for AppArmor. Both tests exited early when the specific
security driver backend wasn't enabled.
With the new security manager trying to initialize a disabled security
driver backend is an error that can't be distinguished from other errors
anymore. Therefore, the updated seclabeltest just asks for the first
available backend as this will always work even with SELinux and AppArmor
backend being disabled due to the new Nop backend.
Remove the obsolete secaatest and compile and run the seclabeltest
unconditional.
This fixes make check on systems that support AppArmor.
Eric Blake [Fri, 7 Jan 2011 00:17:32 +0000 (17:17 -0700)]
maint: improve sc_prohibit_strncmp syntax check
* .gnulib: Update, for sc_prohibit_strcmp fix.
* cfg.mk: Adjust copyright; the only FSF portions come from when
this file was copied from coreutils.
(sc_prohibit_strncmp): Copy bug-fixes from sc_prohibit_strcmp.
* .x-sc_prohibit_strcmp: Delete, now that rule is smarter.
* .x-sc_prohibit_strncmp: Likewise.
* Makefile.am (syntax_check_exceptions): Track deletion.
Eric Blake [Wed, 12 May 2010 02:57:56 +0000 (20:57 -0600)]
datatypes: avoid redundant __FUNCTION__
virLibConnError already includes __FUNCTION__ in its output, so we
were redundant. Furthermore, clang warns that __FUNCTION__ is not
a string literal (at least __FUNCTION__ will never contain %, so
it was not a security risk).
* src/datatypes.c: Replace __FUNCTION__ with a descriptive string.
In short, under heavy load, it's possible for qemu's networking to
lock up due to the tap device's default 1MB sndbuf being
inadequate. adding "sndbuf=0" to the qemu commandline -netdevice
option will alleviate this problem (sndbuf=0 actually sets it to
0xffffffff).
Because we must be able to explicitly specify "0" as a value, the
standard practice of "0 means not specified" won't work here. Instead,
virDomainNetDef also has a sndbuf_specified, which defaults to 0, but
is set to 1 if some value was given.
The sndbuf value is put inside a <tune> element of each <interface> in
the domain. The intent is that further tunable settings will also be
placed inside this element.
The existing libvirt support for the vhost-net backend to the virtio
network driver happens automatically - if the vhost-net device is
available, it is always enabled, otherwise the standard userland
virtio backend is used.
This patch makes it possible to force whether or not vhost-net is used
with a bit of XML. Adding a <driver> element to the interface XML, eg:
will force use of vhost-net (if it's not available, the domain will
fail to start). if driver name="qemu", vhost-net will not be used even
if it is available.
If there is no <driver name='xxx'/> in the config, libvirt will revert
to the pre-existing automatic behavior - use vhost-net if it's
available, and userland backend if vhost-net isn't available.
Use the new set_password monitor command to set password.
We try to use that command first when setting a VNC/SPICE password. If
that doesn't work we fallback to the legacy VNC only password
Allow an expiry time to be set, if that doesn't work, throw an error
if they try to use SPICE.
Change since v1:
- moved qemuInitGraphicsPasswords to qemu_hotplug, renamed
to qemuDomainChangeGraphicsPasswords.
- updated what looks like a typo (that appears to work anyway) in
initial patch from Daniel:
- ret = qemuInitGraphicsPasswords(driver, vm,
- VIR_DOMAIN_GRAPHICS_TYPE_SPICE,
- &vm->def->graphics[0]->data.vnc.auth,
- driver->vncPassword);
+ ret = qemuInitGraphicsPasswords(driver, vm,
+ VIR_DOMAIN_GRAPHICS_TYPE_SPICE,
+ &vm->def->graphics[0]->data.spice.auth,
+ driver->spicePassword);
Based on patch by Daniel P. Berrange <berrange@redhat.com>.
I broke 'make check' with commit 04197350 by unconditionally
emitting 'hap=' in xen xm driver. Only emit 'hap=' if
xendConfigVersion >= 3. I've tested sending 'hap=' to a Xen 3.2
machine without support for hap setting and verified that xend
silently drops the unrecognized setting.
Eric Blake [Thu, 13 Jan 2011 16:09:15 +0000 (09:09 -0700)]
qemu: improve device flag parsing
* src/qemu/qemu_capabilities.h (qemuCapsParseDeviceStr): New
prototype.
* src/qemu/qemu_capabilities.c (qemuCapsParsePCIDeviceStrs)
Rename and split...
(qemuCapsExtractDeviceStr, qemuCapsParseDeviceStr): ...to make it
easier to add and test device-specific checks.
(qemuCapsExtractVersionInfo): Update caller.
* tests/qemuhelptest.c (testHelpStrParsing): Also test parsing of
device-related flags.
(mymain): Update expected flags.
* tests/qemuhelpdata/qemu-0.12.1-device: New file.
* tests/qemuhelpdata/qemu-kvm-0.12.1.2-rhel60-device: New file.
* tests/qemuhelpdata/qemu-kvm-0.12.3-device: New file.
* tests/qemuhelpdata/qemu-kvm-0.13.0-device: New file.
It was awkward having only int conversion in the virStrToLong family,
but only long conversion in the virXPath family. Make both families
support both types.
Eric Blake [Wed, 12 Jan 2011 23:26:34 +0000 (16:26 -0700)]
qemu: convert capabilities to use virCommand
* src/qemu/qemu_capabilities.c (qemuCapsProbeMachineTypes)
(qemuCapsProbeCPUModels, qemuCapsParsePCIDeviceStrs)
(qemuCapsExtractVersionInfo): Use virCommand rather than virExec.
Jim Fehlig [Wed, 5 Jan 2011 22:20:01 +0000 (15:20 -0700)]
Add HAP to xen hypervisor capabilities
xen-unstable c/s 16931 introduced a per-domain setting for hvm
guests to enable/disable hardware assisted paging. If disabled,
software techniques such as shadow page tables are used. If enabled,
and the feature exists in underlying hardware, hardware support for
paging is used.
Xen does not provide a mechanism to discover the HAP capability, so
we advertise its availability for hvm guests on Xen >= 3.3.
Jim Fehlig [Wed, 5 Jan 2011 22:16:57 +0000 (15:16 -0700)]
Add support for HAP feature to xen drivers
xen-unstable c/s 16931 introduced a per-domain setting for hvm
guests to enable/disable hardware assisted paging. If disabled,
software techniques such as shadow page tables are used. If enabled,
and the feature exists in underlying hardware, hardware support for
paging is used.
This provides implementation for mapping HAP setting to/from
domxml/native formats in xen drivers.
Jim Fehlig [Wed, 5 Jan 2011 21:56:48 +0000 (14:56 -0700)]
Add HAP to virDomainFeature enum
Extend the virDomainFeature enumeration to include HAP (hardware
assisted paging) feature.
Hardware features such as Extended Page Table and Nested Page
Table augment hypervisor software techniques such as shadow
page table. Adding HAP to the virDomainFeature enumeration
allows users to select between hardware and software memory
management mechanisms for their guests.
Eric Blake [Wed, 12 Jan 2011 20:13:22 +0000 (13:13 -0700)]
tests: virsh is no longer in builddir/src
Commit 870dba0 (Mar 2008) added builddir/src to PATH to pick
up virsh. Later, virsh was moved to tools; commit db68d6b
(Oct 2009) noticed this, but only added the new location rather
than deleting the old location.
* tests/Makefile.am (path_add): Drop now-useless directory.
Suggested by Daniel P. Berrange.
Eric Blake [Wed, 12 Jan 2011 16:12:24 +0000 (09:12 -0700)]
virFindFileInPath: only find executable non-directory
Without this patch, at least tests/daemon-conf (which sticks
$builddir/src in the PATH) tries to execute the directory
$builddir/src/qemu rather than a real qemu binary.
* src/util/util.h (virFileExists): Adjust prototype.
(virFileIsExecutable): New prototype.
* src/util/util.c (virFindFileInPath): Reject non-executables and
directories. Avoid huge stack allocation.
(virFileExists): Use lighter-weight syscall.
(virFileIsExecutable): New function.
* src/libvirt_private.syms (util.h): Export new function.
Wen Congyang [Wed, 12 Jan 2011 06:12:29 +0000 (14:12 +0800)]
report error when specifying wrong desturi
When we do peer2peer migration, the dest uri is an address of the
target host as seen from the source machine. So we must specify
the ip or hostname of target host in dest uri. If we do not specify
it, report an error to the user.
Osier Yang [Wed, 12 Jan 2011 14:44:00 +0000 (22:44 +0800)]
qemu: Reject SDL graphic if it's not supported by qemu
If the emulator doesn't support SDL graphic, we should reject
the use of SDL graphic xml with error messages, but not ignore
it silently, and pretend things are fine.
"-sdl" flag was exposed explicitly by qemu since 0.10.0, more detail:
http://www.redhat.com/archives/libvir-list/2011-January/msg00442.html
And we already have capability flag "QEMUD_CMD_FLAG_0_10", which
could be used to prevent the patch affecting the older versions
of QEMU.
qemu: Watchdog IB700 is not a PCI device (RHBZ#667091).
Skip IB700 when assigning PCI slots.
Note: the I6300ESB watchdog _is_ a PCI device.
To test this: I applied this patch to libvirt-0.8.3-2.fc14 (rebasing
it slightly: qemu_command.c didn't exist in that version) and
installed this on my machine, then tested that I could successfully
add an ib700 watchdog device to a guest, start the guest, and the
ib700 was available to the guest. I also added an i6300esb (PCI)
watchdog to another guest, and verified that libvirt assigned a PCI
device to it, that the guest could be started, and that i6300esb was
present in the guest.
Note that if you previously had a domain with a ib700 watchdog, it
would have had an <address type='pci' .../> clause added to it in the
libvirt configuration. This patch does not attempt to remove this.
You cannot start such a domain -- qemu gives an error if you try.
With this patch you are able to remove the bogus address element
without libvirt adding it back.
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Eric Blake [Mon, 10 Jan 2011 22:08:56 +0000 (15:08 -0700)]
network: plug unininitialized read found by valgrind
* src/util/network.c (virSocketAddrMask): Zero out port, so that
iptables can initialize just the netmask then call
virSocketFormatAddr without an uninitialized read in getnameinfo.
Cole Robinson [Thu, 2 Dec 2010 17:15:10 +0000 (12:15 -0500)]
python: Use PyCapsule API if available
On Fedore 14, virt-manager spews a bunch of warnings to the console:
/usr/lib64/python2.7/site-packages/libvirt.py:1781: PendingDeprecationWarning: The CObject type is marked Pending Deprecation in Python 2.7. Please use capsule objects instead.
Have libvirt use the capsule API if available. I've verified this compiles
fine on older python (2.6 in RHEL6 which doesn't have capsules), and
virt-manager seems to function fine.
Cole Robinson [Wed, 5 Jan 2011 21:35:07 +0000 (16:35 -0500)]
remote: Don't lose track of events when callbacks are slow
After the remote driver runs an event callback, it unconditionally disables the
loop timer, thinking it just flushed every queued event. This doesn't work
correctly though if an event is queued while a callback is running.
The events actually aren't being lost, it's just that the event loop didn't
think there was anything that needed to be dispatched. So all those 'lost
events' should actually get re-triggered if you manually kick the loop by
generating a new event (like creating a new guest).
The solution is to disable the dispatch timer _before_ we invoke any event
callbacks. Events queued while a callback is running will properly reenable the
timer.
More info at https://bugzilla.redhat.com/show_bug.cgi?id=624252
The current security driver usage requires horrible code like
if (driver->securityDriver &&
driver->securityDriver->domainSetSecurityHostdevLabel &&
driver->securityDriver->domainSetSecurityHostdevLabel(driver->securityDriver,
vm, hostdev) < 0)
This pair of checks for NULL clutters up the code, making the driver
calls 2 lines longer than they really need to be. The goal of the
patchset is to change the calling convention to simply
if (virSecurityManagerSetHostdevLabel(driver->securityDriver,
vm, hostdev) < 0)
The first check for 'driver->securityDriver' being NULL is removed
by introducing a 'no op' security driver that will always be present
if no real driver is enabled. This guarentees driver->securityDriver
!= NULL.
The second check for 'driver->securityDriver->domainSetSecurityHostdevLabel'
being non-NULL is hidden in a new abstraction called virSecurityManager.
This separates the driver callbacks, from main internal API. The addition
of a virSecurityManager object, that is separate from the virSecurityDriver
struct also allows for security drivers to carry state / configuration
information directly. Thus the DAC/Stack drivers from src/qemu which
used to pull config from 'struct qemud_driver' can now be moved into
the 'src/security' directory and store their config directly.
* src/qemu/qemu_conf.h, src/qemu/qemu_driver.c: Update to
use new virSecurityManager APIs
* src/qemu/qemu_security_dac.c, src/qemu/qemu_security_dac.h
src/qemu/qemu_security_stacked.c, src/qemu/qemu_security_stacked.h:
Move into src/security directory
* src/security/security_stack.c, src/security/security_stack.h,
src/security/security_dac.c, src/security/security_dac.h: Generic
versions of previous QEMU specific drivers
* src/security/security_apparmor.c, src/security/security_apparmor.h,
src/security/security_driver.c, src/security/security_driver.h,
src/security/security_selinux.c, src/security/security_selinux.h:
Update to take virSecurityManagerPtr object as the first param
in all callbacks
* src/security/security_nop.c, src/security/security_nop.h: Stub
implementation of all security driver APIs.
* src/security/security_manager.h, src/security/security_manager.c:
New internal API for invoking security drivers
* src/libvirt.c: Add missing debug for security APIs
Jiri Denemark [Fri, 7 Jan 2011 11:34:12 +0000 (12:34 +0100)]
daemon: Fix core dumps if unix_sock_group is set
Setting unix_sock_group to something else than default "root" in
/etc/libvirt/libvirtd.conf prevents system libvirtd from dumping core on
crash. This is because we used setgid(unix_sock_group) before binding to
/var/run/libvirt/libvirt-sock* and setgid() back to original group.
However, if a process changes its effective or filesystem group ID, it
will be forbidden from leaving core dumps unless fs.suid_dumpable sysctl
is set to something else then 0 (and it is 0 by default).
Changing socket's group ownership after bind works better. And we can do
so without introducing a race condition since we loosen access rights by
changing the group from root to something else.
Laine Stump [Wed, 5 Jan 2011 21:53:03 +0000 (16:53 -0500)]
Don't chown qemu saved image back to root after save if dynamic_ownership=0
When dynamic_ownership=0, saved images must be owned by the same uid
as is used to run the qemu process, otherwise restore won't work. To
accomplish this, qemuSecurityDACRestoreSavedStateLabel() needs to
simply return when it's called.
Eric Blake [Wed, 5 Jan 2011 17:58:35 +0000 (10:58 -0700)]
maint: document dislike of mismatched if/else bracing
* docs/hacking.html.in (Curly braces): Tighten recommendations to
disallow if (cond) one-line; else { block; }.
* HACKING: Regenerate.
Suggested by Daniel P. Berrange.
Laine Stump [Tue, 4 Jan 2011 17:31:40 +0000 (12:31 -0500)]
Log an error on attempts to add a NAT rule for non-IPv4 addresses
Although the upper-layer code protected against it, it was possible to
call iptablesForwardMasquerade() with an IPv6 address and have it
attempt to add a rule to the MASQUERADE chain of ip6tables (which
doesn't exist).
This patch changes that function to check the protocol of the given
address, generate an error log if it's not IPv4 (AF_INET), and finally
hardcodes all the family parameters sent down to lower-level functions.
The crash in that report was coincidentally fixed when we switched
from using inet_pton() to using virSocketParseAddr(), but the absence
of an ip address in a dhcp static host definition was still silently
ignored (and that entry discarded from the saved XML). This patch
turns that into a logged failure; likewise if the entry has neither a
mac address nor a name attribute (the entry is useless without at
least one of those, plus an ip address).
Since the network name is now pulled into this function in order for
those error logs to be more informative, the other error messages in
the function have also been changed to take advantage.
Stefan Berger [Tue, 4 Jan 2011 17:46:10 +0000 (12:46 -0500)]
qemu driver: fix positioning to end of log file
While doing some testing with Qemu and creating huge logfiles I encountered the case where the VM could not start anymore due to the lseek() to the end of the Qemu VM's log file failing. The patch below fixes the problem by replacing the previously used 'int' with 'off_t'.
To reproduce this error, you could do the following:
dd if=/dev/zero of=/var/log/libvirt/qemu/<name of VM>.log bs=1024 count=$((1024*2048))
and you should get an error like this:
error: Failed to start domain <name of VM>
error: Unable to seek to -2147482651 in /var/log/libvirt/qemu/<name of VM>.log: Success
Eric Blake [Mon, 3 Jan 2011 22:26:33 +0000 (15:26 -0700)]
build: avoid compilation warnings
Detected on cygwin:
util/util.c: In function 'virSetUIDGID':
util/util.c:2824: warning: format '%d' expects type 'int', but argument 7 has type 'gid_t' [-Wformat]
(and three other lines)
* src/util/util.c (virSetUIDGID): Cast, as is done elsewhere in
this file, to avoid printf type mismatch warnings.
Chris Wright [Fri, 24 Dec 2010 18:41:52 +0000 (10:41 -0800)]
node_device: udev driver does not handle SR-IOV devices
The udev driver does not update a PCI device with its SR-IOV capabilities,
when applicable, the way the hal driver does. As a result, dumping the
device's XML will not include the relevant physical or virtual function
information.
Eric Blake [Fri, 24 Dec 2010 15:40:42 +0000 (08:40 -0700)]
virExec: fix logic bug
As pointed out in https://bugzilla.redhat.com/show_bug.cgi?id=659855#c9,
commit c3568ec2 introduced a regression where we no longer close any
fd's beyond FD_SETSIZE.
* src/util/util.c (__virExec): Continue to close fd's beyond
keepfd range.
Reported by Stefan Praszalowicz.