Eric Blake [Thu, 2 May 2013 23:35:50 +0000 (17:35 -0600)]
build: always include libvirt_lxc.syms in tarball
On a mingw build, 'make distcheck' fails with:
GEN libvirt_qemu.def
make[3]: *** No rule to make target `../../src/libvirt_lxc.syms', needed by `libvirt_lxc.def'. Stop.
I traced it to a missing entry in EXTRA_DIST. But rather than keep
the entire list in sync, it is easier to list the three syms files
that drive .so files directly, and then reuse existing makefile
variables for the remaining files (that is, I validated that all
remaining files are added to SYM_FILES, possibly via USED_SYM_FILES,
according to makefile conditionals).
The code adaptation is not done right now, but in subsequent patches.
Hence I am not implementing syntax-check rule as it would break
compilation. Developers are strongly advised to use these new macros.
They are similar to VIR_ALLOC() logic: VIR_STRDUP(dst, src) returns zero
on success, -1 otherwise. In case you don't want to report OOM error,
use the _QUIET variant of a macro.
Laine Stump [Fri, 3 May 2013 18:30:55 +0000 (14:30 -0400)]
qemu: fix stupid typos in VFIO cgroup setup/teardown
I must have looked at this a couple dozen times before I noticed it
had "!=" instead of "==". Not doing this setup prevented qemu from
doing anything with the vfio group device.
The previous commit failed to update the XSL to take account
of fact that in XHTML mode the elements need namespace
prefixes. This caused every web page to be blank!
The rule generating the HTML docs passing the --html flag
to xsltproc. This makes it use the legacy HTML parser, which
either ignores or tries to fix all sorts of broken XML tags.
There's no reason why we should be writing broken XML in
the first place, so removing --html and adding the XHTML
doctype to all files forces us to create good XML.
This adds the XHTML doc type and fixes many, many XML tag
problems it exposes.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Guido Günther [Fri, 3 May 2013 06:03:26 +0000 (08:03 +0200)]
Make detect_scsi_host_caps a function on all architectures
In the non linux case some callers like gather_scsi_host_caps needed the
return code of -1 while others like update_caps needed an empty
statement (to avoid a "statement without effect" warning). This is much
simpler solved by using a function instead of a define.
Laine Stump [Thu, 2 May 2013 17:59:52 +0000 (13:59 -0400)]
network: fix network driver startup for qemu:///session
This should resolve https://bugzilla.redhat.com/show_bug.cgi?id=958907
Recent new addition of code to read/write active network state to the
NETWORK_STATE_DIR in the network driver broke startup for
qemu:///session. The network driver had several state file paths
hardcoded to /var, which could never possibly work in session mode.
This patch modifies *all* state files to use a variable string that is
set differently according to whether or not we're running
privileged. (It turns out that logDir was never used, so it's been
completely eliminated.)
There are very definitely other problems preventing dnsmasq and radvd
from running in non-privileged mode, but it's more consistent to have
the directories used by them be determined in the same fashion.
NB: I've noted before that the network driver is storing its state
(including dnsmasq and radvd state) in /var/lib, while qemu stores its
state in /var/run. It would probably have been better if the two
matched, but it's been this way for a long time, and changing it would
break running installations during an upgrade, so it's best to just
leave it as it is.
Fix warning about unsupported cookie flags in QEMU driver
The QEMU migration code unconditionally sets the 'persistent'
cookie flag on the source host. The dest host, however, only
allows it during parsing if VIR_MIGRATE_PERSIST_DEST was
set. Make the source host only set it if this flag is
present.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The lockd plugin for the lock manager was not correctly
handling the release of resource locks. This meant that
during migration, or when pausing a VM, the locks would
not get released. This in turn made it impossible to
resume the domain, or finish migration
The F_DUPFD_CLOEXEC operation with fcntl() expects a single
int argument, specifying the minimum FD number for the newly
dup'd file descriptor. We were not specifying that causing
random stack data to be accessed as the FD number. Sometimes
that worked, sometimes it didn't.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Eric Blake [Thu, 2 May 2013 20:23:02 +0000 (14:23 -0600)]
build: avoid non-portable cast of pthread_t
POSIX says pthread_t is opaque. We can't guarantee if it is scaler
or a pointer, nor what size it is; and BSD differs from Linux.
We've also had reports of gcc complaining on attempts to cast it,
if we use a cast to the wrong type (for example, pointers have to be
cast to void* or intptr_t before being narrowed; while casting a
function return of scalar pthread_t to void* triggers a different
warning).
Give up on casts, and use unions to get at decent bits instead. And
rather than futz around with figuring which 32 bits of a potentially
64-bit pointer are most likely to be unique, convert the rest of
the code base to use 64-bit values when using a debug id.
Based on a report by Guido Günther against kFreeBSD, but with a
fix that doesn't regress commit 4d970fd29 for FreeBSD.
* src/util/virthreadpthread.c (virThreadSelfID, virThreadID): Use
union to get at a decent bit representation of thread_t bits.
* src/util/virthread.h (virThreadSelfID, virThreadID): Alter
signature.
* src/util/virthreadwin32.c (virThreadSelfID, virThreadID):
Likewise.
* src/qemu/qemu_domain.h (qemuDomainJobObj): Alter type of owner.
* src/qemu/qemu_domain.c (qemuDomainObjTransferJob)
(qemuDomainObjSetJobPhase, qemuDomainObjReleaseAsyncJob)
(qemuDomainObjBeginNestedJob, qemuDomainObjBeginJobInternal): Fix
clients.
* src/util/virlog.c (virLogFormatString): Likewise.
* src/util/vireventpoll.c (virEventPollInterruptLocked):
Likewise.
More paranoid initialization of 'nparams' variable in dispatch code
Since the 'nparams' variable passed to virTypedParametersFree is
supposed to represent the size of the 'params' array, it is bad
practice to initialize it to a non-zero value, until the array
has been allocated.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Fix potential use of undefined variable in remote dispatch code
If an early dispatch check caused a jump to the 'cleanup' branch
then virTypeParamsFree() would be called with an uninitialized
'nparams' variable. Fortunately 'params' is initialized to NULL,
so the uninitialized 'nparams' variable would not be used.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The call to virReportError conditionally switched between
two format strings, with different numbers of placeholders.
This meant the format string with no placeholders was not
protected by a "%s".
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Eric Blake [Thu, 2 May 2013 21:53:27 +0000 (15:53 -0600)]
build: fix mingw build of vbox
More fallout from commit 7c9a2d88 dropping too many headers. Fixes:
In file included from ../../src/vbox/vbox_glue.c:26:0:
../../src/vbox/vbox_MSCOMGlue.c: In function 'vboxLookupVersionInRegistry':
../../src/vbox/vbox_MSCOMGlue.c:435:5: error: implicit declaration of function 'virParseVersionString' [-Werror=implicit-function-declaration]
...
../../src/vbox/vbox_driver.c: In function 'vboxConnectOpen':
../../src/vbox/vbox_driver.c:147:5: error: implicit declaration of function 'getuid' [-Werror=implicit-function-declaration]
../../src/vbox/vbox_driver.c:147:5: error: nested extern declaration of 'getuid' [-Werror=nested-externs]
Eric Blake [Thu, 2 May 2013 21:46:19 +0000 (15:46 -0600)]
build: fix mingw build of virprocess.c
Commit 776d49f4 added a static function that is only called
conditionally; leading to this compile error on mingw:
CC libvirt_util_la-virprocess.lo
../../src/util/virprocess.c:624:26: error: 'struct rlimit' declared inside parameter list [-Werror]
../../src/util/virprocess.c:624:26: error: its scope is only this definition or declaration, which is probably not what you want [-Werror]
../../src/util/virprocess.c:622:1: error: 'virProcessPrLimit' defined but not used [-Werror=unused-function]
* src/util/virprocess.c (virProcessPrLimit): Only declare
virProcessPrLimit when used.
Eric Blake [Thu, 2 May 2013 21:30:48 +0000 (15:30 -0600)]
build: fix FreeBSD build
Commit 7c9a2d88 cleaned up too many headers; FreeBSD builds
failed due to:
util/virutil.c:556: warning: implicit declaration of function 'canonicalize_file_name'
(Not sure which Linux header leaked this declaration, but gnulib
only guarantees it in stdlib.h)
libvirt.c:956: warning: implicit declaration of function 'virGetUserConfigDirectory'
(Here, a build on Linux was picking up virutil.h indirectly via
one of the conditional driver headers, where that driver was not
being built on my FreeBSD setup)
* src/util/virutil.c (includes): Need <stdlib.h> for
canonicalize_file_name.
* src/libvirt.c (includes): Use "virutil.h" unconditionally,
rather than relying on conditional indirect inclusion.
virutil: Move string related functions to virstring.c
The source code base needs to be adapted as well. Some files
include virutil.h just for the string related functions (here,
the include is substituted to match the new file), some include
virutil.h without any need (here, the include is removed), and
some require both.
It's not desired to force users imagine path for a socket they
are not even supposed to connect to. On the other hand, we
already have a release where the qemu agent socket path is
exposed to XML, so we cannot silently drop it from there.
The new path is generated in form:
Laine Stump [Wed, 1 May 2013 18:44:10 +0000 (14:44 -0400)]
pci: autolearn name of stub driver, remove from arglist
virPCIDeviceReattach and virPCIDeviceUnbindFromStub (called by
virPCIDeviceReattach) had previously required the name of the stub
driver as input. This is unnecessary, because the name of the driver
the device is currently bound to can be found by looking at the link:
/sys/bus/pci/dddd:bb:ss.ff/driver
Instead of requiring that the name of the expected stub driver name
and only unbinding if that one name is matched, we no longer take a
driver name in the arglist for either of these
functions. virPCIDeviceUnbindFromStub just compares the name of the
currently bound driver to a list of "well known" stubs (right now
contains "pci-stub" and "vfio-pci" for qemu, and "pciback" for xen),
and only performs the unbind if it's one of those devices.
This allows virsh nodedevice-reattach to work properly across a
libvirtd restart, and fixes a couple of cases where we were
erroneously still hard-coding "pci-stub" as the drive name.
For some unknown reason, virPCIDeviceReattach had been calling
modprobe on the stub driver prior to unbinding the device. This was
problematic because we no longer know the name of the stub driver in
that function. However, it is pointless to probe for the stub driver
at that time anyway - because the device is bound to the stub driver,
we are guaranteed that it is already loaded, and so that call to
modprobe has been removed.
Eric Blake [Wed, 1 May 2013 20:16:10 +0000 (14:16 -0600)]
spec: collect all BuildRequires into one area
Conditional BuildRequires: should be at the top level, rather
than appearing in conditional sub-package sections. This
appears to be the only offender.
* libvirt.spec.in (BuildRequires): Move libblkid-devel into
correct area.
Commit cc6d19f3 added text containing "<code>snapshot<code>" to
formatsnapshot.html.in. The closing tag is missing '/' which causes
the documentation to misrender.
ESX: Fix DISPATCH_FREE generation code to free all extended objects
Python code generator "generate_source" section that handles
code generation to "free" inherited objects needs to generate
DISPATCH_FREE calls for all extended_by objects.
For s390 we don't want to have a default USB device generated even
if QEMU is silently tolerating -usb on the command line. This may change
in the future.
Another reason to avoid the USB controller is that it implies a PCI
bus which might cause a regression at some later point in time.
The following change will set the USB controller model to 'none'
unless a model or address has been specified, which can be the case
if a legacy definition is loaded or the XML writer knows what
she/he's doing.
Requiring the user to explicitly disable USB on systems not supporting
it seems cumbersome.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Laine Stump [Tue, 30 Apr 2013 18:04:59 +0000 (14:04 -0400)]
qemu: fix failure to start with spice graphics and no tls
Commit eca3fdf inadvertantly caused a failure to start for any domain
with the following in its config:
<graphics type='spice' autoport='yes'/>
The problem is that when tlsPort == 0 and defaultMode == "any" (which
is the default for defaultMode), this would be flagged in the code as
"needTLSPort", and if there was then no spice tls config, the new
error+fail would happen.
This patch checks for the case of defaultMode == "any", and in that
case simply doesn't allocate a TLS port (since that's probably not
what the user wanted, and it would have failed later anyway.). It does
leave the error in place for cases when the user specifically asked to
use tls in one way or another, though.
John Ferlan [Mon, 29 Apr 2013 11:52:49 +0000 (07:52 -0400)]
Resolve valgrind error
As a result of commit id '19c345f2', 'make -C tests valgrind' has the
following for qemuxml2argvtest:
==22482== 197 (80 direct, 117 indirect) bytes in 1 blocks are definitely lost in loss record 101 of 120
==22482== at 0x4A06B6F: calloc (vg_replace_malloc.c:593)
==22482== by 0x4C6F301: virAlloc (viralloc.c:124)
==22482== by 0x4C840FC: virSaveLastError (virerror.c:308)
==22482== by 0x431882: qemuBuildCommandLine (qemu_command.c:8204)
==22482== by 0x41E8F0: testCompareXMLToArgvHelper (qemuxml2argvtest.c:155)
==22482== by 0x41FE9F: virtTestRun (testutils.c:157)
==22482== by 0x419DEB: mymain (qemuxml2argvtest.c:654)
==22482== by 0x4204DA: virtTestMain (testutils.c:719)
==22482== by 0x39D0821A04: (below main) (libc-start.c:225)
==22482==
Jim Fehlig [Mon, 29 Apr 2013 23:04:55 +0000 (17:04 -0600)]
libxl: Fix double-dispose of libxl domain config
libxlBuildDomainConfig() was disposing the libxl_domain_config object
on error, only to have it disposed again by libxlBuildDomainConfig()'s
caller, which resulted in a segfault. Leave disposing of the config
object to it's owner.
Ján Tomko [Mon, 29 Apr 2013 18:01:19 +0000 (20:01 +0200)]
qemu: report an error if memballoon has wrong address type
qemuBuildMemballoonDevStr returns NULL if memballoon doesn't have
the right address type, but it doesn't report an error, leading to:
error: An error occurred, but the cause is unknown
Report a helpful error message instead, e.g.:
error: XML error: memballoon unsupported with address type 'usb'
Ján Tomko [Mon, 29 Apr 2013 17:12:17 +0000 (19:12 +0200)]
virsh: fix incorrect argument errors for long options
For long options, print:
* the option as specified by the user if it's unknown
* the canonical long option if its argument is not
a number (and should be)
And for missing arguments, print both the short and
the long option name.
(Doing only one of those would require either parsing
argv ourselves or let getopt print the errors, since
we can't tell long and short options apart by optopt
or longindex)
Peter Krempa [Mon, 29 Apr 2013 11:41:27 +0000 (13:41 +0200)]
qemu: Error out if spice port autoallocation is requested, but disabled
When a user requests auto-allocation of the spice TLS port but spice TLS
is disabled in qemu.conf, we start the machine and let qemu fail instead
of erroring out sooner.
Peter Krempa [Mon, 22 Apr 2013 09:10:39 +0000 (11:10 +0200)]
network: Don't remove transient network if creating of config file fails
On the off-chance that creation of persistent configuration file would
fail when defining a network that is already started as transient, the
code would remove the transient data structure and thus the network.
This patch changes the code so that in such case, the network is again
marked as transient and left behind.
Laine Stump [Mon, 29 Apr 2013 20:19:19 +0000 (16:19 -0400)]
qemu: put usb cgroup setup in common function
The USB-specific cgroup setup had been inserted inline in
qemuDomainAttachHostUsbDevice and qemuSetupCgroup, but now there is a
common cgroup setup function called for all hostdevs, so it makes sens
to put the usb-specific setup there and just rely on that function
being called.
The one thing I'm uncertain of here (and a reason for not pushing
until after release) is that previously hostdev->missing was checked
only when starting a domain (and cgroup setup for the device skipped
if missing was true), but with this consolidation, it is now checked
in the case of hotplug as well. I don't know if this will have any
practical effect (does it make sense to hotplug a "missing" usb
device?)
Laine Stump [Mon, 29 Apr 2013 17:15:26 +0000 (13:15 -0400)]
qemu: add vfio devices to cgroup ACL when appropriate
PCIO device assignment using VFIO requires read/write access by the
qemu process to /dev/vfio/vfio, and /dev/vfio/nn, where "nn" is the
VFIO group number that the assigned device belongs to (and can be
found with the function virPCIDeviceGetVFIOGroupDev)
/dev/vfio/vfio can be accessible to any guest without danger
(according to vfio developers), so it is added to the static ACL.
The group device must be dynamically added to the cgroup ACL for each
vfio hostdev in two places:
1) for any devices in the persistent config when the domain is started
(done during qemuSetupCgroup())
2) at device attach time for any hotplug devices (done in
qemuDomainAttachHostDevice)
The group device must be removed from the ACL when a device it
"hot-unplugged" (in qemuDomainDetachHostDevice())
Note that USB devices are already doing their own cgroup setup and
teardown in the hostdev-usb specific function. I chose to make the new
functions generic and call them in a common location though. We can
then move the USB-specific code (which is duplicated in two locations)
to this single location. I'll be posting a followup patch to do that.
Ján Tomko [Fri, 26 Apr 2013 16:05:46 +0000 (18:05 +0200)]
qemu: prevent invalid reads in qemuAssignDevicePCISlots
Don't reserve slot 2 for video if the machine has no PCI buses.
Error out when the user specifies a video device without
a PCI address when there are no PCI buses.
(This wouldn't work on a machine with no PCI bus anyway since
we do add PCI addresses for video devices to the command line)
Ján Tomko [Fri, 26 Apr 2013 15:50:36 +0000 (17:50 +0200)]
qemu: don't always reserve PCI addresses for implicit controllers
In the past we automatically added a USB controller and assigned
it a PCI address (0:0:1.2) even on machines without a PCI bus.
This didn't break machines with no PCI bus because the command
line for it is just '-usb', with no mention of the PCI bus.
The implicit IDE controller (reserved address 0:0:1.1) has
no command line at all.
Commit b33eb0dc removed the ability to reserve PCI addresses
on machines without a PCI bus. This made them stop working,
since there would always be the implicit USB controller.
Skip the reservation of addresses for these controllers when
there is no PCI bus, instead of failing.
Laine Stump [Fri, 26 Apr 2013 20:44:05 +0000 (16:44 -0400)]
conf: remove extraneous _TYPE from driver backend enums
This isn't strictly speaking a bugfix, but I realized I'd gotten a bit
too verbose when I chose the names for
VIR_DOMAIN_HOSTDEV_PCI_BACKEND_TYPE_*. This shortens them all a bit.
Laine Stump [Fri, 26 Apr 2013 20:23:27 +0000 (16:23 -0400)]
network: support <driver name='vfio'/> in network definitions
I remembered to document this bit, but somehow forgot to implement it.
This adds <driver name='kvm|vfio'/> as a subelement to the <forward>
element of a network (this puts it parallel to the match between
mode='hostdev' attribute in a network and type='hostdev' in an
<interface>).
Since it's already documented, only the parser, formatter, backend
driver recognition (it just translates/moves the flag into the
<interface> at the appropriate time), and a test case were needed.
(I used a separate enum for the values both because the original is
defined in domain_conf.h, which is unavailable from network_conf.h,
and because in the future it's possible that we may want to support
other non-hostdev oriented driver names in the network parser; this
makes sure that one can be expanded without the other).
Paolo Bonzini [Sat, 20 Apr 2013 09:11:25 +0000 (11:11 +0200)]
qemu: launch bridge helper from libvirtd
<source type='bridge'> uses a helper application to do the necessary
TUN/TAP setup to use an existing network bridge, thus letting
unprivileged users use TUN/TAP interfaces.
However, libvirt should be preventing QEMU from running any setuid
programs at all, which would include this helper program. From
a security POV, any setuid helper needs to be run by libvirtd itself,
not QEMU.
This is what this patch does. libvirt now invokes the setuid helper,
gets the TAP fd and then passes it to QEMU in the normal manner.
The path to the helper is specified in qemu.conf.
As a small advantage, this adds a <target dev='tap0'/> element to the
XML of an active domain using <interface type='bridge'>.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Paolo Bonzini [Sat, 20 Apr 2013 09:11:24 +0000 (11:11 +0200)]
virnetdevtap: add virNetDevTapGetName
This will be used on a tap file descriptor returned by the bridge helper
to populate the <target> element, because the helper does not provide
the interface name.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
this patch fix the wrong sequence for fd and timeout register. the sequence
was right in dfa1e1dd for fd register, but it changed in e0622ca2.
in this patch, set priv, xl_priv in info and increase info->priv ref count
before virEventAddHandle. if do this after virEventAddHandle, the fd
callback or fd deregister maybe got the empty priv, xl_priv or wrong ref
count.
after apply this patch, test more than 100 rounds passed compare to fail
within 3 rounds without this patch. each round includes define -> start ->
destroy -> create -> suspend -> resume -> reboot -> shutdown -> save ->
resotre -> dump -> destroy -> create -> setmem -> setvcpus -> destroy.
Laine Stump [Thu, 25 Apr 2013 16:45:55 +0000 (12:45 -0400)]
qemu: set qemu process' RLIMIT_MEMLOCK when VFIO is used
VFIO requires all of the guest's memory and IO space to be lockable in
RAM. The domain's max_balloon is the maximum amount of memory the
domain can have (in KiB). We add a generous 1GiB to that for IO space
(still much better than KVM device assignment, where the KVM module
actually *ignores* the process limits and locks everything anyway),
and convert from KiB to bytes.
In the case of hotplug, we are changing the limit for the already
existing qemu process (prlimit() is used under the hood), and for
regular commandline additions of vfio devices, we schedule a call to
setrlimit() that will happen after the qemu process is forked.
Laine Stump [Thu, 25 Apr 2013 16:16:25 +0000 (12:16 -0400)]
qemu: use new virCommandSetMax(Processes|Files)
These were previously being set in a custom hook function, but now
that virCommand directly supports setting them, we can eliminate that
part of the hook and call the APIs directly.
Laine Stump [Thu, 25 Apr 2013 16:10:10 +0000 (12:10 -0400)]
util: new virCommandSetMax(MemLock|Processes|Files)
This patch adds two sets of functions:
1) lower level virProcessSet*() functions that will immediately set
the RLIMIT_MEMLOCK. RLIMIT_NPROC, or RLIMIT_NOFILE of either the
current process (using setrlimit()) or any other process (using
prlimit()). "current process" is indicated by passing a 0 for pid.
2) functions for virCommand* that will setup a virCommand object to
set those limits at a later time just after it has forked a new
process, but before it execs the new program.
configure.ac has prlimit and setrlimit added to the list of functions
to check for, and the low level functions log an "unsupported" error)
on platforms that don't support those functions.
If a user cgroup name begins with "cgroup.", "_" or with any of
the controllers from /proc/cgroups followed by a dot, then they
need to be prefixed with a single underscore. eg if there is
an object "cpu.service", then this would end up as "_cpu.service"
in the cgroup filesystem tree, however, "waldo.service" would
stay "waldo.service", at least as long as nobody comes up with
a cgroup controller called "waldo".
Since we require a '.XXXX' suffix on all partitions, there is
no scope for clashing with the kernel 'tasks' and 'release_agent'
files.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ensure all cgroup partitions have a suffix of ".partition"
If the partition named passed in the XML does not already have
a suffix, ensure it gets a '.partition' added to each component.
The exceptions are /machine, /user and /system which do not need
to have a suffix, since they are fixed partitions at the top
level.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Change VM cgroup suffix from '{lxc,qemu}.libvirt' to 'libvirt-{lxc,qemu}'
Recently we changed to create VM cgroups with the naming pattern
$VMNAME.$DRIVER.libvirt. Following discussions with the systemd
community it was decided that only having a single '.' in the
names is preferrable. So this changes the naming scheme to be
$VMNAME.libvirt-$DRIVER. eg for LXC 'mycontainer.libvirt-lxc' or
for KVM 'myvm.libvirt-qemu'.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Eric Blake [Fri, 26 Apr 2013 10:20:29 +0000 (04:20 -0600)]
virsh: suppress aliases in group help
'virsh help | grep nodedev-det' shows only nodedev-detach, but
'virsh help nodedev | grep nodedev-det' also shows the old alias
nodedev-dettach that we intentionally hid in commit af3f9aab.
See also commit 787f4fe and this bug report:
https://bugzilla.redhat.com/show_bug.cgi?id=956966
* tools/virsh.c (vshCmdGrpHelp): Copy suppression of vshCmdHelp.
Laine Stump [Thu, 25 Apr 2013 10:37:21 +0000 (06:37 -0400)]
security: update hostdev labelling functions for VFIO
Legacy kvm style pci device assignment requires changes to the
labelling of several sysfs files for each device, but for vfio device
assignment, the only thing that needs to be relabelled/chowned is the
"group" device for the group that contains the device to be assigned.
Laine Stump [Wed, 24 Apr 2013 18:22:36 +0000 (14:22 -0400)]
virsh: use new virNodeDeviceDetachFlags
The virsh nodedev-detach command has a new --driver option. If it's
given virsh will attempt to use the new virNodeDeviceDetachFlags API
instead of virNodeDeviceDettach. Validation of the driver name string
is left to the hypervisor (qemu accepts "kvm" or "vfio". The only
other hypervisor that implements these functions is xen, and it only
accepts NULL).
Laine Stump [Wed, 24 Apr 2013 18:06:42 +0000 (14:06 -0400)]
xen: implement virNodeDeviceDetachFlags backend
This was the only hypervisor driver other than qemu that implemented
virNodeDeviceDettach. It doesn't currently support multiple pci device
assignment driver backends, but it is simple to plug in this new API,
which will make it easier for Xen people to fill it in later when they
decide to support VFIO (or whatever other) device assignment. Also it
means that management applications will have the same API available to
them for both hypervisors on any given version of libvirt.
The only acceptable value for driverName in this case is NULL, since
there is no alternate, and I'm not willing to pick a name for the
default driver used by Xen.
Laine Stump [Wed, 24 Apr 2013 17:42:04 +0000 (13:42 -0400)]
hypervisor api: implement RPC calls for virNodeDeviceDetachFlags
This requires a custom function for remoteNodeDeviceDetachFlags,
because it is named *NodeDevice, but it goes through the hypervisor
driver rather than nodedevice driver, and so it uses privateData
instead of nodeDevicePrivateData. (It has to go through the hypervisor
driver, because that is the driver that knows about the backend drivers
that will perform the pci device assignment).
Laine Stump [Wed, 24 Apr 2013 16:56:10 +0000 (12:56 -0400)]
hypervisor api: new virNodeDeviceDetachFlags
The existing virNodeDeviceDettach() assumes that there is only a
single PCI device assignment backend driver appropriate for any
hypervisor. This is no longer true, as the qemu driver is getting
support for PCI device assignment via VFIO. The new API
virNodeDeviceDetachFlags adds a driverName arg that should be set to
the exact same string set in a domain <hostdev>'s <driver name='x'/>
element (i.e. "vfio", "kvm", or NULL for default). It also adds a
flags arg for good measure (and because it's possible we may need it
when we start dealing with VFIO's "device groups").
Laine Stump [Tue, 23 Apr 2013 18:53:36 +0000 (14:53 -0400)]
qemu: bind/unbind stub driver according to config <driver name='x'/>
If the config for a device has specified <driver name='vfio'/>,
"backend" in the pci part of the hostdev object will be set to
..._VFIO. In this case, when creating a virPCIDevice set the
stubDriver to "vfio-pci", otherwise set it to "pci-stub". We will rely
on the lower levels to report an error if the vfio driver isn't
loaded.
The detach/attach functions in virpci.c will pay attention to the
stubDriver setting in the device, and bind/unbind the appropriate
driver when preparing hostdevs for the domain.
Note that we don't yet attempt to do anything to mark active any other
devices in the same vfio "group" as a single device that is being
marked active. We do need to do that, but in order to get basic VFIO
functionality testing sooner rather than later, initially we'll just
live with more cryptic errors when someone tries to do that.
Laine Stump [Tue, 23 Apr 2013 18:50:15 +0000 (14:50 -0400)]
pci: keep a stubDriver in each virPCIDevice
This can be set when the virPCIDevice is created and placed on a list,
then used later when traversing the list to determine which stub
driver to bind/unbind for managed devices.
The existing Detach and Attach functions' signatures haven't been
changed (they still accept a stub driver name in the arg list), but if
the arg list has NULL for stub driver and one is available in the
device's object, that will be used. (we may later deprecate and remove
the arg from those functions).
Laine Stump [Thu, 25 Apr 2013 11:58:37 +0000 (07:58 -0400)]
qemu: use vfio-pci on commandline when appropriate
The device option for vfio-pci is nearly identical to that for
pci-assign - only the configfd parameter isn't supported (or needed).
Checking for presence of the bootindex parameter is done separately
from constructing the commandline, similar to how it is done for
pci-assign.
This patch contains tests to check for proper commandline
construction. It also includes tests for parser-formatter-parser
roundtrips (xml2xml), because those tests use the same data files, and
would have failed had they been included before now.
qemu: xml/args tests for VFIO hostdev and <interface type='hostdev'/>
These should be squashed in with the patch that adds commandline
handling of vfio (they would fail at any earlier time).
Laine Stump [Fri, 15 Mar 2013 19:15:14 +0000 (15:15 -0400)]
conf: formatter/parser/RNG/docs for hostdev <driver name='kvm|vfio'/>
A domain's <interface> or <hostdev>, as well as a <network>'s
<forward>, can now have an optional <driver name='kvm|vfio'/>
element. As of this patch, there is no functionality behind this new
knob - this patch adds support to the domain and network
formatter/parser, and to the RNG and documentation.
When the backend is added, legacy KVM PCI device assignment will
continue to be used when no driver name is specified (or if <driver
name='kvm'/> is specified), but if driver name is 'vfio', the new UEFI
Secure Boot compatible VFIO device assignment will be used.
Note that the parser doesn't automatically insert the current default
value of this setting. This is done on purpose because the two
possibilities are functionally equivalent from the guest's point of
view, and we want to be able to automatically start using vfio as the
default (even for existing domains) at some time in the future. This
is similar to what was done with the "vhost" driver option in
<interface>.
Laine Stump [Mon, 18 Mar 2013 19:56:12 +0000 (15:56 -0400)]
conf: put hostdev pci address in a struct
There will soon be other items related to pci hostdevs that need to be
in the same part of the hostdevsubsys union as the pci address (which
is currently a single member called "pci". This patch replaces the
single member named pci with a struct named pci that contains a single
member named "addr".
Laine Stump [Wed, 17 Apr 2013 18:16:28 +0000 (14:16 -0400)]
qemu: detect vfio-pci device and its bootindex parameter
QEMU_CAPS_DEVICE_VFIO_PCI is set if the device named "vfio-pci" is
supported in the qemu binary.
QEMU_CAPS_VFIO_PCI_BOOTINDEX is set if the vfio-pci device supports
the "bootindex" parameter; for some reason, the bootindex parameter
wasn't included in early versions of vfio support (qemu 1.4) so we
have to check for it separately from vfio itself.
Eric Blake [Thu, 25 Apr 2013 20:24:42 +0000 (14:24 -0600)]
build: avoid unsafe functions in libgen.h
POSIX says that both basename() and dirname() may return static
storage (aka they need not be thread-safe); and that they may but
not must modify their input argument. Furthermore, <libgen.h>
is not available on all platforms. For these reasons, you should
never use these functions in a multi-threaded library.
Gnulib instead recommends a way to avoid the portability nightmare:
gnulib's "dirname.h" provides useful thread-safe counterparts. The
obvious dir_name() and base_name() are GPL (because they malloc(),
but call exit() on failure) so we can't use them; but the LGPL
variants mdir_name() (malloc's or returns NULL) and last_component
(always points into the incoming string without modifying it,
differing from basename semantics only on corner cases like the
empty string that we shouldn't be hitting in the first place) are
already in use in libvirt. This finishes the swap over to the safe
functions.
Eric Blake [Thu, 25 Apr 2013 17:22:39 +0000 (11:22 -0600)]
qemu: fix build error with older platforms
Jim Fehlig reported on IRC that older gcc/glibc triggers this warning:
cc1: warnings being treated as errors
qemu/qemu_domain.c: In function 'qemuDomainDefFormatBuf':
qemu/qemu_domain.c:1297: error: declaration of 'remove' shadows a global declaration [-Wshadow]
/usr/include/stdio.h:157: error: shadowed declaration is here [-Wshadow]
make[3]: *** [libvirt_driver_qemu_impl_la-qemu_domain.lo] Error 1
Fix it like we have done in the past (such as commit 2e6322a).
* src/qemu/qemu_domain.c (qemuDomainDefFormatBuf): Avoid shadowing
a function name.