In a full domain config, libvirt allows overriding the normal PCI
vs. PCI Express rules when a device address is explicitly provided
(so, e.g., you can force a legacy PCI device to plug into a PCIe port,
although libvirt would never do that on its own). However, due to a
bug libvirt doesn't give this same leeway when hotplugging devices. On
top of that, current libvirt assumes that *all* devices are legacy
PCI. The result of all this is that it's impossible to hotplug a
device into a PCIe port, even if you manually add the PCI address.
This can all be traced to the function
virDomainPCIAddressEnsureAddr(), and the fact that it calls
virDomainPCIaddressReserveSlot() for manually set addresses, and that
function hardcodes the argument "fromConfig" to false (meaning "this
address was auto-assigned, so it should be subject to stricter
validation").
Since virDomainPCIAddressReserveSlot() is just a one line simple
wrapper around virDomainPCIAddressReserveAddr() (adding in a hardcoded
reserveEntireSlot = true and fromConfig = false), all that's needed to
solve the problem with no unwanted side effects is to replace that
call for virDomainPCIAddressReserveSlot() with a direct call to
virDomainPCIAddressReserveAddr(), but with reserveEntireSlot = true,
fromConfig = true. That's what this patch does.
Laine Stump [Mon, 12 Sep 2016 17:21:10 +0000 (13:21 -0400)]
qemu: fix improper initialization of cgroupControllers bitmap
virQEMUDriverConfigNew() always initializes the bitmap in its
cgroupControllers member to -1 (i.e. all 1's).
Prior to commit a9331394, if qemu.conf had a line with
"cgroup_controllers", cgroupControllers would get reset to 0 before
going through a loop setting a bit for each named cgroup controller.
commit a9331394 left out the "reset to 0" part, so cgroupControllers
would always be -1; if you didn't want a controller included, there
was no longer a way to make that happen.
This was discovered by users who were using qemu commandline
passthrough to use the "input-linux" method of directing
keyboard/mouse input to a virtual machine:
Thanks to sL1pKn07 SpinFlo <sl1pkn07@gmail.com> for bringing the
problem up in IRC, and then taking the time to do a git bisect and
find the patch that started the problem.
qemu: Add the ability to hotplug the TLS X.509 environment
added a parameter "bool listen" in some methods. This
unfortunately clashes with the listen() method, causing
compile failures on certain platforms (RHEL-6 for example)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
John Ferlan [Tue, 6 Sep 2016 21:00:30 +0000 (17:00 -0400)]
storage: Need to properly read the crypt offset value
Commit id 'a48c7141' altered how to determine if a volume was encrypted
by adding a peek at an offset into the file at a specific buffer location.
Unfortunately, all that was compared was the first "char" of the buffer
against the expect "int" value.
Restore the virReadBufInt32BE to get the complete field in order to
compare against the expected value from the qcow2EncryptionInfo or
qcow1EncryptionInfo "modeValue" field.
This restores the capability to create a volume with encryption, then
refresh the pool, and still find the encryption for the volume.
John Ferlan [Tue, 6 Sep 2016 20:52:36 +0000 (16:52 -0400)]
storage: Need to refresh secret for luks volume after volume refresh
A LUKS volume uses the volume secret type just like the QCOW2 secret, so
adjust the loading of the default secrets to handle any volume that the
virStorageFileGetMetadataFromBuf code has deemed to be an encrypted volume
to search for the volume's secret. This lookup is done by volume usage
where the usage is expected to be the path to volume.
When migration fails, we need to poke QEMU monitor to check for a reason
of the failure. We did this using query-migrate QMP command, which is
not supposed to return any meaningful result on the destination side.
Thus if the monitor was still functional when we detected the migration
failure, parsing the answer from query-migrate always failed with the
following error message:
"info migration reply was missing return status"
This irrelevant message was then used as the reason for the migration
failure replacing any message we might have had.
Let's use harmless query-status for poking the monitor to make sure we
only get an error if the monitor connection is broken.
John Ferlan [Tue, 6 Sep 2016 21:20:30 +0000 (17:20 -0400)]
util: Quiet the logging if perf file doesn't exist
Commit id 'b00d7f29' shifted the opening of the /sys/devices/intel_cqm/type
file from event enable to perf event initialization. If the file did not
exist, then an error would be written to the domain log:
2016-09-06 20:51:21.677+0000: 7310: error : virFileReadAll:1360 : Failed to open file '/sys/devices/intel_cqm/type': No such file or directory
Since the error is now handled in virPerfEventEnable by checking if the
event_attr->attrType == 0 for CMT, MBML, and MBMT events - we can just
use the Quiet API in order to not log the error we're going to throw away.
Additionally, rather than using virReportSystemError, use virReportError
and VIR_ERR_ARGUMENT_UNSUPPORTED in order to signify that support isn't there
for that type of perf event - adjust the error message as well.
Implement support for "virsh cpu-compare" so that we can calculate
common cpu element between a pool of hosts, which had a requirement
of providing host cpu description.
Signed-off-by: Joao Martins <joao.m.martins@oracle.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Parse libxl_hwcap accounting for versions since Xen 4.4 - Xen 4.7.
libxl_hwcaps is a set of cpuid leaves output that is described in [0] or
[1] in Xen 4.7. This is a collection of CPUID leaves that we version
in libvirt whenever feature words are reordered or added. Thus we keep the
common ones in one struct and others for each version. Since
libxl_hwcaps doesn't appear to have a stable format across all supported
versions thus we need to keep track of changes as a compromise until it's
exported in xen libxl API. We don't fail in initializing the driver in case
parsing of hwcaps failed for that reason. In addition, change the notation
on PAE feature such that is easier to read which bit it corresponds.
Add support for describing cpu topology in host cpu element. In doing
so, refactor hwcaps part to its own helper namely libxlCapsInitCPU to
handle all host cpu related operations, including topology.
Peter Krempa [Mon, 5 Sep 2016 16:12:00 +0000 (18:12 +0200)]
qemu: hotplug: Don't wait if cdrom tray is opened forcibly
Qemu always opens the tray if forced to. Skip the waiting step in such
case.
This also helps if qemu does not report the tray change event when
opening the cdrom forcibly (the documentation says that the event will
not be sent although qemu in fact does trigger it even if @force is
selceted).
This is a workaround for a qemu issue where qemu does not send the tray
change event in some cases (after migration with empty closed locked
drive) and thus renders the cdrom useless from libvirt's point of view.
Peter Krempa [Mon, 5 Sep 2016 13:50:18 +0000 (15:50 +0200)]
qemu: domain: Clear startup policy for dropped removable media
When a source image is dropped when missing due to startup policy the
policy needs to be cleared since it was relevant only for the given
storage source. New sources need to update it if needed.
QEMU added another virtio-net tunable [1]. It basically allows
users to set the size of RX virtio ring. But because virtio-net
uses two separate ring buffers to pass data from/to guest they
named it explicitly rx_queue_size. We should expose it in our XML
too.
John Ferlan [Thu, 14 Jul 2016 19:09:08 +0000 (15:09 -0400)]
conf: Add new secret type "tls"
Add a new secret usage type known as "tls" - it will handle adding the
secret objects for various TLS objects that need to provide some sort
of passphrase in order to access the credentials.
Once defined and a passphrase set, future patches will allow the UUID
to be set in the qemu.conf file and thus used as a secret for various
TLS options such as a chardev serial TCP connection, a NBD client/server
connection, and migration.
John Ferlan [Mon, 13 Jun 2016 16:30:34 +0000 (12:30 -0400)]
qemu: Add the ability to hotplug the TLS X.509 environment
If the incoming XML defined a path to a TLS X.509 certificate environment,
add the necessary 'tls-creds-x509' object to the VIR_DOMAIN_CHR_TYPE_TCP
character device.
Likewise, if the environment exists the hot unplug needs adjustment as
well. Note that all the return ret were changed to goto cleanup since
the cfg needs to be unref'd
John Ferlan [Thu, 9 Jun 2016 22:30:55 +0000 (18:30 -0400)]
qemu: Add support for TLS X.509 path to TCP chardev backend
When building a chardev device string for tcp, add the necessary pieces to
access provide the TLS X.509 path to qemu. This includes generating the
'tls-creds-x509' object and then adding the 'tls-creds' parameter to the
VIR_DOMAIN_CHR_TYPE_TCP command line.
Finally add the tests for the qemu command line. This test will make use
of the "new(ish)" /etc/pki/qemu setting for a TLS certificate environment
by *not* "resetting" the chardevTLSx509certdir prior to running the test.
Also use the default "verify" option (which is "no").
John Ferlan [Tue, 14 Jun 2016 19:52:37 +0000 (15:52 -0400)]
conf: Introduce chartcp_tls_x509_cert_dir
Add a new TLS X.509 certificate type - "chardev". This will handle the
creation of a TLS certificate capability (and possibly repository) for
properly configured character device TCP backends.
Unlike the vnc and spice there is no "listen" or "passwd" associated. The
credentials eventually will be handled via a libvirt secret provided to
a specific backend.
John Ferlan [Tue, 14 Jun 2016 18:14:31 +0000 (14:14 -0400)]
conf: Add new default TLS X.509 certificate default directory
Rather than specify perhaps multiple TLS X.509 certificate directories,
let's create a "default" directory which can then be used if the service
(e.g. for now vnc and spice) does not supply a default directory.
Since the default for vnc and spice may have existed before without being
supplied, the default check will first check if the service specific path
exists and if so, set the cfg entry to that; otherwise, the default will
be set to the (now) new defaultTLSx509certdir.
Additionally add a "default_tls_x509_verify" entry which can also be used
to force the peer verification option (for vnc it's a x509verify option).
Add/alter the macro for the option being found in the config file to accept
the default value.
qemu: Remove stale transient def when migration fails
If a migration of a domain which is already defined on the destination
host failed early (before we tried to start QEMU), we would forget to
remove the incoming transient definition. Later on when someone starts
the domain on the destination host, we will use the stale incoming
definition and the persistent definition will just be ignored.
The code for replacing domain's transient definition with the persistent
one is repeated in several places and we'll need to add one more. Let's
make a nice helper for it.
When using
virsh net-event non-existing-net
the error message says that 'either --list or event type is required'
This is misleading as 'virsh net-event $valid-event-type' is not going
to work either. What is expected is 'virsh net-event --event
$valid-event-type'
This commit fixes the string in pool-event, nodedev-event, event, and
net-event.
Julio Faracco [Wed, 7 Sep 2016 21:43:53 +0000 (18:43 -0300)]
security: Fixing wrong label in virt-aa-helper.c.
There is an issue with a wrong label inside vah_add_path().
The compilation fails with the error:
make[3]: Entering directory '/tmp/libvirt/src'
CC security/virt_aa_helper-virt-aa-helper.o
security/virt-aa-helper.c: In function 'vah_add_path':
security/virt-aa-helper.c:769:9: error: label 'clean' used but not defined
goto clean;
This patch moves 'clean' label to 'cleanup' label.
Signed-off-by: Julio Faracco <jcfaracco@gmail.com>
Rufo Dogav [Wed, 24 Aug 2016 23:15:29 +0000 (00:15 +0100)]
Avoid segfault in virt-aa-helper when handling read-only filesystems
This patch fixes a segfault in virt-aa-helper caused by attempting to
modify a static string literal. It is triggered when a domain has a
<filesystem> with type='mount' configured read-only and libvirt is
using the AppArmor security driver for sVirt confinement. An "R" is
passed into the function and converted to 'r'.
Yanqiu Zhang [Thu, 25 Aug 2016 02:49:55 +0000 (10:49 +0800)]
storage: Delete extra wrap after vol-resize error
This patch is to delete the extra wrap "\n" after failed vol-resize
error for both "Failed to change size of volume to" and "Failed to change
size of volume by". For error with wrap, there will be an extra wrap
between two errors, such as:
(1)# virsh vol-resize --pool default --vol vol-test 5M
error: Failed to change size of volume 'vol-test' to 5M
error: invalid argument: Can't shrink capacity below current capacity unless shrink flag explicitly specified
(2)# virsh vol-resize /var/lib/libvirt/images/volds --shrink --delta 10M
error: Failed to change size of volume 'volds' by 10M
Peter Krempa [Wed, 7 Sep 2016 11:20:00 +0000 (13:20 +0200)]
qemu: process: Fix start with unpluggable vcpus with NUMA pinning
Similarly to vcpu hotplug the emulator thread cgroup numa mapping needs
to be relaxed while hot-adding vcpus so that the threads can allocate
data in the DMA zone.
Peter Krempa [Wed, 7 Sep 2016 11:11:59 +0000 (13:11 +0200)]
qemu: cgroup: Extract temporary relaxing of cgroup setting for vcpu hotplug
When hot-adding vcpus qemu needs to allocate some structures in the DMA
zone which may be outside of the numa pinning. Extract the code doing
this in a set of helpers so that it can be reused.
Erik Skultety [Mon, 5 Sep 2016 11:51:21 +0000 (13:51 +0200)]
virt-admin: Output srv-clients-set data as unsigned int rather than signed
Unfortunately, commit a8962f70 only fixed first half of the reported issue of
virt-admin outputting negative values where unsigned int is expected by
BZ below, so this commit represents the other missing half of the fix.
Maxim Nestratov [Mon, 6 Jun 2016 14:42:16 +0000 (17:42 +0300)]
util: fix crash in virClassIsDerivedFrom for CloseCallbacks objects
There is a possibility that qemu driver frees by unreferencing its
closeCallbacks pointer as it has the only reference to the object,
while in fact not all users of CloseCallbacks called thier
virCloseCallbacksUnset.
Backtrace is the following:
Thread #1:
0 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
1 in virCondWait (c=<optimized out>, m=<optimized out>)
at util/virthread.c:154
2 in virThreadPoolFree (pool=0x7f0810110b50)
at util/virthreadpool.c:266
3 in qemuStateCleanup () at qemu/qemu_driver.c:1116
4 in virStateCleanup () at libvirt.c:808
5 in main (argc=<optimized out>, argv=<optimized out>)
at libvirtd.c:1660
Thread #2:
0 in virClassIsDerivedFrom (klass=0xdeadbeef, parent=0x7f0837c694d0) at util/virobject.c:169
1 in virObjectIsClass (anyobj=anyobj@entry=0x7f08101d4760, klass=<optimized out>) at util/virobject.c:365
2 in virObjectLock (anyobj=0x7f08101d4760) at util/virobject.c:317
3 in virCloseCallbacksUnset (closeCallbacks=0x7f08101d4760, vm=vm@entry=0x7f08101d47b0, cb=cb@entry=0x7f081d078fc0 <qemuProcessAutoDestroy>) at util/virclosecallbacks.c:163
4 in qemuProcessAutoDestroyRemove (driver=driver@entry=0x7f081018be50, vm=vm@entry=0x7f08101d47b0) at qemu/qemu_process.c:6368
5 in qemuProcessStop (driver=driver@entry=0x7f081018be50, vm=vm@entry=0x7f08101d47b0, reason=reason@entry=VIR_DOMAIN_SHUTOFF_SHUTDOWN, asyncJob=asyncJob@entry=QEMU_ASYNC_JOB_NONE, flags=flags@entry=0) at qemu/qemu_process.c:5854
6 in processMonitorEOFEvent (vm=0x7f08101d47b0, driver=0x7f081018be50) at qemu/qemu_driver.c:4585
7 qemuProcessEventHandler (data=<optimized out>, opaque=0x7f081018be50) at qemu/qemu_driver.c:4629
8 in virThreadPoolWorker (opaque=opaque@entry=0x7f0837c4f820) at util/virthreadpool.c:145
9 in virThreadHelper (data=<optimized out>) at util/virthread.c:206
10 in start_thread () from /lib64/libpthread.so.0
Let's reference CloseCallbacks object in virCloseCallbacksSet and
unreference in virCloseCallbacksUnset.
Make sure sys/types.h is included after sys/sysmacros.h
In the latest glibc, major() and minor() functions are marked as
deprecated (glibc commit dbab6577):
CC util/libvirt_util_la-vircgroup.lo
util/vircgroup.c: In function 'virCgroupGetBlockDevString':
util/vircgroup.c:768:5: error: '__major_from_sys_types' is deprecated:
In the GNU C Library, `major' is defined by <sys/sysmacros.h>.
For historical compatibility, it is currently defined by
<sys/types.h> as well, but we plan to remove this soon.
To use `major', include <sys/sysmacros.h> directly.
If you did not intend to use a system-defined macro `major',
you should #undef it after including <sys/types.h>.
[-Werror=deprecated-declarations]
if (virAsprintf(&ret, "%d:%d ", major(sb.st_rdev), minor(sb.st_rdev)) < 0)
^~
In file included from /usr/include/features.h:397:0,
from /usr/include/bits/libc-header-start.h:33,
from /usr/include/stdio.h:28,
from ../gnulib/lib/stdio.h:43,
from util/vircgroup.c:26:
/usr/include/sys/sysmacros.h:87:1: note: declared here
__SYSMACROS_DEFINE_MAJOR (__SYSMACROS_FST_IMPL_TEMPL)
^
Moreover, in the glibc commit, there's suggestion to keep
ordering of including of header files as implemented here.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Nishith Shah [Tue, 6 Sep 2016 12:04:37 +0000 (12:04 +0000)]
tools: Pass opaque data in vshCompleter and introduce autoCompleteOpaque
This patch changes the signature of vshCompleters, allowing to pass along
some data that we might want to along with the completers; for example,
we might want to pass the autocomplete vshControl along with the
completer, in case the completer requires a connection to libvirtd.
Signed-off-by: Nishith Shah <nishithshah.2211@gmail.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Current implementation uses the dev.cpu.0.freq sysctl that is
provided by the cpufreq(4) framework and returns the actual
CPU frequency. However, there are environments where it's not available,
e.g. when running nested in KVM. In this case fall back to hw.clockrate
that reports CPU frequency at the boot time.
Having After=libvirtd.service merely ensures that, if both
services are asked to start, libvirtd.service will start
first.
What we really want is for libvirtd.service to be started
whenever libvirt-guests.service is asked to start. Adding a
Requires= relationship guarantees that will happen.
We use a separate line for each After= relationship in other
unit files: do the same here for consistency's sake, and also
to make future changes nicer to diff
libvirt-guests.service does both suspend *and* resume guests,
depending on whether it's being started or stopped: the
description should reflect this, to avoid confusing messages
during startup.
Replace "active" with "running" (to match virsh list's output)
and don't capitalize libvirt.
virtlogd.socket: Tie lifecycle to libvirtd.service
We already guarantee that virtlogd.socket is enabled/disabled
along with libvirtd.service, but if libvirtd.service has just
been installed and is started before rebooting, then
virtlogd.socket will not be running and guest startup will
fail.
Add Requires=virtlogd.socket to libvirtd.service to make sure
virtlogd.socket is always started along with libvirtd.service,
and add Before=libvirtd.service to both virtlogd.socket and
virtlogd.service so that virtlogd never disappears before
libvirtd has exited.
Also add PartOf=libvirtd.service to both virtlogd.socket and
virtlogd.service, so that virtlogd can be shut down when not
needed.
qemu: allow turning off QEMU guest RAM dump globally
We already have the ability to turn off dumping of guest
RAM via the domain XML. This is not particularly useful
though, as it is under control of the management application.
What is needed is a way for the sysadmin to turn off guest
RAM defaults globally, regardless of whether the mgmt app
provides its own way to set this in the domain XML.
So this adds a 'dump_guest_core' option in /etc/libvirt/qemu.conf
which defaults to false. ie guest RAM will never be included in
the QEMU core dumps by default. This default is different from
historical practice, but is considered to be more suitable as
a default because
a) guest RAM can be huge and so inflicts a DOS on the host
I/O subsystem when dumping core for QEMU crashes
b) guest RAM can contain alot of sensitive data belonging
to the VM owner. This should not generally be copied
around inside QEMU core dumps submitted to vendors for
debugging
c) guest RAM contents are rarely useful in diagnosing
QEMU crashes
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
qemu: add a max_core setting to qemu.conf for core dump size
Currently the QEMU processes inherit their core dump rlimit
from libvirtd, which is really suboptimal. This change allows
their limit to be directly controlled from qemu.conf instead.
With current perf framework, this patch adds support and documentation
for more perf events, including cache misses, cache references, cpu cycles,
and instructions.
Qiaowei Ren [Wed, 3 Aug 2016 17:23:31 +0000 (13:23 -0400)]
perf: Adjust the perf initialization
Introduce a static attr table and refactor virPerfEventEnable() for
general purpose usage.
This patch creates a static table/matrix that converts the VIR_PERF_EVENT_*
events into their respective "attr.type" and "attr.config" so that
virPerfEventEnable doesn't have the switch the calling function passes
by value the 'type'.
qemu: Filter cur_balloon ABI check for certain transactions
Since the domain lock is not held during preparation of an external XML
config, it is possible that the value can change resulting in unexpected
failures during ABI consistency checking for some save and migrate
operations.
This patch adds a new flag to skip the checking of the cur_balloon value
and then sets the destination value to the source value to ensure
subsequent checks without the skip flag will succeed.
This way it is protected from forges and is keeped up to date too.
Signed-off-by: Nikolay Shirokovskiy <nshirokovskiy@virtuozzo.com>
Bob Liu [Thu, 18 Aug 2016 02:20:48 +0000 (10:20 +0800)]
libxl: support serial list
Add support for multi serial devices, after this patch virsh can be used to
connect different serial devices of running domains. E.g.
vish # console <xxx> --devname serial<xxx>
Note:
This depends on a xen/libxl bug fix to have libxl_console_get_tty(...) correctly
returning the tty path (as opposed to always returning the first one).
[0] https://lists.xen.org/archives/html/xen-devel/2016-08/msg00438.html
Jim Fehlig [Tue, 2 Aug 2016 03:36:45 +0000 (21:36 -0600)]
virpci: support driver_override sysfs interface
libvirt uses the new_id PCI sysfs interface to bind a PCI stub driver
to a PCI device. The new_id interface is known to be buggy and racey,
hence a more deterministic interface was introduced in the 3.12 kernel:
driver_override. For more details see
This patch adds support for the driver_override interface by
- adding new virPCIDevice{BindTo,UnbindFrom}StubWithOverride functions
that use the driver_override interface
- renames the existing virPCIDevice{BindTo,UnbindFrom}Stub functions
to virPCIDevice{BindTo,UnbindFrom}StubWithNewid to perserve existing
behavior on new_id interface
- changes virPCIDevice{BindTo,UnbindFrom}Stub function to call one of
the above depending on availability of driver_override
The patch includes a bit of duplicate code, but allows for easily
dropping the new_id code once support for older kernels is no
longer desired.
Xian Han Yu [Mon, 15 Aug 2016 04:22:25 +0000 (06:22 +0200)]
conf: Fix initialization value of 'multi' in PCI address
The 'multi' element in PCI address struct used as 'virTristateSwitch',
and its default value is 'VIR_TRISTATE_SWITCH_ABSENT'. Current PCI
process use 'false' to initialization 'multi', which is ambiguously
for assignment or comparison. This patch use '{0}' to initialize
the whole PCI address struct, which fix the 'multi' initialization
and makes code more simplify and explicitly.
Signed-off-by: Xian Han Yu <xhyubj@linux.vnet.ibm.com>
Michal Privoznik [Wed, 31 Aug 2016 10:52:11 +0000 (12:52 +0200)]
tools: Don't list virsh-* under EXTRA_DIST
When we wanted to break huge and unmaintainable virsh into
smaller files first thing we did was to just move funcs into
virsh-.c files and then #include them from virsh. Having it done
this way we also needed to have them listed under EXTRA_DIST.
However, things got changed since then and now all the virsh-*.c
files are proper source files. Therefore they are listed under
virsh_SOURCES too. But for some reason we forgot to remove them
from EXTRA_DIST.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Jim Fehlig [Mon, 29 Aug 2016 16:08:01 +0000 (10:08 -0600)]
libxl: advertise support for migration V3
The libxl driver has long supported migration V3 but has never
indicated so in the connectSupportsFeature API. As a result, apps
such as virt-manager that use the more generic virDomainMigrate API
fail with
libvirtError: this function is not supported by the connection driver:
virDomainMigrate
Add VIR_DRV_FEATURE_MIGRATION_V3 to the list of features marked as
supported in the connectSupportsFeature API.
Test 12 from objecteventtest (createXML add event) segaults on FreeBSD
with bus error.
At some point it calls testNodeDeviceDestroy() from the test driver. And
it fails when it tries to unlock the device in the "out:" label of this
function.
Unlocking fails because the previous step was a call to
virNodeDeviceObjRemove from conf/node_device_conf.c. This function
removes the given device from the device list and cleans up the object,
including destroying of its mutex. However, it does not nullify the pointer
that was given to it.
As a result, we end up in testNodeDeviceDestroy() here:
out:
if (obj)
virNodeDeviceObjUnlock(obj);
And instead of skipping this, we try to do Unlock and fail because of
malformed mutex.
Change virNodeDeviceObjRemove to use double pointer and set pointer to
NULL.
Peter Krempa [Thu, 25 Aug 2016 19:30:21 +0000 (15:30 -0400)]
qemu: driver: Validate configuration when setting maximum vcpu count
Setting vcpu count when cpu topology is specified may result into an
invalid configuration. Since the topology can't be modified, reject the
setting if it doesn't match the requested topology. This will allow
fixing the topology in case it was broken.
Peter Krempa [Thu, 25 Aug 2016 18:53:06 +0000 (14:53 -0400)]
qemu: driver: Fix qemuDomainHelperGetVcpus for sparse vcpu topologies
ce43cca0e refactored the helper to prepare it for sparse topologies but
forgot to fix the iterator used to fill the structures. This would
result into a weirdly sparse populated array and possible out of bounds
access and crash once sparse vcpu topologies were allowed.
Peter Krempa [Thu, 25 Aug 2016 18:48:52 +0000 (14:48 -0400)]
virsh: vcpuinfo: Report vcpu number from the structure rather than it's position
virVcpuInfo contains the vcpu number that the data refers to. Report
what's returned by the daemon rather than the sequence number as with
sparse vcpu topologies they won't match.
Olga Krishtal [Thu, 18 Aug 2016 11:57:14 +0000 (14:57 +0300)]
vz: fixed race in vzDomainAttach/DettachDevice
While dettaching/attaching device in OpenStack, nova
calls vzDomainDettachDevice twice, because the update of the internal
configuration of the ct comes a bit latter than the update event.
As the result, we suffer from the second call to dettach the same device.
Signed-off-by: Olga Krishtal <okrishtal@virtuozzo.com>
libvirt-python passes parameter bandwidth = 0
by default. This means that bandwidth is unlimited.
VZ driver doesn't support bandwidth rate limiting,
but we still need to handle it and fail if bandwidth > 0.
Signed-off-by: Pavel Glushchak <pglushchak@virtuozzo.com>
Laine Stump [Thu, 25 Aug 2016 05:46:37 +0000 (01:46 -0400)]
qemu: set tap device online for type='ethernet'
When support for auto-creating tap devices was added to <interface
type='ethernet'> in commit 9c17d6, the code assumed that
virNetDevTapCreate() would honor the VIR_NETDEV_TAP__CREATE_IFUP flag
that is supported by virNetDevTapCreateInBridgePort(). That isn't the
case - the latter function performs several operations, and one of
them is setting the tap device online. But virNetDevTapCreate() *only*
creates the tap device, and relies on the caller to do everything
else, so qemuInterfaceEthernetConnect() needs to call
virNetDevSetOnline() after the device is successfully created.
Laine Stump [Thu, 25 Aug 2016 05:18:25 +0000 (01:18 -0400)]
qemu: remove unnecessary setting of tap device online state
The linkstate setting of an <interface> is only meant to change the
online status reported to the guest system by the emulated network
device driver in qemu, but when support for auto-creating tap devices
for <interface type='ethernet'> was added in commit 9717d6, a chunk of
code was also added to qemuDomainChangeNetLinkState() that sets the
online status of the tap device (i.e. the *host* side of the
interface) for type='ethernet'. This was never done for tap devices
used in type='bridge' or type='network' interfaces, nor was it done in
the past for tap devices created by external scripts for
type='ethernet', so we shouldn't be doing it now.
This patch removes the bit of code in qemuDomainChangeNetLinkState()
that modifies online status of the tap device.
Vasiliy Tolstov [Wed, 24 Aug 2016 16:09:22 +0000 (19:09 +0300)]
qemu: fix ethernet network type ip/route assign
The call to virNetDevIPInfoAddToDev() that sets up tap device IP
addresses and routes was somehow incorrectly placed in
qemuInterfaceStopDevice() instead of qemuInterfaceStartDevice() in
commit fe8567f6. This fixes that error by moving the call to
virNetDevIPInfoAddToDev() to qemuInterfaceStartDevice().
Peter Krempa [Tue, 16 Aug 2016 13:02:11 +0000 (15:02 +0200)]
qemu: hotplug: Add support for VCPU unplug
This patch removes the old vcpu unplug code completely and replaces it
with the new code using device_del. The old hotplug code basically never
worked with any recent qemu and thus is useless.
As the new code is using device_del all the implications of using it
are present. Contrary to the device deletion code, the vcpu deletion
code fails if the unplug request is not executed in time.
Peter Krempa [Tue, 16 Aug 2016 12:44:26 +0000 (14:44 +0200)]
qemu: Use modern vcpu hotplug approach if possible
To allow unplugging the vcpus, hotplugging of vcpus on platforms which
require to plug multiple logical vcpus at once or plugging them in an
arbitrary order it's necessary to use the new device_add interface for
vcpu hotplug.
This patch adds support for the device_add interface using the old
setvcpus API by implementing an algorithm to select the appropriate
entities to plug in.
Peter Krempa [Thu, 4 Aug 2016 12:36:24 +0000 (14:36 +0200)]
qemu: command: Add support for sparse vcpu topologies
Add support for using the new approach to hotplug vcpus using device_add
during startup of qemu to allow sparse vcpu topologies.
There are a few limitations imposed by qemu on the supported
configuration:
- vcpu0 needs to be always present and not hotpluggable
- non-hotpluggable cpus need to be ordered at the beginning
- order of the vcpus needs to be unique for every single hotpluggable
entity
Qemu also doesn't really allow to query the information necessary to
start a VM with the vcpus directly on the commandline. Fortunately they
can be hotplugged during startup.
The new hotplug code uses the following approach:
- non-hotpluggable vcpus are counted and put to the -smp option
- qemu is started
- qemu is queried for the necessary information
- the configuration is checked
- the hotpluggable vcpus are hotplugged
- vcpus are started
This patch adds a lot of checking code and enables the support to
specify the individual vcpu element with qemu.
Peter Krempa [Thu, 4 Aug 2016 12:23:25 +0000 (14:23 +0200)]
qemu: process: Copy final vcpu order information into the vcpu definition
The vcpu order information is extracted only for hotpluggable entities,
while vcpu definitions belonging to the same hotpluggable entity need
to all share the order information.
We also can't overwrite it right away in the vcpu info detection code as
the order is necessary to add the hotpluggable vcpus enabled on boot in
the correct order.
The helper will store the order information in places where we are
certain that it's necessary.