Peter Krempa [Fri, 29 Mar 2013 17:21:19 +0000 (18:21 +0100)]
rpc: Fix connection close callback race condition and memory corruption/crash
The last Viktor's effort to fix the race and memory corruption unfortunately
wasn't complete in the case the close callback was not registered in an
connection. At that time, the trail of event's that I'll describe later could
still happen and corrupt the memory or cause a crash of the client (including
the daemon in case of a p2p migration).
Consider the following prerequisities and trail of events:
Let's have a remote connection to a hypervisor that doesn't have a close
callback registered and the client is using the event loop. The crash happens in
cooperation of 2 threads. Thread E is the event loop and thread W is the worker
that does some stuff. R denotes the remote client.
1.) W - The client finishes everything and sheds the last reference on the client
2.) W - The virObject stuff invokes virConnectDispose that invokes doRemoteClose
3.) W - the remote close method invokes the REMOTE_PROC_CLOSE RPC method.
4.) W - The thread is preempted at this point.
5.) R - The remote side receives the close and closes the socket.
6.) E - poll() wakes up due to the closed socket and invokes the close callback
7.) E - The event loop is preempted right before remoteClientCloseFunc is called
8.) W - The worker now finishes, and frees the conn object.
9.) E - The remoteClientCloseFunc accesses the now-freed conn object in the
attempt to retrieve pointer for the real close callback.
10.) Kaboom, corrupted memory/segfault.
This patch tries to fix this by introducing a new object that survives the
freeing of the connection object. We can't increase the reference count on the
connection object itself or the connection would never be closed, as the
connection is closed only when the reference count reaches zero.
The new object - virConnectCloseCallbackData - is a lockable object that keeps
the pointers to the real user registered callback and ensures that the
connection callback is either not called if the connection was already freed or
that the connection isn't freed while this is being called.
Peter Krempa [Wed, 27 Mar 2013 13:37:01 +0000 (14:37 +0100)]
virsh: Register and unregister the close callback also in cmdConnect
This patch improves the error message after disconnecting from the
hypervisor and adds the close callback operations required not to leak
the callback reference.
Peter Krempa [Wed, 27 Mar 2013 13:22:47 +0000 (14:22 +0100)]
virsh: Move cmdConnect from virsh-host.c to virsh.c
The function is used to establish connection so it should be in the main
virsh file. This movement also enables further improvements done in next
patches.
Note that the "connect" command has moved from the host section of virsh to the
main section. It is now listed by 'virsh help virsh' instead of 'virsh help
host'.
Peter Krempa [Wed, 13 Mar 2013 20:39:34 +0000 (21:39 +0100)]
virCaps: get rid of defaultConsoleTargetType callback
This patch refactors various places to allow removing of the
defaultConsoleTargetType callback from the virCaps structure.
A new console character device target type is introduced -
VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_NONE - to mark that no type was
specified in the XML. This type is at the end converted to the standard
VIR_DOMAIN_CHR_CONSOLE_TARGET_TYPE_SERIAL. Other types that are
different from this default have to be processed separately in the
device post parse callback.
Peter Krempa [Fri, 15 Mar 2013 14:44:12 +0000 (15:44 +0100)]
virCaps: get rid of macPrefix field
Use the virDomainXMLConf structure to hold this data and tweak the code
to avoid semantic change.
Without configuration the KVM mac prefix is used by default. I chose it
as it's in the privately administered segment so it should be usable for
any purposes.
Peter Krempa [Wed, 13 Mar 2013 14:28:11 +0000 (15:28 +0100)]
virCaps: get rid of defaultDiskDriverType
Use the qemu specific callback to fill this data in the qemu driver as
it's the only place where it was used and fix tests as the qemu test
capability object didn't configure the defaults for the tests.
Peter Krempa [Mon, 11 Mar 2013 11:12:08 +0000 (12:12 +0100)]
virCaps: get rid of emulatorRequired
This patch removes the emulatorRequired field and associated
infrastructure from the virCaps object. Instead the driver specific
callbacks are used as this field isn't enforced by all drivers.
This patch implements the appropriate callbacks in the qemu and lxc
driver and moves to check to that location.
Peter Krempa [Mon, 11 Mar 2013 09:24:29 +0000 (10:24 +0100)]
virCaps: get rid of defaultDiskDriverName
This patch removes the defaultDiskDriverName from the virCaps
structure. This particular default value is used only in the qemu driver
so this patch uses the recently added callback to fill the driver name
if it's needed instead of propagating it through virCaps.
Peter Krempa [Wed, 6 Mar 2013 14:48:06 +0000 (15:48 +0100)]
virCaps: get rid of "defaultInitPath" value in the virCaps struct
This gets rid of the parameter in favor of using the new callback
infrastructure to do the same stuff.
This patch implements the domain adjustment callback in the openVZ
driver and moves the check from the parser to a new validation method in
the callback infrastructure.
Peter Krempa [Tue, 19 Feb 2013 16:29:39 +0000 (17:29 +0100)]
conf: Add post XML parse callbacks and prepare for cleaning of virCaps
This patch adds instrumentation that will allow hypervisor drivers to
fill and validate domain and device definitions after parsed by the XML
parser.
With this patch, after the XML is parsed, a callback to the driver is
issued requesting to fill and validate driver specific details of the
configuration. This allows to use sensible defaults and checks on a per
driver basis at the time the XML is parsed.
Two callback pointers are stored in the new virDomainXMLConf object:
* virDomainDeviceDefPostParseCallback (devicesPostParseCallback)
- called for a single device parsed and for every single device in a
domain config. A virDomainDeviceDefPtr is passed along with the
domain definition and virCaps.
* virDomainDefPostParseCallback, (domainPostParseCallback)
- A callback that is meant to process the domain config after it's
parsed. A virDomainDefPtr is passed along with virCaps.
Both types of callbacks support arbitrary opaque data passed for the
callback functions.
Errors may be reported in those callbacks resulting in a XML parsing
failure.
If libnuma is not compiled in, or numa_available() returns an
error, stub out fake NUMA info consisting of one NUMA cell
containing all CPUs and memory.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Cope with missing /sys/devices/system/cpu/cpu0/topology files
Not all kernel builds have any entries under the location
/sys/devices/system/cpu/cpu0/topology. We already cope with
that being missing in some cases, but not all. Update the
code which looks for thread_siblings to cope with the missing
file
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Certain functions in the sysinfotest.c are not used unless
a whitelisted architecture is being built. Disable those
functions unless required to avoid warnings about unused
functions.
sysinfotest.c:93:1: warning: 'sysinfotest_run' defined but not used [-Wunused-function]
sysinfotest_run(const char *test,
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The nodedev driver currently only detects harddisk, cdrom
and floppy devices. This adds support for SD cards, which
are common storage for ARM devices, eg the Google ChromeBook
Auto-add a root <filesystem> element to LXC containers on startup
Currently the LXC container code has two codepaths, depending on
whether there is a <filesystem> element with a target path of '/'.
If we automatically add a <filesystem> device with src=/ and dst=/,
for any container which has not specified a root filesystem, then
we only need one codepath for setting up the filesystem.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Remove support for old kernels lacking private devpts
Early on kernel support for private devpts was not widespread,
so we had compatibiltiy codepaths. Such old kernels are not
seriously used for LXC these days, so the compat code can go
away
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When creating a logical volume with virStorageVolCreateXMLFrom,
"qemu-img convert" is called internally if clonevol is a file volume.
Then, vol->target.format is used as output_fmt parameter but the
target.format of logical volumes is always 0 because logical volumes
haven't the volume format type element.
Fortunately, 0 was treated as RAW file format before commit f772b3d9,
so there was no problem. But now, 0 is treated as the type of none,
qemu-img fails with "Unknown file format 'none'".
This patch fixes this issue by treating output block devices as RAW
file format like for input block devices.
By passing the flags -z relro -z now to the linker, we can force
it to resolve all library symbols at startup, instead of on-demand.
This allows it to then make the global offset table (GOT) read-only,
which makes some security attacks harder.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
PIE (position independent executable) adds security to executables
by composing them entirely of position-independent code (PIC. The
.so libraries already build with -fPIC. This adds -fPIE which is
the equivalent to -fPIC, but for executables. This for allows Exec
Shield to use address space layout randomization to prevent attackers
from knowing where existing executable code is during a security
attack using exploits that rely on knowing the offset of the
executable code in the binary, such as return-to-libc attacks.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Peter Krempa [Wed, 3 Apr 2013 08:36:03 +0000 (10:36 +0200)]
qemu-blockjob: Fix limit of bandwidth for block jobs to supported value
The JSON generator is able to represent only values less than LLONG_MAX, fix the
bandwidth limit checks when converting to value to catch overflows before they
reach the generator.
Every source file is currently built twice by libtool, once for
the shared library and once for the static library. Static libs
are not commonly packaged by distros and slow down compilation
time by more than 50% compared to a shared-only build time.
Time for 'make -j 4':
shared only: 2 mins 9 secs
shared + static: 3 mins 26 secs
Time for non-parallel make
shared only: 3 mins 32 secs
shared + static: 5 mins 41 secs
Those few people who really want them, can pass --enable-static
to configure
Disabling them by default requires use of LT_INIT, but for
compat with RHEL5 we can't rely on that. So we conditionally
use LT_INIT, but fallback to AM_PROG_LIBTOOL if not present.
If a user configures a domain to use a seclabel of a specific type,
but the appropriate driver is not accessible, we should refuse to
start the domain. For instance, if user requires selinux, but it is
either non present in the system, or is just disabled, we should not
start the domain. Moreover, since we are touching only those labels we
have a security driver for, the other labels may confuse libvirt when
reconnecting to a domain on libvirtd restart. In our selinux example,
when starting up a domain, missing security label is okay, as we
auto-generate one. But later, when libvirt is re-connecting to a live
qemu instance, we parse a state XML, where security label is required
and it is an error if missing:
error : virSecurityLabelDefParseXML:3228 : XML error: security label
is missing
This results in a qemu process left behind without any libvirt control.
virsh schedinfo was able to set only one parameter at a time (not
counting the deprecated options), but it is useful to set more at
once, so this patch adds the possibility to do stuff like this:
Invalid scheduler options are reported as well. These were previously
reported only if the command hadn't updated any values (when
cmdSchedInfoUpdate returned 0).
Peter Krempa [Tue, 2 Apr 2013 21:15:00 +0000 (23:15 +0200)]
qemu: Fix crash when updating media with shared device
Mimic the fix done in 02b9097274d1330c2e1dca7f598880e09b5c2aa0 to fix crash by
accessing an already freed structure. Also copy the explaining comment why the
pointer can't be accessed any more.
The virsh(1) man page wasn't saying anything about the 'migrateuri'
parameter other than it can be usually omitted. A patched version of
docs/migrate.html.in is taken in this patch to fix that up in the man
page.
Peter Krempa [Fri, 15 Mar 2013 16:11:28 +0000 (17:11 +0100)]
virsh: Fix semantics of --config for "update-device" command
The man page states that with --config the next boot is affected. This
can be understood as if _only_ the next boot was affected. This isn't
true if the machine is running.
This patch adds the full --live, --config, --current infrastructure and
tweaks stuff to correctly support the obsolete --persistent flag.
Note that this patch changes the the behavior of the --config flag to match the
use of this flag in rest of libvirt. This flag was mistakenly renamed from
--persistent that originaly had different semantics.
Peter Krempa [Tue, 26 Mar 2013 09:15:32 +0000 (10:15 +0100)]
virsh-domain-monitor: Refactor cmdDomIfGetLink
The domif-getlink command did not terminate successfully when the
interface state was found. As the code used old and too complex approach
to do the job, this patch refactors it and fixes the bug.
Peter Krempa [Tue, 26 Mar 2013 10:45:29 +0000 (11:45 +0100)]
util: Change virMacAddrFormat to lowercase hex characters
The domain XML generator creates the mac addres strings with lowercase
strings with a separate piece of code. This patch changes the formating
helper to do the same stuff to allow using it to normalize a string
provided by the user. After this change some of the tests that are
outputing the mac address will need to be changed.
Li Zhang [Fri, 29 Mar 2013 05:22:46 +0000 (13:22 +0800)]
Optimize machine option to set more options with it
Currently, -machine option is used only when dump-guest-core is set.
To use options defined in machine option for newer version of QEMU,
it needs to use -machine xxx, and to be compatible with older version
-M, this patch adds QEMU_CAPS_MACHINE_OPT capability for newer
version which supports -machine option.
Signed-off-by: Li Zhang <zhlcindy@linux.vnet.ibm.com> Signed-off-by: Eric Blake <eblake@redhat.com>
Peter Krempa [Mon, 11 Mar 2013 13:20:58 +0000 (14:20 +0100)]
conf: Enforce ranges on cputune variables
The limits are documented at
http://libvirt.org/formatdomain.html#elementsCPUTuning . Enforce them
when going through XML parsing in addition to being enforced by the API.
Ján Tomko [Fri, 22 Mar 2013 13:52:25 +0000 (14:52 +0100)]
qemu: Allow migration over IPv6
Allow migration over IPv6 by listening on [::] instead of 0.0.0.0
when QEMU supports it (QEMU_CAPS_IPV6_MIGRATION) and there is
at least one v6 address configured on the system.
Use virURIParse in qemuMigrationPrepareDirect to allow parsing
IPv6 addresses, which would cause an 'incorrect :port' error
message before.
Move setting of migrateFrom from qemuMigrationPrepare{Direct,Tunnel}
after domain XML parsing, since we need the QEMU binary path from it
to get its capabilities.
Osier Yang [Thu, 28 Mar 2013 11:36:30 +0000 (19:36 +0800)]
virsh: Add a helper to parse cpulist
The 'virsh vcpupin' and 'virsh emulatorpin' commands use the same
code to parse the cpulist. This patch abstracts the same code as
a helper. Along with various code style fixes, and error improvement
(only error "Physical CPU %d doesn't exist" if the specified CPU
exceed the range, no "cpulist: Invalid format", see the following
for an example of the error prior to this patch).
TEST: qemuxml2argvtest
........................................ 40
........................................ 80
........................................ 120
........................................ 160
........................................ 200
........................................ 240
................................. 273 OK
==30993== 39 bytes in 1 blocks are definitely lost in loss record 33 of 87
==30993== at 0x4A0887C: malloc (vg_replace_malloc.c:270)
==30993== by 0x41E501: fakeSecretGetValue (qemuxml2argvtest.c:33)
==30993== by 0x427591: qemuBuildDriveURIString (qemu_command.c:2571)
==30993== by 0x42C502: qemuBuildDriveStr (qemu_command.c:2627)
==30993== by 0x4335FC: qemuBuildCommandLine (qemu_command.c:6443)
==30993== by 0x41E8A0: testCompareXMLToArgvHelper (qemuxml2argvtest.c:154
==30993== by 0x41FE8F: virtTestRun (testutils.c:157)
==30993== by 0x418BE3: mymain (qemuxml2argvtest.c:506)
==30993== by 0x4204CA: virtTestMain (testutils.c:719)
==30993== by 0x38D6821A04: (below main) (in /usr/lib64/libc-2.16.so)
==30993==
==30993== 46 bytes in 1 blocks are definitely lost in loss record 64 of 87
==30993== at 0x4A0887C: malloc (vg_replace_malloc.c:270)
==30993== by 0x38D690A167: __vasprintf_chk (in /usr/lib64/libc-2.16.so)
==30993== by 0x4CB28E7: virVasprintf (stdio2.h:210)
==30993== by 0x4CB29A3: virAsprintf (virutil.c:2017)
==30993== by 0x4275B4: qemuBuildDriveURIString (qemu_command.c:2580)
==30993== by 0x42C502: qemuBuildDriveStr (qemu_command.c:2627)
==30993== by 0x4335FC: qemuBuildCommandLine (qemu_command.c:6443)
==30993== by 0x41E8A0: testCompareXMLToArgvHelper (qemuxml2argvtest.c:154
==30993== by 0x41FE8F: virtTestRun (testutils.c:157)
==30993== by 0x418BE3: mymain (qemuxml2argvtest.c:506)
==30993== by 0x4204CA: virtTestMain (testutils.c:719)
==30993== by 0x38D6821A04: (below main) (in /usr/lib64/libc-2.16.so)
==30993==
==30993== 385 (56 direct, 329 indirect) bytes in 1 blocks are definitely los
==30993== at 0x4A06B6F: calloc (vg_replace_malloc.c:593)
==30993== by 0x4C6B2CF: virAllocN (viralloc.c:152)
==30993== by 0x4C9C7EB: virObjectNew (virobject.c:191)
==30993== by 0x4D21810: virGetSecret (datatypes.c:642)
==30993== by 0x41E5D5: fakeSecretLookupByUsage (qemuxml2argvtest.c:51)
==30993== by 0x4D4BEC5: virSecretLookupByUsage (libvirt.c:15295)
==30993== by 0x4276A9: qemuBuildDriveURIString (qemu_command.c:2565)
==30993== by 0x42C502: qemuBuildDriveStr (qemu_command.c:2627)
==30993== by 0x4335FC: qemuBuildCommandLine (qemu_command.c:6443)
==30993== by 0x41E8A0: testCompareXMLToArgvHelper (qemuxml2argvtest.c:154
==30993== by 0x41FE8F: virtTestRun (testutils.c:157)
==30993== by 0x418BE3: mymain (qemuxml2argvtest.c:506)
==30993==
PASS: qemuxml2argvtest
Interesting side note is that running the test singularly via 'make -C tests
check TESTS=qemuxml2argvtest' didn't trip the valgrind error; however,
running during 'make -C tests valgrind' did cause the error to be seen.
Ján Tomko [Fri, 29 Mar 2013 11:55:38 +0000 (12:55 +0100)]
virsh: don't call virSecretFree on NULL
Since the refactoring in fbe2d49 we call virSecretFree even if
virSecretDefineXML fails, which leads to overwriting the error
message with:
error: Invalid secret: virSecretFree
storage: Avoid double virCommandFree in virStorageBackendLogicalDeletePool
When logical pool has no PVs associated with itself (user-created),
virCommandFree(cmd) is called twice with the same pointer and that
causes a segfault in daemon.
Michal Privoznik [Thu, 28 Mar 2013 15:13:01 +0000 (16:13 +0100)]
security_manager.c: Append seclabel iff generated
With my previous patches, we unconditionally appended a seclabel,
even if it wasn't generated but found in array of defined seclabels.
This resulted in double free later when doing virDomainDefFree
and iterating over the array of defined seclabels.
Moreover, there was another possibility of double free, if the
seclabel was generated in the last iteration of the process of
walking trough security managers array.
One of my previous patches manipulated virSecurityLabel* APIs,
some were added to header files, and some were renamed. However,
these changes were not reflected in libvirt_private.syms.
The <seclabel type='none'/> should be added iff there is no other
seclabel defined within a domain. This bug can be easily reproduced:
1) configure selinux seclabel for a domain
2) disable system's selinux and restart libvirtd
3) observe <seclabel type='none'/> being appended to a domain on its
startup
Michal Privoznik [Thu, 21 Mar 2013 15:12:55 +0000 (16:12 +0100)]
security_manager: Don't manipulate domain XML in virDomainDefGetSecurityLabelDef
The virDomainDefGetSecurityLabelDef was modifying the domain XML.
It tried to find a seclabel corresponding to given sec driver. If the
label wasn't found, the function created one which is wrong. In fact
it's security manager which should modify this part of domain XML.
Guannan Ren [Thu, 28 Mar 2013 04:10:05 +0000 (12:10 +0800)]
conf: fix memory leak of class_id bitmap
When libvirtd loads active network configs from network state directory,
it should release the class_id memory block which was allocated
at the time of loading xml from network config directory.
virBitmapParse will create a new memory block of bitmap class_id which
causes a memory leak.
This happens when at least one virtual network is active before.
==12234== 8,216 (24 direct, 8,192 indirect) bytes in 1 blocks are definitely \
lost in loss record 702 of 709
==12234== at 0x4A06B2F: calloc (vg_replace_malloc.c:593)
==12234== by 0x37AB04D77D: virAlloc (in /usr/lib64/libvirt.so.0.1000.3)
==12234== by 0x37AB04EF89: virBitmapNew (in /usr/lib64/libvirt.so.0.1000.3)
==12234== by 0x37AB0BFB37: virNetworkAssignDef (in /usr/lib64/libvirt.so.0.1000.3)
==12234== by 0x37AB0BFD31: ??? (in /usr/lib64/libvirt.so.0.1000.3)
==12234== by 0x37AB0BFE92: virNetworkLoadAllConfigs (in /usr/lib64/libvirt.so.0.1000.3)
==12234== by 0x10650E5A: ??? (in /usr/lib64/libvirt/connection-driver/libvirt_driver_network.so)
==12234== by 0x37AB0EB72F: virStateInitialize (in /usr/lib64/libvirt.so.0.1000.3)
==12234== by 0x40DE04: ??? (in /usr/sbin/libvirtd)
==12234== by 0x37AB0832E8: ??? (in /usr/lib64/libvirt.so.0.1000.3)
==12234== by 0x3796807D14: start_thread (in /usr/lib64/libpthread-2.16.so)
==12234== by 0x37960F246C: clone (in /usr/lib64/libc-2.16.so)
Stefan Seyfried [Mon, 25 Mar 2013 19:39:40 +0000 (20:39 +0100)]
net: use newer iptables syntax
iptables-1.4.18 removed the long deprecated "state" match.
Use "conntrack" instead in forwarding rules.
Fixes openSUSE bug https://bugzilla.novell.com/811251 #811251.
Jiri Denemark [Tue, 26 Mar 2013 14:50:38 +0000 (15:50 +0100)]
rpc: Fix client crash when server drops connection
Despite the comment stating virNetClientIncomingEvent handler should
never be called with either client->haveTheBuck or client->wantClose
set, there is a sequence of events that may lead to both booleans being
true when virNetClientIncomingEvent is called. However, when that
happens, we must not immediately close the socket as there are other
threads waiting for the buck and they would cause SIGSEGV once they are
woken up after the socket was closed. Another thing is we should clear
all remaining calls in the queue after closing the socket.
The situation that can lead to the crash involves three threads, one of
them running event loop and the other two calling libvirt APIs. The
event loop thread detects an event on client->sock and calls
virNetClientIncomingEvent handler. But before the handler gets a chance
to lock client, the other two threads (T1 and T2) start calling some
APIs. T1 gets the buck and detects EOF on client->sock while processing
its RPC call. Since T2 is waiting for its own call, T1 passes the buck
on to it and unlocks client. But before T2 gets the signal, the event
loop thread wakes up, does its job and closes client->sock. The crash
happens when T2 actually wakes up and tries to do its job using a closed
client->sock.
Jiri Denemark [Mon, 25 Mar 2013 10:35:27 +0000 (11:35 +0100)]
log: Separate thread ID from timestemp in ring buffer
When we write a log message into a log, we separate thread ID from
timestamp using ": ". However, when storing the message into the ring
buffer, we omitted the separator, e.g.:
Guannan Ren [Tue, 26 Mar 2013 14:17:43 +0000 (22:17 +0800)]
conf: fix a failure when detaching a usb device
#virsh detach-device $guest usb.xml
error: Failed to detach device from usb2.xml
error: operation failed: host usb device vendor=0x0951 \
product=0x1625 not found
This regresstion is due to a typo in matching function. The first
argument is always the usb device that we are checking for. If the
usb xml file provided by user contains bus and device info, we try
to search it by them, otherwise, we use vendor and product info.
The bug occurred only when detaching a usb device with no bus and
device info provided in the usb xml file.
Guido Günther [Fri, 22 Mar 2013 09:25:42 +0000 (10:25 +0100)]
qemu: Don't set address type too early during virtio disk hotplug
f946462e14ac036357b7c11ce5c23f94a3ee4e49 changed behavior by settings
VIR_DOMAIN_DEVICE_ADDRESS_TYPE_PCI upfront. If we do so before invoking
qemuDomainPCIAddressEnsureAddr we merely try to set the PCI slot via
qemuDomainPCIAddressReserveSlot instead reserving a new address via
qemuDomainPCIAddressSetNextAddr which fails with
$ ~/run-tck-test domain/200-disk-hotplug.t
./scripts/domain/200-disk-hotplug.t .. # Creating a new transient domain
./scripts/domain/200-disk-hotplug.t .. 1/5 # Attaching the new disk /var/lib/jenkins/jobs/libvirt-tck-build/workspace/scratchdir/200-disk-hotplug/extra.img
# Failed test 'disk has been attached'
# at ./scripts/domain/200-disk-hotplug.t line 67.
# died: Sys::Virt::Error (libvirt error code: 1, message: internal error unable to reserve PCI address 0:0:0.0
# )
Michal Privoznik [Tue, 26 Mar 2013 14:45:16 +0000 (15:45 +0100)]
qemu: Set migration FD blocking
Since we switched from direct host migration scheme to the one,
where we connect to the destination and then just pass a FD to a
qemu, we have uncovered a qemu bug. Qemu expects migration FD to
block. However, we are passing a nonblocking one which results in
cryptic error messages like:
qemu: warning: error while loading state section id 2
load of migration failed
The bug is already known to Qemu folks, but we should workaround
already released Qemus. Patch has been originally proposed by Stefan
Hajnoczi <stefanha@gmail.com>
virConnectOpenAuth didn't require 'name' to be specified (VIR_DEBUG
used NULLSTR() for the output) and by default, if name == NULL, the
default connection uri is used. This was not indicated in the
documentation and wasn't checked for in other API's VIR_DEBUG outputs.
Guannan Ren [Tue, 26 Mar 2013 04:34:49 +0000 (12:34 +0800)]
python: set default value to optional arguments
When prefixing with string (optional) or optional in the description
of arguments to libvirt C APIs, in python, these arguments will be
set as optional arugments, for example:
* virDomainSaveFlags:
* @domain: a domain object
* @to: path for the output file
* @dxml: (optional) XML config for adjusting guest xml used on restore
* @flags: bitwise-OR of virDomainSaveRestoreFlags
the corresponding python APIs is
restoreFlags(self, frm, dxml=None, flags=0)
After further discussions with Alon Levy, I learned the following:
The use of '-vga qxl' vs. '-device qxl-vga' is completely orthogonal
to whether ram_size can be exposed. Downstream distros are interested
in backporting support for multi-head qxl, but this can be done in
one of two ways:
1. Support one head per PCI device. If you do this, then it makes
sense to have full control over the PCI address of each device. For
full control, you need '-device qxl-vga' instead of '-vga qxl'.
2. Support multiple heads through a single PCI device. If you do
this, then you need to allocate more RAM to that PCI device (enough
ram to cover the multiple screens). Here, the device is hard-coded
to 0:0:2.0, both in qemu and libvirt code.
Apparently, backporting ram_size changes to allow multiple heads in
a single device is much easier than backporting multiple device
support. Furthermore, the presence or absence of qxl-vga.surfaces
is no different than the presence or absence of qxl-vga.ram_size;
both properties can be applied regardless of whether you have one
PCI device (-vga qxl) or multiple (-device qxl-vga), so this property
is NOT a good witness of whether '-device qxl-vga' support has been
backported.
Downstream RHEL will NOT be using this patch; and worse, leaving this
patch in risks doing the wrong thing if compiling upstream libvirt
on RHEL, so the best course of action is to revert it. That means
that libvirt will go back to only using '-device qxl-vga' for qemu
>= 1.2, but this is just fine because we know of no distros that plan
on backporting multiple PCI address support to any older version of
qemu. Meanwhile, downstream can still use ram_size to pack multiple
heads through a single PCI device.
$ service libvirt-guests restart
Running guests on default URI: a, b, d, c
Shutting down guests on default URI...
Starting shutdown on guest: a
Shutdown of guest a failed to complete in time.Starting shutdown on guest: b
Shutdown of guest b failed to complete in time.Starting shutdown on guest: d
Shutdown of guest d failed to complete in time.Starting shutdown on guest: c
Shutdown of guest c failed to complete in time.libvirt-guests is configured not to start any guests on boot
* tools/libvirt-guests.sh.in (shutdown_guest): Add missing newline.
Reported by Xuesong Zhang.
to tag the PCI device with "virtual_function" cap.
However, this results in a VF will aslo get "virtual_function" cap.
This patch fixes it by:
* Ignoring the VF which has failure of getting PCI config space
(given that the successfully detected VFs are kept , it makes
sense to not give up on the failure of one VF too) with a warning,
so virPCIGetVirtualFunctions will not return -1 except out of memory.
* Free the allocated *virtual_functions when out of memory
And thus the logic can be changed to:
/* Out of memory */
int ret = virPCIGetVirtualFunctions(syspath,
&data->pci_dev.virtual_functions,
&data->pci_dev.num_virtual_functions);
if (ret < 0 )
goto out;
if (data->pci_dev.num_virtual_functions > 0)
data->pci_dev.flags |= VIR_NODE_DEV_CAP_FLAG_PCI_VIRTUAL_FUNCTION;
Osier Yang [Mon, 7 Jan 2013 17:05:34 +0000 (01:05 +0800)]
nodedev: Abstract nodeDeviceVportCreateDelete as util function
This abstracts nodeDeviceVportCreateDelete as an util function
virManageVport, which can be further used by later storage patches
(to support persistent vHBA, I don't want to create the vHBA
using the public API, which is not good).
* docs/formatnode.html.in: (Document the new XML)
* docs/schemas/nodedev.rng: (Add the schema)
* src/conf/node_device_conf.h: (New member for data.scsi_host)
* src/node_device/node_device_linux_sysfs.c: (Collect the value of
max_vports and vports)
Osier Yang [Mon, 7 Jan 2013 17:05:31 +0000 (01:05 +0800)]
nodedev: Refactor the helpers
This adds two util functions (virIsCapableFCHost and virIsCapableVport),
and rename helper check_fc_host_linux as detect_scsi_host_caps,
check_capable_vport_linux is removed, as it's abstracted to the util
function virIsCapableVport. detect_scsi_host_caps nows detect both
the fc_host and vport_ops capabilities. "stat(2)" is replaced with
"access(2)" for saving.
* src/util/virutil.h:
- Declare virIsCapableFCHost and virIsCapableVport
* src/util/virutil.c:
- Implement virIsCapableFCHost and virIsCapableVport
* src/node_device/node_device_linux_sysfs.c:
- Remove check_capable_vport_linux
- Rename check_fc_host_linux as detect_scsi_host_caps, and refactor
it a bit to detect both fc_host and vport_os capabilities
* src/node_device/node_device_driver.h:
- Change/remove the related declarations
* src/node_device/node_device_udev.c: (Use detect_scsi_host_caps)
* src/node_device/node_device_hal.c: (Likewise)
* src/node_device/node_device_driver.c (Likewise)
Osier Yang [Mon, 7 Jan 2013 17:05:30 +0000 (01:05 +0800)]
nodedev: Use access instead of stat
The use of 'stat' in nodeDeviceVportCreateDelete is only to check
if the file exists or not, it's a bit overkill, and safe to replace
with the wrapper of access(2) (virFileExists).
Osier Yang [Mon, 7 Jan 2013 17:05:29 +0000 (01:05 +0800)]
util: Add one helper virReadFCHost to read the value of fc_host entry
"open_wwn_file" in node_device_linux_sysfs.c is redundant, on one
hand it duplicates work of virFileReadAll, on the other hand, it's
waste to use a function for it, as there is no other users of it.
So I don't see why the file opening work cannot be done in
"read_wwn_linux".
"read_wwn_linux" can be abstracted as an util function. As what all
it does is to read the sysfs entry.
So this patch removes "open_wwn_file", and abstract "read_wwn_linux"
as an util function "virReadFCHost" (a more general name, because
after changes, it can read each of the fc_host entry now).
* src/util/virutil.h: (Declare virReadFCHost)
* src/util/virutil.c: (Implement virReadFCHost)
* src/node_device/node_device_linux_sysfs.c: (Remove open_wwn_file,
and read_wwn_linux)
src/node_device/node_device_driver.h: (Remove the declaration of
read_wwn_linux, and the related macros)
src/libvirt_private.syms: (Export virReadFCHost)
Osier Yang [Mon, 7 Jan 2013 17:05:28 +0000 (01:05 +0800)]
nodedev: Introduce two new flags for listAll API
VIR_CONNECT_LIST_NODE_DEVICES_CAP_FC_HOST to filter the FC HBA,
and VIR_CONNECT_LIST_NODE_DEVICES_CAP_VPORTS to filter the FC HBA
which supports vport.
Osier Yang [Mon, 7 Jan 2013 17:05:27 +0000 (01:05 +0800)]
nodedev: Remove the unused enum
Guess it was created for the fc_host and vports_ops capabilities
purpose, but there is enum virNodeDevScsiHostCapFlags for them,
and enum virNodeDevHBACapType is unused, and actually both
VIR_ENUM_DECL and VIR_ENUM_IMPL use the wrong enum name
"virNodeDevHBACap".
Peter Krempa [Fri, 22 Mar 2013 10:05:36 +0000 (11:05 +0100)]
virsh: Fix docs for "virsh setmaxmem"
The docs assumed the command works always for QEMU and other
hypervisors. As this is done using the balloon mechainism live increase
of the maximum memory limit isn't supported. Fix the docs to mention
this limitation.
Mount temporary devpts on /var/lib/libvirt/lxc/$NAME.devpts
Currently the lxc controller sets up the devpts instance on
$rootfsdef->src, but this only works if $rootfsdef is using
type=mount. To support type=block or type=file for the root
filesystem, we must use /var/lib/libvirt/lxc/$NAME.devpts
for the temporary devpts mount in the controller
Move FUSE mount to /var/lib/libvirt/lxc/$NAME.fuse
Instead of using /var/lib/libvirt/lxc/$NAME for the FUSE
filesystem, use /var/lib/libvirt/lxc/$NAME.fuse. This allows
room for other temporary mounts in the same directory