Add bounds checking on virDomainGetJobStats RPC call
The return values for the virDomainGetJobStats call were not
bounds checked. This is a robustness issue for clients if
something where to cause corruption of the RPC stream data.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add bounds checking on virDomainMigrate*Params RPC calls (CVE-2013-4292)
The parameters for the virDomainMigrate*Params RPC calls were
not bounds checks, meaning a malicious client can cause libvirtd
to consume arbitrary memory
This issue was introduced in the 1.1.0 release of libvirt
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Michal Privoznik [Thu, 29 Aug 2013 11:19:45 +0000 (13:19 +0200)]
autogen.sh: Correctly detect .git as a file
One of my previous patches 5cfe0d37cd0be tried to handle the case when
libvirt is a submodule of another project. In that case, the .git is
just a link to the parent .git directory (which the autogen.sh script
didn't count on). The fix was missing 'test' though.
Peter Krempa [Tue, 27 Aug 2013 17:06:18 +0000 (19:06 +0200)]
qemu: Remove hostdev entry when freeing the depending network entry
When using a <interface type="network"> that points to a network with
hostdev forwarding mode a hostdev alias is created for the network. This
allias is inserted into the hostdev list, but is backed with a part of
the network object that it is connected to.
When a VM is being stopped qemuProcessStop() calls
networkReleaseActualDevice() which eventually frees the memory for the
hostdev object. Afterwards when the domain definition is being freed by
virDomainDefFree() an invalid pointer is accessed by
virDomainHostdevDefFree() and may cause a crash of the daemon.
This patch removes the entry in the hostdev list before freeing the
depending memory to avoid this issue.
Eric Blake [Fri, 16 Aug 2013 22:07:31 +0000 (16:07 -0600)]
virsh: detect programming errors with option parsing
Noticed while reviewing another patch that had an accidental
mismatch due to refactoring. An audit of the code showed that
very few callers of vshCommandOpt were expecting a return of
-2, indicating programmer error, and of those that DID check,
they just propagated that status to yet another caller that
did not check. Fix this by making the code blatantly warn
the programmer, rather than silently ignoring it and possibly
doing the wrong thing downstream.
I know that we frown on assert()/abort() inside libvirtd
(libraries should NEVER kill the program that linked them),
but as virsh is an app rather than the library, and as this
is not the first use of assert() in virsh, I think this
approach is okay.
* tools/virsh.h (vshCommandOpt): Drop declaration.
* tools/virsh.c (vshCommandOpt): Make static, and add a
parameter. Abort on programmer errors rather than making callers
repeat that logic.
(vshCommandOptInt, vshCommandOptUInt, vshCommandOptUL)
(vshCommandOptString, vshCommandOptStringReq)
(vshCommandOptLongLong, vshCommandOptULongLong)
(vshCommandOptBool): Adjust callers.
Jiri Denemark [Wed, 28 Aug 2013 11:50:10 +0000 (13:50 +0200)]
virt-sanlock-cleanup; Fix augtool usage
Surprisingly, augtool get (or print) returns "path = value" while we are
only interested in the value. We need to remove the "path = " part from
the augtool's output. The following is an example of the augtool command
as used in virt-sanlock-cleanup script:
$ augtool get /files/etc/libvirt/qemu-sanlock.conf/disk_lease_dir
/files/etc/libvirt/qemu-sanlock.conf/disk_lease_dir = /var/lib/libvirt/sanlock
Commit a0b6a36f "fixed" what abfff210 broke (URI precedence), but
there was still one more thing missing to fix. When using virsh
parameters to setup debugging, those weren't honored, because at the
time debugging was initializing, arguments weren't parsed yet. To
make ewerything work as expected, we need to initialize the debugging
twice, once before debugging (so we can debug option parsing properly)
and then again after these options are parsed.
As a side effect, this patch also fixes a leak when virsh is ran with
multiple '-l' parameters.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Michal Privoznik [Wed, 28 Aug 2013 07:25:59 +0000 (09:25 +0200)]
virsh-pool.c: Don't jump over variable declaration
Since 785ff34bf8 we are using the outputStr variable in cleanup label.
However, there is a possibility to jump to the label before the variable
has been declared:
virsh-pool.c: In function 'cmdPoolList':
virsh-pool.c:1121:25: error: jump skips variable initialization [-Werror=jump-misses-init]
goto asprintf_failure;
^
virsh-pool.c:1308:1: note: label 'asprintf_failure' defined here
asprintf_failure:
^
virsh-pool.c:1267:11: note: 'outputStr' declared here
char *outputStr = NULL;
Ján Tomko [Tue, 27 Aug 2013 11:47:57 +0000 (13:47 +0200)]
virsh: free the caps list properly if one of them is invalid
VIR_FREE(caps) is not enough to free an array allocated
by vshStringToArray.
==17== 4 bytes in 1 blocks are definitely lost in loss record 4 of 728
==17== by 0x4EFFC44: virStrdup (virstring.c:554)
==17== by 0x128B10: _vshStrdup (virsh.c:125)
==17== by 0x129164: vshStringToArray (virsh.c:218)
==17== by 0x157BB3: cmdNodeListDevices (virsh-nodedev.c:409)
Ján Tomko [Tue, 27 Aug 2013 11:34:09 +0000 (13:34 +0200)]
virsh: free the formatting string when listing pool details
==23== 41 bytes in 1 blocks are definitely lost in loss record 626 of 727
==23== by 0x4F0099F: virAsprintfInternal (virstring.c:358)
==23== by 0x15D2C9: cmdPoolList (virsh-pool.c:1268)
Ján Tomko [Tue, 27 Aug 2013 11:27:50 +0000 (13:27 +0200)]
virsh: free the list from ListAll APIs even for 0 items
virsh secret-list leak when no secrets are defined:
==27== 8 bytes in 1 blocks are definitely lost in loss record 6 of 726
==27== by 0x4E941DD: virAllocN (viralloc.c:183)
==27== by 0x5037F1A: remoteConnectListAllSecrets (remote_driver.c:3076)
==27== by 0x5004EC6: virConnectListAllSecrets (libvirt.c:16298)
==27== by 0x15F813: vshSecretListCollect (virsh-secret.c:397)
==27== by 0x15F0E1: cmdSecretList (virsh-secret.c:532)
Ján Tomko [Tue, 27 Aug 2013 11:07:27 +0000 (13:07 +0200)]
virsh: free messages after logging them to a file
The messages were only freed on error.
==12== 1,100 bytes in 1 blocks are definitely lost in loss record 698 of 729
==12== by 0x4E98C22: virBufferAsprintf (virbuffer.c:294)
==12== by 0x12C950: vshOutputLogFile (virsh.c:2440)
==12== by 0x12880B: vshError (virsh.c:2254)
==12== by 0x131957: vshCommandOptDomainBy (virsh-domain.c:109)
==12== by 0x14253E: cmdStart (virsh-domain.c:3333)
Ján Tomko [Mon, 12 Aug 2013 11:48:34 +0000 (13:48 +0200)]
Build QEMU command line for pcihole64
QEMU commit 3984890 introduced the "pci-hole64-size" property,
to i440FX-pcihost and q35-pcihost with a default setting of 2 GB.
Translate <pcihole64>x<pcihole64/> to:
-global q35-pcihost.pci-hole64-size=x for q35 machines and
-global i440FX-pcihost.pci-hole64-size=x for i440FX-based machines.
Error out on other machine types or if the size was specified
but the pcihost device lacks 'pci-hole64-size' property.
It can be used to adjust (or disable) the size of the 64-bit
PCI hole. The size attribute is in kilobytes (different unit
can be specified on input), but it gets rounded up to
the nearest GB by QEMU.
Disabling it will be needed for guests that crash with the
64-bit PCI hole (like Windows XP), see:
https://bugzilla.redhat.com/show_bug.cgi?id=990418
Ján Tomko [Tue, 20 Aug 2013 15:37:08 +0000 (17:37 +0200)]
Always specify qcow2 compat level on qemu-img command line
qemu-img is going to switch the default for QCOW2
to QCOW2v3 (compat=1.1)
Extend the probing for qemu-img command line options to check
if -o compat is supported. If the volume definition specifies
the qcow2 format but no compat level and -o compat is supported,
specify -o compat=0.10 to create a QCOW2v2 image.
Michal Privoznik [Mon, 26 Aug 2013 14:59:03 +0000 (16:59 +0200)]
qemuDomainAttachHostPciDevice: Fall back to mem balloon if there's no hard_limit
If there's no hard_limit set and domain uses VFIO we still must lock
the guest memory (prerequisite from qemu). Hence, we should compute
the amount to be locked from max_balloon.
Claudio Bley [Thu, 22 Aug 2013 09:16:03 +0000 (11:16 +0200)]
Test for object identity when checking for None in Python
Consistently use "is" or "is not" to compare variables to None,
because doing so is preferrable, as per PEP 8
(http://www.python.org/dev/peps/pep-0008/#programming-recommendations):
> Comparisons to singletons like None should always be done with is or
> is not, never the equality operators.
Set security label on FD for virDomainOpenGraphics
The virDomainOpenGraphics method accepts a UNIX socket FD from
the client app. It must set the label on this FD otherwise QEMU
will be prevented from receiving it with recvmsg.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Michal Privoznik [Thu, 22 Aug 2013 08:19:08 +0000 (10:19 +0200)]
qemuBuildNicDevStr: Add mq=on for multiqueue networking
If user requested multiqueue networking, beside multiple /dev/tap and
/dev/vhost-net openings, we forgot to pass mq=on onto the -device
virtio-net-pci command line. This is advised at:
Peter Krempa [Mon, 19 Aug 2013 14:08:24 +0000 (16:08 +0200)]
virBitmapParse: Fix behavior in case of error and fix up callers
Re-arrange the code so that the returned bitmap is always initialized to
NULL even on early failures and return an error message as some callers
are already expecting it. Fix up the rest not to shadow the error.
Commit abfff210 changed the order of vshParseArgv() and vshInit() in
order to make fix debugging of parameter parsing. However, vshInit()
did a vshReconnect() even though ctl->name wasn't set according to the
'-c' parameter yet. In order to keep both issues fixed, I've split
the vshInit() into vshInitDebug() and vshInit().
One simple memleak of ctl->name is fixed as a part of this patch,
since it is related to the issue it's fixing.
Jim Fehlig [Wed, 21 Aug 2013 17:05:18 +0000 (11:05 -0600)]
libxl: fix libvirtd crash when reconnecting domains
More fallout from commit d72ef888. When reconnecting to running
domains, the libxl_ctx in libxlDomainObjPrivate was used before
initializing it, causing a segfault in libxl and consequently
crashing libvirtd.
Initialize the libxlDomainObjPrivate libxl_ctx in libxlReconnectDomain,
and while at it use this ctx in libxlReconnectDomain instead of the
driver-wide ctx.
When doing a live migration, if the destination fails for any
reason after the point in which files should be labeled, then
the cleanup of the destination would restore the labels to their
defaults, even though the source is still trying to continue
running with the image open. Bug 822052 mentioned one source
of live migration failure - a mismatch in SELinux virt_use_nfs
settings (on for source, off for destination); but I found other
situations that would also trigger it (for example, having a
graphics device tied to port 5999 on the source, and a different
domain on the destination already using that port, so that the
destination cannot reuse the port).
In short, just as cleanup of the source on a successful migration
must not relabel files (because the destination would be crippled
by the relabel), cleanup of the destination on a failed migration
must not relabel files (because the source would be crippled).
* src/qemu/qemu_process.c (qemuProcessStart): Set flag to avoid
label restoration when cleaning up on failed migration.
In commit f905cc998449c89339d0e2894a71d9a9e45293e5 a use of
uninitialized data was fixed based on a coverity report. It
turns out it was possible to trigger this issue by pointing
libvirt at non-existent certificate files, typically causing
a crash.
This adds a test case for that scenario. With the above
commit reverted, this new test case will crash with a SEGV.
With the fix applied, it passes, reporting a normal libvirt
error to the caller.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Osier Yang [Tue, 20 Aug 2013 09:08:49 +0000 (17:08 +0800)]
storage: Fix the use-after-free memory bug
Introduced by commit e0139e30444. virStorageVolDefFree free'ed the
pointers that are still used by the added volume object, this changes
it back to VIR_FREE.
John Ferlan [Wed, 7 Aug 2013 13:05:43 +0000 (09:05 -0400)]
docs: Update the formatdomain disk examples
Add more iSCSI examples including having a secret attached. There are 4 new
examples; one for each way to have an iSCSI - a network disk using virtio,
a passthrough network lun using scsi, a volume disk using "mode='host'",
and a volume disk using "mode='direct'"
John Ferlan [Tue, 6 Aug 2013 11:52:46 +0000 (07:52 -0400)]
Report secret usage error message similarly
Each of the modules handled reporting error messages from the secret fetching
slightly differently with respect to the error. Provide a similar message
for each error case and provide as much data as possible.
John Ferlan [Tue, 6 Aug 2013 11:49:23 +0000 (07:49 -0400)]
virsh: Print cephx and iscsi usage
When using virsh secret-list - if the secret types are cephx or iscsi,
then allow fetch/print of the usage information. Prior to the change
the following would print:
John Ferlan [Tue, 20 Aug 2013 17:20:56 +0000 (13:20 -0400)]
libxl: Resolve possible NULL dereference
If we reached cleanup: prior to allocating cpus, it was possible that
'nr_nodes' had a value, but cpus was NULL leading to a possible NULL
deref. Add a 'cpus' as an end condition to for loop
Commit 904e05a2 (v0.9.9) added a per-<disk> seclabel element with
an attribute relabel='no' in order to try and minimize the
impact of shutdown delays when an NFS server disappears. The idea
was that if a disk is on NFS and can't be labeled in the first
place, there is no need to attempt the (no-op) relabel on domain
shutdown. Unfortunately, the way this was implemented was by
modifying the domain XML so that the optimization would survive
libvirtd restart, but in a way that is indistinguishable from an
explicit user setting. Furthermore, once the setting is turned
on, libvirt avoids attempts at labeling, even for operations like
snapshot or blockcopy where the chain is being extended or pivoted
onto non-NFS, where SELinux labeling is once again possible. As
a result, it was impossible to do a blockcopy to pivot from an
NFS image file onto a local file.
The solution is to separate the semantics of a chain that must
not be labeled (which the user can set even on persistent domains)
vs. the optimization of not attempting a relabel on cleanup (a
live-only annotation), and using only the user's explicit notation
rather than the optimization as the decision on whether to skip
a label attempt in the first place. When upgrading an older
libvirtd to a newer, an NFS volume will still attempt the relabel;
but as the avoidance of a relabel was only an optimization, this
shouldn't cause any problems.
In the ideal future, libvirt will eventually have XML describing
EVERY file in the backing chain, with each file having a separate
<seclabel> element. At that point, libvirt will be able to track
more closely which files need a relabel attempt at shutdown. But
until we reach that point, the single <seclabel> for the entire
<disk> chain is treated as a hint - when a chain has only one
file, then we know it is accurate; but if the chain has more than
one file, we have to attempt relabel in spite of the attribute,
in case part of the chain is local and SELinux mattered for that
portion of the chain.
* src/conf/domain_conf.h (_virSecurityDeviceLabelDef): Add new
member.
* src/conf/domain_conf.c (virSecurityDeviceLabelDefParseXML):
Parse it, for live images only.
(virSecurityDeviceLabelDefFormat): Output it.
(virDomainDiskDefParseXML, virDomainChrSourceDefParseXML)
(virDomainDiskSourceDefFormat, virDomainChrDefFormat)
(virDomainDiskDefFormat): Pass flags on through.
* src/security/security_selinux.c
(virSecuritySELinuxRestoreSecurityImageLabelInt): Honor labelskip
when possible.
(virSecuritySELinuxSetSecurityFileLabel): Set labelskip, not
norelabel, if labeling fails.
(virSecuritySELinuxSetFileconHelper): Fix indentation.
* docs/formatdomain.html.in (seclabel): Document new xml.
* docs/schemas/domaincommon.rng (devSeclabel): Allow it in RNG.
* tests/qemuxml2argvdata/qemuxml2argv-seclabel-*-labelskip.xml:
* tests/qemuxml2argvdata/qemuxml2argv-seclabel-*-labelskip.args:
* tests/qemuxml2xmloutdata/qemuxml2xmlout-seclabel-*-labelskip.xml:
New test files.
* tests/qemuxml2argvtest.c (mymain): Run the new tests.
* tests/qemuxml2xmltest.c (mymain): Likewise.
Peter Krempa [Thu, 15 Aug 2013 14:48:20 +0000 (16:48 +0200)]
virsh: Don't leak list of volumes when undefining domain with storage
Use the new semantics of vshStringToArray to avoid leaking the array of
volumes to be deleted. The array would be leaked in case the first
volume was found in the domain definition. Also refactor the code a bit
to sanitize naming of variables hoding arrays and dimensions of the
arrays.
Peter Krempa [Thu, 15 Aug 2013 16:20:05 +0000 (18:20 +0200)]
virsh: modify vshStringToArray to duplicate the elements too
At a slightly larger memory expense allow stealing of items from the
string array returned from vshStringToArray and turn the result into a
string list compatible with virStringSplit. This will allow to use the
common dealloc function.
This patch also fixes a few forgotten checks of return from
vshStringToArray and one memory leak.
Michal Privoznik [Tue, 20 Aug 2013 12:53:22 +0000 (14:53 +0200)]
qemuBuildCommandLine: Fall back to mem balloon if there's no hard_limit
If there's no hard_limit set and domain uses VFIO we still must lock the
guest memory (prerequisite from qemu). Hence, we should compute the
amount to be locked from max_balloon.
Michal Privoznik [Tue, 20 Aug 2013 09:04:18 +0000 (11:04 +0200)]
qemuSetupMemoryCgroup: Handle hard_limit properly
Since 16bcb3 we have a regression. The hard_limit is set
unconditionally. By default the limit is zero. Hence, if user hasn't
configured any, we set the zero in cgroup subsystem making the kernel
kill the corresponding qemu process immediately. The proper fix is to
set hard_limit iff user has configured any.
Jim Fehlig [Fri, 16 Aug 2013 21:37:46 +0000 (15:37 -0600)]
libxl: implement NUMA capabilities reporting
From: Dario Faggioli <dario.faggioli@citrix.com>
Starting from Xen 4.2, libxl has all the bits and pieces in place
for retrieving an adequate amount of information about the host
NUMA topology. It is therefore possible, after a bit of shuffling,
to arrange those information in the way libvirt wants to present
them to the outside world.
Therefore, with this patch, the <topology> section of the host
capabilities is properly populated, when running on Xen, so that
we can figure out whether or not we're running on a NUMA host,
and what its characteristics are.
Peter Krempa [Mon, 19 Aug 2013 09:59:54 +0000 (11:59 +0200)]
nwfilter: Don't fail to start if DBus isn't available
When the daemon is compiled with firewalld support but the DBus message
bus isn't started in the system, the initialization of the nwfilter
driver fails even if there are fallback options.
Peter Krempa [Mon, 19 Aug 2013 09:34:39 +0000 (11:34 +0200)]
virsystemd: Don't fail to start VM if DBus isn't available or compiled in
On hosts that don't have the DBus service running or installed the new
systemd cgroups code failed with hard error instead of falling back to
"manual" cgroup creation.
Use the new helper to check for the system bus and use the fallback code
in case it isn't available.
Peter Krempa [Mon, 19 Aug 2013 09:24:04 +0000 (11:24 +0200)]
virdbus: Add virDBusHasSystemBus()
Some systems may not use DBus in their system. Add a method to check if
the system bus is available that doesn't print error messages so that
code can later check for this condition and use an alternative approach.
Peter Krempa [Mon, 19 Aug 2013 12:02:52 +0000 (14:02 +0200)]
virbitmaptest: Shut coverity up in case of broken test
Coverity reported a memleak in the test added in 7efd5fd1b02. In case
the code will be broken and the code will actually parse a faulty bitmap
the resulting pointer would be leaked. Free it although that shouldn't
ever happen.
David Weber [Mon, 19 Aug 2013 11:38:23 +0000 (12:38 +0100)]
Make max_clients in virtlockd configurable
Each new VM requires a new connection from libvirtd to virtlockd.
The default max clients limit in virtlockd of 20 is thus woefully
insufficient. virtlockd sockets are only accessible to matching
users, so there is no security need for such a tight limit. Make
it configurable and default to 1024.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Michal Privoznik [Mon, 19 Aug 2013 09:54:05 +0000 (11:54 +0200)]
docs: Discourage users to set hard_limit
In one of my previous patches I am removing the hard_limit heuristic to
guess the correct value if none set. However, it turned out, this limit
is hard to guess even for users. We should advise them to not set the
limit as their domains may be OOM killed. Sigh.
This function is to guess the correct limit for maximal memory
usage by qemu for given domain. This can never be guessed
correctly, not to mention all the pains and sleepless nights this
code has caused. Once somebody discovers algorithm to solve the
Halting Problem, we can compute the limit algorithmically. But
till then, this code should never see the light of the release
again.
Osier Yang [Fri, 16 Aug 2013 12:08:07 +0000 (20:08 +0800)]
storage: Update pool metadata after adding/removing/resizing volume
One has to refresh the pool to get the correct pool info after
adding/removing/resizing a volume, this updates the pool metadata
(allocation, available) after those operation are done.