Michal Privoznik [Wed, 18 Feb 2015 10:59:22 +0000 (11:59 +0100)]
libvirt-guests: Allow time sync on guests resume
Well, imagine domains were running, and as the host went down, they
were managesaved. Later, after some time, the host went up again and
domains got restored. But without correct time. And depending on how
long was the host shut off, it may take some time for ntp to sync the
time too. But hey, wait a minute. We have an API just for that! So:
1) Introduce SYNC_TIME variable in libvirt-guests.sysconf to allow
users control over the new functionality
2) Call 'virsh domtime --sync $dom' in the libvirt-guests script.
Unfortunately, this is all-or-nothing approach (just like anything
else with the script). Domains are required to have configured and
running qemu-ga inside.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
libxl: support backend domain setting for disk and net devices
This implement handling of <backenddomain name=''/> parameter introduced
in previous patch.
Works on Xen >= 4.3, because only there libxl supports setting backend
domain by name. Specifying backend domain by ID or UUID is currently not
supported.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
conf: support backend domain name in disk and network devices
At least Xen supports backend drivers in another domain (aka "driver
domain"). This patch introduces an XML config option for specifying the
backend domain name for <disk> and <interface> devices. E.g.
Laine Stump [Sun, 15 Feb 2015 03:43:16 +0000 (22:43 -0500)]
network: allow <pf> together with <interface>/<address> in network status
The function that parses the <forward> subelement of a network used to
fail/log an error if the network definition contained both a <pf>
element as well as at least one <interface> or <address> element. That
check was present because the configuration of a network should have
either one <pf>, one or more <interface>, or one or more <address>,
but never combinations of multiple kinds.
This caused a problem when libvirtd was restarted with a network
already active - when a network with a <pf> element is started, the
referenced PF (Physical Function of an SRIOV-capable network card) is
checked for VFs (Virtual Functions), and the <forward> is filled in
with a list of all VFs for that PF either in the form of their PCI
addresses (a list of <address>) or their netdev names (a list of
<interface>); the <pf> element is not removed though. When libvirtd is
restarted, it parses the network status and finds both the original
<pf> from the config, as well as the list of either <address> or
<interface>, fails the parse, and the network is not added to the
active list. This failure is often obscured because the network is
marked as autostart so libvirt immediately restarts it.
It seems odd to me that <interface> and <address> are stored in the
same array rather than keeping two separate arrays, and having
separate arrays would have made the check much simpler. However,
changing to use two separate arrays would have required changes in
more places, potentially creating more conflicts and (more
importantly) more possible regressions in the event of a backport, so
I chose to keep the existing data structure in order to localize the
change.
It appears that this problem has been in the code ever since support
for <pf> was added (0.9.10), but until commit 34cc3b2f106e296df5e64309620c79d16fd76c85 (first in libvirt 1.2.4)
networks with interface pools were not properly marked as active on
restart anyway, so there is no point in backporting this patch any
further than that.
Peter Krempa [Tue, 10 Feb 2015 16:11:40 +0000 (17:11 +0100)]
conf: Hoist validation of memory size into the post parse callback
Later patches will need to access the full definition to do check the
memory size and thus the checking needs to be done after the whole
definition including devices is known.
Peter Krempa [Mon, 16 Feb 2015 16:28:48 +0000 (17:28 +0100)]
conf: Move all NUMA configuration to virDomainNuma
For historical reasons data regarding NUMA configuration were split
between the CPU definition and numatune. We cannot do anything about the
XML still being split, but we certainly can at least store the relevant
data in one place.
This patch moves the NUMA stuff to the right place.
Peter Krempa [Mon, 16 Feb 2015 16:05:46 +0000 (17:05 +0100)]
numa: conf: Tweak parameters of virDomainNumatuneSet
As virDomainNumatuneSet now doesn't allocate the virDomainNuma object
any longer it's not necessary to pass the pointer to a pointer to store
the object as it will not change any longer.
While touching the parameter definitions I've also changed the name of
the parameter to "numa".
Peter Krempa [Mon, 16 Feb 2015 15:42:13 +0000 (16:42 +0100)]
conf: numa: Always allocate the NUMA config
Since our formatter now handles well if the config is allocated and not
filled we can safely always-allocate the NUMA config and remove the
ad-hoc allocation code.
This will help in later patches as the parser will be refactored to just
fill the data.
Peter Krempa [Mon, 16 Feb 2015 14:58:13 +0000 (15:58 +0100)]
conf: Separate helper for creating domain objects
Move the existing virDomainDefNew to virDomainDefNewFull as it's setting
a few things in the conf and re-introduce virDomainDefNew as a function
without parameters for common use.
Peter Krempa [Mon, 16 Feb 2015 13:17:41 +0000 (14:17 +0100)]
conf: numa: Format <numatune> XML only if necessary
Do a content-aware check if formatting of the <numatune> element is
necessary. Later on the def->numa structure will be always present so we
cannot decide only on the basis whether it's allocated.
Peter Krempa [Thu, 12 Feb 2015 15:04:45 +0000 (16:04 +0100)]
conf: numa: Refactor logic in virDomainNumatuneParseXML
Shuffling around the logic will allow to simplify the code quite a bit.
As an additional bonus the change in the logic now reports an error if
automatic placement is selected and individual placement is configured.
Peter Krempa [Wed, 11 Feb 2015 16:38:29 +0000 (17:38 +0100)]
conf: numa: Improve error message in case a numa node doesn't have cpus
Currently the code would exit without reporting an error as
virBitmapParse reports one only if it fails to parse the bitmap, whereas
the code was jumping to the error label even in case 0 cpus were
correctly parsed in the map.
Peter Krempa [Thu, 12 Feb 2015 08:37:13 +0000 (09:37 +0100)]
conf: numa: Recalculate rather than remember total NUMA cpu count
It's easier to recalculate the number in the one place it's used as
having a separate variable to track it. It will also help with moving
the NUMA code to the separate module.
Peter Krempa [Wed, 11 Feb 2015 14:40:27 +0000 (15:40 +0100)]
conf: Move enum virMemAccess to the NUMA code and rename it
Name it virNumaMemAccess and add it to conf/numa_conf.[ch]
Note that to avoid a circular dependency the type of the NUMA cell
memAccess variable was changed to int. It will be turned back later
after the circular dependency will not exist.
Peter Krempa [Wed, 11 Feb 2015 13:06:20 +0000 (14:06 +0100)]
conf: numa: Don't duplicate NUMA cell cpumask
The mask was stored both as a bitmap and as a string. The string is used
for XML output only. Remove the string, as it can be reconstructed from
the bitmap.
The test change is necessary as the bitmap formatter doesn't "optimize"
using the '^' operator.
Peter Krempa [Wed, 11 Feb 2015 12:31:09 +0000 (13:31 +0100)]
conf: Refactor virDomainNumaDefCPUParseXML
Rewrite the function to save a few local variables and reorder the code
to make more sense.
Additionally the ncells_max member of the virCPUDef structure is used
only for tracking allocation when parsing the numa definition, which can
be avoided by switching to VIR_ALLOC_N as the array is not resized
after initial allocation.
Peter Krempa [Wed, 11 Feb 2015 11:27:53 +0000 (12:27 +0100)]
conf: Move NUMA cell parsing code from cpu conf to numa conf
For weird historical reasons NUMA cells are added as a subelement of
<cpu> while the actual configuration is done in <numatune>.
This patch splits out the cell parser code from cpu config to NUMA
config. Note that the changes to the code are minimal just to make it
work and the function will be refactored in the next patch.
Peter Krempa [Wed, 11 Feb 2015 09:08:35 +0000 (10:08 +0100)]
conf: Move numatune_conf to numa_conf
For a while now there are two places that gather information about NUMA
related guest configuration. While the XML can't be changed we can at
least store the data in one place in the definition.
Rename the numatune_conf.[ch] files to numa_conf as later patches will
move the rest of the definitions from the cpu definition to this one.
Pavel Hrdina [Fri, 20 Feb 2015 07:21:05 +0000 (08:21 +0100)]
virsh: fix vcpupin info
The "virDomainGetInfo" will get for running domain only live info and for
offline domain only config info. There was no way how to get config info
for running domain. We will use "vshCPUCountCollect" instead to get the
correct cpu count that we need to pass to "virDomainGetVcpuPinInfo".
Also cleanup some unnecessary variables and checks that are done by
drivers.
Michal Privoznik [Thu, 12 Feb 2015 13:50:31 +0000 (14:50 +0100)]
virQEMUCapsCacheLookupCopy: Filter qemuCaps based on machineType
Not all machine types support all devices, device properties, backends,
etc. So until we create a matrix of [machineType, qemuCaps], lets just
filter out some capabilities before we return them to the consumer
(which is going to make decisions based on them straight away).
Currently, as qemu is unable to tell which capabilities are (not)
enabled for given machine types, it's us who has to hardcode the matrix.
One day maybe the hardcoding will go away and we can create the matrix
dynamically on the fly based on a few monitor calls.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
When editing a domain with 'virsh edit' and failing validation, the
usual message pops up:
Failed. Try again? [y,n,f,?]:
Turning off validation can be useful, mainly for testing (but other
purposes too), so this patch adds support for relaxing definition in
virsh-edit and makes 'virsh edit <domain>' more usable.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Jiri Denemark [Fri, 13 Feb 2015 13:25:27 +0000 (14:25 +0100)]
Search for schemas and cpu_map.xml in source tree
Not all files we want to find using virFileFindResource{,Full} are
generated when libvirt is built, some of them (such as RNG schemas) are
distributed with sources. The current API was not able to find source
files if libvirt was built in VPATH.
Both RNG schemas and cpu_map.xml are distributed in source tarball.
When migrating with storage, libvirt iterates over domain disks and
instruct qemu to migrate the ones we are interested in (shared, RO and
source-less disks are skipped). The disks are migrated in series. No
new disk is transferred until the previous one hasn't been quiesced.
This is checked on the qemu monitor via 'query-jobs' command. If the
disk has been quiesced, it practically went from copying its content
to mirroring state, where all disk writes are mirrored to the other
side of migration too. Having said that, there's one inherent error in
the design. The monitor command we use reports only active jobs. So if
the job fails for whatever reason, we will not see it anymore in the
command output. And this can happen fairly simply: just try to migrate
a domain with storage. If the storage migration fails (e.g. due to
ENOSPC on the destination) we resume the host on the destination and
let it run on partly copied disk.
The proper fix is what even the comment in the code says: listen for
qemu events instead of polling. If storage migration changes state an
event is emitted and we can act accordingly: either consider disk
copied and continue the process, or consider disk mangled and abort
the migration.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Tue, 10 Feb 2015 15:24:45 +0000 (16:24 +0100)]
qemuProcessHandleBlockJob: Take status into account
Upon BLOCK_JOB_COMPLETED event delivery, we check if the job has
completed (in qemuMonitorJSONHandleBlockJobImpl()). For better image,
the event looks something like this:
"timestamp": {"seconds": 1423582694, "microseconds": 372666}, "event":
"BLOCK_JOB_COMPLETED", "data": {"device": "drive-virtio-disk0", "len": 8412790784, "offset": 409993216, "speed": 8796093022207, "type":
"mirror", "error": "No space left on device"}}
If "len" does not equal "offset" it's considered an error, and we can
clearly see "error" field filled in. However, later in the event
processing this case was handled no differently to case of job being
aborted via separate API. It's time that we start differentiate these
two because of the future work.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Tue, 10 Feb 2015 14:32:59 +0000 (15:32 +0100)]
qemuProcessHandleBlockJob: Set disk->mirrorState more often
Currently, upon BLOCK_JOB_* event, disk->mirrorState is not updated
each time. The callback code handling the events checks if a blockjob
was started via our public APIs prior to setting the mirrorState.
However, some block jobs may be started internally (e.g. during
storage migration), in which case we don't bother with setting
disk->mirror (there's nothing we can set it to anyway), or other
fields. But it will come handy if we update the mirrorState in these
cases too. The event wasn't delivered just for fun - we've started the
job after all.
So, in this commit, the mirrorState is set to whatever job status
we've obtained. Of course, there are some actions on some statuses
that we want to perform. But instead of if {} else if {} else {} ...
enumeration, let's move to switch().
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Pavel Hrdina [Wed, 18 Feb 2015 15:10:58 +0000 (16:10 +0100)]
daemon: Fix segfault by reloading daemon right after start
Libvirt could crash with segfault if user issue "service reload" right
after "service start". One possible way to crash libvirt is to run reload
during initialization of QEMU driver.
It could happen when qemu driver will initialize qemu_driver_lock but
don't have a time to set it's "config" and the SIGHUP arrives. The
reload handler tries to get qemu_drv->config during "virStorageAutostart"
and dereference it which ends with segfault.
Let's ignore all reload requests until all drivers are initialized. In
addition set driversInitialized before we enter virStateCleanup to
ignore reload request while we are shutting down.
Michal Privoznik [Thu, 12 Feb 2015 16:43:27 +0000 (17:43 +0100)]
qemuBuildMemoryBackendStr: Report backend requirement more appropriately
So, when building the '-numa' command line, the
qemuBuildMemoryBackendStr() function does quite a lot of checks to
chose the best backend, or to check if one is in fact needed. However,
it returned that backend is needed even for this little fella:
This can be guaranteed via CGroups entirely, there's no need to use
memory-backend-ram to let qemu know where to get memory from. Well, as
long as there's no <memnode/> element, which explicitly requires the
backend. Long story short, we wouldn't have to care, as qemu works
either way. However, the problem is migration (as always). Previously,
libvirt would have started qemu with:
-numa node,memory=X
in this case and restricted memory placement in CGroups. Today, libvirt
creates more complicated command line:
Again, one wouldn't find anything wrong with these two approaches.
Both work just fine. Unless you try to migrated from the older libvirt
into the newer one. These two approaches are, unfortunately, not
compatible. My suggestion is, in order to allow users to migrate, lets
use the older approach for as long as the newer one is not needed.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Thu, 12 Feb 2015 16:39:34 +0000 (17:39 +0100)]
qemuxml2argvtest: Fake response from numad
Well, we can pretend that we've asked numad for its suggestion and let
qemu command line be built with that respect. Again, this alone has no
big value, but see later commits which build on the top of this.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
John Ferlan [Fri, 13 Feb 2015 20:09:09 +0000 (15:09 -0500)]
libxl: Resolve Coverity CHECKED_RETURN
Periodically my Coverity scan will return a checked_return failure
for libxlDomainShutdownThread call to libxlDomainStart. Followed the
libxlAutostartDomain example in order to check the status, emit a
message, and continue on.
Luyao Huang [Wed, 4 Feb 2015 13:42:56 +0000 (21:42 +0800)]
lxc: Fix container cleanup for LXCProcessStart
Jumping to the cleanup label prior to starting the container failed to
properly clean everything up that is handled by the virLXCProcessCleanup
which is called if virLXCProcessStop is called on failure after the
container properly starts. Most importantly is prior to this patch none
of the stop/release hooks, host device reattachment, and network cleanup
(that is reverse of virLXCProcessSetupInterfaces).
John Ferlan [Thu, 12 Feb 2015 19:32:39 +0000 (14:32 -0500)]
lxc: Modify/add some debug messages
Modify the VIR_DEBUG message in virLXCProcessCleanup to make it clearer
about the path. Also add some more VIR_DEBUG messages in virLXCProcessStart
in order to help debug error flow.
Move the two console checks - one for zero nconsoles present and the
other for an invalid console type to earlier in the processing rather than
getting after performing some setup that has to be undone for what amounts
to an invalid configuration.
This resolves the above bug since it's not not possible to have changed
the security labels when we cause the configuration check failure.
Erik Skultety [Thu, 12 Feb 2015 17:32:41 +0000 (18:32 +0100)]
security: Refactor virSecurityManagerGenLabel
if (mgr == NULL || mgr->drv == NULL)
return ret;
This check isn't really necessary, security manager cannot be a NULL
pointer as it is either selinux (by default) or 'none', if no other driver is
set in the config. Even with no config file driver name yields 'none'.
The other hunk checks for domain's security model validity, but we should
also check devices' security model as well, therefore this hunk is moved into
a separate function which is called by virSecurityManagerCheckAllLabel that
checks both the domain's security model and devices' security model.
https://bugzilla.redhat.com/show_bug.cgi?id=1165485 Signed-off-by: Ján Tomko <jtomko@redhat.com>
Erik Skultety [Thu, 12 Feb 2015 17:32:40 +0000 (18:32 +0100)]
security: introduce virSecurityManagerCheckAllLabel function
We do have a check for valid per-domain security model, however we still
do permit an invalid security model for a domain's device (those which
are specified with <source> element).
This patch introduces a new function virSecurityManagerCheckAllLabel
which compares user specified security model against currently
registered security drivers. That being said, it also permits 'none'
being specified as a device security model.
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1165485 Signed-off-by: Ján Tomko <jtomko@redhat.com>
Ján Tomko [Fri, 6 Feb 2015 14:35:57 +0000 (15:35 +0100)]
Add mrg_rxbuf option to virtio interfaces
Add an XML attribute to allow disabling merge of rx buffers
on the host:
<interface ...>
...
<model type='virtio'/>
<driver ...>
<host mrg_rxbuf='off'/>
</driver>
</interface>
Our hotplug code supports macvtap insertion to guests. However, we
somehow forgot about 'attach-interface' (which tries to build XML from
passed arguments and use virDomainAttachDeviceFlags()).
New type is accessible under 'direct' type, to keep the same type as
used in domain XML.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
virsh attach-interface: Use enum instead of arbitrary integers
The type of interface to attach is held in the variable 'typ'.
Depending on interface type selected by user, the variable is set
either to 1 (network), or 2 (bridge). Lets use already existing
enum from domain_conf.h instead: virDomainNetType.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
The enum converters are defined in the domain_conf.h (so
accessible widely across the code), but on the symbol layer, only
virDomainNetTypeToString was exposed. However, FromString variant
is going to be needed shortly.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Pavel Hrdina [Thu, 12 Feb 2015 12:46:08 +0000 (13:46 +0100)]
virprocess: fix MinGW build and RHEL-5 build
Commit b6a2828e introduced new functions to set process scheduler. There
is a small typo in ELSE path for systems where scheduler is not
available.
Also some of the definitions were introduced later in kernel. For
example RHEL-5 is running on kernel 2.6.18, but SCHED_IDLE was introduces
in 2.6.23 [1] and SCHED_BATCH in 2.6.16 [1]. We should not count only on
existence of function sched_setscheduler(), we must also check for
existence of used macros as they might not be defined.
While the main storage driver code allows the flag
VIR_STORAGE_VOL_RESIZE_SHRINK to be set, none of the backend
drivers are supporting it. At the very least this can work
for plain file based volumes since we just ftruncate() them
to the new size. It does not work with qcow2 volumes, but we
can arguably delegate to qemu-img for error reporting for that
instead of second guessing this for ourselves:
$ virsh vol-resize --shrink /home/berrange/VirtualMachines/demo.qcow2 2G
error: Failed to change size of volume 'demo.qcow2' to 2G
error: internal error: Child process (/usr/bin/qemu-img resize /home/berrange/VirtualMachines/demo.qcow2 2147483648) unexpected exit status 1: qemu-img: qcow2 doesn't support shrinking images yet
qemu-img: This image does not support resize
See also https://bugzilla.redhat.com/show_bug.cgi?id=1021802
qemu: do upfront check for vcpupids being null when querying pinning
The qemuDomainHelperGetVcpus attempted to report an error when the
vcpupids info was NULL. Unfortunately earlier code would clamp the
value of 'maxinfo' to 0 when nvcpupids was 0, so the error reporting
would end up being skipped.
This lead to 'virsh vcpuinfo <dom>' just returning an empty list
instead of giving the user a clear error.
The code assumes that def->vcpus == nvcpupids, so when we setup
fake CPU pids for old QEMU with nvcpupids == 1, we cause the
later code to read off the end of the array. This has fun results
like sche_setaffinity(0, ...) which changes libvirtd's own CPU
affinity, or even better sched_setaffinity($RANDOM, ...) which
changes the affinity of a random OS process.
The intent was that this would merely disable the ability to set
per-vCPU affinity. It should still have been possible to set VM
level host CPU affinity.
Unfortunately, when you set <vcpu cpuset='0-1'>4</vcpu>, the XML
parser will internally take this & initialize an entry in the
def->cputune.vcpupin array for every VCPU. IOW this is implicitly
being treated as
Even more fun, the faked cputune elements are hidden from view when
querying the live XML, because their cpuset mask is the same as the
VM default cpumask.
The upshot was that it was impossible to set VM level CPU affinity.
To fix this we must update qemuProcessSetVcpuAffinities so that it
only reports a fatal error if the per-VCPU cpu mask is different
from the VM level cpu mask.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
libxl: disable VNC and SDL until explicitly enabled
When initializing a libxl_domain_build_info struct with
libxl_domain_build_info_init(), VNC is enabled by default. As a
result, VMs configured with no graphics still have VNC enabled.
This behavior is a regression wrt to the legacy Xen driver.
Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
util: Add virProcessSetScheduler() function for scheduler settings
This function uses sched_setscheduler() function so it works with
processes and threads as well (even threads not created by us, which is
what we'll need in the future).
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Laine Stump [Tue, 10 Feb 2015 21:17:11 +0000 (16:17 -0500)]
domain: include portgroup in interface status xml
Prior to commit 7d5bf484747 (first appearing in libvirt 1.2.2), the
status XML of a domain's interface was missing a lot of important
information; mainly it just output the config of the interface, plus
the name of the tap device and qemu device alias. Commit 7d5bf484747
changed the status XML to include many important bits of information
that were required to make network "hook" scripts useful - bandwidth
information, vlan tag, the name of the bridge (or physical device in
the case of macvtap) that the tap/macvtap device was attached to - the
commit log for 7d5bf484747 has a very detailed explanation of the
change. For quick reference - in the example given there, prior to the
change, status XML looked like figure [C]:
You'll notice that bandwidth info, physdev, and macvtap mode have been
added, but the network and portgroup names are now missing - I didn't
think that this information was of any use once the needed
bandwidth/vlan/etc config had been pulled from the network/portgroup.
I was wrong.
A few months after that change a user on IRC asked what happened to
portgroup in the status XML and described how he used it (more or less
as a tag to decide what external information to use in a hook script
that was run at startup/migration time - see
http://wiki.libvirt.org/page/OVS_and_PVLANS ). At that time I planned
to make a patch to re-add portgroup, but life intervened as that was
just prior to a transatlantic move involving several weeks of
"vacation". During this time I somehow forgot to make the patch, and
also mistakenly remembered that I *had* made it.
Subsequent to this, as a part of mprivozn's work to add support for
network-specific hooks, I did re-add the output of the network name in
status XML, but once again completely forgot about portgroup. This was
in commit a3609121 (first appearing in libvirt 1.2.11). This made the
status XML from the above example look like this:
The result is that the status XML now contains all information about
how the interface is setup (bandwidth, physical device, tap device,
etc), in addition to pointers to its origin (the network and
portgroup).
Laine Stump [Tue, 10 Feb 2015 19:04:40 +0000 (14:04 -0500)]
domain: avoid potential memory leak in virDomainGraphicsListenSet*()
virDomainGraphicsListenSetAddress() and
virDomainGraphicsListenSetNetwork() both set their respective char* to
NULL directly when asked to set it to NULL, which is okay as long as
it's already set to NULL. If these functions are ever called to clear
a listen object that has a valid string in address or network, it will
end up leaking the old value. Currently that doesn't happen, so this
is just a preemptive strike.
Laine Stump [Tue, 10 Feb 2015 18:49:16 +0000 (13:49 -0500)]
domain: backfill listen address to parent <graphics> listen attribute
Prior to 0.9.4, libvirt only supported a single listen, and it had to
be an IP address:
<graphics listen='1.2.3.4' ..../>
Starting with 0.9.4, a graphics element could have a <listen>
subelement (actually the grammar supports multiples, but all of the
drivers only support a single <listen> per <graphics>), and that
listen element can be of type='address' or type='network'. For
type='address', <listen> also has an attribute called 'address' which
contains the IP address for listening:
At domain start (or migrate) time, libvirt will attempt to
find an IP address associated with that network (e.g. the IP address
of the bridge device used by the network, or the physical device
listed in <forward dev='physdev'/>) and fill in that address in the
status XML:
In the case that a <graphics> element has a <listen> subelement of
type='address', that listen subelement's "address" attribute is
backfilled into the parent graphics element's "listen" *attribute* for
backward compatibility (so that a management application unaware of
the separate <listen> element can still learn the listen
address). This backfill should be done with the IP learned from
type='network' as well, and that's what this patch does:
virsh's domdisplay command looks in /domain/devices/graphics/@listen
of the domain's XML for the listen address, however for listen
type='network' (added in libvirt 0.9.4), the <graphics> element
doesn't have a listen attribute, but has a <listen> subelement,
*still* with no address (this is the inactive XML):
However, at domain start time the <listen> subelement gets its address
attribute filled in once libvirt figures out the IP address associated
with the named network (this is the status XML):
So in these cases, we need to look at
/domain/devices/graphics/listen/@address instead.
Even though another patch is being pushed that will backfill
listen/@address into @listen, this patch is still useful, as it fixes
domdisplay for cases of a new virsh (with this patch) connecting to a
libvirtd that is newer than 0.9.4 but doesn't have the followup patch.
Signed-off-by: Luyao Huang <lhuang@redhat.com> Signed-off-by: Laine Stump <laine@laine.org>
John Ferlan [Mon, 26 Jan 2015 19:28:25 +0000 (14:28 -0500)]
qemu: qemuOpenFileAs - set flag VIR_FILE_OPEN_FORCE_MODE
In the event we're falling into the code that tries to create the file
in a forked environment (VIR_FILE_OPEN_FORK) we pass different mode bits,
but those are never set because the virFileOpenForceOwnerMode has a check
if the OPEN_FORCE_MODE bit is set before attempting to change the mode.
Since this is a special case it seems reasonable to set u+rw,g+rw,o
John Ferlan [Wed, 28 Jan 2015 17:34:54 +0000 (12:34 -0500)]
virfile: Adjust error path for virFileOpenForked
Rather than have a dummy waitpid loop and return of the failure status
from recvfd, adjust the logic to save the recvfd error & fd and then
in priority order:
- if waitpid failed, use that errno value
- waitpid succeeded, but if the child exited abnormally, report failure
(use EACCES to report as return failure, since either EACCES or EPERM is
what caused us to fall into the fork+setuid path)
- waitpid succeeded, but if the child reported non-zero status, report
failure (use the errno value that the child encoded into exit status)
- waitpid succeeded, but if recvfd failed, report recvfd_errno
- waitpid and recvfd succeeded, use the fd
NOTE: Original logic to retry the open and force owner mode was
"documented" as only being attempted if we had already tried opening
with the fork+setuid, but checked flags vs. VIR_FILE_OPEN_NOFORK which
is counter to how we would get to that point. So that code was removed.
target libvirtd will crash because uri->scheme is NULL in
qemuMigrationPrepareDirect on this line:
if (STRNEQ(uri->scheme, "tcp") &&
Add a value check before this line. Also fix a bug like this in
doNativeMigrate, that could only happen when destination libvirtd
returned an incorrect URI.
Signed-off-by: Luyao Huang <lhuang@redhat.com> Signed-off-by: Ján Tomko <jtomko@redhat.com>
Zhang Bo [Wed, 11 Feb 2015 08:48:24 +0000 (16:48 +0800)]
conf: Fix libvirtd crash and memory leak caused by virDomainVcpuPinDel()
The function virDomainVcpuPinDel() used vcpupin_list to stand for
def->cputune.vcpupin, which made the codes more readable.
However, in this function, it will realloc vcpupin_list later.
As the definition of realloc(), it may free vcpupin_list and then
points it to a new-realloced address, but def->cputune.vcpupin doesn't
point to the new address(it's freed however).
Thus,
1) When we refer to the def->cputune.vcpupin afterwards, which was freed
by realloc(), an INVALID READ occurs, and libvirtd may crash.
2) As no one will use vcpupin_list any more, and no one frees it(it's just
alloced by realloc()), memory leak occurs.
Part of the valgrind logs are shown as below:
==1837== Thread 15:
==1837== Invalid read of size 8
==1837== at 0x5367337: virDomainDefFormatInternal (domain_conf.c:18392)
which is : virBufferAsprintf(buf, "<vcpupin vcpu='%u' ",
def->cputune.vcpupin[i]->vcpuid);
==1837== by 0x536966C: virDomainObjFormat (domain_conf.c:18970)
==1837== by 0x5369743: virDomainSaveStatus (domain_conf.c:19166)
==1837== by 0x117B26DC: qemuDomainPinVcpuFlags (qemu_driver.c:4586)
==1837== by 0x53EA313: virDomainPinVcpuFlags (libvirt.c:9803)
==1837== by 0x14CB7D: remoteDispatchDomainPinVcpuFlags (remote_dispatch.h:6762)
==1837== by 0x14CC81: remoteDispatchDomainPinVcpuFlagsHelper (remote_dispatch.h:6740)
==1837== by 0x5464C30: virNetServerProgramDispatchCall (virnetserverprogram.c:437)
==1837== by 0x546507A: virNetServerProgramDispatch (virnetserverprogram.c:307)
==1837== by 0x171B83: virNetServerProcessMsg (virnetserver.c:172)
==1837== by 0x171E6E: virNetServerHandleJob (virnetserver.c:193)
==1837== by 0x5318E78: virThreadPoolWorker (virthreadpool.c:145)
==1837== Address 0x12ea2870 is 0 bytes inside a block of size 16 free'd
==1837== at 0x4C291AC: realloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==1837== by 0x52A3D14: virReallocN (viralloc.c:245)
==1837== by 0x52A3DFB: virShrinkN (viralloc.c:372)
==1837== by 0x52A3F57: virDeleteElementsN (viralloc.c:503)
==1837== by 0x533939E: virDomainVcpuPinDel (domain_conf.c:15405) //doReset为true时才会进到。
==1837== by 0x117B2642: qemuDomainPinVcpuFlags (qemu_driver.c:4573)
==1837== by 0x53EA313: virDomainPinVcpuFlags (libvirt.c:9803)
==1837== by 0x14CB7D: remoteDispatchDomainPinVcpuFlags (remote_dispatch.h:6762)
==1837== by 0x14CC81: remoteDispatchDomainPinVcpuFlagsHelper (remote_dispatch.h:6740)
==1837== by 0x5464C30: virNetServerProgramDispatchCall (virnetserverprogram.c:437)
==1837== by 0x546507A: virNetServerProgramDispatch (virnetserverprogram.c:307)
==1837== by 0x171B83: virNetServerProcessMsg (virnetserver.c:172)
Steps to reproduce the problem:
1) use virDomainPinVcpuFlags() to pin a guest's vcpu to all the pcpus
of the host.
This patch uses def->cputune.vcpupin instead of vcpupin_list to do the
realloc() job, to avoid invalid read or memory leaking.
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com> Signed-off-by: Yue Wenyuan <yuewenyuan@huawei.com@huawei.com>
Erik Skultety [Tue, 10 Feb 2015 16:17:36 +0000 (17:17 +0100)]
schema: allow multiple seclabel for devices in domaincommon.rng
In our RNG schema we do allow multiple (different) seclabels per-domain,
but don't allow this for devices, yet we neither have a check in our XML parser,
nor in a post-parse callback. In that case we should allow multiple
(different) seclabels for devices as well.
Peter Krempa [Tue, 3 Feb 2015 09:31:33 +0000 (10:31 +0100)]
qemu: command: Refactor creation of RNG device commandline
As the RNG device is using an -object as backend refactor the code to
use the JSON to commandline generator so that we can reuse the code
later in hotplug.
Peter Krempa [Tue, 3 Feb 2015 09:14:42 +0000 (10:14 +0100)]
qemu: command: Shuffle around formatting of alias for RNG device backend
Move the alias name right after the object type for rng-egd backend so
that we can later use the JSON to commandline generator to create the
command line.