Wang Yufei [Sat, 26 Sep 2015 12:18:03 +0000 (20:18 +0800)]
qemu_agent: fix deadlock in qemuProcessHandleAgentEOF
If VM A is shutdown a by qemu agent at appoximately the same time
an agent EOF of VM A happened, there's a chance that deadlock may occur:
qemuProcessHandleAgentEOF in main thread
A) priv->agent = NULL; //A happened before B
//deadlock when we get agent lock which's held by worker thread
qemuAgentClose(agent);
qemuDomainObjExitAgent called by qemuDomainShutdownFlags in worker thread
B) hasRefs = virObjectUnref(priv->agent); // priv->agent is NULL,
// return false
if (hasRefs)
virObjectUnlock(priv->agent); //agent lock will not be released here
In order to resolve, during EOF close the agent first, then set priv->agent
to NULL to fix the deadlock.
This essentially reverts commit id '1020a504'. It's also of note that commit
id '362d0477' notes a possible/rare deadlock similar to what was seen in
the monitor in commit id '25f582e3'. However, it seems interceding changes
including commit id 'd960d06f' should remove the deadlock issue.
With this change, if EOF is called:
Get VM lock
Check if !priv->agent || priv->beingDestroyed, then unlock VM
Call qemuAgentClose
Unlock VM
When qemuAgentClose is called
Get Agent lock
If Agent->fd open, close it
Unlock Agent
Unref Agent
qemuDomainObjEnterAgent
Enter with VM lock
Get Agent lock
Increase Agent refcnt
Unlock VM
After running agent command, calling qemuDomainObjExitAgent
Enter with Agent lock
Unref Agent
If not last reference, unlock Agent
Get VM lock
If we were in the middle of an EnterAgent, call Agent command, and
ExitAgent sequence and the EOF code is triggered, then the EOF code
can get the VM lock, make it's checks against !priv->agent ||
priv->beingDestroyed, and call qemuAgentClose. The CloseAgent
would wait to get agent lock. The other thread then will eventually
call ExitAgent, release the Agent lock and unref the Agent. Once
ExitAgent releases the Agent lock, AgentClose will get the Agent
Agent lock, close the fd, unlock the agent, and unref the agent.
The final unref would cause deletion of the agent.
Signed-off-by: Wang Yufei <james.wangyufei@huawei.com> Reviewed-by: Ren Guannan <renguannan@huawei.com>
build: Create needed folders without dependency tracking
The parameter --disable-dependency-tracking is supposed to speed up
one-time build due to the fact that it disables some dependency
extractors that, apparently, take longer time to execute. That is a
problem for code that is generated into builddir (especially some
specific subdirectory) because the directory it should be installed to
does not exists in VPATH and without the dependency tracking is not
created. Generating such file hence fails with -ENOENT. In order to
keep generating files into builddir instead of srcdir, we must create
the directory ourselves. This should finally fix the problem that is
being fixed multiple times since its introduction in commit a9fe62037214
and let us continue with cleaning those parts of Makefiles that depend
on generating files into the srcdir rather than builddir as it should
be.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
Wei Jiangang [Mon, 30 Nov 2015 10:08:40 +0000 (18:08 +0800)]
tools: fix output of list with state-shutoff
Due to the default of flags is VIR_CONNECT_LIST_DOMAINS_ACTIVE,
It doesn't show the domains that have been shutdown when we use
'virsh list' with only --state-shutoff.
Michal Privoznik [Fri, 17 Jul 2015 09:11:23 +0000 (11:11 +0200)]
conf: Split virDomainObjList into a separate file
Our domain_conf.* files are big enough. Not only they contain XML
parsing code, but they served as a storage of all functions whose
name is virDomain prefixed. This is just wrong as it gathers not
related functions (and modules) into one big file which is then
harder to maintain. Split virDomainObjList module into a separate
file called virdomainobjlist.[ch].
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Ján Tomko [Fri, 13 Nov 2015 10:26:37 +0000 (11:26 +0100)]
qemu: add capabilities for virtio input devices
Add capabilities for virtio-keyboard, virtio-mouse
and virtio-tablet devices:
name "virtio-keyboard-device", bus virtio-bus
name "virtio-keyboard-pci", bus PCI
name "virtio-mouse-device", bus virtio-bus
name "virtio-mouse-pci", bus PCI
name "virtio-tablet-device", bus virtio-bus
name "virtio-tablet-pci", bus PCI
Map both -device and -pci versions of the device to one capability.
Pavel Hrdina [Sat, 28 Nov 2015 08:03:40 +0000 (09:03 +0100)]
virlogd: fix crash if log file exists and it's larger the maxlen
If for some reason there is an existing log file, that is larger then
max length of log file, we need to rollover that file immediately.
Trying to figure out how much data we could write will resolve in
overflow of unsigned variable 'towrite' and this leads to segfault.
Erik Skultety [Mon, 5 Oct 2015 15:17:51 +0000 (17:17 +0200)]
admin: Introduce virAdmConnectGetLibVersion
Introduce a new API to get libvirt version. It is worth noting, that
libvirt-admin and libvirt share the same version number. Unfortunately,
our existing API isn't generic enough to be used with virAdmConnectPtr
as well. Also this patch wires up this API to the virt-admin client
as a generic cmdVersion command.
Erik Skultety [Mon, 12 Oct 2015 15:10:57 +0000 (17:10 +0200)]
admin: Add support for connection close callbacks
As we need a client disconnect handler, we also need a mechanism to register
such handlers for a client. This patch introduced both the close callbacks and
also the client vshAdmCatchDisconnect handler to be registered with it. By
registering the handler we still need to make sure the client can react to
daemon's events like disconnect or keepalive, so asynchronous I/O event polling
is necessary to be enabled too.
Erik Skultety [Fri, 30 Oct 2015 11:11:52 +0000 (12:11 +0100)]
admin: Add support for URI aliases
Now that we introduced URI support in libvirt-admin, we should also support URI
aliases during connection establishment phase. After applying this patch,
virAdmConnectOpen will also support VIR_CONNECT_NO_ALIASES flag.
Erik Skultety [Fri, 6 Nov 2015 09:44:53 +0000 (10:44 +0100)]
livirt: Move URI alias matching to util
As we need to provide support for URI aliases in libvirt-admin as well, URI
alias matching needs to be internally visible. Since
virConnectOpenResolveURIAlias does have a compatible signature, it could be
easily reused by libvirt-admin. This patch moves URI alias matching to util,
renaming it accordingly.
Erik Skultety [Mon, 12 Oct 2015 15:08:29 +0000 (17:08 +0200)]
admin: Do not generate remoteAdminConnect{Open,Close}
As we plan to add more and more logic to remote connecting methods,
these cannot be generated from admin_protocol.x anymore. Instead,
this patch implements these to methods explicitly.
Erik Skultety [Thu, 15 Oct 2015 12:48:58 +0000 (14:48 +0200)]
admin: Move remote admin API version to a separate module
By moving the remote version into a separate module, we gain a slightly
better maintainability in the long run than just by leaving it in one
place with the existing libvirt-admin library which can start getting
pretty messy later on.
Erik Skultety [Wed, 14 Oct 2015 13:10:02 +0000 (15:10 +0200)]
admin: Introduce virAdmConnectIsAlive
Since most of our APIs rely on an acive functional connection to a daemon and
we have such a mechanism in libvirt already, there's need to have such a way in
libvirt-admin as well. By introducing a new public API, this patch provides
support to check for an active connection.
Erik Skultety [Mon, 12 Oct 2015 15:07:21 +0000 (17:07 +0200)]
virt-admin: Introduce first working skeleton
This patch introduces virt-admin client which is based on virsh client,
but had to reimplement several methods to meet virt-admin specific needs
or remove unnecessary virsh specific logic.
Erik Skultety [Thu, 26 Nov 2015 14:08:36 +0000 (15:08 +0100)]
admin: introduce virAdmGetVersion
Unfortunately, client side version retrieval API virGetVersion uses
one-time initialization (due to the fact we might not have initialized the
library by calling connect prior to this) which is not completely compatible
with admin initialization. This API is rather simplistic and reimplementing
it for admin might be the preferred method of reusing it. Note that even though
the method will be reimplemented, the version number is still the same for both
the libvirt and libvirt-admin library.
Erik Skultety [Mon, 12 Oct 2015 14:09:53 +0000 (16:09 +0200)]
libvirt: Move config getters to util
virConnectGetConfig and virConnectGetConfigPath were static libvirt
methods, merely because there hasn't been any need for having them
internally exported yet. Since libvirt-admin also needs to reference
its config file, 'xGetConfig' should be exported.
Besides moving, this patch also renames the methods accordingly,
as they are libvirt config specific.
Erik Skultety [Thu, 26 Nov 2015 14:02:49 +0000 (15:02 +0100)]
libvirt: introduce libvirt/libvirt-common.h.in
As it turned out, we need to share some enums and declarations between
libvirt.h and libvirt-admin.h, but since our policy forbids direct includes of
libvirt*.h, there has to be some header exempt from this rule. This patch moves
the relevant part of code from libvirt.h.in to libvirt-common.h.in. Moreover,
since there is no need to have libvirt.h generated anymore, introduce a new
header libvirt.h which was previosly ignored from git and make the common
header ignored and generated instead.
qemu 2.5 provides virtio video device. It can be used with -device
virtio-vga for primary devices, or -device virtio-gpu for non-vga
devices. However, only the primary device (VGA) is supported with this
patch.
systemd: Escape only needed characters for machined
Machine name escaping follows the same rules as serice name escape,
except that '.' and '-' must not be escaped in machine names, due
to a bug in systemd-machined.
The rule for virrotatingfiletest was defined in DBUS-only block even
though the test does not use DBus at all. Also DBUS_CFLAGS and
DBUS_LIBS are removed from the rules. The original error was:
/usr/lib/gcc/x86_64-pc-linux-gnu/5.2.0/../../../../lib64/Scrt1.o: In
function `_start':
(.text+0x20): undefined reference to `main'
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
logging: remove reference to non-existent augeas files
The libvirt_logd.aug and test_libvirt_logd.aug.in files
have never existed so shouldn't be in EXTRA_DIST. It was
a copy+paste mistake when closing virtlogd from virtlockd
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
logging: avoid variables called 'daemon' due to function clash
With some versions of GLibC / GCC, a variable called 'daemon'
will result in a warning about clashing with the function also
named 'daemon'. Rename it to 'dmn' to avoid the clash.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Guido Günther [Thu, 26 Nov 2015 17:00:09 +0000 (18:00 +0100)]
virtlogd: use %llu to print 64bit types
Otherwise we fail on 32bit with:
CC logging/virtlogd-log_daemon_dispatch.o
logging/log_daemon_dispatch.c: In function 'virLogManagerProtocolDispatchDomainReadLogFile':
logging/log_daemon_dispatch.c:120:9: error: format '%zu' expects argument of type 'size_t', but argument 7 has type 'uint64_t' [-Werror=format]
logging: inhibit virtlogd shutdown while log files are open
The virtlogd daemon is launched with a 30 second timeout for
unprivileged users. Unfortunately the timeout is only inhibited
while RPC clients are connected, and they only connect for a
short while to open the log file descriptor. We need to hold
an inhibition for as long as the log file descriptor itself
is open.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
qemu: add support for sending QEMU stdout/stderr to virtlogd
Currently the QEMU stdout/stderr streams are written directly to
a regular file (eg /var/log/libvirt/qemu/$GUEST.log). While those
can be rotated by logrotate (using copytruncate option) this is
not very efficient. It also leaves open a window of opportunity
for a compromised/broken QEMU to DOS the host filesystem by
writing lots of text to stdout/stderr.
This makes it possible to connect the stdout/stderr file handles
to a pipe that is provided by virtlogd. The virtlogd daemon will
read from this pipe and write data to the log file, performing
file rotation whenever a pre-determined size limit is reached.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
qemu: convert monitor to use qemuDomainLogContextPtr indirectly
Currently the QEMU monitor is given an FD to the logfile. This
won't work in the future with virtlogd, so it needs to use the
qemuDomainLogContextPtr instead, but it shouldn't directly
access that object either. So define a callback that the
monitor can use for reporting errors from the log file.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
qemu: change qemuDomainTaint APIs to accept qemuDomainLogContextPtr
The qemuDomainTaint APIs currently expect to be passed a log file
descriptor. Change them to instead use a qemuDomainLogContextPtr
to hide the implementation details.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Introduce a qemuDomainLogContext object to encapsulate
handling of I/O to/from the domain log file. This will
hide details of the log file implementation from the
rest of the driver, making it easier to introduce
support for virtlogd later.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
qemu: unify code for reporting errors from QEMU log files
There are two pretty similar functions qemuProcessReadLog and
qemuProcessReadChildErrors. Both read from the QEMU log file
and try to strip out libvirt messages. The latter then reports
an error, while the former lets the callers report an error.
Re-write qemuProcessReadLog so that it uses a single read
into a dynamically allocated buffer. Then introduce a new
qemuProcessReportLogError that calls qemuProcessReadLog
and reports an error.
Convert all callers to use qemuProcessReportLogError.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
qemu: remove writing to QEMU log file for rename operation
The rename operation only works on inactive virtual machines,
but it none the less writes to the log file used by the QEMU
processes. This log file is not intended to provide a general
purpose audit trail of operations performed on VMs. The audit
subsystem has recording of important operations. If we want
to extend that to cover all significant public APIs that is
a valid thing to consider, but we shouldn't arbitrarily log
specific APIs into the QEMU log file in the meantime.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add the virLogManager API which allows for communication with
the virtlogd daemon to RPC program. This provides the client
side API to open log files for guest domains.
The virtlogd daemon is setup to auto-spawn on first use when
running unprivileged. For privileged usage, systemd socket
activation is used instead.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Define a new RPC protocol for the virtlogd daemon that provides
for handling of logs. The initial RPC method defined allows a
client to obtain a file handle to use for writing to a log
file for a guest domain. The file handle passed back will not
actually refer to the log file, but rather an anonymous pipe.
The virtlogd daemon will forward I/O between them, ensuring
file rotation happens when required.
Initially the log setup is hardcoded to cap log files at
128 KB, and keep 3 backups when rolling over, which gives
a max usage of 512 KB per guest.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Import stripped down virtlockd code as basis of virtlogd
Copy the virtlockd codebase across to form the initial virlogd
code. Simple search & replace of s/lock/log/ and gut the remote
protocol & dispatcher. This gives us a daemon that starts up
and listens for connections, but does nothing with them.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
util: add APIs for reading/writing from/to rotating files
Add virRotatingFileReader and virRotatingFileWriter objects
which allow reading & writing from/to files with automation
rotation to N backup files when a size limit is reached. This
is useful for guest logging when a guaranteed finite size
limit is required. Use of external tools like logrotate is
inadequate since it leaves the possibility for guest to DOS
the host in between invokations of logrotate.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
According to the documentation, CreateMachine accepts only 7bit ASCII
characters in the machinename parameter, so let's make sure we can start
machines with unicode names with systemd. We already have a function
for that, we just forgot to use it.
Jiri Denemark [Tue, 10 Nov 2015 12:43:04 +0000 (13:43 +0100)]
qemu: Use qemuProcessLaunch in migration Prepare phase
Using qemuProcess{Init,Launch,FinishStartup} allows us to run
pre-migration commands on destination before asking QEMU to wait for
incoming migration data.
Jiri Denemark [Tue, 10 Nov 2015 11:56:29 +0000 (12:56 +0100)]
qemu: Skip starting NBD servers for offline migration
NBD storage migration will not work with offline migration anyway and we
already checked that the user did not ask for it. Thus it doesn't make
sense to keep the code after 'done' label where we jump in case of
offline migration.
Jiri Denemark [Tue, 10 Nov 2015 11:41:01 +0000 (12:41 +0100)]
qemu: Kill QEMU process if Prepare phase fails
Some failure paths in qemuMigrationPrepareAny forgot to kill the just
started QEMU process. This patch fixes this by combining 'stop' and
'endjob' label into a new label 'stopjob'. This name was chosen to avoid
confusion with the most common semantics of 'endjob'. Normally, 'endjob'
is always called at the end of an API to stop the job we entered at the
beginning. In qemuMigrationPrepareAny we only want to stop the job in
failure path; on success we need to carry the job over to the Finish
phase.
Jiri Denemark [Tue, 10 Nov 2015 15:58:12 +0000 (16:58 +0100)]
qemu: Introduce qemuProcessInit
qemuProcessStart is going to be split in three parts: qemuProcessInit,
qemuProcessLaunch, and qemuProcessFinish so that migration Prepare phase
can insert additional code in the process. qemuProcessStart will be a
small wrapper for all other callers.
qemuProcessInit prepares the domain up to the point when priv->qemuCaps
is initialized.
Dmitry Andreev [Tue, 24 Nov 2015 12:26:36 +0000 (15:26 +0300)]
Allow multiple panic devices
'model' attribute was added to a panic device but only one panic
device is allowed. This patch changes panic device presence
from 'optional' to 'zeroOrMore'.
Dmitry Andreev [Tue, 24 Nov 2015 12:26:33 +0000 (15:26 +0300)]
qemu: add support for hv_crash feature as a panic device
Panic device type used depends on 'model' attribute.
If no model is specified then device type depends on hypervisor
and guest arch. 'pseries' model is used for pSeries guest and
'isa' model is used in other cases.
Dmitry Andreev [Tue, 24 Nov 2015 12:26:31 +0000 (15:26 +0300)]
conf: add 'model' attribute for panic device with values isa, pseries, hyperv
Libvirt already has two types of panic devices - pvpanic and pSeries firmware.
This patch introduces the 'model' attribute and a new type of panic device.
'isa' model is for ISA pvpanic device.
'pseries' model is a default value for pSeries guests.
'hyperv' model is the new type. It's used for Hyper-V crash.
Schema and docs are updated for the new attribute.
Laine Stump [Mon, 23 Nov 2015 19:19:13 +0000 (14:19 -0500)]
nodedev: report maxCount for virtual_functions capability
A PCI device may have the capability to setup virtual functions (VFs)
but have them currently all disabled. Prior to this patch, if that was
the case the the node device XML for the device wouldn't report any
virtual_functions capability.
With this patch, if a file called "sriov_totalvfs" is found in the
device's sysfs directory, its contents will be interpreted as a
decimal number, and that value will be reported as "maxCount" in a
capability element of the device's XML, e.g.:
This will be reported regardless of whether or not any VFs are
currently enabled for the device.
NB: sriov_numvfs (the number of VFs currently active) is also
available in sysfs, but that value is implied by the number of items
in the list that is inside the capability element, so there is no
reason to explicitly provide it as an attribute.
sriov_totalvfs and sriov_numvfs are available in kernels at least as far
back as the 2.6.32 that is in RHEL6.7, but in the case that they
simply aren't there, libvirt will behave as it did prior to this patch
- no maxCount will be displayed, and the virtual_functions capability
will be absent from the device's XML when 0 VFs are enabled.
I've just discovered that the virtual_functions and physical_functions
capabilities are not supported in the virNodeDeviceParse functions,
only in virNodeDeviceFormat (I suppose because they are only reported,
not set from XML). This should probably be remedied, but is less
immediately useful than the current patch.
Peter Krempa [Mon, 19 Oct 2015 12:36:14 +0000 (14:36 +0200)]
conf: Drop useless check when parsing cpu scheduler info
The checked predicate is a deduction from the following checks:
1) maximum cpu id is checked for every parsed <vcpusched> element
2) the resulting bitmaps are checked for overlaps
3) there has to be at least one cpu per <vcpusched>
From the above checks we can indeed deduce that if we have one
<vcpusched> element per CPU we will have at most 'maxvcpus' of them.
Ján Tomko [Tue, 24 Nov 2015 12:14:29 +0000 (13:14 +0100)]
qemu: pass the asyncJob to qemuProcessStartCPUs
Now that new domains are started inside a QEMU_ASYNC_JOB_START job,
we need to pass it down to qemuProcessStartCPUs too.
This removes the warning:
qemuDomainObjEnterMonitorInternal:1750 : This thread seems to be the
async job owner; entering monitor without asking for a nested job is
dangerous
Introduced by commit 04c721f, before that this code path was only
executed with QEMU_ASYNC_JOB_NONE.
(This code is not executed on migration, because qemuMigrationPrepareAny
sets the VIR_QEMU_PROCESS_START_PAUSED flag.)
Ján Tomko [Thu, 19 Nov 2015 13:25:44 +0000 (14:25 +0100)]
Simplify qemuSetupChrSourceCgroup and its callers
The domain definition is not needed in any of these functions.
Only pass it to qemuSetupChardevCgroup, which is used as a callback
for virDomainChrDefForeach.
Use the right type for passing virDomainObjPtr instead of
void* where possible.
Rather than using just open on the path, allow for the possibility that
the path to be opened resides on an NFS root-squash target and was created
under a different uid/gid.
Without using virFileOpenAs an attempt to get the volume size data may fail
if the current user doesn't have permissions to read the volume, such as
would be the case if mode wasn't supplied in the volume XML and the default
VIR_STORAGE_DEFAULT_VOL_PERM_MODE (e.g. 0600) was used. Under this scenario
the owner/group is not root:root, thus this path run under root would fail
to open/read the volume.
NB: The virFileOpenAs code using OPEN_FORK will only work when the failure
is not EACESS/EPERM and the path resolves to a shared file system.
Although commit id '77346f27' resolves part of the problem regarding creating
a qemu-img image in an NFS root-squash environment, it really didn't fix the
entire problem. Unfortunately it only masked the problem. It seems qemu-img
must open/create the image using 0644, which if used by target.perms would
result in the chmod not being called since the mode desired and set match.
Although qemu-img could conceivably ignore the mode when creating, libvirt
has more knowledge of the environment and can make the adjustment to the
mode far more easily by using virFileOpenAs with VIR_FILE_OPEN_FORCE_MODE.
If that's successful, then we know on return the file will have the right
owner and mode, so we can declare success
Andrea Bolognani [Fri, 13 Nov 2015 09:37:12 +0000 (10:37 +0100)]
qemu: Add ppc64-specific math to qemuDomainGetMlockLimitBytes()
The amount of memory a ppc64 domain might need to lock is different
than that of a equally-sized x86 domain, so we need to check the
domain's architecture and act accordingly.
Andrea Bolognani [Wed, 18 Nov 2015 11:10:33 +0000 (12:10 +0100)]
qemu: Use qemuDomainRequiresMlock() when attaching PCI hostdev
The function is used everywhere else to check whether the locked
memory limit should be set / updated, and it should be used here
as well.
Moreover, qemuDomainGetMlockLimitBytes() expects the hostdev to
have already been added to the domain definition, but we only do
that at the end of qemuDomainAttachHostPCIDevice(). Work around
the issue by adding the hostdev before adjusting the locked memory
limit and removing it immediately afterwards.
Jiri Denemark [Wed, 4 Nov 2015 11:45:15 +0000 (12:45 +0100)]
qemu: Close logfd when closing monitor
Remembering to call qemuMonitorSetDomainLog in the right paths before
calling qemuProcessStop is annoying and easy to forget. And I already
forgot to do so in commit v1.2.8-52-g0389060: logfd may be leaked if
QEMU process dies between Prepare and Finish migration phases.