2. use the same cpu in 2 cell, can set success(cpu count = 8 < 10):
<vcpu placement='static'>10</vcpu>
<cell id='0' cpus='0-3' memory='512000' unit='KiB'/>
<cell id='1' cpus='0-3' memory='512000' unit='KiB'/>
3. use the same cpu in 2 cell, cannot set success(cpu count = 11 > 10):
<vcpu placement='static'>10</vcpu>
<cell id='0' cpus='0-6' memory='512000' unit='KiB'/>
<cell id='1' cpus='0-3' memory='512000' unit='KiB'/>
Add a check for numa cpus, check if duplicate use one cpu in more
than one cell.
Signed-off-by: Luyao Huang <lhuang@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Mon, 27 Apr 2015 12:54:19 +0000 (14:54 +0200)]
qemu: Implement GIC
The only version that's supported in QEMU is version 2, currently.
Fortunately, it is enabled by aarch64 automatically, so there's
nothing for us that needs to be put onto command line.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Michal Privoznik [Mon, 27 Apr 2015 12:03:19 +0000 (14:03 +0200)]
Introduce GIC feature
Some platforms, like aarch64, don't have APIC but GIC. So there's
no reason to have <apic/> feature turned on. However, we are
still missing <gic/> feature. This commit introduces the feature
to XML parser and formatter, adds documentation and updates RNG
schema.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Jiri Denemark [Mon, 4 May 2015 20:21:23 +0000 (22:21 +0200)]
qemu: Properly rename persistent def after migration
When migrating a domain while changing its name and using
VIR_MIGRATE_PERSIST_DEST flag, libvirt would fail to properly change the
name in the persistent definition. The inconsistency results in weird
behavior when dumping domain XML, destroying the domain, restarting
libvirtd and likely in several other situations.
Since the new name is already stored in vm->def->name, we just need to
make sure the persistent definition uses this new name too.
polkit: Allow password-less access for 'libvirt' group
Many users, who admin their own machines, want to be able to access
system libvirtd via tools like virt-manager without having to enter
a root password. Just google 'virt-manager without password' and
you'll find many hits. I've read at least 5 blog posts over the years
describing slightly different ways of achieving this goal.
Let's finally add official support for this.
Install a polkit-1 rules file granting password-less auth for any user
in the new 'libvirt' group. Create the group on RPM install
storage: fs: Don't try to chown directory unless user requested
Currently we try to chown any directory passed to virDirCreate,
even if the user didn't request any explicit owner/group via the
pool/vol XML.
This causes issues with qemu:///session: try to build a pool of
a root owned directory like /tmp, and it fails trying to chown the
directory to the session user. Instead it should just leave things
as they are, unless the user requests changing permissions via
the pool XML.
Similarly this is annoying if creating a storage pool via system
libvirtd of an existing directory in user $HOME, it's now owned
by root.
The virDirCreate function is pretty convoluted, since it needs to
fork off in certain specific cases. Try to document that, to make
it clear where exactly we are changing behavior.
storage: fs: Don't attempt directory creation if it already exists
The current code attempts to handle this, but it only catches mkdir
failing with EEXIST. However if say trying to build /tmp for an
unprivileged qemu:///session, mkdir will fail with EPERM.
Rather than catch any errors, just don't attempt mkdir if the directory
already exists.
Set the capability based on qmp query, or qemu version. The qmp query
includes vmport with 2.2, but no longer with 2.3. It lists only
non-machine specific capabilities, so check the qemu version too until a
machine-specific query is supported.
The QEMU machine vmport option allows to set the VMWare IO port
emulation. This emulation is useful for absolute pointer input when the
guest has vmware input drivers, and is enabled by default for kvm.
However it is unnecessary for Spice-enabled VM, since the agent already
handles absolute pointer and multi-monitors. Furthermore, it prevents
Spice from switching to relative input since the regular ps/2 pointer
driver is replaced by the vmware driver. It is thus advised to disable
vmport when using a Spice VM. This will permit the Spice client to
switch from absolute to relative pointer, as it may be required for
certain games or applications.
Luyao Huang [Fri, 20 Mar 2015 14:39:03 +0000 (15:39 +0100)]
tools: fix the wrong check when use virsh setvcpus --maximum
The --maximum option wasn't properly parsed and the equivalent flag
wasn't set. Fix this bug and also rewrite the way we check this option
by using new macro. The new approach is that --maximum requires
--config, no other combination is allowed, because they don't make sense.
The new error will be:
# virsh setvcpus test --maximum 10
error: Option --config is required by option --maximum
Pavel Hrdina [Wed, 25 Mar 2015 21:33:10 +0000 (22:33 +0100)]
qemu: use new macros for setvcpus to check flags and cleanup the code
Now that we have macros for exclusive flags and flag requirements we can
use them to cleanup the code for setvcpus and error out for all wrong
flag combination.
John Ferlan [Thu, 30 Apr 2015 19:52:03 +0000 (15:52 -0400)]
qemu: Fix bus and lun checks when scsi-disk.channel not present
Found by Laine and discussed a bit on internal IRC.
Commit id c56fe7f1d6 added support for creating a command line to support
scsi-disk.channel.
Series was here:
http://www.redhat.com/archives/libvir-list/2012-February/msg01052.html
Which pointed to a design proposal here:
http://permalink.gmane.org/gmane.comp.emulators.libvirt/50428
Which states (in part):
Libvirt should check for the QEMU "scsi-disk.channel" property. If it
is unavailable, QEMU will only support channel=lun=0 and 0<=target<=7.
However, the check added was ensuring that bus != lun *and* bus != 0. So
if bus == lun and both were non zero, we'd never make the second check.
Changing this to an *or* check fixes the check, but still is less readable
than the just checking each for 0
Pavel Hrdina [Thu, 30 Apr 2015 15:18:59 +0000 (17:18 +0200)]
rpm-build: update %files section for libxl
Recent commit 198cc1d3 introduced integration of lockd and sanlock into
libxl, but forget to update libvirt.spec.in to also list new files
distributed via package.
Ján Tomko [Wed, 29 Apr 2015 12:59:08 +0000 (14:59 +0200)]
iscsi: do not fail to stop a stopped pool
Just as we allow stopping filesystem pools when they were unmounted
externally, do not fail to stop an iscsi pool when someone else
closed the session externally.
The phyp driver stuffed it into a DomainDefPtr during its attachdevice
routine, but the value is never advertised via capabilities so it should
be safe to drop.
Have the phyp driver use OSTYPE_LINUX, which is what it advertises via
capabilities.
Michael Chapman [Thu, 16 Apr 2015 09:24:23 +0000 (19:24 +1000)]
qemu: migration: use sync block job helpers
In qemuMigrationDriveMirror we can start all disk mirrors in parallel.
We wait until they are all ready, or one of them aborts.
In qemuMigrationCancelDriveMirror, we wait until all mirrors are
properly stopped. This is necessary to ensure that destination VM is
fully in sync with the (paused) source VM.
If a drive mirror can not be cancelled, then the destination is not in a
consistent state. In this case it is not safe to continue with the
migration.
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
qemuBlockJobSyncBegin and qemuBlockJobSyncEnd delimit a region of code
where block job events are processed "synchronously".
qemuBlockJobSyncWait and qemuBlockJobSyncWaitWithTimeout wait for an
event generated by a block job.
The Wait* functions may be called multiple times while the synchronous
block job is active. Any pending block job event will be processed by
only when Wait* or End is called. disk->blockJobStatus is reset by
these functions, so if it is needed a pointer to a
virConnectDomainEventBlockJobStatus variable should be passed as the
last argument. It is safe to pass NULL if you do not care about the
block job status.
All functions assume the VM object is locked. The Wait* functions will
unlock the object for as long as they are waiting. They will return -1
and report an error if the domain exits before an event is received.
if (qemuBlockJobSyncWaitWithTimeout(driver, vm, disk,
timeout, &status) < 0) {
/* domain died while waiting for event */
ret = -1;
goto error;
}
} while (status == -1);
qemuBlockJobSyncEnd(driver, vm, disk, NULL);
Signed-off-by: Michael Chapman <mike@very.puzzling.org>
Peter Krempa [Mon, 27 Apr 2015 11:57:20 +0000 (13:57 +0200)]
qemu: blockCopy: Allow reuse of raw image for shallow block copy
The documentation states that for shallow block copy the image has to
have the same guest visible content as backing file of the current
image if the file is being reused. This condition can be achieved also
with a raw file (or a qcow without a backing file) so remove the
condition that would disallow it.
(This patch additionally fixes crash described in
https://bugzilla.redhat.com/show_bug.cgi?id=1215569 )
Zhang Bo [Tue, 28 Apr 2015 01:16:13 +0000 (09:16 +0800)]
tests: free ChardevInfo correctly in qemumonitorjsontest
The free callback should be qemuMonitorChardevInfoFree rather
than just 'free' when virHashCreate'ing the chardevInfo hash.
==29959== 24 bytes in 2 blocks are definitely lost in loss record 19 of 53
==29959== at 0x4C29F80: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==29959== by 0xB95C679: strdup (in /lib64/libc-2.20.so)
==29959== by 0x63C6546: virStrdup (virstring.c:709)
==29959== by 0x4805ED: qemuMonitorJSONExtractChardevInfo (qemu_monitor_json.c:3429)
==29959== by 0x4807A5: qemuMonitorJSONGetChardevInfo (qemu_monitor_json.c:3479)
==29959== by 0x434AEC: testQemuMonitorJSONqemuMonitorJSONGetChardevInfo (qemumonitorjsontest.c:1824)
==29959== by 0x436F2F: virtTestRun (testutils.c:211)
==29959== by 0x436932: mymain (qemumonitorjsontest.c:2404)
==29959== by 0x4382EA: virtTestMain (testutils.c:863)
==29959== by 0x436B27: main (qemumonitorjsontest.c:2423)
Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com> Signed-off-by: Zhou Yimin <zhouyimin@huawei.com>
storage: fs: Ignore volumes that fail to open with EACCESS/EPERM
Trying to use qemu:///session to create a storage pool pointing at
/tmp will usually fail with something like:
$ virsh pool-start tmp
error: Failed to start pool tmp
error: cannot open volume '/tmp/systemd-private-c38cf0418d7a4734a66a8175996c384f-colord.service-kEyiTA': Permission denied
If any volume in an FS pool can't be opened by the daemon, the refresh
fails, and the pool can't be used.
This causes pain for virt-install/virt-manager though. Imaging a user
downloads a disk image to /tmp. virt-manager wants to import /tmp as
a storage pool, so we can detect what disk format it is, and set the
XML correctly. However this case will likely fail as explained above.
Change the logic here to skip volumes that fail to open. This could
conceivably cause user complaints along the lines of 'why doesn't
libvirt show $ROOT-OWNED-VOLUME-FOO', but figuring that currently
the pool won't even startup, I don't think there are any current
users that care about that case.
storage: If driver startup state syncing fails, delete statefile
If you end up with a state file for a pool that no longer starts up
or refreshes correctly, the state file is never removed and adds
noise to the logs everytime libvirtd is started.
If the initial state syncing fails, delete the statefile.
John Ferlan [Mon, 27 Apr 2015 18:24:34 +0000 (14:24 -0400)]
qemu: qemuProcessDetectIOThreadPIDs invert checks
If we received zero iothreads from the monitor, but were perhaps
expecting to receive something, then the code was skipping the check
to ensure what's in the monitor matches our expectations. So invert
the checks to check that what we get back matches expectations and
then check there are zero iothreads returned.
John Ferlan [Mon, 27 Apr 2015 18:16:54 +0000 (14:16 -0400)]
qemu: Remove need for qemuDomainParseIOThreadAlias
Rather than have a separate routine to parse the alias of an iothread
returned from qemu in order to get the iothread_id value, parse the alias
when returning and just return the iothread_id in qemuMonitorIOThreadInfoPtr
This set of patches removes the function, changes the "char *name" to
"unsigned int" and handles all the fallout.
CC conf/libvirt_conf_la-domain_conf.lo
conf/domain_conf.c:13377:9: error: variable 'cpumask' is used
uninitialized whenever 'if' condition is true
[-Werror,-Wsometimes-uninitialized]
if (!(tmp = virXMLPropString(node, "cpuset"))) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
and many other similar errors regarding the 'cpuset' variable.
Laine Stump [Fri, 24 Apr 2015 18:15:44 +0000 (14:15 -0400)]
network: check newDef for used bridge names in addition to def
If someone has updated a network to change its bridge name, but the
network is still active (so that bridge name hasn't taken effect yet),
we still want to disallow another network from taking that new name.
Laine Stump [Thu, 23 Apr 2015 18:29:08 +0000 (14:29 -0400)]
network: check for bridge name conflict with existing devices
Since some people use the same naming convention as libvirt for bridge
devices they create outside the context of libvirt, it is much nicer
if we check for those devices when looking for a bridge device name to
auto-assign to a new network.
Laine Stump [Thu, 23 Apr 2015 16:49:59 +0000 (12:49 -0400)]
network: move auto-assign of bridge name from XML parser to net driver
We already check that any auto-assigned bridge device name for a
virtual network (e.g. "virbr1") doesn't conflict with the bridge name
for any existing libvirt network (via virNetworkSetBridgeName() in
conf/network_conf.c).
We also want to check that the name doesn't conflict with any bridge
device created on the host system outside the control of libvirt
(history: possibly due to the ploriferation of references to libvirt's
bridge devices in HOWTO documents all around the web, it is not
uncommon for an admin to manually create a bridge in their host's
system network config and name it "virbrX"). To add such a check to
virNetworkBridgeInUse() (which is called by virNetworkSetBridgeName())
we would have to call virNetDevExists() (from util/virnetdev.c); this
function calls ioctl(SIOCGIFFLAGS), which everyone on the mailing list
agreed should not be done from an XML parsing function in the conf
directory.
To remedy that problem, this patch removes virNetworkSetBridgeName()
from conf/network_conf.c and puts an identically functioning
networkBridgeNameValidate() in network/bridge_driver.c (because it's
reasonable for the bridge driver to call virNetDevExists(), although
we don't do that yet because I wanted this patch to have as close to 0
effect on function as possible).
There are a couple of inevitable changes though:
1) We no longer check the bridge name during
virNetworkLoadConfig(). Close examination of the code shows that
this wasn't necessary anyway - the only *correct* way to get XML
into the config files is via networkDefine(), and networkDefine()
will always call networkValidate(), which previously called
virNetworkSetBridgeName() (and now calls
networkBridgeNameValidate()). This means that the only way the
bridge name can be unset during virNetworkLoadConfig() is if
someone edited the config file on disk by hand (which we explicitly
prohibit).
2) Just on the off chance that somebody *has* edited the file by hand,
rather than crashing when they try to start their malformed
network, a check for non-NULL bridge name has been added to
networkStartNetworkVirtual().
(For those wondering why I don't instead call
networkValidateBridgeName() there to set a bridge name if one
wasn't present - the problem is that during
networkStartNetworkVirtual(), the lock for the network being
started has already been acquired, but the lock for the network
list itself *has not* (because we aren't adding/removing a
network). But virNetworkBridgeInuse() iterates through *all*
networks (including this one) and locks each network as it is
checked for a duplicate entry; it is necessary to lock each network
even before checking if it is the designated "skip" network because
otherwise some other thread might acquire the list lock and delete
the very entry we're examining. In the end, permitting a setting of
the bridge name during network start would require that we lock the
entire network list during any networkStartNetwork(), which
eliminates a *lot* of parallelism that we've worked so hard to
achieve (it can make a huge difference during libvirtd startup). So
rather than try to adjust for someone playing against the rules, I
choose to instead give them the error they deserve.)
3) virNetworkAllocateBridge() (now removed) would leak any "template"
string set as the bridge name. Its replacement
networkFindUnusedBridgeName() doesn't leak the template string - it
is properly freed.
Laine Stump [Mon, 27 Apr 2015 17:43:06 +0000 (13:43 -0400)]
test: Fix actual vs. expected in virtTestCompareFiles
Commit ca329299 added a utility function virtTestCompareFiles() to
eliminate repetitive code in several test programs. It unfortunately
calls virtTestDifference() with the arguments in the wrong order -
strcontent is the "actual" output gathered by the test rig, while
filecontent is the "expected", and virtTestDifference() wants expected
(filecontent) followed by actual (strcontent), but
virtTestCompareFiles() does the opposite, which can make the output a
bit confusing when there is a failure.
John Ferlan [Mon, 27 Apr 2015 11:25:22 +0000 (07:25 -0400)]
qemu: Resolve Coverity DEADCODE
Coverity notes that the switch() used to check 'connected' values has
two DEADCODE paths (_DEFAULT & _LAST). Since 'connected' is a boolean
it can only be one or the other (CONNECTED or DISCONNECTED), so it just
seems pointless to use a switch to get "all" values. Convert to if-else
OPTIONS
[--domain] <string> domain name, id or uuid
[--id] <number> iothread for the new IOThread
--config affect next boot
--live affect running domain
--current affect current domain
$ virsh iothreaddel --help
NAME
iothreaddel - delete an IOThread from the guest domain
DESCRIPTION
Delete an IOThread from the guest domain.
OPTIONS
[--domain] <string> domain name, id or uuid
[--id] <number> iothread_id for the IOThread to delete
--config affect next boot
--live affect running domain
--current affect current domain
Assuming a running $dom with multiple IOThreads assigned and that
that the $dom has disks assigned to IOThread 1 and IOThread 2:
$ virsh iothreadinfo $dom
IOThread ID CPU Affinity
---------------------------------------------------
1 2
2 3
3 0-1
$ virsh iothreadadd $dom 1
error: invalid argument: an IOThread is already using iothread_id '1' in iothreadpids
$ virsh iothreadadd $dom 1 --config
error: invalid argument: an IOThread is already using iothread_id '1' in persistent iothreadids
John Ferlan [Wed, 18 Mar 2015 10:51:12 +0000 (06:51 -0400)]
qemu: Add support to Add/Delete IOThreads
Add qemuDomainAddIOThread and qemuDomainDelIOThread in order to add or
remove an IOThread to/from the host either for live or config optoins
The implementation for the 'live' option will use the iothreadpids list
in order to make decision, while the 'config' option will use the
iothreadids list. Additionally, for deletion each may have to adjust
the iothreadpin list.
IOThreads are implemented by qmp objects, the code makes use of the existing
qemuMonitorAddObject or qemuMonitorDelObject APIs.
John Ferlan [Thu, 23 Apr 2015 18:01:48 +0000 (14:01 -0400)]
domain: Introduce virDomainIOThreadSchedDelId
We're about to allow IOThreads to be deleted, but an iothreadid may be
included in some domain thread sched, so add a new API to allow removing
an iothread from some entry.
Then during the writing of the threadsched data and an additional check
to determine whether the bitmap is all clear before writing it out.
John Ferlan [Tue, 21 Apr 2015 21:21:28 +0000 (17:21 -0400)]
conf: Adjust the iothreadsched expectations
With iothreadid's allowing any 'id' value for an iothread_id, the
iothreadsched code needs a slight adjustment to allow for "any"
unsigned int value in order to create the bitmap of ids that will
have scheduler adjustments. Adjusted the doc description as well.
John Ferlan [Tue, 21 Apr 2015 17:35:33 +0000 (13:35 -0400)]
conf: Move virDomainPinIsDuplicate and make static
Since it's only ever referenced in domain_conf.c, make the function
static, but also will need to move it to somewhere before it's referenced
rather than forward referencing it.
John Ferlan [Fri, 10 Apr 2015 13:21:23 +0000 (09:21 -0400)]
qemu: Use domain iothreadids to IOThread's 'thread_id'
Add 'thread_id' to the virDomainIOThreadIDDef as a means to store the
'thread_id' as returned from the live qemu monitor data.
Remove the iothreadpids list from _qemuDomainObjPrivate and replace with
the new iothreadids 'thread_id' element.
Rather than use the default numbering scheme of 1..number of iothreads
defined for the domain, use the iothreadid's list for the iothread_id
Since iothreadids list keeps track of the iothread_id's, these are
now used in place of the many places where a for loop would "know"
that the ID was "+ 1" from the array element.
The new tests ensure usage of the <iothreadid> values for an exact number
of iothreads and the usage of a smaller number of <iothreadid> values than
iothreads that exist (and usage of the default numbering scheme).
John Ferlan [Thu, 2 Apr 2015 23:59:25 +0000 (19:59 -0400)]
conf: Add new domain XML element 'iothreadids'
Adding a new XML element 'iothreadids' in order to allow defining
specific IOThread ID's rather than relying on the algorithm to assign
IOThread ID's starting at 1 and incrementing to iothreads count.
This will allow future patches to be able to add new IOThreads by
a specific iothread_id and of course delete any exisiting IOThread.
Each iothreadids element will have 'n' <iothread> children elements
which will have attribute "id". The "id" will allow for definition
of any "valid" (eg > 0) iothread_id value.
On input, if any <iothreadids> <iothread>'s are provided, they will
be marked so that we only print out what we read in.
On input, if no <iothreadids> are provided, the PostParse code will
self generate a list of ID's starting at 1 and going to the number
of iothreads defined for the domain (just like the current algorithm
numbering scheme). A future patch will rework the existing algorithm
to make use of the iothreadids list.
On output, only print out the <iothreadids> if they were read in.
Michal Privoznik [Sat, 25 Apr 2015 08:06:29 +0000 (10:06 +0200)]
openvz: Drop useless domain lookup
The lookup is just for check whether a domain we are about to add does
not already exists. Well, the virDomainObjListAdd() function does that
for us already so there's no need to duplicate the check.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Peter Krempa [Fri, 24 Apr 2015 14:48:26 +0000 (16:48 +0200)]
qemu: Connect to guest agent after channel hotplug
If a user hot-attaches the guest agent channel libvirt would ignore it
until the restart of libvirtd or shutdown/destroy and start of the VM
itself.
This patch adds code that opens or closes the guest agent connection
according to the state of the guest agent channel according to
connect/disconnect events.
To allow opening the channel from the event handler qemuConnectAgent
needed to be exported.
Peter Krempa [Fri, 24 Apr 2015 14:43:38 +0000 (16:43 +0200)]
qemu: agent: Differentiate errors when the agent channel was hotplugged
When the guest agent channel gets hotplugged to a VM, libvirt would
still report that "QEMU guest agent is not configured" rather than
stating that the connection was not established yet.
Currently the code won't be able to connect to the agent after hotplug
but that will change in a later patch.
As the qemuFindAgentConfig() helper is quite helpful in this case move
it to a more usable place and export it.