Commit be1bb6c95 changed the way volumes were stored from a forward
linked list to a hash table. In doing so, it required that each vol
object would have 3 unique values as keys into tables - key, name,
and path. Due to how vHBA/NPIV LUNs are created/used this resulted
in a failure to utilize all the LUN's found during processing.
During virStorageBackendSCSINewLun processing fetch the key (or
serial value) for NPIV LUN's using virStorageFileGetNPIVKey which
will formulate a more unique key based on the serial value and
the port for the LUN.
Signed-off-by: John Ferlan <jferlan@redhat.com> ACKed-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
John Ferlan [Wed, 16 Jan 2019 00:07:07 +0000 (19:07 -0500)]
util: Introduce virStorageFileGetNPIVKey
The vHBA/NPIV LUNs created via the udev processing of the
VPORT_CREATE command end up using the same serial value
as seen/generated by the /lib/udev/scsi_id as returned
during virStorageFileGetSCSIKey. Therefore, in order to
generate a unique enough key to be used when adding the
LUN as a volume during virStoragePoolObjAddVol a more
unique key needs to be generated for an NPIV volume.
The problem is illustrated by the following example, where
scsi_host5 is a vHBA used with the following LUNs:
$ lsscsi -tg
...
[5:0:4:0] disk fc:0x5006016844602198,0x101f00 /dev/sdh /dev/sg23
[5:0:5:0] disk fc:0x5006016044602198,0x102000 /dev/sdi /dev/sg24
...
Note that althrough /dev/sdh and /dev/sdi are separate LUNs, they
end up with the same serial number used for the vol->key value.
When virStoragePoolFCRefreshThread calls virStoragePoolObjAddVol
the second LUN fails to be added with the following message
getting logged:
To resolve this, virStorageFileGetNPIVKey will use a similar call
sequence as virStorageFileGetSCSIKey, except that it will add the
"--export" option to the call. This results in more detailed output
which needs to be parsed in order to formulate a unique enough key
to be used. In order to be unique enough, the returned value will
concatenate the target port as returned in the "ID_TARGET_PORT"
field from the command to the "ID_SERIAL" value.
Signed-off-by: John Ferlan <jferlan@redhat.com> ACKed-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
John Ferlan [Tue, 15 Jan 2019 20:59:21 +0000 (15:59 -0500)]
storage: Rework virStorageBackendSCSISerial
Alter the code to use the virStorageFileGetSCSIKey helper
to fetch the unique key for the SCSI disk. Alter the logic
to follow the former code which would return a duplicate
of @dev when either the virCommandRun succeeded, but returned
an empty string or when WITH_UDEV was not true.
Signed-off-by: John Ferlan <jferlan@redhat.com> ACKed-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
The way we currently start qemu from CPU affinity POV is as
follows:
1) the child process is set affinity to all online CPUs (unless
some vcpu pinning was given in the domain XML)
2) Once qemu is running, cpuset cgroup is configured taking
memory pinning into account
Problem is that we let qemu allocate its memory just anywhere in
1) and then rely in 2) to be able to move the memory to
configured NUMA nodes. This might not be always possible (e.g.
qemu might lock some parts of its memory) and is very suboptimal
(copying large memory between NUMA nodes takes significant amount
of time).
The solution is to set affinity to one of (in priority order):
- The CPUs associated with NUMA memory affinity mask
- The CPUs associated with emulator pinning
- All online host CPUs
Later (once QEMU has allocated its memory) we then change this
again to (again in priority order):
- The CPUs associated with emulator pinning
- The CPUs returned by numad
- The CPUs associated with vCPU pinning
- All online host CPUs
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Erik Skultety [Thu, 24 Jan 2019 09:33:01 +0000 (10:33 +0100)]
qemu: caps: Use CAP_DAC_OVERRIDE for probing to avoid permission issues
This is mainly about /dev/sev and its default permissions 0600. Of
course, rule of 'tinfoil' would be that we can't trust anything, but the
probing code in QEMU is considered safe from security's perspective + we
can't create an udev rule for this at the moment, because ioctls and
file system permissions aren't cross-checked in kernel and therefore a
user with read permissions could issue a 'privileged' operation on SEV
which is currently only limited to root.
Erik Skultety [Mon, 21 Jan 2019 13:50:11 +0000 (14:50 +0100)]
qemu: cgroup: Expose /dev/sev/ only to domains that require SEV
SEV has a limit on number of concurrent guests. From security POV we
should only expose resources (any resources for that matter) to domains
that truly need them.
Signed-off-by: Erik Skultety <eskultet@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
This adds an additional directive to the dnsmasq configuration file that
notifies clients via dhcp about the link's MTU. Guests can then choose
adjust their link accordingly.
Signed-off-by: Casey Callendrello <cdc@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Ján Tomko <jtomko@redhat.com>
Instead of repeating the same platform for every test,
set it once, since we do the same tests with the same
input for all platforms, it's just the output that differs.
Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Andrea Bolognani <abologna@redhat.com>
John Ferlan [Wed, 30 Jan 2019 15:26:48 +0000 (10:26 -0500)]
tests: Fix storagepoolxml2xmltest execution for XML namespaces
Only run the pool-netfs-ns-mountopts if built WITH_STORAGE_FS and only
run pool-rbd-ns-configopts if built with WITH_STORAGE_RBD since the
namespace support is only enabled if the pool is enabled.
Signed-off-by: John Ferlan <jferlan@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
The 'qemu' binary used to provide the i386 emulator until it was renamed
to qemu-system-i386 in QEMU 1.0. Since we don't support such old
versions we don't need to check for 'qemu' when probing capabilities.
Reviewed-by: Erik Skultety <eskultet@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
storage: change custom namespace URIs to drop '/source' component
The custom namespaces were originally registered against the storage
pool source struct, but during review this was changed to the top level
storage pool struct. The namespace URIs were not updated to match, so
had a redundant '/source' component.
Reviewed-by: John Ferlan <jferlan@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Peter Krempa [Tue, 29 Jan 2019 16:17:29 +0000 (17:17 +0100)]
qemu: blockjob: Don't report block job progress at 100% if job isn't ready
Some clients poll virDomainGetBlockJobInfo rather than wait for the
VIR_DOMAIN_BLOCK_JOB_READY event. In some cases qemu can get to 100% and
still not reach the synchronised phase. Initiating a pivot in that case
will fail.
Given that computers are interacting here, the error that the job
can't be finalized yet is not handled very well by those specific
implementations.
Our docs now correctly state to use the event. We already do a similar
output adjustment in case when the progress is not available from qemu
as in that case we'd report 0 out of 0, which some apps also incorrectly
considered as 100% complete.
In this case we subtract 1 from the progress if the ready state is not
signalled by qemu if the progress was at 100% otherwise.
storage: pass struct _virStorageBackendQemuImgInfo by reference
The struct _virStorageBackendQemuImgInfo is quite large so it is
preferrable to pass it by reference instead of by value. This requires
us to stop modifying the "compat" field.
Reviewed-by: John Ferlan <jferlan@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
'val' is initialized from virDomainCapsFeatureTypeFromString and a
few lines earlier there was already a check for 'val < 0'.
The 'val >= 0' is thus always true. The enum conversion similarly
ensures that the val will be less than VIR_DOMAIN_CAPS_FEATURE_LAST,
so "val < VIR_DOMAIN_CAPS_FEATURE_LAST' is thus always true too.
Reviewed-by: John Ferlan <jferlan@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Peter Krempa [Wed, 23 Jan 2019 14:54:53 +0000 (15:54 +0100)]
qemu: Label backing chain of user-provided target of blockCopy when starting the job
Be more sensible when setting labels of the target of a
virDomainBlockCopy operation. Previously we'd relabel everything in case
it's a copy job even if there's no unlabelled backing chain. Since we
are also not sure whether the backing chain is shared we don't relabel
the chain on completion of the blockjob. This certainly won't play nice
with the image permission relabelling feature.
While this does not fix the case where the image is reused and has
backing chain it certainly sanitizes all the other cases. Later on it
will also allow to do the correct thing in cases where only one layer
was introduced.
The change is necessary as in case when -blockdev will be used we will
need to hotplug the backing chain and thus labeling needs to be setup in
advance and not only at the time of pivot. To avoid multiple code paths
move the labeling now.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Peter Krempa [Wed, 23 Jan 2019 13:28:31 +0000 (14:28 +0100)]
qemu: hotplug: Refactor qemuHotplugPrepareDiskAccess to work on virStorageSource
Rather than passing in a virStorageSource which would override the
originally passed disk->src we can now drop passing in a disk completely
as all functions called inside here require a virStorageSource.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Peter Krempa [Wed, 23 Jan 2019 12:53:14 +0000 (13:53 +0100)]
qemu: driver: Remove disk source munging in qemuDomainBlockPivot
Previously there weren't any suitable functions which would allow
setting up host side of a full disk chain so we've opted to replace the
'src' in a virDomainDiskDef by the new image source.
That is now no longer necessary so remove the munging.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Peter Krempa [Wed, 23 Jan 2019 10:50:33 +0000 (11:50 +0100)]
security: Remove security driver internals for disk labeling
Security labeling of disks consists of labeling of the disk image
itself and it's backing chain. Modify
virSecurityManager[Set|Restore]ImageLabel to take a boolean flag that
will label the full chain rather than the top image itself.
This allows to delete/unify some parts of the code and will also
simplify callers in some cases.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Peter Krempa [Wed, 16 Jan 2019 14:49:07 +0000 (15:49 +0100)]
qemu: cgroup: Change qemu[Setup|Teardown]DiskCgroup to take virStorageSource
Since the disk is necessary only to get the source modify the functions
to take the source directly and rename them to
qemu[Setup|Teardown]ImageChainCgroup.
Additionally drop a pointless comment containing the old function name.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Peter Krempa [Wed, 16 Jan 2019 14:33:07 +0000 (15:33 +0100)]
qemu: domain: Allow overriding disk source in qemuDomainDetermineDiskChain
When we need to detect a chain for a image which will become the new
source for a disk (e.g. after a disk media change or a blockjob) we'd
need to replace disk->src temporarily to do so.
Move the 'disksrc' temporary variable to an argument and adjust callers.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Peter Krempa [Wed, 16 Jan 2019 14:17:24 +0000 (15:17 +0100)]
qemu: domain: Clarify temp variable scope in qemuDomainDetermineDiskChain
The function at first validates the top image of the chain, then
traverses the chain as declared in the XML (if any) and then procedes to
detect the rest of the chain from images. All of the steps have their
own temporary iterator.
Clarify the use scope of the steps by introducing a new temp variable
holding the top level source and adding comments.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
John Ferlan [Wed, 30 Jan 2019 14:31:31 +0000 (09:31 -0500)]
tests: Add storagepoolxml2argvtest source to EXTRA_DIST
Commit f2f84b4d4 added storagepoolxml2argvtest processing; however,
it didn't follow alter the else to !WITH_STORAGE and add the source
itself to the EXTRA_DIST like the other WITH_STORAGE options for
virstorageutiltest and storagevolxml2argvtest.
Signed-off-by: John Ferlan <jferlan@redhat.com> ACKed-by: Michal Privoznik <mprivozn@redhat.com>
John Ferlan [Wed, 30 Jan 2019 14:29:37 +0000 (09:29 -0500)]
tests: Fix build issue with storagevolxml2xmltest
Commit 7a227688a caused a build failure on mingw. Following
other uses of including ../src/libvirt_driver_storage_impl.la
I moved to under the WITH_STORAGE conditional.
Signed-off-by: John Ferlan <jferlan@redhat.com> ACKed-by: Michal Privoznik <mprivozn@redhat.com>
John Ferlan [Tue, 8 Jan 2019 14:28:03 +0000 (09:28 -0500)]
rbd: Utilize storage pool namespace to manage config options
Allow for adjustment of RBD configuration options via Storage
Pool XML Namespace adjustments. When namespace arguments are
used to start the pool, add a VIR_WARN to indicate that the
startup was tainted by custom config_opts.
John Ferlan [Wed, 12 Dec 2018 22:41:14 +0000 (17:41 -0500)]
storage: Add storage pool namespace options to fs and netfs command lines
If the Storage Pool Namespace XML data exists, format the mount
options on the MOUNT command line and issue a VIR_WARN to indicate
that the storage pool was tainted by custom mount_opts.
When the pool is started, the options will be generated on the
command line along with the options already defined.
Signed-off-by: John Ferlan <jferlan@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
John Ferlan [Mon, 7 Jan 2019 22:14:57 +0000 (17:14 -0500)]
storage: Add infrastructure to manage XML namespace options
Introduce the virStoragePoolFSMountOptionsDef to be used to
manage the Storage Pool XML Namespace for mount options.
Using a new virStorageBackendNamespaceInit function, set the
virStoragePoolXMLNamespace into the _virStoragePoolOptions when
the storage backend is loaded.
Modify the storagepool.rng to allow for the usage of a different
XML namespace to parse the fs_mount_opts to be included with
the fs and netfs storage pool definitions.
Modify the storagepoolxml2xmltest to utilize a properly modified
XML file to parse and format the namespace for a netfs storage pool.
Signed-off-by: John Ferlan <jferlan@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
John Ferlan [Thu, 3 Jan 2019 16:47:48 +0000 (11:47 -0500)]
conf: Introduce virStoragePoolXMLNamespace
Introduce the infrastructure necessary to manage a Storage Pool XML
Namespace. The general concept is similar to virDomainXMLNamespace,
except that for Storage Pools the storage backend specific details
can be stored within the _virStoragePoolOptions unlike the domain
processing code which manages its xmlopt's via the virDomainXMLOption
which is allocated/passed around for each domain.
This patch defines the add the parse, format, free, and href methods
required to process the XML and callout from the Storage Pool Def
parse, format, and free API's to perform the action on the XML data
for/from the backend.
Signed-off-by: John Ferlan <jferlan@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Modify the command generation to add some default options to the
fs/netfs storage pools based on the OS type. For Linux, it'll be
the "nodev, nosuid, noexec". For FreeBSD, it'll be "nosuid, noexec".
For others, just leave the options alone.
Modify the storagepoolxml2argvtest to handle the fact that the
same input XML could generate different output XML based on whether
Linux, FreeBSD, or other was being built.
Signed-off-by: John Ferlan <jferlan@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
When the guest is running we already check if pinning is
possible in the qemuDomainPinVcpuLive method, so this
check was adding no benefit.
When the guest is not running, we cannot know whether the
subsequent launch will use MTTCG or TCG, so we must allow
the pinning request. If the guest does use TCG on the next
launch it will fail, but this is no worse than if the user
had done a virDomainDefineXML with an XML doc specifying
vCPU pinning.
Reviewed-by: John Ferlan <jferlan@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
MTTCG is the new multi-threaded impl of TCG which follows
KVM in having one host OS thread per vCPU. Historically
we have discarded all PIDs reported for TCG guests, but
we must now selectively honour this data.
We don't have anything in the domain XML that indicates
whether a guest is using TCG or MTTCG. While QEMU does
have an option (-accel tcg,thread=single|multi), it is
not desirable to expose this in libvirt. QEMU will
automatically use MTTCG when the host/guest architecture
pairing is known to be safe. Only developers of QEMU TCG
have a strong reason to override this logic.
Thus we use two sanity checks to decide if the vCPU
PID information is usable. First we see if the PID
duplicates the main emulator PID, and second we see
if the PID duplicates any other vCPUs.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Peter Krempa [Mon, 21 Jan 2019 15:01:57 +0000 (16:01 +0100)]
lib: domain: Emphasise that users should wait for block job READY state via events
The transition to the ready state is best observed by events as it's
ansynchronous and does not hint users to do polling. As currently only
the qemu driver supports block copy and block commit and the ready state
event was introduced by qemu 1.3 we can fully switch to the new
approach.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Peter Krempa [Mon, 21 Jan 2019 11:28:25 +0000 (12:28 +0100)]
qemu: Don't reject making domain persistent if block copy is running
Add documentation that the 'VIR_DOMAIN_BLOCK_COPY_TRANSIENT_JOB' flag
is auto-assumed if the block copy job is started while the VM is
transient and remove the restriction to define the domain when copy
is running.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
util: move virtual network firwall rules into private chains
The previous commit created new chains to hold the firewall rules. This
commit changes the code that creates rules to place them in the new
private chains instead of the builtin top level chains.
With two networks running, the rules in the filter table now look like
-N LIBVIRT_FWI
-N LIBVIRT_FWO
-N LIBVIRT_FWX
-N LIBVIRT_INP
-N LIBVIRT_OUT
-A INPUT -j LIBVIRT_INP
-A FORWARD -j LIBVIRT_FWX
-A FORWARD -j LIBVIRT_FWI
-A FORWARD -j LIBVIRT_FWO
-A OUTPUT -j LIBVIRT_OUT
-A LIBVIRT_FWI -d 192.168.0.0/24 -o virbr0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A LIBVIRT_FWI -o virbr0 -j REJECT --reject-with icmp-port-unreachable
-A LIBVIRT_FWI -d 192.168.1.0/24 -o virbr1 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A LIBVIRT_FWI -o virbr1 -j REJECT --reject-with icmp-port-unreachable
-A LIBVIRT_FWO -s 192.168.0.0/24 -i virbr0 -j ACCEPT
-A LIBVIRT_FWO -i virbr0 -j REJECT --reject-with icmp-port-unreachable
-A LIBVIRT_FWO -s 192.168.1.0/24 -i virbr1 -j ACCEPT
-A LIBVIRT_FWO -i virbr1 -j REJECT --reject-with icmp-port-unreachable
-A LIBVIRT_FWX -i virbr0 -o virbr0 -j ACCEPT
-A LIBVIRT_FWX -i virbr1 -o virbr1 -j ACCEPT
-A LIBVIRT_INP -i virbr0 -p udp -m udp --dport 53 -j ACCEPT
-A LIBVIRT_INP -i virbr0 -p tcp -m tcp --dport 53 -j ACCEPT
-A LIBVIRT_INP -i virbr0 -p udp -m udp --dport 67 -j ACCEPT
-A LIBVIRT_INP -i virbr0 -p tcp -m tcp --dport 67 -j ACCEPT
-A LIBVIRT_INP -i virbr1 -p udp -m udp --dport 53 -j ACCEPT
-A LIBVIRT_INP -i virbr1 -p tcp -m tcp --dport 53 -j ACCEPT
-A LIBVIRT_INP -i virbr1 -p udp -m udp --dport 67 -j ACCEPT
-A LIBVIRT_INP -i virbr1 -p tcp -m tcp --dport 67 -j ACCEPT
-A LIBVIRT_OUT -o virbr0 -p udp -m udp --dport 68 -j ACCEPT
-A LIBVIRT_OUT -o virbr1 -p udp -m udp --dport 68 -j ACCEPT
While in the nat table:
-N LIBVIRT_PRT
-A POSTROUTING -j LIBVIRT_PRT
-A LIBVIRT_PRT -s 192.168.0.0/24 -d 224.0.0.0/24 -j RETURN
-A LIBVIRT_PRT -s 192.168.0.0/24 -d 255.255.255.255/32 -j RETURN
-A LIBVIRT_PRT -s 192.168.0.0/24 ! -d 192.168.0.0/24 -p tcp -j MASQUERADE --to-ports 1024-65535
-A LIBVIRT_PRT -s 192.168.0.0/24 ! -d 192.168.0.0/24 -p udp -j MASQUERADE --to-ports 1024-65535
-A LIBVIRT_PRT -s 192.168.0.0/24 ! -d 192.168.0.0/24 -j MASQUERADE
-A LIBVIRT_PRT -s 192.168.1.0/24 -d 224.0.0.0/24 -j RETURN
-A LIBVIRT_PRT -s 192.168.1.0/24 -d 255.255.255.255/32 -j RETURN
-A LIBVIRT_PRT -s 192.168.1.0/24 ! -d 192.168.1.0/24 -p tcp -j MASQUERADE --to-ports 1024-65535
-A LIBVIRT_PRT -s 192.168.1.0/24 ! -d 192.168.1.0/24 -p udp -j MASQUERADE --to-ports 1024-65535
-A LIBVIRT_PRT -s 192.168.1.0/24 ! -d 192.168.1.0/24 -j MASQUERADE
util: create private chains for virtual network firewall rules
Historically firewall rules for virtual networks were added straight
into the base chains. This works but has a number of bugs and design
limitations:
- It is inflexible for admins wanting to add extra rules ahead
of libvirt's rules, via hook scripts.
- It is not clear to the admin that the rules were created by
libvirt
- Each rule must be deleted by libvirt individually since they
are all directly in the builtin chains
- The ordering of rules in the forward chain is incorrect
when multiple networks are created, allowing traffic to
mistakenly flow between networks in one direction.
To address all of these problems, libvirt needs to move to creating
rules in its own private chains. In the top level builtin chains,
libvirt will add links to its own private top level chains.
Addressing the traffic ordering bug requires some extra steps. With
everything going into the FORWARD chain there was interleaving of rules
for outbound traffic and inbound traffic for each network:
-A FORWARD -d 192.168.3.0/24 -o virbr1 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -s 192.168.3.0/24 -i virbr1 -j ACCEPT
-A FORWARD -i virbr1 -o virbr1 -j ACCEPT
-A FORWARD -o virbr1 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -i virbr1 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -d 192.168.2.0/24 -o virbr0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -s 192.168.2.0/24 -i virbr0 -j ACCEPT
-A FORWARD -i virbr0 -o virbr0 -j ACCEPT
-A FORWARD -o virbr0 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -i virbr0 -j REJECT --reject-with icmp-port-unreachable
The rule allowing outbound traffic from virbr1 would mistakenly
allow packets from virbr1 to virbr0, before the rule denying input
to virbr0 gets a chance to run.
What we really need todo is group the forwarding rules into three
distinct sets:
* Cross rules - LIBVIRT_FWX
-A FORWARD -i virbr1 -o virbr1 -j ACCEPT
-A FORWARD -i virbr0 -o virbr0 -j ACCEPT
* Incoming rules - LIBVIRT_FWI
-A FORWARD -d 192.168.3.0/24 -o virbr1 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o virbr1 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -d 192.168.2.0/24 -o virbr0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -o virbr0 -j REJECT --reject-with icmp-port-unreachable
* Outgoing rules - LIBVIRT_FWO
-A FORWARD -s 192.168.3.0/24 -i virbr1 -j ACCEPT
-A FORWARD -i virbr1 -j REJECT --reject-with icmp-port-unreachable
-A FORWARD -s 192.168.2.0/24 -i virbr0 -j ACCEPT
-A FORWARD -i virbr0 -j REJECT --reject-with icmp-port-unreachable
There is thus no risk of outgoing rules for one network mistakenly
allowing incoming traffic for another network, as all incoming rules
are evalated first.
With this in mind, we'll thus need three distinct chains linked from
the FORWARD chain, so we end up with:
Peter Krempa [Thu, 24 Jan 2019 09:35:48 +0000 (10:35 +0100)]
qemu: Don't double-free disk->mirror if block commit initialization fails
disk->mirror would not be cleared while the local pointer was freed in
qemuDomainBlockCommit if qemuDomainObjExitMonitor or qemuBlockJobDiskNew
would return a failure.
Since block job handling is executed in the separate handler which needs
a qemu job, we don't need to pre-set the mirror state prior to starting
the job. Similarly the block copy job does not do that.
Move the setting of the data after starting the job so that we avoid
this problem.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Peter Krempa [Thu, 24 Jan 2019 09:31:38 +0000 (10:31 +0100)]
qemu: blockjob: Mark job as started only when it's new
Switching a block job to some states (e.g. QEMU_BLOCKJOB_STATE_READY)
might not require a job, thus if it will become ready asynchronously we
should not overwrite the state any more.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Peter Krempa [Thu, 24 Jan 2019 08:49:26 +0000 (09:49 +0100)]
qemu: blockjob: Make sure that internal states are not reported as event
While the callers should make sure that they don't call
qemuBlockJobEmitEvents for any internal state or job, let's add checks
that prevents us from emitting wrong events altogether.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Peter Krempa [Thu, 24 Jan 2019 08:46:57 +0000 (09:46 +0100)]
lib: Fix docs generated for enum virDomainBlockJobType
Mixing documentation strings trailing the enum value and preceeding the
enum value ends in a big mixup. Fix docs string for
VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN so that it's not squished together
with the next one.
Signed-off-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Ceph can be mounted just like any other filesystem and in fact is
a shared and cluster filesystem. The filesystem magic constant
was taken from kernel sources as it is not in magic.h yet.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com>
Michal Privoznik [Mon, 28 Jan 2019 13:41:37 +0000 (14:41 +0100)]
lib: Use more of VIR_STEAL_PTR()
We have this very handy macro called VIR_STEAL_PTR() which steals
one pointer into the other and sets the other to NULL. The
following coccinelle patch was used to create this commit:
@ rule1 @
identifier a, b;
@@
- b = a;
...
- a = NULL;
+ VIR_STEAL_PTR(b, a);
Some places were clean up afterwards to make syntax-check happy
(e.g. some curly braces were removed where the body become a one
liner).
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> Reviewed-by: Erik Skultety <eskultet@redhat.com>
* Define virDomainXMLNamespace for the bhyve driver, which
at this point supports only the 'commandline' element
described above,
* Update command generation code to inject these command line
arguments between driver-generated arguments and the vmname
positional argument.
Signed-off-by: Roman Bogorodskiy <bogorodskiy@gmail.com> Reviewed-by: John Ferlan <jferlan@redhat.com>
Laine Stump [Thu, 10 Jan 2019 00:42:41 +0000 (19:42 -0500)]
network: remove stale function
networkMigrateStateFiles was added nearly 5 years ago when the network
state directory was moved from /var/lib/libvirt to /var/run/libvirt
just prior to libvirt-1.2.4). It was only required to maintain proper
state information for networks that were active during an upgrade that
didn't involve rebooting the host. At this point the likelyhood of
anyone upgrading their libvirt from pre-1.2.4 directly to 5.0.0 or
later *without rebooting the host* is probably so close to 0 that no
properly informed bookie would take *any* odds on it happening, so it
seems appropriate to remove this pointless code.
Signed-off-by: Laine Stump <laine@laine.org> Reviewed-by: Andrea Bolognani <abologna@redhat.com>
Eric Blake [Fri, 25 Jan 2019 03:32:55 +0000 (21:32 -0600)]
virjson: add convenience wrapper for appending string to array
Upcoming patches need an array of strings for use in QMP
block-dirty-bitmap-merge. A convenience wrapper cuts down
on the verbosity of creating the array, similar to the
existing virJSONValueObjectAppendString().
Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Eric Blake [Fri, 25 Jan 2019 03:28:23 +0000 (21:28 -0600)]
virjson: always raise vir error on append failures
A function that returns -1 for multiple possible failures, but only
raises a libvirt error for some of those failures, can be hard to
use correctly. Yet both of our JSON object/array appenders fall in
that pattern. True, the silent errors represent coding bugs that
none of the callers should ever trigger, while the noisy errors
represent memory failures that can happen anywhere, so we happened
to never end up failing without an error. But it is better to
either use the _QUIET memory allocation variants, and make callers
decide to report failure; or make all failure paths noisy. This
patch takes the latter approach.
Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com>
Now that the inner loop does not require any other variables,
it can be easily separated. Apart from reducing the indentation
level this will allow it to be called from different code paths.
Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Cole Robinson <crobinso@redhat.com>
Ján Tomko [Mon, 21 Jan 2019 14:49:23 +0000 (15:49 +0100)]
virPortAllocatorSetUsed: ignore port 0
Similar to what commit 86dba8f3 did for virPortAllocatorRelease,
ignore port 0 in virPortAllocatorSetUsed.
For all the reasonable use cases the callers already check that
the port is non-zero, however if the port from the XML overflows
unsigned short and turns into 0, it can be set as used by
virPortAllocatorSetUsed but not released by virPortAllocatorRelease.
Also skip port '0' in virPortAllocatorSetUsed to make this behavior
symmetric.
The serenity was disturbed by commit 5dbda5e9 which started using
virPortAllocatorRelease instead of virPortAllocatorSetUsed (false).
Thomas Huth [Fri, 25 Jan 2019 09:50:28 +0000 (10:50 +0100)]
docs/governance: Clarify the version number of the LGPL
There is no "GNU Lesser General Public License, version 2",
only version 2.1 and later. In "version 2", the license was
still called "Library" instead of "Lesser". So assume that
version 2.1 is meant here.
Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Ján Tomko <jtomko@redhat.com>
Thomas Huth [Fri, 25 Jan 2019 09:50:27 +0000 (10:50 +0100)]
tools/virt-xml-validate: Fix GPL information
The tools/virt-xml-validate.in file is licensed under the terms of
the GPL, but then says "You should have received a copy of the
GNU *Lesser* General Public License". Thus scratch the "Lesser" here.
Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Ján Tomko <jtomko@redhat.com>
Thomas Huth [Fri, 25 Jan 2019 09:50:26 +0000 (10:50 +0100)]
bootstrap.conf: Fix LGPL information
The bootstrap.conf is licensed under the terms of the LGPL, but then
suggests to "See the GNU General Public License for more details".
That should be the "GNU Lesser General Public License" instead, of
course.
Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Signed-off-by: Ján Tomko <jtomko@redhat.com>
Cole Robinson [Tue, 22 Jan 2019 21:15:03 +0000 (16:15 -0500)]
qemu: command: Make BuildVirtioDevStr more generic
Switch qemuBuildVirtioDevStr to use virDomainDeviceSetData: callers
pass in the virDomainDeviceType and the void * DefPtr. This will
save us from having to repeatedly extend the function argument
list in subsequent patches.
Reviewed-by: Andrea Bolognani <abologna@redhat.com> Signed-off-by: Cole Robinson <crobinso@redhat.com>