Balazs Gibizer [Thu, 9 Mar 2017 16:28:02 +0000 (17:28 +0100)]
Fix missing instance.delete notification
The I8742071b55f018f864f5a382de20075a5b444a79 introduced cases when an
instance object is destroyed without the instance.delete notification
being emitted.
This patch adds the necessary notification to restore legacy
behaviour.
Ionuț Bîru [Fri, 10 Mar 2017 15:22:22 +0000 (17:22 +0200)]
Correctly set up deprecation warning
In the current state, no warning was output in the logs.
With this fix, a warning is output in the logs and the value from
[DEFAULT] is used correctly.
Stephen Finucane [Tue, 21 Feb 2017 20:34:50 +0000 (15:34 -0500)]
conf: Fix invalid rST comments
These were breaking the openstack-docs-tools build, which is used to
build the central config guide. We should probably add a unit tests to
ensure all config options use valid (or at least, error-free) rST, but
that's a job for another day.
Matt Riedemann [Tue, 7 Mar 2017 01:25:55 +0000 (20:25 -0500)]
Remove unused placement_database config options
The placement_database config options were added in Newton
but the actual code to use the options was reverted and is
not used. The help text actually has a typo (Ocata was 15.0.0,
not 14.0.0) and, more importantly, the help text makes it sound
like you should really be setting up a separate database for the
placement service but we don't actually use these options, we use
the api_database options for all of the placement data models.
To avoid confusion until this is actually supported, let's just
remove the options which should have been cleaned up as part of 39fb302fd9c8fc57d3e4bea1c60a02ad5067163f anyway.
Matt Riedemann [Tue, 7 Mar 2017 00:00:31 +0000 (19:00 -0500)]
libvirt: pass log_path to _create_pty_device for non-kvm/qemu
log_path is required in _create_pty_device if:
1. serial consoles are disabled
2. libvirt/qemu are new enough that they support virtlogd
This was working fine for kvm and qemu since _create_consoles_s390x
and _create_consoles_qemu_kvm pass in the log_path, but for the
non-kvm/qemu cases, like xen, the log_path wasn't provided.
This wasn't caught by the XenProject CI since it's using libvirt
1.3.1 which does not have virtlogd support so this path was
not exercised and apparently not unit tested either.
A release note is provided since this is a pretty severe bug if
you're running new enough libvirt/qemu and not using kvm/qemu as
the virt type because CONF.serial_console.enabled is False by
default so you're going to have failed server creates immediately
upon upgrading to Ocata.
Stephen Finucane [Wed, 22 Feb 2017 18:02:47 +0000 (13:02 -0500)]
libvirt: Handle InstanceNotFound exception
In 'ad1c7ac2', we stopped returning NovaException from certain libvirt
operations in favour of more specific exception types. Unfortunately, as
part of this changeover we missed an exception type. Correct this
oversight.
m4cr0v [Tue, 13 Dec 2016 08:17:34 +0000 (16:17 +0800)]
Fix spice channel type
nova generate wrong spice channel type 'pty', which should be 'spicevmc'
Closes-Bug: #1634495
Change-Id: I58e9a8df9f40f900eeabd9d40429663cbbedb8d6
(cherry picked from commit f4be97d8cf8811a64a2d9f7d990e79d45cdf0d62)
Matt Riedemann [Fri, 17 Feb 2017 18:31:41 +0000 (13:31 -0500)]
Only create vendordata_dynamic ksa session if needed
We're logging a warning about vendordata dynamic auth
not being configured every time we create a server with
a config drive. The dynamic vendordata v2 stuff is all
optional and controlled via configuring:
CONF.api.vendordata_dynamic_targets
This change only attempts to create the ksa session
when we try to make a request, which would only happen
if CONF.api.vendordata_dynamic_targets is configured.
Maciej Józefczyk [Thu, 23 Feb 2017 11:56:04 +0000 (12:56 +0100)]
Ensure that instance directory is removed after success migration/resize
Nova recreates instance directory on source host after successful migration/resize.
This patch removes directory of migrated instance from source host.
Jordan Pittier [Fri, 27 Jan 2017 14:22:25 +0000 (15:22 +0100)]
Fix unspecified bahavior on GET /servers/detail?tenant_id=X as admin
When an admin calls /v2.1/servers/detail?tenant_id=XX, then the
`get_all` method of nova.compute.api.API is called with 2
conflicting search options: {'tenant_id': XX, 'project_id': YY}.
But because, latter on, in that `get_all` method we define a dict
called filter_mapping, on which we iter upon, which value takes
precedence depends on the order in which the dict is iterated upon.
As the order in unpredictable and varies between Py2 and Py3, this
is problematic. Especially, on Python 2 hash randomization is
disabled by default but it's enabled by default on Py3 (unless
the PYTHONHASHSEED env var is set to a fixed value) [0]
The (unreliable) order we iterate on items of that `filter_mapping`
dict is why the Tempest test_list_servers_by_admin_with_specified_tenant
test randomly fails on Py35.
This patch ensures that, if the all_tenants search option is
not set, then the `tenant_id` search option is ignored. Note that
this *is* the current behavior on Py27, as documented in lp:#1185290
and tested by Tempest here [1].
Dan Smith [Mon, 27 Feb 2017 15:52:29 +0000 (07:52 -0800)]
Ignore deleted services in minimum version calculation
When we go to detect the minimum version for a given service, we
should ignore any deleted services. Without this, we will return
the minimum version of all records, including those that have been
deleted with "nova service-delete". This patch filters deleted
services from the query.
Use the keystone session loader in the placement reporting
Using load_session_from_conf_options has the advantage that it honors
session settings like cafile and insecure, to make use of non-system TLS
certificates (or disable certificate checks at all). Also client
certificates and timeout values can be specified, too.
melanie witt [Fri, 17 Feb 2017 17:27:57 +0000 (17:27 +0000)]
Skip soft-deleted records in 330_enforce_mitaka_online_migrations
The 330_enforce_mitaka_online_migrations migration considers
soft-deleted records as unmigrated (the blocker migration uses the
select function from sqlalchemy), but the online migrations only
migrate non-deleted records (the migrations use the model_query
function which defaults to read_deleted='no'). So even after running
all of the online migrations, operators can get stuck until they can
hard delete any soft-deleted compute_nodes, aggregates, and
pci_devices records they have.
Matt Riedemann [Thu, 16 Feb 2017 17:15:12 +0000 (12:15 -0500)]
Deprecate xenserver.vif_driver config option and change default
There are two in-tree options for the xenserver.vif_driver,
the bridge driver and the ovs driver. The XenAPI subteam has
confirmed that the bridge driver is for nova-network (which is
deprecated) and the ovs driver is for Neutron, and that's how
things are tested in CI.
Since we changed the default on use_neutron to be True for Ocata
we need to change the default on the vif_driver to be the ovs
driver so it works with the default config, which is Neutron.
We're deprecating the option though since we can use the use_neutron
option to decide which vif driver to load - which will make
deploying and configuring nova with xen as the backend simpler.
Huan Xie [Sun, 22 Jan 2017 11:08:40 +0000 (03:08 -0800)]
Fix live migrate with XenServer
Live migration with XenServer as hypervisor failed with xapi
errors "VIF_NOT_IN_MAP". There are two reasons for this
problem:
(1) Before XS7.0, it supports VM live migration without
setting vif_ref and network_ref explicitly if the destination
host has same network, but since XS7.0, it doesn't support
this way, we must give vif_ref and network_ref mapping.
(2) In nova, XenServer has introduced interim network for
fixing ovs updating wrong port in neutron, see bug 1268955
and also interim network can assist support neutron security
group (linux bridge) as we cannot make VIF connected to
linux bridge directly via XAPI
To achieve this, we will add {src_vif_ref: dest_network_ref}
mapping information, in pre_live_migration, we first create
interim network in destination host and store
{neutron_vif_uuid: dest_network_ref} in migrate_data, then in
source host, before live_migration, we will calculate the
{src_vif_ref: dest_network_ref} and set it as parameters to
xapi when calling VM.migrate_send. Also, we will handle the
case where the destination host is running older code that
doesn't have this new src_vif_ref mapping, like live migrating
from an Ocata compute node to a Newton compute node.
Corey Bryant [Wed, 15 Feb 2017 21:43:50 +0000 (16:43 -0500)]
Enable defaults for cell_v2 update_cell command
Initialize optional parameters for update_cell() to None and
enable getting the transport_url and db_connection from
nova.conf if not specified as arguments.
Matt Riedemann [Mon, 13 Feb 2017 20:48:43 +0000 (15:48 -0500)]
Cleanup some issues with CONF.placement.os_interface
This change fixes a few things with the recently added
"os_interface" option in the [placement] config group.
1. It adds tests for the scheduler report client that
were missing in the original change that added the
config option.
2. It uses the option in the "nova-status upgrade check"
command so it is consistent with how the scheduler
report client uses it.
3. It removes the restrictive choices list from the
config option definition. keystoneauth1 allows an
"auth" value for the endpoint interface which means
don't use the service catalog to find the endpoint
but instead just read it from the "auth_url" config
option. Also, the Keystone v3 API performs strict
validation of the endpoint interface when creating
an endpoint record. The list of supported interfaces
may change over time, so we shouldn't encode that
list within Nova.
4. As part of removing the choices, the release note
associated with the new option is updated and changed
from a 'feature' release note to simply 'other' since
it's not really a feature as much as it is a bug fix.
Dan Smith [Fri, 10 Feb 2017 15:37:51 +0000 (07:37 -0800)]
Remove straggling use of main db flavors in cellsv1 code
This remaining use of the flavor query routine from the cellsv1 code
still looks at the main database. This patch converts it to use the
object which looks in the right place.
Andreas Jaeger [Sun, 5 Feb 2017 14:56:27 +0000 (15:56 +0100)]
Prepare for using standard python tests
Add simple script to setup mysql and postgresql databases, this script
can be run by users during testing and will be run by CI systems for
specific setup before running unit tests. This is exactly what is
currently done by OpenStack CI in project-config.
This allows to change in project-config the python-db jobs to
python-jobs since python-jobs will call this script initially.
See also
http://lists.openstack.org/pipermail/openstack-dev/2016-November/107784.html
Update devref for this.
Needed-By: Iea42a0525b2c5a5cdbf8604eb23a6e7b029f6b48
Change-Id: Ie9bae659077dbe299eea131572117036065bdccf
(cherry picked from commit d60dffc6bef4d2de32b9509a62f894136c434c3c)
John Garbutt [Mon, 6 Feb 2017 17:42:59 +0000 (17:42 +0000)]
Default live_migration_progress_timeout to off
live_migration_progress_timeout aims to timeout a live-migration well
before the live_migration_completion_timeout limit, by looking for when
it appears that no progress has been made copying the memory between the
hosts. However, it turns out there are several problems with the way we
monitor progress. In production, and stress testing, having
live_migration_progress_timeout > 0 has caused random timeout failures
for live-migrations that take longer than live_migration_progress_timeout
One problem is that block_migrations appear to show no progress, as it
seems we only look for progress in copying memory. Also the way we query
QEMU via libvirt breaks when there are multiple iterations of memory
copying.
We need to revisit this bug and either fix the progress mechanism or
remove the all the code that checks for the progress (including the
automatic trigger for post-copy). But in the mean time, lets default to
having no timeout, and warn users that have overridden this
configuration by deprecating the live_migration_progress_timeout
configuration option.
For users concerned about live-migration timeout errors, I have
cleaned up the configuration option descriptions, so they have a better
chance of stopping the live-migration timeout errors they may come
across.
Matt Riedemann [Wed, 8 Feb 2017 01:28:13 +0000 (20:28 -0500)]
Allow None for block_device_mapping_v2.boot_index
The legacy v2 API allowed None for the boot_index [1]. It
allowed this implicitly because the API code would convert
the block_device_mapping_v2 dict from the request into a
BlockDeviceMapping object, which has a boot_index field that
is nullable (allows None).
The API reference documentation [2] also says:
"To disable a device from booting, set the boot index
to a negative value or use the default boot index value,
which is None."
It appears that with the move to v2.1 and request schema
validation, the boot_index schema was erroneously set to
not allow None for a value, which is not backward compatible
with the v2 API behavior.
This change fixes the schema to allow boot_index=None again
and adds a test to show it working.
This should not require a microversion bump since it's fixing
a regression in the v2.1 API which worked in the v2 API and
is already handled throughout Nova's block device code.
Dan Smith [Tue, 7 Feb 2017 16:43:42 +0000 (08:43 -0800)]
Add an update_cell command to nova-manage
In case a cell record was incorrectly created, or needs to be updated
to point to a new database or mq endpoint, we need a way for users to
be able to update a CellMapping record. Since deleting and re-creating
is definitely not an option, this patch adds an update_cell which
allows updating the mutable fields if necessary.
Lee Yarwood [Tue, 31 Jan 2017 18:39:15 +0000 (18:39 +0000)]
libvirt: Remove redundant bdm serial mangling and saving during swap_volume
During an initial swap_volume call the serial of the original volume was
previously stashed in the connection_info of the new volume by the
compute layer. This was used by I19d5182d11 to allow the virt driver to
lookup and update the existing BDM with the new volume's connection_info
after it had been used by connect_volume.
Future calls to swap_volume in the compute layer would not
update the serial found in the old volume connection_info. This would
result in an invalid serial being copied into the new volume
connection_info and used to preform a lookup of a BDM that didn't exist.
To correct this we now explicitly set the serial of the new volume to
that of the new volume id. While the correct serial id should already be
present in the connection_info provided by most backend Cinder volume
drivers the act of updating this dict is required by our own functional
tests to invoke a failure case :
https://git.io/vDmRE
The serial is updated once more to match the volume id returned
by os-migrate-volume-completion prior to the BDM being updated in the
compute layer.
The BDM lookup and save from the virt layer is also removed as the
compute layer retains a reference to new_cinfo and will update the BDM
with this, including any modifications, at the end of swap_volume.
Finally, the associated Tempest admin test is also extended by the
following change to now attempt a second volume swap to verify these
changes :
Dan Smith [Fri, 3 Feb 2017 17:20:11 +0000 (09:20 -0800)]
Update the upgrades part of devref
This removes some references to "online schema migrations", as well
as some references to in-progress things that have been (long since)
completed. It also clarifies some of the upgrade steps, and unifies
the notion of "offline" and "live" upgrades, calling out only a couple
places where the process differs.
This came from me explaining the document to someone and calling out
things that were no longer accurate.
EdLeafe [Thu, 2 Feb 2017 18:48:35 +0000 (18:48 +0000)]
Delete a compute node's resource provider when node is deleted
Currently when a compute node is deleted, its record in the cell DB is
deleted, but its representation as a resource provider in the placement
service remains, along with any inventory and allocations. This could
cause the placement engine to return that provider record, even though
the compute node no longer exists. And since the periodic "healing" by
the resource tracker only updates compute node resources for records in
the compute_nodes table, these old records are never removed.
This patch adds a call to delete the resource provider when the compute
node is deleted. It also adds a method to the scheduler report client
to make these calls to the placement API.
John Garbutt [Thu, 2 Feb 2017 18:41:46 +0000 (18:41 +0000)]
Stop swap allocations being wrong due to MB vs GB
Swap is in MB, but allocations for disk are in GB.
We totally should claim disk in GB, for now lets just round up the swap
allocation to the next GB. While this is wasteful, its the only safe
answer to ensure you don't over commit resources on the node.
Updated the test so the swap is 1023MB, after rounding up this should
claim the same 1GB extra space for swap.
Matt Riedemann [Thu, 2 Feb 2017 18:34:32 +0000 (13:34 -0500)]
Clarify the [cells] config option help
People are easily confused into thinking that for cells v2
they need to configure things in the [cells] group in
nova.conf, but the cells options are strictly for cells v1
functionality, which we don't want people using anymore.
This change adds some wording to those options to make it
clear that they are for cells v1 and that is not a
recommended way to deploy Nova.
ghanshyam [Thu, 2 Feb 2017 10:02:53 +0000 (10:02 +0000)]
Fix access_ip_v4/6 filters params for servers filter
While adding the json schema for servers filter query,
we added 'accessIPv4' and 'accessIPv6' as allowed params
but they do not match with what DB has. It is 'access_ip_v4'
and 'access_ip_v6' in DB.
This makes 'access_ip_v4' and 'access_ip_v6' filter stop working.
The schema should be fixed accordingly to allow the 'access_ip_v4'
and 'access_ip_v6' as valid filter.
'accessIPv4' and 'accessIPv6' are something the API accepts
and returns and internally API layer translate those param
to their respective field('access_ip_v4' and 'access_ip_v6')
present in DB.
So user does not know anything about 'access_ip_v4' and
'access_ip_v6'. They are not in API representation actually.
Later list filter and sort param should be same as field return
in GET or accepted in POST/PUT which are 'accessIPv4' and 'accessIPv6'.
But that is something new attribute support in filter and can be
done later after more discussion.
Matt Riedemann [Tue, 31 Jan 2017 19:20:55 +0000 (14:20 -0500)]
doc: add upgrade notes to the placement devref
This just adds some more detailed docs for upgrading
to Ocata and notable things for the Placement service
and related components as part of that upgrade.
The Ocata release notes will have some of this
information too, but I think it's good to also have
a single place for some of this, like an install/upgrade
guide for Placement.