David Scott [Wed, 26 Jan 2011 17:39:05 +0000 (17:39 +0000)]
Remove all mutable state within the database layer, leaving one single global reference (to the master's single "database"). Allow the type-safe Db.* API to be used on more than one database at a time, by adding the "current database" to Context.t. Add a notion of database callbacks which are used by xapi for both the redo-log(s) and the event system.
Signed-off-by: David Scott <dave.scott@eu.citrix.com>
David Scott [Wed, 26 Jan 2011 17:39:05 +0000 (17:39 +0000)]
Remove the xapi_minor from the database manifest because it shouldn't have been used: version checks should have considered the schema version instead.
Signed-off-by: David Scott <dave.scott@eu.citrix.com>
David Scott [Wed, 26 Jan 2011 17:39:05 +0000 (17:39 +0000)]
Remove the xapi_major from the database manifest because it shouldn't have been used: version checks should have considered the schema version instead.
Signed-off-by: David Scott <dave.scott@eu.citrix.com>
David Scott [Wed, 26 Jan 2011 17:39:05 +0000 (17:39 +0000)]
Remove the pool_token from the database manifest since it doesn't make sense to preserve the pool secret if all other hosts are being deleted anyway (on database restore)
Signed-off-by: David Scott <dave.scott@eu.citrix.com>
Mike McClurg [Wed, 26 Jan 2011 17:39:05 +0000 (17:39 +0000)]
CA-47663 The host blob sync logs 10s of thousands of log lines per iteration
The rsync command called by Xapi_sync.sync_host was given the -v (verbose) option, which made it spew tens of thousands of lines into the xensource.log file. I removed the -v option and added --stats in order to keep the file transfer summaries in the output.
Signed-off-by: Mike McClurg <mike.mcclurg@citrix.com>
Mike McClurg [Wed, 26 Jan 2011 17:39:05 +0000 (17:39 +0000)]
CA-47663: Fix indentation and whitespace problems in xapi_sync.ml
This patch is in preparation to resolve issue CA-47663. I wanted to make a lot of whitespace and indentation changes to this file for readability's sake, and decided to keep those changes separate from the (minor) changes that I will make to actually fix this issue.
Signed-off-by: Mike McClurg <mike.mcclurg@citrix.com>
Alex Zeffertt [Wed, 26 Jan 2011 17:39:05 +0000 (17:39 +0000)]
CA-47556: Revert to PV template for RHEL 6 64 bit for now
From ticket description:
See CP-1876 - RH have changed their kernel to require an option to use PV on HVM.
We currently have no way to set that option ourselves so would rely on the customer
doing it which they will be very unlikely to do in most cases. Also it's not easy
to switch from emulated drivers to PV therefore the chances are customers will end
up with a poorly performing fully HVM VM.
For RTM I suggest we revert to using the PV template for 64 bit (just like we do
for 32 bit). This is a simple revert of the template code.
Long term we want to move to HVM to avoid the performance penalty of 64 bit PV but
RH6 isn't quite ready yet. This could be a change we make when we turn RH6 support
into fully supported, rather than experimental, in a future LCM update
Signed-off-by: Alex Zeffertt <alex.zeffertt@eu.citrix.com>
Mike McClurg [Wed, 26 Jan 2011 17:39:05 +0000 (17:39 +0000)]
CA-47135 Add "experimental" to RHEL 6 and Debian Squeeze templates
Added "is_experimental" argument to make_long_name function, and added
optional "is_experimental" argument to template building functions. If
a template is experimental (untested), pass the argument
~is_experimental:true to the *_template function.
CA-48240: Debian Squeeze 32-bit shouldn't be experimental, Ubuntu 32-bit
and 64-bit should.
CP-1686: In Boston the Ubuntu templates should not be marked experimental
as they will be tested and supported.
Signed-off-by: Mike McClurg <mike.mcclurg@citrix.com> Signed-off-by: David Scott <dave.scott@eu.citrix.com>
Fixed whitespace (again) in a few files. Modified xapi_ha.ml to
deactivate statefile/metadata vdis before detaching them. Added helper
files in xha_*.ml to do this for us.
Signed-off-by: Mike McClurg <mike.mcclurg@citrix.com>
Zheng Li [Wed, 26 Jan 2011 17:39:04 +0000 (17:39 +0000)]
CA-41553: Fix logic bugs in vm_install_real and do some code cleanup
There were two logic bugs in vm_install_real
* When user create a VM based on a snapshot (which is also considered as a template from XenServer point of view), and neither sr-name-lable or sr-uuid is specified (neither is wanted any way), the code will fail if the pool doesn't have default SR set (which is not necessary as well). This is the problem spot in CA-41553.
* When both sr-uuid and sr-name-lable are specified in command line at the same time
- If there is some contradiction, say the SR with sr-uuid doesn't have the name as specified in sr-name-label, XenServer will only take sr-name-label into consideration and ignore sr-uuid without a warning
- If sr-name-label corresponding to several SRs in the system, instead of using the sr-uuid information to restrict the candidate to one, XenServer will simply fail and complain "Multiple SRs with that name-label found".
xapi was failing nastily on start-up if the xensource-inventory file was
missing. Now it generates a minimal one if none exists. This does not include
a build number, so version.ml now falls back to using a build number from the
Make environment if one is not available from the inventory, i.e. it falls
back to the behaviour from before Matthias's commit for CA-43574 (build number
from xensource-inventory).
Signed-off-by: Thomas Sanders <thomas.sanders@citrix.com>
Alex Zeffertt [Wed, 26 Jan 2011 17:39:04 +0000 (17:39 +0000)]
Call /opt/xensource/sm/mpathcount.py after xapi started if root disk is multipathed
mpathcount.py now updates the host object when the root disk is multipathed as well
as updating pbd objects. (It writes values into other-config to show XenCenter
how many paths are active and how many are failed.)
Normally it is multipathd that calls mpathcount.py, but in the case of the root
disk the /dev/mapper node is created by the initrd before multipathd is started.
Signed-off-by: Alex Zeffertt <alex.zeffertt@eu.citrix.com>
Jon Ludlam [Wed, 26 Jan 2011 17:39:04 +0000 (17:39 +0000)]
CP-1981: Hook in the reset-vdis script to HA
The 'locks' for the VDIs, which are maintained by the SM backends now, are stored in xapi's database. In the event of a master failover, the pool database may rev
ert to a previous backup, which is then resynced with reality. Unfortunately there's no logic to resync the SM backends' data, nor any hook point to do this. Idea
lly, the SM backends would store their critical data internally somehow, but until this happens xapi will have to contain the important logic to resynchronise the
se locks.
This patch adds a host-post-declare-dead script that causes the reset of the locks of VDIs that were present on a host that has been declared dead.
Signed-off-by: Jon Ludlam <jonathan.ludlam@eu.citrix.com>
Jon Ludlam [Wed, 26 Jan 2011 17:39:04 +0000 (17:39 +0000)]
CP-1981: Resynchronise the locks in the sm_config maps of VDIs in dbsync_slave
The 'locks' for the VDIs, which are maintained by the SM backends now, are stored in xapi's database. In the event of a master failover, the pool database may revert to a previous backup, which is then resynced with reality. Unfortunately there's no logic to resync the SM backends' data, nor any hook point to do this. Ideally, the SM backends would store their critical data internally somehow, but until this happens xapi will have to contain the important logic to resynchronise these locks.
This patch implements the logic to resynchronise the sm_config keys based on the information stored in the local database
Signed-off-by: Jon Ludlam <jonathan.ludlam@eu.citrix.com>
Jon Ludlam [Wed, 26 Jan 2011 17:39:04 +0000 (17:39 +0000)]
CP-1981: Track the vdi_activations in the local database
The 'locks' for the VDIs, which are maintained by the SM backends now, are stored in xapi's database. In the event of a master failover, the pool database may revert to a previous backup, which is then resynced with reality. Unfortunately there's no logic to resync the SM backends' data, nor any hook point to do this. Ideally, the SM backends would store their critical data internally somehow, but until this happens xapi will have to contain the important logic to resynchronise these locks.
This patch maintains a list of the VDIs that have been activatedi (including RW/RO mode) in the local database.
Signed-off-by: Jon Ludlam <jonathan.ludlam@eu.citrix.com>
Jonathan Knowles [Wed, 26 Jan 2011 17:39:04 +0000 (17:39 +0000)]
[CA-46591] Prevents build_pre from overwriting xen_maxmem.
Signed-off-by: Jonathan Knowles <jonathan.knowles@eu.citrix.com>
Previously, during a VM.resume, both of the following functions would overwrite xen_maxmem:
Xen hg user [Wed, 26 Jan 2011 17:39:04 +0000 (17:39 +0000)]
[whitespace] Conservatively corrects the whitespace for a small number of functions that invoke VM start, in preparation for further patches that will add parameters to VM start.
Signed-off-by: Jonathan Knowles <jonathan.knowles@eu.citrix.com>
Proof that this patch introduces no semantic changes:
Jonathan Knowles [Wed, 26 Jan 2011 17:39:04 +0000 (17:39 +0000)]
[CA-47369] Enables shadow memory by default for PV domains, with a hard-wired multiplier of 1.
Signed-off-by: Jonathan Knowles <jonathan.knowles@eu.citrix.com> Acked-by: Jonathan Ludlam <jonathan.ludlam@eu.citrix.com>
This change enables successful migrations of PV domains away from hosts with no spare memory.
David Scott [Wed, 26 Jan 2011 17:39:04 +0000 (17:39 +0000)]
SCTX-525: only write the names of interfaces ("current interfaces") into the inventory file and cause them to be ifup'ed on system boot if they actually have an IP address configuration in dom0. This avoids initialising bridges for bonds and VLANs which are only for guests and which can be initialised on demand.
Stats:
* 5 host pool
* 300 VLANs, none used
Pool reboot time drops from 45 mins to 8 mins
Signed-off-by: David Scott <dave.scott@eu.citrix.com>
David Scott [Wed, 26 Jan 2011 17:39:04 +0000 (17:39 +0000)]
Move defaults for environment variables PRODUCT_VERSION, PRODUCT_BRAND, BUILD_NUMBER into the OMakefile, to make it easier to use omake directly for building.
Signed-off-by: David Scott <dave.scott@eu.citrix.com>
Jonathan Davies [Thu, 28 Oct 2010 16:01:05 +0000 (17:01 +0100)]
CA-42914, SCTX-434: Speed up writes of the database to the redo-log
Anecdotal evidence suggests that the default 1 KiB block size provided by Unixext.read_in_chunks causes write throughput to be very slow on some storage substrates.
Some (not massively scientific) timings of dd from /dev/zero to a block-attached VDI on various SR types agree with this observation. The data below shows that 4 KiB is the minimum block size that should be considered for use: throughput it universally higher above this threshold and often substantially lower below it. We'll default to 16 KiB to be on the safe side.
Jonathan Davies [Thu, 28 Oct 2010 16:00:59 +0000 (17:00 +0100)]
CA-42914: Deal with unexpected closure of data socket caused by exception in block_device_io
Previously, the closing of the data socket causes xapi's code that writes database to an fd to raise Sys_error("Connection reset by peer").
Instead, we can safely ignore the unexpected closing of the data socket and wait until we hear what happened over the control socket. Any exception that may be raised during transfer_data_from_sock_to_fd in block_device_io (that causes the data socket to be prematurely closed) gets caught in the exception handlers in action_writedb that call send_failure. So suppress all Sys_error("Connection reset by peer") exceptions that xapi may raise during the writing of the database to the fd because full details should be forthcoming on the control socket.
Signed-off-by: Jonathan Davies <jonathan.davies@citrix.com>
Jonathan Davies [Thu, 28 Oct 2010 16:00:51 +0000 (17:00 +0100)]
CA-42914, SCTX-434: Increase the timeout for writing the database to the redo-log
Previously, we had a flat timeout of 2 seconds for all redo-log operations. However, this has been shown to be too impatient for writing large databases over slow connections. (This resulted in timeouts firing prematurely, resulting in lots of METADATA_LUN_BROKEN alerts, and the redo-log being entirely useless for large pools as the database would never be successfully written!)
The timeout for each operation can now be specified independently. The new default for database writes is 30 seconds.
Signed-off-by: Jonathan Davies <jonathan.davies@citrix.com>
Jonathan Davies [Thu, 28 Oct 2010 16:00:42 +0000 (17:00 +0100)]
CA-42914: Catch other exceptions when reading commands from client in block_device_io
Previously, we only caught End_of_file which Unixext.really_read throws when the client sends EOF. Other exceptions dribbled through to the deeper exception handler, which was supposed to be reserved exclusively for problems opening the block device.
Now, we also catch other exceptions in the same place as the End_of_file exception is handled.
Signed-off-by: Jonathan Davies <jonathan.davies@citrix.com>
Jon Ludlam [Thu, 28 Oct 2010 15:59:54 +0000 (16:59 +0100)]
Fix the allowed-operations check for VMs.
The functions is long and had some non-obvious short-cut termination clauses in the long if...else if ... section. This has now been changed to have an option type containing the current error. Checks should be made in order of 'severity' - ie. power-state first, and e.g. PV driver status later.
Signed-off-by: Jon Ludlam <jonathan.ludlam@eu.citrix.com>
David Scott [Thu, 28 Oct 2010 15:51:04 +0000 (16:51 +0100)]
CA-46955: make code robust to parallel deletions of VBDs
In general we should be very careful in code like this not to expect the configuration of "other VMs" to remain static while we run. In this case a parallel thread deleted a VBD which it "owned" and this cross-talk killed this thread.
Signed-off-by: David Scott <dave.scott@eu.citrix.com>