Ian Jackson [Wed, 6 May 2015 22:47:30 +0000 (23:47 +0100)]
cs-bisection-step: Abandon repro attempts after a bit
If we have had a number of attempts at a repro, and none of them have
produced a pass or fail, something is probably wrong and we should
give up rather than carrying on.
Handle this with the machinery we use for conflicting test results.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Wed, 6 May 2015 22:41:00 +0000 (23:41 +0100)]
cs-bisection-step: Report conflict if basis pass/fail are wrong
It can happen that the (for example) supposed basis pass (originally
only tested on another host) failed (when reproduced, or for some
other reason). When that happens do not attempt to get it to pass;
instead, treat it the same way we would if we had actually got
conflicting results at that revision.
(Conversely, do not attempt to get a basis fail if the basis fail has
already passed on the selected host. This is, as it happens,
impossible in a bisection triggered by sg-report-flight with the
current invocation arrangements - but cs-bisection-step should
handle it correctly.)
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 1 May 2015 14:54:03 +0000 (15:54 +0100)]
sg-report-flight: Report stepno and testid of first worst fail
This makes reading the scoreboard considerably easier.
We abuse the local variable @worst slightly, pushing the extra info we
are going to print onto the end of it.
We also have to defer printing the cells, because we compute the cell
to duplicate in column order but we have to output them in row order.
For symmetry we accumulate both rows rather than only the second row.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Thu, 30 Apr 2015 15:23:56 +0000 (16:23 +0100)]
sg-report-job-history: Avoid full runvars table scan (!)
sg-report-job-history wants to know the potential names of runvars
relating to hosts. To do this it tries to find a list of distinct
runvar names which exist in the flights it's processing.
However, it fails to limit the runvar query appropriately, and as a
result postgresql must scan almost the complete runvars table to
produce an answer. This is very slow if the table is bigger than the
database server's RAM.
Fix this by limiting the runvars table query to relevant flights.
Specifically:
* Break the `100' from the LIMIT clause on the flights search
into a local variable $limit.
* Break the bulk of the flights search sql statement text into
a local variable $fromstuff.
* In the runvars statement, add a condition on flights which uses
LIMIT and OFFSET, based on results of the the flights query.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Thu, 30 Apr 2015 15:20:51 +0000 (16:20 +0100)]
cs-bisection-step: Use pbm tools, not graphicsmagick/imagemagick
Graphicsmagick / imagemagick have very poor performance with images
with large pixel sizes. The bisector can generate some very large
images.
In an example I have seen, a 21595x21048 png, occupying only 2.6Mby of
disk space. An invocation of `convert' to resize this was using 3Gby
of RAM and lots of CPU. Whereas, the pbm utilities can process this
with much less memory and a tiny fraction of the cpu time.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Wed, 29 Apr 2015 15:06:29 +0000 (16:06 +0100)]
ts-kernel-build: Enable CONFIG_SCSI_SAS_ATA
(Some) SAS storage controller drivers do not recognise attached SATA
disks when this option is not set. It is inexplicably not set by
default in Linux 3.14.36 (at least).
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Ian Jackson [Tue, 21 Apr 2015 16:39:24 +0000 (17:39 +0100)]
target_cmd_build: Delete build-ok-stamp before starting
Many of the callers of target_cmd_build use a build-ok-stamp idiom to
detect failed builds. This idiom does not work if the stamp file
exists already, so delete it.
In the future we may move more of the test build-ok-stamp, echo ok,
into TestSupport, but this will do for now.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Ian Jackson [Tue, 21 Apr 2015 15:28:37 +0000 (16:28 +0100)]
ts-kernel-build: Enable x86 IOMMU options
This has a variety of beneficial implications:
* The kernel becomes more like the kind of distro kernels that Xen
users are probably using.
* We are more likely to discover any bugs in Linux where Linux
running under Xen (eg as dom0) fights with Xen for control of io
mediation resources or otherwise mishandles the situation.
* A pleasant side effect is that in a kernel which does not yet have
"config: Enable NEED_DMA_MAP_STATE when SWIOTLB is selected"
(a bugfix), enabling INTEL_IOMMU has the side effect of enabling
NEED_DMA_MAP_STATE and thus working around the bug.
The list of options to enable was derived by eyeballing
drivers/iommu/Kconfig from 3.14.34.
I will leave the question of whether to enable any ARM IOMMU options
for the Xen ARM folks to consider.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> CC: Boris Ostrovsky <boris.ostrovsky@oracle.com> CC: David Vrabel <david.vrabel@citrix.com> CC: Andrew Cooper <andrew.cooper3@citrix.com> CC: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Tue, 24 Mar 2015 14:23:58 +0000 (14:23 +0000)]
Arrange for core dumps to be placed in /var/core and collect them
Refactor the $kvp_replace helper in ts-xen-install into a generic
helper (which requires using ::EO and ::EI for namespacing) for use
with target_editfile and use it to edit /etc/sysctl.conf to set
kernel.core_pattern on boot.
Tested in standalone mode by installing and running a C program
containing "*(int *)0 = 1;" which, after running "ulimit -c unlimited"
produces the expected core file. ts-logs-capture when run in
standalone mode then picks them up.
I've not yet figured out how to make the desired rlimit take affect
for all processes (including e.g. daemons spawned on boot). Likely
this will involve some combination of pam_limits.so PAM module and
adding explicit ulimit calls to the initscripts which we care about
(primarily xencommons and libvirt initscripts).
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Wed, 1 Apr 2015 13:24:00 +0000 (14:24 +0100)]
cambridge: Stop publishing logs to chiark
http://osstest.cam.xci-test.com/~osstest/testlogs already exists and
points to the live logs directory, so switch PubBaseUrl to that in the
Cambridge config such that email reports etc contain it. This won't be
externally accessible but I think that won't matter now that the
master production instance is elsewhere.
Arrange that cr-publish-flight-logs doesn't publish the corresponding
thing if either LogsPublish or ResultsPublish is not set, and unset
them in the Cambridge config.
Likewise arrange that cr-ensure-disk-space doesn't do anything if the
configuration variable passed as an option is not set, and unset
Publish (the base for {Logs,Results}Publish) in the Cambridge config.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: Check the config variable and not its name.
v3: Adjust for control VM move to xs.citrite.net
Ian Campbell [Wed, 1 Apr 2015 13:12:51 +0000 (14:12 +0100)]
cambridge: Do not try to push harness to XenProject instance output
By arranging for cr-publish-flight-logs to ignore --push-harness if
either of HarnessPublishGitRepoDir or HarnessPublishGitUserHost are
not specified
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
v2:
- Avoid logm which isn't available here, wasn't saying much of use
anyway.
- Syntax fix (is not a function, so exit not return)
Perhaps we should have our own tree for such things, but for now just
nobble it.
Ian Campbell [Wed, 1 Apr 2015 12:50:50 +0000 (13:50 +0100)]
Handle osstest's own local push gate in non-master production instances
We want to arrange that the master XenProject instance continues to
test its own pretest branch while any downstream instances will pickup
changes from the master instance's production (i.e. tested) branch,
which is published at git://xenbits.xen.org/osstest.git#master. We
want to also be able to use local pretest for local changes (which may
or may not get merged back upstream).
Add a new configuration option OsstestUpstream which by default is
"git://xenbits.xen.org/osstest.git master" and which is cleared to
nothing on the master instance via production-config.
If the option is not set then the existing behaviour is unchanged.
If the option is set then osstest branch flights will still prefer to
test the local pretest branch, but if nothing is pending there then it
will proceed by merging the upstream branch into the local production
branch and testing the result.
This merge must be done:
- in a clone not in the main testing.git in order to avoid inserting
merge conflict markers into the active set of scripts.
- in a non-bare repo because git merge requires it.
$repos/osstest is a bare repo which we want to keep that way because
using repo_tree_rev_fetch_git to fetch the remote branch is
convenient.
So we use $repos/osstest-merge as a temporary merge repo and reclone
from the active local repo each time.
All of this happens in ap-fetch-version.
As part of this arrange that the result is always left in the ap-fetch
branch of the for-osstest.git repo (even for existing cases) and the
sha1 is produced as output. Resetting to that revision is handled by
cr-daily-branch.
If the merge fails then manual intervention (i.e. a manual merge and
push to the _local_ pretest) will be required. Likewise if local
pretest and local production have diverged manual intervention will be
required.
In ap-push we stop pushing to xenbits#master except for the master
instance if an upstream is defined. At some point it might be useful
to add a configuration option for where to push to but I don't have
that requirement right now.
ap-fetch-version-old requires no changes.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
v4:
- Use git update-ref properly, i.e. with the full ref name, otherwise
it creates random .git/ap-fetch refs
v3:
- Only merge from upstream if there is nothing pending locally.
- Always update ap-fetch.
v2:
- Arrange for $OSSTEST_USE_HEAD=y to take precendence
- drop LOCALREV (which was wrong anyway) in favour of inline
branchname
- Rename OSSTEST_REVISION_MERGE as revision_merge to avoid implying
it can be set and will be honoured.
- Git in Debian Squeeze lacks -C and --no-edit, adjust accordingly.
Ian Campbell [Fri, 1 May 2015 10:20:45 +0000 (11:20 +0100)]
Osstest/Debian.pm: Use Fqdn hostprop when collecting host keys
Otherwise hosts which are not in the same DnsDomain are not processed,
resulting in log messages such as:
2015-05-01 10:06:19 Z skipping host key for nonexistent host marilith-n4.xs.citrite.net
2015-05-01 10:06:20 Z skipping host key for nonexistent host lace-bug.xs.citrite.net
The practical impact of this appears to be that the pair migration
tests can fail with:
Ian Campbell [Wed, 29 Apr 2015 10:14:58 +0000 (11:14 +0100)]
cambridge: Switch configuration to use osstest.xs.citrite.net
The VM has moved to different infrastructure and its new name is
osstest.citrite.net.
Update ExecutiveDbnamePat. The DB is still in the XC infrastructure so
using DnsDomain (the default) no longer works.
Set {Owner,Queue}DaemonHost to refer to the new VM host and not the
default ControlDaemonHost value of control-daemons.osstest.cam.xci-test.com
(which will be removed later).
We set both variables rather than just ControlDaemonHost in case we
ever want to move one but not the other.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Wed, 1 Apr 2015 10:54:47 +0000 (11:54 +0100)]
allow instance specific settings
cri-args-hostlists and invoke-daemon now check for
$HOME/.xen-osstest/settings which can contain things like "export
OSSTEST_CONFIG=production-config-cambridge" to tailor things for a
particular instance of osstest running in production mode.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
[ ijc -- add invoke-daemon too and reword commit message accordingly ]
FreeBSD: Cleanups relating to guest images and ts-freebsd-install script
Remove some unused variables from ts-freebsd-install script. Also make the
third parameter of target_put_guest_image optional and fix both callers of
this function.
New 10.1 images are larger than the previous 10.0 images, so change
the size of the LVM volume to accommodate them, in preparation.
Increase the size to 24000 in case of future increases upstream.
Ian Campbell [Tue, 31 Mar 2015 15:06:46 +0000 (16:06 +0100)]
tcl: Handle environment variables which are unset.
This allows wrappers such as the standalone wrapper to do
OSSTEST_SIMULATE=$foo ./sg-run-job
and not worry if $foo is unset.
Do likewise for OSSTEST_TCL_JOBDB_DEBUG.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 13 Mar 2015 15:19:43 +0000 (15:19 +0000)]
HostnameSortSwapWords: Make name order mangling configurable
We still default to having the mangling enabled. Arguably this is
wrong I'm am minimising the number of things that will be wrong for
the existing Cambridge instance.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Mon, 9 Mar 2015 16:22:58 +0000 (12:22 -0400)]
cr-ensure-disk-space: Permit argument to specify local directory.
If the argument is Logs rather than LogsPublish (ie, refers to a local
directory (without `:') rather than a remote one (with `:'), do things
locally (by invoking sh -ec so that we have identical quoting rules to
ssh).
No effectively functional change with the current configuration.
We still always do a local deletion. This is anomalous and will
disappear shortly.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Sun, 8 Mar 2015 12:20:34 +0000 (12:20 +0000)]
daemons: Allow QueueDaemon and OwnerDaemon to run on different hosts
We want the OwnerDaemon to run on the same host as the database (for
fate-sharing reasons). OTOH the QueueDaemon is less critical if it
fails, and it generates reports etc., and wants to be more frequently
updated, so it should run on the osstest VM.
Permit this by:
* Providing OwnerDaemonHost and QueueDaemonHost config settings
which default to the value of ControlDaemonHost.
* Using those everywhere.
* In the daemons' Tcl code, have main-daemon take the string `Owner'
or `Queue' so that it can look up both the host and port.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 6 Mar 2015 18:39:44 +0000 (18:39 +0000)]
pxe setup: New TftpPxeTemplatesReal feature
Rather than having mg-hosts mkpxedir hardcode the strange thing done
in the XenClient test lab in the Citrix Cambridge office, provide a
somewhat more general and correct approach:
* Generalise host_pxefile to support [Tftp]PxeTemplatesReal as well
as [Tftp]PxeTemplates.
* Default [Tftp]PxeTemplatesReal to ''
mg-hosts mkpxedir now uses these templates, as follow:
* Create the host's PxeTemplates-based pxe file's parent
directories and make the parent directory be owned by PxeGroup.
* If the PxeTemplatesReal is specified and different, make a symlink
named according to PxeTemplatesReal pointing at the previous file.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Thu, 5 Mar 2015 19:04:43 +0000 (14:04 -0500)]
PDU: pdu-msw: Support APC v6 firmware
APC PDUs with firmware 6.x have a different OID space for turning
ports on and off, to the one for querying. The old namespace still
works to turn the port on and off but returns a genErr error response!
Support a new command-line option --apc6 to use this other OID.
Ian Jackson [Thu, 5 Mar 2015 19:02:09 +0000 (14:02 -0500)]
PDU: pdu-msw: Split $read_oid and $write_oid
Some PDUs have a different OID space for turning ports on and off, to
the one for querying. To make this easier to handle, split the
variable $oid into $read_oid and $write_oid.
Also move $baseoid settings earlier so that we can modify them with
command-line arguments.
No functional change in this patch.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Wei Liu [Mon, 2 Feb 2015 19:57:13 +0000 (19:57 +0000)]
mfi-common, make-flight: create XSM test jobs
Duplicate Debian PV and HVM test jobs for XSM testing.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
---
Changes in v8:
1. Make libvirtbuildjob = ${bfi}build-$dom0arch-libvirt
Changes in v6:
1. Skip generating xsm job for different platforms.
2. Use "xsms".
3. Reformat some long lines.
Wei Liu [Mon, 2 Feb 2015 19:53:26 +0000 (19:53 +0000)]
make-flight: factor out do_pv_debian_tests
Pure code motion. No effect on job generation.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Wei Liu [Fri, 12 Sep 2014 15:29:00 +0000 (16:29 +0100)]
Debian.pm: load flask policy in uboot
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
---
Changes in v15:
1. Use new flask option "flask=enforcing".
Changes in v10:
1. Correctly get $flaskpolicy.
Changes in v9:
1. Add "xen,multiboot-module".
Changes in v8:
1. Append flask_enforcing=1 and flask_enabled=1.
Wei Liu [Mon, 23 Feb 2015 12:03:16 +0000 (12:03 +0000)]
ts-xen-build: only move hypervisor to xeninstall
... so that we can leave xenpolicy-* in tools tarball.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
---
Changes in v13:
1. Use find rune to get list of files to move.
Wei Liu [Mon, 8 Sep 2014 15:06:52 +0000 (16:06 +0100)]
ts-xen-build: build with XSM support if requested
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
Changes in v14:
1. Use target_cmd_output instead of target_cmd_output_root.
Changes in v5:
1. Only set XSM_ENABLE when runvar is defined.
2. Fix inconsistent whitespace.
Wei Liu [Sun, 12 Oct 2014 16:04:34 +0000 (17:04 +0100)]
overlay: update overlay/etc/grub.d/20_linux_xen
This file was originally created to work around Debian bug #633127
("/etc/grub/20_linux does not recognise some old Xen kernels").
According to Debian bug tracker [0], #633127 bug is fixed in Wheezy. As
we're now using Wheezy in OSSTest we can safely remove the old overlay
file if there's no further bugs discovered.
However we have another bug #690538 ("grub-common: Please make submenu
creation optional or at least allow users to disable it easily") that
would break OSSTest. We're now using Wheezy in production. There's no
way to disable submenu in Wheezy. And submenu breaks OSSTest's grub menu
parser.
So update this overlay file to the one in Wheezy's grub-common
1.99-27+deb7u2 and take care of Debian bug #690538 by removing the lines
to generate submenu.
Also work around GRUB bug #43420 ("20_linux_xen doesn't support Xen XSM
policy file") by applying a small patch proposed in [2].
Add a note to reference #633127 and #690538 above grub2 setup function.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com> Cc: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
Changes in v15:
1. Use new flask option "flask=enforcing".