]> xenbits.xensource.com Git - people/liuw/osstest.git/log
people/liuw/osstest.git
9 years agoDO NOT APPLY point to my own trees wip.stubdom-split-build-volatile
Wei Liu [Thu, 7 Apr 2016 19:35:24 +0000 (20:35 +0100)]
DO NOT APPLY point to my own trees

9 years agosg-run-job: add build-stubdom recipe
Wei Liu [Wed, 16 Mar 2016 15:47:21 +0000 (15:47 +0000)]
sg-run-job: add build-stubdom recipe

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
9 years agomake-flight: pass split_stubdom runvar to script
Wei Liu [Wed, 16 Mar 2016 15:46:33 +0000 (15:46 +0000)]
make-flight: pass split_stubdom runvar to script

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
9 years agomfi-common: expose stubdombuildjob runvar to test jobs
Wei Liu [Mon, 11 Apr 2016 23:08:57 +0000 (00:08 +0100)]
mfi-common: expose stubdombuildjob runvar to test jobs

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
9 years agomfi-common: create stubdom build job
Wei Liu [Wed, 16 Mar 2016 15:34:42 +0000 (15:34 +0000)]
mfi-common: create stubdom build job

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
9 years agots-xen-install: extract stubdom if necessary
Wei Liu [Wed, 16 Mar 2016 14:45:02 +0000 (14:45 +0000)]
ts-xen-install: extract stubdom if necessary

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
9 years agoAdd ts-stubdom-build
Wei Liu [Tue, 15 Mar 2016 16:24:48 +0000 (16:24 +0000)]
Add ts-stubdom-build

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
9 years agoap-common: add mini-os tree
Wei Liu [Wed, 16 Mar 2016 12:06:41 +0000 (12:06 +0000)]
ap-common: add mini-os tree

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
9 years agoap-common: add stubdom variables
Wei Liu [Mon, 14 Mar 2016 14:26:50 +0000 (14:26 +0000)]
ap-common: add stubdom variables

XXX need to use the actual tree url and name, currently using
stubdom.git.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
9 years agoREADME: mention OSSTEST_JOBS_ONLY
Wei Liu [Thu, 7 Apr 2016 20:02:08 +0000 (21:02 +0100)]
README: mention OSSTEST_JOBS_ONLY

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
9 years agoIncrease priority of xen-unstable-coverity master
Ian Campbell [Wed, 17 Feb 2016 16:13:11 +0000 (16:13 +0000)]
Increase priority of xen-unstable-coverity

Since we are limited on the number of these we can do per week (to 2)
we would like these to happen fairly promptly after the time given in
the crontab, otherwise we can potentially end up with the Wednesday
run not actually happening until late Saturday, right before the
Sunday run which might happen right away.

Therefore specify OSSTEST_RESOURCE_PRIORITY=-15, which is right behind
xen-unstable-smoke in priority order.

We don't have much data yet but based on what we have so far
ts-coverity-build takes up to 1000s (around quarter of an hour) and
ts-coverity-upload a little over half an hour. So including host
install (if needed, it can use a share of an existing build host if
one is around) the whole thing comes in at well under an hour, so
having this slip to the head of the queue is unlikely to cause
problems.

Also put mg-allocate and mg-blockage in the correct order in the doc.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agomg-show-flight-runvars: avoid "SELECT .. AND TRUE" for sqlite
Ian Campbell [Wed, 17 Feb 2016 10:50:01 +0000 (10:50 +0000)]
mg-show-flight-runvars: avoid "SELECT .. AND TRUE" for sqlite

c5e29f93fb6e "mg-show-flight-runvars: recurse on buildjobs upon
request" broke standalone mode with:
    Error: no such column: TRUE
from sqlite. Do as is done for $syntcond and use (1=1) instead.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agocrontab: Add a coverity run on a Wednesday
Ian Campbell [Tue, 16 Feb 2016 14:56:20 +0000 (14:56 +0000)]
crontab: Add a coverity run on a Wednesday

In addition to the current Sunday run.

Projects of Xen's size are currently allowed 2 builds per week (max 1
per day), so make use of both.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agomake-flight: Use older Debian for host and guest OS with older Xen
Ian Campbell [Mon, 18 Jan 2016 14:28:57 +0000 (14:28 +0000)]
make-flight: Use older Debian for host and guest OS with older Xen

Sometimes when updating osstest to use a newer version of Debian as a
baseline we find that the new compiler or other tools pickup latent
errors in older code bases for which the fixes are invasive or
otherwise inappropriate for a stable branch.

This is the case with Debian Jessie and Xen 4.3 and earlier, so
restrict those branches to keep using Wheezy.

This only applies to xen-X.Y-testing branches and
qemu-upstream-X.Y-testing branches since other branch all use
xen-unstable as their Xen.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agomfi-common: usual_debianhvm_image: derive version from $guestsuite
Ian Campbell [Mon, 18 Jan 2016 14:28:56 +0000 (14:28 +0000)]
mfi-common: usual_debianhvm_image: derive version from $guestsuite

This more likely matches the callers intention.

Move the setting into production-config* alongside the Suite and
TftpDiVersion settings. Continue to support $DEBIAN_IMAGE_VERSION as an
override. The value for Wheezy is from what was replaced
in 610ea1628363 "Switch to Debian 8.0 (jessie) as OS for test hosts".

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agoQualify TftpDiVersion with the suite.
Ian Campbell [Mon, 18 Jan 2016 14:28:55 +0000 (14:28 +0000)]
Qualify TftpDiVersion with the suite.

This allows the version to differ e.g. between Wheezy and Jessie.

Update production-config* to set TftpDiVersion_jessie instead of just
TftpDiVersion, also add TftpDiVersion_wheezy using the version
replaced in commit f610ea162836 "Switch to Debian 8.0 (jessie) as OS
for test hosts".

In mfi-common we need to check for TftpDiVersion_$suite (_$guestsuite)
and TftpDiVersion manually since getconfig In that context will not
see any DebianSuite override in the environment.

This ensures that when a non-default suite is configured a
corresponding useful version of DI is selected.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agomfi-common: Set di_version for build & test host install
Ian Campbell [Mon, 18 Jan 2016 14:28:54 +0000 (14:28 +0000)]
mfi-common: Set di_version for build & test host install

This means that bisections will use the same version, even if
production-config changed in the mean time.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agomake-flight: Set di_version runvar on d-i based test jobs.
Ian Campbell [Mon, 18 Jan 2016 14:28:53 +0000 (14:28 +0000)]
make-flight: Set di_version runvar on d-i based test jobs.

Note that make-distros-flight does not want this, since it uses d-i
fetched from the web not the version in our config.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agots-debian-di-install: Allow Di Version to come from runvars
Ian Campbell [Mon, 18 Jan 2016 14:28:52 +0000 (14:28 +0000)]
ts-debian-di-install: Allow Di Version to come from runvars

and following the lead of the suite arrange for a version selected
from the defaults to be written back to the runvars.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
[ ijc -- missing s/diversion/di_version/ in ts-debian-di-install,
         drop unnecessary \ wrapping from $di_path assignment ]

9 years agots-host-install: Support DiVersion coming from runvars
Ian Campbell [Mon, 18 Jan 2016 14:28:51 +0000 (14:28 +0000)]
ts-host-install: Support DiVersion coming from runvars

To do so initialise $ho->{DiVersion} in select host and use it in
ts-host-install.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
[ ijc missing s/diversion/di_version in selecthost ]

9 years agomfi-common: Always add debian_suite to debian_runvars
Ian Campbell [Mon, 18 Jan 2016 14:28:50 +0000 (14:28 +0000)]
mfi-common: Always add debian_suite to debian_runvars

This adds an explicit debian_suite to some jobs which didn't already
have one, meaning that those jobs will remain the same when cloned for
a bisect and run in a tree where $c{DebianGuestSuite} has changed
since the original construction.

No expected semantic change.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agomfi-common: always add host suite to hostos_runvars
Ian Campbell [Mon, 18 Jan 2016 14:28:49 +0000 (14:28 +0000)]
mfi-common: always add host suite to hostos_runvars

This avoids situations where production-config* has changed
DebianSuite but the bisector is still picking up baselines etc from
before the change and reusing their runvars (without suite) with an
inconsistent config.

Switch selecthost() to use target_var when querying the suite. This
means it will check the "{ident}_suite" runvar first as before but
fallback to just looking at the "all_host_suite" runvar. We also
change the existing host_suite to all_host_suite in mfi-commong so
that test_matrix_iterate() needn't worry about ident=host vs
=src_host/dst_host etc (of course this can still be overridden if
desired by using src_host_suite etc, but nowhere does.

Other uses of $c{DebianSuite} have been abolished already.

Note that "$suite != $defsuite" is not true for any current production
invocation of osstest. If this was ever true then we would have set
the host_suite runvar, whereas now we always set all_host_suite.
However any old flights with host_suite would still be interpretted
the same. Note also that the "$suite != $defsuite" case was previously
broken for the -pair tests since the host idents there are 'src_host'
and 'dst_host', so the previous code would have fallen back to
$c{DebianSuite} without looking at the host_suite runvar.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agomfi-common: Rename $suite_runvars as $hostos_runvars
Ian Campbell [Mon, 18 Jan 2016 14:28:48 +0000 (14:28 +0000)]
mfi-common: Rename $suite_runvars as $hostos_runvars

Later in the series more runvars to control the host install will be
added.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agotarget_var: Support fallback to all_(guest|host)_$vn
Ian Campbell [Mon, 18 Jan 2016 14:28:47 +0000 (14:28 +0000)]
target_var: Support fallback to all_(guest|host)_$vn

Having to set {ident}_foo for all idents used in a job (e.g host vs
src_host+dst_host) in make-flight would be a little fiddly.

Instead follow the lead of all_hostflags and consult all_host_$vn.

I have no immediate use for all_guest_$vn, but support it for
consistency.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agoDebian: Abolish $suite and $xopts{Suite} from preseed_* interfaces.
Ian Campbell [Mon, 18 Jan 2016 14:28:46 +0000 (14:28 +0000)]
Debian: Abolish $suite and $xopts{Suite} from preseed_* interfaces.

Generating a preseed for a suite which does not match the ->{Suite} of
the underlying guest or host object does not seem useful, so remove
this option and use ->{Suite} instead.

For guests ->{Suite} is set by debian_guest_suite() (which is called
from preseed_guest_create(), although it is often also called prior to
that) and by selectguest()

For hosts $ho->{Suite} is initialised by selecthost if we are in the
context of a $job (and if we aren't we had best not be trying to
reinstall a host).

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agoAdd a weekly coverity flight
Ian Campbell [Fri, 5 Feb 2016 09:30:39 +0000 (09:30 +0000)]
Add a weekly coverity flight

This primarily consists of ts-coverity-{build,upload} and
make-coverity-flight which constructs the sole job.

The branch is named "xen-unstable-coverity" which matches various xen*
in the cr-* scripts. Places which needed special treatement are
handled by matching xen-*-coverity, which leaves the possibility of
xen-4.7-testing-coverity etc in the future, but note that care would
be needed so that coverity's tracking of new vs existing issues would
likely be confused by uploading different branches without
differentiating somehow (I don't know how this is supposed to work).

The most recently scanned revision is pushed to a new
coverity-scanned/master branch in the usual xen.git, tests are run on
the master branch.

I initially thoughts that $c{CoverityEmail} would need to be an actual
account registered with scan, however a manual experiment using
email=security@xen.org was accepted by the service. An "analysis
complete" message was sent to security@ while individual results mails
were sent to each member of the coverity project who was configured to
receive them. I think this is what we want. The "analysis complete"
mail contained no sensitive data, but also no real information other
than "success" (or presumably "failure" if that were to be the case).
I think going to security@ is probably OK.

The upload URL defaults to a dummy local URL, which will fail (it
would be possible in principal to put a stunt CGI there though). When
run with "cr-daily-branch --real" (i.e. in full on production mode)
then this is set instead to the value of CoverityUploadUrl from the
config (production-config etc). This means that adhoc and play runs
still exercise all the code (but the curl will fail) while --real runs
upload to a site-configurable location. (Note that the URL includes
the coverity project name, which would likely differ for different
instances).

I have run this via cr-daily-branch --real on the production infra
and it did upload as expected (flight 80516). Since
master==coverity-tested/master at this point it came out as a baseline
test which didn't attempt ap-push, which I would have expected to fail
anyway since it was running as my user in the colo which cannot push
to osstest@xenbits.

In my experiments the curl command took ~35 minutes to complete (rate
in the 100-200k range). Not sure if this is a problem, but use curl
--max-time passing it an hour to bound things. Note that curl is run
on the controller (via system_checked).  timeout etc.

Note that the token must be supplied with </path/to/token and not
@/path/to/token. The latter appears to the server as a file upload
rather than a text field in a form which doesn't work. In early
attempts I thought that the trailing \n in /path/to/token might be an
issue and hence wrote a big comment. However having discovered < vs @
I am no longer 100% sure that is the case, but I left the comment
anyway since I can observe on the wire that the \n is included in the
upload (but each test takes ~35 mins and there is a ratelimit on the
server side too).

A final niggle is that the descripton field in the web ui ends up as:
    80516:\ git://xenbits.xen.org/xen.git\ 9937763265d9597e5f2439249b16d995842cdf0
(i.e. spaces are \ escaped). I've confirmed with curl --trace-ascii
the the uploaded data is not escaped (this is from an earlier attempt
which did not include the flight number):

009a: Content-Disposition: form-data; name="description"
00ce:
00d0: git://xenbits.xen.org/xen.git 9937763265d9597e5f2439249b16d99584
0110: 2cdf0f

Due to the limitations on the numbers of uploads I've not experimented
with possible fixes yet (e.g. URL escaping the upload). Worst case we
either live with it or adjust the syntax to avoid the problematic
characters.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
9 years agoMove collectversions from ts-xen-build into Osstest::BuildSupport
Ian Campbell [Fri, 5 Feb 2016 09:30:38 +0000 (09:30 +0000)]
Move collectversions from ts-xen-build into Osstest::BuildSupport

I'm going to have a need for it elsewhere.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agomg-show-flight-runvars: recurse on buildjobs upon request
Ian Campbell [Wed, 3 Feb 2016 12:37:33 +0000 (12:37 +0000)]
mg-show-flight-runvars: recurse on buildjobs upon request

By looping over @rows looking for buildjobs runvars and adding those
jobs to the output until nothing changes.

The output is resorted by runvar name which is the desired default
behaviour. As usual can be piped to sort(1) to sort by flight+job.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agomg-show-flight-runvars: include $flight. prefix on job name if -r (recurse)
Ian Campbell [Mon, 1 Feb 2016 14:28:31 +0000 (14:28 +0000)]
mg-show-flight-runvars: include $flight. prefix on job name if -r (recurse)

Adds a new -r (==recurse) option which for now only adds "$flight." to
the job name, i.e. nothing is recursive yet.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agomg-show-flight-runvars: calculate @colsw from @rows not via SQL
Ian Campbell [Mon, 1 Feb 2016 14:28:30 +0000 (14:28 +0000)]
mg-show-flight-runvars: calculate @colsw from @rows not via SQL

This will work even once @rows is not all collected by the same SQL
statement.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agomg-show-flight-runvars: move collection into a sub
Ian Campbell [Mon, 1 Feb 2016 14:28:29 +0000 (14:28 +0000)]
mg-show-flight-runvars: move collection into a sub

This will make it easier to collect more rows.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agomg-show-flight-runvars: collect rows into @rows, output in second step
Ian Campbell [Mon, 1 Feb 2016 14:28:28 +0000 (14:28 +0000)]
mg-show-flight-runvars: collect rows into @rows, output in second step

This will make it easier to collect more rows.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agoproduction-config*: Update TftpDiVersion for Debian 8.3 point release
Ian Campbell [Sun, 24 Jan 2016 10:18:24 +0000 (10:18 +0000)]
production-config*: Update TftpDiVersion for Debian 8.3 point release

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
9 years agomake-flight: Support specifying a mini-os tree+revision
Ian Campbell [Tue, 19 Jan 2016 12:48:08 +0000 (12:48 +0000)]
make-flight: Support specifying a mini-os tree+revision

This is useful for standalone or adhoc use as well as (presumably)
bisection.

There is no ap-* or cr-daily-* integration here because I didn't need
it (i.e. I'm not intending to create a new mini-os branch here).

In order to cope with Xen <= 4.5 where extras/mini-os exists but is
part of xen.git and not something cloned from elsewhere add a
$optional argument (itself optional) to dir_identify_vcs which if true
causes dir_identify_vcs to return 'none' instead of failing.

Previously dir_identify_vcs failed with:
    bash: line 5: fail: command not found
because the fail command is undefined. Instead echo fail and use that
to trigger the $optional handling.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agocs-adjust-flight: Add jobs-rename command which applies a perlop to job names
Ian Campbell [Tue, 19 Jan 2016 12:47:49 +0000 (12:47 +0000)]
cs-adjust-flight: Add jobs-rename command which applies a perlop to job names

My intention was to allow creation of adhoc jobs based on a template
but modified e.g. to enable/disable XSM with a sequence something
like:

./cs-adjust-flight $flight copy-jobs $template test-foo-xsm
./cs-adjust-flight $flight jobs-rename test-foo-xsm 's/-xsm$//'
./cs-adjust-flight $flight runvar-set $job enable_xsm false
./cs-adjust-flight ... update %buildjob

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agostop allowing libvirt failures
Ian Campbell [Mon, 18 Jan 2016 15:54:15 +0000 (15:54 +0000)]
stop allowing libvirt failures

In Feb/Mar 2015 (not long after adding the libvirt tests) we appear to
have added test-@@-libvirt@@ to the set of allowed failures in
response to some issues with libvirtd crashing.

However looking at the history of test-@@-libvirt@@ on all branches
both in the COLO and in Cambridge (which was the production instance
back then) I don't see any evidence that this issue is still ongoing
(which matches my recollection of it having been fixed).

Therefore remove the entries allowing libvirt failures.

This effectively reverts:

00023a5af6ff allow files: Allow all libvirt test failures on other branches
83b8c8eafb18 allow.all: Do not regard libvirt guest start failures as regressions

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agosg-report-job-history: alternate color of osstest column only when it changes
Ian Campbell [Wed, 6 Jan 2016 11:08:43 +0000 (11:08 +0000)]
sg-report-job-history: alternate color of osstest column only when it changes

Currently the bgcolor of the osstest column alternates on each line,
rather than only when it changes as the other revision columns do.

A given flight might touch multiple osstest revisions (although in
practice they rarely do) but it seems reasonable to simply consider
any change as a change.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agoDebian: erase-other-disks: rescan partition tables after erasing whole disk
Ian Campbell [Wed, 20 Jan 2016 15:06:21 +0000 (15:06 +0000)]
Debian: erase-other-disks: rescan partition tables after erasing whole disk

This appears to happen anyway, but force it to be sure.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agoDebian: erase-other-disks: erase partitions first
Ian Campbell [Wed, 20 Jan 2016 15:06:20 +0000 (15:06 +0000)]
Debian: erase-other-disks: erase partitions first

It seems that when sdX is zeroed there is some chance that sdX[0-9]
will disappear before we get to them.

When partman comes along and recreates the partitions it is likely
that they will occupy the same disk space as before (since d-i's
autopartition is deterministic), meaning that LVM will find the old
PV headers again.

This is in particular problematic on multi disk systems where we end
up with an LV spanning sda5 and sdb. sdb is successfully erased here
but sda5 is not, however LVM will still find the LV with missing PV,
which is sufficient to trigger partman-lvm's checks for erasing
devices which weren't explicitly listed, resulting in:

    !! ERROR: Unable to automatically remove LVM data

    Because the volume group(s) on the selected device also consist of physical
    volumes on other devices, it is not considered safe to remove its LVM data
    automatically. If you wish to use this device for partitioning, please remove
    its LVM data first.

which cannot be preseeded around.

If the autopartitioning is not deterministic (as might be the case
when installing a different version of Debian to last time) then
going from layout A -> B -> A' risks B (by chance) not destroying the
headers created by A, meaning that A' will find them and suffer again
from the problem above. This is handled via the use of
ts-host-install-twice which will cause A' to run twice, i.e. A -> B
-> (A' -> A''). In this case A' will fail as above, but A'' will
startup seeing the partition layout put in place by A' (which matches
A) and erase those partitions, leading to success later on.

Also erase partitions for all sd/hd? not just sda+hda.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agoDebian: erase-other-disks: add a log() helper
Ian Campbell [Wed, 20 Jan 2016 15:06:19 +0000 (15:06 +0000)]
Debian: erase-other-disks: add a log() helper

Writing it out each time is too verbose.

At the same time log the set of devices present before and after each
batch of erasing, with a udev settle before the second to ensure any
changes to /dev have happened.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agots-debian-install: increase time allowed for xen-create-image
Ian Campbell [Fri, 15 Jan 2016 13:35:30 +0000 (13:35 +0000)]
ts-debian-install: increase time allowed for xen-create-image

This step is consistently timing out when run on cubietruck-*. Judging
from the logs it appears to be completing during the 30s slack added
by tcmdex (i.e. after the timeout message the rest of the output
appears in the test step log).

Looking at the results on arndale-* (which looks to pass reasonably
reliably) I see that the regular test-armhf-armhf-xl job takes around
550s to do the xen-create-image while test-armhf-armhf-xl-rtds
typically takes around 1100s (twice as long).

On cubietruck-braque test-armhf-armhf-xl uses 900s. One could
therefore extrapolate that test-armhf-armhf-xl-rtds might need more
than 1800s and not be too surprised that it appears to need something
a bit more than 2000s in practice. 2500s seems like sufficient
headroom.

For comparisson with arm on x86 godello takes around 210s in the
normal case and 680s with RTDS (>3x slower) while nocera takes 265s
and 640s (2.4x). (Those are from nearby but not identical flights in
order to match up the host).

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Dario Faggioli <dario.faggioli@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agoAllow longer timeout when creating backing file for a raw disk.
Ian Campbell [Fri, 15 Jan 2016 12:23:58 +0000 (12:23 +0000)]
Allow longer timeout when creating backing file for a raw disk.

I noticed this dd timiung out when recommissioning the 3 cubietrucks
(picasso, metzinger, gleizes) but looking at the log shows this has
been happening on braque too.

The current code assumes 65MB/s arriving at a timeout of 153s for the
10G file. On arndale-* the logs indicate that it is achieving 95MB/s
and taking 105-107s which results in a warning but not a failure:

   execution took 105 seconds [**>153.846153846154/2**]

In experiments on a local cubietruck I observed it achieving a much
lower throughput of 40MB/s, which seems to be consistent with what
others are seeing:
https://groups.google.com/forum/#!category-topic/cubieboard/troubleshooting/7R4HlCDNCTU

Therefore calculate the timeout assuming a throughput of 20MB/s, in
practice for a 10GB file this will result in a 500s timeout.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agots-xen-build: support XSM/FLASK via Kconfig
Doug Goldstein [Wed, 6 Jan 2016 19:19:54 +0000 (13:19 -0600)]
ts-xen-build: support XSM/FLASK via Kconfig

In antcipation of XSM and FLASK migrating to Kconfig add support for
building them via Kconfig or the existing mechanism.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agostandalone-generate-dump-flight-runvars: include cri-getconfig
Ian Campbell [Fri, 18 Dec 2015 12:02:27 +0000 (12:02 +0000)]
standalone-generate-dump-flight-runvars: include cri-getconfig

Commit fb373a2096dc "cri-getconfig: Break out exec_resetting_sigint."
refactored this functionality, and asserted that cri-getconfig is the
one library which everything includes.

standalone-generate-dump-flight-runvars appears to have been the
exception to that rule.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agomg-allocate: In planner mode, pre-check the arguments
Ian Jackson [Thu, 17 Dec 2015 13:29:40 +0000 (13:29 +0000)]
mg-allocate: In planner mode, pre-check the arguments

Now, attempting to allocate a nonexistent host fails immediately with
a sensible message, rather than queueing up and then reporting the
message only later:

mariner:testing.git> OSSTEST_CONFIG=/u/iwj/.xen-osstest/config:local-config.test-database_iwj ./mg-allocate -U 1h spong
2015-12-17 17:05:14 Z pre-checking resources (dry run)...
2015-12-17 17:05:14 Z (precheck) task 196916 static iwj@mariner: iwj@mariner manual
*** no candidates for spong! ***
mariner:testing.git>

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agomg-allocate: Better error handling when no candidates
Ian Jackson [Thu, 17 Dec 2015 13:17:27 +0000 (13:17 +0000)]
mg-allocate: Better error handling when no candidates

Spot when our db search revealed no candidates for the resources to
allocate, and:
 - when doing an immediate allocation, call it an error
 - when doing a planned allocation, cause it to prevent allocation
   on this iteration, and print a suitably unreassuring message

Previously it would simply say `nothing available'.

Implement this as follows:
 - Report lack of candidates as $ok=-1 from alloc_1rescand
 - In alloc_1res, return this -1 as with any non-zero $ok
 - Handle the new $ok at all the call sites, in particular
 - In plan(), rename `allok' to `worstok' and have it be
   the worst relevant $ok value.  If $ok gives -1, return
   undef, rather than a booking list, to the allocator core.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoExecutive DB retry: Avoid an undefined warning
Ian Jackson [Thu, 17 Dec 2015 13:27:30 +0000 (13:27 +0000)]
Executive DB retry: Avoid an undefined warning

If something other than the DB statements inside need_retry throws an
exception, ->err will normally be undef (because
$dbh_tests->begin_work will clear it, if nothing else).

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agodb_retry: Suppress an "exiting via last" warning
Ian Jackson [Thu, 17 Dec 2015 13:16:08 +0000 (13:16 +0000)]
db_retry: Suppress an "exiting via last" warning

This warning appears when db_retry_abort is used, since 2b069b6c
"Database locking: Perl: Retry all deadlocks in PostgreSQL".

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoms-planner: Improve an error message
Ian Jackson [Thu, 17 Dec 2015 12:10:34 +0000 (12:10 +0000)]
ms-planner: Improve an error message

I experienced this `die' due to mg-schema-test-database failing to
borrow shared hosts properly, and added this Dumper for debugging.

I have not bothered to improve any of the other quite terse `die's in
ms-planner.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agomg-schema-test-database: Borrow shares properly
Ian Jackson [Thu, 17 Dec 2015 13:16:37 +0000 (13:16 +0000)]
mg-schema-test-database: Borrow shares properly

Previously, the test database would be generated in a broken state:
resources share-host/foo/{1,2,...} exist but the resource host/foo/0
is allocated to magic/xdbref rather than to magic/shared.  This causes
various resource allocation machinery to crash.  (Even if the host is
entirely un-borrowed.)

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: Expand commit message.

9 years agomg-schema-test-database: Wipe previous local plan data
Ian Jackson [Thu, 17 Dec 2015 12:10:07 +0000 (12:10 +0000)]
mg-schema-test-database: Wipe previous local plan data

Whatever is in the user's cwd is unlikely to correspond to anything
real.  In principle it might be possible to obtain an official copy
from the real daemons, and massage it, or something, but that's a lot
of work.

Instead, just remove it when we start the test db daemons.

In principle it would be more correct to remove it when we set up the
test db, because it is at that point that we create the new view of
the world.  Removing the old plan data when we start daemons means
that if the user, who is testing, restarts the daemons, the
newly-created queue daemon does not have information about allocations
made with the previous daemon, and instead regards those allocations
as rogue.

However, removing the file only when the daemons are started means
that if the user has saved a data-plan.pl in their cwd for some other
reason we don't remove it unless the user is actually going to run the
daemons.  So I think this is preferable.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agomg-schema-test-database: Provide some timeouts which are better for testing
Ian Jackson [Thu, 17 Dec 2015 12:09:44 +0000 (12:09 +0000)]
mg-schema-test-database: Provide some timeouts which are better for testing

The default timeouts mean that after starting a test db queue daemon
and a test db allocation attempt, we have to wait two minutes.

Lower timeouts increase the risk that we might lose noncritical races
and allocate resources to the `wrong' tasks.  And they reduce the
duration of an outage which will cause a planned allocation attempt to
fail.

I think we don't care about those problems for test instances.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoREADME.dev: Document the blessings
Ian Jackson [Thu, 17 Dec 2015 16:51:17 +0000 (16:51 +0000)]
README.dev: Document the blessings

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: Improvements from review

9 years agomfi-common: Only test migrupgrade from 4.5 onwards
Ian Campbell [Mon, 16 Nov 2015 10:24:50 +0000 (10:24 +0000)]
mfi-common: Only test migrupgrade from 4.5 onwards

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agoretain the configuration that Xen was built with
Doug Goldstein [Tue, 22 Dec 2015 15:44:44 +0000 (09:44 -0600)]
retain the configuration that Xen was built with

This should retain the .config file from the Kconfig process so that we
know how this build of Xen was configured.

Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoms-* html generation: Provide right title for projection
Ian Jackson [Mon, 4 Jan 2016 16:17:15 +0000 (16:17 +0000)]
ms-* html generation: Provide right title for projection

When ms-queuedaemon generates a resource-projection.html, it sometimes
does so from data-plan.pl (see proc report-plan).  This means that
ms-planner does not get a reliable indication of whether it is being
run for the plan or the projection, and the resource-project.html
sometimes claims to be the plan.

Fix with a new ms-planner option -W which tells it what to put in the
title, defaulting to the value passed to -w.

DEPLOYMENT NOTE:

The new ms-planner works with the old queuedaemon, so when upgrading,
it is OK to simply update the daemons-testing.git and then restart the
ms-queuedaemon.

If it is necessary to downgrade, rewinding to the old commit with a
running ms-queuedaemon will cause errors from the old ms-planner being
passed -w -- but these errors are trapped and ignored.  So in this
case reports will be out of date until ms-queuedaemon is also
restarted.

In either case nothing will go badly wrong.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoms-* html generation: Provide <title>
Ian Jackson [Tue, 22 Dec 2015 13:08:47 +0000 (13:08 +0000)]
ms-* html generation: Provide <title>

This means that these browser windows will actually get titles!

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotcl daemons: Fix reentrancy hazard in chan-read
Ian Jackson [Tue, 15 Dec 2015 18:26:15 +0000 (18:26 +0000)]
tcl daemons: Fix reentrancy hazard in chan-read

If the callback called by chan-read sets up a different read handler,
and the data for that other read handler arrives before chan-read
returns, chan-read would go round its loop again and eat and process
the new data.  This is wrong.

Instead, return from chan-read after processing one result from
`gets'.  If there is more to do, with this handler, the filevent will
arrange for us to be reentered.

This is most easily done by changing the `while' into an `if', and all
of the `continue's into `return's.  (There are no `break's.)

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
9 years agoSwitch to Debian 8.0 (jessie) as OS for test hosts
Ian Campbell [Mon, 16 Nov 2015 10:31:29 +0000 (10:31 +0000)]
Switch to Debian 8.0 (jessie) as OS for test hosts

mg-debian-installer-update-all has been run on the production instance
and TftpDiVersion is also updated to match.

The resulting binaries have also been copied to the Cambridge
instance, so update Cambridge config too.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agoDatabase locking: Tcl: Limit number of retries
Ian Jackson [Tue, 15 Dec 2015 16:17:59 +0000 (16:17 +0000)]
Database locking: Tcl: Limit number of retries

If there is something fundamentally wrong, don't just sit looping
around every 500ms.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoDatabase locking: Tcl: Cover LOCK TABLEs with catch
Ian Jackson [Tue, 15 Dec 2015 16:08:44 +0000 (16:08 +0000)]
Database locking: Tcl: Cover LOCK TABLEs with catch

Previously we would retry only the body, but not LOCK TABLEs.

We got away with it before because of the heavyweight locking of even
long-running read-only transactions, but now the LOCK TABLEs can fail
(at least in a mixed-version system, and perhaps even in a system with
only new code).

Additionally, if one of the LOCK TABLEs fails, the code's use of the
db handle becomes stuck because of the failed transaction: the error
is caught by the daemon's main loop error handler, but the db handle
is not subjected to ROLLBACK and all future attempts to use it will
fail.

So: move the LOCK TABLEs (and the SET TRANSACTION) into the catch, so
that deadlocks in LOCK TABLEs are retried (after ROLLBACK).

The COMMIT remains outside the eval but this should be unaffected by
DB deadlocks if the LOCK TABLEs are right.

Note that this code does not attempt to distinguish DB deadlock errors
from other errors.  Arguably this is quite wrong.  Fixing it to
distinguish deadlocks is awkward because pg_execute does not leave the
error information anywhere it can be found.  Contrary to what the
documentation seems to imply, it does not set errorCode (!)

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoDatabase locking: Perl: Increase retry count
Ian Jackson [Tue, 15 Dec 2015 15:36:51 +0000 (15:36 +0000)]
Database locking: Perl: Increase retry count

It seems to me that this deadlock might actually become fairly common
in some setups.  There is little harm in trying it for 100s rather
than 20s, and there maybe some benefit.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoDatabase locking: Perl: Retry all deadlocks in PostgreSQL
Ian Jackson [Tue, 15 Dec 2015 15:14:34 +0000 (15:14 +0000)]
Database locking: Perl: Retry all deadlocks in PostgreSQL

Previously we would retry all COMMITs but nothing else.  This is
correct for SQLite3 but not for PostgreSQL.

We got away with it before because of the heavyweight locking of even
long-running read-only transactions, but now the LOCK TABLEs can fail
(at least in a mixed-version system, and perhaps even in a system with
only new code).

So: cover all of the database work in db_retry with the eval, and
explicitly ask the JobDB adaptation layer (via a new need_retry
method) whether to go around again.  We tell the JobDB layer whether
the problem was during commit, so that we can avoid making any overall
semantic change to the interaction with SQLite3.

In the PostgreSQL case, the db handle can be asked whether there was
an error and what the error code was.  Deadlock has its own error
code.

(One side effect here is that db_retry_retry, which sets
$db_retry_stop='retry', is now no longer affected by the retry count
in db_retry.  But there are no callers and that may be more right
anyway.  db_retry_abort always exits the loop, as before.)

I have tested this with the following rune:

 OSSTEST_CONFIG=/u/iwj/.xen-osstest/config:local-config.test-database_iwj perl -w -MData::Dumper -e 'use strict; use Osstest::Executive; use Osstest; csreadconfig(); print Dumper($dbh_tests->{AutoCommit}); eval { $dbh_tests->do("BOGUS"); }; db_begin_work($dbh_tests, [qw(flights resources)])'

adding a sleep(2) to the loop Osstest::JobDB::Executive::begin_work,
and running a second copy of the rune with the tables to lock in the
other order.

Acked-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
v2: Mention db_retry_retry in commit message.

9 years agoSchema: When creating, check that no updates are applied
Ian Jackson [Thu, 10 Dec 2015 15:31:37 +0000 (15:31 +0000)]
Schema: When creating, check that no updates are applied

If you try to run mg-schema-create on an existing instance it bombs
out right at the beginning because it tries to create the `flights'
table, which already exists.

But in the future the `flights' table might be removed in an update,
which would remove this safety catch.  Then running the create might
partially succeed, leaving debris a production instance.

Detect this situation by looking for applied schema updates, and
bombing out if there are any.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v4: Add comment.

9 years agoSchema: drop old resource_log table
Ian Jackson [Thu, 10 Dec 2015 13:39:04 +0000 (13:39 +0000)]
Schema: drop old resource_log table

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoSchema: Check that schema creation and update runs as the right user
Ian Jackson [Thu, 10 Dec 2015 13:50:00 +0000 (13:50 +0000)]
Schema: Check that schema creation and update runs as the right user

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoSchema: Support database schema updates
Ian Jackson [Thu, 10 Dec 2015 13:26:00 +0000 (13:26 +0000)]
Schema: Support database schema updates

See schema/README.schema, introduced in this patch, for the design.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v4: Add comment about test db safety catch.

v3: Fix spurious message from ./mg-schema-updates apply.
    Fix grammar error in README.updates.

v2: Slight increase schema update name length format.
    Docs fixes:
    Change erroneous `three' to `four'.
    Change `state' to `status' throghout.
    Explain scope of <status>.
    Sort out (and renumber) `Update order for Populate-then-rely'.
    Sort out "Statuses" explanations.
    Encourage use of DML update, rather than ad-hoc scripts,
     for populating new columns.

9 years agoSchema: Introduce mg-schema-create
Ian Jackson [Thu, 10 Dec 2015 12:29:32 +0000 (12:29 +0000)]
Schema: Introduce mg-schema-create

There is a fair amount of option parsing clobber here that will be
relevant shortly.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoSchema: Remove SET OWNER and GRANT/REVOKE from schema/initial.sql
Ian Jackson [Thu, 10 Dec 2015 12:13:58 +0000 (12:13 +0000)]
Schema: Remove SET OWNER and GRANT/REVOKE from schema/initial.sql

Really, we don't want the initial schema setup to mess about with
permissions.  Instead, we simply expect to run the creation as the
correct role user.

So:
 - Remove the code in mg-schema-test-database to remove the
   permission settings from initial.sql;
 - Instead, run exactly that code on initial.sql and commit the
   result.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoSchema: Rename schema file
Ian Jackson [Mon, 7 Dec 2015 18:25:14 +0000 (18:25 +0000)]
Schema: Rename schema file

We are going to have multiple schema snippets and this is going be
just the initial baseline.

Rename the file and change references to it.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agomg-schema-test-database: Fix argument parsing for _SUFFIX
Ian Jackson [Wed, 9 Dec 2015 12:04:03 +0000 (12:04 +0000)]
mg-schema-test-database: Fix argument parsing for _SUFFIX

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agoExecutive DB: Reduce strength of DB locks
Ian Jackson [Fri, 11 Dec 2015 16:13:00 +0000 (16:13 +0000)]
Executive DB: Reduce strength of DB locks

The purpose of these locks is partly to prevent transactions being
aborted (which I'm not sure the existing code would in practice cope
with, although this is a bug) and also to avoid bugs due to the fact
that
  SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
does not mean that the transactions are necessarily serialisable!
  http://www.postgresql.org/docs/8.3/static/transaction-iso.html

In SQL in general it is possible for read-only transactions to
conflict with writing transactions.

However, in PostgreSQL this is not a problem because Postgres uses
multi-version concurrency control: it retains the old version of the
data while the read transaction is open:
  http://www.postgresql.org/docs/8.3/static/transaction-iso.html

So a read transaction cannot cause a write transaction to abort, nor
vice versa.  So there is no need to have the database explicit table
locks prevent concurrent read access.

Preventing concurrent read access means that simple and urgent updates
can be unnecessarily delayed by long-running reader transactions in
the history reporters and archaeologists.

So, reduce the lock mode from ACCESS EXCLUSIVE to ACCESS.  This still
conflicts with all kinds of updates and prospective updates, but no
longer with SELECT:
  http://www.postgresql.org/docs/8.3/static/explicit-locking.html

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: Fix grammar and typo in commit message.

9 years agoExecutive DB: Eliminate SQL locking for read-only transactions
Ian Jackson [Fri, 11 Dec 2015 16:04:11 +0000 (16:04 +0000)]
Executive DB: Eliminate SQL locking for read-only transactions

Our transactions generally run with
  SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
(which, incidentally, does not mean that the transactions are
necessarily serialisable!)

In SQL in general it is possible for a read-only transaction to fail
and need to be retried because some writer has updated things.

However, in PostgreSQL this is not possible because Postgres uses
multi-version concurrency control: it retains the old version of the
data while the read transaction is open:
  http://www.postgresql.org/docs/8.3/static/transaction-iso.html

(And, of course, SQLite uses MVCC too, and all transactions in SQLite
are fully serialisable.)

So it is not necessary for these read-only operations to take out
locks.  When they do so they can unnecessarily block other important
work for long periods of time.

With this change, we go further from the ability to support databases
other than PostgreSQL and SQLite.  However, such support was very
distant anyway because of differences in SQL syntax and semantics, our
reliance in Executive mode on Postgres's command line utilities, and
so on.

We retain the db_retry framing because (a) although the retry loop is
not necessary in these cases, the transaction framing is (b) it will
make it slightly easier to reverse this decision in the future if we
ever decide to do so (c) it is less code churn.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: Fix minor error in in commit message

9 years agoExecutive: Permit OSSTEST_TASK=<refkey> (for static tasks)
Ian Jackson [Tue, 8 Dec 2015 14:11:47 +0000 (14:11 +0000)]
Executive: Permit OSSTEST_TASK=<refkey> (for static tasks)

If OSSTEST_TASK is not set, we construct a <refkey> from the username
and the nodename, and look for a such a static task.  If OSSTEST_TASK
/is/ set would require it to contain `<taskid> <type> <refkey>'.

In this patch, permit OSSTEST_TASK to be set simply to the <refkey>.
This is much more convenient and doesn't involve manually looking up
taskids.  The risk of error seems negligible.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agomg-execute-flight: create tmp directory before using it
Wei Liu [Mon, 7 Dec 2015 14:51:07 +0000 (14:51 +0000)]
mg-execute-flight: create tmp directory before using it

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agomg-allocate: fix comment for deallocation
Wei Liu [Mon, 7 Dec 2015 14:38:56 +0000 (14:38 +0000)]
mg-allocate: fix comment for deallocation

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agomg-hosts: document "power" command
Wei Liu [Mon, 7 Dec 2015 17:24:23 +0000 (17:24 +0000)]
mg-hosts: document "power" command

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agostandalone: Log things we are running via with_logging
Ian Campbell [Fri, 4 Dec 2015 15:27:59 +0000 (15:27 +0000)]
standalone: Log things we are running via with_logging

Turning on set -x generally in this script is too verbose, so run the
command in a subshell which sets -x.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agomg-schema-test-database: Add workflow doc comment
Ian Jackson [Mon, 7 Dec 2015 17:06:12 +0000 (17:06 +0000)]
mg-schema-test-database: Add workflow doc comment

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: New patch

9 years agomg-schema-test-database: Safety catch in JobDB database open
Ian Jackson [Mon, 7 Dec 2015 16:33:57 +0000 (16:33 +0000)]
mg-schema-test-database: Safety catch in JobDB database open

When we open the `osstest' database, see whether this is a parent DB
(main DB) from which a test DB has been spawned by this user.

If it has, bomb out, unless the user has specified a suitable regexp
matching the DB name in the env var
  OSSTEST_DB_USEREAL_IGNORETEST

This means that when a test database is in play, the user who created
it cannot accidentally operate on the real DB.

The safety catch does not affect Tcl programs, which get the DB config
directly, but in general that just means sg-execute-flight and
sg-run-job which already have a fair amount of safety catch because
they demand flight numbers.

mg-schema-test-database hits this feature over the head.  We assume
that the caller of mg-schema-test-database knows what they are doing;
particularly, that if they create nested test DBs (!), they do not
need the assitance of this feature to stop themselves operating
mg-schema-test-database incorrectly.  Anyone who creates nested test
DBs will hopefully recognise the potential for confusion!

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: Fix unclarity in a comment.

9 years agomg-schema-test-database: Bump flight sequence number in test DB
Ian Jackson [Mon, 7 Dec 2015 17:22:54 +0000 (17:22 +0000)]
mg-schema-test-database: Bump flight sequence number in test DB

This makes test flights have different numbers to those currently in
production, which will help avoid accidents.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: New patch

9 years agomg-schema-test-database: Change username for back-to-main-db xref
Ian Jackson [Mon, 7 Dec 2015 16:37:03 +0000 (16:37 +0000)]
mg-schema-test-database: Change username for back-to-main-db xref

The `username' of the xdbref task in the test db, referring to the
main db, is changed to `PARENT' (from `<username>@<nodename>').

Currently this is purely cosmetic, but it is going to be useful to
distinguish the two cases:
 * This is a test DB and contains references to a parent
 * This is a parent DB (probably the main DB) which contains
   references to child test DB(s).

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: Fix `DBS' in commit message to `DB(s)'.
v2: New patch

9 years agoConfiguration: Introduce $c{Username}
Ian Jackson [Mon, 7 Dec 2015 16:03:40 +0000 (16:03 +0000)]
Configuration: Introduce $c{Username}

This makes it easier to share the output of whoami.  As a beneficial
side effect it can now be overridden.

Replace many open-coded calls to `whoami` etc. with references to
$c{Username}.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: New patch

9 years agomg-schema-test-database: Sort out daemons; provide `daemons' subcommand
Ian Jackson [Fri, 4 Dec 2015 18:24:44 +0000 (18:24 +0000)]
mg-schema-test-database: Sort out daemons; provide `daemons' subcommand

We arrange for the test configuration to look for the daemons on a
different host and port, and we provide a convenient way to run such a
pair of daemons.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: Moved setting of *Daemon{Host,Port} to this patch (was
     previously in `mg-schema-test-database: New script')

9 years agomg-schema-test-database: Move setting of test_cfg_setting to dbname
Ian Jackson [Fri, 4 Dec 2015 19:13:08 +0000 (19:13 +0000)]
mg-schema-test-database: Move setting of test_cfg_setting to dbname

This will makes it available to a wider subset of the script, which is
going to be important in a moment.

No functional change.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agomg-schema-test-database: New script
Ian Jackson [Fri, 4 Dec 2015 17:57:54 +0000 (17:57 +0000)]
mg-schema-test-database: New script

This allows a user in non-standalone mode to make a whole new test
database, which is largely a clone of the original database.

The new db refers to the same resources (hosts), and more-or-less
safely borrows some of those hosts.

Currently we don't do anything about the queue and owner daemons.
This means that queue-daemon-based resource allocation is broken when
clients are pointed at the test db.  But non-queue-based allocation
(eg, ./mg-allocate without -U) works, and the test db can be used for
db-related experiments and even support individual ts-* scripts (other
than ts-hosts-allocate of course).

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: Do not set *Daemon{Host,Port} - move this chunk to a later patch

9 years agoOsstest.pm: Break out and export globalconfigfiles
Ian Jackson [Fri, 4 Dec 2015 18:06:09 +0000 (18:06 +0000)]
Osstest.pm: Break out and export globalconfigfiles

No functional change; no callers as yet.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agocri-getconfig: Provide debugging for get_psql_cmd
Ian Jackson [Wed, 25 Nov 2015 15:34:04 +0000 (15:34 +0000)]
cri-getconfig: Provide debugging for get_psql_cmd

This allows us to execute only the first <some number> SQL
invocations.  The first non-executed one is dumped, instead, by having
get_psql_command print a rune involving ./mg-debug-fail (which the
caller will then execute).

The locking makes things work roughly-correctly if get_psql_cmd is run
in multiple processes at once: it is not defined exactly which
invocations get which counter values, but they will all work properly
and get exactly one counter value each.

If set -x is in force, turn it off for get_psql_cmd: our perl rune is
uninteresting to see repeated ad infinitum in debugging output.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agocri-getconfig: Provide get_psql_cmd and get_pgdump_cmd
Ian Jackson [Fri, 4 Dec 2015 18:03:30 +0000 (18:03 +0000)]
cri-getconfig: Provide get_psql_cmd and get_pgdump_cmd

This is for (non-standalone-mode) shell scripts which want to access
the postgresql database.

get_psql_command provides `-v ON_ERROR_STOP' because it is not the
default (!) and no sane caller would not want it.

No callers as yet.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: Fix typo in comment.

9 years agomg-debug-fail: Catch attempts to read from a tty
Ian Jackson [Fri, 4 Dec 2015 18:41:10 +0000 (18:41 +0000)]
mg-debug-fail: Catch attempts to read from a tty

When stdin is a tty, do not try to dump it.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agomg-debug-fail: New utility script for debugging
Ian Jackson [Fri, 4 Dec 2015 18:12:38 +0000 (18:12 +0000)]
mg-debug-fail: New utility script for debugging

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: Use "egrep ''" rather than "egrep .".  Both sanitise
     missing-final-newline but "egrep ''" will print blank lines,
     which is desirable here.

9 years agoConfiguration: No longer set password=<~/.xen-osstest/db-password>
Ian Jackson [Fri, 4 Dec 2015 18:00:48 +0000 (18:00 +0000)]
Configuration: No longer set password=<~/.xen-osstest/db-password>

Instead, expect the user to provide ~/.pgpass.

This is a good idea because we don't really want to be handling
passwords ourselves if we can help it.  And, we are shortly going to
want to do some exciting mangling of the database access
configuration, which would be complicated by the presence of this
password expansion.

This may break for some users of existing Executive (non-standalone)
setups which are using production-config-cambridge or the default
built-in configuration.

DEPLOYMENT NOTE: After this passes the push gate in Cambridge,
/export/home/osstest/.{xen-,}osstest/db-password should be deleted to
avoid confusion in the future.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agocri-getconfig: Break out exec_resetting_sigint.
Ian Jackson [Fri, 4 Dec 2015 19:11:41 +0000 (19:11 +0000)]
cri-getconfig: Break out exec_resetting_sigint.

Move this oddity (and the associated comment) from
standalone-generate-dump-flight-runvars to cri-getconfig.  We are
going to want it elsewhere.

We put this in cri-getconfig because that is the one library of
generic shell functions which everything includes.  Perhaps this file
is misnamed.

No overall functional change.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agotcl daemons: log host and port number we bind to, at startup
Ian Jackson [Fri, 4 Dec 2015 19:03:16 +0000 (19:03 +0000)]
tcl daemons: log host and port number we bind to, at startup

If the socket setup fails, this makes it easier to see what the
program was trying to do.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agocr-try-bisect-adhoc: Set OSSTEST_PRIORITY=-30
Ian Campbell [Thu, 3 Dec 2015 14:55:48 +0000 (14:55 +0000)]
cr-try-bisect-adhoc: Set OSSTEST_PRIORITY=-30

This makes adhoc bisects slightly more important than smoke tests, on
the basis that a smoke test can choose another host while an adhoc
bisect cannot.

Document this is README.planner and while there make a note of the
usage of OSSTEST_RESOURCE_WAITSTART by cr-try-bisect.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agoproduction-config-cambridge: Use new squid proxy
Ian Jackson [Fri, 4 Dec 2015 13:57:59 +0000 (13:57 +0000)]
production-config-cambridge: Use new squid proxy

Specify both HttpProxy and DebianMirrorProxy.  In my tests this seems
to improve some of the apparently-intercepting-proxy-related failures,
and it will certainly improve logging.

I set DebianMirrorProxy too so that queries to security.d.o go through
the proxy.  Ideally we would have a apt cache that could be used as an
http proxy rather than as an origin server; when that happens we can
set DebianMirrorProxy to point to it and do away with DebianMirrorHost
(as we do in Massachusetts).

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Acked-by: Ian Campbell <ian.campbell@citrix.com>
9 years agocr-try-bisect-adhoc: Add a reminder about OSSTEST_EMAIL_HEADER
Ian Campbell [Thu, 3 Dec 2015 15:35:30 +0000 (15:35 +0000)]
cr-try-bisect-adhoc: Add a reminder about OSSTEST_EMAIL_HEADER

... I'm forever spamming Ian J for a couple of iterations when I set
one of these up and forget to override the default destination.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agocr-try-bisect-adhoc: Set laundered_testid so graph URL is correct
Ian Campbell [Wed, 2 Dec 2015 16:05:04 +0000 (16:05 +0000)]
cr-try-bisect-adhoc: Set laundered_testid so graph URL is correct

Otherwise the testid is missing from the filename, resulting in e.g.
http://osstest.test-lab.xenproject.org/~osstest/pub/results-adhoc/bisect/xen-unstable/test-amd64-amd64-qemuu-nested-intel..svg

Instead of test-amd64-amd64-qemuu-nested-intel.debian-hvm-install-l1-l2.svg

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agocr-try-bisect-adhoc: Ensure tmp exists.
Ian Campbell [Wed, 2 Dec 2015 15:58:58 +0000 (15:58 +0000)]
cr-try-bisect-adhoc: Ensure tmp exists.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
9 years agosg-run-job: Coalesce a couple of repetitions
Ian Jackson [Fri, 27 Nov 2015 16:36:23 +0000 (16:36 +0000)]
sg-run-job: Coalesce a couple of repetitions

Fold `guest-localmigrate.2' into `guest-localmigrate/x10' and move
`guest-start.2' to after `guest-start.repeat' (reversing the contents
of the latter so that the start comes before the stop).
(guest-start.2 is still necessary because the start/stop test leaves
the guest stopped, whereas the subsequent destroy test ought happen
with the guest running.)

This change will allow the heisenbug compensator to see more of these
failures as the same failures.

The overall effect includes a reduction of the number of localhost
migrations from 11 to 10, but this is better than leaving a misleading
testid containing the string `x10' (or changing the testid).

It is best to fold this way, keeping the testid of the step which
previously had most of the regressions, because: the alternative,
keeping the testid of the low-repetition step, would allow osstest to
use previous lucky passes of the low-repetition step to justify
current failures of the now-high-repetition step.

To check that the effect of the patch is as intended, I ran a before
and after run with OSSTEST_SIMULATE=1, and (a) collected and sedded
and diffed the sg-run-job transcripts and (b) looked in the db.

I also ran a real test (65261 in the Xen Project test lab) with a very
similar version, which passed, and will re-run that before pushing.

(a):

  c&p transcripts from mg-execute-flight email reports
  perl -i~ -pe 's/\b(38371|38370|65261|38395|38397)\b/FLIGHT/; s/^2015-11-\d\d \S+ /TIME /' [tu]
  diff -u [tu] >v
  grep starting v

 =>

 TIME Z [test-amd64-i386-xl] starting FLIGHT.test-amd64-i386-xl ts-build-check  build-check(1)
 TIME Z [test-amd64-i386-xl] starting FLIGHT.test-amd64-i386-xl ts-guest-saverestore host debian guest-saverestore.2
-TIME Z [test-amd64-i386-xl] starting FLIGHT.test-amd64-i386-xl ts-guest-localmigrate host debian guest-localmigrate.2
 TIME Z [test-amd64-i386-xl] starting FLIGHT.test-amd64-i386-xl ts-guest-localmigrate x10 host debian guest-localmigrate/x10
 TIME Z [test-amd64-i386-xl] starting FLIGHT.test-amd64-i386-xl ts-guest-stop host debian guest-stop
+TIME Z [test-amd64-i386-xl] starting FLIGHT.test-amd64-i386-xl ts-repeat-test 10 ts-guest-start host debian {;} ts-guest-stop host debian guest-start/debian.repeat
 TIME Z [test-amd64-i386-xl] starting FLIGHT.test-amd64-i386-xl ts-guest-start host debian guest-start.2
-TIME Z [test-amd64-i386-xl] starting FLIGHT.test-amd64-i386-xl ts-repeat-test 10 ts-guest-stop host debian {;} ts-guest-start host debian guest-start/debian.repeat
 TIME Z [test-amd64-i386-xl] starting FLIGHT.test-amd64-i386-xl ts-guest-destroy host debian guest-destroy
 TIME Z [test-amd64-i386-xl] starting FLIGHT.test-amd64-i386-xl ts-leak-check check host leak-check/check
-TIME Z [test-amd64-i386-xl] starting FLIGHT.test-amd64-i386-xl ts-logs-capture host capture-logs(24)
+TIME Z [test-amd64-i386-xl] starting FLIGHT.test-amd64-i386-xl ts-logs-capture host capture-logs(23)

(b)

osstestdb=> select * from (select job,stepno,step,status,testid from steps where flight=38370) before full outer join (select job,stepno,step,status,testid from steps where flight=38400) after using (testid) order by coalesce(before.stepno, after.stepno);
          testid           |        job         | stepno |             step             | status |        job         | stepno |             step             | status
---------------------------+--------------------+--------+------------------------------+--------+--------------------+--------+------------------------------+--------
 build-check(1)            | test-amd64-i386-xl |      1 | ts-build-check               | pass   | test-amd64-i386-xl |      1 | ts-build-check               | pass
 hosts-allocate            | test-amd64-i386-xl |      2 | ts-hosts-allocate            | pass   | test-amd64-i386-xl |      2 | ts-hosts-allocate            | pass
 host-install(3)           | test-amd64-i386-xl |      3 | ts-host-install-twice        | pass   | test-amd64-i386-xl |      3 | ts-host-install-twice        | pass
 host-ping-check-native    | test-amd64-i386-xl |      4 | ts-host-ping-check           | pass   | test-amd64-i386-xl |      4 | ts-host-ping-check           | pass
 xen-install               | test-amd64-i386-xl |      5 | ts-xen-install               | pass   | test-amd64-i386-xl |      5 | ts-xen-install               | pass
 xen-boot                  | test-amd64-i386-xl |      6 | ts-host-reboot               | pass   | test-amd64-i386-xl |      6 | ts-host-reboot               | pass
 host-ping-check-xen       | test-amd64-i386-xl |      7 | ts-host-ping-check           | pass   | test-amd64-i386-xl |      7 | ts-host-ping-check           | pass
 leak-check/basis(8)       | test-amd64-i386-xl |      8 | ts-leak-check                | pass   | test-amd64-i386-xl |      8 | ts-leak-check                | pass
 debian-install            | test-amd64-i386-xl |      9 | ts-debian-install            | pass   | test-amd64-i386-xl |      9 | ts-debian-install            | pass
 debian-fixup              | test-amd64-i386-xl |     10 | ts-debian-fixup              | pass   | test-amd64-i386-xl |     10 | ts-debian-fixup              | pass
 guest-start               | test-amd64-i386-xl |     11 | ts-guest-start               | pass   | test-amd64-i386-xl |     11 | ts-guest-start               | pass
 migrate-support-check     | test-amd64-i386-xl |     12 | ts-migrate-support-check     | pass   | test-amd64-i386-xl |     12 | ts-migrate-support-check     | pass
 saverestore-support-check | test-amd64-i386-xl |     13 | ts-saverestore-support-check | pass   | test-amd64-i386-xl |     13 | ts-saverestore-support-check | pass
 guest-saverestore         | test-amd64-i386-xl |     14 | ts-guest-saverestore         | pass   | test-amd64-i386-xl |     14 | ts-guest-saverestore         | pass
 guest-localmigrate        | test-amd64-i386-xl |     15 | ts-guest-localmigrate        | pass   | test-amd64-i386-xl |     15 | ts-guest-localmigrate        | pass
 guest-saverestore.2       | test-amd64-i386-xl |     16 | ts-guest-saverestore         | pass   | test-amd64-i386-xl |     16 | ts-guest-saverestore         | pass
 guest-localmigrate.2      | test-amd64-i386-xl |     17 | ts-guest-localmigrate        | pass   |                    |        |                              |
 guest-localmigrate/x10    | test-amd64-i386-xl |     18 | ts-guest-localmigrate        | pass   | test-amd64-i386-xl |     17 | ts-guest-localmigrate        | pass
 guest-stop                | test-amd64-i386-xl |     19 | ts-guest-stop                | pass   | test-amd64-i386-xl |     18 | ts-guest-stop                | pass
 guest-start.2             | test-amd64-i386-xl |     20 | ts-guest-start               | pass   | test-amd64-i386-xl |     20 | ts-guest-start               | pass
 guest-start/debian.repeat | test-amd64-i386-xl |     21 | ts-repeat-test               | pass   | test-amd64-i386-xl |     19 | ts-repeat-test               | pass
 guest-destroy             | test-amd64-i386-xl |     22 | ts-guest-destroy             | pass   | test-amd64-i386-xl |     21 | ts-guest-destroy             | pass
 leak-check/check          | test-amd64-i386-xl |     23 | ts-leak-check                | pass   | test-amd64-i386-xl |     22 | ts-leak-check                | pass
 capture-logs(23)          |                    |        |                              |        | test-amd64-i386-xl |     23 | ts-logs-capture              | pass
 capture-logs(24)          | test-amd64-i386-xl |     24 | ts-logs-capture              | pass   |                    |        |                              |
(25 rows)

osstestdb=>

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
v2: Do not increment count of migration tests so as to make
     testid misleading.
    Do the change to the start/stop test differently.