Ian Jackson [Wed, 11 Nov 2015 11:51:37 +0000 (11:51 +0000)]
Serial::xenuse: Send xenuse output to /dev/null
Like sympathy, attaching via xenuse causes xenuse to send output from
the host to its own stdout.
But we don't want the ts-logs-capture stdout to contain this serial
output, interleaved with its own log messages. We'll capture the
whole serial log from the xenuse logfile. So redirect it to /dev/null.
I have checked that xenuse does (at least sometimes, eg when given a
nonexistent hostname) use stderr when something actually goes wrong.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
v17: New patch
Robert Ho [Mon, 2 Nov 2015 09:21:13 +0000 (17:21 +0800)]
Serial: Add new serial method object for `guest' type
L1 guests' serial ports are owned by qemu in L0. We can send them
debug keys by writing to the qemu pipe.
(xl debug-key looks like it would be useful but it actually sends
debug keys to the hypervisor of the host it is running on. We want to
send the debug keys to the hypervisor and kernel from the outside.)
Log fetching is not needed because from the POV of the L0 the L1 is a
guest, so the L0's log capture will already fetch the L1's serial
console output.
Signed-off-by: Robert Ho <robert.hu@intel.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
v17: Get FIFO path right.
Log the ssh command.
Put quotes around the redirection, as required.
Export sshuho from TestSupport.
Make ::guest->fetch_logs actually use @_ so that it works.
v16: Mostly rewritten: now uses new keys_real base class, and
uses the qemu pipe rather than xl debug-keys.
v15: New patch.
Ian Jackson [Tue, 10 Nov 2015 19:11:01 +0000 (19:11 +0000)]
Serial: Factor out Osstest::Serial::keys_real
The sympathy and xenuse serial modules had too much in common. Factor
out the common code, which is now responsible for
- knowledge of the Xen console switch
- splitting strings up into individual keys
- timing decisions
- error trapping and logging
This new class is an abstract base class for the concrete serial
method classes, and calls back to its derived class to prepare, send
each actual key, and shut down.
There is some functional change: notably, after failure to send the
first debug key, sending the remainder will not be attempted.
While we're here, fix a typo `dettach' to `detach'.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v17: Move `force attach' and `force detach' writes into xenuse.pm
where they belong.
Add typo fix.
Ian Jackson [Tue, 10 Nov 2015 18:00:03 +0000 (18:00 +0000)]
HVM guests: Use qemu "pipe:" for serial output logging
Modern qemu has the "pipe:/PATH" character driver. This opens
/PATH.in for reading and /PATH.out for writing. In my tests, I found
that:
- contrary to the documentation, they do not need to be pipes
(at least, /PATH.out can be a file)
- but they must both already exist
- qemu will follow symlinks, so /PATH.out can be a symlink to
a file
- if /PATH.in is a fifo, qemu will tolerate other processes opening
it for writing, and writing things, only occasionally. (Probably,
qemu opens it O_RDWR; or perhaps it reopens it after EOF.)
Use this feature to achieve the following:
- guest serial output ends up in /var/log/xen/osstest-serial-GUEST.log
(which is already captured by ts-logs-capture) rather than
interleaved with qemu's stderr output (in the libxl-created logfile)
- guest serial input comes from a pipe in /root which we can open
and write to if we want to talk to the guest
We are mostly interested in the final bullet point, because that will
allow us to send debug keys to the emulated serial port of an L1
nested HVM guest.
Looking at the source code of qemu in 4.2 and 4.6 I think the above
approach will work with all relevant qemu-xen's.
If the device model version is qemu-xen-traditional, `pipe:' is not
supported. If device_model_version is not set, we will be using
whatever the xen.git we used defaults to. For Xen 4.1 and earlier
that is qemu-xen-traditional, and I'm slightly loathe to break osstest
for those earlier versions. There doesn't seem to be anything else in
the runvars that would clue us in. So be cautious and do not use the
new feature unless device_model_version is explicitly set.
The nested tests are all -qemuu so set device_model_version.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v17: Fix fifo mode to be 600, not 700
v16: New patch
Robert Ho [Sat, 31 Oct 2015 05:42:03 +0000 (13:42 +0800)]
Osstest/Testsupport.pm: use get_target_property() for some host setup
For nested cases, nested host can inherit its host's property for
dhcp watch setup and ether_prefix property setup.
Signed-off-by: Robert Ho <robert.hu@intel.com> Tested-by: Robert Ho <robert.hu@intel.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v15: New patch
Ian Jackson [Sat, 31 Oct 2015 05:35:26 +0000 (13:35 +0800)]
Osstest/Testsupport.pm: change target's default kernkind to 'pvops'
This is safe only if no existing flights would be affected. (That is,
the meaning of no existing sets of runvars would be changed.)
To check whether this would make any difference I did some database
searches. Since any time target_kernkind_check is called it sets a
corresponding `console' runvar, I can search for `console' without a
corresponding `kernkind'. I ran this query:
select * from (select *, (select name from runvars r2 where
r2.flight=r1.flight and r2.job=r1.job and r2.name=
replace(r1.name,'console','kernkind')) kk from runvars r1 where
r1.name like '%console') iq where kk is null order by flight desc;
and it found nothing since flight 7682. So I think we can change the
default.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Signed-off-by: Robert Ho <robert.hu@intel.com> Tested-by: Robert Ho <robert.hu@intel.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v15: New patch
Ian Jackson [Thu, 10 Sep 2015 17:09:16 +0000 (18:09 +0100)]
ts-xen-install: networking: Rename `nodhcp' to `ensurebridge'
This function does not (now) always undo the DHCP configuration.
Sometimes it leaves it. Its main function is to ensure that we have
a bridge for use by guests.
So rename the function.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Tested-by: Robert Ho <robert.hu@intel.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v14: This patch was previously 4/4 of a miniature series containing
a different way of dealing with the Nested HVM L1 DHCP problem.
Robert Ho [Fri, 14 Aug 2015 03:55:50 +0000 (11:55 +0800)]
ts-xen-install: Properly handle hosts without a static IP address
Check IpStatic, and if it is not set, provide a dhcp stanza in
/etc/network/interfaces, rather than an `inet static' one.
This is necessary for L1 nested hosts, because they don't have a
static IP address.
In principle this makes matters more correct for physical hosts
without static IP addresses, but these are currently not supported
by selecthost().
Signed-off-by: Robert Ho <robert.hu@intel.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Tested-by: Robert Ho <robert.hu@intel.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v14: Only use `inet dhcp' if !$ho->{IpStatic}.
Robert Ho [Mon, 17 Aug 2015 08:40:22 +0000 (16:40 +0800)]
Nested HVM: Add test job to appropriate flights
Signed-off-by: longtao.pang <longtaox.pang@intel.com> Signed-off-by: Robert Ho <robert.hu@intel.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Tested-by: Robert Ho <robert.hu@intel.com>
---
v17: Use usual_debianhvm_image
v14: Use default gueststorage_size, rather than setting runvar.
Dropped acked from Ian Campbell.
Robert Ho [Mon, 17 Aug 2015 09:07:02 +0000 (17:07 +0800)]
Nested HVM: Provide test-nested recipe
Signed-off-by: Robert Ho <robert.hu@intel.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Tested-by: Robert Ho <robert.hu@intel.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v14: ts-nested-setup command line syntax updated.
v15: (Robert Ho) remove the unnecessary l1 destroy; as it will
implicitly powered off by framework as a nested host.
This change hasn't been confirmed by Ian Jackson yet; I
plan to separate it as fix patch but by mistake squashed
it in. Ian Jackson may want to revert this if he dosn't
agree.
Robert Ho [Mon, 17 Aug 2015 07:58:01 +0000 (15:58 +0800)]
Nested HVM: Provide ts-nested-setup to help make L1 usable as a host
* Provide the L1 with some storage for its own guests' disks
* Install some packages in the L1
* Optionally, set a runvar defining the L1 for the rest of the job
The recipe is going to run ts-xen-install etc.
Signed-off-by: longtao.pang <longtaox.pang@intel.com> Signed-off-by: Robert Ho <robert.hu@intel.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Tested-by: Robert Ho <robert.hu@intel.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v14: Use target_check_ip (renamed, earlier, in a new patch)
ts-nested-setup now has a default gueststorage_size of 20G
and this is implicitly used by make-flight.
Adjusted for new selecthost nested host syntax and correspondingly
completely changed invocation syntax.
Only optionally sets the runvar, if you pass --define. (This
will make it easier to play around with interactively.)
Broken the pieces of work out into subroutines for clarity.
Comment about guest storage slightly edited and rewrapped.
Guest storage runvars and perl variable names etc. renamed to
`gueststorage' rather than `guest_storage'.
LVM and VG names and perl variable names changed to be clearer.
Install `ed' too.
Use lv_create and toolstack()->block_attach.
Dropped ack from Ian Campbell.
Ian Jackson [Tue, 30 Jun 2015 16:19:32 +0000 (17:19 +0100)]
sg-run-job: Provide infrastructure for layers of nesting
Provides nested-layer-descend, which can be called in an individual
test job at the appropriate point (after the L1 has been set up).
The inner host is a guest of the outer host; powering it off means
destroying it. Putting the poweroff at this point in the loop, rather
than in per-host-finish, avoids powering off physical servers. The
use of `.' rather than `!.' for iffail means we do not power off
after failures (as we might want to preserve the state for debugging
etc).
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Tested-by: Robert Ho <robert.hu@intel.com> Signed-off-by: Robert Ho <robert.hu@intel.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
v14: Squash syntax fix from Robert Ho into this patch
v15: Remove spurious "=" from final-poweroff step invocation
Ian Jackson [Sat, 31 Oct 2015 03:31:54 +0000 (11:31 +0800)]
sg-run-job: Break out per-host-prep and per-host-finish
No functional change.
We now call the per-host-ts finish steps unconditionally, rather than
only if !$need_build_host, per-host-ts is (complicated) no-op if
$need_build_host, since in that case $need_xen_hosts is {}.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Tested by: Robert Ho <robert.hu@intel.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v14: Squash typo fix from Robert into this patch
Ian Jackson [Fri, 25 Sep 2015 18:03:06 +0000 (19:03 +0100)]
Toolstack::xl: Provide block_attach method
It is possible that this may work some of the time with xm, so I have
taken no measures to prevent it running then.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Signed-off-by: Robert Hu <robert.hu@intel.com> Tested-by: Robert Hu <robert.hu@intel.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
----
v14: New patch
v15: Fix missing $ho assignment in block-attach method
Ian Jackson [Fri, 25 Sep 2015 18:01:46 +0000 (19:01 +0100)]
LVM: Break out lv_create
We are going to want to reuse this.
lv_create doesn't (want to) take a $gho, but the $vg and $lv names
directly (so that callers can use it when they don't have a suitable
$gho whose $gho->{Lvdev} they want to use).
In the one existing call site we pass $gho->{Vg} and $gho->{Lv} so
that the effect is the same.
There is a minor functional change: $gho->{Lvdev} has been put through
lv_dev_mapper. But we don't care about that in lv_create (since the
LVM operations, and dd, are perfectly happy to use the `real',
non-/dev/mapper, names). So we can just use /dev/$vg/$lv.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Signed-off-by: Robert Ho <robert.hu@intel.com> Tested-by: Robert Ho <robert.hu@intel.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v17: Discuss change to /dev/$vg/$lv from $gho->{Lvdev}.
v14: New patch
v15: Change some trivial typo, so to resolve conflicts with
production tree.
Robert Ho [Sat, 15 Aug 2015 12:37:50 +0000 (20:37 +0800)]
await_tcp(): Run check_ip on each loop iteration
await_tcp is often invoked after a reboot.
In this situation the target's IP address may change. If this happens
while await_tcp is running, we would continue to poll the old IP address.
Fix this by running target_check_ip on each iteration.
Signed-off-by: Robert Ho <robert.hu@intel.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Tested-by: Robert Ho <robert.hu@intel.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v14: Dropped change to selecthost, which was in code which is no
longer present in this version of the series.
Rewritten to use target_check_ip.
Dropped IMO-unnecessary comment.
Ian Jackson [Fri, 25 Sep 2015 17:07:42 +0000 (18:07 +0100)]
target_check_ip: Rename and improve from guest_check_ip
Make this function suitable for running on targets with static IP
addresses. (Ie, on physical hosts.) Accordingly, rename it and
adjust all call sites.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Tested-by: Robert Ho <robert.hu@intel.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v14: New patch
Ian Jackson [Fri, 25 Sep 2015 17:08:45 +0000 (18:08 +0100)]
DhcpWatch::leases: Fix a reporting message
This talks about `guest_check_ip', but this code is now factored out
into a method. Use the correct method name in reporting.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Tested-by: Robert Ho <robert.hu@intel.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v14: New patch
Robert Ho [Fri, 28 Aug 2015 01:52:15 +0000 (09:52 +0800)]
Nested hosts: Provide PDU power method
This `guest' power method uses VM create/destroy. It is automatically
used for nested hosts. It would not make much sense to configure it
manually.
For nested host/guest, its power on/off method shall be
its host invoke $(toolstack)->create/destroy method.
Signed-off-by: Robert Ho <robert.hu@intel.com> Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Tested-by: Robert Ho <robert.hu@intel.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v14: Mostly rewritten by iwj
Ian Jackson [Fri, 25 Sep 2015 16:24:00 +0000 (17:24 +0100)]
selecthost: Support nested hosts (guests which are also hosts)
We introduce a new syntax: instead of a hostname (which might appear
in a command line argument to a ts-* script and hence be passed to
selecthost, or which might be in a runvar), we now support
<hostspec>:<domname>.
Such `hosts' (let us refer to such a thing as an L1, although in
principle further nesting may be possible) are expected to be
dynamically created. So they do not have flags and properties in the
configuration (or in an Executive instance's database).
The IP address is determined dynamically from the leases file. If the
L1 is not running, then no IP address may be found. This is not an
error. Users of this facility will need to make sure that ts-*
scripts which are unaware of the L1's special status are only invoked
when it is known that the L1 is up and has obtained its IP address.
`Power cycling' the L1 will be done by VM control operations in the
L0; this will come in a subsequent patch.
`Serial access' to the L1 guest will likewise need to be done via the
console arrangements in L0.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Tested-by: Robert Ho <robert.hu@intel.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v14: New patch
Ian Jackson [Fri, 25 Sep 2015 16:19:35 +0000 (17:19 +0100)]
selecthost: Minor cleanups
Document the syntax for $ident.
Log the ident as well as the selected hostname.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Tested-by: Robert Ho <robert.hu@intel.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v17: Fix typo in doc message.
v14: New patch
Ian Jackson [Thu, 29 Oct 2015 15:51:01 +0000 (15:51 +0000)]
build_clone: git clean newly cloned trees
This may seem redundant, however:
git does not track empty directories. So it can happen that a
directory is created as part of `git clone', but is empty in the
revision switched to with `git checkout'.
In this situation, the tree we are going to build ought not to contain
this directory, because that directory will not (in general) be
produced, eg when the revision being switched to becomes master.
We can use git clean to produce a working tree whose contents -
including the presence or absence of empty directories - depends only
on the commit we are trying to check out, and not on the previous
states of the git history or working tree.
For example, if a directory is made empty (ie, deleted, since git does
not distinguish) in xen.git#staging, osstest's clones of
xen.git#master will produce the directory, but `git checkout' of
staging won't delete it. If the xen.git build system mistakenly
depends on this directory, we won't detect this until the deletion
reaches master. This situation actually occurred with xen.git#598e97f
"tools/python: remove broken xl binding" (fixed in b261366f).
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> CC: Andrew Cooper <andrew.cooper3@citrix.com> Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Ian Jackson [Fri, 23 Oct 2015 16:12:48 +0000 (16:12 +0000)]
cs-bisection-step: Mark which revision(s) changed in each node
We compare the revision rtuple elements with each of the parents, and
mark changes by surrounding the revision id with *asterisks*.
In the SVG output, we have to strip these when generating the URL, and
we can show this by emboldening the relevant revision instead. This
involves changing the ther output to be of normal weight by deleting
the emboldening output by dot.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 23 Oct 2015 13:46:04 +0000 (13:46 +0000)]
cs-bisection-step: Make hyperlinks in SVG revision graph
Make various elements in the output into hyperlinks and hence document
that the SVG version of the graph is best to use.
AFAICT dot does not provide a way to put literal SVG elements into its
output. So we postprocess it. Luckily we produced the input to dot
so we know a lot about what the output will look like.
In theory it would be better to feed the SVG into an XML parser and do
this editing at the ESIS level, via XSLT. However, I don't understand
XSLT, and this regexp-based version will work until the authors of dot
decide to change the output syntax. If they do, the hyperlinking will
go away, but everything else will still work. I think this approach
will be less effort overall. This is particularly true as even using
XSLT would involve us knowing how dot's output is structured, and
changes to the syntax of the dot output are not very likely unless the
dot authors also change the semantics.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Tue, 20 Oct 2015 17:23:02 +0000 (18:23 +0100)]
ms-queuedaemon: Do not spin if client input is delayed/truncated
chan-read-data would spin if `read' returns early because of
nonblocking mode.
Check whether the return value is the empty string (which can only
happen on eof or nonblocking lack of data, and we checked eof just
before), and if so, simply return. The fileevent remains set up so we
will be called again when more data arrives.
(Deployment/testing note: this change is currently live in Cambridge,
as I cowboyed it directly into ~osstest/daemons-testing.git, on
observing this misbehaviour.)
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Mon, 19 Oct 2015 16:06:43 +0000 (17:06 +0100)]
README.bisection: New consumer-oriented document
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> CC: Wei Liu <wei.liu2@citrix.com>
---
v2: Untabify
Ian Jackson [Mon, 19 Oct 2015 14:38:55 +0000 (15:38 +0100)]
Rename README.bisection to NOTES.bisection
This contains osstest-developer-oriented information about the
bisector, and is not really pitched at osstest consumers. (It's also
rather more of a sketch than a complete algorithm doc, and somewhat
out of date.)
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Fri, 23 Oct 2015 10:00:27 +0000 (11:00 +0100)]
ap-fetch-version-old: Fix qemu branch handling after mergin of qemu trees
ap-fetch-version-old should always reference the output gate, but 99e92a6b3991 "Switch to merged qemu-xen{,-traditional}.git trees"
switched it to use TREE_QEMU_UPSTREAM directly, which can be
overridden by cr-daily-branch. This broke at least when
OSSTEST_BASELINES_ONLY=y since "cr-daily-branch qemu-mainline" ends up looking
for an "upstream-tested" branch in the qemu.org git tree, when it should be
looking at our output tree on xenbits.
Follow pattern of TREE_LINUX and set BASE_TREE_QEMU_UPSTREAM to the
output gate and then conditionally set TREE_QEMU_UPSTREAM to the
BASE_TREE if it is not already set. Switch ap-fetch-version-old to use
BASE_TREE.
I have confirmed that for
qemu-{mainline,upstream-unstable,4.6-testing} both
TREE_QEMU_UPSTREAM=git://git.qemu.org/qemu.git OSSTEST_BASELINES_ONLY=y ./ap-fetch-version-old $branch
and
TREE_QEMU_UPSTREAM=git://git.qemu.org/qemu.git ./ap-fetch-version-old $branch
are consulting the correct trees (and produce the same answers) and that
./ap-fetch-version $branch is also correct in each case.
I have done a dummy cr-daily-branch qemu-mainline (with standalone make-flight)
with baselines forced and it now appears correct.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Thu, 30 Jul 2015 12:42:03 +0000 (13:42 +0100)]
Switch to merged qemu-xen{,-traditional}.git trees
The qemu-mainline flights now push to the upstream-tested branch of
qemu-xen.git (while still pulling from upstream).
The qemu-upstream-unstable flights pull from staging and push to
master in qemu-xen.git
All of the qemu-upstream-X.Y-testing pull from staging-X.Y and push to
stable-X.Y branch in qemu-xen.git.
For now we also continue pushing to the old trees for qemu-upstream
4.2 through to 4.6-testing. Once those branches have updated their
Config.mk and done a point release we can consider removing these.
Abolish $LOCALREV_QEMU_MAINLINE in favour of $LOCALREV_QEMU_UPSTREAM,
it was used inconsistently.
While changing things ensure all pushes are done to refs/heads/$thing
to avoid issues when output branches to not exist.
Note that xenbranch_forqemu is no longer needed.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 9 Oct 2015 11:22:30 +0000 (12:22 +0100)]
smoke tests: Consider osstest revision when reusing builds
Build results done with one version of osstest are not necessarily
reuseable with a different version of osstest. For example, the suite
may have changed. The smoke tests try to reuse builds from
xen-unstable but if osstest changes incompatibly, the smoke tests
might repeatedly fail until a xen-unstable flight using the new
osstest completes.
(This issue is a problem for bisection too but that's less critical
and there is less of an easy answer.)
Address this by also considering the osstest revision when searching
for builds to reuse, so the smoke tests only reuse builds made with
the same version of osstest. (If we are running with uncommitted
changes, we ignore that aspect and instead insist on only builds which
used the committed revision of osstest: this makes testing the
xen-unstable-smoke machinery marginally easier, and will make no
difference to production runs.)
This introduces a new problem, though: after an osstest push, there
will be no suitable builds until the next xen-unstable flight passes.
So each smoke test would run its own build. This would delay the
smoke tests, and waste capacity.
To address this we permit the smoke tests to reuse (i) builds from a
suitable osstest flight (hopefully there will be an osstest flight
which justified the push of the osstest version we are running) or
(ii) previous builds done by a xen-unstable-smoke test (this is a
useful backstop).
sg-check-tested always reports the highest-numbered flight which
matches all the specified conditions, so overall this means that:
1. Normally, the most recent relevant build for each job will be a
xen-unstable build; xen-unstable-smoke will reuse those. Recent
flights on the osstest branch will be unsuitable because they use
different osstest; and there will be no recent relevant builds on
xen-unstable-smoke because xen-unstable-smoke will prefer to reuse its
own old builds rather than make new ones.
2. After a normal osstest push (whether force-pushed manually on the
basis of a test flight, or automatically pushed), the xen-unstable
builds are unsuitable. However, the osstest push _is_ suitable, and
its builds will be used.
3. After a manual force push of an untested osstest, there are no
suitable builds on either xen-unstable or osstest. The first
xen-unstable-smoke run will have to do all the builds. However,
subsequent xen-unstable-smoke runs can just pick up those builds.
These same builds will be reused until a xen-unstable flight using the
new osstest produces a passing build.
(If the force pushed osstest causes a build to break, then
xen-unstable-smoke will keep retrying and failing that individual
build until a fix is pushed to osstest#production or xen#staging.)
We honour an environment variable SMOKE_HARNESS_REV to override the
automatic determination of the desired test harness revision.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Fri, 9 Oct 2015 16:18:42 +0000 (17:18 +0100)]
ap-fetch-*: Support $AP_FETCH_PLACEHOLDERS envvar which outputs a placeholder
And use this in standalone-generate-dump-flight-runvars. In general I
don't think we are interested in the specific revision_* runvars when
using this tool but when it matters this new behaviour can be avoided
by setting AP_FETCH_PLACEHOLDERS=n.
This is quicker even than using memoisation on the ap-fetch
invocations and produces output like:
This is useful when doing comparisons of before and after changes to
e.g. make-flight since they do not pickup noise if a something/someone
does a push in the middle.
The memoisation bits of standalone-generate-dump-flight-runvars are
disabled if AP_FETCH_PLACEHOLDERS=y.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: Updated comment with more accurate discussion, references to
bash "trap - INT" bug, and proposed new shopt.
No code change.
Ian Campbell [Mon, 5 Oct 2015 09:12:04 +0000 (10:12 +0100)]
ts-logs-capture: Gather /proc/modules
Knowing which modules are loaded can be useful.
/proc/modules is slightly less human-readable than the output of
"lsmod", but also includes the load address, which might plausibly be
useful for decoding a stack trace (although I've never used it for
that myself)
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Thu, 1 Oct 2015 15:01:26 +0000 (16:01 +0100)]
Debian: Use dtbs from kernel dist when booting that kernel
The kernel dist built by ts-kernel-build puts the corresponding dtbs
into /boot/dtbs/$kvers.
The host installer's dtbs remain in /boot/dtbs and are used when
booting the native kernel.
It's possible that this change will expose bugs which exist in the
DTBs in previous kernel branches (3.18 and 4.1). I've not
investigated, I think we should accept this possibility and deal with
it via backport requests (and maybe some force pushes if appropriate)
as necessary.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Thu, 1 Oct 2015 15:04:22 +0000 (16:04 +0100)]
Debian: Enable interpolation in uboot_scr_load_dtb here doc
By switching <<'END' to <<END. A future patch is going to want to put
a variable here which requires interpretation by the Perl.
Unfortunately this means lots of extra backslashes to escape things
such that they pass through Perl and Shell and end up as ${foo} in the
resulting u-boot script.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Thu, 1 Oct 2015 10:08:49 +0000 (11:08 +0100)]
ts-kernel-build: Include dtbs in dist file
These are installed to $(INSTALL_PATH)/dtbs/$(KERNEL_RELEASE) where
$(INSTALL_PATH) defaults to /boot but we override it to our staging
/boot.
Note that ts-host-install will install the OS dtbs directly into
/boot/dtbs without the subdirectory, so this won't clash and could be
considered a fallback hence I don't propose to move those ones.
The install_dtbs target has been available since v3.14, wherease we
only test v3.16 onwards on ARM, hence no arrangements are needed to
conditionalise this installation over and above the per-arch
arrangements made here.
Having now set $(INSTALL_PATH) I think the "install" target could now
take over the installation of System.map, vmlinux and .config into
/boot but I've not checked this with all historical kernel versions
and don't intend to make this change now.
Remove any previous dist dir on install, otherwise the kernel tends to
create dist/boot/dtbs.old with the previous contents on repeated use.
Seems like good hygiene anyway.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Sat, 3 Oct 2015 10:47:28 +0000 (11:47 +0100)]
ts-freebsd-install: Pass -s option to kpartx
This is "Sync mode. Don't return until the partitions are created",
which seems to be needed in Jessie. The option also exists in Wheezy,
according to the manpage.
Without this the following mount fails having apparently raced against
the creation of the device nodes.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Cope with this here, by treating the x86/xen (PV) case as the special
case that it is (due to the requirement of a separate i686-pae
installer kernel).
This goes unnoticed in the distros-debian flights because the kernel
is downloaded at runtime via a runvar. It matters once regular
(non-distro-debian) flights run with Jessie because then the d-i which
is used by the di based tests is taken from the result of
mg-debian-installer-update (e.g. for the -qcow2 flights etc).
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Tue, 22 Sep 2015 09:59:40 +0000 (10:59 +0100)]
ts-freebsd-install: Use $gho->{Lvdev} instead of target_guest_lv_name
prepareguest has already assigned this so we should use it instead of
replicating (perhaps wrongly since target_guest_lv_name and
target_choose_vg can behave differently if multiple vgs are present).
$gho->{Lvdev} has been adjusted to return /dev/mapper/vg-lv paths
which is required to be able to add partition numbers since kpartx
does not create the /dev/vg/lv form.
Ian Campbell [Fri, 2 Oct 2015 09:31:13 +0000 (10:31 +0100)]
TestSupport: Use lv_dev_mapper in guest_find_lv
This has the effect of switching $gho->{Lvdev} from /dev/$VG/$LV to
/dev/mapper/$VG-$LV (where $VG and $LV have an s/-/--/ transformation
applied).
The two paths point to the samedevice and most call sites don't care
about the distinction. Some places which use "kpartx $dev" to Lvdev
expect to be able to append a partition number to $dev, while kpartx
only creates the /dev/mapper form, meaning such places cannot use
Lvdev. By making this switch we allow these places (such as
ts-freebsd-install) to use kpartx.
The only other place I'm aware of which requires one form or the other
is the Debian initramfs code which expects root=/dev/mapper/vg-lv and
does not accept /dev/vg/lv. However this is already handled correctly.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Tue, 22 Sep 2015 09:33:00 +0000 (10:33 +0100)]
Debian: uboot: Use di_vg_name() and lv_dev_mapper() for root=
root is not a "guest lv", so using target_guest_lv_name is misleading.
target_guest_lv_name also fails to properly handle the d-i vg name
correctly, which di_vg_name does. Specifically under Jessie an extra
"-vg" is appended to the hostname compared with Wheezy and Squeeze.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Tue, 14 Jul 2015 14:32:54 +0000 (15:32 +0100)]
ms-flights-summary: Produce an HTML report of all active flights
Jobs are categorised by a new ->Job field. This is added by
ts-hosts-alllocate-Executive and propagated by the planner after
recent patches. It contains $flight.$job.
Jobs which do not include this are anonymous and are listed
separately, using the resource name and info field (if present) as the
job name.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Mon, 5 Oct 2015 14:10:36 +0000 (15:10 +0100)]
standalone: Use fail() from mgi-common in most places
Functional change is simply to prepend "$0: ", to change the exit
code for unknown operation and to slightly alter the error message
when no arguments are given.
A few "exit 0" and "exit $rc" remain.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Fri, 22 May 2015 11:56:15 +0000 (12:56 +0100)]
standalone: Make it possible to pass options to run-test
Currently the remainder of the comnand line is passed after the host=
ident, which allows for other idents to be given, which isn't all that
useful in practice.
Instead arrange that any additional options up to a "--" marker are
passed before host= and anything after are passed after.
Since the options themselves have a leading -- this can confuse the
scripts own option parsing, meaning you may need more than one "--"
marker, the first to separate the standalone helper args from the ts
args and a second to separate from any ident optiopns.
Ian Campbell [Fri, 25 Sep 2015 13:06:13 +0000 (14:06 +0100)]
cs-adjust-flight: Add job-status to report job stats
The return code of sg-run-job does not reflect the state of the job,
which is instead written to the database. For the benefit of running
tests in a loop until failure add a command to retrieve the status to
stdout.
Add a get-job-status command to the standalone helper script.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Tue, 29 Sep 2015 10:24:53 +0000 (11:24 +0100)]
production-config: Switch DebianMirrorProxy to port 3143
This is running apt-cacher-ng rather than apt-cacher (which remains on
port 3142 for the time being). It seems that apt-cacher exposes a bug
(#795284) in Jessie's version of apt.
apt-cacher-ng also seems more popular these days.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Tue, 29 Sep 2015 09:44:53 +0000 (10:44 +0100)]
Debian: Uninstall flash-kernel when creating our own boot.scr
flash-kernel will run from various kernel postinst hooks and overwrite
our own boot scripts. While this might be tollerable for the initial
installation we don't want to risk it occuring after we have created
our own boot.scr to boot xen.
dpkg --purge succeeds if the package wasn't installed.
This happened to show up with Jessie since it now supports the two
boards in our test lab while Wheezy didn't (so flash-kernel didn't
know about them and did nothing).
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Wed, 30 Sep 2015 15:02:49 +0000 (16:02 +0100)]
Executive HTML output: Use #888888 (grey) for queued jobs
Either the flight hasn't started yet, or the job is blocked waiting
for other jobs to finish. In any case their state is not very
interesting.
Most usefully this change visually distinguishes, in the plan summary,
jobs which are waiting for prior jobs to finish, from ones which have
entered the planning queue.
Also replace a $f->{status} with $status, which is less confusing.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v2: Get test right.
Update commit message.
Ian Jackson [Thu, 1 Oct 2015 10:55:47 +0000 (11:55 +0100)]
ts-logs-capture: Actually do hard host reboot sometimes
The logic in try_fetch_logs for setting $ok was wrong. $ok would be
set if we reached the end of any outer (pattern) loop iteration. If
the host is actually dead all the pattern expansions would fail, but
some of the patterns are literals and do not need expansion. The
inner (logfile) loop would say `next' if the logfile fetch failed, but
that just goes onto the next logfile. So this code would always set
$ok.
Instead, set $ok to 1 when we successfuly fetch any logfile or
successfully expanded any pattern (even if it didn't match any files).
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Wed, 30 Sep 2015 12:01:40 +0000 (13:01 +0100)]
make-flight: Trim the matrix of disk format flights
We don't need to test every combination of toolstack, architecture,
and disk format. We don't expect many architecture-specific bugs in
the per-disk-format code in the toolstack layers.
We _do_ want to test every combination of toolstack and disk format
(since the format configuration machinery is toolstack specific) and a
reasonable selection of architectures for each disk format (since
arch-specific bugs in actual underlying disk drivers are a
possibility).
The implementation strategy is for do_pv_debian_tests to select a
particular architecture for each combination of toolstack and format.
(Because the architecture is actually in an outer loop, we recalculate
that selection multiple times, and skip inner iterations for the other
architectures. This is all in bash code so the wasted computation is
not particularly important.)
We have a safety catch which spots if any architecture is entirely
untested in any of these combinations; this would happen if a new
architecture is introduced elsewhere and not added to the list. We do
not have a safety catch which spots when a (toolstack,format)
combination becomes untested due to deletion of an architecture.
(That would be more fiddly to implement without restructuring.)
We list armhf twice because we would like to do at least as many ARM
as x86 tests (particularly given our current workload and capacity).
The result is that the set of generated jobs is adjusted as follows:
Ian Jackson [Tue, 29 Sep 2015 14:25:47 +0000 (15:25 +0100)]
ts-hosts-allocate-Executive: Finish a couple of transactions
Call $equivflagscheckq->finish() at the end, rather than in the
candidate search loop. This avoids missing a `next' control path.
Call $resprop_q->finish() at all.
One of these is responsible for this message:
DBI::db=HASH(0x88e79c4)->disconnect invalidates 1 active statement
handle (either destroy statement handles or call finish on them
before disconnecting) at Osstest/Executive.pm line 708, <GEN4> line
53.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Tue, 29 Sep 2015 13:47:33 +0000 (14:47 +0100)]
ts-hosts-allocate-Executive: Do not allocate specific host with wrong blessing
If the job contains a runvar specifying a specific host, still check
that host's blessing. Otherwise bisections can run on unblessed
hosts. They should fail instead.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Tue, 29 Sep 2015 13:19:05 +0000 (14:19 +0100)]
Do not multiply console hvc0 getty entries
target_kernkind_console_inittab is supposed to edit inittab to make
sure that there is a getty running on hvc0.
However:
- It is not idempotent; if run more than once it can produce duplicate
entries `1:' and `xc:'.
- It works by copying and editing the entry `1:' but it might be that
there is already another console entry for hvc0 for some other
reason.
If we end up with multiple entries for hvc0, we can have two copies of
getty fighting, and if you manage to log in, one of them will be
fighting with your shell.
Guard the script with a grep, which looks for inittab entries
mentioning the intended console. This makes makes it do nothing if
nothing is needed (and therefore it also makes it idempotent).
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Mon, 28 Sep 2015 15:29:33 +0000 (16:29 +0100)]
TestSupport: Honour $stdin fh argument to cmd, tcmd and tcmdex
These are internal functions, so the change is entirely within
TestSupport. All the call sites are adjusted to pass undef so there
is no functional change.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
(See the comment in the new file for the explanation.)
This change affects all our Debian installs (both hosts and guests)
which are done with preseeding, because preseed_base() arranges to
install overlay/.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Thu, 24 Sep 2015 17:03:00 +0000 (18:03 +0100)]
Shell fixup: Use bash in posix mode
When bash is started as /bin/sh it run in posix compatibility mode.
But when invoked as /bin/bash it does some things ... differently.
Most notably:
Subshells spawned to execute command substitutions inherit the
value of the -e option from the parent shell. When not in posix
mode, bash clears the -e option in such subshells.
It is a mystery why anyone thought the `non-posix' behaviour was
desirable. One effect in practice is that osstest's cr-daily-branch
can blunder on if one of its version fetches fails.
AFAICT the only documented way to get rid of this anomalous behaviour
is to switch bash to posix mode. I have read through the wheezy
bash(1) manpage and searched for posix, and the following behavioural
differences are described:
* Differences in interative startup (not relevant to us).
* Minor (irrelevant) differences in which startup files are read
during noninteractive startup. (Eg, BASH_ENV not honoured.)
* Differences to the parsing of invocations of `time'
* `test a == b' may be unsupported (but it's wrong and we say `=')
* -e not inherited by some subshells (this is what I am trying to fix)
* `.' and `source' do not search the cwd.
* `set' with no arguments does not print shell functions etc.
So I think, with the previous patch, that these changes are all
desirable or at least harmless.
I have not added `set -o posix' to shell script fragments invoked by
various scripts (eg Perl and Tcl scripts). Those scripts might be
processed by bash if /bin/sh is bash, but when is invoked as sh it
runs in posix mode anyway.
I have done some ad-hoc testing but it seems like much of this is
difficult to test. I suggest we push it at a time when we can keep a
close eye on the behaviour.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Ian Jackson [Thu, 24 Sep 2015 15:46:44 +0000 (16:46 +0100)]
Shell fixup: Make all invocations of `.' (`source') use ./
In POSIX, `.' (the shell builtin) respects PATH, and does not search
`.' (the current directory).
Change all the invocations which refer to files which are part of
osstest to say `. ./foo' instead of simply `. foo'.
I have checked the results of
git-grep '^[ \t]*\. [^./]'
after this patch and the remaining five hits are of no concern.
As a double-check of my hand-editing, I have also done this
perl -i~ -pe 's#^(\s*\. )\./#$1#' *
and verified that the resulting tree is almost identical to that
before this commit. There is one difference, where the original
code already said `. ./job'.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Ian Campbell [Fri, 25 Sep 2015 10:21:08 +0000 (11:21 +0100)]
Do not run baseline tests on osstest branch.
If an osstest instance is running flights with
OSSTEST_BASELINES_ONLY=y (e.g. on a secondary site, like the Citrix
Cambridge instance) then it will also be running a regular osstest
flight to merge from the upstream osstest and does not want to also do
baseline testing.
I expect this will be the normal configuration in all sites other than
the master colo production instance, so arrange for it via
branch-settings.osstest rather than some site local configuration file.
In the colo production instance we never set OSSTEST_BASELINES_ONLY=y,
so this has no effect. If we did set that option I think suppressing it
for the osstest branch would still be correct.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>