In
dm restrict audit: install newer chiark-scripts for fishdescriptor
arrangements were made to install suitable chiark-scripts for
for jessie and stretch.
For buster and later, the mainline Debian version of chiark-scripts is
indeed sufficient, but nothing installed it. Do that.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 5 Apr 2019 13:55:53 +0000 (14:55 +0100)]
dm restrict audit: actually install fishdescriptor in host
In
dm restrict audit: install newer chiark-scripts for fishdescriptor
arrangements were made to install a backport of chiark-scripts
but the code was mistakenly placed in preseed_create_guest but
of course it's needed in the host.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 5 Apr 2019 13:04:26 +0000 (14:04 +0100)]
dm restrict audit: actually install right package for fishdescriptor
In
dm restrict audit: install newer chiark-scripts for fishdescriptor
a locally-provided chiark-scripts_6.0.2_all.deb was installed for
jessie. For stretch a backport was installed, but mistakenly
of chiark-utils-bin rather than chiark-scripts.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 5 Apr 2019 12:31:07 +0000 (13:31 +0100)]
Debian: set partman-lvm/device_remove_lvm_span
Web searching[1] suggests that this suppresses this error:
!! ERROR: Unable to automatically remove LVM data
Because the volume group(s) on the selected device also consist of
physical volumes on other devices, it is not considered safe to
remove its LVM data automatically. If you wish to use this device
for partitioning, please remove its LVM data first.
It doesn't, though. I am only experiencing it now because the ad-hoc
disk-erasing (25erase-other-disks) is broken for other reasons. But
let's have it anyway as it sounds like a thing we might want.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 5 Apr 2019 12:24:52 +0000 (13:24 +0100)]
Debian: partman scripts: Run right away too
We are switching the installation of these to partman/early_command
which runs as a result of a /lib/partman/init.d hook. That means that
things we install don't get picked up, so run them right away (too).
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Tue, 2 Apr 2019 15:24:08 +0000 (16:24 +0100)]
preseed_base: chmod ssh host private keys to placate sshd
Otherwise:
Could not load host key: /etc/ssh/ssh_host_ecdsa_key
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: UNPROTECTED PRIVATE KEY FILE! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Permissions 0640 for '/etc/ssh/ssh_host_ed25519_key' are too open.
It is recommended that your private key files are NOT accessible by others.
This private key will be ignored.
key_load_private: bad permissions
Could not load host key: /etc/ssh/ssh_host_ed25519_key
This seems to start happening with stretch. Presumably stretch is
more annoyingly picky than jessie.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 30 Nov 2018 16:42:26 +0000 (16:42 +0000)]
TestSupport: target_somefile_leaf rename and change a variable
Rename this function. `getleaf' contains `get' which makes it sound
like the function copies something, or returns answers suitable for
getting, or something.
Also rename `$rdest' to `$rfile' since it might be a source too.
(Although we are not about to make it a source...)
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Julien Grall [Fri, 30 Nov 2018 15:23:42 +0000 (15:23 +0000)]
ts-xen-build: Enable ITS driver in Xen
The ITS driver was added in Xen 4.10 as a technical preview feature.
However, it is required in order to boot Xen as Thunder-X because
PCI devices don't support legacy interrupt.
Ian Jackson [Thu, 29 Nov 2018 16:00:56 +0000 (16:00 +0000)]
persistent-net: Include initramfs script to copy to target
This is the piece which actually copies the installer's network names
to the target. It should not appear on the installed system, so it's
not in overlay-persistent-net.
Technically this is only useful when the installer has the
overlay-persistent-net in it, which is done only in ts-host-install
and not in all the places where setup_netboot_firstboot is used.
But without overlay-persistent-net it is harmless, and it is most
convenient to put it here.
The little script fragment was copied out of a jessie debian-installer
initramfs environment.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Tue, 27 Nov 2018 14:33:41 +0000 (14:33 +0000)]
persistent-net: Add overlay in installer >= stretch
We are going to need this in the installer so that the interface names
from the installer environment are captured so that they can be the
same on the host.
This prepares the ground for turning off net.ifnames. The actual
rules are gated on net.ifnames so right now there is no change.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Tue, 27 Nov 2018 14:18:38 +0000 (14:18 +0000)]
overlay-persistent-net: Copy from jessie
These were copied from a system running Debian jessie.
The nontrivial files are:
# Copyright (C) 2006 Marco d'Itri <md@Linux.IT>
# Copyright (C) 2007 Kay Sievers <kay.sievers@vrfy.org>
and licenced GPLv2+. That is compatible with osstest's AGPLv3+.
Right now we do nothing with these.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Tue, 27 Nov 2018 16:50:26 +0000 (16:50 +0000)]
contents_make_cpio: Include symlinks
We are going to introduce some symlinks into one of our preprepared
overlays. We must therefore arrange to copy them as appropriate.
The syntax `-type f,l' is an extension in GNU find. If this causes
trouble in the future we will then have to introduce the obvious
circumlocution involving ( ).
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Wei Liu [Tue, 8 May 2018 15:51:39 +0000 (16:51 +0100)]
ts-kernel-build: disable host1x, which doesn't build
Empirically, on stretch armhf:
drivers/gpu/host1x/cdma.c: In function `host1x_pushbuffer_init':
drivers/gpu/host1x/cdma.c:94:48: error: passing argument 3 of `dma_alloc_wc' from incompatible pointer type [-Werror=incompatible-pointer-types]
pb->mapped = dma_alloc_wc(host1x->dev, size, &pb->phys,
^
etc.
This is blocking the upgrade of the Xen Project CI to Debian stretch
so disable it for now.
Wei Liu [Wed, 9 May 2018 08:59:37 +0000 (09:59 +0100)]
Drop rumprun tests
These have been failing for some time and it doesn't any more look
like this will be an attractive route to stub device models. (At
least two Xen downstream projects are using Linux-based stub device
models.)
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
v4: Expand commit message.
Signed-off-by: Julien Grall <julien.grall@arm.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
v4: Correct suite name capitalisation in commit message and comment.
Wei Liu [Thu, 2 Nov 2017 15:15:04 +0000 (15:15 +0000)]
make-flight: guest should use jessie to test pvgrub
stretch has 64bit feature enabled for ext4, which pvgrub can't cope.
We want to continue to test pvgrub, so specify jessie in the guest
suite field.
A consequence is that this test will test jessie forever. Eventually
jessie will rot so badly that this test fails and then we will no
longer be testing pvgrub1. Hopefully by then no-one will be using it.
CC: Juergen Gross <jgross@suse.com> Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
v4: Adjust commit message slightly.
Wei Liu [Fri, 20 Oct 2017 14:35:06 +0000 (15:35 +0100)]
ts-guests-nbd-mirror: make it work with stretch
On the server side, only add oldstyle= and port= on wheezy and jessie.
stretch doesn't support or need those anymore.
On the client side, generate new style configuration file.
Reorder nbd-client setup a bit. Install it first, then write our own
configuration file, then start it. This stops dpkg asking what to
do regarding configuration files.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
v3: invert some tests, rearrange client setup code.
v4: Fix commit message grammar.
Wei Liu [Tue, 17 Oct 2017 16:10:10 +0000 (17:10 +0100)]
ts-debian-fixup: merge origin extra= to our own if necessary
The original extra= was not removed, so there were two extra= in the
resulting config file.
It wasn't a problem for xl because the second extra= took precedence.
However libvirt tests would only pick up the first extra= so they
worked by chance.
Fix this issue by merging the original. If there isn't already extra=
in $cfg, use our own.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
v3: handle situation when no extra= is in $cfg
Ian Jackson [Tue, 2 Apr 2019 14:56:57 +0000 (15:56 +0100)]
power: Fix uninitialised variable warning
In
power: Record approach used for power cycles in runvars
we introduced a reference to $r{$rv} which might be undef,
resulting in this:
Use of uninitialized value in concatenation (.) or string at Osstest/TestSupport.pm line 1069.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Mon, 25 Feb 2019 15:46:52 +0000 (15:46 +0000)]
jessie: Disable use of security.debian.org
We have about a 10% failure rate of a problem where the symptoms are
that the test box fails to get some thing from security.debian.org.
The apt-cacher-ng logs show the relevant test box's ip address
fetching the file that it is supposed to. But, it is possible that
there are different timeouts, so that does not mean the problem is
inside the colo.
Fetching from other apt sources, notably the main Debian archive and
snapshot.d.o, do not seem to be affected.
Specifically, I searched the logs for the last 1000 host install steps,
and looked for the failures, with the following rune:
select flight,job,logfile,started from (select *, (select val from runvars r where r.job=steps.job and r.flight=steps.flight and r.name='host') from steps where testid like 'host-install%' and flight>130000 order by finished desc limit 1000) sub where status='broken';
I then used these runes to correlate that with the syslogs from the
installer:
perl <~/t -ne 'use strict; s/^ *//; my ($flight,$job,$logf) = split / +\| +/; next unless $flight =~ m/^\d+$/; my $f = "$flight/$job/?.ts-syslog-server.log"; my @y = glob $f; print $_, "\n" for @y;' >~/u
xargs <~/junk/u egrep -L 'Failed to fetch http://security\.debian\.org.*Connection failed'
The only logs which did not mention that error message were three
failed jobs on the same host, joubertin1, which seems not to be
rebooting reliably.
So I think this is a problem with the security.debian.org CDN.
For now, disable security updates entirely. We don't really care
about the security patch status of test boxes anyway. Hopefully this
will cause the system to become reliable again.
CC: Juergen Gross <jgross@suse.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Wed, 13 Feb 2019 17:07:23 +0000 (17:07 +0000)]
backports snapshot: Use 20190206T211314Z for jessie-backports
Some time on 2019-02-07, Debian removed linux-base from
jessie-backports. This caused everything to break: apt wasn't happy
to get linux-base from jessie-security (because of our -t
jessie-backports, probably) and that meant there was no linux-base
suitable for linux-image-4.9.x on arm64. We ended up trying to
boot the installed system with 3.16, which does not work on our two
SoftIron arm64 test boxes.
Also, jessie-backports about to be completely removed.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 25 Jan 2019 12:04:21 +0000 (12:04 +0000)]
power: ssh: Wait for the target to appear to go down
When we `power on' with the ssh method, we actually run ssh reboot.
On some systems (notably, FreeBSD) the kernel does not simply reboot
immediately even with the runes we provide here, ie for FreeBSD
reboot -nq
Eg, I have seen reboots with several messages like this:
Jan 25 14:17:59.100044 Waiting (max 60 seconds) for system thread `bufspacedaemon-2' to stop... done
This can result in the ssh method failing spuriously, because the
`power on' appears to complete while the host is still up in the
previous environment. In one of my test runs I saw an ssh to the host
succeed, and print the uptime (of the existing environment), between
the reboot command being issued and the host actually rebooting.
So, wait (up to just over a minute) until the host does not respond to
ping. (target_await_down runs ping -c 5.)
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Thu, 24 Jan 2019 18:25:24 +0000 (18:25 +0000)]
power: ts-freebsd-host-install: Use power_reboot_attempts
We look at the installer environment uptime, to
| check that this is the installer environment
as requested by the comment
| in particular $await must only succeed if the host really did
| reboot into the boot environment that $await expects.
near the top of power_reboot_attempts
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com> Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 25 Jan 2019 12:08:41 +0000 (12:08 +0000)]
power: ssh: Fix handling of $delay
The script fragment contains a reference to $delay which is a perl
variable, not a variable in the script fragment. We therefore need to
not ''-quote the script.
Without this, the ssh method will often fail spuriously: the exiting
parent (which will signal success back to the osstest controller)
races with the attempt to reboot.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Tue, 22 Jan 2019 17:58:54 +0000 (17:58 +0000)]
power: power_reboot_attempts: Honour an $approach_re
The semantics are slightly different here: not specifying it means to
try everything rather than only the hardest. But the effect is
similar: not specifying $approach_re means we must succeed.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Tue, 22 Jan 2019 14:44:52 +0000 (14:44 +0000)]
power: Provide `ssh' power method
This is not really a power method but it can pretend to be one. On
power off, it does nothing. On power on it logs into the host to ask
it to do a hard reboot.
This is rather best effort, but it is eminently suitable for our new
approach/attempts arrangements because those will try another approach
if ssh didn't work.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Tue, 22 Jan 2019 15:50:00 +0000 (15:50 +0000)]
power: New ILOM/PDU arrangements - try just IPMI
We honour two new host properties PowerPDU and PowerILOM, in
preference to PowerMethod. The semantics are going to be properly
documented in a later patch, but, briefly:
If only one of these is supplied, it works like PowerMethod, except
that `nest' is applied by default.
If both are supplied, we make two approaches: one is just ILOM. The
other is to use ILOM and PDU together, with pause in between, and with
try_off applied to ILOM.
The current configuration in Massachusetts is, for hosts with IPMI, to
provide a PowerMethod specifying ad hoc to use PDU and then IPMI, and
also to provide both PowerPDU and PowerILOM.
The overall result of this patch, with that configuration, is to avoid
using the PDU at all if an IPMI-requested reboot is successful. This
should significantly reduce the number of hard power cycles for hosts
with IMPI.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Tue, 22 Jan 2019 17:01:02 +0000 (17:01 +0000)]
power: Provide `try_off' pdu method; deprecate ipmi_try
We are going to want to use this magically, in our new approach. Make
a general version, and deprecate ipmi_try (which will be obsoleted by
the new approach and which has probably not been used very much).
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
This new variable contains a list of different approaches to try.
* Move the meat of power_state into power_approach_invoke.
* power_state now looks for a single approach to try.
* The default for power_state is to pick the last approach in
the list, which by definition is supposed to be the most reliable.
* Currently there will only be one approach, `Only'.
No overall functional change other than to log messages.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Thu, 17 Jan 2019 15:20:11 +0000 (15:20 +0000)]
line wrapping: Use tmp/$flight.report in two extra places
The output from sg-report-flight might in principle contain long
lines, although this is not expected. So we are going to want to feed
it through the new cr-fold-long-lines.
Rather than piping, we are going to keep a copy of the .report file,
like is done in mg-execute-flight. So for now, just make that change.
No overall change other than to leave behind the tmp/$flight.report
file. It will be tidied up by the usual cleanup processes.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Thu, 17 Jan 2019 15:17:22 +0000 (15:17 +0000)]
line wrapping: Provide cr-fold-long-lines script
This is a reversible transformation which usually just introduces a \
where it splits lines.
We are going to use this to wrap the lines in our emails. SMTP has a
999-byte length limit (including a CR-LF pair). This can cause our
emails to go astray. We don't really want our messages to be q-p or
base64-encoded if we can avoid it, and MTAs don't do that anyway (so
we would have to organise it). So instead, we will simply wrap any
long lines that occur.
This transformation is not suitable for headers, but we don't intend
or want to generate long lines which would need further wrapping. (A
reversible transformation suitable for headers would be quite ugly and
would only be right for a subset of headers anyway...)
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Wed, 16 Jan 2019 11:32:06 +0000 (11:32 +0000)]
ts-livepatch-run: Fix erroneous $$ in double-check
The doubled $s here are simply a mistake. The result is to make this
test ineffective, since `$$file' means `the value of the variable
whose name is in the variable $file', which here will never exist.
This produces a `Use of uninitialized value' warning and substitutes
the empty string, so overall we test the existence of the directory.
The missing check is not of much consequence since this check is not
really expected ever to fail, and if it does, some actual test
execution would fail due to the missing file.
So overall I think the only change is to log output.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Ian Jackson [Wed, 16 Jan 2019 11:30:49 +0000 (11:30 +0000)]
ts-livepatch-run: Print a message about expected failures
target_cmd_output_root_status prints the command exit status. If that
was a failure and the failure was as expected, this can be confusing
to readers who do not know that this is a possibility. So print a
message about it.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>