]> xenbits.xensource.com Git - osstest.git/log
osstest.git
4 years agots-host-reuse: tolerate unremoveable lv
Ian Jackson [Fri, 17 Nov 2017 14:05:34 +0000 (14:05 +0000)]
ts-host-reuse: tolerate unremoveable lv

It might be a symlink in the pair tests.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agots-host-reuse: New script, to do reuse state changes
Ian Jackson [Tue, 21 May 2019 16:06:50 +0000 (17:06 +0100)]
ts-host-reuse: New script, to do reuse state changes

This will be made part of the test job recipes.

We calculate the sharing scope (sharetype) by reference to a lot of
runvars, etc.

This version of the script is rather far from the finished working
one, but it seems better to preserve the actual history for how it got
the way it is.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agots-hosts-allocate-Executive: Better message for hosts abandoned mid-test
Ian Jackson [Mon, 6 Nov 2017 17:23:34 +0000 (17:23 +0000)]
ts-hosts-allocate-Executive: Better message for hosts abandoned mid-test

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agoresource reporting, nfc: Break out report_rogue_task_description
Ian Jackson [Fri, 28 Aug 2020 15:53:18 +0000 (16:53 +0100)]
resource reporting, nfc: Break out report_rogue_task_description

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agoresource reporting: Print username when listing "rogue tasks"
Ian Jackson [Fri, 28 Aug 2020 15:45:53 +0000 (16:45 +0100)]
resource reporting: Print username when listing "rogue tasks"

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agoplan_search: Track last sharing state to determine $share_reuse
Ian Jackson [Wed, 8 Nov 2017 16:29:07 +0000 (16:29 +0000)]
plan_search: Track last sharing state to determine $share_reuse

What matters for the purpose of $share_reuse is not whether the host
is actually being _shared_ (ie, there are other concurrent allocations
and therefore a concurrent Event with Share information).  What we
really want to know is whether the *last* use of this host was a
suitable sharing setup - because we actually want to know if we will
be able to skip our setup.

So track that explicitly.  (The slightly odd structure, where there
are two loops in one, means that we reset $last_eshare when we go onto
the next $req ie the next host to check.)

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agoplan search: Move $share_compat_ok further up the file
Ian Jackson [Wed, 8 Nov 2017 16:43:34 +0000 (16:43 +0000)]
plan search: Move $share_compat_ok further up the file

We are going to want to use this outside the loop.

No functional change.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agoplan_search: Use plan's Wear information rather than tracking it ourselves
Ian Jackson [Wed, 8 Nov 2017 16:39:37 +0000 (16:39 +0000)]
plan_search: Use plan's Wear information rather than tracking it ourselves

There is no reason not to use this information from the plan.
Not computing it ourselves saves some confusing logic here.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agoplan_search: Improve debugging of $share_compat_ok->()
Ian Jackson [Wed, 8 Nov 2017 16:36:07 +0000 (16:36 +0000)]
plan_search: Improve debugging of $share_compat_ok->()

No change other than to debugging output.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agoplan_search: Break out $share_compat_ok
Ian Jackson [Wed, 8 Nov 2017 16:16:29 +0000 (16:16 +0000)]
plan_search: Break out $share_compat_ok

No functional change.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agohost allocation: *_shared_mark_ready: Only prod when $newstate is ready
Ian Jackson [Mon, 30 Oct 2017 17:25:43 +0000 (17:25 +0000)]
host allocation: *_shared_mark_ready: Only prod when $newstate is ready

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agohost allocation: Support new reuse-* magic hostflag
Ian Jackson [Mon, 30 Oct 2017 16:33:50 +0000 (16:33 +0000)]
host allocation: Support new reuse-* magic hostflag

This is like share-* except it has different MaxTasks and MaxWear
parameters.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agohost allocation: *_shared_mark_ready: allow alternative $oldtypes
Ian Jackson [Fri, 27 Oct 2017 17:23:41 +0000 (18:23 +0100)]
host allocation: *_shared_mark_ready: allow alternative $oldtypes

$oldtype may now be a hashref, where keys mapping to truthy values are
permitted for the sharetype precondition.

No functional change for existing callers.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agohost allocation: selecthost: allow sort-of-selection of prospective hosts
Ian Jackson [Fri, 27 Oct 2017 16:52:49 +0000 (17:52 +0100)]
host allocation: selecthost: allow sort-of-selection of prospective hosts

If one passes a trueish value for $prospective, selecthost does not
worry about whether any host has actually been selected.  It does a
limited amount of prep work.

This will be useful if we want to know some of the non-host-specific
information selecthost computes - in particular, $ho->{Suite} etc.

No functional change with existing callers.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agohost allocation: *_shared_mark_ready: Make $sharetype check optional
Ian Jackson [Fri, 27 Oct 2017 14:42:39 +0000 (15:42 +0100)]
host allocation: *_shared_mark_ready: Make $sharetype check optional

We are going to want to be able to set shares to other than ready,
without double-checking the sharetype.

The change to the UPDATE statement makes no difference because
resource_check_allocated_core has just got that sharetype out of the
db.  (This does remove one safety check against bugs, sadly.)

No functional change for existing callers.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agohost allocation: *_shared_mark_ready: Allow other states
Ian Jackson [Fri, 27 Oct 2017 14:41:31 +0000 (15:41 +0100)]
host allocation: *_shared_mark_ready: Allow other states

Generalise these functions so they can set the state to something
other than `ready', and so that they can expect a state other than
`prep'.

No functional change with existing callers.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agodb_retry: Make the sleeps random and increasing
Ian Jackson [Tue, 21 Nov 2017 17:18:09 +0000 (17:18 +0000)]
db_retry: Make the sleeps random and increasing

When there's a thundering herd, this can run out of retries.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agosg-run-job: Use +! in per-host-ts implementation
Ian Jackson [Wed, 22 May 2019 15:34:11 +0000 (16:34 +0100)]
sg-run-job: Use +! in per-host-ts implementation

This makes this slightly clearer, even more so in a moment.

No functional change.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-run-job: support +! for *only* adding things to TESTID
Ian Jackson [Tue, 21 May 2019 15:43:51 +0000 (16:43 +0100)]
sg-run-job: support +! for *only* adding things to TESTID

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agots-hosts-allocate-Executive: Fix handling of failed preps for same sharing
Ian Jackson [Fri, 3 Nov 2017 17:40:42 +0000 (17:40 +0000)]
ts-hosts-allocate-Executive: Fix handling of failed preps for same sharing

This code was previously unreachable.  It ought to be executed when
all the shares are allocatable or prep: in that case, we can unshare
and re-share the host.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agohost allocation: Executive: Honour $xparams{InfraPriority}
Ian Jackson [Fri, 17 Nov 2017 15:33:02 +0000 (15:33 +0000)]
host allocation: Executive: Honour $xparams{InfraPriority}

And pass it to ms-queuedaemon.  No functional change with existing
callers since no-one sets this yet.

Forthcoming test host sharing machinery uses this.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agohost allocation: Remove some unnecessary definedness tests
Ian Jackson [Fri, 17 Nov 2017 15:31:41 +0000 (15:31 +0000)]
host allocation: Remove some unnecessary definedness tests

$set_info->() already checkes for undef, and returns immediately in
that case.  So there is no point checking at the call site.

No functional change.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agoDebian: osstest-erase-other-disks: Slightly guard against races
Ian Jackson [Mon, 24 Aug 2020 17:54:18 +0000 (18:54 +0100)]
Debian: osstest-erase-other-disks: Slightly guard against races

Apparently it can happen that something decides to rescan a partition
table, removing a partition block device, while it is being zeroed:

 osstest-erase-other-disks-6081: hd devices present after: /dev/hd*
 osstest-erase-other-disks-6081: Erasing /dev/sda
 osstest-erase-other-disks-6081: Erasing /dev/sda1
 osstest-erase-other-disks-6081: /dev/sda1 is no longer a block device!

To try to narrow the window during which this race occurs, do not care
if the thing we just zeroed no longer exists after we zeroed it.

We still bomb out if it exists but is not a block device - that would
probably mean we had written it out as a file.

This is all quite unfortunate.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agoabolish "kernkind"; desupport non-pvops kernels
Ian Jackson [Tue, 25 Aug 2020 11:02:13 +0000 (12:02 +0100)]
abolish "kernkind"; desupport non-pvops kernels

This was for distinguishing the old-style Xenolinux kernels from pvops
kernels.

We have not actually tested any non-pvops kernels for a very very long
time.  Delete this now because the runvar is slightly in the way of
test host reuse.

(Sorry for the wide CC but it seems better to make sure anyone who
might object can do so.)

All this machinery exists just to configure the guest console
device (Xenolinux used "xvc" rather than "hvc") and the guest root
block device (Xenolinux stole "hda"/"sda" rather than using "xvda").

Specifically, in this commit:
 * In what is now target_setup_rootdev_console_inittab, do not
   look at any kernkind runvar and simply do what we would if
   it were "pvops" or unset, as it is in all current jobs.
 * Remove the runvar from all jobs creation and example runes.
   (This has no functional change even for jobs running with
   the previous osstest code because we have defaulted to "pvops"
   for a very long time.)

We retain the setting of the shell variable "kernbuild", because that
ends up in build jobs' names.  All our kernel build jobs now end in
-pvops and I intend to retain that name component since abolishing it
is nontrivial.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
CC: Boris Ostrovsky <boris.ostrovsky@oracle.com>
CC: Juergen Gross <jgross@suse.com>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wei.liu@kernel.org>
CC: Paul Durrant <paul@xen.org>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Juergen Gross <jgross@suse.com>
CC: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
CC: Andrew Cooper <Andrew.Cooper3@citrix.com>
CC: Olivier Lambert <olivier.lambert@vates.fr>
4 years agotarget setup refactoring: Add a doc comment
Ian Jackson [Tue, 25 Aug 2020 11:08:42 +0000 (12:08 +0100)]
target setup refactoring: Add a doc comment

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agotarget setup refactoring: Merge target_kernkind_*
Ian Jackson [Tue, 25 Aug 2020 11:00:47 +0000 (12:00 +0100)]
target setup refactoring: Merge target_kernkind_*

Combine these two functions.  Rename them to a name which doesn't
mention "kernkind".

No functional change.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agotarget setup refactoring: Move target_kernkind_console_inittab
Ian Jackson [Tue, 25 Aug 2020 10:51:27 +0000 (11:51 +0100)]
target setup refactoring: Move target_kernkind_console_inittab

We move this earlier.  This is OK because it depends only on the
console runvar (inside the sub; this is set by target_kernkind_check),
$ho and $gho (which are set by this point); and $mountpoint$ (which is
set by access().

No functional change.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agotarget setup refactoring: Move target_kernkind_check
Ian Jackson [Tue, 25 Aug 2020 10:49:08 +0000 (11:49 +0100)]
target setup refactoring: Move target_kernkind_check

This is OK because nothing in access() looks at the rootdev or console
runvars, which are what target_kernkind_check sets.

No functional change other than perhaps to log output.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agocr-publish-flight-logs: Fix abs_time calls
Ian Jackson [Mon, 24 Aug 2020 11:00:16 +0000 (12:00 +0100)]
cr-publish-flight-logs: Fix abs_time calls

There was a missing space in these messages, since they were
introduced in 31b7cae19fe1
  timing traces: cr-publish-flight-logs: Report more progress

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agohsot reuse: ms-planner: Abbreviate reporting of test shares
Ian Jackson [Fri, 4 Sep 2020 20:58:51 +0000 (21:58 +0100)]
hsot reuse: ms-planner: Abbreviate reporting of test shares

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agohost reuse: ms-planner: Do not show reuse as shared in the plan
Ian Jackson [Mon, 30 Oct 2017 16:52:24 +0000 (16:52 +0000)]
host reuse: ms-planner: Do not show reuse as shared in the plan

If the number of shares is 1, do not show it as shared, and also
ignore the Unshare events.

This clarifies the display, especially when used with forthcoming test
host reuse work.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agohost reuse: ms-planner: Bring some variables forward
Ian Jackson [Mon, 30 Oct 2017 16:50:20 +0000 (16:50 +0000)]
host reuse: ms-planner: Bring some variables forward

Move the scope of $share earlier in cmd_show_html, and also introduce
$shared in the colour computation.  This makes the next changes easier.

No functional change.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agots-hosts-allocate-Executive: Add a comment about a warning
Ian Jackson [Fri, 3 Nov 2017 17:40:30 +0000 (17:40 +0000)]
ts-hosts-allocate-Executive: Add a comment about a warning

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agoshow_abs_time: Represent undef $timet as <undef>
Ian Jackson [Mon, 30 Oct 2017 11:36:16 +0000 (11:36 +0000)]
show_abs_time: Represent undef $timet as <undef>

This can happen, for example, if a badly broken flight has steps which
are STARTING and have NULL in the start time column, and is then
reported using sg-report-flight.

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agosg-run-job: Improve some internal API docs
Ian Jackson [Fri, 2 Oct 2020 15:00:28 +0000 (16:00 +0100)]
sg-run-job: Improve some internal API docs

Signed-off-by: Ian Jackson <iwj@xenproject.org>
4 years agosg-run-job: Minor whitespace (formatting) changes
Ian Jackson [Tue, 21 May 2019 16:35:23 +0000 (17:35 +0100)]
sg-run-job: Minor whitespace (formatting) changes

No functional change.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agoREADME.planner: Document magic job hostflags
Ian Jackson [Mon, 30 Oct 2017 16:32:27 +0000 (16:32 +0000)]
README.planner: Document magic job hostflags

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agoExecutive.pm planner: fix typo
Ian Jackson [Mon, 6 Nov 2017 18:07:24 +0000 (18:07 +0000)]
Executive.pm planner: fix typo

Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
4 years agoms-queuedaemon: Update for newer Tcl's socket channel ids
Ian Jackson [Fri, 21 Aug 2020 10:37:51 +0000 (11:37 +0100)]
ms-queuedaemon: Update for newer Tcl's socket channel ids

Now we have things like "sock55599edaf050" where previously we had
something like "sock142".  So the output is misaligned.

Bump the sizes.  And with these longer names, when showing the front
of the queue only print the full first entry and the start of the next
one.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agoTolerate lack of platform-specific hosts in old Xen branches
Ian Jackson [Thu, 1 Oct 2020 14:17:44 +0000 (15:17 +0100)]
Tolerate lack of platform-specific hosts in old Xen branches

Right now we have a situation where these can't all be made to work
because because some older Xen branches are hard to make work on
current Debian stable, and we have some hardware (which we have tagged
as specific "platforms") which doesn't work with oldstable.

This seems like a general problem, so fix it this way.

Note that we still treat these failed allocations as failures, so they
are subject to regression analysis and ought not to appear willy-nilly
on existing branches.

Runvar dump shows the addition of this runvar
   hostalloc_missing_expected=1
to
   qemu-upstream-4.6-testing
   xen-4.6-testing
   ...
   qemu-upstream-4.14-testing
   xen-4.14-testing
inclusive.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
4 years agostandalone-generate-dump-flight-runvars: Simulate cri-getplatforms
Ian Jackson [Thu, 1 Oct 2020 15:36:29 +0000 (16:36 +0100)]
standalone-generate-dump-flight-runvars: Simulate cri-getplatforms

Set MF_SIMULATE_PLATFORMS to a suitable value if it is
not *set*.  (Distinguishing unset from set to empty.)

I have verified that this, plus the preceding commits to
cri-getplatforms, produces no change in the output of
  MF_SIMULATE_PLATFORMS='' OSSTEST_CONFIG=standalone-config-example eatmydata ./standalone-generate-dump-flight-runvars

Without the MF_SIMULATE_PLATFORMS setting it adds several new jobs to
each flight, name things like this:
  test-amd64-$arch1-xl-simplat-$arch2-$suite

The purpose of this right now is to provide a way to dry-run test the
next change.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
4 years agocri-getplatforms: Honour new MF_SIMULATE_PLATFORMS env var
Ian Jackson [Thu, 1 Oct 2020 15:36:17 +0000 (16:36 +0100)]
cri-getplatforms: Honour new MF_SIMULATE_PLATFORMS env var

This is to be expanded by the shell, using eval, so that it can refer
to $xenarch, $suite and $blessing.

No functional change if this variable is unset, or empty.  If it is
set to a single space, cri-getplatforms produces no output (as it does
anyway in standalone mode).

Signed-off-by: Ian Jackson <iwj@xenproject.org>
4 years agocri-getplatforms: Give names to xenarch and suite
Ian Jackson [Thu, 1 Oct 2020 15:35:56 +0000 (16:35 +0100)]
cri-getplatforms: Give names to xenarch and suite

No functional change.  This will be useful in a moment.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
4 years agots-hosts-allocate-Executive: Allow to tolerate missing resources
Ian Jackson [Thu, 1 Oct 2020 14:18:39 +0000 (15:18 +0100)]
ts-hosts-allocate-Executive: Allow to tolerate missing resources

Now, a job can specify that lack of a suitable host should be treated
as a plain test failure (ie, subject to the usual regression analysis)
rather than as an infrastructure or configuration problem.

This will be useful for some tests which don't work in some branches
because of lack of suitable hardware.  We want to avoid encoding our
hardware availability situation in make-flight.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
4 years agosg-run-job: Preserve step state "fail" if set by test script
Ian Jackson [Thu, 1 Oct 2020 16:02:48 +0000 (17:02 +0100)]
sg-run-job: Preserve step state "fail" if set by test script

If the test script exits nonzero but after setting the step status to
'fail', we can leave it that way.  This is particularly relevant if
the iffail in the job spec says 'broken' or something.  After this
change, a step can decide to override that.

An alternative would be to have the step script exit zero, but of
course that would (generally) leave the job to continue running more
steps!

Signed-off-by: Ian Jackson <iwj@xenproject.org>
4 years agostandalone: Use mkdir -p
Ian Jackson [Thu, 1 Oct 2020 14:18:33 +0000 (15:18 +0100)]
standalone: Use mkdir -p

These two mkdir calls could fail if
standalone-generate-dump-flight-runvars is run without a log
directory, because they were not concurrency-correct.

mkdir -p should fix that.

Signed-off-by: Ian Jackson <iwj@xenproject.org>
4 years agoExecutive: Fix an undef warning message
Ian Jackson [Thu, 1 Oct 2020 14:08:29 +0000 (15:08 +0100)]
Executive: Fix an undef warning message

$onhost can be undef too

Signed-off-by: Ian Jackson <iwj@xenproject.org>
4 years agoUpdate TftpDiVersion_buster
Ian Jackson [Mon, 28 Sep 2020 12:05:52 +0000 (13:05 +0100)]
Update TftpDiVersion_buster

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agoTftiDiVersion: Update to latest installer for stretch
Ian Jackson [Thu, 24 Sep 2020 16:14:25 +0000 (16:14 +0000)]
TftiDiVersion: Update to latest installer for stretch

The stretch (Debian oldstable) kernel has been updated, causing our
Xen 4.10 tests (which are still using stretch) to break.  This update
seems to fix it.

Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Ian Jackson <iwj@xenproject.org>
4 years agoTCP fix: Do not wait for ownerdaemon to speak
Ian Jackson [Mon, 28 Sep 2020 11:43:30 +0000 (12:43 +0100)]
TCP fix: Do not wait for ownerdaemon to speak

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agoTCP fix: Do not wait for queuedaemon to speak
Ian Jackson [Mon, 28 Sep 2020 11:41:13 +0000 (12:41 +0100)]
TCP fix: Do not wait for queuedaemon to speak

This depends on the preceding daemonlib patch and an ms-queuedaemon
restart.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agodaemonlib: Provide a "noop" command
Ian Jackson [Mon, 28 Sep 2020 11:37:33 +0000 (12:37 +0100)]
daemonlib: Provide a "noop" command

We are going to want clients to speak before waiting for the server
banner.  A noop command is useful for that.

Putting this here makes it apply to both ownerdaemon and queuedaemon.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agoschema: Provide index on flights by start time
Ian Jackson [Wed, 19 Aug 2020 14:59:09 +0000 (15:59 +0100)]
schema: Provide index on flights by start time

We often use flight number as a proxy for ordering, but this is not
always appropriate and not always done (and sometimes it's a bit of a
bodge).

Provide an index to find flights by start time.  This significantly
speeds up the host allocation $equivstatusq query, and the duration
estimator.

(I have tested this by creating a trial index in the production
database.  That index can be dropped again, preferably after this
commit makes it to production.)

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agots-hosts-allocate-Executive: Do a pre-check
Ian Jackson [Wed, 19 Aug 2020 11:52:04 +0000 (12:52 +0100)]
ts-hosts-allocate-Executive: Do a pre-check

Call attempt_allocation with an empty plan and $mayalloc=0.

In the usual case this will arrange to prime our memoisation caches
before we get involved with the queueing system.

It will also arrange for various errors to be reported sooner.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
v2: Improved error and result handling

4 years agohost allocation: Memoise $equivstatus query results
Ian Jackson [Wed, 19 Aug 2020 12:09:49 +0000 (13:09 +0100)]
host allocation: Memoise $equivstatus query results

This provides a very significant speedup.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agohost allocation: Memoise duration estimates
Ian Jackson [Wed, 19 Aug 2020 12:00:58 +0000 (13:00 +0100)]
host allocation: Memoise duration estimates

We look at our own branch to estimate durations.  If somehow we are
one of multiple concurrent flights on this branch with the appropriate
blessing, we don't mind not noticing the doing of our peer flights so
that if our estimates are a bit out of date.

So it is fine to use an estimate no older than our own runtime.

Right now we generate a new duration estimator during each queueing
round, because it contains a statement handle and we must disconnect
from the db while waiting.  So the internal memo table gets thrown
away each time and is useless.

To actually memoise, pass our own hash which lives as long as we do.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agoduration estimates: Memoise results
Ian Jackson [Wed, 19 Aug 2020 11:55:20 +0000 (12:55 +0100)]
duration estimates: Memoise results

The caller may provide a memoisation hash.  If they don't we embed
one in the estimator.

The estimator contains a db statement handle so shouldn't be so
long-lived that this gives significantly wrong answers.

I am aiming this work at ts-hosts-allocate-Executive, but it is
possible that this might speed up sg-report-flight.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agoresource allocation: Provide OSSTEST_ALLOC_FAKE_PLAN test facility
Ian Jackson [Wed, 19 Aug 2020 11:13:23 +0000 (12:13 +0100)]
resource allocation: Provide OSSTEST_ALLOC_FAKE_PLAN test facility

Set this variable (to a data-plan.final.pl, say) and it becomes
possible to test host allocation programs without actually allocating
anything and without engaging with the queue system.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agots-hosts-allocate-Executive: Fix broken call to $duration_estimator
Ian Jackson [Wed, 19 Aug 2020 12:05:22 +0000 (13:05 +0100)]
ts-hosts-allocate-Executive: Fix broken call to $duration_estimator

The debug subref is passed to the constructor (and indeed we do that).
The final argument to the actual estimator is $uptoincl_testid (but we
didn't say $will_uptoincl_testid, so it is ignored).

The code was wrong, but with no effect.  So no functional change.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-report-job-history: Increase default limit
Ian Jackson [Mon, 10 Aug 2020 16:36:35 +0000 (17:36 +0100)]
sg-report-job-history: Increase default limit

Now this is a *lot* faster, we can print a lot more history.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-report-job-history: Provide --time-limit
Ian Jackson [Mon, 10 Aug 2020 16:32:58 +0000 (17:32 +0100)]
sg-report-job-history: Provide --time-limit

Calculate a minflight based on the time limit, and set the time limit
to a year ago by default.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-report-host-history: Cache report_run_getinfo
Ian Jackson [Wed, 12 Aug 2020 10:08:44 +0000 (11:08 +0100)]
sg-report-host-history: Cache report_run_getinfo

No logical change.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-report-job-history: Cache report_run_getinfo
Ian Jackson [Wed, 12 Aug 2020 09:59:47 +0000 (10:59 +0100)]
sg-report-job-history: Cache report_run_getinfo

No logical change.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-report-job-history (nfc): Abolish $fromstuff
Ian Jackson [Mon, 10 Aug 2020 16:34:55 +0000 (17:34 +0100)]
sg-report-job-history (nfc): Abolish $fromstuff

This used to be reused, but that is no longer the case.  Do away with
it, for clarity and simplicity.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agohistory reporting: Break out minflight_by_time
Ian Jackson [Mon, 10 Aug 2020 16:30:20 +0000 (17:30 +0100)]
history reporting: Break out minflight_by_time

Move this from sg-report-host-history so we can reuse it.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-report-job-history: Cache osstestrevs
Ian Jackson [Mon, 10 Aug 2020 16:20:53 +0000 (17:20 +0100)]
sg-report-job-history: Cache osstestrevs

No logical change.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-report-job-history (nfc): Refactor osstestrevs code
Ian Jackson [Mon, 10 Aug 2020 16:19:09 +0000 (17:19 +0100)]
sg-report-job-history (nfc): Refactor osstestrevs code

Split this into (1) get the data from the db (2) process it into the
form we want.

This will make it easy to cache (1).

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-report-job-history: Cache the per-flight revisions
Ian Jackson [Mon, 10 Aug 2020 16:08:44 +0000 (17:08 +0100)]
sg-report-job-history: Cache the per-flight revisions

This involves changing %revisions to %$revisions in the code which
uses the value.

No logical change.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agohistory reporting (nfc): Quote keys too
Ian Jackson [Mon, 10 Aug 2020 16:14:40 +0000 (17:14 +0100)]
history reporting (nfc): Quote keys too

Right now all the callers have keys which don't need quoting.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agohistory reporting (nfc): url-quoting: quote = too
Ian Jackson [Fri, 14 Aug 2020 15:24:08 +0000 (16:24 +0100)]
history reporting (nfc): url-quoting: quote = too

We are going to want to url-encode keys.  If key contains =, we still
need to be able to tell where it ends, so it must be encoded.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agohistory reporting (nfc): Break out $url_ok_chars
Ian Jackson [Fri, 14 Aug 2020 15:22:20 +0000 (16:22 +0100)]
history reporting (nfc): Break out $url_ok_chars

We will want this in a moment.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agohistory reporting (nfc): Break out url_unquote
Ian Jackson [Mon, 10 Aug 2020 16:13:10 +0000 (17:13 +0100)]
history reporting (nfc): Break out url_unquote

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agohistory reporting (nfc): Break out url_quote
Ian Jackson [Mon, 10 Aug 2020 16:11:59 +0000 (17:11 +0100)]
history reporting (nfc): Break out url_quote

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-report-job-history: Introduce use of cache, for hosts query
Ian Jackson [Mon, 10 Aug 2020 15:52:01 +0000 (16:52 +0100)]
sg-report-job-history: Introduce use of cache, for hosts query

* Set up the cache.
* Call the per-row setup hook.
* Cache the computation of $ri->{Hosts}.
* Call the per-row cache write hook.
* Finalise the cache.

Output is the same, but with cache information in the output html, and
faster.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agohistory reporting (nfc): Provide cache_set_task_print
Ian Jackson [Mon, 10 Aug 2020 15:57:44 +0000 (16:57 +0100)]
history reporting (nfc): Provide cache_set_task_print

This takes a string which gets added to the cache messages.  This
will allow us to distinguish the output from different processes
when using parallel by fork.

Nothing sets this yet.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-report-job-history (nfc): Drop $hostsq query
Ian Jackson [Wed, 12 Aug 2020 12:51:32 +0000 (13:51 +0100)]
sg-report-job-history (nfc): Drop $hostsq query

We have eliminated all the users of @hostvarcols before @hostvarcols2
is calculated from the row data.

The query which produces this is very slow and can't be cached.  We
can abolish it now and just use the @hostvarcols2 calculation.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-report-job-history (nfc): Query hosts runvars in one go
Ian Jackson [Mon, 10 Aug 2020 15:10:50 +0000 (16:10 +0100)]
sg-report-job-history (nfc): Query hosts runvars in one go

Rather than doing one query for each entry in @hostvarcols, do one
query for all the relevant runvars.  This is quite a bit faster and
will enable us to use the cache.

This is correct because @hostvarcols was the union of all the host
runvars, so this produces the same answers as the individual queries.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-report-job-history (nfc): Add new hostvarcols calculation
Ian Jackson [Mon, 10 Aug 2020 15:04:39 +0000 (16:04 +0100)]
sg-report-job-history (nfc): Add new hostvarcols calculation

We are going to want to replace the existing @hostvarcols
calculation.  Provide a new one based on $ri->{Hosts}.

Right now, die if they don't produce the same answers.  This still
works, which shows that the calculation is right.

Tested-by: Ian Jackson <ian.jackson@eu.citrix.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-report-job-history (nfc): Make $ri->{Hosts} a hash
Ian Jackson [Mon, 10 Aug 2020 15:02:00 +0000 (16:02 +0100)]
sg-report-job-history (nfc): Make $ri->{Hosts} a hash

This will make it easier to cache this.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-report-job-history: Refactor "ALL" handling
Ian Jackson [Mon, 10 Aug 2020 14:43:33 +0000 (15:43 +0100)]
sg-report-job-history: Refactor "ALL" handling

* Make an explicit entry ALL in @branches, rather than implicitly
  processing ALL as well.

* Consequently, put explicit ALL entries in @tasks too, rather than
  putting in entries without a branch name.

* Pass ALL to processjobbranch rather than undef, and turn it into
  the internally-used undef at the start.

When used with --flight (findflight), this has no functional change.
When used with --job, ALL must now be included in the branch
list passed to --branches.  The only in-tree call is with --flight.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agohistory reporting (nfc): Make cacheable_fn work without cache
Ian Jackson [Wed, 12 Aug 2020 09:55:43 +0000 (10:55 +0100)]
history reporting (nfc): Make cacheable_fn work without cache

This would allow us to use this in call sites without a cache.

I changed my mind about the code that prompted this, but it still
seems plausibly useful, so I'm keeping this commit.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agohistory reporting (nfc): Add another test rune to the notes
Ian Jackson [Mon, 10 Aug 2020 12:06:48 +0000 (13:06 +0100)]
history reporting (nfc): Add another test rune to the notes

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-report-job-history (nfc): Remove needless conditional
Ian Jackson [Tue, 4 Aug 2020 16:27:04 +0000 (17:27 +0100)]
sg-report-job-history (nfc): Remove needless conditional

$htmlout is now always defined.

Nothing other than indentation change, and removal of the surrounding
if block.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-report-job-history: Always write HTML output
Ian Jackson [Tue, 4 Aug 2020 16:25:40 +0000 (17:25 +0100)]
sg-report-job-history: Always write HTML output

Previously, unlike sg-report-host-history, if you didn't specify
--html-dir, it would would do a lot work to no effect.

This is not useful and nothing calls it this way.  So abolish this
notion.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-report-job-history (nfc): Have main program decide HTML filename
Ian Jackson [Tue, 4 Aug 2020 16:26:38 +0000 (17:26 +0100)]
sg-report-job-history (nfc): Have main program decide HTML filename

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-report-job-history: Use one child per report
Ian Jackson [Tue, 4 Aug 2020 16:23:18 +0000 (17:23 +0100)]
sg-report-job-history: Use one child per report

Rather than one child per job, which then did one report per branch.

This will mean we can use the cache machinery, which is rather global
so wouldn't cope well with processing multiple job history reports
within a process.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-report-job-history: Use fork-based parallelism
Ian Jackson [Tue, 4 Aug 2020 16:08:14 +0000 (17:08 +0100)]
sg-report-job-history: Use fork-based parallelism

For now, one child per job (for all branches).  This is already a
speedup.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-report-job-history: Prep for fork: Move $revisionsq query
Ian Jackson [Mon, 10 Aug 2020 12:09:28 +0000 (13:09 +0100)]
sg-report-job-history: Prep for fork: Move $revisionsq query

We will need to prepare this in add_revisions so that it works when we
do each (job,branch) in a different process.

It is OK that it is still global, becauswe we only call add_revisions
in the children.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-report-job-history: Prep for fork: Move $buildsq query
Ian Jackson [Mon, 10 Aug 2020 11:50:59 +0000 (12:50 +0100)]
sg-report-job-history: Prep for fork: Move $buildsq query

We will need to prepare this once per (job,branch) so that it works
when we do each of those in a different process.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agoparallel by fork: Fix a variable name to $task
Ian Jackson [Mon, 10 Aug 2020 15:57:35 +0000 (16:57 +0100)]
parallel by fork: Fix a variable name to $task

This code came from sg-report-host-history where tasks were hosts.
But in the more general context, the names are wrong.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agoparallel by fork: Disconnect $dbh_tests as well as undefing it
Ian Jackson [Thu, 6 Aug 2020 11:57:31 +0000 (12:57 +0100)]
parallel by fork: Disconnect $dbh_tests as well as undefing it

If the caller is buggy and has statement handles still open, they can
still "work" even if we have thrown away the db handle.

Where, after forking, "work" means "use the same connection in
multiple processes simultaneously, without locking".  This could
result in arbitrary crazy nbehaviour (eg, TLS crypto failures).

No functional change with existing callers since they don't have this
bug.

Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agoparallel by fork: Break out into HistoryReport
Ian Jackson [Tue, 4 Aug 2020 17:10:25 +0000 (18:10 +0100)]
parallel by fork: Break out into HistoryReport

Move this code from sg-report-host-history to HistoryReport, so that
it can be reused.

No significant functional change.  Some changes to debug messages.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agosg-report-host-history: Write cache entry for unfinished jobs
Ian Jackson [Tue, 4 Aug 2020 15:20:49 +0000 (16:20 +0100)]
sg-report-host-history: Write cache entry for unfinished jobs

We have to also check ->{finished}, rather than the existence of a row
at all, since now unfinished jobs can appear in the cache.

Because the cache key includes the job status, when the job becomes
finished the cache entry will be invalidated.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agohistory reporting: Improve an error message slightly
Ian Jackson [Tue, 4 Aug 2020 14:35:31 +0000 (15:35 +0100)]
history reporting: Improve an error message slightly

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agohistory reporting: Print debug for cache misses
Ian Jackson [Tue, 4 Aug 2020 16:48:50 +0000 (17:48 +0100)]
history reporting: Print debug for cache misses

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agohistory reporting: Cache data limit now in History module
Ian Jackson [Tue, 4 Aug 2020 16:49:55 +0000 (17:49 +0100)]
history reporting: Cache data limit now in History module

Replace the ad-hoc query-specific limit strategy in
sg-report-host-history with a new, more principled, arrangement, in
HistoryReport.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agohistory reporting (nfc): Documentation for the new module
Ian Jackson [Wed, 5 Aug 2020 12:09:59 +0000 (13:09 +0100)]
history reporting (nfc): Documentation for the new module

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agohistory reporting: Skip undefined keys
Ian Jackson [Tue, 4 Aug 2020 13:41:09 +0000 (14:41 +0100)]
history reporting: Skip undefined keys

This makes it work if the caller's cached hash contains an key which
is bound to undef.

sg-report-host-history already does this, which currently causes:

 Use of uninitialized value $_ in substitution (s///) at Osstest/HistoryReport.pm line 134.
 Use of uninitialized value $_ in printf at Osstest/HistoryReport.pm line 135.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agohistory reporting (nfc): Rename some module variables, remove "cache"
Ian Jackson [Tue, 4 Aug 2020 17:22:20 +0000 (18:22 +0100)]
history reporting (nfc): Rename some module variables, remove "cache"

This makes the code terser and easier to read.  No functional change.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
4 years agohistory reporting (nfc): Move cache code into HistoryReport module
Ian Jackson [Wed, 5 Aug 2020 11:41:09 +0000 (12:41 +0100)]
history reporting (nfc): Move cache code into HistoryReport module

Finally this is now reuseable code and we can put it in the
HistoryReport module.

Pure cut-and-paste.

Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>