Ian Campbell [Mon, 7 Sep 2015 12:58:29 +0000 (13:58 +0100)]
ts-xen-install: Rewrite /etc/hosts to comment out 127.0.1.1 entry
Debian creates an entry such as:
127.0.1.1 lace-bug.xs.citrite.net lace-bug
This causes local lookups of the FQDN to get 127.0.1.1, which is
unhelpful if you are looking for an address to bind to and were hoping
to get the public IP address, as libvirt does on the target host for
migration.
Here we remove (actually, comment) any 127.0.1.1 line in /etc/hosts.
This means that lookups of a hosts own name (fqdn or just dn) now rely
on DNS, which may not be ideal. However for a host which uses DHCP I'm
not aware of a way to keep /etc/hosts up to date with the actual IP
address the machine has. In our infra the test host IP addresses are
all static, but I don't think we want to rely on at any more that we
already do.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Thu, 13 Aug 2015 15:18:37 +0000 (16:18 +0100)]
cambridge: arrange to test each new baseline
Provide a new cr-daily-branch setting OSSTEST_BASELINES_ONLY which
causes it to only attempt to test the current baseline (if it is
untested) and never the tip version. Such tests will not result in any
push.
Each new baseline is tested exactly once (i.e. we aren't repeating
hoping for a pass), hence the correct revision is just the one tested
by the last run on the branch.
Add a cronjob to Cambridge which runs in this manner, ensuring that
there will usually be some sort of reasonably up to date baseline for
any given branch which can be used for comparisons in adhoc testing or
bisections.
This will also give us some data on the success of various branches on
the set of machines in Cambridge, which can be useful/interesting.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Fri, 31 Jul 2015 10:58:48 +0000 (11:58 +0100)]
Osstest/TestSupport: Hide $ho->{Toolstack} from casual use
This should only be accessed via toolstack($ho), which is responsible
for caching the value. Rename the field to _Toolstack to deter code
from using it.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Anthony PERARD [Thu, 6 Aug 2015 17:03:28 +0000 (18:03 +0100)]
ts-xen-install: Add dom0_mem runvar to control dom0 memory
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Wed, 9 Sep 2015 15:46:21 +0000 (16:46 +0100)]
production-config: Update TftpDiVersion
I have already run mg-debian-installer-update-all
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- also update production-config-cambridge ]
Ian Jackson [Mon, 7 Sep 2015 13:00:51 +0000 (14:00 +0100)]
Manual allocation: Break out manual_allocation_base_jobinfo from mg-blockage
This is called `jobinfo' because it ought to be used in
alloc_resources's JobInfo xparam, rather than an Xinfo in the booking:
JobInfo is per planning client; Xinfo is per individual resource.
mg-blockage currently gets this wrong; we will fix that shortly.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
v2: New patch
v3: Fix "joinfo" to "jobinfo".
Ian Jackson [Mon, 7 Sep 2015 13:15:25 +0000 (14:15 +0100)]
Manual allocation: Report better info in plan for rogue tasks
(This will only take effect as such tasks appear in the plan for the
first time. Ie, once a rogue task is found, the plan is populated by
whatever version of the planner is running at that time. So the
effect will not be immediately visible.)
Signed-off-by: Ian Jackson <iwj@osstest.xs.citrite.net>
---
v2: New patch
Ian Jackson [Mon, 7 Sep 2015 14:14:10 +0000 (15:14 +0100)]
Planner: ms-queuedaemon: Better log message for Tcl `after idle'
This does not mean the planner is `idle' in any general sense of the
word. It just means that the Tcl event loop has finished processing
outstanding events. Change the debug message to be less confusing.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
v2: New patch
Ian Jackson [Fri, 4 Sep 2015 16:44:21 +0000 (17:44 +0100)]
Planner: Remove O(n^2) problem from plan restart
Change `./ms-planner unprocessed' to take a file of infos on stdin,
and when we restart the planning, invoke it once.
(This would be an incompatible change to the planner, needing a
queuedaemon restart, if this patch were applied separately from the
previous "Report unprocessed planning clients".)
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
---
v2: New patch
Ian Jackson [Mon, 7 Sep 2015 14:08:19 +0000 (15:08 +0100)]
Planner: Report unprocessed planning clients
With recent changes, it can happen that a queue daemon client is not
given an opportunity to report itself in the plan. This makes the
plan incomplete.
(For resource-plan.html, because the planning run was restarted to try
to quickly allocate new resources; for resource-projection.html,
because it's an old client that doesn't support feature-noalloc.)
When this happens, provide an explicit indication of this in the plan:
* Invent a new entry Unprocessed in data-*.pl for this information.
* Display the first 50 in ms-planner show-html.
* Provide a new ms-planner invocation `unprocessed' to record one.
* Note unprocessed when we skip a client due to !feature-noalloc.
* Note unprocessed for remaining queue when we restart planning.
For now this algorithm can be rather unfortunately O(n^2) when
draining the planning queue, because each `ms-planner unprocessed'
invocation adds only one job but needs to read and write the whole
plan. This will be fixed shortly.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
v2: New patch
Ian Jackson [Thu, 3 Sep 2015 11:46:27 +0000 (12:46 +0100)]
Plan reporting: Provide get-last-plan queuedaemon command
This allows retrieval, by monitoring clients which are not
participating in the planning queue, of the finished projection, or
the unfinished plan as it was at the time of last restart.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
v2: Fix invocation of return-plan-to-client.
Use data-W.final.pl, not data-W-final.pl, to fit
with existing .gitignore, and be slightly neater.
Ian Jackson [Wed, 2 Sep 2015 14:12:48 +0000 (15:12 +0100)]
Planner: ms-queuedaemon: Restart planning when resources become free
This solves a performance problem with the existing planner.
The problem is that with a large installation, and a big queue, a full
plan can take a long time to prepare. (In our current installation,
perhaps as long as half an hour.) Any resource which becomes free
during one plan run cannot be allocated to a new job until the next
plan run starts. This means resources (test machines) are often
sitting around idle.
Fix this by restarting the planning process as soon as any new
resource becomes free. This means that jobs at the front of the queue
get a chance to allocate it right away, so it will probably be
allocated soon.
If it is only interesting to jobs later in the queue, then there may
be a delay in reallocating it, but presumably the resource is not much
in demand and those later jobs will allocate it when they get a bit
closer to the head.
But, there is a problem with this: it means that the plan is generally
never completed. So we have no overview any more of when which
flights will finish and what the overall queue is like. We solve this
problem by running a second instance of the planner algorithm, all the
way to completion, in a `dummy' mode where no actual resource
allocation takes place. This second `projection' instance comes into
being whenever the main `plan' instance is restarted, and it inherits
the planning state from the main `plan' instance.
Global livelock (where we keep restarting the plan but never manage to
allocate anything) is not possible because each restart involves a new
resource becoming free. If nothing gets allocated because we can't
get that far before being restarted, then eventually there will be
nothing left allocated to become newly free.
Starvation, of a form, is possible: a late-in-queue job which wants a
resource available right now might have difficulty allocating it
because the planner is spending its effort rescheduling early-in-queue
jobs which want resources which are in greater demand - so that the
late-in-queue job never gets called. Arguably this is an appropriate
allocation of planning time.
With this arrangement we can generate two reports: a `plan' report
containing the short term plan which was used for actual resource
allocation, and which is frequently restarted and therefore not
necessarily complete; and a `projection' report which contains a
complete plan for all work the system is currently aware of, but which
is less-frequently updated.
Because planner clients do not contain the planning algorithm state,
the only client change needed is the ability to run in a `dummy' mode
without actual allocation; this is the `noalloc' feature earlier in
this series.
The main work is in ms-queuedaemon. We have prepared the ground for
multiple instances of the planning algorithm; from the point of view
of ms-queuedaemon, an instance of the planning algorithm is mainly a
walk over the job queue. So we call them `walkers'.
Therefore, what we do here is introduce a new `projection' walker,
as follows:
Add `projection' to the global list of possible walkers.
Invent a new section of code, the `restarter', which is responsible
for managing the relationship between the two walkers. (It uses
direct knowledge of the queue state data structures, etc., to avoid
having to invent a complete formal interface to a walker.)
If we ever finish the plan walker's queue, we update both the
projection report output and the plan report output, from the same
plan. Finishing the projection walker's queue means we have a
complete projection, but we don't touch the plan.
In principle it might happen that the plan walker might overtake the
projection walker, and then complete, write out a complete and up to
date plan as the projection, and that the projection walker would then
complete and overwrite the projection with less-up-to-date
information. We don't explicitly exclude this. Of course such a
result will be rectified soon enough by another planning run.
The restarter can ask the database for the list of currently-available
resources, and can therefore detect when new become newly-free.
The rest of the code remains largely ignorant of the operation of the
restarter. There are a few hooks:
runneeded-perhaps-start notifies the restarter when we start the
plan; this is used by the restarter to record the set of free
resources at the start of a planning run, so that it can see later
whether any /new/ resources have become free.
restarter-maybe-provoke-restart is called when we get notification
from the the owner daemon that resources may have become idle. We
look for newly-idle resources, and if there are any, and we are
running the plan walker, we directly edit the plan walker's queue to
put RESTART at the front.
queuerun-perhaps-step spots the special entry RESTART in its queue and
calls into back the restarter when it finds it. This deferred
approach is necessary because we can't do the restart operation while
a client is thinking (because we would have to change that client's
cogitation from the `live, can allocate' mode to the `dummy, cannot
allocate' mode; and because that would make the code more complex).
The main work is done in the restarter-restart-now hook. It reports
the current (incomplete) plan, and then checks to see if a projection
walker is running; if it is, it leaves it alone, and simply abandons
the current plan run and arranges for a new run to started. If a
projection walker is not running it copies all the plan walker's state
(including the data-plan.pl disk file containing the plan-in-progress)
to the projection walker, and sets the projection walker going.
We update .gitignore to ignore data-plan.* and data-projection.*.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
v2: Update .gitignore too.
Use `walker-globals' not `walker-runvars' (which does not exist).
Remove wrap damage `#' from comment.
Fix typo in commit message.
Fix several silly bugs in for-free-resources
Fix three silly bugs relating to handling of $newly_free
Fix a wrong bracket syntax error in restarter-maybe-provoke-restart
Properly return from queuerun-perhaps-step on RESTART;
restarter-restart-now has taken the flow of control.
Reorder operations in restarter-restart-now so as to make it work
Correct some wrong log messages in restarter-restart-now
Add a log message when we restart planning
Minor code layout changes
In notify-to-think, process feature-noalloc properly
Ian Jackson [Tue, 1 Sep 2015 18:04:53 +0000 (19:04 +0100)]
Planner: ms-queuedaemon: Break out queuerun-finished/<walker>
This formalises the queue-completed interface, allowing parts outside
the queuerun machinery to cleanly be notified when a queue is
completed, and relieving the queuerun-perhaps-step of the need to know
what to do for the end of any particular walker's queue.
Currently there is still only one walker, `plan'.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
If multiple walkers want to ask the same chan, we want to serialise
them. This is actually straightforward: Firstly, we arrrange that
each walker finishing a thought will prompt _all_ walkers to
reconsider whether they need to continue. Then we can simply do
nothing if we want to a chan to think that another walker is already
waiting for; since that other walker will prompt us later.
Still no actual functional change because there is still only one
walker.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Tue, 1 Sep 2015 15:54:04 +0000 (16:54 +0100)]
Planner: ms-queuedaemon: Prep for multiple walkers
We are going to introduce multiple concurrent streams of planning
processing, called `walkers'.
Prepare the ground for this with some formulaic changes which will
otherwise greatly clutter substantive patches.
(A client will still only think for one walker at once, because that's
what the client protocol expects - and anything else would be far too
confusing.)
General:
* Introduce the concept of a `walker' to ms-queuedaemon.
* Provide a list of the walkers which might exist, `walkers'
* Provide some helper procedures for iterating over these,
and easily accessing their state.
Queue handling:
* Add a new `w' argument to many procs: specifically, most of the
procs in the section `machinery for running the queue'.
* Log the walker ($w) at the start of all relevant log messages.
* Pass the -w option to ms-planner and ms-planner-debug.
* Add safety catches which will crash the ms-queuedaemon if it finds
it is asking the same client to think for more than one walker.
* we-are-thinking and check-we-are-thinking tell the caller what
walker the client is thinking for.
* In the resource-plan.html filename, replace `plan' with the walker
filename.
Elsewhere:
* Teach dequeue-chan to deal with all the walkers, including
maybe the (one) walker for which the client is thinking.
* Teach log-state to report on all the walkers.
* In the runneeded logic, hardcode `plan' as the walker to use.
There is still actually only one walker.
No overall functional change, except to some log messages.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
---
v2: Fix walker-globals to import the $w/$v from #0, ie the global scope
Correct invocation of upvar in walker-globals
Use walker-globals everywhere, not obsolete name walker-vars
Do not pass w to do-book-resources (which does not want it
because it uses uses chan-we-are-thinking)
Ian Jackson [Tue, 1 Sep 2015 15:52:17 +0000 (16:52 +0100)]
Planner: ms-planner support -w option
We are going to introduce multiple concurrent streams of planning
processing, called `walkers' in ms-queuedaemon. The work-in-progress
plan is stored, server-side, during planning, in data-plan.pl. But we
need to have more than one of these.
Update ms-planner and ms-planner-debug to honour a -w option, to
specify a replacement for the word `plan' in `data-plan.pl'.
No overall functional change, since nothing uses these options yet.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Tue, 1 Sep 2015 13:56:46 +0000 (14:56 +0100)]
Planner: client side: New `!OK think noalloc' protocol
Introduce a way for the queue daemon to tell its client that it must
not allocate anything in this planning iteration.
In the client:
* Advertise the new feature via set-info.
* Accept the `noalloc' part of `!OK think noalloc';
* Print that in our log message;
* Honour it by passing it to $resourcecall.
And document the new protocol. However, there is no server-side yet,
so this does not yet introduce any overall change to the system.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Tue, 1 Sep 2015 13:50:52 +0000 (14:50 +0100)]
Planner: client side: $mayalloc parameter to $resourcecall->()
Add a new parameter to $resourcecall which allows the alloc_resources
loop in Osstest::Executive to specify to its clients that on this
occasion they should not make any actual allocations.
The callers of alloc_resources are all adjusted to honour this new
parameter:
* ts-hosts-allocate-Executive avoids allocating unless $mayalloc
* mg-allocate avoids allocating unless $mayalloc
* mg-blockage never allocates anyway.
Currently we always pass 1, so no functional change.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
---
v3: Add missing my $mayalloc. ($plan is global.)
Ian Jackson [Tue, 1 Sep 2015 18:15:32 +0000 (19:15 +0100)]
Planner: Fix indefinite holdoff
runneeded-ensure-will would always reset the runneeded_holdoff_after
timer. So no new queue run would start until no runneeded-ensure-will
has occurred for (currently) 30s.
Instead, only start the timer if it's not already running.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Wed, 1 Apr 2015 16:55:12 +0000 (17:55 +0100)]
ms-planner: Propagate a booking's Job to the plan
This needs to be done in several places:
- When booking resources (cmd: book-resources), to initially propagate
from the booking (e.g. from ts-hosts-allocate-Executive's input).
- On reset (cmd: reset) so that the Events corresponding to actual
allocations retain their Job.
- When retrieving the plan (cmd: get-plan), so it would be available
for logging etc.
The Job is added by a following patch "ts-hosts-allocate-Executive:
Add the requesting Job to the booking".
This patch has been deployed on the Cambridge instance for testing
with no ill-effects.
cmd_reset does not include a ->Job for jobs which are "(preparing)",
corresponding to a job which is going to use a shared host which is
currently being installed by another job. I was unable to figure out a
way to include these.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Thu, 13 Aug 2015 15:43:47 +0000 (16:43 +0100)]
Disable proxy for all preseeded wget
At least in some contexts scripts can be run with http_proxy pointing
to the apt proxy (I noticed it in /usr/lib/base-installer.d/ hook used
for ucode installation).
Since all of these particular fetches are from a known to be local
webserver just disable proxying altogether.
With busybox wget in d-i this is done with the -Y argument.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Thu, 13 Aug 2015 16:52:41 +0000 (17:52 +0100)]
Debian: Create /boot/boot -> . symlink on ARM when PvMenuLst enabled
This is under the same conditional as the nobootloader confirmation
one, since they effectively both stem from the lack of a boot loader
and the consequential use of the pv-grub-menu package.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Thu, 13 Aug 2015 16:52:39 +0000 (17:52 +0100)]
Debian: ARM: only apply no bootloader workaround if xopts{PvMenuLst}
This workaround is only necessary because of how pv-grub-menu works,
so we should only apply both or neither of them.
This results in a long line and I'm about to add a second workaround
to this block, so switch to a regular if block instead of postfixing
on the one command. Move the comment inside that block in preparation
for other workarounds as well.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Thu, 13 Aug 2015 16:52:36 +0000 (17:52 +0100)]
ts-debian-di-install: Use exit/poweroff in preference to exit/always_halt
always_halt results in d-i calling "halt", which does not necessarily
poweroff the host (it seems to for x86/PV Xen guests, but does not for
ARM). Using exit/poweroff calls "poweroff" which is equivalent to
"halt -p", doing so results in ARM guests powering off as desired.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Fri, 4 Sep 2015 10:46:37 +0000 (11:46 +0100)]
cs-bisection-step: Properly handle external job refs in template
cs-bisection-step has had, for a long time, code which is supposed to
handle the situation where the template flight contains build job
references to other flights.
However:
- The regexp to spot these other-flight job reference runvars would
never match because it said \s where \S was probably intended (and
. would be better);
- If it were to match, the flight and job arguments to the recursive
preparejob invocation were the wrong way round. preparejob takes
the job name first.
Fix these two bugs. Now it does seem to work properly.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Fri, 4 Sep 2015 10:38:35 +0000 (11:38 +0100)]
cs-bisection-step: Print our command line at the start
The usual approach for debugging the cs-bisection-step is to repro the
problem (with --max-flight), which is most easily done by copying the
command line provided during a run which did the wrong thing.
Print the command line at startup, so that it appears in the report.
This will save us grobbling through the logs and cron mail.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Find all the places where adhoc-revtuple-generator runs subprograms
and have it add set -x (either by adding $OSSTEST_AHRTG_SETX to an
existing set -e, or using $setx which is either : or `set -x').
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Thu, 3 Sep 2015 10:39:37 +0000 (11:39 +0100)]
cr-daily-branch: Make sg-report-flight ignore bisections
sg-report-flight when testing X' (with a baseline of X) can justify a
failure of T(X',Y,Z) with a bisection failure of T(X,Y'',Z).
If Y'' breaks T then this makes it look to sg-report-flight like T was
already broken in X; cr-daily-branch could then push X' even though it
is actually broken.
This happened rarely, because cr-daily-branch's sg-report-flight would
only look at flights on the right branch, so only a bisection of T on
that branch can cause this, but nevertheless this can produce bad
pushes.
So: have cr-daily-branch pass a --blessings option to cr-daily-branch,
so that it only looks at (usually) `real' rather than the default of
`real' and also `real-bisect'.
An alternative, more complicated, approach would be for
sg-report-flight to compare versions of Y, Z, et al, when looking for
justifications, but I'm not sure this is desirable because it would
effectively reset the heisenbug compensator each time any other tree
changed.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Fri, 28 Aug 2015 11:28:26 +0000 (11:28 +0000)]
Other-revision-jobs: Update cs-bisection-step
This is rather more subtle. We want to be able to bisect over all the
relevant inputs.
What we actually want to do if one of the *prev* tests fail is to
treat the "previous Xen branch" as a separate "tree" when bisecting,
so each revision tuple has both "current" and "old" Xen versions.
That way if the stable-4.x branch has broken forward migration, we
will report it properly.
Indeed, this needs to be extended not just to the Xen revision, but
all the inputs to the *prev* build.
We achieve this with new concept `other-revision job suffix',
introduced in the previous patch. The bisector now works internally
always with tree names which are `<tree>[ <suffix>]' (delimited by a
space). (Henceforth, we'll call `[ <suffix>]' the `othrev'.)
That is, all the revisions specified in prev build jobs are treated as
revisions of different trees to the revisions of apparently-same trees
in non-prev jobs.
The specific changes needed to cs-bisection-step are very small. We
only need to adjust the code which reads and writes the database:
* When we do the cross join on urls and revisions which generates the
rev tuple for a particular flight, also have the database compute
the othrev for each tree. Then, print the othrev in the debug
output, and append it to the tree name.
That resulting name is used everywhere:
It affects `mixed revision' detection, so we consider build-*-prev
jobs with differing revisions to problematic, or main-revision build
jobs with differing revisions, but we treat each category of build
job separately so the fact that the prev and main build jobs have
different revisions is fine.
The name is used for the key that is returned from flight_rmap.
Thence it is used for the Name in @treeinfos, and therefore the
results from flight_rtuple will be terms of this decorated tree
namespace.
* When we are preparing a new job to go, we need to (effectively) undo
this transformation. The query which finds the `tree_' variables
for a particular tree name is arranged to take an additional
parameter, which is the othrev. If the othrev does not match the
job, the name is not returned in the results.
Actually, because both the job and the othrev are query parameters,
what happens is either that they match (ie, the othrev in the tree
name from @treeinfos is indeed the othrev for the job we are
constructing) in which case we process the variable as before; or
they don't match, in which case the query contains contradictory
conditions in its AND clauses, and returns no rows.
So the ultimate effect is that we process each Name from @treeinfos
only if it is for the this kind of job. This slightly convoluted
implementation arises from the fact that the job-to-othrev mapping
is implemented as SQL, so we need to ask the database.
There is no need to change any of the output processing and reporting,
because "<tree> prev" is a perfectly good thing to print in all the
relevant contexts.
And there is no need to change how we drive adhoc-revtuple-generator,
because we do not pass it tree names at all, only urls.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Ian Jackson [Fri, 28 Aug 2015 14:17:09 +0000 (15:17 +0100)]
Other-revision-jobs: Provide other_revision_job_suffix
This is a string, a function of the job name, that identifies the
class of `other revisions'. It is empty for main-revision jobs
and currently there is only `<delimiter>prev' for build-*-prev.
We are going to use this in the bisector.
Reimplement main_revision_job_cond in terms of this. No functional
change, except that the SQL optimiser may have more work to do.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com>
Ian Jackson [Fri, 28 Aug 2015 11:16:40 +0000 (11:16 +0000)]
Other-revision-jobs: Provide central test
Since 75fbbc19 "Arrange to test migration from the previous Xen
version", some flights have contained additional jobs build-*-prev,
which build a different revision of xen.git.
However, this violates an existing assumption in several of the
automatic archaeologists, namely that a flight should contain only
runvars referring to a single revision of a tree.
We will need to adjust all the places where this assumption is baked
in. The question arises, as to how the code in general is supposed to
know. There are many possible schemes, but almost all of them would
involve some kind of schema change and/or would be violated by
now-recorded history.
For now we adopt the following rule: the job name tells you. That is,
revision runvars in jobs with certain job names are disregarded. We
call non-disregarded jobs `main-revision jobs', since they use the
`main' revisions of everything, and others `other-revision jobs'.
We provide a single function in Osstest.pm which takes as argument a
SQL expression string representing a job name, and returns a SQL
expression string evaluating to a boolean, specifying whether the job
is a main revision job. This can be used in queries.
In subsequent patches I will go through all plausibly-relevant output
from
git-grep 'revision_\|revision\\\\_'
and update each piece in turn.
There are obviously-irrelevant hits in TestSupport (build_clone and
store_vcs_revision) and in BuildSupport.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Wei Liu [Tue, 11 Aug 2015 20:25:09 +0000 (21:25 +0100)]
Toolstack/libvirt: use URI in migration command
Virsh migrate expects an URI, not a host. We don't actually care what
kind of transport it uses, the main objective is to test migration, so
use xen+ssh for the time being.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com> Cc: Ian Jackson <ian.jackson@eu.citrix.com>
[ ijc -- added --live and factored out $duri ]
Ian Campbell [Tue, 11 Aug 2015 16:25:10 +0000 (17:25 +0100)]
Arrange to test migration from the previous Xen version
There are several steps to this:
- Identify $prevxenbranch, that is the branch which precedes
$xenbranch.
- Create appropriate build jobs.
- Add support in ts-xen-install for overriding {xen,}buildjob on a
per-ident basis
- Add a new receipt test-pair-oneway which only migrates from
src_host to dst_host and not the reverse
- Create appropriate test jobs, overridding the default builds for
src_host.
Currently we only do this for xen* branches and using xl, but in the
future we may wish to add to the libvirt branch too.
In make-flight if REVISION_PREVXEN is not supplied (e.g. called from
standalone-reset or by hand etc) then we create the build-$arch-prev jobs
with no revision_xen, same as build-$arch
It would be nice to try and reuse the builds from the last flight
which tested the $prevxenbranch baseline. I've not dont that here.
Ian Campbell [Wed, 5 Aug 2015 12:48:27 +0000 (13:48 +0100)]
libvirt: Pass correct arguments to virsh migrate
$dst is a host hash/object, resulting in:
2015-08-04 22:35:25 Z executing ssh ... root@172.16.144.34 virsh
migrate debian.guest.osstest HASH(0x28f4310)
bash: -c: line 0: syntax error near unexpected token `('
bash: -c: line 0: `virsh migrate debian.guest.osstest HASH(0x28f4310)'
Switch to using the same pattern as xl.pm, which is to call the
argument (containing the host hash) $dho and for $dst to be a local
variable containing $dho->{Name}.
Also s/$ho/$sho/ to match xl.pm, since I think that is clearer about
what role everything has.
Fix the prototype too while editing this function.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Wei Liu <wei.liu2@citrix.com>
Ian Campbell [Mon, 27 Jul 2015 12:51:27 +0000 (13:51 +0100)]
ts-debian-hvm-install: Use xargs -0 to avoid massive filelist in logs.
The current arrangement is a bit odd, I'm not sure why it would be
that way and it results in a huge list of files in the middle of the
log which is rather boring to scroll through.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Mon, 27 Jul 2015 12:51:26 +0000 (13:51 +0100)]
ts-debian-hvm-install: use di_installcmdline_core
This is primarily to get DEBIAN_FRONTEND=text, for easier to read
logging.
Previously the command line consisted of the console and
preseed/file=/preseed.cfg. After this it is more complex.
The preseed file uses file= which is an alias for preseed/file. Extra
options are given including DEBIAN_FRONTEND and DEBCONF_DEBUG and the
following are preseeded via the command line:
Previous implied were "auto=true preseed" which are now explicit.
In addition the following harmless (in this context) options are
added:
hw-detect/load_firmware=
hostname=
netcfg/dhcp_timeout=
netcfg/choose_interface=
The caller could also cause debconf/priority to be set, but doesn't
here.
ts-debian-di-install in the distro test series also uses
di_installcmdline_core for guest uses.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Mon, 27 Jul 2015 12:51:25 +0000 (13:51 +0100)]
ts-debian-hvm-install: Remove VGA console runes.
I don't think there is any point in these since c60b6d20b0fd
"ts-debian-hvm-install: Arrange for installed guest to use a serial
console" and they represent an unexplained difference between the
islinux and grub cases.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Wed, 5 Aug 2015 10:42:10 +0000 (11:42 +0100)]
Executive: Support host_check_allocated outside a job.
When called outside a job there are no hostflags, so get_hostflags
cannot be used. Instead assume a new pseudo-flag "OUTSIDE-JOB" when
there is no $job.
Otherwise uses of select_host such as "mg-hosts mkpxedir" fail.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- wrong commit message ]
Wei Liu [Sun, 12 Jul 2015 11:29:02 +0000 (12:29 +0100)]
TestSupport: don't put kernel= in HVM config when using xl and libvirt
Setting kernel to hvmloader is ignored in xl but not in libvirt. Libvirt
config converter will translate that then pass it to QEMU. QEMU
complains there is no kernel called hvmloader and exits.
Remove this option for xl and libvirt. Xl is not affected and libvirt
will be able to create HVM guest. Xend might still need it.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
[ ijc -- use toolstack($ho) not $ho->{Toolstack} ]
Wei Liu [Sun, 8 Feb 2015 16:05:16 +0000 (16:05 +0000)]
ts-debian-hvm-install: stub out libvirt + ovmf / rombios
Libvirt's configuration converter doesn't know how to deal with BIOS
selection. The end result is it always use the default one (seabios).
Stub out ovmf and rombios to avoid false positive results.
This restriction will be removed once libvirt's converter knows how to
deal with BIOS selection.
Note that we don't expect to see such configurations any time soon.
These configurations will be filtered in make-flight. The changes here
are more of an extra level of safety check.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Wei Liu [Sun, 8 Feb 2015 15:44:31 +0000 (15:44 +0000)]
sg-run-job: remove save/restore dependency on local migration support
Since we've introduced different checks for save / restore and local
migration, it's possible to run save / restore tests without running
local migration tests.
Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Tue, 16 Dec 2014 12:12:44 +0000 (12:12 +0000)]
osstest migrate support check catch -> variables
The goal here is to skip the following test steps if the check fails.
Instead of using catch to turn an exception into value, we can just
use spawn-ts and reap-ts to do that. This pattern is useful when we add
in extra check for save / restore check later.
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
[ wei: write commit message ] Signed-off-by: Wei Liu <wei.liu2@citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Campbell [Thu, 30 Jul 2015 14:18:20 +0000 (15:18 +0100)]
crontab-cambridge: Add a commented out adhoc bisect line
This is handy to have, editing it in locally just means one cannot
simply use "crontab contab-cambridge" to load a new one without
remembering the content of the line for next time.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Thu, 23 Jul 2015 08:26:31 +0000 (09:26 +0100)]
cambridge: Adjust configuration for move to xs.citrite.net subnet
The Cambridge osstest instance is moving from the .cam.xci-test.com
infrastructure to the xs.citrite.net infrastructure. Adjust the
configuration accordingly.
- The database has been moved from osstestdb.db.cam.xci-test.com to a
new dedicate pgres server at osstestdb.xs.citrite.net. The data has
been transfered. (README.dev is updated to use the production
instance's name instead)
- DHCP leases now come from dns1.uk.xensource.com:5556
- PXE has switched to /usr/groups/netboot. Also switch the templates
to use %name%/pxelinux.cfg with a symlink from
pxelinux.cfg/%ipaddrhex%, which will make it easier to find relevant
files.
- osstser1 is not in the .xs.citrite.net domain
- Logs mount point is now /home/osstest not /home/xc_osstest.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Thu, 23 Jul 2015 16:31:52 +0000 (17:31 +0100)]
cs-bisection-step: Fix memoisation of search_compute_length_at
There was a half-implemented memoisation. Memoisation is necessary
because otherwise the algorithm is exponential in the commit history
depth (with base equal to the commit parent fanout).
Sort this out:
* Break out the actual computation into a new
search_compute_length_at_intern
* Deleting the individual memo assignments, which incidentally
means we no longer miss an (unimportant) one.
* Actually having the new memoising function search_compute_length_at
check $n->{LengthAt} (this is the bugfix).
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
To reproduce a recent bisection problem I needed to exclude not just
all flights after a certain number, but also one earlier flight. So I
invented this option (and associated yaks).
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Thu, 23 Jul 2015 17:32:16 +0000 (18:32 +0100)]
Flight restriction: Change implementation of --max-flight
Abolish $maxflight. All the users outside Osstest::Executive have
been eliminated, so this is fine. Replace it with
$restrictflight_cond, which can accumulate multiple conditions.
There is a minor functional change: when multiple --max-flight options
are specified, _all_ of them take effect (effectively using the lowest
value). That option is not used in production, and I don't expect
people elsewhere to be passing multiple different such options.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Thu, 23 Jul 2015 16:54:53 +0000 (17:54 +0100)]
Flight restriction: Make report_blessingscond use implicit $maxflight
We have $maxflight in Osstest::Executive now, set appropriately.
Use that in report_blessingscond and all its callers including
report_find_push_age_info and hence in mg-all-branch-statuses and
sg-report-flight and sg-report-job-history.
No functional change.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>
Ian Jackson [Thu, 23 Jul 2015 17:01:22 +0000 (18:01 +0100)]
Flight restriction: Update cs-bisection-step
Use restrictflight_arg and restrictflight_cond.
This entails replacing $maxflight_cond (which is empty or contains a
series of texts each starting with AND) with $restrictflight_cond
(which is actually an expression, and might be just "TRUE").
No functional change.
Signed-off-by: Ian Jackson <Ian.Jackson@eu.citrix.com> Acked-by: Ian Campbell <ian.campbell@citrix.com>