This call to guest_umount_lv is here for the benefit of ad-hoc reruns
of (eg) ts-guest-start tidy up any ad-hoc messing about (eg from
earlier runs of ts-debian-fixup or something). It is not needed in
production runs.
Serendipitously, this osstest code discovered a bug in the Linux
blkback: when tearing down, it sets the backend state to 6 before it
has closed the underlying block devices. This ultimately means that
after "xl destroy" or "xl shutdown -w" there is a period when the
guest's open handle onto its storage is still open. This is wrong.
This detection depends on us winning a tricky race. So it shows up in
osstest as a very low probability heisenbug. The bug is currently in
all versions of Linux and causing a bit of a nuisance.
It would be best to add a proper check for this bug. However, this is
quite fiddly: really, it ought to be done as close to the xl command
completion as possible, in the same ssh invocation. That would
involve a fair bit of plumbing and ad-hocery. I don't think that
would be proportionate for such a low-impact bug.
So instead in this patch I just disable this cleanup code in the
troublesome case, unless it is explicitly requested by the user
setting OSSTEST_GUEST_DISK_MOUNT_CLEANUP to a trueish value. (This
would be reasonably convenient for the ad-hoc testing that this call
serves.)
Thanks to Roger for diagnosing the Linux kernel bug.
CC: Jürgen Groß <jgross@suse.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
Reviewed-by: Roger Pau Monne <roger.pau@citrix.com>
sub guest_prepare_disk ($) {
my ($gho) = @_;
- guest_umount_lv($gho->{Host}, $gho);
+ guest_umount_lv($gho->{Host}, $gho)
+ if $ENV{'OSSTEST_GUEST_DISK_MOUNT_CLEANUP'};
return if ($gho->{Diskfmt} // 'none') eq "none";