]> xenbits.xensource.com Git - people/dstodden/blktap.git/log
people/dstodden/blktap.git
15 years agoCA-40696: Use system macros for fd set sizes.
Daniel Stodden [Fri, 23 Apr 2010 21:59:24 +0000 (14:59 -0700)]
CA-40696: Use system macros for fd set sizes.

15 years agoCA-40696: Bump up max number of tapdisks.
Daniel Stodden [Fri, 23 Apr 2010 02:38:12 +0000 (19:38 -0700)]
CA-40696: Bump up max number of tapdisks.

 * Kernel space blktap is good for 1024 disks.
 * Blktapctrl only for 504, due to fd_set size.

15 years agoCA-40171: Validate vhd parent chain during LV reactivation.
Daniel Stodden [Fri, 16 Apr 2010 02:06:04 +0000 (19:06 -0700)]
CA-40171: Validate vhd parent chain during LV reactivation.

Implementation of tapdisk_vbd_reactivate_volumes follows the vhd
parent chain, but therein lacks a critical check matching child and
parent uuids.

This creates a race window wherein reactivation hits an lv resize on
the master. The new last sector, while unrewritten, may carry garbage
footers. Results may vary, from plain reactivation failures to chain
traversal running off into the weeds.

Fixed with a proper uuid check. Adds some eprintfs to aid debugging
the corner cases.

15 years agoCA-39974: Revert -r486:94bcdedc9a6d
Daniel Stodden [Thu, 8 Apr 2010 06:43:45 +0000 (23:43 -0700)]
CA-39974: Revert -r486:94bcdedc9a6d

Known to risk queue deadlocks without 487:94a6007d887b.

15 years agoCA-39974; Unbreak tapdisk behaviour on trunk.
Julian Chesterfield [Wed, 7 Apr 2010 11:05:37 +0000 (11:05 +0000)]
CA-39974; Unbreak tapdisk behaviour on trunk.

15 years agoCA-35276: Deproprietarize all blktap sources.
Daniel Stodden [Tue, 6 Apr 2010 01:24:27 +0000 (18:24 -0700)]
CA-35276: Deproprietarize all blktap sources.

Remove the 'XenSource proprietary' stamp from all source headers,
adding a neat BSD-style disclaimer instead.

Won't bother removing build system support code.
Just set relevant source file lists to null.

15 years agoCA-29373: Unwedge queue after new request failure.
Daniel Stodden [Mon, 5 Apr 2010 19:35:18 +0000 (12:35 -0700)]
CA-29373: Unwedge queue after new request failure.

A tapdisk_vbd_issue_request failure is indication to back of from
further queue processing. The call however will also return error
status of a synchronous request failure.

When failing a new request immediately, we stop making progress. We
break out of the loop, subsequent incoming requests stay on the
new_requests lists, failed_requests is empty, so we block
indefinitely.

Patch decouples queue status from request status, we only back off if
the queue test fails, not the request. Otoh, this means we won't back
off after the first EBUSY encountered. One may argue the decision
about what's busy and not is better made by the driver, not the VBD.

15 years agoCA-29373: Stop retrying the pathetic case.
Daniel Stodden [Mon, 5 Apr 2010 19:35:16 +0000 (12:35 -0700)]
CA-29373: Stop retrying the pathetic case.

Separating the retryable from the recoverable errors should make the
control path more responsive to broken SRs or images. Fail vreqs with
irrecoverable errors immediately. Presently known ones comprise ESTALE
and ENOSPC. Both don't have a great prospect to improve within the
next two minutes.

Note that this somewhat obsoletes ENOSPC, as our original reason for
adding a tapdisk-level forced-shutdown. However, there may still be
reasons to discard retryable requests.

15 years agoCP-1613: Fix compiler warnings
Daniel Stodden [Mon, 5 Apr 2010 19:35:16 +0000 (12:35 -0700)]
CP-1613: Fix compiler warnings

The vhd_cache_init/enabled calls were redeclared inline after their
original declaration, yielding an ugly warning. These are module
members, not header macros. Safe to leave the inlining to the
compiler.

15 years agoCA-39535: Break call chain recursion during force-shutdown.
Daniel Stodden [Mon, 5 Apr 2010 19:35:15 +0000 (12:35 -0700)]
CA-39535: Break call chain recursion during force-shutdown.

Pending requests during shutdown flag TD_VBD_SHUTDOWN_REQUESTED,
resulting in an endless vbd_close -> vbd_kick -> vbd_check_state ->
vbd_close loop.

The vbd_check_state call was added to unblock canonical (synchronous)
I/O mode in cset 45c15fdaed55 (Separate tapdisk raw I/O into different
backends).

We only need the queue dispatch during vbd_kick. Fixed by breaking a
vbd_check_queue_state out of vbd_check_state.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
15 years agoImport libvhdio from the XenClient fork of blktap (required for transfervm VHD tests)
Andrei Lifchits [Thu, 1 Apr 2010 02:23:32 +0000 (19:23 -0700)]
Import libvhdio from the XenClient fork of blktap (required for transfervm VHD tests)

15 years agoCP-1613: CR-60/REQ222 - VHD import through Transfer VM (partial)
Ewan Mellor [Tue, 30 Mar 2010 03:53:36 +0000 (04:53 +0100)]
CP-1613: CR-60/REQ222 - VHD import through Transfer VM (partial)

Minor changes to libvhd so that it compiles in the Transfer VM's
environment.

15 years agoCA-39320: fix the endianness of the "vhd-util read -B" output
Andrei Lifchits [Tue, 23 Mar 2010 18:52:29 +0000 (11:52 -0700)]
CA-39320: fix the endianness of the "vhd-util read -B" output

15 years agoCA-37486: Prevent watchdog timeouts on paused VBSs.
Daniel Stodden [Wed, 17 Feb 2010 19:20:56 +0000 (11:20 -0800)]
CA-37486: Prevent watchdog timeouts on paused VBSs.

The 10s watchdog timeout is easily spent pausing. Since
470:e5e6122a457e we will drop the log into a more permanent location,
so avoid that.

Adds tapdisk_vbd_mark_progress() for timestamping. Adds an extra
timestamp mark to unpause.

15 years agoCA-37486: Prevent bogus EIO warnings while quiescing.
Daniel Stodden [Wed, 17 Feb 2010 19:20:56 +0000 (11:20 -0800)]
CA-37486: Prevent bogus EIO warnings while quiescing.

Forwarding treqs while trying to pause barfs. An EBUSY is kind of
misleading (no more than EIO) but right now the only silent one.

15 years agoCA-36298: add more logging to vhd-util-scan for EINVAL error cases
Andrei Lifchits [Mon, 15 Feb 2010 23:16:28 +0000 (15:16 -0800)]
CA-36298: add more logging to vhd-util-scan for EINVAL error cases

15 years agoCA-37618: add a switch to print VHD BAT information as a bitmap
Andrei Lifchits [Wed, 10 Feb 2010 17:53:38 +0000 (09:53 -0800)]
CA-37618: add a switch to print VHD BAT information as a bitmap

15 years agoCA-36385: Prefer AIO eventfd support on kernels >= 2.6.22
Keir Fraser [Fri, 29 Jan 2010 08:55:27 +0000 (08:55 +0000)]
CA-36385: Prefer AIO eventfd support on kernels >= 2.6.22

Mainline kernel support for eventfd(2) in linux aio was added between
2.6.21 and 2.6.22. Libaio after 0.3.107 has the header file, but
presently few systems support it. Neither do we rely on an up-to-date
libc6.

Instead, this patch adds a header which defines custom iocb_common
struct, and works around a potentially missing sys/eventfd.h.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
15 years agoCA-36385: Separate tapdisk raw I/O into different backends.
Keir Fraser [Fri, 29 Jan 2010 08:54:51 +0000 (08:54 +0000)]
CA-36385: Separate tapdisk raw I/O into different backends.

Hide tapdisk support for different raw I/O interfaces behind a new
struct tio. Libaio remains to dominate the interface, requiring
everyone to dispatch iocb/ioevent structs.

Backends:
 - lio:  Kernel AIO via libaio.
 - rwio: Canonical read/write() mode.

Misc:
 - Fixes a bug in tapdisk-vbd which locks up the sync io mode.
 - Wants a PERROR macro in blktaplib.h
 - Removes dead code in qcow2raw to make it link again.

Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com>
Signed-off-by: Jake Wires <jake.wires@citrix.com>
15 years agoCA-27472: Revert poll event debug code.
Daniel Stodden [Wed, 8 Jul 2009 02:39:30 +0000 (19:39 -0700)]
CA-27472: Revert poll event debug code.

16 years agoCA-27472: Revert error dispersion debug code.
Daniel Stodden [Wed, 8 Apr 2009 00:25:14 +0000 (17:25 -0700)]
CA-27472: Revert error dispersion debug code.

15 years agoCA-34981: Set a noreturn attribute on td_panic()
Daniel Stodden [Wed, 20 Jan 2010 16:38:05 +0000 (08:38 -0800)]
CA-34981: Set a noreturn attribute on td_panic()

Prevents bogus uninitialized variable usage warnings.

15 years agoCA-35363: fix VHD metadata corruption introduced with cset 453:b86e2a0bbccd
Andrei Lifchits [Sat, 28 Nov 2009 01:52:52 +0000 (17:52 -0800)]
CA-35363: fix VHD metadata corruption introduced with cset 453:b86e2a0bbccd

15 years agoCA-34846: Integrate tapdisk-syslog with the tlog_error path.
Daniel Stodden [Sat, 21 Nov 2009 02:54:31 +0000 (18:54 -0800)]
CA-34846: Integrate tapdisk-syslog with the tlog_error path.

Currently removes the caller line filter, based on the fact that we
are only reporting terminal vreq failures anyway.

[The alternative would have been to keep filtering and flush the log
only once per loop iteration.]

With an overall request timeout of 2 minutes per request, there should
presently be no need to filter.

This may likely change in future, so bursts of errors would lead to
message loss. We anticipate this by logging to both syslog and the
logfile, which is then reliable.

15 years agoCA-34846: Add non-blocking syslog client.
Daniel Stodden [Sat, 21 Nov 2009 02:54:31 +0000 (18:54 -0800)]
CA-34846: Add non-blocking syslog client.

Limited talk to syslog on the datapath. Integrates with the event loop
and directly talks to /dev/log. EAGAIN redirects messages into a
fixed-size ring buffer.

The main result is that datapath errors get reported immediately,
instead of being deferred until the next control path intervention.

15 years agoCA-34846: Remove disktypes.h from tapdisk.h
Daniel Stodden [Sat, 21 Nov 2009 02:54:31 +0000 (18:54 -0800)]
CA-34846: Remove disktypes.h from tapdisk.h

Leads to name clashes: the tapdisk_log driver vs. tapdisk-log.

15 years agoCA-34846: Enable event masking.
Daniel Stodden [Sat, 21 Nov 2009 02:54:31 +0000 (18:54 -0800)]
CA-34846: Enable event masking.

15 years agoCA-34846: Enable single loop iterations in tapdisk-server.
Daniel Stodden [Sat, 21 Nov 2009 02:54:31 +0000 (18:54 -0800)]
CA-34846: Enable single loop iterations in tapdisk-server.

15 years agoCA-34846: Support recursive event loop iterations.
Daniel Stodden [Sat, 21 Nov 2009 02:54:31 +0000 (18:54 -0800)]
CA-34846: Support recursive event loop iterations.

Split scheduler_run_events() into two phases:
 1. scheduler_check_events() processes the results from select().
 2. scheduler_run_events() dispatches results collected during the prior check.

Given that fd notification are typically level-triggered, this does
slightly more than absolutely necessary (likely might just as well
re-select the event sets instead).

But the given approach should generate a little less overhead and the
split above would also be the way to go when integrating tapdisks with
foreign event loops.

15 years agoCA-34846: Fix a scheduler glitch.
Daniel Stodden [Sat, 21 Nov 2009 02:54:31 +0000 (18:54 -0800)]
CA-34846: Fix a scheduler glitch.

Because max_fd = 0 would indicate we're polling stdin in the unlikely
case where no events whatsoever are to be polled. We never poll empty
fd sets, so this should yield no practical effect.

15 years agoCA-34846: Eliminiate the restart flag.
Daniel Stodden [Sat, 21 Nov 2009 02:54:31 +0000 (18:54 -0800)]
CA-34846: Eliminiate the restart flag.

Working towards a reentrant scheduler_run_events(). The restart flag
is meant to recover from event struct removals. Instead, do not delete
events during scheduler_event_callback(). Mark them as dead instead,
and collect them on the return path.

15 years agoCA-34981: Remove nonexistent function declaration.
Daniel Stodden [Sat, 21 Nov 2009 02:54:31 +0000 (18:54 -0800)]
CA-34981: Remove nonexistent function declaration.

15 years agoCA-27472: Sort out tapdisk AIO init.
Daniel Stodden [Sat, 14 Nov 2009 04:26:14 +0000 (20:26 -0800)]
CA-27472: Sort out tapdisk AIO init.

Move event callbacks registration into tapdisk-queue. This should also
remove the need for the dummy pollfd pipe in the synchronous case.

15 years agoCA-27472: Sort out tapdisk IPC init.
Daniel Stodden [Sat, 14 Nov 2009 04:26:13 +0000 (20:26 -0800)]
CA-27472: Sort out tapdisk IPC init.

Move I/O and event callbacks setup out of tapdisk-server, into tapdisk-ipc.

15 years agoCA-27472: Switch vreq failure termination from retry counts to realtime.
Daniel Stodden [Sat, 14 Nov 2009 04:26:11 +0000 (20:26 -0800)]
CA-27472: Switch vreq failure termination from retry counts to realtime.

15 years agoCA-27472: Revert 375:8a93ed55587e (CA-25683 - Don't bump the retry counter..)
Daniel Stodden [Sat, 14 Nov 2009 04:26:10 +0000 (20:26 -0800)]
CA-27472: Revert 375:8a93ed55587e (CA-25683 - Don't bump the retry counter..)

15 years agoCA-34981: Integrate tapdisk-logfile into tapdisk-log.
Daniel Stodden [Sat, 14 Nov 2009 04:26:09 +0000 (20:26 -0800)]
CA-34981: Integrate tapdisk-logfile into tapdisk-log.

15 years agoCA-34981: New tapdisk logfile support.
Daniel Stodden [Sat, 14 Nov 2009 04:26:08 +0000 (20:26 -0800)]
CA-34981: New tapdisk logfile support.

Does
- plain buffered, blocking stdio.
- managed buffer mode and sizes.
- file names and renaming derived from the (sys-)log id.
- syslog-style timestamp prefixes.

15 years agoCA-34981: Disentangle td-utils.o from tapdisk-log.o
Daniel Stodden [Sat, 14 Nov 2009 04:26:07 +0000 (20:26 -0800)]
CA-34981: Disentangle td-utils.o from tapdisk-log.o

Moves tapdisk_start/stop_logging into tapdisk-log.o

15 years agoCA-27617: remove a syslog call on the datapath (oops)
Andrei Lifchits [Wed, 11 Nov 2009 19:44:39 +0000 (11:44 -0800)]
CA-27617: remove a syslog call on the datapath (oops)

15 years agoCA-27617: improve VHD sequential write performance by not skipping bitmap regions...
Andrei Lifchits [Wed, 11 Nov 2009 19:06:42 +0000 (11:06 -0800)]
CA-27617: improve VHD sequential write performance by not skipping bitmap regions when overwriting fully allocated blocks
(i.e., introduce redundant writes to make the pattern on the underlying block device sequential)

15 years agoCA-27732: improve (buffered) read performance on long chains by not splitting vreqs...
Andrei Lifchits [Wed, 11 Nov 2009 19:03:35 +0000 (11:03 -0800)]
CA-27732: improve (buffered) read performance on long chains by not splitting vreqs into multiple treqs.
This reduces the overhead by having fewer treqs percolate through the VHD chain.

15 years agoCA-27472: Add missing ifdef braces to header.
Daniel Stodden [Mon, 9 Nov 2009 23:34:59 +0000 (15:34 -0800)]
CA-27472: Add missing ifdef braces to header.

15 years agoCA-34845: Reduce blktapctrl chatter.
Daniel Stodden [Mon, 9 Nov 2009 21:24:53 +0000 (13:24 -0800)]
CA-34845: Reduce blktapctrl chatter.

   1. Remove the debug log statements in the low-level xs library
      embedded into blktapctrl.

   2. Add dedicated brace messages (got watch/handled watch) only
      after the watch has been validated for processing.

15 years agoCA-32254: Unbreak bvt after -r445:ff5db28c637f
Daniel Stodden [Sat, 7 Nov 2009 00:10:03 +0000 (16:10 -0800)]
CA-32254: Unbreak bvt after -r445:ff5db28c637f

15 years agoCA-32254: Do not indicate channel states in tapdisk errors.
Daniel Stodden [Fri, 6 Nov 2009 19:40:08 +0000 (11:40 -0800)]
CA-32254: Do not indicate channel states in tapdisk errors.

Silly.

15 years agoCA-34208: Upgrade atomic writes to a uuid check.
Daniel Stodden [Fri, 6 Nov 2009 19:16:42 +0000 (11:16 -0800)]
CA-34208: Upgrade atomic writes to a uuid check.

Writes to recycled VBDs are yet to be seen, but it should not hurt either.

15 years agoCA-34208: Quiesce watches for dead/broken channels.
Daniel Stodden [Fri, 6 Nov 2009 19:16:41 +0000 (11:16 -0800)]
CA-34208: Quiesce watches for dead/broken channels.

Notably pause watches otherwise tend to generate spurious errors, if
the channel failed early.

Also fixes a potential bug driving RECYCLED -> PAUSED/UNPAUSED if we
get a spurious pause watch at the wrong moment. Didn't catch this one
when recycled was added.

15 years agoCA-32254: Clean up watch validation.
Daniel Stodden [Fri, 6 Nov 2009 19:16:40 +0000 (11:16 -0800)]
CA-32254: Clean up watch validation.

 * Do not test watch path existence. Let watch handlers check this
   themselves, so pause events doesn't have to bypass the results.

 * Don't signal EINVAL when the vbd path is gone/recycled. That's not
   nice, but there isn't much left to fatalize either way.

 * The the uuid check implies the vbd path check, so skip the extra
   read.

15 years agoCA-32254: Verbose pausing -> paused
Daniel Stodden [Thu, 29 Oct 2009 00:30:12 +0000 (17:30 -0700)]
CA-32254: Verbose pausing -> paused

15 years agoCA-34208: Make additional xenstore updates atomic.
Daniel Stodden [Thu, 29 Oct 2009 00:30:12 +0000 (17:30 -0700)]
CA-34208: Make additional xenstore updates atomic.

Make sure xenstore writes do not accidentally recreate the VBD
node. This used to be done for trigerring VBD reprobes only.

The failure-free case mainly covers pause-done.

To cover cases where (additional) failures occur, include
tapdisk-error as well.

15 years agoCA-34208: Anticipate VBD gone when removing pause-done; plus add more verbosity.
Daniel Stodden [Thu, 29 Oct 2009 00:30:12 +0000 (17:30 -0700)]
CA-34208: Anticipate VBD gone when removing pause-done; plus add more verbosity.

15 years agoCA-31481: Add syslog facility option to blktapctrl and tapdisk.
Daniel Stodden [Mon, 26 Oct 2009 12:48:23 +0000 (05:48 -0700)]
CA-31481: Add syslog facility option to blktapctrl and tapdisk.

15 years agoCA-33766: Consolidate start/shutdown-tapdisk to tapdisk-request.
Daniel Stodden [Mon, 26 Oct 2009 12:37:09 +0000 (05:37 -0700)]
CA-33766: Consolidate start/shutdown-tapdisk to tapdisk-request.

15 years agoCA-32254: Minor log output corrections.
Daniel Stodden [Mon, 26 Oct 2009 12:36:42 +0000 (05:36 -0700)]
CA-32254: Minor log output corrections.

15 years agoCA-32254: Fix a memory leak.
Daniel Stodden [Mon, 26 Oct 2009 12:36:26 +0000 (05:36 -0700)]
CA-32254: Fix a memory leak.

15 years agoCA-33664: Close unplug/plug race window since CA-32254.
Daniel Stodden [Tue, 13 Oct 2009 22:07:50 +0000 (15:07 -0700)]
CA-33664: Close unplug/plug race window since CA-32254.

Race window in tapdisk_daemon_probe_vb if VBD recreation is observered
before we managed to close the previous channel.

We don't permit duplicate channels on the same XS keys. Instead add
TAPDISK_VBD_RECYCLED, which indicates a channel whose vbd state went
one step beyond DEAD, pending a synchronous re-probe.

Trying to avoid spurious probe failures, we cannot just rerun probe()
after destruction. Instead, perform a test/write transaction cycle.
We abort in the unlikely case where we destroy the channel after the
backend path was already gone again.

The related case of observing a remove event in RECYCLED state would
drive us back to DEAD, which should work okay.

15 years agoCA-32254: Fix error code display broken by 433:660ec5748510.
Daniel Stodden [Tue, 13 Oct 2009 22:07:50 +0000 (15:07 -0700)]
CA-32254: Fix error code display broken by 433:660ec5748510.

15 years agoCA-32254: Rewrite channel/vbd state machine.
Daniel Stodden [Thu, 8 Oct 2009 05:47:56 +0000 (22:47 -0700)]
CA-32254: Rewrite channel/vbd state machine.

 * Refine channel states:

   - Replaces the former IDLE state with a more detailed representation:
       - DEAD (just after spawning the channel)
       - LAUNCHED (just after spawning the channel)
- PID (following a pid response)
- RUNNING (following an open resume response)
- PAUSED (following a pause response)

   - Former channel->open replaced with TAPDISK_CHANNEL_IPC_OPEN(),
     based on the above channel states.

   - Former channel->connected removed. It was used as an indicator
     for pause request legitimacy, but process state is not a criteria
     anymore.

 * Add 'vbd state', reflecting agent control state:
- UNPAUSED (may be RUNNING)
- PAUSING  (following a pause request)
- PAUSED   (finished PAUSING)
- BROKEN   (following a fatal error)
- DEAD     (following channel->path removal)

 * Add 'shutdown state', reflecting kernel control state:
- UP       (tapdisk shall live)
- DOWN     (tapdisk shall die)

 * Reimplement pause/unpause on top of that:
    - mapping vbd and shutdown states to a channel state to be driven.
    - tapdisk-daemon drives vbd DEAD state on path removal
    - allow start-tapdisk to resurrect channels: CLOSED/DEAD -> LAUNCHED
    - the point is that this maintains vbd paused state across tapdisk exits.

15 years agoCA-32254: Sort out tapdisk_channel_launch_tapdisk complement.
Daniel Stodden [Tue, 6 Oct 2009 03:10:48 +0000 (20:10 -0700)]
CA-32254: Sort out tapdisk_channel_launch_tapdisk complement.

15 years agoCA-32254: Separate code for XS pause/unpause completion indications.
Daniel Stodden [Tue, 6 Oct 2009 03:09:42 +0000 (20:09 -0700)]
CA-32254: Separate code for XS pause/unpause completion indications.

15 years agoCA-32254: Clean up tapdisk_channel_receive_open_response.
Daniel Stodden [Tue, 6 Oct 2009 02:05:27 +0000 (19:05 -0700)]
CA-32254: Clean up tapdisk_channel_receive_open_response.

15 years agoCA-32254: Sort out tapdisk_channel_open complement.
Daniel Stodden [Tue, 6 Oct 2009 02:05:27 +0000 (19:05 -0700)]
CA-32254: Sort out tapdisk_channel_open complement.

15 years agoCA-32254: Sort out tapdisk_channel_gather_info complement.
Daniel Stodden [Tue, 6 Oct 2009 02:05:26 +0000 (19:05 -0700)]
CA-32254: Sort out tapdisk_channel_gather_info complement.

15 years agoCA-32254: Sort out tapdisk_channel_init complement.
Daniel Stodden [Tue, 6 Oct 2009 02:05:25 +0000 (19:05 -0700)]
CA-32254: Sort out tapdisk_channel_init complement.

15 years ago[mq]: tapdisk-channel-clear-watches.diff
Daniel Stodden [Mon, 5 Oct 2009 18:49:45 +0000 (11:49 -0700)]
[mq]: tapdisk-channel-clear-watches.diff

15 years agoCA-30816: Fix shutdown phase on device reset.
Daniel Stodden [Sat, 11 Jul 2009 01:22:19 +0000 (18:22 -0700)]
CA-30816: Fix shutdown phase on device reset.

Cset 397:272912c3d0ec (CA-26523) broke the tapdisk shutdown phase run
on a reset of the backend's XenStore root node. I missed the fact that
this is the point where the shutdown message in channel_close() is
essential.

This patch keeps the channel_reap() operation to close the startup
failure loophole, but doesn't defer channel_close(). We now free the
channel only - but unconditionally - after tapdisk exit.

Removed the comment on safe errors since it doesn't apply any more. At
the expense of splitting the original close code.

15 years agoCA-27472: Dump event log on EBUSY vreq failure
Daniel Stodden [Wed, 8 Jul 2009 02:39:31 +0000 (19:39 -0700)]
CA-27472: Dump event log on EBUSY vreq failure

Add debug code to let make_response() dump the event ring if we fail
the retry count with an EBUSY status.

15 years agoCA-27472: Log past events to a ring buffer
Daniel Stodden [Wed, 8 Jul 2009 02:39:30 +0000 (19:39 -0700)]
CA-27472: Log past events to a ring buffer

Adds debug code to the main event loop to trace unexpectedly high
event frequencies we sometimes seem to come across.

Comprises a 1K ring buffer and makes the select loop append fd sets
plus relative time into it. We cover up to 128 events. We only cover
the lower 32 fds to save space, but that's usually more than needed.

15 years agoCA-27472: Add failure counts to vbd_debug()
Daniel Stodden [Wed, 8 Jul 2009 02:38:08 +0000 (19:38 -0700)]
CA-27472: Add failure counts to vbd_debug()

While we are at it, dump the whole timeval resolution instead of
millisecons.

15 years agoCA-27472: Fix 405:929fa443deff
Daniel Stodden [Tue, 7 Jul 2009 22:09:04 +0000 (15:09 -0700)]
CA-27472: Fix 405:929fa443deff

Aiieee, vreq time, not the vbd stamp.

15 years agoCA-30475; Use the BLKGETSIZE64 ioctl when querying a raw device for it's size.
Julian Chesterfield [Thu, 25 Jun 2009 13:25:23 +0000 (14:25 +0100)]
CA-30475; Use the BLKGETSIZE64 ioctl when querying a raw device for it's size.

15 years agoUse xen_rmb() as rmb() is no longer defined by the libcx headers
Tim Deegan [Thu, 11 Jun 2009 09:45:50 +0000 (10:45 +0100)]
Use xen_rmb() as rmb() is no longer defined by the libcx headers

15 years agoCA-30196; Advertise the tapdisk-pid on initialisation to enable QoS settings to be...
Julian Chesterfield [Fri, 5 Jun 2009 11:05:07 +0000 (12:05 +0100)]
CA-30196; Advertise the tapdisk-pid on initialisation to enable QoS settings to be applied to
the correct process.

16 years agoCA-29519: fix tapdisk-diff EBUSY errors
Andrei Lifchits [Fri, 15 May 2009 18:36:52 +0000 (11:36 -0700)]
CA-29519: fix tapdisk-diff EBUSY errors

16 years agoCA-29010: do not enforce OPEN_STRICT (for exclusivity of RW mode) because we don...
Andrei Lifchits [Thu, 7 May 2009 00:54:17 +0000 (17:54 -0700)]
CA-29010: do not enforce OPEN_STRICT (for exclusivity of RW mode) because we don't have a story for crash recoveries (this problem was uncovered after fixing CA-28285). [The previous cset (417:417711d49f8f) also refers to CA-29010, not CA-19010]

16 years agoCA-19010: Make vhd-util repair work on block devices.
Daniel Stodden [Wed, 6 May 2009 18:08:53 +0000 (11:08 -0700)]
CA-19010: Make vhd-util repair work on block devices.

The repair facility opens a VHD, recovers metadata from the backup
footer, then writes the data back to the primary footer at
end-of-file.

We only had it implemented for file-based VDIs, the difference being
that our definition of 'end-of-file' is a slightly different one on
block devices.

This patch removes vhd-util's own assumptions about what the right
footer offset is and lets instead vhd_write_footer() do the math. The
price is a redundant backup footer re-write in the repair case and a
redundant ftruncate() issued by tapdisk's vhd_close() in the
non-faulty case.

None of those are on a critical path. So let's avoid to declare more
funny functions non-static.

16 years agoCA-28285: fix prior cset 413:61009e9850e9
Daniel Stodden [Wed, 6 May 2009 18:05:32 +0000 (11:05 -0700)]
CA-28285: fix prior cset 413:61009e9850e9

16 years agoCA-28511: don't deactivate LVs in tapdisk
Andrei Lifchits [Fri, 24 Apr 2009 05:18:13 +0000 (22:18 -0700)]
CA-28511: don't deactivate LVs in tapdisk

16 years agoCA-25742: make sure the user can still delete a VHD in ENOSPACE conditions when we...
Andrei Lifchits [Thu, 16 Apr 2009 21:12:27 +0000 (14:12 -0700)]
CA-25742: make sure the user can still delete a VHD in ENOSPACE conditions when we can't write the primary footer

16 years agoCA-28285: heed an invalid primary footer when opening a VHD in RW mode
Andrei Lifchits [Thu, 16 Apr 2009 21:09:26 +0000 (14:09 -0700)]
CA-28285: heed an invalid primary footer when opening a VHD in RW mode

16 years agoCA-27472: Improve the timestamps kept on tlog errors.
Daniel Stodden [Fri, 10 Apr 2009 03:49:40 +0000 (20:49 -0700)]
CA-27472: Improve the timestamps kept on tlog errors.

Apart from the buffering, the tapdisk-log code collects log entries in
buckets. Every error condition is saved only once, later ocurrences
just increment a counter. Presently this means one only gets to see
the timestamp of the very first occurrence.

While this is often sufficient, a bit more information may sometimes
help to tell transient from recurrent issues. Dump first and last
occurrence per dispersion instead.

Saving tv structs also simplifies the string formatting code a little.

16 years agoCA-27472: Remove redundant function name from error formatting.
Daniel Stodden [Fri, 10 Apr 2009 03:49:40 +0000 (20:49 -0700)]
CA-27472: Remove redundant function name from error formatting.

The function name is output during tlog flush anyway, and at a more
promiment position. Save the space and make output more concise.

16 years agoCA-27472: Limit dispersion tracking to timevals.
Daniel Stodden [Fri, 10 Apr 2009 03:49:40 +0000 (20:49 -0700)]
CA-27472: Limit dispersion tracking to timevals.

Get rid of the FPU stuff. We're only tracking time this way. The
floats lack the necessary precision to track absolute time, and fixed
integer arithmetic is nicer anyways.

16 years agoCA-27472: Strformat buffered tapdisk error timestamps, syslog-style.
Daniel Stodden [Fri, 10 Apr 2009 03:49:40 +0000 (20:49 -0700)]
CA-27472: Strformat buffered tapdisk error timestamps, syslog-style.

Avoids tedious conversion of <sec>.<usec> formatted output to ascii
time in order to determine failure context during triage.

And the space advantage of dotted notation appears rather negligible.

16 years agoCleanup watchdog timeout calculation.
Daniel Stodden [Fri, 10 Apr 2009 03:49:40 +0000 (20:49 -0700)]
Cleanup watchdog timeout calculation.

Just cleanup, but off by a 500 ms average.

16 years agoCA-27472: Count/log incoming/outgoing VBD kicks in tapdisk.
Daniel Stodden [Fri, 10 Apr 2009 03:49:40 +0000 (20:49 -0700)]
CA-27472: Count/log incoming/outgoing VBD kicks in tapdisk.

The notification count helps debugging some interesting VBD
problems. One is that that dom0 plugs have a problem with batching
requests issued through the frontend device (xvd). Looks like there is
no batching at all.

16 years agostatic tapdisk_vbd_queue_ready()
Daniel Stodden [Wed, 8 Apr 2009 17:17:32 +0000 (10:17 -0700)]
static tapdisk_vbd_queue_ready()

Queue status should be of no concern to drivers nor servers, and
actually isn't. Declare as static, so it can be inlined if
appropriate.

16 years agoCA-27472: Add cheap vreq time-to-failure dispersion statistics.
Daniel Stodden [Wed, 8 Apr 2009 00:25:14 +0000 (17:25 -0700)]
CA-27472: Add cheap vreq time-to-failure dispersion statistics.

We cannot just dump request timestamps, since the log would explode if
failures occur frequently. Instead, produce some minimum amount of
statistics.

For single, apparently spurious failures (e.g. CA-26804), this may
help making more informed guess on whether we are looking at image
driver issues or the current event and retry logic is to blame.

16 years agoCA-27472: Add start-of-day timestamps on vreqs.
Daniel Stodden [Wed, 8 Apr 2009 00:25:02 +0000 (17:25 -0700)]
CA-27472: Add start-of-day timestamps on vreqs.

Upcoming csets will depend on vreq lifetime calculations, so saving
vreq allocation time is what it does.

16 years agoCA-27472: Add slightly more debug info to the request failure path.
Daniel Stodden [Tue, 7 Apr 2009 23:46:19 +0000 (16:46 -0700)]
CA-27472: Add slightly more debug info to the request failure path.

Replace the hardcoded EIO we dump to the logs. We do have an actual
error to display. The interesting piece is actually vreq->error, response
status is rather boring.

Doing it like this should also prevent people from attributing
failures to *real* EIO, as would be the case with NFS or physical
device issues.

Not that vreq->error may be zero on some failure paths. Consider this
useful information as well; these case can be attribute path to
tapdisk-vbd itself, meaning the respective treqs may have never made
it down to the block driver.

16 years agoMerge consecutive gettimeofdays.
Daniel Stodden [Tue, 7 Apr 2009 22:47:55 +0000 (15:47 -0700)]
Merge consecutive gettimeofdays.

Three cheers for a leaner strace.

16 years agoUse vbd_add_image() inline macro where appropriate.
Daniel Stodden [Tue, 7 Apr 2009 22:47:54 +0000 (15:47 -0700)]
Use vbd_add_image() inline macro where appropriate.

16 years agoRemove executables from .phony target set.
Daniel Stodden [Fri, 2 Jan 2009 22:32:32 +0000 (14:32 -0800)]
Remove executables from .phony target set.

Avoid pointlessly relinking utility programs.

16 years agoCA-27988: reactivate all LVs in the VHD chain even on resume
Andrei Lifchits [Tue, 7 Apr 2009 04:18:54 +0000 (21:18 -0700)]
CA-27988: reactivate all LVs in the VHD chain even on resume

16 years agoCA-26523: Allow for alternate tapdisk programs spawned by blktapctrl.
Daniel Stodden [Mon, 6 Apr 2009 20:22:21 +0000 (13:22 -0700)]
CA-26523: Allow for alternate tapdisk programs spawned by blktapctrl.

Some experiments, such as profiling blktap, get a whole lot easier if
there's a simple way to wrap tapdisk processes in custom code. A
typical example would be running tapdisk processes through Valgrind.

This patch makes blktapctrl sensitive to an environment variable
TAPDISK, which can be set to point to a replacement program which
blktapctrl is then supposed to spawn instead. PATH search per execvp()
remains unaffected.

16 years agoCA-26523: Adjust blktapctrl exit status to main loop result.
Daniel Stodden [Mon, 6 Apr 2009 20:22:21 +0000 (13:22 -0700)]
CA-26523: Adjust blktapctrl exit status to main loop result.

In summary, this remains as unused as the prior code: the catch is
that we presently never exit voluntarily.

A watchdog parent + termination on signals, fatal conditions, etc,
would change the landscape. Especially since exiting with running
tapdisks is presently irrecoverable.

So for now just clean stuff up a little to prevent people from having
to investigate where all those return values go.

16 years agoCA-26523: Maintain child affiliation in blktapctrl.
Daniel Stodden [Mon, 6 Apr 2009 20:22:21 +0000 (13:22 -0700)]
CA-26523: Maintain child affiliation in blktapctrl.

Early fatal tapdisk failures are rare but presently go unnoticed. The
less improbable pathologic example case covered here is an AIO init
failure due to resource exhaustion.

At the very least, we would (a) start tapdisk1 synchronously and test
for errors or (b) defer post-daemonization to somewhere after PID_MSG.

Synchronous starts recently becoming more popular -- we used to
daemonize from the first instruction -- sheds light on a more general
issue: tapdisk1 shouldn't detach from blttapctrl at all.

Ergo:

 - Eliminate daemon()ization from tapdisk1.

 - Leaves detaching as a (hopefully unpopular) option, in case
   something is ever found to disagree.

 - Makes blktapctrl reap children.

 - Assocites child exists with respective tapdisk-channels.

 - Lets channels detect and escalate (via XS) unclean tapdisk losses.

At this point, even doing the right thing (tm) and suspending guests
in face of crashed tapdisks becomes an option, but this takes an
additional patch or two on the agent.

16 years agoCA-26523: Void tapdisk_daemon_check_fds().
Daniel Stodden [Mon, 6 Apr 2009 20:22:21 +0000 (13:22 -0700)]
CA-26523: Void tapdisk_daemon_check_fds().

Failures go unused either way. Would not disrupt overall event
dispatch for any individual failure.

16 years agoCA-26523: Add global PERROR macro.
Daniel Stodden [Mon, 6 Apr 2009 20:22:21 +0000 (13:22 -0700)]
CA-26523: Add global PERROR macro.