Owen Smith [Fri, 11 Oct 2019 13:51:10 +0000 (14:51 +0100)]
Fix > 2TB disks
In order to determine the size of a disk, Windows will issue a
SCSIOP_READ_CAPACITY. Disks larger than 2TB will respond with a max LBA
of 0xFFFFFFFF, which causes Windows to issue a SCSIOP_READ_CAPACITY16.
The read capacity 16 is passed with a 12 byte buffer to be filled in
using the READ_CAPACITY_DATA_EX structure, not the 16 or 32byte
(depending on packing) READ_CAPACITY16_DATA buffer.
Also adds Error labels to the failure conditions.
Owen Smith [Thu, 19 Sep 2019 08:44:15 +0000 (09:44 +0100)]
Attempt to process responses on the ring
When Disabling the ring, outstanding responses need to be completed.
Poll the ring to complete outstanding responses if the backend is still
connected and valid.
Owen Smith [Thu, 19 Sep 2019 08:30:41 +0000 (09:30 +0100)]
Rework BlkifRingDisable
Clean up all prepared and submitted requests when the ring is disabled,
so that outstanding SRBs are returned to storport for queueing. This is
especially important on the return from suspend path, as the ring is no
longer valid, and any submitted requests would be lost and trigger a
storport target reset.
Also ignores missing requests for responses.
Owen Smith [Thu, 19 Sep 2019 08:24:28 +0000 (09:24 +0100)]
Attempt to process responses on the ring
When Disabling the ring, outstanding responses need to be completed.
Poll the ring to complete outstanding responses if the backend is still
connected and valid.
Owen Smith [Thu, 5 Sep 2019 15:29:04 +0000 (16:29 +0100)]
Rework request submission
Make BlkifRingPostRequests return success for submitting 0 or more requests,
or failure when the ring is full. This prevents the loop in
BlkifRingSchedule() from preparing the next SRB when the ring is already
full.
Also attempt to notify the backend of changes every iteration of the loop in
BlkifRingSchedule(), to trigger the backend as soon as possible.
Owen Smith [Thu, 5 Sep 2019 08:42:38 +0000 (09:42 +0100)]
Update rsp_event during BlkifRingPoll()
Currently, by updating it in __BlkifRingPushRequests(), the code is
attempting to defer events until all of the posted requests have responses
on the ring. This is likely to lead to a cycle of fill...empty...fill...
empty etc., which is bad for performance.
This patch instead updates rsp_event when BlkifRingPoll() completes, such
that the very next response placed on the ring by the backend should
cause an event to be sent.
Signed-off-by: Owen Smith <owen.smith@citrix.com>
[Expanded commit message] Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Owen Smith [Fri, 14 Jun 2019 15:42:49 +0000 (16:42 +0100)]
Add PowerShell build scripts, version.vcxproj
Based on the sequence of commits to xenbus, add powershell scripts to
build the solution using the EWDK
version.vcxproj generates versioned files (version.h and xenvbd.inf) using
scripts/genfiles.ps1
Strips duplicated functionality from build.py to produce consistant
builds between python and powershell.
Paul Durrant [Fri, 26 Apr 2019 08:53:34 +0000 (09:53 +0100)]
Try to avoid dumping non-RAM pages
When XENCRSH sets up blkif requests they may end up referring to PFNs that
are ballooned out. When these requests reach the backend driver, it will
unsurprisingly encounter failures when trying to map or copy the data from
these PFNs, generally resulting in the request as a whole being failed and
a lot of noise being emitted to various logs.
This patch adds a check into PrepareReadWrite() to check the P2M type of
PFNs being dumped. If the type is found to be anything other than writable
RAM then the PFN is substituted with a buffer PFN, which will just contain
zeroes. The storage backend will be able to map or copy these pages, so
stalls in the dump process and useless log messages will be avoided.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Tue, 9 Apr 2019 16:03:08 +0000 (17:03 +0100)]
Report disk size and logical sector size in XENDISK...
...rather than XENVBD.
This allows us to use the PDO name rather than the more obscure target
number. Also, report the size in MB rather rather than in sectors (now
that sector size may be something other than 512B).
Also fix some whitespace bugs while in the neighbourhood.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Fri, 5 Apr 2019 15:51:08 +0000 (16:51 +0100)]
Add 'feature-large-sector-size'
As explained in Xen commit 67e1c050 "public/io/blkif.h: try to fix the
semantics of sector based quantities" [1], frontends that always
supply and interpret sector based quantities in terms of the 'sector-size'
of the backend should declare 'feature-large-sector-size'.
Owen Smith [Wed, 26 Sep 2018 09:47:36 +0000 (10:47 +0100)]
Fix BSOD on RingDestroy
Zero Frontend->MaxQueues after calling RingDestroy, as RingDestroy will
query this value to free each BlkifRing, which will decrement an
unsigned value below 0.
Also adds an ASSERT to detect if FrontendGetMaxQueues returns 0.
Signed-off-by: Owen Smith <owen.smith@citrix.com>
Test that Index != 0 rather than > 0, since it is an unsigned quantity.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Owen Smith [Fri, 24 Aug 2018 16:46:43 +0000 (17:46 +0100)]
Conditionally package DPInst
Since DPInst.exe is not shipped with the Windows Driver Kit 10, an
environment variable must point to local copies. Make the inclusion of
DPInst conditional on DPINST_REDIST being defined and that path
existing. This simplifies building packages which do not require DPInst
for installation, and removes a required step to create a working build.
Paul Durrant [Mon, 6 Aug 2018 09:47:58 +0000 (10:47 +0100)]
Remove bogus ASSERTion
In a checked build the code in BlkifRingSchedule() sometimes hits the
ASSERTion:
ASSERT3U(State->Count, ==, 0);
This check is there because this code was ported across from XENVIF. In
the context of that driver the check is valid because it should never be
possible to post a partial sequence of netif requests (since that would
violate the protocol). However, in the context of XENVBD posting blkif
requests, it is perfectly reasonable for a subset of blkif requests for
a single SRB to be posted, and hence __BlkifRingPostRequests() may exit
before State->Count falls to zero. Thus the ASSERTion is invalid in this
context and needs to be removed.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Owen Smith [Fri, 3 Aug 2018 13:48:17 +0000 (14:48 +0100)]
Add some sensible default overrides
Without overrides, "max-ring-page-order" and "multi-queue-max-queues"
use values that will consume large amounts of grant references.
"max-ring-page-order" will default to 4 (16 pages per ring)
"multi-queue-max-queues" will default to the lowest of guest vCPU count
or backend's vCPU count.
Override "max-ring-page-order" to 1 (2 pages per ring) and
"multi-queue-max-queues" to 2 (2 rings per block device)
Owen Smith [Mon, 4 Jun 2018 14:37:36 +0000 (15:37 +0100)]
Set StorPortInitializePerfOpts
Sets DPC_REDIRECTION, OPTIMIZE_FOR_COMPLETION_DURING_STARTIO and
CONCURRENT_CHANNELS (to "multi-queue-max-queues") to improve StorPort's
distribution of SRBs to vCPUs.
Owen Smith [Mon, 4 Jun 2018 14:11:26 +0000 (15:11 +0100)]
Implement multi-queues
Splits XENVBD_RING into multiple XENVBD_BLKIF_RINGs, one for each shared
ring. Up-to "multi-queue-max-queues" rings are used to pass
blkif_requests and blkif_responses between frontend and backend. Reworks
the ring interactions to remove the locks used by XENVBD_QUEUE,
implementing a queue system similar to XenVifs transmitter queues.
Signed-off-by: Owen Smith <owen.smith@citrix.com>
s/Packets/SRBs in ring.c
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Owen Smith [Mon, 4 Jun 2018 13:57:43 +0000 (14:57 +0100)]
Force single queue in crash/hibernate path
Remove "multi-queue-num-queues" from frontend area when the crashump
driver is active. This will force the backend to use a single queue,
which the crashdump frontend is supplying
Signed-off-by: Owen Smith <owen.smith@citrix.com>
Add whitespace after cast for consistency with other code.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Owen Smith [Mon, 4 Jun 2018 13:47:15 +0000 (14:47 +0100)]
Dont close ring during FrontendReset
Closing the ring (and destroying the shared pages, etc) is not required
when a HwStorResetBus or SRB_FUNCTION_RESET_DEVICE is triggered.
Disabling the ring will cause any outstanding blkif_requests and SRBs to
be failed.
Owen Smith [Mon, 4 Jun 2018 13:42:23 +0000 (14:42 +0100)]
Remove RingTrigger
RingTrigger calls XENBUS_EVTCHN(Trigger..) on during the suspend
callback. Just before this, the ring is recreated and enabled, which
also calls XENBUS_EVTCHN(Trigger..). The explicit call RingTrigger is
unneccessary
Owen Smith [Mon, 4 Jun 2018 13:36:21 +0000 (14:36 +0100)]
Query "multi-queue-max-queues"
Query "multi-queue-max-queues", and override if neccessary, and work
out a suitable value for the number of queues used. Also adds the
commented out writing code to set "multi-queue-num-queues"
Paul Durrant [Mon, 4 Jun 2018 13:19:52 +0000 (14:19 +0100)]
Update Xen header files
The original patch from Owen contained too much whitespace damage. This
patch replaces it with the results of an invocation of get_xen_headers.py
and includes all resulting header updates.
NOTE: This patch adds one extra header (physdev.h) and a fix to remove
now-duplicate definitions of DOMID_INVALID from frontend.c.
Suggested-by: Owen Smith <owen.smith@citrix.com> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Owen Smith [Tue, 10 Apr 2018 10:51:18 +0000 (11:51 +0100)]
Abort any partially completed SRBs
When disabling, a SRB may be comprised of requests that have been
prepared and completed. Set the SrbStatus as early as possible, so
that any partially completed SRB is always failed.
Owen Smith [Tue, 10 Apr 2018 10:48:38 +0000 (11:48 +0100)]
Remove FreshSrbs queue
Any SRB that fails to prepare will be completed with SRB_STATUS_BUSY,
so that storport can manage the queue. SRBs are prepared in the StartIo
handler, and appended to the PreparedReqs queue. All possible PreparedReqs
are submitted, and if there is more PreparedReqs, the DPC is triggered to
submit more when possible.
Owen Smith [Tue, 13 Feb 2018 16:58:28 +0000 (16:58 +0000)]
Use a threaded DPC for ring processing
The current DPC back-off mechanism uses a timer to reschedule long
running DPCs. Timers seem to always have a minimum tick period, so
rescheduled DPCs always have an additional level of latency.
Using a threaded DPC (which executes at PASSIVE_LEVEL) will allow
other drivers to interrupt the vbd DPC, and maintain system
responsiveness. Once the vbd's DPC can be pre-empted by another
DPC, a DPC back-off mechanism is no longer required.
Paul Durrant [Thu, 25 Jan 2018 13:37:22 +0000 (13:37 +0000)]
Make sure Mdl->StartVa is set by __AllocatePages()
wdm.h carries this comment:
// Notice that while in the context of the subject thread, the base virtual
// address of a buffer mapped by an MDL may be referenced using the
// following:
//
// Mdl->StartVa | Mdl->ByteOffset
//
Hence it is important that a mapped MDL has a valid StartVa field as well
as a valid MappedSystemVa field. Unfortunately, for reasons best known to
Microsoft, MmMapLockedPagesSpecifyCache() does not ensure this is the
case, so it needs to be fixed up by __AllocatePages() itself.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Owen Smith [Mon, 23 Oct 2017 16:22:38 +0000 (17:22 +0100)]
Remove RingReQueueRequests
Since the Frontend transitions the ring through Closed and back to
Enabled on a resume from suspend, which aborts all queued and
prepared requests, de-queueing and re-queuing all requests from the
Target suspend callback is unneccessary. Also removes the Target
suspend callback and interface which is now unused.
Paul Durrant [Mon, 2 Oct 2017 10:07:45 +0000 (11:07 +0100)]
Fix BSOD on shutdown due to freeing NULL pointer
Unfortunately ExFreePoolWithTag() is not NULL-safe, and Base64Free() is
coded assuming it is. This patch adds an extra check for NULL to fix the
problem.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Owen Smith [Fri, 29 Sep 2017 16:23:31 +0000 (17:23 +0100)]
Use CACHE interface for REQUESTs, SEGMENTs and INDIRECTs
Replaces the ExLookasideList* calls with CACHE interface.
This allows the INDIRECT object constructor to allocate the indirect
page, and not allocate the indirect page on every object retrieval.
Also defers the indirect page free until the object is destroyed, and
not on every object put.
Owen Smith [Fri, 29 Sep 2017 15:58:13 +0000 (16:58 +0100)]
Add override for MaxRingPageOrder
Allow a Windows admin to limit the ring page count without
affecting any other VMs on the host. The override can be used
to restrict blkback's generous 16 page ring to a smaller value.
Owen Smith [Fri, 29 Sep 2017 15:45:27 +0000 (16:45 +0100)]
Use CACHE interface for bounce buffers
When the incomming read/write buffer is not aligned, the data needs
bouncing via a page-aligned buffer. Use the CACHE interface to provide
this buffer instead of a custom lookaside collection. Also moves the
storage for mapping the source buffers to the bounce buffer structure,
which will reduce the allocation size of the segments used on the fast
path (aligned buffers). The Indirect pages are also pre-allocated in the
object construction, and not for every object retrieval, and likewise
indirect pages are only freed on object destruction, and not for every
object put.
This will remove the log spam caused by the custom implementation.
Owen Smith [Fri, 22 Sep 2017 12:18:34 +0000 (13:18 +0100)]
Move Inquiry handling into target.c
Removes pdoinquiry.[ch] and its opaque inquiry pointer in favour of
making target.c the location of all target SRB handlers.
Removes a XenServer specific extension to the page83 inquiry data.
Owen Smith [Mon, 21 Aug 2017 14:37:00 +0000 (15:37 +0100)]
Add DPC timeout check
Check the DPC has not exceeded 1/2 its alloted time every
* 1/4 ring of responses processed
* all outstanding prepared requests submitted
* 1 queued SRB prepared
Eric Mackay [Wed, 28 Jun 2017 13:54:56 +0000 (14:54 +0100)]
Added a function called from DriverEntry where...
we can safely read registry keys and convert strings at PASSIVE_LEVEL
The MSDN documentation for various registry key access and string
conversion functions requires the caller to be at PASSIVE_LEVEL.
One of the reasons for this is that the string conversion tables used by
functions such as RtlAnsiStringToUnicodeString are stored in paged pool
memory. Both the page fault handler and the process scheduler run at
DISPATCH_LEVEL, therefore you must not touch memory at DISPATCH_LEVEL that
could be paged out. A process running at DISPATCH_LEVEL cannot be
preempted for the page fault handler to run.
To aid Static Driver Verifier code analysis and inform developers, I have
added SAL annotations that indicate PASSIVE_LEVEL is required on certain
registry access functions.
Signed-off-by: Eric Mackay <mackayem@amazon.com>
Re-based onto master and adjusted for style. Note this involved fixing
whitespace issues in frontend.c.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
v2:
- Remove unused FeatureName array
- Make more use of const qualifier for override name
Paul Durrant [Wed, 28 Jun 2017 15:32:45 +0000 (16:32 +0100)]
Make sure AdapterInitialize is called at PASSIVE_LEVEL
AdapterHwFindAdapter() can be called at DISPATCH_LEVEL and therefore it
is not safe to call AdapterInitialize() directly from there.
This patch, instead, calls StorPortEnablePassiveInitialization() from
AdapterHwInitialize() to request a PASSIVE_LEVEL initialization routine
is called immediately after AdapterHwFindAdapter() so that
AdapterInitialize() can be called from there.
Suggested-by: Owen Smith <owen.smith@citrix.com> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Mon, 8 May 2017 16:09:02 +0000 (17:09 +0100)]
Reboot request keys should be volatile
When a driver makes a reboot request it should use a volatile registry key.
The monitor service will explicitly remove the key prior to reboot but,
if the reboot is initiated in some other way and the key is non-volatile,
the monitor service will then needlessly prompt for a second reboot.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>