Owen Smith [Mon, 28 Oct 2024 13:02:16 +0000 (13:02 +0000)]
Add DeviceID info to DisplayNames
Append the DeviceID info to the strings used for the DisplayName strings.
Use the current genfiles.ps1 script, as part of the version project, to
populate the %XenBusName_VEND% value (or remove the lines) to match the
binding hardware ID.
Owen Smith [Fri, 5 Jul 2024 07:32:21 +0000 (08:32 +0100)]
Add safety check to Unplug
If a driver does not have the matching Enum keys to trigger an AddDevice,
then block the Unplug of any emulated device.
When a boot start driver is removed, the Enum keys are removed, but the
driver is not unloaded. Since there is no CoInstaller to remove the Unplug
value, there needs to be a check to prevent emulated devices from being
unplugged while there is no driver bindings for the child device, so the
child device will not create the PV device.
Owen Smith [Fri, 5 Jul 2024 07:32:20 +0000 (08:32 +0100)]
Use MmGetSystemRoutineAddress to test for IoOpenDriverRegistryKey
Server 2016 does not define the function IoOpenDriverRegistryKey, use
MmGetSystemRoutineAddress to dynamically find the function so that a
single binary can be used on Server 2016 (and Win10-1607) and Server 2025.
Owen Smith [Mon, 1 Jul 2024 10:32:41 +0000 (11:32 +0100)]
Conditionally use IoOpenDriverRegistryKey
IoOpenDriverRegistryKey is not available in Server 2016 and Windows 10 before 1803.
Use a conditinal to modify the RegistryOpenParametersKey function to use the
correct API to open the parameters key.
Set '#define VERIFIER_REG_ISOLATION' when compiling for Server 2025, and do not
include this definition when compiling to include support for Server 2016.
Owen Smith [Fri, 7 Jun 2024 07:17:00 +0000 (08:17 +0100)]
Add UNPLUG v3
Adds UnplugBootEmulated, which reports if the SystemStartOptions indicates
the boot disk should remain emulated (i.e. not unplugged)
Adds UnplugReboot, which can be used to issue the reboot request, so that
child drivers can abide by Driver Verifier's registry isolation rules
Owen Smith [Fri, 7 Jun 2024 07:16:59 +0000 (08:16 +0100)]
Move Registry operations to xen.sys
Driver Verifier's registry isolation rules are not applied to xen.sys during
WHQL testing. Move remaining operations that do not comply to xen.sys.
Operations related to Active Device, Reboot Requests and SystemStartOptions
are exposed by xen.sys to allow xenbus.sys and other drivers to successfully
execute with Driver Verifier enabled.
Owen Smith [Fri, 7 Jun 2024 07:16:58 +0000 (08:16 +0100)]
Remove FdoSetFriendlyName
RegistryOpenHardwareKey uses an absolute path to open the parent key
of IoOpenDeviceRegistryKey. Since this is a violation of the registry
isolation rules in Driver Verifier, its not possible to open the correct
registry key to set the device instance's friendly name.
Owen Smith [Fri, 7 Jun 2024 07:16:57 +0000 (08:16 +0100)]
Move FiltersInstall/FiltersUninstall to xen.sys
Since WHQL will enable Driver Verifier's registry isolation violation
detection on xenbus.sys, move the registry manipulation for inserting
xenfilt.sys into the appropriate device class UpperFilters to xen.sys.
Owen Smith [Fri, 7 Jun 2024 07:16:56 +0000 (08:16 +0100)]
Add RegistryOpenParametersKey
Use IoOpenDriverRegistryKey to avoid opening an absolute registry path.
Driver Verifier can detect registry isolation violations when running WHQL
tests on Server 2025. The rule states that a driver may not open an absolute
registry key path. Use the specific API to open the 'Parameters' key with
KEY_READ when querying settings.
Signed-off-by: Owen Smith <owen.smith@cloud.com>
Cosmetic fix-up.
Owen Smith [Mon, 20 May 2024 09:15:20 +0000 (10:15 +0100)]
Re-work Cache->Cursor handling in CacheDestroySlab()
Since the advent of lazy slab spilling in commit 7f8b622668fb ("Improve the
performance of the slab allocator"), if the cursor slab is being destroyed,
it no longer means it is the only slab in the last. Remove the ASSERTions
and simply set the new cursor to the current cursor's Flink. This will
either be the next slab which (because slabs are kept in decreasing order
of occupancy) will also be empty, or it will be the list anchor (which
indicates that all slabs in the list are full). Call CacheAudit() after
slab removal to re-verify these invariants.
Also clean up a typo in a commit comment in CacheAudit().
Signed-off-by: Owen Smith <owen.smith@cloud.com>
[Re-worked original patch] Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Owen Smith [Wed, 10 Jan 2024 13:42:22 +0000 (13:42 +0000)]
ASSERT(Frame->Mdl != NULL) before dereference
CodeAnalysis detects a false positive, where Frame->Mdl could be NULL in
GnttabContract. Without asserting a non-NULL pointer, SDV will generate a
DVL log file that will fail WHQL testing.
Martin Harvey [Wed, 10 Jan 2024 13:42:21 +0000 (13:42 +0000)]
Asynchronous power handling.
Replace a static, single thread for System and Device power transitions with
an IO_WORKITEM for each transition. IO_WORKITEMs are only used when a transition
requires calling codepaths that must be called at PASSIVE_LEVEL.
Move the bulk of power transitions to pending IRPs and IRP completion routines.
Signed-off-by: Martin Harvey <martin.harvey@citrix.com>
Refactored to only apply to XenBus power
Signed-off-by: Owen Smith <owen.smith@cloud.com>
Cleaned up out-of-style comments.
Martin Harvey [Wed, 10 Jan 2024 13:42:20 +0000 (13:42 +0000)]
Asynchronous power handling
XenFilt requires minimal IRP_MN_SET_POWER/IRP_MN_QUERY_POWER interactions.
No IoWorkItems are required as operationsperform no significant work.
Power handlers are is limited to tracking state changes and calling PoSetPowerState.
Signed-off-by: Martin Harvey <martin.harvey@citrix.com>
Refactored and limited to XenFilt changes
Owen Smith [Mon, 27 Nov 2023 11:16:11 +0000 (11:16 +0000)]
Windows 0xEF Bugcheck Handler
Adds a bugcheck handler for 0xEF (CRITICAL_PROCESS_DIED) which dumps the
process image file name (if available)
Adds ProcessGetImageFileName() to get the image file name, which relies
on calling MmGetSystemRoutineAddress("PsGetProcessImageFileName")
Suggested-by: Rabish Kumar <rabish.kumar@citrix.com> Signed-off-by: Owen Smith <owen.smith@cloud.com>
Owen Smith [Thu, 16 Nov 2023 15:05:26 +0000 (15:05 +0000)]
Fix CodeQL issue
ProcNumber should be initialized before calling KeGetCurrentProcessorNumberEx
incease the call fails and doesnt populate the PROCESSOR_NUMBER. This function
should never fail but the annotations in Windows headers dont state this
correctly.
Owen Smith [Wed, 18 Oct 2023 09:14:32 +0000 (10:14 +0100)]
Reference XenBus_Unplug section
9daf0cc470c failed to add the XenBus_Unplug section to AddReg, which
would result in a XenBus upgrade failing to clear the unplug keys, leading
to a 0x7B bugcheck. Add the AddReg reference to correct this issue.
Owen Smith [Wed, 13 Sep 2023 13:45:17 +0000 (14:45 +0100)]
Remove CoInstaller from INF
Windows 11 22H2 WHQL requires INF files pass "InfVerif /k", which highlights
several issues
- PnpLockdown=1 needs to be specified
- CoInstallers are no longer allowed
The CoInstaller has several functions that will need alternative solutions:
- The AllowUpdate mechanism is no longer possible
- The ability to block upgrades from installing on a different vendor's drivers
- The safety checks that ensure interface versionings remain compatible
- The cleanup of xenbus_monitor on uninstall.
Interface safety checks need to be handled by changes to child device bindings,
and assuming upgrade via emulated devices is safe. The unplug keys are cleared
in the INF to revert to emulated on the next boot, incase the current child
drivers rely on an interface that is no longer present (note: in this case,
child drivers will need updating).
Owen Smith [Wed, 13 Sep 2023 13:45:16 +0000 (14:45 +0100)]
Add Unplug v2 interface (REV_0900000A)
Unplug v2 adds a query call to determine if a device type has had its unplug issued.
This is useful during upgrade cases when the Unplug keys have been set to 0, and can
be used to prevent XenVif from starting whilst emulated devices are present, but those
emulated devices have not been assigned a valid configuration yet (emulated devices
will receive valid configuration, but not at this point in the startup sequence during
upgrade)
Signed-off-by: Owen Smith <owen.smith@cloud.com>
Renamed UnplugWasRequested() to UnplugIsRequested() in the XENBUS_UNPLUG_INTERFACE_V2
by renaming the clashing XEN_API function from UnplugIsRequested() to UnplugGetRequest()
Owen Smith [Wed, 13 Sep 2023 13:45:15 +0000 (14:45 +0100)]
Remove REV from DeviceID
Driver upgrades use HardwareIDs (or CompatibleIDs) to match the child INF DDInstall
section (stored as matching device id), but use the DeviceID to generate the device
instance path. By keeping the device instance path the same over upgrades, the network
stack should identify this as an upgrade, rather than 'replacement hardware', and
not generate a new network connection, which would require network settings to be
copied from the existing network connection to the new network connection.
Note: Adds a strict requirement on child INF DDInstall sections, to specify the full
hardware ID (including revision) to guarantee interface versions are correctly supported
Owen Smith [Wed, 13 Sep 2023 13:45:14 +0000 (14:45 +0100)]
Fix Length calculation in PdoQueryId
Decrease Length by the string length of the current ID before moving
the Buffer value to the end of the current ID. Without this, Length
is not decreased, leading to potential issues with the next call to
RtlStringCbPrintfW.
Note: the second chunk it to maintain consistent ordering of operations
for clarity, and has no functional change.
Owen Smith [Wed, 13 Sep 2023 13:45:13 +0000 (14:45 +0100)]
Reset StorNvme's StartOverride
When StorNvme does not enumerate any devices during boot start, it sets
the StartOverride value, which means that StorNvme does not start during
the next boot.
This can be an issue when attempting to revert to emulated boot, such as
during a driver upgrade, as StorNvme will not be loaded and not take control
of the boot disk, leading to a 0x7B bugcheck.
Paul Durrant [Mon, 10 Jul 2023 12:21:31 +0000 (13:21 +0100)]
Don't rely on contiguous grant table
If the size of the grant table is large then it may not be possible to
get sufficient contiguous memory to cover it. However, there's no
actual need for the entire grant table to be physically contiguous, or
even virtually contiguous, so allocate space from the hole as the table
is expanded rather than doing a single up-front allocation.
Paul Durrant [Wed, 12 Jul 2023 10:50:00 +0000 (11:50 +0100)]
Get rid of the single contiguous memory hole
A standard grant table with 64 pages only uses half the allocated 2 MiB of
space and, empirically, getting more than 2 MiB of physically contiguous
memory doesn't seem to work very often.
By changing FdoAllocateHole() and FdoFreeHole() to use an MDL we can do
discrete contiguous allocations on-demand, which is all that is actually
required by callers. It's straightforward to adapt __AllocatePages() for
this purpose by having it pass the MM_ALLOCATE_REQUIRE_CONTIGUOUS_CHUNKS
flag to MmAllocatePagesForMdlEx().
NOTE: This means that the generic XENBUS_HOLE becomes XENBUS_PCI_HOLE.
Also re-work the function naming so FdoAllocateHole() becomes
FdoHoleAllocate() and other names are similarly restructured. It
makes the code marginally neater.
Take the opprtunity to also put zeroing of FrameIndex in the
XENBUS_GNTTAB Context in the right place in the GnttabAcquire()
error path, and adjust GnttabRelease() accordingly.
Paul Durrant [Wed, 12 Jul 2023 10:05:39 +0000 (11:05 +0100)]
Add support for XENMEM_remove_from_physmap...
... and make use of it to remove shared_info and grant table pages from the
P2M when we're closing down. This makes sure we don't leave such pages lying
around in the Xen platform PCI device's BAR.
NOTE: Now that we're making GnttabUnmap() actually do something, tidy up the
implementation of GnttabMap() so it is aligned.
Paul Durrant [Fri, 7 Jul 2023 15:28:27 +0000 (16:28 +0100)]
Add more comments and ASSERTions in the cache allocator
It was clear from the thread at [1] that there is some confusion (including
my own) over the semantics of the slab list. Add some more comments and
ASSERTions to better explain.
Also turn a silly secondary 'if' into the 'else' clause it should have
always been.
Paul Durrant [Tue, 4 Jul 2023 17:26:41 +0000 (18:26 +0100)]
Avoid unnecessary check for non-NULL Processor->Interrupt in EvtchnRelease()
If EvtchnIsProcessorEnabled() is TRUE then Processor->Interrupt should be
valid. Hence use an ASSERTion instead. Also replicate the check of
EvtchnIsProcessorEnabled() in the error path in EvtchnAcquire().
While we're at it, let's also use EvtchnIsProcessorEnabled() in
EvtchnInterruptEnable() and EvtchnInterruptDisable().
Owen Smith [Tue, 20 Jun 2023 14:33:19 +0000 (15:33 +0100)]
Only call EvtchnFlush on valid Cpus
The Evtchn processor array is created using KeQueryMaximumProcessorCountEx, which
can include processors that do not get initialized.
Skip cleanup and flushing uninitialized event channels
Signed-off-by: Owen Smith <owen.smith@cloud.com>
Use EvtchnIsProcessorEnabled() rather than SystemProcessorVcpuId() as the test.
Owen Smith [Thu, 25 May 2023 15:14:09 +0000 (16:14 +0100)]
Fix buffer overrun when suspending VMs with many vCPUs
Dynamically allocate the KDPC array. __Section is defined as a PAGE_SIZE
region, which can only contain a limited number of KDPC objects in addition
to the SYNC_CONTEXT header. Dynamic allocation of the KDPC objects will
remove this restriction.
Signed-off-by: Owen Smith <owen.smith@cloud.com>
Always dynamically allocate KDPC objects, allowing us to get rid of
__Section and define SyncContext as a simple static global. Hence amend
the original commit comment.
Also use __AllocatePoolWithTag() and __FreePoolWithTag() rather than
open-coding them.
Owen Smith [Tue, 18 Apr 2023 08:50:45 +0000 (09:50 +0100)]
Rebuild CodeQL builds
CodeQL can sometimes fail to detect any source code if the codebase is
not rebuilt. Use the Rebuild target to force all intermediate build artifacts
to be cleaned beforehand.
Owen Smith [Thu, 16 Feb 2023 11:55:45 +0000 (11:55 +0000)]
Skip uninitialized CPUs
EvtchnFifoAcquire() will loop through all CPUs to call EVTCHNOP_init_control.
Skip any CPUs that are not initialized, which is indicated by
SystemProcessorVcpuId() failing, instead of failing the Acquire operation.
This is primarily an issue when KeQueryMaximumProcessorCountEx() returns
a different value to KeQueryActiveProcessorCountEx(), or the system
processor callback has not been called with KeProcessorAddCompleteNotify
for that CPU.
Signed-off-by: Owen Smith <owen.smith@citrix.com>
Fix up error path that also calls SystemProcessorVcpuId().
Paul Durrant [Tue, 14 Feb 2023 17:36:48 +0000 (17:36 +0000)]
Avoid race when checking for an active transaction
The code in StoreTransactionEnd() checks, under the protection of Context->Lock,
that the transaction is active. It then drops the lock and calls
StorePrepareRequest(), which also tests whether the transaction is active but
*not* under the protection of the lock. If the domain is suspended and resumed
in between the two checks then this will cause StorePrepareRequest() to fail
for non-NULL transactions.
This patch makes sure that Context->Lock is held across all calls to
StorePrepareRequest(), along with any prior tests for whether transactions or
watches are active, and drops the internal acquisition of the lock (which was
done to protect the increment of Context->RequestId).
Reported-by: Owen Smith <owen.smith@citrix.com> Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Martin Harvey [Mon, 14 Nov 2022 11:32:26 +0000 (11:32 +0000)]
Correct return codes during racy destruction.
Errors in PnP retun codes found when testing under driver
verifier with mixed VM lifecycle operations. Under some
rare cases, it is possible to get more than one PnP
"remove-like" operation. This results in a PnP remove
operation being processed whilst the device is already
in the deleted state.
This patch fixes the immediate cause of the bugfixes,
by fixing the return code. Device destruction is
unchanged. Investigation into the root cause is still
ongoing.
Signed-off-by: Martin Harvey <martin.harvey@citrix.com>
Cosmetic fixes.
Owen Smith [Fri, 18 Nov 2022 10:06:10 +0000 (10:06 +0000)]
Pass SignMode to MSBuild
Allows overriding of SignMode to "Off" to prevent signing binaries with the PFX
file. This is useful if wrapper builds sign binaries with alternative signatures
or when signing is not required.
Signed-off-by: Owen Smith <owen.smith@citrix.com>
Small whitespace fix.
Owen Smith [Wed, 12 Oct 2022 10:21:34 +0000 (11:21 +0100)]
Add build options for EWDK 22621
VisualStudioVersion = 17.0 maps to Visual Studio 2022
* Adds project files for vs2022
* Adds mapping from VisualStudioVersion 17.0 to "vs2022" project folder
* Adds mapping from VisualStudioVersion 17.0 to "Windows 10" build target
* Adds guard to build.ps1 - EWDK 22621 does not build x86 binaries
* Adds include directive where compiler intrinsics are used
Paul Durrant [Wed, 31 Aug 2022 15:14:27 +0000 (16:14 +0100)]
Lazily construct slab objects
To avoid a large overhead in both time and potemtially space when a new slab
is created, only construct objects as they are allocated. When they are freed
we keep them constructed to increase the chance of finding an already-
constructed object during subsequent allocations.
Paul Durrant [Tue, 30 Aug 2022 08:45:40 +0000 (09:45 +0100)]
Add a new XENBUS_CACHE_MASK absraction
This abstracts away the current array along with the size of the mask. This
slightly shortens the slab pre-amble, potentially allowing more objects per
slab.
The __CacheMaskScan() is also dropped in favour of a simple loop implemented
directly in CacheGetObjectFromSlab(). This is done to simplify subsequent
patches.
Paul Durrant [Wed, 31 Aug 2022 13:11:36 +0000 (14:11 +0100)]
Add an explicit type parameter to the P2ROUNDUP() macro
Because it uses signed logic internally it is currently quite vulnerable to
mismatched argument types leading to weird evaluations. Therefore it's safer
to give it an explicit type parameter and have it cast its other arguments to
that type.
Owen Smith [Thu, 5 May 2022 07:02:45 +0000 (08:02 +0100)]
Fix compiler options
Adds '/ZH:SHA_256' '/CETCOMPAT' '/sdl' to compiler and '/SafeSEH' to x86 linker
command lines
These changes were prompted by binskim https://github.com/microsoft/binskim
Note: Rule BA2004 (Warning_NativeWithInsecureStaticLibraryCompilands) is still
reported for xenbus_coinst.dll and xenbus_monitor.exe
Rule BA2007 (Error_WarningsDisabled) is still reported for all drivers
Rule BA2018 (Error, SSE table is empty) is still reported for x86 xen.sys
Paul Durrant [Fri, 6 May 2022 11:03:58 +0000 (12:03 +0100)]
Remove the 'Success' field from SUSPEND_CONTEXT
Now that there are dedicated SyncRunEarly() and SyncRunLate() functions there
is no need for this value; we can simply make the function invocations
contingent on the success of the hypercall (which tells us whether we are
doing fast-resume or not).
Paul Durrant [Thu, 5 May 2022 18:12:40 +0000 (19:12 +0100)]
Separate running the 'late' SYNC_CALLBACKs from exitting the DPC
This patch introduces a new dedicated request to ensure that *all* callbacks
have been completed before *any* CPU exits the DPC, thereby allowing threads
to be scheduled or other DPCs to run.
Paul Durrant [Thu, 5 May 2022 17:29:17 +0000 (18:29 +0100)]
Remove the SYNC_PROCESSOR structure
A previous commit left this structure with only a single remaining field:
the KDPC structure. This patch simply replaces the SYNC_PROCESSOR array in
SYNC_CONTEXT with a KDPC array. The now-unused 'Processor' pointer in
SyncWorker() is also cleaned up.
NOTE: There is a little re-formatting done in the definition of SYNC_CONEXT:
The field names were excessively indented.
Paul Durrant [Thu, 5 May 2022 17:19:17 +0000 (18:19 +0100)]
Move 'Request' from SYNC_PROCESSOR to SYNC_CONTEXT
By keeping a local 'Request' value on stack in SyncWorker() to track the last
completed request, we can avoid the need to initiate operations using a per-
processor value and simply use a global one. This means we no longer need
the loops iterating over all SYNC_PROCESSORs in SyncDisableInterrupts(),
SyncEnableInterrupts() and SyncRelease().
Paul Durrant [Thu, 5 May 2022 14:56:21 +0000 (15:56 +0100)]
Reduce code duplication
Introduce helper functions for disabling/enabling interrupts and waiting for
completion. The functions are then used in place of the current open-coding of
these operations.
NOTE: To avoid compiler/prefast noise, some warnings are disabled. The static
analysis can't cope with the IRQL manipulation.
Owen Smith [Mon, 28 Feb 2022 11:47:01 +0000 (11:47 +0000)]
All items in SYSTEM_PROCESSOR array may not be initialized
The SYSTEM_PROCESSOR array is allocated to fit the maximum number of supported
CPUs, but elements are only initialized when the SystemProcessorChangeCallback
callback is called with KeProcessorAddCompleteNotify.
Check if the SYSTEM_PROCESSOR structure is initialized before accessing any
other members, and fail SystemProcessorVcpuId with STATUS_NOT_SUPPORTED for any
uninitialized CPUs
Owen Smith [Mon, 7 Feb 2022 13:15:03 +0000 (13:15 +0000)]
SDV: RemoveLock rule violations
Calls to IoAcquireRemoveLock and IoReleaseRemoveLock should be paired within
the same dispatch entry point, unless the IoCompletionRoutine does some work.
Remove completion routines that are not required and call IoReleaseRemoveLock
after the IRP has been passed to IoCallDriver.
Owen Smith [Mon, 7 Feb 2022 13:15:02 +0000 (13:15 +0000)]
SDV: ZwRegistryOpen rule violations
Dont hold the ParametersKey open, SDV treats this as a mismatched
ZwRegistryOpen and ZwClose pair.
Open the registry key when required, and close it once its no longer
required.
Signed-off-by: Owen Smith <owen.smith@citrix.com>
Remove DriverGetParametersKey() from xenfilt/driver.h and don't add the
implementation of DriverOpenParametersKey() in xenfilt/driver.c.
Paul Durrant [Thu, 18 Nov 2021 21:05:45 +0000 (21:05 +0000)]
The PV console may not always be available
In some Xen deployments the tool-stack may not allocate a PV console ring
and event channel to the guest, so XENBUS should deal with this situation
gracefully.
Paul Durrant [Tue, 16 Nov 2021 16:35:42 +0000 (16:35 +0000)]
Introduce an alternative hole type using the platorm PCI device BAR
Using a memory hole burns 2M of RAM and is only helpful in the case where
the guest has pass-through devices causing Xen to make accesses to all PCI
BARs uncacheable. In the case where guest-visible devices are all emulated
this will not be the case and so we can save the 2M of RAM by using the
platform PCI device BAR as the hole.
This patch adds the necessary code to do that, defaulted off but enabled
by setting the XENBUS registry parameter DWORD:UseMemoryHole to 0.
Richard Turner [Fri, 8 Oct 2021 13:22:44 +0000 (09:22 -0400)]
xenfilt: Move list pointer to next entry when pdo is missing
The pointer to the list of fdo entries is never advanced
when the pdo is missing, causing a BSOD. When a device
is missing, advance the list pointer to the next entry.
Signed-off-by: Richard Turner <turnerr@ainfosec.com>
Paul Durrant [Mon, 20 Sep 2021 08:26:29 +0000 (09:26 +0100)]
Fix issues raised by CodeQL (part 2)
Swap strtol() for strtoul() in emulated.c (since we're not interested in
negative values anyway) and then check the returned value *before* checking
the end pointer.
Reported-by: Owen Smith <owen.smith@citrix.com> Signed-off-by: Paul Durrant <pdurrant@amazon.com>
Owen Smith [Tue, 10 Aug 2021 15:40:48 +0000 (16:40 +0100)]
Fix issues raised by CodeQL (part 1)
- ExAllocatePoolWithTag is deprecated in Windows 10 2004 and replaced with
ExAllocatePool2. Use ExAllocatePoolUninitialized to maintain support for
earlier versions of Windows.
Signed-off-by: Owen Smith <owen.smith@citrix.com>
Split up original patch.
Owen Smith [Tue, 10 Aug 2021 15:40:47 +0000 (16:40 +0100)]
Fix SDV/CodeQL log generation
- sarif files need to be stored with SDV logs when generating the DVL file
- Disable PREFast and CodeAnalysis by default
- Run a seperate CodeAnalysis build after SDV, but before generating DVL file
DVL file should contain multiple summary lines for SDV, at least 1 line
for CodeAnalysis and at least 1 line for Semmle (CodeQL)
Paul Durrant [Mon, 6 Sep 2021 07:46:50 +0000 (08:46 +0100)]
Fix build with later WDKs:
- Adds alias for GetProjectInfoForReference target to version.vcxproj
Later kits seemed to have renamed the build target, and will fail without
this alias target.
- Adds "/fd sha256" to signtool command line
WDK 20344 and later require binaries signed with a SHA256 file digest, or
the build outputs are deleted.
Signed-off-by: Owen Smith <owen.smith@citrix.com>
Re-worked from Owen's original patch:
- Squashes warnings 4061 and 26052.
- Casts XENBUS_STORE_PERMISSION_MASK to ULONG in switch statement to avoid
complaint about case using '|'.
Owen Smith [Mon, 19 Jul 2021 10:03:34 +0000 (11:03 +0100)]
Remove MINIMUM_OBJECT_SIZE
MINIMUM_OBJECT_SIZE would make all cached objects at least 0x80 bytes, which
would limit the number of objects in each slab to 31 objects.
This limitation is not needed, as the slab's mask is dynamically allocated to
cope with the correct number of objects that can fit into a single slab.
Cache object's sizes are rounded up to the nearest pointer boundary to maintain
object alignment. Removing the minimum size allows more objects per cache slab,
reducing the memory overhead of caches.
Martin Harvey [Thu, 15 Jul 2021 13:15:02 +0000 (14:15 +0100)]
Disable CONS debug logging
In some cases, third party antivirus products may send many
PnP query IRP's down the stack. This tends to fill the logfiles up
with unnecessary repeated lines, making debugging of other
failures difficult.
Previous loglevel was (INFO|WARNING|ERROR|CRITICAL)
Signed-Off-By: Martin Harvey <martin.harvey@citrix.com>
Martin Harvey [Thu, 15 Jul 2021 13:15:01 +0000 (14:15 +0100)]
Add logging for XenFilt AddDevice.
Recent releases of Windows (10 and 11 in particular) allow
online edition updates which involve a driver migration step,
which occurs in SAFEBOOT mode.
In some rare cases, this step may fail (for a variety of reasons).
This additional logging added to debug such upgrade cases.
Signed-Off-By: Martin Harvey <martin.harvey@citrix.com>
Martin Harvey [Thu, 15 Jul 2021 13:15:00 +0000 (14:15 +0100)]
Additional logging for module loading.
Recent releases of Windows (10 and 11 in particular) allow
online edition updates which involve a driver migration step,
which occurs in SAFEBOOT mode.
In some rare cases, this step may fail (for a variety of reasons).
This additional logging added to debug such upgrade cases.
Signed-Off-By: Martin Harvey <martin.harvey@citrix.com>
Owen Smith [Mon, 28 Jun 2021 12:58:39 +0000 (13:58 +0100)]
Add emulated NVMe to IsDiskPresent results
IsDiskPresent currently only reports the presence of emulated IDE disks. When
using emulated NVMe disks, its possible to start booting off the emulated disk,
but have XenVbd 'take over' resulting in storage requests to the emulated NVMe
disk timing out and failing. This results in a Windows error on boot
"Status 0xc000000e. A required device isnt connected or can't be accessed"
Query the CompatibleIDs and, if present, add the last CompatibleID to emulated
objects of type PCI. When querying if a disk is preset, also check for PCI
devices which match the CompatibleID "PCI\CC_0108". This will prevent XenVbd
enumerating a PV disk which is has a matching emulated NVMe device.
Owen Smith [Mon, 28 Jun 2021 12:58:38 +0000 (13:58 +0100)]
Avoid potential race with FiltersInstall
Is certain situations, a race between XENFILT and XENBUS can lead to XENFILT
not being loaded on the root PCI device node. This is due to XENBUS!DriverEntry
removing the registry value just before the PnP manager determines what filters
to load, and fails to load XENFILT on the root PCI node. This leads to XENBUS
being unable to determine the correct ActiveDevice. Without an ActiveDevice,
no Unplugs are issued, and emulated devices are used for boot, leading to a
reboot prompt before XENVBD can be used as the boot device. The race appears to
be reliable once triggered, and a reboot will follow the same sequence. This
appears to be caused by OS upgrades which affect the order the PnP manager
starts different driver stacks.
This contains a reversion to 9d28a9e9b79, which fixed an upgrade issue that
triggered multiple reboot requirements to reload XENFILT correctly.
If an incompatibility is detected, which can be resolved by a reboot to
complete the driver installation, XENFILT is inserted into the UpperFilters so
that XENFILT is loaded on this reboot. This avoids requiring a second reboot so
that XENFILT can load and determine the ActiveDevice.
Owen Smith [Thu, 17 Jun 2021 12:33:52 +0000 (13:33 +0100)]
Skip stale device config when checking child compatibility
When a device is updated, the Enum key for the old binding is not deleted.
This can lead to a device binding that is not in use (has been replaced by
a later binding) triggering the coinstaller to fail the upgrade to a newer
version. This is especially prevelent when the older stale information was
bound to a revision that is not present in the new driver INF file.
This fix ignores the stale entries under the Enum key when performing the
compatibility checks.
e.g.
tag 8.2.1 has 0x08000009 to 0x08000009 for its bindings
tag 9.0.0 has 0x08000009 to 0x09000007 for its bindings
commit a9631142d0be removed v8 revisions, leaving only 0x0900000x revisions
It should be possible to upgrade from tag 8.2.1 to tag 9.0.0 and then to
commits after a9631142d0be. At each stage of this upgrade, the revisions
overlap, even if the initial and end revisions do not have an overlap.
It is not possible to upgrade directly from tag 8.2.1 to commit a9631142d0be,
as there is no common revision that can be used.
Owen Smith [Thu, 17 Jun 2021 12:33:51 +0000 (13:33 +0100)]
Clear unplug keys if Active device is not the Vendor device
When a VM has both Vendor device and the standard device, upgrades can be made
for XenBus on the inactive device. In this case, the driver binaries are
replaced but the coinstaller is not executed for the Active device, leading to
the unplug keys remaining. When the VM is rebooted to complete the driver
installation, both the Active and Inactive devices will use the new driver
binaries, but the Active device will require the child devices rebinding to the
potentially new hardware IDs exposed by the newer binary. This is not possible
during early boot, and the absence of an emulated disk and not being able to
enumerate the PV disk will result in a 0x7B bugcheck.
The Vendor device is designed to be the prefered device, but is not required
to be the active device (this is the case if the VMs configuration is changed
after the drivers have been installed).
It is possible to detect if the Active device is not the Vendor device during
the Active device coinstaller, and clear the Unplug keys to avoid the problem
where the VM will attempt to boot with unplugged emulated disks and PV disks
that require rebinding, which results in a 0x7B bugcheck.
Owen Smith [Wed, 24 Feb 2021 08:19:57 +0000 (08:19 +0000)]
Add CodeQL build stage
CodeQL logs will be required for future WHQL submissions. Add a stage
that generates the required SARIF files. CodeQL is a semantic code
analysis engine, which will highlight vunerabilities that will need
fixing.
In order to use CodeQL, the CodeQL binaries must be on the path and the
Windows-Driver-Developer-Supplemental-Tools must be on the path defined
by the CODEQL_QUERY_SUITE environment variable (if defined), or under
the parent folder (if CODEQL_QUERY_SUITE variable is not defined)
Note: Due to the way the codeql command line is built, using quotes in a
MSBuild command line is not possible, so generate a batch file to wrap
the command line.
Paul Durrant [Mon, 22 Feb 2021 09:45:11 +0000 (09:45 +0000)]
Fix PDO revision
Commit 58760cc3dd94 ("Add XENBUS_SHARED_INFO method to check whether event
upcalls are supported") added a new version of the XENBUS_SHARED_INFO
interface but there was a typo in the line added into revisions.h and hence
the XENBUS PDO revision was left as 0x09000008 rather then being increased to
0x09000009. This patch rectifies the situation.