Paul Durrant [Thu, 10 Dec 2015 11:03:50 +0000 (11:03 +0000)]
Introduce __FreePoolWithTag()
Being able to interpose on memory allocation can be useful during
debugging. We already have __AllocatePoolWithTag() so this patch matches
it with __FreePoolWithTag().
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 10 Dec 2015 10:37:07 +0000 (10:37 +0000)]
Don't use C runtime versions of toupper() and tolower()
It seems that, despite their trivial functionality, the runtime
implementation insists on converting to Unicode! This means those functions
are actually only safe at PASSIVE_LEVEL.
This patch implements __toupper() and __tolower() as replacements with
no such hidden nastiness and modifies callers to use those.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Wed, 9 Dec 2015 14:35:42 +0000 (14:35 +0000)]
Use new SystemProcessorCount() function for XENBUS_EVTCHN initialization
Since it's necessary in a few places in the EVTCHN code to map processor
number to vcpu_id, the available processors should be limited to that for
which such a mapping exists.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 3 Dec 2015 12:31:57 +0000 (12:31 +0000)]
Verify that all interfaces have been released when going into S4
Because a transition into and out of S4 means a new domain is built, it's
crucial that all XENBUS interfaces are released (so that things like
event channels, grant tables and the xenstore ring get re-constructed).
This patch adds BUG_ONs to ensure this is the case.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 26 Nov 2015 16:17:14 +0000 (16:17 +0000)]
Make sure registry updates and deletes are flushed
In most cases it is desirable to makre sure any updates are committed to
the registry hive on storage before any further operations are performed.
This patch adds ZwFlushKey() calls to ensure that is the case.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 26 Nov 2015 12:43:50 +0000 (12:43 +0000)]
Use the registry to check for vendor device
Using the XENFILT_PVDEVICE interface to select active device (which entails
checking for the presence of a vendor device) means that XENBUS requires a
reboot on installation before any instance can create PDOs. By using the
registry to check for vendor device presence (by looking if there is a key
under HKLM/System/CurrentControlset/Enum) there is no longer any need for
that reboot.
This patch amends the code as necessary, essentially pulling most of the
implementation of XENFILT_PVDEVICE into src/xenbus/driver.c.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Tue, 3 Nov 2015 11:48:30 +0000 (11:48 +0000)]
Add STORE watchdog
There have been occasions during testing when xenstored has apparently
missed sending notification to the frontend that data is on the ring.
This patch adds a watchdog to the code to notice when either of the rings
has stalled and try to move things along.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Fri, 9 Oct 2015 16:07:48 +0000 (17:07 +0100)]
Add a registry override to veto driver installations
There are certain cases where a local installer package may wish to
prevent Windows Update installations of drivers. This can be achieved by
having the co-installer check for a registry value and fail it's pre-install
phase if the value is present.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 10 Sep 2015 09:05:01 +0000 (10:05 +0100)]
Fix list walking in hash_table.c
Neither HashTableLookup() nor HashTableRemove() update the iterator in their
attempted list walks, leading to an endless spin. This patch changes the
while loops to for loops and fixes the problem.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reported-by: Rafal Wojdyla <omeg@invisiblethingslab.com>
Paul Durrant [Wed, 9 Sep 2015 15:37:46 +0000 (16:37 +0100)]
Add Wait method to XENBUS_EVTCHN and use it in XENBUS_STORE
This patch adds a Wait method to the XENBUS_EVTCHN interface to allow
a subscriber to wait for an event channel to be signalled. This is useful
in XENBUS_STORE to avoid polling the ring state too often.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Wed, 9 Sep 2015 12:39:13 +0000 (13:39 +0100)]
Fix hash table overflow
There is a flaw in HashTableHash() which means that, for example, an Array
value of 0xff added to an Accumulator value of 0xff will lead to more than
4 bits of Overflow. The 5th bit is missed by the mask and is hence not
folded back into the lower order bits of the Accumulator. The upshot of the
this is an ASSERTion failure for a debug build or an array overflow in the
caller for a non-debug build.
This patch fixes this issue by increasing the overflow mask to 8 bits
instead of 4 (although 5 bit would actually be sufficient).
Paul Durrant [Tue, 8 Sep 2015 15:21:25 +0000 (16:21 +0100)]
Parameterize vendor prefix and PCI device id
The XenServer PV vendor prefix ('XS') and PCI device (C000) are still
hard-coded into the XENBUS package. These need to be stripped out and
replaced by values that can be customized at build time. This patch does
that.
The patch also reverts to building version.h and customizing xenbus.inf
directly in build.py.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Tue, 8 Sep 2015 13:20:19 +0000 (14:20 +0100)]
Don't treat a missing Driver key as a hard failure
When looking to see whether an incumbent child driver will patch the
PDO names created by the new version of XENBUS, ignore any cases where
we find that the Driver key referenced in the Device key is actually
missing.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
In WHQL testing I suspect the removal and re-creation of filter objects
when IRP_MN_REMOVE_DEVICE is processed in the case that underlying PDO is
not actually going away may cause problems.
By reverting 632cc904 this bouncing is prevented but the code needs more
work to fix the hanging object references from filtDO to PDO that were the
motivation for 632cc904 in the first place.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Wed, 5 Aug 2015 07:56:56 +0000 (08:56 +0100)]
Registry string value types cannot be inferred
For instance, the UpperFilters key needs to be a REG_MULTI_SZ
even if it contains only one string. Thus the type needs to be
passed explicitly to RegistryUpdateSzValue.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 23 Jul 2015 10:53:22 +0000 (11:53 +0100)]
Install filters on first FDO creation and remove on last deletion
When XENBUS binds to two devices (as it may when the vendor PCI device
is present) then installing/removing filters on a per-FDO basis does not
work properly.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Tue, 21 Jul 2015 13:52:27 +0000 (14:52 +0100)]
Move filter installation and active device selection logic into drivers
When XENBUS creates its FDO object it will query up to XENFILT for a new
PDEVICE interface. This is used for getting/setting the active device
instance.
If the query fails then it is taken to mean that XENFILT has not been
installed into the system class UpperFilters and so this is done, and a
reboot requested (the FDO creation succeeding but remaining inactive).
If the query succeeds then the code attempts to get the active device
instance. If that succeeds then then the FDO identity is checked to see if
it should be active. If, however, it fails then the code attempts to
claim the active device instance.
When XENBUS destroys its FDO object then the active device instance is
cleared (if the FDO was active) and XENFILT is removed from UpperFilters.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Fri, 17 Jul 2015 09:25:46 +0000 (10:25 +0100)]
Introduce new mechanism to unplug emulated devices
This makes an incompatible change and so the PDO revision is bumped up
without retaining any previous revisions.
With this patch a new unplug interface is exported by XENBUS (so it is
available for query before installing XENFILT). This interface exports
a Request method which is now the one true way of requesting unplug
of emulated devices. Co-installers need not mess with registry keys
any more. Instead drivers should request unplug when they find their
PDOs blocked by aliasing emulated devices, or when they successfully
come online. The reason for the latter case is that unplug is now
single-shot. It needs to be re-requested by PV drivers each time their
PDOs come online otherwise emulated devices will be re-instated on
next reboot.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Wed, 8 Jul 2015 12:53:45 +0000 (13:53 +0100)]
Avoid PDO namespace conflicts...
...by encoding the driver major version in the upper byte of the
revision.
This clearly implies that any future change in the driver major version
will start a new PDO namespace, but that it almost certainly the correct
thing to do in that case.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 2 Jul 2015 09:23:26 +0000 (10:23 +0100)]
Fix fall-back to two-level EVTCHN ABI
When the EVTCHN code attempts to acquire the FIFO ABI it may fail to do
so because the version of Xen may not support it. In this case the code
was issuing an EventChannelReset() which has the unfortunate side effect of
killing any toolstack-created channels, such as the xenstored channel.
This patch moves the existent EvtchnFifoReset function into the base
evtchn source module (since it's not ABI specific) and uses that function
as the only mechanism of issuing an EventChannelReset() since it contains
code to preserve event channel bindings. (Prior to the move it only
preserved the xenstore channel but this patch adds code to preserve the
console event channel too, if it exists).
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Wed, 1 Jul 2015 15:11:22 +0000 (16:11 +0100)]
Fix potential buffer overflow
The __min in XENFILT's FdoQueryDeviceRelations() should be a __max. The only
reason this mistake did not lead to an immediate buffer overflow was because
the allocation incorrectly used sizeof (DEVICE_OBJECT) rather than
sizeof (PDEVICE_OBJECT).
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 2 Apr 2015 10:32:42 +0000 (11:32 +0100)]
Fix unplugging of emulated devices on resume from suspend
Due to a mis-ordering of the interface initialization calls in FdoCreate(),
the SUSPEND interface never gets hold of the UNPLUG interface and thus,
on resume from suspend, emulated device unplug is not done and the
emulated network and disk devices re-appear in the VM.
This patch re-orders the initialization code to fix this problem and also
makes SuspendTrigger() fail if the UNPLUG interface is not available.
NOTE: The change of type of SuspendTrigger() does not require a new
interface revision since it is a change from a void function to a
non-void function, so older client code will simply ignore the return
value.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Wed, 1 Apr 2015 07:53:54 +0000 (08:53 +0100)]
Make sure SYNC per-processor structures are zeroed after resume
Since the per-processor data in the SYNC code was split out from the
main context structure, the code that zeroes that structure on resume
no longer clears the per-processor Exit flag. This means that a multi-
vcpu VM can only be suspended once; subsequent attempts will fail.
This patch fixes the problem by zeroing the full page containing the SYNC
context structure and any per-processor data.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Mon, 30 Mar 2015 12:46:06 +0000 (13:46 +0100)]
Windows Server 2008 compatibility fix
Use of the CONNECT_FULLY_SPECIFIED_GROUP flag to IoConnectInterruptEx() is
not supported prior to Windows 7, so when Group == 0 (which will always be
true for any OS prior to Windows 7) just use CONNECT_FULLY_SPECIFIED
in which case it is documented that Windows will assume Group == 0.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 26 Mar 2015 13:43:01 +0000 (13:43 +0000)]
Don't use a stack based DPC structure in the System per-CPU code
Whilst this is believed to be safe, there is no documentation to say that
Windows does not make use of the DPC structure after the DPC routine has
completed. Instead, make the DPC structure part of the per-CPU structure.
Also fix an ASSERT on the per-CPU array pointer not being NULLed.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 26 Mar 2015 13:39:32 +0000 (13:39 +0000)]
Fix an ASSERT failure and BugCheck on XENBUS unload
The Prcoessor array pointer in the EVTCHN code is not being NULLed, leading
to an ASSERT faiure. There is also a race in zero-ing out the per-processor
DPCs and them being present on kernel queues, which leads to a BugCheck.
This patch fixes both issues.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Tue, 24 Mar 2015 10:11:35 +0000 (10:11 +0000)]
Improve auditing in CACHE and GNTTAB interfaces
Add 'Get' and 'Put' counters to CACHEs which can then be checked for
equality at destruction time to make sure all objects have been returned.
Also add a list of GNTTAB caches so that the code can BUG on any
outstanding caches at Release.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Tue, 17 Mar 2015 15:53:05 +0000 (15:53 +0000)]
Make XEN, XENFILT and XENBUS processor group aware
Processor groups have been around for a long time in Windows and
contnuing to ignore them becomes ever more painful when trying to
pass the HCM multiple processor group device test. This patch, therefore,
modifies all the code that uses the non-group-aware kernel calls to use
the newer group aware calls.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 19 Mar 2015 12:35:15 +0000 (12:35 +0000)]
Add strict version check between XEN and XENFILT/XENBUS
Because the ABI between XEN and the other drivers in the package is
(intentionally) unstable, add a strict version check using the single
function XenTouch to prevent XENBUS or XENFILT loading if an incumbent
XEN is from a previous package installation.
Also add code to the co-installer to request a reboot, as this is needed
to bring up a compatible set of modules.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
The code relies on post-4.5.0 header changes so we need a more recent
repository. The commit id here is just master on the day this update was
done. The headers will be updated to a tag once a suitable one is
available.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Wed, 11 Mar 2015 17:14:45 +0000 (17:14 +0000)]
Squash a prefast warning
Prefast warns that following the Flink pointer of the EVTCHN context list
may be unsafe due to it being NULL. It should never be NULL so this patch
adds an ASSERTion and hence squashes the warning.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 5 Mar 2015 12:08:08 +0000 (12:08 +0000)]
__DbgPrintEnable() cannot be called on paging path
The system power up code in XENFILT re-enables DbgPrint hooking.
Unfortunately the undocumented kernel call it uses may require a page-in
and so causes a deadlock when XENVBD is responsible for the page file,
since it depends upon XENFILT to power up.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Wed, 4 Mar 2015 12:55:50 +0000 (12:55 +0000)]
Fix WHQL Multiple processor group device test BSOD
Some of the fakery done by this test causes a BSOD in the code in the
XEN driver that attempts to build a mapping from Windows CPU index to
Xen vcpu id. Applying this patch to re-work the code slightly fixes the
BSOD and allows the test to pass.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Wed, 4 Mar 2015 10:09:51 +0000 (10:09 +0000)]
Only call PdoSuspend() and PdoResume() when in the correct state
The FDO code calls PdoSuspend() and PdoResume() when a PDO is removed or
added (respectively) to the FDO list. However, if the FDO is not in state
D0, this is not the correct thing to do. Currently PdoSuspend() and
PdoResume() do nothing in XENBUS, but the equivalent functions in XENVIF
do and a call to PdoSuspend() when the FDO was in D3 was causing a BSOD.
Commit 5b895423957de23a82dc763dc378e305be16817a fixes this problem in
XENVIF. This patch brings XENBUS into line.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Mon, 2 Mar 2015 13:08:12 +0000 (13:08 +0000)]
Fix VS2013 SDV failures
A mis-annotation of some ZwQueryXXX operations is causing SDV to fail
when it notices code in registry.c using the length being passed back
from a failed call. The code is correct according to the documentation of
those functions so this patch suppresses the warnings.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Fri, 27 Feb 2015 18:03:18 +0000 (18:03 +0000)]
Add registry tweak to make PDOs ejectable
Add a registry tweak called 'AllowPdoEject' so that PDOs can be made
ejectable and rename the tweak to make PDOs removable to 'AllowPdoRemove'
to bring it in line.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Fri, 27 Feb 2015 18:00:09 +0000 (18:00 +0000)]
Release lower bus interface before passing removal IRP to PDO
If the PDO is surprised removed then the FDO tries to retain its
reference to the PDO's bus interface byond its destruction. This may cause
memory corruption.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Fri, 27 Feb 2015 13:48:46 +0000 (13:48 +0000)]
Fix event channel unmasking for two-level ABI
The two-level ABI requires that an event is masked for the unmask
hypercall to raise the event, so the test-and-clear operation in the
guest basically means that pending events get stuck. The simple fix
is to re-mask pending events before making the hypercall. This is
unnecessary when the FIFO ABI is used, but it's safe. Hence this patch
unconditionally re-masks pending events, regardless of ABI, before
making the unmask hypercall.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 26 Feb 2015 14:23:43 +0000 (14:23 +0000)]
Re-instate bus enumation filtering in XENFILT
The filtering code was removed in commit ff034f7ebd4010f7b502c56cb1cca6ba40a7f1aa when it was realized that
simply modifying the reported instance ID of the active device to be
constant, even if it moves location on the PCI bus was sufficient to
keep PV drivers binding. However, this does not protect against the
active device going away so you can still end up with a non-bootable VM.
This patch re-instates the code, modified to check for presence of the
active device with a wildcard instance rather than a fixed instance.
NOTE: This patch does include a change to emulated_interface.h, to allow
a NULL instance ID to be passed to IsDevicePresent. Whilst this is
a new semantic, it does not affect compatibility so there is no
change to the interface version.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Mon, 9 Feb 2015 13:12:39 +0000 (13:12 +0000)]
Fix SDV build
SDV builds are failing with this error:
..\..\src\xenbus\fdo.c(758) : error C2220: warning treated as error - no 'object' file generated
c:\git\xenbus\src\xenbus\fdo.c(744) : warning C6246: Local declaration of 'status' hides declaration of the same name in outer scope. For additional information, see previous declaration at line '675' of 'c:\git\xenbus\src\xenbus\fdo.c'.: Lines: 675
This patch removes the erroneous declaration.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Tue, 3 Feb 2015 11:38:33 +0000 (11:38 +0000)]
EVTCHN FIFO ABI should keep per-vcpu queue head shadow
There is currently only one shadow head per priority which means that
polling event channels of the same priority on different CPUs can conflict,
leading to lost events. This patch fixes the problem.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Wed, 28 Jan 2015 16:27:36 +0000 (16:27 +0000)]
Remove use of KeNumberProcessors from EVTCHN code
The crucial things are the virtual CPUs to which the ABI can steer
interrupts and the number of interrupts which the FDO code successfully
allocated. By using an intersection of these two things we can drop any
use of KeNumberProcessors.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Wed, 28 Jan 2015 14:47:31 +0000 (14:47 +0000)]
Remove use of KeNumberProcessors from SYSTEM code
The Xen SYSTEM module queries system information, including per-cpu
information. It is therefore best to make use if processor callback
functions rather than iterating over boot-time active processors, as this
makes the code robust to processors hot-added after boot.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Tue, 20 Jan 2015 17:39:58 +0000 (17:39 +0000)]
Re-work EVTCHN Close
When using the FIFO ABI, the Close operation needs to be synchronized with
polling. This requires some re-structuring of the close code to defer the
actual closure of the event channel in Xen until at least one poll after
masking and starting the tear-down.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Tue, 20 Jan 2015 11:40:39 +0000 (11:40 +0000)]
Re-work EVTCHN Trigger
By re-working the event channel poll callback to add pending channels to a
list and then service them, we can tidy up Trigger by forcibly appending a
channel to the same list and then scheduling a DPC on the correct cpu to
service in the list (unless an interrupt callback came along in the interim
and did the job).
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Fri, 16 Jan 2015 15:09:40 +0000 (15:09 +0000)]
Use a DPC per CPU for EVTCHN Trigger
Using a single DPC potentially means re-afinitizing it for each use, and
this potentially means Windows could try to insert it onto multiple CPU
queues at the same time, which probably won't end well. Using a DPC per
CPU seems a lot safer.
Also add a DPC flush before we zero out data structures.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Mon, 19 Jan 2015 09:54:38 +0000 (09:54 +0000)]
Eagerly enable and disable interrupts
There's really no need to lazily enable event channel interrupts. It makes
the code simpler to enable at the end of the first acquire and disable at
the start of the last release.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Mon, 19 Jan 2015 16:18:05 +0000 (16:18 +0000)]
Get rid of per-channel interrupt pointer
It's kind of redundant as we have a Cpu index value. Also, remove code that
tries to ensure everything is on the right cpu even across an event channel
re-bind. Since we now always use the hypercall to handle unmasking then we
don't need to care so much.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Tue, 20 Jan 2015 15:11:28 +0000 (15:11 +0000)]
Fix EvtchnOpen() error path
It's possible for an error in the latter stages of EvtchnOpen() to cause the
channel to be left open in Xen. Also the Mask boolean was not being cleared
which would lead to ASSERTion failures in checked builds.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>