David Buches [Tue, 22 Nov 2016 11:08:36 +0000 (11:08 +0000)]
Fixed improper translation of SCHEDOP_Shutdown return code
The documentation for the SCHEDOP_Shutdown hyper-call states that when
invoked with the SHUTDOWN_Suspend reason code, the return value indicates
that the guest domain either suspended (and resumed) in a new domain (0),
or that the operation was canceled (1).
The problem - the SchedShutdown() wrapper wasn't properly translating the
return value for SHUTDOWN_Suspend - it returned a success value for both
successful and canceled suspend operations, which resulted in suspend
callbacks erroneously being invoked for canceled operations, producing
undesirable side effects (suspend callbacks are only supposed to be
invoked when resuming on a new domain).
The code now returns an appropriate status value when SHUTDOWN_Suspend
operations are canceled.
Signed-off-by: David Buches <davebuch@amazon.com>
Slightly re-factored for cosmetic reasons.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Wed, 2 Nov 2016 11:00:11 +0000 (11:00 +0000)]
Remove S4 BUG_ONs for interface that don't depend on Xen
Some interfaces don't depend on Xen (e.g. CACHE, RANGE_SET) and so it
is safe for them to have outstanding references across an S4 transtion
or suspend/resume (i.e. transitions which result in a new domain). Only
interfaces that actually depend on Xen (e.g. GNTTAB, EVTCHN) cannot
have outstanding reference in these cases, so limit the BUG_ONs to those.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Wed, 21 Sep 2016 12:49:20 +0000 (13:49 +0100)]
Fix a couple of issues picked up by Windows 10 verifier
- It's possible for MmAllocatePagesForMdlEx() not to satisfy the
full allocation request, but not fail. Thus AllocatePage() should
check that the completed allocation actually matches what it
asks for.
- RegistryCreateKey() has a memory leak.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Owen Smith [Tue, 20 Sep 2016 17:06:14 +0000 (18:06 +0100)]
Step through hardware revision list in reverse order
Windows treats the HardwareID list as a decending order of specialization
where the first entry is the most specific, and last entry is least
specific. This can lead to install issues when the newer driver has a
less-specific HardwareID, as the older ("more-specific") HardwareID is
used for the match. Reordering the HardwareID list, so that the newest
revision is first, will stop Windows selecting the wrong driver package
to install.
Signed-off-by: Owen Smith <owen.smith@citrix.com>
Re-factored slightly for code consistency.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Mon, 22 Aug 2016 12:59:10 +0000 (13:59 +0100)]
Don't assume a 32-page grant table
The default grant tabled size in Xen is 32 pages, but it is tunable.
This patch allows the XENBUS_GNTTAB interface to take advantage of an
inreased grant table size.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Mon, 22 Aug 2016 07:40:58 +0000 (08:40 +0100)]
Add missing patch
I missed a 'git add' for the latest code in registry.c resulting in the
code here being slightly behind that in XENVIF and XENVBD. This patch
brings it into line.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Fri, 19 Aug 2016 10:29:41 +0000 (11:29 +0100)]
Bring RegistryCreateKey()'s semantics in line with Win32 RegCreateKeyEx()
RegCreateKeyEx() will create intermediate keys in a path whereas
ZwCreateKey() will not. Thus, to align the semantics, this patch will
parse the path passwed to RegistryCreateKey() and create subkeys one by
one.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Fri, 12 Aug 2016 13:39:35 +0000 (14:39 +0100)]
monitor: get dialog paramaters from the registry
It is easier to localise the monitor dialog if it picks up the reboot dialog
title and message from registry parameters rather than having the hardcoded
or in a string table. This patch does this and sets default values in the
the INF file.
This patchs also adds a call to wait for driver installations to complete
before initialiating a reboot.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Mon, 8 Aug 2016 15:18:59 +0000 (16:18 +0100)]
Re-work monitor service registry keys
Instead of using the monitor service key directly to place reboot
requests, use a key under HKLM\SOFTWARE. This is a better place to handle
interactions between separate PV driver packages.
Also, give the monitor service a description and add a parameter to control
the reboot prompt dialog timeout.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 28 Jul 2016 09:34:13 +0000 (10:34 +0100)]
XENBUS_MONITOR: don't delete the registry value until a reboot is pending
If a reboot is requested whilst there is no active session then the
monitor will not be able to prompt for reboot. We need to leave the
registry value in place until we have prompted.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Wed, 27 Jul 2016 09:53:02 +0000 (10:53 +0100)]
XENBUS_MONITOR refinements
Use a string table for the dialog message rather than coding it inline.
Also, trim the DisplayName pulled from the registry because Windows 10
seems to prefix it with useless tags.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Wed, 20 Jul 2016 13:40:23 +0000 (14:40 +0100)]
Check 'Reboot' value in the 'Request' key
If the 'Reboot' value is set with a service name then pop up a message in
the active session indicating that the specified service requires a system
reboot in order to complete installation. If the session user responds
affirmatively to the message then initiate a reboot.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Fri, 8 Jul 2016 16:29:37 +0000 (17:29 +0100)]
Add a new monitor service
This patch adds the boilerplate for a service called XENBUS_MONITOR.
The service does not yet have any functionality. This will be added
by subsequent patches.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Tue, 19 Jul 2016 13:13:51 +0000 (14:13 +0100)]
Reduce priority of suspend thread
In cycles of repeated suspend/resume attempt to make sure other threads
get to run by:
a) Dropping the priority of the suspend thread as low as possible.
b) Deliberately waiting for DPSc on other CPUs to complete before
checking xenstore again.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Wed, 6 Jul 2016 12:58:49 +0000 (13:58 +0100)]
Clear Unplug keys when PDO names change
When upgrading XENBUS the names of PDOs may change because a new
interface version is added.
The co-installer will check for compatibility with child drivers, but
even a compatible child driver will need to re-bind if the name of the PDO
to which is binds has changed. This is a problem for boot-start drivers
because the CDDB was removed in Windows 7, which means the setupapi must
do the re-bind and that means a 0x7B BSOD will ensue if XENVBD's binding
needs to change.
To avoid this problem, if the co-installer detects that PDO names will
change, the Unplug keys are cleared causing a fall-back to emulated devices
on reboot thus allowing the setupapi to run and fix the bindings of other
PV drivers.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 12 May 2016 09:14:19 +0000 (10:14 +0100)]
Don't free memory at HIGH IRQL
The hash table remove function is invoked by the EVTCHN early callback on
resume from suspend. This means it is invoked at HIGH level with interrupts
disabled, which means that memory can neither be allocated nor freed. The
code, however, does indeed free a data structure and this may well lead
to memory corruption. This patch addresses the issue by deferring freeing
the memory to a subsequently scheduled DPC.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Tue, 1 Mar 2016 14:04:00 +0000 (14:04 +0000)]
Don't veto everything on InitSafeModeMode
In safe mode we want to fall back to using emulated devices (which have
in-box drivers) just in case there is a problem using PV devices. However,
the current scheme of bailing very early in DriverEntry() hence not
supplying an AddDevice() entry point, hence not creating any FDOs and hence
no PDOs is problematic. This is because, when no child FDOs are created,
un-installing a child driver does not invoke the child driver co-installer
and thus cleanup, such as removing unplug registry keys, does not occur.
This then leads to a potential 0x7B BSOD on reboot if XENVBD was removed in
safe mode.
This patch gets rid of the global veto and instead simply vetoes unplug of
emulated devices. This should be sufficient for other PV drivers to
deactivate and let Windows use the emulated devices, but won't get in the
way of normal driver un-install behaviour.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Owen Smith [Tue, 15 Dec 2015 11:30:18 +0000 (11:30 +0000)]
BSOD if initial balloon thread has not completed within 20 minutes
Since there is no way of reporting balloon failures to the toolstack,
the only way of stopping a VM from attempting to balloon indefinitely
is to BSOD after a large timeout.
Signed-off-by: Owen Smith <owen.smith@citrix.com>
Largely cosmetic changes (comments and #defines).
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 10 Dec 2015 11:03:50 +0000 (11:03 +0000)]
Introduce __FreePoolWithTag()
Being able to interpose on memory allocation can be useful during
debugging. We already have __AllocatePoolWithTag() so this patch matches
it with __FreePoolWithTag().
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 10 Dec 2015 10:37:07 +0000 (10:37 +0000)]
Don't use C runtime versions of toupper() and tolower()
It seems that, despite their trivial functionality, the runtime
implementation insists on converting to Unicode! This means those functions
are actually only safe at PASSIVE_LEVEL.
This patch implements __toupper() and __tolower() as replacements with
no such hidden nastiness and modifies callers to use those.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Wed, 9 Dec 2015 14:35:42 +0000 (14:35 +0000)]
Use new SystemProcessorCount() function for XENBUS_EVTCHN initialization
Since it's necessary in a few places in the EVTCHN code to map processor
number to vcpu_id, the available processors should be limited to that for
which such a mapping exists.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 3 Dec 2015 12:31:57 +0000 (12:31 +0000)]
Verify that all interfaces have been released when going into S4
Because a transition into and out of S4 means a new domain is built, it's
crucial that all XENBUS interfaces are released (so that things like
event channels, grant tables and the xenstore ring get re-constructed).
This patch adds BUG_ONs to ensure this is the case.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 26 Nov 2015 16:17:14 +0000 (16:17 +0000)]
Make sure registry updates and deletes are flushed
In most cases it is desirable to makre sure any updates are committed to
the registry hive on storage before any further operations are performed.
This patch adds ZwFlushKey() calls to ensure that is the case.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 26 Nov 2015 12:43:50 +0000 (12:43 +0000)]
Use the registry to check for vendor device
Using the XENFILT_PVDEVICE interface to select active device (which entails
checking for the presence of a vendor device) means that XENBUS requires a
reboot on installation before any instance can create PDOs. By using the
registry to check for vendor device presence (by looking if there is a key
under HKLM/System/CurrentControlset/Enum) there is no longer any need for
that reboot.
This patch amends the code as necessary, essentially pulling most of the
implementation of XENFILT_PVDEVICE into src/xenbus/driver.c.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Tue, 3 Nov 2015 11:48:30 +0000 (11:48 +0000)]
Add STORE watchdog
There have been occasions during testing when xenstored has apparently
missed sending notification to the frontend that data is on the ring.
This patch adds a watchdog to the code to notice when either of the rings
has stalled and try to move things along.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Fri, 23 Oct 2015 10:39:06 +0000 (11:39 +0100)]
Dump information about viridian enlightenments
Sometimes, for diagnosis, it's useful to have a log of what viridian
enlightenments are visiable to a VM. This patch adds new code into the
XEN system module to dump relevant information at boot time.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Fri, 9 Oct 2015 16:07:48 +0000 (17:07 +0100)]
Add a registry override to veto driver installations
There are certain cases where a local installer package may wish to
prevent Windows Update installations of drivers. This can be achieved by
having the co-installer check for a registry value and fail it's pre-install
phase if the value is present.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
PDO revision 0x0800000B includes STORE interface version 2 (added
StorePermissionsSet()) and GNTTAB interface version 2 (added
GnttabMapForeignPages() and GnttabUnmapForeignPages()).
Signed-off-by: Rafal Wojdyla <omeg@invisiblethingslab.com> Acked-by: Paul Durrant <paul.durrant@citrix.com>
Add foreign page mapping functions to the GNTTAB interface
GNTTAB interface now includes functions to map and unmap memory pages
granted by a foreign domain. The page(s) are mapped under an address
allocated from the PCI BAR space.
Signed-off-by: Rafal Wojdyla <omeg@invisiblethingslab.com>
Some cosmetic tweaking and BUG_ON unmap failure rather than using a
dedicated bugcheck code. The Errors in grant_table.c are changed
to Warnings with expanded information on the precise map/unmap
that failed.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 10 Sep 2015 09:05:01 +0000 (10:05 +0100)]
Fix list walking in hash_table.c
Neither HashTableLookup() nor HashTableRemove() update the iterator in their
attempted list walks, leading to an endless spin. This patch changes the
while loops to for loops and fixes the problem.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com> Reported-by: Rafal Wojdyla <omeg@invisiblethingslab.com>
Paul Durrant [Wed, 9 Sep 2015 15:37:46 +0000 (16:37 +0100)]
Add Wait method to XENBUS_EVTCHN and use it in XENBUS_STORE
This patch adds a Wait method to the XENBUS_EVTCHN interface to allow
a subscriber to wait for an event channel to be signalled. This is useful
in XENBUS_STORE to avoid polling the ring state too often.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Wed, 9 Sep 2015 12:39:13 +0000 (13:39 +0100)]
Fix hash table overflow
There is a flaw in HashTableHash() which means that, for example, an Array
value of 0xff added to an Accumulator value of 0xff will lead to more than
4 bits of Overflow. The 5th bit is missed by the mask and is hence not
folded back into the lower order bits of the Accumulator. The upshot of the
this is an ASSERTion failure for a debug build or an array overflow in the
caller for a non-debug build.
This patch fixes this issue by increasing the overflow mask to 8 bits
instead of 4 (although 5 bit would actually be sufficient).
Paul Durrant [Tue, 8 Sep 2015 15:21:25 +0000 (16:21 +0100)]
Parameterize vendor prefix and PCI device id
The XenServer PV vendor prefix ('XS') and PCI device (C000) are still
hard-coded into the XENBUS package. These need to be stripped out and
replaced by values that can be customized at build time. This patch does
that.
The patch also reverts to building version.h and customizing xenbus.inf
directly in build.py.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Tue, 8 Sep 2015 13:20:19 +0000 (14:20 +0100)]
Don't treat a missing Driver key as a hard failure
When looking to see whether an incumbent child driver will patch the
PDO names created by the new version of XENBUS, ignore any cases where
we find that the Driver key referenced in the Device key is actually
missing.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
In WHQL testing I suspect the removal and re-creation of filter objects
when IRP_MN_REMOVE_DEVICE is processed in the case that underlying PDO is
not actually going away may cause problems.
By reverting 632cc904 this bouncing is prevented but the code needs more
work to fix the hanging object references from filtDO to PDO that were the
motivation for 632cc904 in the first place.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Wed, 5 Aug 2015 07:56:56 +0000 (08:56 +0100)]
Registry string value types cannot be inferred
For instance, the UpperFilters key needs to be a REG_MULTI_SZ
even if it contains only one string. Thus the type needs to be
passed explicitly to RegistryUpdateSzValue.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 23 Jul 2015 10:53:22 +0000 (11:53 +0100)]
Install filters on first FDO creation and remove on last deletion
When XENBUS binds to two devices (as it may when the vendor PCI device
is present) then installing/removing filters on a per-FDO basis does not
work properly.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Tue, 21 Jul 2015 13:52:27 +0000 (14:52 +0100)]
Move filter installation and active device selection logic into drivers
When XENBUS creates its FDO object it will query up to XENFILT for a new
PDEVICE interface. This is used for getting/setting the active device
instance.
If the query fails then it is taken to mean that XENFILT has not been
installed into the system class UpperFilters and so this is done, and a
reboot requested (the FDO creation succeeding but remaining inactive).
If the query succeeds then the code attempts to get the active device
instance. If that succeeds then then the FDO identity is checked to see if
it should be active. If, however, it fails then the code attempts to
claim the active device instance.
When XENBUS destroys its FDO object then the active device instance is
cleared (if the FDO was active) and XENFILT is removed from UpperFilters.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Fri, 17 Jul 2015 09:25:46 +0000 (10:25 +0100)]
Introduce new mechanism to unplug emulated devices
This makes an incompatible change and so the PDO revision is bumped up
without retaining any previous revisions.
With this patch a new unplug interface is exported by XENBUS (so it is
available for query before installing XENFILT). This interface exports
a Request method which is now the one true way of requesting unplug
of emulated devices. Co-installers need not mess with registry keys
any more. Instead drivers should request unplug when they find their
PDOs blocked by aliasing emulated devices, or when they successfully
come online. The reason for the latter case is that unplug is now
single-shot. It needs to be re-requested by PV drivers each time their
PDOs come online otherwise emulated devices will be re-instated on
next reboot.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Wed, 8 Jul 2015 12:53:45 +0000 (13:53 +0100)]
Avoid PDO namespace conflicts...
...by encoding the driver major version in the upper byte of the
revision.
This clearly implies that any future change in the driver major version
will start a new PDO namespace, but that it almost certainly the correct
thing to do in that case.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 2 Jul 2015 09:23:26 +0000 (10:23 +0100)]
Fix fall-back to two-level EVTCHN ABI
When the EVTCHN code attempts to acquire the FIFO ABI it may fail to do
so because the version of Xen may not support it. In this case the code
was issuing an EventChannelReset() which has the unfortunate side effect of
killing any toolstack-created channels, such as the xenstored channel.
This patch moves the existent EvtchnFifoReset function into the base
evtchn source module (since it's not ABI specific) and uses that function
as the only mechanism of issuing an EventChannelReset() since it contains
code to preserve event channel bindings. (Prior to the move it only
preserved the xenstore channel but this patch adds code to preserve the
console event channel too, if it exists).
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Wed, 1 Jul 2015 15:11:22 +0000 (16:11 +0100)]
Fix potential buffer overflow
The __min in XENFILT's FdoQueryDeviceRelations() should be a __max. The only
reason this mistake did not lead to an immediate buffer overflow was because
the allocation incorrectly used sizeof (DEVICE_OBJECT) rather than
sizeof (PDEVICE_OBJECT).
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 2 Apr 2015 10:32:42 +0000 (11:32 +0100)]
Fix unplugging of emulated devices on resume from suspend
Due to a mis-ordering of the interface initialization calls in FdoCreate(),
the SUSPEND interface never gets hold of the UNPLUG interface and thus,
on resume from suspend, emulated device unplug is not done and the
emulated network and disk devices re-appear in the VM.
This patch re-orders the initialization code to fix this problem and also
makes SuspendTrigger() fail if the UNPLUG interface is not available.
NOTE: The change of type of SuspendTrigger() does not require a new
interface revision since it is a change from a void function to a
non-void function, so older client code will simply ignore the return
value.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Wed, 1 Apr 2015 07:53:54 +0000 (08:53 +0100)]
Make sure SYNC per-processor structures are zeroed after resume
Since the per-processor data in the SYNC code was split out from the
main context structure, the code that zeroes that structure on resume
no longer clears the per-processor Exit flag. This means that a multi-
vcpu VM can only be suspended once; subsequent attempts will fail.
This patch fixes the problem by zeroing the full page containing the SYNC
context structure and any per-processor data.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Mon, 30 Mar 2015 12:46:06 +0000 (13:46 +0100)]
Windows Server 2008 compatibility fix
Use of the CONNECT_FULLY_SPECIFIED_GROUP flag to IoConnectInterruptEx() is
not supported prior to Windows 7, so when Group == 0 (which will always be
true for any OS prior to Windows 7) just use CONNECT_FULLY_SPECIFIED
in which case it is documented that Windows will assume Group == 0.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 26 Mar 2015 13:43:01 +0000 (13:43 +0000)]
Don't use a stack based DPC structure in the System per-CPU code
Whilst this is believed to be safe, there is no documentation to say that
Windows does not make use of the DPC structure after the DPC routine has
completed. Instead, make the DPC structure part of the per-CPU structure.
Also fix an ASSERT on the per-CPU array pointer not being NULLed.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Thu, 26 Mar 2015 13:39:32 +0000 (13:39 +0000)]
Fix an ASSERT failure and BugCheck on XENBUS unload
The Prcoessor array pointer in the EVTCHN code is not being NULLed, leading
to an ASSERT faiure. There is also a race in zero-ing out the per-processor
DPCs and them being present on kernel queues, which leads to a BugCheck.
This patch fixes both issues.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Paul Durrant [Tue, 24 Mar 2015 10:11:35 +0000 (10:11 +0000)]
Improve auditing in CACHE and GNTTAB interfaces
Add 'Get' and 'Put' counters to CACHEs which can then be checked for
equality at destruction time to make sure all objects have been returned.
Also add a list of GNTTAB caches so that the code can BUG on any
outstanding caches at Release.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>