annotate Documentation/PCIEBUS-HOWTO.txt @ 897:329ea0ccb344

balloon: try harder to balloon up under memory pressure.

Currently if the balloon driver is unable to increase the guest's
reservation it assumes the failure was due to reaching its full
allocation, gives up on the ballooning operation and records the limit
it reached as the "hard limit". The driver will not try again until
the target is set again (even to the same value).

However it is possible that ballooning has in fact failed due to
memory pressure in the host and therefore it is desirable to keep
attempting to reach the target in case memory becomes available. The
most likely scenario is that some guests are ballooning down while
others are ballooning up and therefore there is temporary memory
pressure while things stabilise. You would not expect a well behaved
toolstack to ask a domain to balloon to more than its allocation nor
would you expect it to deliberately over-commit memory by setting
balloon targets which exceed the total host memory.

This patch drops the concept of a hard limit and causes the balloon
driver to retry increasing the reservation on a timer in the same
manner as when decreasing the reservation.

Also if we partially succeed in increasing the reservation
(i.e. receive less pages than we asked for) then we may as well keep
those pages rather than returning them to Xen.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
author Keir Fraser <keir.fraser@citrix.com>
date Fri Jun 05 14:01:20 2009 +0100 (2009-06-05)
parents 831230e53067
rev   line source
ian@0 1 The PCI Express Port Bus Driver Guide HOWTO
ian@0 2 Tom L Nguyen tom.l.nguyen@intel.com
ian@0 3 11/03/2004
ian@0 4
ian@0 5 1. About this guide
ian@0 6
ian@0 7 This guide describes the basics of the PCI Express Port Bus driver
ian@0 8 and provides information on how to enable the service drivers to
ian@0 9 register/unregister with the PCI Express Port Bus Driver.
ian@0 10
ian@0 11 2. Copyright 2004 Intel Corporation
ian@0 12
ian@0 13 3. What is the PCI Express Port Bus Driver
ian@0 14
ian@0 15 A PCI Express Port is a logical PCI-PCI Bridge structure. There
ian@0 16 are two types of PCI Express Port: the Root Port and the Switch
ian@0 17 Port. The Root Port originates a PCI Express link from a PCI Express
ian@0 18 Root Complex and the Switch Port connects PCI Express links to
ian@0 19 internal logical PCI buses. The Switch Port, which has its secondary
ian@0 20 bus representing the switch's internal routing logic, is called the
ian@0 21 switch's Upstream Port. The switch's Downstream Port is bridging from
ian@0 22 switch's internal routing bus to a bus representing the downstream
ian@0 23 PCI Express link from the PCI Express Switch.
ian@0 24
ian@0 25 A PCI Express Port can provide up to four distinct functions,
ian@0 26 referred to in this document as services, depending on its port type.
ian@0 27 PCI Express Port's services include native hotplug support (HP),
ian@0 28 power management event support (PME), advanced error reporting
ian@0 29 support (AER), and virtual channel support (VC). These services may
ian@0 30 be handled by a single complex driver or be individually distributed
ian@0 31 and handled by corresponding service drivers.
ian@0 32
ian@0 33 4. Why use the PCI Express Port Bus Driver?
ian@0 34
ian@0 35 In existing Linux kernels, the Linux Device Driver Model allows a
ian@0 36 physical device to be handled by only a single driver. The PCI
ian@0 37 Express Port is a PCI-PCI Bridge device with multiple distinct
ian@0 38 services. To maintain a clean and simple solution each service
ian@0 39 may have its own software service driver. In this case several
ian@0 40 service drivers will compete for a single PCI-PCI Bridge device.
ian@0 41 For example, if the PCI Express Root Port native hotplug service
ian@0 42 driver is loaded first, it claims a PCI-PCI Bridge Root Port. The
ian@0 43 kernel therefore does not load other service drivers for that Root
ian@0 44 Port. In other words, it is impossible to have multiple service
ian@0 45 drivers load and run on a PCI-PCI Bridge device simultaneously
ian@0 46 using the current driver model.
ian@0 47
ian@0 48 To enable multiple service drivers running simultaneously requires
ian@0 49 having a PCI Express Port Bus driver, which manages all populated
ian@0 50 PCI Express Ports and distributes all provided service requests
ian@0 51 to the corresponding service drivers as required. Some key
ian@0 52 advantages of using the PCI Express Port Bus driver are listed below:
ian@0 53
ian@0 54 - Allow multiple service drivers to run simultaneously on
ian@0 55 a PCI-PCI Bridge Port device.
ian@0 56
ian@0 57 - Allow service drivers implemented in an independent
ian@0 58 staged approach.
ian@0 59
ian@0 60 - Allow one service driver to run on multiple PCI-PCI Bridge
ian@0 61 Port devices.
ian@0 62
ian@0 63 - Manage and distribute resources of a PCI-PCI Bridge Port
ian@0 64 device to requested service drivers.
ian@0 65
ian@0 66 5. Configuring the PCI Express Port Bus Driver vs. Service Drivers
ian@0 67
ian@0 68 5.1 Including the PCI Express Port Bus Driver Support into the Kernel
ian@0 69
ian@0 70 Including the PCI Express Port Bus driver depends on whether the PCI
ian@0 71 Express support is included in the kernel config. The kernel will
ian@0 72 automatically include the PCI Express Port Bus driver as a kernel
ian@0 73 driver when the PCI Express support is enabled in the kernel.
ian@0 74
ian@0 75 5.2 Enabling Service Driver Support
ian@0 76
ian@0 77 PCI device drivers are implemented based on Linux Device Driver Model.
ian@0 78 All service drivers are PCI device drivers. As discussed above, it is
ian@0 79 impossible to load any service driver once the kernel has loaded the
ian@0 80 PCI Express Port Bus Driver. To meet the PCI Express Port Bus Driver
ian@0 81 Model requires some minimal changes on existing service drivers that
ian@0 82 imposes no impact on the functionality of existing service drivers.
ian@0 83
ian@0 84 A service driver is required to use the two APIs shown below to
ian@0 85 register its service with the PCI Express Port Bus driver (see
ian@0 86 section 5.2.1 & 5.2.2). It is important that a service driver
ian@0 87 initializes the pcie_port_service_driver data structure, included in
ian@0 88 header file /include/linux/pcieport_if.h, before calling these APIs.
ian@0 89 Failure to do so will result an identity mismatch, which prevents
ian@0 90 the PCI Express Port Bus driver from loading a service driver.
ian@0 91
ian@0 92 5.2.1 pcie_port_service_register
ian@0 93
ian@0 94 int pcie_port_service_register(struct pcie_port_service_driver *new)
ian@0 95
ian@0 96 This API replaces the Linux Driver Model's pci_module_init API. A
ian@0 97 service driver should always calls pcie_port_service_register at
ian@0 98 module init. Note that after service driver being loaded, calls
ian@0 99 such as pci_enable_device(dev) and pci_set_master(dev) are no longer
ian@0 100 necessary since these calls are executed by the PCI Port Bus driver.
ian@0 101
ian@0 102 5.2.2 pcie_port_service_unregister
ian@0 103
ian@0 104 void pcie_port_service_unregister(struct pcie_port_service_driver *new)
ian@0 105
ian@0 106 pcie_port_service_unregister replaces the Linux Driver Model's
ian@0 107 pci_unregister_driver. It's always called by service driver when a
ian@0 108 module exits.
ian@0 109
ian@0 110 5.2.3 Sample Code
ian@0 111
ian@0 112 Below is sample service driver code to initialize the port service
ian@0 113 driver data structure.
ian@0 114
ian@0 115 static struct pcie_port_service_id service_id[] = { {
ian@0 116 .vendor = PCI_ANY_ID,
ian@0 117 .device = PCI_ANY_ID,
ian@0 118 .port_type = PCIE_RC_PORT,
ian@0 119 .service_type = PCIE_PORT_SERVICE_AER,
ian@0 120 }, { /* end: all zeroes */ }
ian@0 121 };
ian@0 122
ian@0 123 static struct pcie_port_service_driver root_aerdrv = {
ian@0 124 .name = (char *)device_name,
ian@0 125 .id_table = &service_id[0],
ian@0 126
ian@0 127 .probe = aerdrv_load,
ian@0 128 .remove = aerdrv_unload,
ian@0 129
ian@0 130 .suspend = aerdrv_suspend,
ian@0 131 .resume = aerdrv_resume,
ian@0 132 };
ian@0 133
ian@0 134 Below is a sample code for registering/unregistering a service
ian@0 135 driver.
ian@0 136
ian@0 137 static int __init aerdrv_service_init(void)
ian@0 138 {
ian@0 139 int retval = 0;
ian@0 140
ian@0 141 retval = pcie_port_service_register(&root_aerdrv);
ian@0 142 if (!retval) {
ian@0 143 /*
ian@0 144 * FIX ME
ian@0 145 */
ian@0 146 }
ian@0 147 return retval;
ian@0 148 }
ian@0 149
ian@0 150 static void __exit aerdrv_service_exit(void)
ian@0 151 {
ian@0 152 pcie_port_service_unregister(&root_aerdrv);
ian@0 153 }
ian@0 154
ian@0 155 module_init(aerdrv_service_init);
ian@0 156 module_exit(aerdrv_service_exit);
ian@0 157
ian@0 158 6. Possible Resource Conflicts
ian@0 159
ian@0 160 Since all service drivers of a PCI-PCI Bridge Port device are
ian@0 161 allowed to run simultaneously, below lists a few of possible resource
ian@0 162 conflicts with proposed solutions.
ian@0 163
ian@0 164 6.1 MSI Vector Resource
ian@0 165
ian@0 166 The MSI capability structure enables a device software driver to call
ian@0 167 pci_enable_msi to request MSI based interrupts. Once MSI interrupts
ian@0 168 are enabled on a device, it stays in this mode until a device driver
ian@0 169 calls pci_disable_msi to disable MSI interrupts and revert back to
ian@0 170 INTx emulation mode. Since service drivers of the same PCI-PCI Bridge
ian@0 171 port share the same physical device, if an individual service driver
ian@0 172 calls pci_enable_msi/pci_disable_msi it may result unpredictable
ian@0 173 behavior. For example, two service drivers run simultaneously on the
ian@0 174 same physical Root Port. Both service drivers call pci_enable_msi to
ian@0 175 request MSI based interrupts. A service driver may not know whether
ian@0 176 any other service drivers have run on this Root Port. If either one
ian@0 177 of them calls pci_disable_msi, it puts the other service driver
ian@0 178 in a wrong interrupt mode.
ian@0 179
ian@0 180 To avoid this situation all service drivers are not permitted to
ian@0 181 switch interrupt mode on its device. The PCI Express Port Bus driver
ian@0 182 is responsible for determining the interrupt mode and this should be
ian@0 183 transparent to service drivers. Service drivers need to know only
ian@0 184 the vector IRQ assigned to the field irq of struct pcie_device, which
ian@0 185 is passed in when the PCI Express Port Bus driver probes each service
ian@0 186 driver. Service drivers should use (struct pcie_device*)dev->irq to
ian@0 187 call request_irq/free_irq. In addition, the interrupt mode is stored
ian@0 188 in the field interrupt_mode of struct pcie_device.
ian@0 189
ian@0 190 6.2 MSI-X Vector Resources
ian@0 191
ian@0 192 Similar to the MSI a device driver for an MSI-X capable device can
ian@0 193 call pci_enable_msix to request MSI-X interrupts. All service drivers
ian@0 194 are not permitted to switch interrupt mode on its device. The PCI
ian@0 195 Express Port Bus driver is responsible for determining the interrupt
ian@0 196 mode and this should be transparent to service drivers. Any attempt
ian@0 197 by service driver to call pci_enable_msix/pci_disable_msix may
ian@0 198 result unpredictable behavior. Service drivers should use
ian@0 199 (struct pcie_device*)dev->irq and call request_irq/free_irq.
ian@0 200
ian@0 201 6.3 PCI Memory/IO Mapped Regions
ian@0 202
ian@0 203 Service drivers for PCI Express Power Management (PME), Advanced
ian@0 204 Error Reporting (AER), Hot-Plug (HP) and Virtual Channel (VC) access
ian@0 205 PCI configuration space on the PCI Express port. In all cases the
ian@0 206 registers accessed are independent of each other. This patch assumes
ian@0 207 that all service drivers will be well behaved and not overwrite
ian@0 208 other service driver's configuration settings.
ian@0 209
ian@0 210 6.4 PCI Config Registers
ian@0 211
ian@0 212 Each service driver runs its PCI config operations on its own
ian@0 213 capability structure except the PCI Express capability structure, in
ian@0 214 which Root Control register and Device Control register are shared
ian@0 215 between PME and AER. This patch assumes that all service drivers
ian@0 216 will be well behaved and not overwrite other service driver's
ian@0 217 configuration settings.