ia64/linux-2.6.18-xen.hg

annotate Documentation/cpu-hotplug.txt @ 854:950b9eb27661

usbback: fix urb interval value for interrupt urbs.

Signed-off-by: Noboru Iwamatsu <n_iwamatsu@jp.fujitsu.com>
author Keir Fraser <keir.fraser@citrix.com>
date Mon Apr 06 13:51:20 2009 +0100 (2009-04-06)
parents 831230e53067
children
rev   line source
ian@0 1 CPU hotplug Support in Linux(tm) Kernel
ian@0 2
ian@0 3 Maintainers:
ian@0 4 CPU Hotplug Core:
ian@0 5 Rusty Russell <rusty@rustycorp.com.au>
ian@0 6 Srivatsa Vaddagiri <vatsa@in.ibm.com>
ian@0 7 i386:
ian@0 8 Zwane Mwaikambo <zwane@arm.linux.org.uk>
ian@0 9 ppc64:
ian@0 10 Nathan Lynch <nathanl@austin.ibm.com>
ian@0 11 Joel Schopp <jschopp@austin.ibm.com>
ian@0 12 ia64/x86_64:
ian@0 13 Ashok Raj <ashok.raj@intel.com>
ian@0 14 s390:
ian@0 15 Heiko Carstens <heiko.carstens@de.ibm.com>
ian@0 16
ian@0 17 Authors: Ashok Raj <ashok.raj@intel.com>
ian@0 18 Lots of feedback: Nathan Lynch <nathanl@austin.ibm.com>,
ian@0 19 Joel Schopp <jschopp@austin.ibm.com>
ian@0 20
ian@0 21 Introduction
ian@0 22
ian@0 23 Modern advances in system architectures have introduced advanced error
ian@0 24 reporting and correction capabilities in processors. CPU architectures permit
ian@0 25 partitioning support, where compute resources of a single CPU could be made
ian@0 26 available to virtual machine environments. There are couple OEMS that
ian@0 27 support NUMA hardware which are hot pluggable as well, where physical
ian@0 28 node insertion and removal require support for CPU hotplug.
ian@0 29
ian@0 30 Such advances require CPUs available to a kernel to be removed either for
ian@0 31 provisioning reasons, or for RAS purposes to keep an offending CPU off
ian@0 32 system execution path. Hence the need for CPU hotplug support in the
ian@0 33 Linux kernel.
ian@0 34
ian@0 35 A more novel use of CPU-hotplug support is its use today in suspend
ian@0 36 resume support for SMP. Dual-core and HT support makes even
ian@0 37 a laptop run SMP kernels which didn't support these methods. SMP support
ian@0 38 for suspend/resume is a work in progress.
ian@0 39
ian@0 40 General Stuff about CPU Hotplug
ian@0 41 --------------------------------
ian@0 42
ian@0 43 Command Line Switches
ian@0 44 ---------------------
ian@0 45 maxcpus=n Restrict boot time cpus to n. Say if you have 4 cpus, using
ian@0 46 maxcpus=2 will only boot 2. You can choose to bring the
ian@0 47 other cpus later online, read FAQ's for more info.
ian@0 48
ian@0 49 additional_cpus*=n Use this to limit hotpluggable cpus. This option sets
ian@0 50 cpu_possible_map = cpu_present_map + additional_cpus
ian@0 51
ian@0 52 (*) Option valid only for following architectures
ian@0 53 - x86_64, ia64, s390
ian@0 54
ian@0 55 ia64 and x86_64 use the number of disabled local apics in ACPI tables MADT
ian@0 56 to determine the number of potentially hot-pluggable cpus. The implementation
ian@0 57 should only rely on this to count the #of cpus, but *MUST* not rely on the
ian@0 58 apicid values in those tables for disabled apics. In the event BIOS doesnt
ian@0 59 mark such hot-pluggable cpus as disabled entries, one could use this
ian@0 60 parameter "additional_cpus=x" to represent those cpus in the cpu_possible_map.
ian@0 61
ian@0 62 s390 uses the number of cpus it detects at IPL time to also the number of bits
ian@0 63 in cpu_possible_map. If it is desired to add additional cpus at a later time
ian@0 64 the number should be specified using this option or the possible_cpus option.
ian@0 65
ian@0 66 possible_cpus=n [s390 only] use this to set hotpluggable cpus.
ian@0 67 This option sets possible_cpus bits in
ian@0 68 cpu_possible_map. Thus keeping the numbers of bits set
ian@0 69 constant even if the machine gets rebooted.
ian@0 70 This option overrides additional_cpus.
ian@0 71
ian@0 72 CPU maps and such
ian@0 73 -----------------
ian@0 74 [More on cpumaps and primitive to manipulate, please check
ian@0 75 include/linux/cpumask.h that has more descriptive text.]
ian@0 76
ian@0 77 cpu_possible_map: Bitmap of possible CPUs that can ever be available in the
ian@0 78 system. This is used to allocate some boot time memory for per_cpu variables
ian@0 79 that aren't designed to grow/shrink as CPUs are made available or removed.
ian@0 80 Once set during boot time discovery phase, the map is static, i.e no bits
ian@0 81 are added or removed anytime. Trimming it accurately for your system needs
ian@0 82 upfront can save some boot time memory. See below for how we use heuristics
ian@0 83 in x86_64 case to keep this under check.
ian@0 84
ian@0 85 cpu_online_map: Bitmap of all CPUs currently online. Its set in __cpu_up()
ian@0 86 after a cpu is available for kernel scheduling and ready to receive
ian@0 87 interrupts from devices. Its cleared when a cpu is brought down using
ian@0 88 __cpu_disable(), before which all OS services including interrupts are
ian@0 89 migrated to another target CPU.
ian@0 90
ian@0 91 cpu_present_map: Bitmap of CPUs currently present in the system. Not all
ian@0 92 of them may be online. When physical hotplug is processed by the relevant
ian@0 93 subsystem (e.g ACPI) can change and new bit either be added or removed
ian@0 94 from the map depending on the event is hot-add/hot-remove. There are currently
ian@0 95 no locking rules as of now. Typical usage is to init topology during boot,
ian@0 96 at which time hotplug is disabled.
ian@0 97
ian@0 98 You really dont need to manipulate any of the system cpu maps. They should
ian@0 99 be read-only for most use. When setting up per-cpu resources almost always use
ian@0 100 cpu_possible_map/for_each_possible_cpu() to iterate.
ian@0 101
ian@0 102 Never use anything other than cpumask_t to represent bitmap of CPUs.
ian@0 103
ian@0 104 #include <linux/cpumask.h>
ian@0 105
ian@0 106 for_each_possible_cpu - Iterate over cpu_possible_map
ian@0 107 for_each_online_cpu - Iterate over cpu_online_map
ian@0 108 for_each_present_cpu - Iterate over cpu_present_map
ian@0 109 for_each_cpu_mask(x,mask) - Iterate over some random collection of cpu mask.
ian@0 110
ian@0 111 #include <linux/cpu.h>
ian@0 112 lock_cpu_hotplug() and unlock_cpu_hotplug():
ian@0 113
ian@0 114 The above calls are used to inhibit cpu hotplug operations. While holding the
ian@0 115 cpucontrol mutex, cpu_online_map will not change. If you merely need to avoid
ian@0 116 cpus going away, you could also use preempt_disable() and preempt_enable()
ian@0 117 for those sections. Just remember the critical section cannot call any
ian@0 118 function that can sleep or schedule this process away. The preempt_disable()
ian@0 119 will work as long as stop_machine_run() is used to take a cpu down.
ian@0 120
ian@0 121 CPU Hotplug - Frequently Asked Questions.
ian@0 122
ian@0 123 Q: How to i enable my kernel to support CPU hotplug?
ian@0 124 A: When doing make defconfig, Enable CPU hotplug support
ian@0 125
ian@0 126 "Processor type and Features" -> Support for Hotpluggable CPUs
ian@0 127
ian@0 128 Make sure that you have CONFIG_HOTPLUG, and CONFIG_SMP turned on as well.
ian@0 129
ian@0 130 You would need to enable CONFIG_HOTPLUG_CPU for SMP suspend/resume support
ian@0 131 as well.
ian@0 132
ian@0 133 Q: What architectures support CPU hotplug?
ian@0 134 A: As of 2.6.14, the following architectures support CPU hotplug.
ian@0 135
ian@0 136 i386 (Intel), ppc, ppc64, parisc, s390, ia64 and x86_64
ian@0 137
ian@0 138 Q: How to test if hotplug is supported on the newly built kernel?
ian@0 139 A: You should now notice an entry in sysfs.
ian@0 140
ian@0 141 Check if sysfs is mounted, using the "mount" command. You should notice
ian@0 142 an entry as shown below in the output.
ian@0 143
ian@0 144 ....
ian@0 145 none on /sys type sysfs (rw)
ian@0 146 ....
ian@0 147
ian@0 148 if this is not mounted, do the following.
ian@0 149
ian@0 150 #mkdir /sysfs
ian@0 151 #mount -t sysfs sys /sys
ian@0 152
ian@0 153 now you should see entries for all present cpu, the following is an example
ian@0 154 in a 8-way system.
ian@0 155
ian@0 156 #pwd
ian@0 157 #/sys/devices/system/cpu
ian@0 158 #ls -l
ian@0 159 total 0
ian@0 160 drwxr-xr-x 10 root root 0 Sep 19 07:44 .
ian@0 161 drwxr-xr-x 13 root root 0 Sep 19 07:45 ..
ian@0 162 drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu0
ian@0 163 drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu1
ian@0 164 drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu2
ian@0 165 drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu3
ian@0 166 drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu4
ian@0 167 drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu5
ian@0 168 drwxr-xr-x 3 root root 0 Sep 19 07:44 cpu6
ian@0 169 drwxr-xr-x 3 root root 0 Sep 19 07:48 cpu7
ian@0 170
ian@0 171 Under each directory you would find an "online" file which is the control
ian@0 172 file to logically online/offline a processor.
ian@0 173
ian@0 174 Q: Does hot-add/hot-remove refer to physical add/remove of cpus?
ian@0 175 A: The usage of hot-add/remove may not be very consistently used in the code.
ian@0 176 CONFIG_CPU_HOTPLUG enables logical online/offline capability in the kernel.
ian@0 177 To support physical addition/removal, one would need some BIOS hooks and
ian@0 178 the platform should have something like an attention button in PCI hotplug.
ian@0 179 CONFIG_ACPI_HOTPLUG_CPU enables ACPI support for physical add/remove of CPUs.
ian@0 180
ian@0 181 Q: How do i logically offline a CPU?
ian@0 182 A: Do the following.
ian@0 183
ian@0 184 #echo 0 > /sys/devices/system/cpu/cpuX/online
ian@0 185
ian@0 186 once the logical offline is successful, check
ian@0 187
ian@0 188 #cat /proc/interrupts
ian@0 189
ian@0 190 you should now not see the CPU that you removed. Also online file will report
ian@0 191 the state as 0 when a cpu if offline and 1 when its online.
ian@0 192
ian@0 193 #To display the current cpu state.
ian@0 194 #cat /sys/devices/system/cpu/cpuX/online
ian@0 195
ian@0 196 Q: Why cant i remove CPU0 on some systems?
ian@0 197 A: Some architectures may have some special dependency on a certain CPU.
ian@0 198
ian@0 199 For e.g in IA64 platforms we have ability to sent platform interrupts to the
ian@0 200 OS. a.k.a Corrected Platform Error Interrupts (CPEI). In current ACPI
ian@0 201 specifications, we didn't have a way to change the target CPU. Hence if the
ian@0 202 current ACPI version doesn't support such re-direction, we disable that CPU
ian@0 203 by making it not-removable.
ian@0 204
ian@0 205 In such cases you will also notice that the online file is missing under cpu0.
ian@0 206
ian@0 207 Q: How do i find out if a particular CPU is not removable?
ian@0 208 A: Depending on the implementation, some architectures may show this by the
ian@0 209 absence of the "online" file. This is done if it can be determined ahead of
ian@0 210 time that this CPU cannot be removed.
ian@0 211
ian@0 212 In some situations, this can be a run time check, i.e if you try to remove the
ian@0 213 last CPU, this will not be permitted. You can find such failures by
ian@0 214 investigating the return value of the "echo" command.
ian@0 215
ian@0 216 Q: What happens when a CPU is being logically offlined?
ian@0 217 A: The following happen, listed in no particular order :-)
ian@0 218
ian@0 219 - A notification is sent to in-kernel registered modules by sending an event
ian@0 220 CPU_DOWN_PREPARE
ian@0 221 - All process is migrated away from this outgoing CPU to a new CPU
ian@0 222 - All interrupts targeted to this CPU is migrated to a new CPU
ian@0 223 - timers/bottom half/task lets are also migrated to a new CPU
ian@0 224 - Once all services are migrated, kernel calls an arch specific routine
ian@0 225 __cpu_disable() to perform arch specific cleanup.
ian@0 226 - Once this is successful, an event for successful cleanup is sent by an event
ian@0 227 CPU_DEAD.
ian@0 228
ian@0 229 "It is expected that each service cleans up when the CPU_DOWN_PREPARE
ian@0 230 notifier is called, when CPU_DEAD is called its expected there is nothing
ian@0 231 running on behalf of this CPU that was offlined"
ian@0 232
ian@0 233 Q: If i have some kernel code that needs to be aware of CPU arrival and
ian@0 234 departure, how to i arrange for proper notification?
ian@0 235 A: This is what you would need in your kernel code to receive notifications.
ian@0 236
ian@0 237 #include <linux/cpu.h>
ian@0 238 static int __cpuinit foobar_cpu_callback(struct notifier_block *nfb,
ian@0 239 unsigned long action, void *hcpu)
ian@0 240 {
ian@0 241 unsigned int cpu = (unsigned long)hcpu;
ian@0 242
ian@0 243 switch (action) {
ian@0 244 case CPU_ONLINE:
ian@0 245 foobar_online_action(cpu);
ian@0 246 break;
ian@0 247 case CPU_DEAD:
ian@0 248 foobar_dead_action(cpu);
ian@0 249 break;
ian@0 250 }
ian@0 251 return NOTIFY_OK;
ian@0 252 }
ian@0 253
ian@0 254 static struct notifier_block __cpuinitdata foobar_cpu_notifer =
ian@0 255 {
ian@0 256 .notifier_call = foobar_cpu_callback,
ian@0 257 };
ian@0 258
ian@0 259 You need to call register_cpu_notifier() from your init function.
ian@0 260 Init functions could be of two types:
ian@0 261 1. early init (init function called when only the boot processor is online).
ian@0 262 2. late init (init function called _after_ all the CPUs are online).
ian@0 263
ian@0 264 For the first case, you should add the following to your init function
ian@0 265
ian@0 266 register_cpu_notifier(&foobar_cpu_notifier);
ian@0 267
ian@0 268 For the second case, you should add the following to your init function
ian@0 269
ian@0 270 register_hotcpu_notifier(&foobar_cpu_notifier);
ian@0 271
ian@0 272 You can fail PREPARE notifiers if something doesn't work to prepare resources.
ian@0 273 This will stop the activity and send a following CANCELED event back.
ian@0 274
ian@0 275 CPU_DEAD should not be failed, its just a goodness indication, but bad
ian@0 276 things will happen if a notifier in path sent a BAD notify code.
ian@0 277
ian@0 278 Q: I don't see my action being called for all CPUs already up and running?
ian@0 279 A: Yes, CPU notifiers are called only when new CPUs are on-lined or offlined.
ian@0 280 If you need to perform some action for each cpu already in the system, then
ian@0 281
ian@0 282 for_each_online_cpu(i) {
ian@0 283 foobar_cpu_callback(&foobar_cpu_notifier, CPU_UP_PREPARE, i);
ian@0 284 foobar_cpu_callback(&foobar-cpu_notifier, CPU_ONLINE, i);
ian@0 285 }
ian@0 286
ian@0 287 Q: If i would like to develop cpu hotplug support for a new architecture,
ian@0 288 what do i need at a minimum?
ian@0 289 A: The following are what is required for CPU hotplug infrastructure to work
ian@0 290 correctly.
ian@0 291
ian@0 292 - Make sure you have an entry in Kconfig to enable CONFIG_HOTPLUG_CPU
ian@0 293 - __cpu_up() - Arch interface to bring up a CPU
ian@0 294 - __cpu_disable() - Arch interface to shutdown a CPU, no more interrupts
ian@0 295 can be handled by the kernel after the routine
ian@0 296 returns. Including local APIC timers etc are
ian@0 297 shutdown.
ian@0 298 - __cpu_die() - This actually supposed to ensure death of the CPU.
ian@0 299 Actually look at some example code in other arch
ian@0 300 that implement CPU hotplug. The processor is taken
ian@0 301 down from the idle() loop for that specific
ian@0 302 architecture. __cpu_die() typically waits for some
ian@0 303 per_cpu state to be set, to ensure the processor
ian@0 304 dead routine is called to be sure positively.
ian@0 305
ian@0 306 Q: I need to ensure that a particular cpu is not removed when there is some
ian@0 307 work specific to this cpu is in progress.
ian@0 308 A: First switch the current thread context to preferred cpu
ian@0 309
ian@0 310 int my_func_on_cpu(int cpu)
ian@0 311 {
ian@0 312 cpumask_t saved_mask, new_mask = CPU_MASK_NONE;
ian@0 313 int curr_cpu, err = 0;
ian@0 314
ian@0 315 saved_mask = current->cpus_allowed;
ian@0 316 cpu_set(cpu, new_mask);
ian@0 317 err = set_cpus_allowed(current, new_mask);
ian@0 318
ian@0 319 if (err)
ian@0 320 return err;
ian@0 321
ian@0 322 /*
ian@0 323 * If we got scheduled out just after the return from
ian@0 324 * set_cpus_allowed() before running the work, this ensures
ian@0 325 * we stay locked.
ian@0 326 */
ian@0 327 curr_cpu = get_cpu();
ian@0 328
ian@0 329 if (curr_cpu != cpu) {
ian@0 330 err = -EAGAIN;
ian@0 331 goto ret;
ian@0 332 } else {
ian@0 333 /*
ian@0 334 * Do work : But cant sleep, since get_cpu() disables preempt
ian@0 335 */
ian@0 336 }
ian@0 337 ret:
ian@0 338 put_cpu();
ian@0 339 set_cpus_allowed(current, saved_mask);
ian@0 340 return err;
ian@0 341 }
ian@0 342
ian@0 343
ian@0 344 Q: How do we determine how many CPUs are available for hotplug.
ian@0 345 A: There is no clear spec defined way from ACPI that can give us that
ian@0 346 information today. Based on some input from Natalie of Unisys,
ian@0 347 that the ACPI MADT (Multiple APIC Description Tables) marks those possible
ian@0 348 CPUs in a system with disabled status.
ian@0 349
ian@0 350 Andi implemented some simple heuristics that count the number of disabled
ian@0 351 CPUs in MADT as hotpluggable CPUS. In the case there are no disabled CPUS
ian@0 352 we assume 1/2 the number of CPUs currently present can be hotplugged.
ian@0 353
ian@0 354 Caveat: Today's ACPI MADT can only provide 256 entries since the apicid field
ian@0 355 in MADT is only 8 bits.
ian@0 356
ian@0 357 User Space Notification
ian@0 358
ian@0 359 Hotplug support for devices is common in Linux today. Its being used today to
ian@0 360 support automatic configuration of network, usb and pci devices. A hotplug
ian@0 361 event can be used to invoke an agent script to perform the configuration task.
ian@0 362
ian@0 363 You can add /etc/hotplug/cpu.agent to handle hotplug notification user space
ian@0 364 scripts.
ian@0 365
ian@0 366 #!/bin/bash
ian@0 367 # $Id: cpu.agent
ian@0 368 # Kernel hotplug params include:
ian@0 369 #ACTION=%s [online or offline]
ian@0 370 #DEVPATH=%s
ian@0 371 #
ian@0 372 cd /etc/hotplug
ian@0 373 . ./hotplug.functions
ian@0 374
ian@0 375 case $ACTION in
ian@0 376 online)
ian@0 377 echo `date` ":cpu.agent" add cpu >> /tmp/hotplug.txt
ian@0 378 ;;
ian@0 379 offline)
ian@0 380 echo `date` ":cpu.agent" remove cpu >>/tmp/hotplug.txt
ian@0 381 ;;
ian@0 382 *)
ian@0 383 debug_mesg CPU $ACTION event not supported
ian@0 384 exit 1
ian@0 385 ;;
ian@0 386 esac