annotate Documentation/infiniband/user_verbs.txt @ 524:7f8b544237bf

netfront: Allow netfront in domain 0.

This is useful if your physical network device is in a utility domain.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
author Keir Fraser <keir.fraser@citrix.com>
date Tue Apr 15 15:18:58 2008 +0100 (2008-04-15)
parents 831230e53067
rev   line source
ian@0 2
ian@0 3 The ib_uverbs module, built by enabling CONFIG_INFINIBAND_USER_VERBS,
ian@0 4 enables direct userspace access to IB hardware via "verbs," as
ian@0 5 described in chapter 11 of the InfiniBand Architecture Specification.
ian@0 6
ian@0 7 To use the verbs, the libibverbs library, available from
ian@0 8 <http://openib.org/>, is required. libibverbs contains a
ian@0 9 device-independent API for using the ib_uverbs interface.
ian@0 10 libibverbs also requires appropriate device-dependent kernel and
ian@0 11 userspace driver for your InfiniBand hardware. For example, to use
ian@0 12 a Mellanox HCA, you will need the ib_mthca kernel module and the
ian@0 13 libmthca userspace driver be installed.
ian@0 14
ian@0 15 User-kernel communication
ian@0 16
ian@0 17 Userspace communicates with the kernel for slow path, resource
ian@0 18 management operations via the /dev/infiniband/uverbsN character
ian@0 19 devices. Fast path operations are typically performed by writing
ian@0 20 directly to hardware registers mmap()ed into userspace, with no
ian@0 21 system call or context switch into the kernel.
ian@0 22
ian@0 23 Commands are sent to the kernel via write()s on these device files.
ian@0 24 The ABI is defined in drivers/infiniband/include/ib_user_verbs.h.
ian@0 25 The structs for commands that require a response from the kernel
ian@0 26 contain a 64-bit field used to pass a pointer to an output buffer.
ian@0 27 Status is returned to userspace as the return value of the write()
ian@0 28 system call.
ian@0 29
ian@0 30 Resource management
ian@0 31
ian@0 32 Since creation and destruction of all IB resources is done by
ian@0 33 commands passed through a file descriptor, the kernel can keep track
ian@0 34 of which resources are attached to a given userspace context. The
ian@0 35 ib_uverbs module maintains idr tables that are used to translate
ian@0 36 between kernel pointers and opaque userspace handles, so that kernel
ian@0 37 pointers are never exposed to userspace and userspace cannot trick
ian@0 38 the kernel into following a bogus pointer.
ian@0 39
ian@0 40 This also allows the kernel to clean up when a process exits and
ian@0 41 prevent one process from touching another process's resources.
ian@0 42
ian@0 43 Memory pinning
ian@0 44
ian@0 45 Direct userspace I/O requires that memory regions that are potential
ian@0 46 I/O targets be kept resident at the same physical address. The
ian@0 47 ib_uverbs module manages pinning and unpinning memory regions via
ian@0 48 get_user_pages() and put_page() calls. It also accounts for the
ian@0 49 amount of memory pinned in the process's locked_vm, and checks that
ian@0 50 unprivileged processes do not exceed their RLIMIT_MEMLOCK limit.
ian@0 51
ian@0 52 Pages that are pinned multiple times are counted each time they are
ian@0 53 pinned, so the value of locked_vm may be an overestimate of the
ian@0 54 number of pages pinned by a process.
ian@0 55
ian@0 56 /dev files
ian@0 57
ian@0 58 To create the appropriate character device files automatically with
ian@0 59 udev, a rule like
ian@0 60
ian@0 61 KERNEL="uverbs*", NAME="infiniband/%k"
ian@0 62
ian@0 63 can be used. This will create device nodes named
ian@0 64
ian@0 65 /dev/infiniband/uverbs0
ian@0 66
ian@0 67 and so on. Since the InfiniBand userspace verbs should be safe for
ian@0 68 use by non-privileged processes, it may be useful to add an
ian@0 69 appropriate MODE or GROUP to the udev rule.