coroutine: switch per-thread free pool to a global pool
ucontext-based coroutines use a free pool to reduce allocations and
deallocations of coroutine objects. The pool is per-thread, presumably
to improve locality. However, as coroutines are usually allocated in
a vcpu thread and freed in the I/O thread, the pool accounting gets
screwed up and we end allocating and freeing a coroutine for every I/O
request. This is expensive since large objects are allocated via the
kernel, and are not cached by the C runtime.
Fix by switching to a global pool. This is safe since we're protected
by the global mutex.
Signed-off-by: Avi Kivity <avi@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>