Reduce size of Xen-qemu shared ioreq structure to 32 bytes. This has two
advantages:
1. We can support up to 128 VCPUs with a single shared page
2. If/when we want to go beyond 128 VCPUs, a whole number of ioreq_t
structures will pack into a single shared page, so a multi-page array will
have no ioreq_t straddling a page boundary
Also, while modifying qemu, replace a 32-entry vcpu-indexed array with a
dynamically-allocated array.